-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CERTTF-388] [CERTTF-443] Implement masking on the agent #404
base: main
Are you sure you want to change the base?
Conversation
A masking command runner is used when either sensitive patterns are specified in the agent configuration or secrets are provided in the job data (or both).
# retrieve sensitive patterns from the agent configuration | ||
sensitive_patterns = self.client.config.get("sensitive_patterns", []) | ||
# retrieve secrets from the job | ||
secrets_dict = self.job_data.get("secrets") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logic is self-explanatory, no need for comments here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must say that in this particular case the comments were deliberate: I mean to emphasize how the sources of these patterns (sensitive vs. secrets) are different, i.e. that the former is static and comes from the agent configuration whereas the latter is dynamic and differs for every job. I believe one can easily miss that by skimming over the code.
# inject secrets into the runner environment | ||
environment = { | ||
**self.client.config, | ||
**secrets_dict, | ||
} | ||
# add secrets to the sensitive variables | ||
sensitive_patterns.extend(secrets_dict.values()) | ||
# secrets and (possibly) sensitive information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
} | ||
|
||
log_data = self.run_testcmds(job_data, client, tmp_path) | ||
print(log_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you supposed to print log data to stdout, not stderr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a test, so neither stdout
nor stderr
are printed when the test is successful. They are however both captured and printed (separately) when the test fails, so this print
is meant as a diagnostic, i.e. you only see it when the test fails and might help you understand what went wrong. I'd happily change this to print to stderr
if you feel it makes any difference.
@boukeas I have a really stupid question to ask first... when/why do we want to use regexp to specify patterns that need to be masked, and why would they be in the agent config? I guess I just assumed that the only masking would be secrets that were sent to us from the server - which would be literal string matching in the output and not defined in the agent config. The specific example cited here is masking the IP address, but I don't think we ever want to mask that information do we? That could make interpreting the job really hard in many cases. |
Description
As described in CR080 - Handling secrets and sensitive information in Testflinger, one of the tasks involved in supporting secrets in Testflinger is implementing functionality for masking sensitive information in the output generated by the Testflinger agent.
This PR:
Masker
class for masking sensitive information in text. What is considered sensitive is specified through a list of regular expressions. Note that theMasker
currently replaces sensitive information with a hash of a specified length, which means that the same masked pieces of information will appear in the output with the same hash.MaskingCommandRunner
class, derived from the regularCommandRunner
. Their only difference lies in thepost_output
method, where the masking variant of the runner employs aMasker
to the output before posting it. TheMasker
to be used is actually provided to the runner upon its construction, so the runner itself is not aware of the specifics, i.e. it is not aware of the kind of information that is considered sensitive.TestflingerJob
class to check if "sensitive patterns" are specified in the agent's configuration file or if secrets are included in the job data. In any of these cases, it employs aMaskingCommandRunner
instead of a regularCommandRunner
, to make sure that sensitive information is masked in the output. In the case of secrets, it also injects them into the corresponding environment variables of the runner (CERTTF-443).Resolved issues
Resolves CERTTF-388 and CERTTF-443.
Some (reasonable) assumptions have been made regarding the way the server will pass secrets onto the agent along with other job data. This is separate work to be undertaken within CERTTF-441 and possibly CERTTF-444, so some minor modifications might be necessary when these tasks are also completed.
Documentation
The newly introduced
sensitive_patterns
in the agent configuration file has been added to the relevant documentation.Tests
Tests have been included for all the newly introduced classes and functionality, including integration-style tests that use the
fake_connector
device connector to runtest_cmds
that include sensitive information and/or secrets.Also, this job was submitted to a local Testflinger deployment:
The testflinger agent had these lines added to its configuration file, specifying IP addresses as sensitive information (note the necessary escapes).
This was the output of the job, where the device IP has been masked.