Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Debug Mode configuration and move detailed logs to Debug log level #167

Merged
merged 4 commits into from
Oct 1, 2020

Conversation

abrahamko
Copy link
Contributor

What does this PR do?

  • Add configuration to allow Debug mode that will show detailed logs.
    By default, they will not be written, but can be added for investigation.
  • Change log level of detailed logs from Info to Debug

Most messages were moved to DEBUG, because INFO should be information on the operation that the invoker is interested in.
For authn client, this is the authentication.
It doesn't matter to him how it does it, only what it does, and when it is done or failed.
See the examples below for more info.

What ticket does this PR close?

Closes #134

Checklists

Change log

  • The CHANGELOG has been updated

Test coverage

  • The changes in this PR do not require tests

Documentation

  • Docs (e.g. READMEs) were updated in this PR, and/or there is a follow-on issue to update docs, or
  • This PR does not require updating any documentation

Example of debug mode off (default):

INFO:  2020/09/15 09:31:55.403986 main.go:16: CAKC014I Kubernetes Authenticator Client v0.18.1-b6d2d1e starting up...
INFO:  2020/09/15 09:31:55.425011 main.go:43: CAKC006I Authenticating as user 'host/conjur/authn-k8s/authn-dev-env/apps/local-secrets-provider/*/*'
INFO:  2020/09/15 09:31:55.665871 main.go:55: CAKC001I Successfully authenticated
INFO:  2020/09/15 09:31:55.665917 main.go:61: CAKC013I Waiting for 6m0s to re-authenticate

Example of debug mode off (default) with errors:

INFO:  2020/09/15 09:29:36.001761 main.go:16: CAKC014I Kubernetes Authenticator Client v0.18.1-b6d2d1e starting up...
INFO:  2020/09/15 09:29:36.025154 main.go:43: CAKC006I Authenticating as user 'host/conjur/authn-k8s/authn-dev-env/apps/INVALID/*/*'
ERROR: 2020/09/15 09:29:36.048248 authenticator.go:136: CAKC029E Received invalid response to certificate signing request. Reason: status code 401, 
ERROR: 2020/09/15 09:29:36.048335 authenticator.go:200: CAKC015E Login failed
ERROR: 2020/09/15 09:29:36.048349 main.go:47: CAKC016E Failed to authenticate

Example of debug mode on

INFO:  2020/09/15 09:32:50.828426 main.go:16: CAKC014I Kubernetes Authenticator Client v0.18.1-b6d2d1e starting up...
DEBUG: 2020/09/15 09:32:50.828527 logger.go:60: CAKC001D Debug mode is enabled
INFO:  2020/09/15 09:32:50.894115 main.go:43: CAKC006I Authenticating as user 'host/conjur/authn-k8s/authn-dev-env/apps/local-secrets-provider/*/*'
DEBUG: 2020/09/15 09:32:50.894248 authenticator.go:197: CAKC005I Trying to login Conjur...
DEBUG: 2020/09/15 09:32:50.894299 authenticator.go:116: CAKC007I Logging in as user 'host/conjur/authn-k8s/authn-dev-env/apps/local-secrets-provider/*/*'
DEBUG: 2020/09/15 09:32:50.895002 requests.go:23: CAKC011I Login request to: https://conjur-oss.local-conjur.svc.cluster.local/authn-k8s/authn-dev-env/inject_client_cert
DEBUG: 2020/09/15 09:32:51.074895 authenticator.go:159: CAKC015I Loaded client certificate successfully from /etc/conjur/ssl/client.pem
DEBUG: 2020/09/15 09:32:51.075120 authenticator.go:171: CAKC016I Deleted client certificate from memory
DEBUG: 2020/09/15 09:32:51.075153 authenticator.go:203: CAKC002I Logged in
DEBUG: 2020/09/15 09:32:51.075205 authenticator.go:186: CAKC008I Cert expires: 2020-09-18 09:32:50 +0000 UTC
DEBUG: 2020/09/15 09:32:51.075225 authenticator.go:187: CAKC009I Current date: 2020-09-15 09:32:51.0751843 +0000 UTC
DEBUG: 2020/09/15 09:32:51.075253 authenticator.go:188: CAKC010I Buffer time:  30s
DEBUG: 2020/09/15 09:32:51.075514 requests.go:47: CAKC012I Authn request to: https://conjur-oss.local-conjur.svc.cluster.local/authn-k8s/authn-dev-env/cucumber/host%2Fconjur%2Fauthn-k8s%2Fauthn-dev-env%2Fapps%2Flocal-secrets-provider%2F%2A%2F%2A/authenticate
INFO:  2020/09/15 09:32:51.141853 main.go:55: CAKC001I Successfully authenticated
INFO:  2020/09/15 09:32:51.142157 main.go:61: CAKC013I Waiting for 6m0s to re-authenticate

@abrahamko abrahamko requested a review from a team as a code owner September 15, 2020 09:46
cmd/authenticator/main.go Outdated Show resolved Hide resolved
@@ -113,7 +113,7 @@ func (auth *Authenticator) GenerateCSR(commonName string) ([]byte, error) {
// successfully retrieved
func (auth *Authenticator) Login() error {

log.Info(log.CAKC007I, auth.Config.Username)
log.Debug(log.CAKC007I, auth.Config.Username)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this opens something up. we need to change CAKC007I to CAKC007D (or some other number) as the last letter indicates the log level. we have a few options here:

  1. Change the suffix for every message that we move from Info to Debug
  2. Remove the suffice altogether as the level is logged anyway in the same message.
  3. Leave these messages with the I suffix and not have a D suffix at all (info and debug messages will have an I suffix).

The second option is the most correct one imo, although it is different from how we log in other projects (conjur, CSPFK). However, option 3 is backwards-compatible and doesn't require a doc change (also - customers may be monitoring these messages).

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!
I agree that option 2 looks the most correct way to go, but it made me think: Does it mean that there could be 2 different messages, one for info and one for debug, but with the same ID?
It might create conflicts that will be hard to resolve.
And between 1 and 3, I prefer to go with the convention to decide, which is option 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they won't have the same ID as one will be CAKC001I and one will be CAKC001D. we have the same thing with Error log messages. If we go for option 2 then we will need to restructure all the codes from the beginning and it will break compatibility.

Option 1 is ok as well but it us not fully backwards-compatible. For example, a customer may monitor the log for CAKC007I but now it will change to CAKC007D and they will never catch it.

Option 3 is the safest one regarding backwards-compatibility. It has its flaws but I would go in that direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read the Semantic Versioning doc this project adheres to.
It explicitly says: Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable.
Isn't it enough to allow the use of the 1st option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I also found this: https://semver.org/spec/v2.0.0.html#how-do-i-know-when-to-release-100.
@izgeri, why authn client wasn't released as 1.0.0 yet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can go for option 1 but in that case then we can go with option 2. Which one do you prefer?

For both options I would add a CHANGELOG entry saying that the error codes were changed to enhance supportability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abrahamko I don't know, but I've raised this with PM a couple of times recently. We will have to find a time to have a smallish focused effort on this project to get it to 1.0.0. cc @Tovli

Copy link
Contributor

@izgeri izgeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great! I look forward to seeing the positive impact on Secretless logs from this change :)

left a couple of requested changes from the changelog

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@orenbm
Copy link
Member

orenbm commented Sep 15, 2020

@izgeri the CHANGELOG master!

@sgnn7
Copy link
Contributor

sgnn7 commented Sep 15, 2020

@abrahamko / @orenbm / @izgeri Just doing a drive-by comment on this:
While I know that reduction in default logging may be advantageous for smaller logs and less noise, usually the customer provides us with the default log when something goes wrong (which often happens with setups of authn-k8s) and I'm not sure we will get enough detail to troubleshoot those issues quickly and just from the defaults that this PR introduces. I think maybe some contact with support or post-sales would be good to ping on this before we merge the PR to see if this is a desired change.

@abrahamko abrahamko force-pushed the add-debug-logs branch 2 times, most recently from 709bc19 to d44f405 Compare September 17, 2020 07:38
@dataplex
Copy link

A few comments on this based on some recent conversations:

  1. Since we didn't publish a comprehensive list of info/error codes to date for customer consumption, we can change existing ones. We also are pre-version 1.0 so these are subject to change at this time.
  2. In the future, we should keep the info/error/debug codes fixed and just create new codes and deprecate the old codes but keep them in the documentation (much like PAS does currently). Handling log message code changes this way makes "breaking" logging code changes easier for customers to adjust to as they may have multiple versions of a component in the environment and migrations may take some time where both sets of log messages need to be handled.
  3. I like @orenbm 's comment to "Remove the suffice altogether as the level is logged anyway in the same message."
  4. I also share @sgnn7 's concern about not having enough information in the INFO logs currently for troubleshooting purposes. At the very least we should log the configuration INFO when the container is launched to allow us to have those values when troubleshooting or working a ticket. Otherwise we have to reproduce this information by enabling DEBUG, which wastes time on both ends of the interaction (customer/CyberArk support and/or R&D).

@izgeri
Copy link
Contributor

izgeri commented Sep 21, 2020

Thanks @dataplex! I think we could add a separate issue for adding configuration info logs on container startup - that sounds like a reasonable idea. I would suggest for this PR we:

  • Update the log codes to remove the log type suffix (since this is already echoed when the message is logged)
  • Review the log messages that will go out once this change is made to ensure there is sufficient info for initial troubleshooting (fwiw this looks reasonable to me, given the process for turning on debug in this container and redeploying is fairly simple - much simpler than in the follower)
  • File a follow-up issue to log the initial configuration info on startup

@orenbm
Copy link
Member

orenbm commented Sep 22, 2020

thanks @dataplex & @izgeri !

@abrahamko will you take this from here? when are you planning to do it?

- Add configuration to allow Debug mode that will show detailed logs.
  By default, they will not be written, but can be added for investigation.
- Change log level of detailed logs from Info to Debug
@abrahamko
Copy link
Contributor Author

thanks @dataplex & @izgeri !

@abrahamko will you take this from here? when are you planning to do it?
@orenbm I pushed the suffix removal.
@izgeri, you can review the leftover logs in the PR description.

@abrahamko
Copy link
Contributor Author

I opened issue #169 as a follow-up issue to write the configuration to the log.

Log suffix is redundant, as it is written in the same line with the log.
This allows the log level change from Info to Debug without impacting the
log identifier.
Copy link
Contributor

@izgeri izgeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments. It's also worth noting that I ran golint ./... against this branch and there were a number of issues - are there any that can be fixed as part of this PR?

CHANGELOG.md Show resolved Hide resolved
pkg/log/log_messages.go Outdated Show resolved Hide resolved
pkg/log/log_messages.go Outdated Show resolved Hide resolved
pkg/log/logger.go Show resolved Hide resolved
pkg/authenticator/config/config.go Show resolved Hide resolved
@orenbm
Copy link
Member

orenbm commented Sep 29, 2020

@abrahamko can you update on this ticket? where does it stand?
i am adding some new log messages in another PR and wondered if i should wait for this

@abrahamko
Copy link
Contributor Author

@orenbm I will do one last effort to finish. Otherwise it will have to wait to next R&D boost.

@abrahamko
Copy link
Contributor Author

@izgeri this PR diverged too much from its original purpose.
Please open an issue to address the issues arise from golint.

CHANGELOG.md Outdated Show resolved Hide resolved
izgeri
izgeri previously approved these changes Sep 29, 2020
Copy link
Contributor

@izgeri izgeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a suggestion for the change log entry - it's worth calling out that the error messages are changing for some errors with this PR

but other than that, this LGTM, so I'll approve this to remove my "request changes" on the PR

Co-authored-by: Geri Jennings <[email protected]>
@orenbm orenbm merged commit ac965f4 into master Oct 1, 2020
@orenbm orenbm deleted the add-debug-logs branch October 1, 2020 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Separate debug logs from info
5 participants