Skip to content

DAOS-18292 ddb: Debug level of the DDB GO code can not be defined#17232

Open
knard38 wants to merge 14 commits intomasterfrom
ckochhof/fix/master/daos-18292
Open

DAOS-18292 ddb: Debug level of the DDB GO code can not be defined#17232
knard38 wants to merge 14 commits intomasterfrom
ckochhof/fix/master/daos-18292

Conversation

@knard38
Copy link
Contributor

@knard38 knard38 commented Dec 5, 2025

Description

Properly configure C DAOS debug facilities from Golang DDB code.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@knard38 knard38 self-assigned this Dec 5, 2025
@github-actions
Copy link

github-actions bot commented Dec 5, 2025

Ticket title is 'Debug level of the DDB GO code can not be defined'
Status is 'In Progress'
https://daosio.atlassian.net/browse/DAOS-18292

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/1/execution/node/1276/log

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch 2 times, most recently from 3df67cc to 85e6a34 Compare December 15, 2025 19:25
@daosbuild3
Copy link
Collaborator

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch from 85e6a34 to caf8936 Compare January 5, 2026 08:16
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18291 branch 2 times, most recently from 9891709 to 9cbf6fe Compare January 5, 2026 10:58
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch from caf8936 to 72a9f0c Compare January 5, 2026 11:06
@daosbuild3
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17232/4/display/redirect

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/5/execution/node/1358/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/5/execution/node/1399/log

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch 2 times, most recently from 795df75 to 0187ab0 Compare January 8, 2026 08:43
@knard38 knard38 added the CR Catastrophic Recovery Feature label Jan 8, 2026
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch from 0187ab0 to 0f34795 Compare January 29, 2026 08:46
@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/8/execution/node/1398/log

Properly configure C DAOS debug facilities from Golang DDB code.

Features: recovery
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18292 branch from 0f34795 to 981d4b8 Compare February 6, 2026 14:09
@knard38 knard38 changed the base branch from ckochhof/fix/master/daos-18291 to master February 6, 2026 14:10
@knard38 knard38 marked this pull request as ready for review February 6, 2026 14:10
@knard38 knard38 requested review from a team as code owners February 6, 2026 14:10
@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/13/execution/node/1189/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17232/12/testReport/

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/12/execution/node/1392/log

Integrate reviewers comments:
- Fix UX interface

Features: recovery
Allow-unstable-test: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
tanabarr
tanabarr previously approved these changes Feb 12, 2026
@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/14/execution/node/1390/log

Integrate reviewers comments:
- Update documentation.

Features: recovery
Allow-unstable-test: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
@knard38
Copy link
Contributor Author

knard38 commented Feb 13, 2026

To properly integrate the documentation on logging facilities, I have added a ddb man file as it is already done for the dmg and daos command.
I have also move into this man page the detailed documentation which was printed with the help message.
The pdf of the man page generated is attached to this document.
ddb.pdf

@knard38 knard38 requested a review from tanabarr February 13, 2026 15:59
kjacque
kjacque previously approved these changes Feb 13, 2026
Copy link
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job adding documentation. I think this will be helpful to users. Thanks.

Remaining comments aren't blocking and could be addressed in a follow-on.

SysdbPath string `long:"db_path" short:"p" description:"Path to the sys db."`
VosPath string `long:"vos_path" short:"s" description:"Path to the VOS file to open."`
Version bool `short:"v" long:"version" description:"Show version"`
Debug bool `long:"debug" description:"Without this option, the console log level is set to ERROR. With this option, the console log level is set to INFO. For more detailed logs, provide the --log_dir path to log to a file."`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From user's perspective, he/she may not understand what is ERROR and what is INFO. So this option is mainly for DAOS developer. If that is true, then more convenient and flexible solution maybe something like that:
ddb --debug=mmm --logdir=nnn ...
mmm is different debug level, such as ERROR, WARN, INFO or DEBUG, and so on. nnn is the directory for log messages. These two options are independent from each other. Means that even if the debug level is set as ERROR, we should also allow the user to redirect related error message to the specified log directory.

Just suggestion, not sure how difficult to be implemented, depends on you.

Another question: how will these two options be used under interactive mode? They are per sub-command based or the whole mini-shell will share the same configuration when start the interaction?

Copy link
Contributor Author

@knard38 knard38 Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From user's perspective, he/she may not understand what is ERROR and what is INFO. So this option is mainly for DAOS developer. If that is true, then more convenient and flexible solution maybe something like that: ddb --debug=mmm --logdir=nnn ... mmm is different debug level, such as ERROR, WARN, INFO or DEBUG, and so on. nnn is the directory for log messages. These two options are independent from each other. Means that even if the debug level is set as ERROR, we should also allow the user to redirect related error message to the specified log directory.

Just suggestion, not sure how difficult to be implemented, depends on you.

At this time, it is not really feasible as the INFO logging facilities is also used to print the result of ddb commands such as the version or help information. They maybe other commands using the logging facilities for such usage. The issue is fixed in a follow-up PR needed for properly testing the ddb go code. As soon as, this issue is fixed it should be not so difficult (at first glance) to implement to change the behaviour as you asked. However, I will change a little bit: ERROR messages and above should always printed on stderr: If lthe -log-dir= is defined then the messages will be printed on stderr and in the log file.

If this last proposal makes sense to you, I can back port the part of the PR fixing the invalid logging usage to this PR. Then, I will fix this one as you suggested. Doest it makes sense to you ?

Another question: how will these two options be used under interactive mode? They are per sub-command based or the whole mini-shell will share the same configuration when start the interaction?

Under interactive mode, the whole mini-shell and sub-commands share the same configuration.

Copy link
Contributor Author

@knard38 knard38 Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use consistent logging façilities

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use consistent logging façilities

Fixed with commit 021ee3b

Fix building issue: add missing LD_LIBRARY path.

Features: recovery
Allow-unstable-test: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Fix reviewers remarks:
- Add short description of VOS path in live help output.

Features: recovery
Allow-unstable-test: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Fix reviewers remarks:
- Use consistent logging facilities

Features: recovery
Allow-unstable-test: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17232/17/display/redirect

@knard38 knard38 requested review from Nasf-Fan and kjacque February 16, 2026 15:34
@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17232/18/execution/node/1347/log

Copy link
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are minor, not something I would block on. Nice improvements! I like the change to pass the log level--this really makes more sense from the perspective of a developer and/or a user collecting logs to troubleshoot.

}
}

func strToLogLevels(level string) (logging.LogLevel, engine.LogLevel, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good function to write a simple Go unit test for.

SysdbPath string `long:"db_path" short:"p" description:"Path to the sys db."`
VosPath string `long:"vos_path" short:"s" description:"Path to the VOS file to open."`
Version bool `short:"v" long:"version" description:"Show version"`
Debug string `long:"debug" description:"Logging log level (default to ERROR)"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to provide the list of options for log level here directly (or at least direct the user to the manpage).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CR Catastrophic Recovery Feature

Development

Successfully merging this pull request may close these issues.

6 participants