Skip to content

Conversation

@huven
Copy link
Contributor

@huven huven commented Oct 20, 2025

Fixes #5123

Todo:

  • Check std::stop_token cross-platform support
  • Unittest
  • Make all PasswordHash subclasses check stop_token
    • Argon2
    • PBKDF2
    • Scrypt
    • OpenPGP-S2K
    • Bcrypt-PBKDF
  • Drop added supports_cooperative_cancellation() when all done
  • Update documentation

Note:

  • The deprecated API (non-PasswordHash based) has not been touched
  • I've added the default std::nullopt only to the base class. This is not inherited by virtual overrides, which I think is ok as the default will be taken from the static class. Similar to other methods like from_params.
  • macOS runners have been updated from 13/14 to 26
  • Added -march=armv6 flag to bare metal build (to ensure availability of atomic operations)
  • OpenPGP-S2K has been excluded from unit-testing, as it cannot be tuned above a few seconds on modern hardware, which is impossible to test reliably on GitHub runners.

@coveralls
Copy link

coveralls commented Oct 20, 2025

Coverage Status

coverage: 90.416% (-0.009%) from 90.425%
when pulling 627aed3 on huven:pwhash-stoppable
into f283c5d on randombit:master.

@reneme
Copy link
Collaborator

reneme commented Oct 21, 2025

Looking at the CI outputs, we seem to get in trouble with macOS (Xcode 13 and 14) and Android NDK. For both of which we claim to only support the "latest" version. Both could be updated on our CI: Xcode (to 26 that comes with Clang 17) and the NDK (from 28 to 29) [1, 2] which may or may not fix this issue.

Additionally the build configuration for arm32-baremetal (which links to -lnosys) fails because the symbol for __atomic_fetch_sub_4 is missing. I'm guessing that the stop token needs this for synchronization. Not sure whether that would become a showstopper, frankly.

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

For macOS/iOS I know for sure it works with recent Xcode, as I already included a basic implementation in my app, works perfectly 😄.
It looks like the current images use clang 14 and 15 while stop_token was introduced at 16.

Will have some time later today to dive into this.

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

Merged both PRs you linked in here, let's see what happens..

@reneme
Copy link
Collaborator

reneme commented Oct 21, 2025

Merged both PRs you linked in here, let's see what happens..

Thanks. I was about to propose exactly that.

The macOS 26 PR currently doesn't work -- some Python-based test fails. But the build succeeds, despite the stop token usage. Similarly, NDK r29 seems to ship the stop token as well. So both are probably a green light (if we're willing to ditch support for older NDKs and Xcodes). 🙂

That leaves the linking issue on the "arm32-baremetal" CI job which doesn't seem to provide the required atomic functionality. 🙁

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

One python test fails indeed on macos-26. I see the test is disabled on Windows.

I have one Mac here at 26, interestingly enough all tests run ok on it (though an immediate second run fails the cli_tls_socket_tests test (Error: server bind failed), but that looks like another issue).

Will debug this a bit further, would be nice if the cli_tls_proxy_tests would fail on my device too.

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

Didn't succeed in reproducing the tls_proxy test failure. Installed both boost and python with the exact same version as in GitHub runner, copied the configure.py command verbatim, all tests still pass 😢

For now, I'm tempted to add macOS to the Windows exception for that test and leave fixing the test to another issue.

[...]
   INFO: Ran cli_tls_proxy_tests in 0.41 sec
[...]
Ran 238 tests with 0 failures in 7.82 seconds

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

The arm32 cross-compile might need a -latomic added to the linker phase, will test that quick&dirty first. If that works it needs to be added somewhere in the configure system.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces cooperative cancellation support for password hashing operations using std::stop_token. The implementation allows long-running password hash operations to be cancelled gracefully via a stop token passed through the API.

Key changes:

  • Added std::stop_token parameter to all PasswordHash::derive_key() methods
  • Implemented cancellation checking in Argon2's block processing loop
  • Updated test infrastructure to disable flaky TLS proxy tests on macOS and upgraded CI to use newer toolchain versions

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/lib/pbkdf/pwdhash.h Added supports_cooperative_cancellation() method and stop_token parameter to base class
src/lib/pbkdf/pwdhash.cpp Updated implementation to accept stop_token parameter
src/lib/pbkdf/argon2/argon2.h Added supports_cooperative_cancellation() override returning true and updated signatures
src/lib/pbkdf/argon2/argon2.cpp Implemented cancellation checking in process_block() loop
src/lib/pbkdf/argon2/argon2pwhash.cpp Updated to pass stop_token through to argon2 implementation
src/lib/pbkdf/pbkdf2/pbkdf2.h Added stop_token parameter to signature
src/lib/pbkdf/pbkdf2/pbkdf2.cpp Updated to accept but not use stop_token parameter
src/lib/pbkdf/scrypt/scrypt.h Added stop_token parameter to signature
src/lib/pbkdf/scrypt/scrypt.cpp Updated to accept but not use stop_token parameter
src/lib/pbkdf/bcrypt_pbkdf/bcrypt_pbkdf.h Added stop_token parameter to signature
src/lib/pbkdf/bcrypt_pbkdf/bcrypt_pbkdf.cpp Updated to accept but not use stop_token parameter
src/lib/pbkdf/pgp_s2k/pgp_s2k.h Added stop_token parameter to signature
src/lib/pbkdf/pgp_s2k/pgp_s2k.cpp Updated to accept but not use stop_token parameter
src/lib/x509/certstor_system_macos/certstor_macos.h Enhanced documentation comment
src/scripts/test_cli.py Disabled flaky TLS proxy test on macOS
src/scripts/ci/setup_gh_actions.sh Added fallback for Xcode version selection
src/configs/repo_config.env Updated Android NDK version from r28 to r29
.github/workflows/ci.yml Updated CI matrix to use macos-26 runners

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

After some more experimenting (the -latomic didn't help, will revert that), it seems that the atomic primitives are only available if the cpu supports them, so e.g.

arm-none-eabi-c++ -std=c++20 -march=armv6 -O2 -specs=nosys.specs  stop_token_test.cpp

works like a charm, as ARMv6 introduced LDREX/STREX. Lowering that

arm-none-eabi-c++ -std=c++20 -march=armv5t -O2 -specs=nosys.specs  stop_token_test.cpp
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /tmp/cc1A7gy2.o: in function `std::stop_token::stop_requested() const':
stop_token_test.cpp:(.text._ZNKSt10stop_token14stop_requestedEv[_ZNKSt10stop_token14stop_requestedEv]+0x14): undefined reference to `__sync_synchronize'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: /tmp/cc1A7gy2.o: in function `main':
stop_token_test.cpp:(.text.startup+0x30): undefined reference to `__atomic_fetch_add_4'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: stop_token_test.cpp:(.text.startup+0x70): undefined reference to `__atomic_fetch_sub_4'
/usr/lib/gcc/arm-none-eabi/12.2.1/../../../arm-none-eabi/bin/ld: stop_token_test.cpp:(.text.startup+0x88): undefined reference to `__atomic_fetch_sub_4'
[...]

Assuming Botan is used on pre-ARMv6 architectures, that will require some additional work, like providing a custom implementation of some atomic operations, using some kind of locking.
Or throwing in a lot of #ifdef's in the code, which I prefer to avoid..

@huven
Copy link
Contributor Author

huven commented Oct 21, 2025

After giving this some thought: I think it's a good idea to decide first if it is safe to assume that ARMv6 is available, or if prior architectures need to be supported too.

  • If assuming ARMv6 is ok, the -march=armv6 can be added to CI fixing the build error.
  • If ARMv5 and older need to be supported, this feature will need a flag and use #ifdef's in the code.

@huven huven force-pushed the pwhash-stoppable branch 2 times, most recently from c70088a to 6773898 Compare October 21, 2025 21:02
@huven
Copy link
Contributor Author

huven commented Oct 27, 2025

To followup my own question: I will assume ARMv6 or higher, so added that to the CI build. Should succeed now (let's see).

And I've added a unit test to test the stop_token with Argon2 (can be extended to other pwdhashes once they support the stop_token).
The test only succeeds if an Invalid_State exception is thrown (i.e. if the derivation was aborted).

Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To followup my own question: I will assume ARMv6 or higher, so added that to the CI build.

For the sake of this PR, assuming that is certainly fine. I have no way of knowing whether that holds in general though.

Given that we already wrap std::mutex and std::lock_guard, we could consider similarly wrapping the stop token. If a platform doesn't support it (e.g. because atomics are disabled for bare metal), it could simply fall back to a dummy implementation and ignore the cancellation request.

I'll look into that independently.


while(index < segments) {
if((index & 63) == 0 && stop_token.has_value() && stop_token->stop_requested()) {
throw Botan::Invalid_State("Cancelled");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that Invalid_State is probably the closest exception type for this. But I would argue that it is worthwhile to introduce a new exception type Operation_Canceled that inherits from Invalid_State so that applications can explicitly distinguish cancellations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning for using Invalid_State was that the caller can easily distinguish the cancelled state, because the caller owns the stop_source.

In my application, I currently call it like this:

try {
            passwordHash->derive_key(..., stop_source.get_token());
        } catch (Botan::Exception& e) {
            if (stop_source.stop_requested()) {
                 // handle cancellation
            } else {
                 // handle other error
            }
        }

Adding Operation_Cancelled is nice, that simplifies the catch to simply catching Botan:: Operation_Cancelled.

I'll add it in the next commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see I didn't add that logic to the unit test, my bad.. Anyway, will be fixed after adding the new exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operation_Canceled has been added in commit 0e63412

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffi.cpp I mapped OperationCanceled to BOTAN_FFI_ERROR_INVALID_OBJECT_STATE, as the stop_token is not exposed to FFI, so this error cannot occur inside ffi (currently).

If preferred I can add a BOTAN_FFI_ERROR_OPERATION_CANCELED there but that might confuse consumers of the ffi interface.

Copy link
Collaborator

@reneme reneme Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mapping this in FFI as described is fine. Please just pair up the case Botan::ErrorType::OperationCanceled: with the existing case Botan::ErrorType::InvalidObjectState: as this seems to be what was done in this switch mapping for other ambiguous error codes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@huven
Copy link
Contributor Author

huven commented Oct 28, 2025

To followup my own question: I will assume ARMv6 or higher, so added that to the CI build.

For the sake of this PR, assuming that is certainly fine. I have no way of knowing whether that holds in general though.

Given that we already wrap std::mutex and std::lock_guard, we could consider similarly wrapping the stop token. If a platform doesn't support it (e.g. because atomics are disabled for bare metal), it could simply fall back to a dummy implementation and ignore the cancellation request.

Wrapping crossed my mind too. I tried to avoid that as derive_key is public API, having the std:: type there makes the API easier to use, as the caller will know how the argument behaves.

I also considered #ifdef's inside the prototype, wrapping the prototype with ifdef/else, and overloading with ifdef. All of them have some arguments in favor and against (readability, duplication of code/comments).

I ended up assuming ARMv6 is ok🤞

@reneme
Copy link
Collaborator

reneme commented Oct 29, 2025

I ended up assuming ARMv6 is ok🤞

I'm guessing it is. Existing users still have the escape hatch of disabling the modules that use password-based key derivation if they don't need that for their use case. @randombit gets the final say on that, though. :)

@huven huven force-pushed the pwhash-stoppable branch 6 times, most recently from c4290e2 to 8781a45 Compare November 1, 2025 17:44
@huven huven marked this pull request as ready for review November 1, 2025 18:09
@huven
Copy link
Contributor Author

huven commented Nov 1, 2025

@randombit @reneme I think this PR is ready for review, looking forward to your comments if any further improvements are necessary. The opening comment contains a Note section with some thoughts and important changes to review.

@randombit randombit added this to the Botan 3.11 milestone Nov 4, 2025
@huven huven force-pushed the pwhash-stoppable branch from 43ae704 to bb77052 Compare January 4, 2026 16:39
@huven huven force-pushed the pwhash-stoppable branch from bb77052 to 627aed3 Compare January 4, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cooperative Cancellation in PasswordHash

4 participants