Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental API for OAuth 2.0 in WindowsAppSDK #3202

Open
wants to merge 10 commits into
base: main-old
Choose a base branch
from

Conversation

dunhor
Copy link
Member

@dunhor dunhor commented Dec 2, 2022

Before continuing, I would advise you to familiarize yourself with the following RFCs:

There are other RFCs that are relevant for OAuth 2.0, but these are probably the most important in terms of understanding the design here.

This prototype is developed enough to the point where it has a fully functional implementation with tests, is free from any known bugs, and has preliminary functional samples. We've done a preliminary review of the design in general, but no formal API review has been done yet, so names, usage patterns, etc. are likely to change. We are opening this PR semi-early as a request for feedback and suggestions.

The samples can be found here, although some work will be necessary to get them to run since the API is not currently present in WindowsAppSDK: https://github.com/ujjwalchadha/OAuthConsumptionSample

Why

OAuth seems to be a common ask for Windows App SDK applications. UWP has the WebAuthenticationBroker class, however that API is not usable by non-UWP applications. One option was to port that API and modify it to work for non-UWP applications, however the code owners opted not to pursue that option. Therefore, we are taking a separate approach. As a result, we've opted to follow OAuth best practices more closely - e.g. using user's default browser - and the end design is dramatically different than WebAuthenticationBroker, more closely resembling other APIs such as AppAuth.

Design Overview

The API opts to give more "raw" control over the use of OAuth. E.g. it has built-in helpers for all grant types in the OAuth spec as well as enough flexibility to use custom/extension values. This includes things like splitting authorization code requests into two parts. The API also uses best practices by default - such as PKCE - but does not mandate it. The only requirement imposed is the use of a state value in authorization code requests (which is required by the OAuth spec), and it requires that this value is globally unique across all authorization requests in the system since the state value is used to coordinate IPC. This only affects applications manually specifying state values as the API will generate one that's guaranteed to be unique if none is provided.

With the API, there's effectively three operations:

  1. Initiate an authorization request (i.e. launch the browser)
  2. Complete an authorization request
  3. Initiate a token request

1 & 2 are effectively the same operation, depending on how you think about it. They are separate out of necessity because the application needs to handle the redirect, either through a localhost server or through protocol activation (more on this below). 2 will take care of any necessary IPC if the redirect comes in a different process (protocol activation). 3 is effectively just a POST request and provided mostly out of convenience (although I doubt its usefulness as most consumers should be using client credentials, which should never be present on a user's machine).

The request/response types all have properties that map 1:1 (with the exception of PKCE) with their counterparts in the OAuth standard. A map of key/value pairs is provided for specifying arguments outside the OAuth specification. The type of each argument is selected to avoid errors as much as possible. E.g. expires_in could arguably be represented as a DateTime, however for now is a String in an implicit grant type response since it is extracted from the response URI, and a Double in the token response since it comes out of the parsed JSON.

Redirection

To start out, we are not taking a stance on which redirection "scheme" the application should use: either localhost redirect or protocol activation. Both have their pros and cons

"Scheme" Pros Cons
Localhost Redirect
  • Redirect is handled in-proc
  • Allows custom UI to be served to the browser, such as "authorization complete, you can return to the application"
  • Custom JavaScript code can be used to recover the URI fragment for the implicit grant type
  • Does not transfer foreground rights to the application (although nothing is stopping you from writing custom JS to do an activation)
  • Complex to set up a web server
  • Redirect URI cannot be specific to your application
Protocol Activation
  • Allows foreground rights to be transferred to the application
  • Requires relatively few lines of code to set up
  • Activation is in a new process
  • Cannot serve custom HTML to the browser
  • Only usable for the authorization code grant type

We had discussed embedding a web server into the API, but opted - at least for now - against it for a few reasons:

  • Applications likely want to serve custom HTML to the browser (e.g. "branding")
  • We would likely get more value in developing a richer, general-purpose API for writing a web server in WinRT
  • Complexity of design and implementation would likely dramatically delay delivering a "v1" API

Security Considerations

RFC 6819 describes the general OAuth 2.0 threat model and has some good information in there. RFC 8252 also has some good info that's specific to native applications. Therefore, I won't exhaustively cover everything here; only what's specific to this API. The most obvious attack surface that is unique to this API is the fact that we may need to perform IPC to communicate the authorization code back to the requesting process. A combination of client secrets, PKCE, and encryption are used/recommended to avoid potential issues.

NOTE: Regardless of any remediation we put in place, it will always be possible for another process with the same access level to read the application process' memory, so this should all be thought of more as defense in depth

Scenario 1: "Intercepting" Communication

  • Legit app process A initiates an auth request
  • Attacker process B sees this (e.g. by listening to named port creation) and terminates A
  • Legit app process C gets protocol activated and sends the auth code to B

Remediations:

  • The auth code is encrypted using the state, which B does not know
  • PKCE is used, and B does not know the original code verifier
  • B does not have access to the client secret

Scenario 2: Sending Bad Data

  • Legit app process A initiates an auth request
  • Attacker process B sees this and sends its own authorization code to A

Remediations:

  • B is unable to encrypt the data it's sending because it does not know the original state
  • PKCE is used, so the code verifier does not match the one used by B to obtain the false code

Scenario 3: Bad Initiator

  • Attacker process A initiates an auth request
  • Legit process B gets protocol activated and sends the authorization code to A

Remediations:

  • User validates client info on the authorization server's website and sees that the request is unexpected and does not complete the flow
  • B does not have access to the client secret

Note that this problem is unavoidable no matter what. The alternative might be to use localhost for the redirect URI so that the response comes in-proc, however localhost URIs are not specific to any one application, so an attacker could control the entire flow.

Example Code

Some example code for performing some tasks

Performing an Authorization Code Request

auto requestParams = AuthRequestParams::CreateForAuthorizationCodeRequest(L"my_client_id",
    Uri(L"my-app:/oauth-callback/"));
requestParams.Scope(L"user:email user:birthday");

auto requestResult = co_await AuthManager::InitiateAuthRequest(
    Uri(L"https://my.server.com/oauth/authorize"), requestParams);
if (auto response = requestResult.Response())
{
    DoTokenExchange(response);
}
else
{
    auto failure = requestResult.Failure();
    NotifyFailure(failure.Error(), failure.ErrorDescription());
}

Exchanging an Authorization Code for an Access Token

AuthResponse authResponse = authResult.Response();
auto tokenParams = TokenRequestParams::CreateForAuthorizationCodeRequest(authResponse);
auto clientAuth = ClientAuthentication::CreateForBasicAuthorization(L"my_client_id",
    L"my_client_secret");

auto tokenResult = co_await AuthManager::RequestTokenAsync(
    Uri(L"https://my.server.com/oauth/token"), tokenParams, clientAuth);
if (auto response = tokenResult.Response())
{
    auto authToken = tokenResult.Token();
    auto tokenType = tokenResult.TokenType();

    // RefreshToken string null/empty when not present
    if (auto refreshToken = tokenResult.RefreshToken(); !refreshToken.empty())
    {
        // ExpiresIn is zero when not present
        DateTime expires = winrt::clock::now();
        if (auto expiresIn = tokenResult.ExpiresIn(); expiresIn != 0)
        {
            expires += std::chrono::seconds(static_cast<int64_t>(expiresIn));
        }
        else
        {
            // Assume a duration of one hour
            expires += std::chrono::hours(1);
        }

        myAppState.ScheduleRefreshAt(expires, refreshToken);
    }

    DoRequestWithToken(authToken, tokenType);
}
else
{
    auto failure = tokenResult.Failure();
    NotifyFailure(failure.Error(), failure.ErrorDescription());
}

Refreshing an Access Token

auto tokenParams = TokenRequestParams::CreateForRefreshToken(myRefreshToken);
auto clientAuth = ClientAuthentication::CreateForBasicAuthorization(L"my_client_id",
    L"my_client_secret");
auto tokenResult = co_await AuthManager::RequestTokenAsync(
    Uri(L"https://my.server.com/oauth/token"), tokenParams, clientAuth));
if (auto response = tokenResult.Response())
{
    UpdateToken(tokenResult.Token(), tokenResult.TokenType(), tokenResult.ExpiresIn());
}
else
{
    auto failure = tokenResult.Failure();
    NotifyFailure(failure.Error(), failure.ErrorDescription());
}

Completing an Authorization Request from a Protocol Activation

void App::OnActivated(const IActivatedEventArgs& args)
{
    if (args.Kind() == ActivationKind::Protocol)
    {
        auto protocolArgs = args.as<ProtocolActivatedEventArgs>();
        if (AuthManager::CompleteAuthRequest(protocolArgs.Uri()))
        {
            TerminateCurrentProcess();
        }

        DisplayUnhandledMessageToUser();
    }
}

@ghost ghost added the needs-triage 🔍 label Dec 2, 2022
@@ -246,17 +246,6 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "Framework.Widgets", "test\D
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "DynamicDependencyLifetimeManagerShadow", "dev\DynamicDependency\DynamicDependencyLifetimeManagerShadow\DynamicDependencyLifetimeManagerShadow.vcxproj", "{6539E9E1-BF36-40E5-86BC-070E99DB7B7B}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ToastNotificationTests", "test\ToastNotificationTests\ToastNotificationTests.vcxproj", "{E977B1BD-00DC-4085-A105-E0A18E0183D7}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project seems to be mostly deleted and was causing issues with VS trying to re-add it back in when building

@dotMorten
Copy link
Contributor

dotMorten commented Dec 3, 2022

Is there an API specification?

@dotMorten
Copy link
Contributor

dotMorten commented Dec 3, 2022

My main issue with this approach is that it seems to require changes to Program.cs:
https://github.com/ujjwalchadha/OAuthConsumptionSample/blob/63365039cdbe885ef569261233e7e2a7a9951b96/OAuthConsumptionSample/Program.cs#L55-L65

This makes it hard for code to handle authentication in one place, and makes the life of libraries even harder, and you also need to provide this yourself instead of just relying on the auto-generated one.

I'd encourage you to look at what .NET MAUI provides, with just a single-line of code to perform OAuth: https://learn.microsoft.com/en-us/dotnet/maui/platform-integration/communication/authentication?view=net-maui-7.0&tabs=windows#using-webauthenticator

@dotMorten
Copy link
Contributor

dotMorten commented Dec 3, 2022

This will fail if another process is already using port 5000: https://github.com/ujjwalchadha/OAuthConsumptionSample/blob/014a70ef4dd02d623e95bb01c91ee6689b406d0b/OAuthConsumptionSample/GoogleOAuthPage.xaml.cs#L169
It also looks like it never shuts it down, so that a second login attempt would try and create a second listener on the (now occupied) port.

@dunhor
Copy link
Member Author

dunhor commented Dec 5, 2022

Is there an API specification?

Nothing here (yet). The sample was intended to be somewhat of a "how to," though the code paths there might not be representative enough to capture enough usage scenarios. I've copied some sample code that hopefully give a better sense of how the API is used (though it still lacks examples for other grant types such as client credentials, but that's probably for the best).

@dunhor
Copy link
Member Author

dunhor commented Dec 5, 2022

My main issue with this approach is that it seems to require changes to Program.cs: https://github.com/ujjwalchadha/OAuthConsumptionSample/blob/63365039cdbe885ef569261233e7e2a7a9951b96/OAuthConsumptionSample/Program.cs#L55-L65

This makes it hard for code to handle authentication in one place, and makes the life of libraries even harder, and you also need to provide this yourself instead of just relying on the auto-generated one.

Three-ish issues here:

  1. The API makes no assumption about the callback URL. I.e. the response may not come through the activation code path (the PR description covers this)
  2. There's no "central" activation code path we can realistically "hijack." Higher level libraries might be able to do some handling here (e.g. perhaps WinUI), but the core of the API wouldn't be able to rely on that
  3. Even if (2) was done, a well-authored application can't avoid modifying the activation code path. E.g. what if the process that initiated the auth flow got terminated? What if that process timed out and cancelled the operation?

There was a suggestion to make this slightly simpler: change/add an overload that accepts the event args object, though this would only save on the need to QI.

I'd encourage you to look at what .NET MAUI provides, with just a single-line of code to perform OAuth: https://learn.microsoft.com/en-us/dotnet/maui/platform-integration/communication/authentication?view=net-maui-7.0&tabs=windows#using-webauthenticator

The problem there, and the problem that heavily shaped the API here (and likely the reason that API is documented as non-functional on Windows) is that Windows and WindowsAppSDK lack key features that allow such a design to work. To list just a few things:

  1. Windows lacks a feature similar to in-app browser tabs. This has a number of repercussions: we don't know the lifetime of the request, the redirect does not include the fragment component, the browser tab "stays around" after the request is complete, foreground rights won't transfer without an activation, etc.
  2. Protocol activation occurs in a new process, meaning we need to handle the IPC to communicate the response back to the originating process, and the application needs to decide how to handle failure in the event there is nobody is "listening" to the response

The "best" we could probably do is to have a local server running that handles a number of these, but it's not without tradeoffs. The PR description covers a good chunk of this.

@ayamadori
Copy link

ayamadori commented Dec 10, 2022

One option was to port that API and modify it to work for non-UWP applications, however the code owners opted not to pursue that option.

In MicrosoftEdge/WebView2Feedback#1647, commented below:

Our short-term recommendation for a workaround is to launch the system browser and handle the auth flow there.
Longer term, our suggestion will be to use the Web Authentication Broker (WAB) API. The WAB API is a Windows API, vetted by Google, that will enable auth flows in your native applications. This API is currently UWP-only but has plans to be available in win32 and .NET as part of the WindowsAppSDK.

Is this API is for short-term? Or was the plan for longer term changed?

@dunhor
Copy link
Member Author

dunhor commented Dec 13, 2022

One option was to port that API and modify it to work for non-UWP applications, however the code owners opted not to pursue that option.

In MicrosoftEdge/WebView2Feedback#1647, commented below:

Our short-term recommendation for a workaround is to launch the system browser and handle the auth flow there.
Longer term, our suggestion will be to use the Web Authentication Broker (WAB) API. The WAB API is a Windows API, vetted by Google, that will enable auth flows in your native applications. This API is currently UWP-only but has plans to be available in win32 and .NET as part of the WindowsAppSDK.

Is this API is for short-term? Or was the plan for longer term changed?

While I lack the context on that issue/conversation, it does link back to #441 in the following sentence, and that plan has been changed, as noted in the description that you quote. As far as WindowsAppSDK is concerned (or at least the current POR), there are no plans to port/open-source WAB, so it will remain UWP only.

@roxk
Copy link
Contributor

roxk commented Jan 5, 2023

I'd encourage you to look at what .NET MAUI provides, with just a single-line of code to perform OAuth: https://learn.microsoft.com/en-us/dotnet/maui/platform-integration/communication/authentication?view=net-maui-7.0&tabs=windows#using-webauthenticator

That is only single line in code. On all platforms other than windows, other setup are required, e.g. adding init code (android) or interceptor code in corresponding callback (ios). Each platform also needs to (rightfully) defines protocols schema, or even additional activities (android).

At least this is the case for the web authenticator of Xamarin, MAUI's predecessor. Maybe MAUI has some other tricks up its sleeves to reduce setup steps required, but I think the API introduced in this PR is on the right trick for a general-purpose API that can be consumed by e.g. WPF, MFC or even console app (think git credential manager).

I'd also like WinUI 3 apps to have a nicer API, though. Maybe additional integration with WinUI 3 that builds on top of the API here can be provided as another package:

WASDK.Oauth (Base API introduced in this PR)
^
| use
|
WADSK.Oauth.WinUI3 (WinUI 3 integration)

@bpulliam bpulliam added api-design Updates to Project Reunion API surfaces feature proposal labels Jan 25, 2023
@mominshaikhdevs
Copy link

@dunhor whats the latest up date on this?

@ayamadori
Copy link

Spec dropped #4772

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-design Updates to Project Reunion API surfaces feature proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants