-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More sophisticated internal APIs for credentials #262
Comments
We use Azure OpenAI with access tokens (instead of API keys) and have something along the lines of this when we want to use ellmer. library(ellmer)
AOAI_access_token <- R6::R6Class(
"AOAI_access_token",
public = list(
access_token = NULL,
expires_at = NULL,
initialize = function(token = NULL) {
if (is.null(token)) {
self$refresh()
} else {
self$access_token <- token$access_token
self$expires_at <- token$expires_at
}
},
get_token = function() {
if (self$expires_at < Sys.time()) self$refresh()
return(self$access_token)
},
refresh = function() {
response <- httr::POST(
url = paste0("https://login.microsoftonline.com/", Sys.getenv("AZURE_TENANT_ID"), "/oauth2/token"),
body = list(
"grant_type" = "client_credentials",
"client_id" = Sys.getenv("AZURE_CLIENT_ID"),
"client_secret" = Sys.getenv("AZURE_CLIENT_SECRET"),
"resource" = "https://cognitiveservices.azure.com/"
)
)
content <- httr::content(response)
if (response$status_code != 200) stop(content$error_description)
self$access_token <- content$access_token
self$expires_at <- Sys.time() + as.numeric(content$expires_in) - 10 # add 10 seconds buffer just to be sure
},
print = function() {
cat("<Azure OpenAI Access Token>:\n")
cat(" Expires at:", format(as.POSIXct(self$expires_at), "%F %T UTC"), "\n")
}
)
)
access_token <- AOAI_access_token$new()
access_token
#> <Azure OpenAI Access Token>:
#> Expires at: 2025-01-22 09:51:33 UTC
chat <- chat_azure(
endpoint = Sys.getenv("AZURE_OPENAI_ENDPOINT"),
deployment_id = "gpt-4o",
api_key = "UNNEEDED", # Note this must be provided but is not used
token = access_token$get_token(),
api_version = Sys.getenv("AZURE_OPENAI_API_VERSION")
)
a <- chat$extract_data(
"My name is Susan and I'm 13 years old",
type = type_object(
age = type_number(),
name = type_string()
)
) Note that because we do not use an API Key directly but fetch an access token from Azure KeyVault (?), the It would be really convenient if the internals of |
@atheriel I think it's fine to use S7 in httr2, since the goal is to eventually convert to it. |
Inspired by a recent comment [0], this commit adds support for picking up on Azure service principals, which seem to be a common way to furnish `chat_azure()` with credentials without the need for an API key. This approach also handles caching and expiry for these tokens, an ongoing source of problems for users. Unit tests are included. I've also updated the documentation. [0]: #262 (comment) Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel - I also tested the automatic detection portion, and it works for our setup as well! My only comment is whether you could change the environment variable names to be a bit more descriptive? We have multiple Azure Client IDs, so maybe this could use something like
That would make it consistent with other environment variables that all start with AZURE_OPENAI, like AZURE_OPENAI_API_KEY Thanks! Nice work! |
@SokolovAnatoliy I picked those environment variables because those are the ones that the official Azure SDKs detect and use, but it sounds like maybe you need an escape hatch here? Also, out of curiosity: why do you have multiple client IDs for the same app? Is it not possible to assign the right RBAC or API permissions to a single Azure service principal? |
Inspired by a recent comment [0], this commit adds support for picking up on Azure service principals, which seem to be a common way to furnish `chat_azure()` with credentials without the need for an API key. This approach also handles caching and expiry for these tokens, an ongoing source of problems for users. Unit tests are included. I've also updated the documentation. [0]: #262 (comment) Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel - I meant to say we have multiple Client IDs under our Azure tenant. They are not for the same app. But it sounds like we should just create a custom credential function and use that, instead of relying on the |
This commit adopts the pattern used in `chat_azure()` to `chat_databricks()` and `chat_snowflake()`. That is: these providers now have a single `credentials` property that returns a list of headers when called. In addition, there is now consistent internal usage of a `default_x_credentials()` function that itself returns a `credentials` function. (Through experimenting with various designs, I decided that a simple function factory approach was preferable to the heavyweight S7 classes and generics approach I thought we could use initially.) Finally, I've added several new unit tests for Databricks and Snowflake authentication. Closes #262. Signed-off-by: Aaron Jacobs <[email protected]>
This commit adopts the pattern used in `chat_azure()` to `chat_databricks()` and `chat_snowflake()`. That is: these providers now have a single `credentials` property that returns a list of headers when called. In addition, there is now consistent internal usage of a `default_x_credentials()` function that itself returns a `credentials` function. (Through experimenting with various designs, I decided that a simple function factory approach was preferable to the heavyweight S7 classes and generics approach I thought we could use initially.) Finally, I've added several new unit tests for Databricks and Snowflake authentication. Closes #262. Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel - could you please remove the requirement of just 1 of Thanks! |
This commit adopts the pattern used in `chat_azure()` to `chat_databricks()` and `chat_snowflake()`. That is: these providers now have a single `credentials` property that returns a list of headers when called. In addition, there is now consistent internal usage of a `default_x_credentials()` function that itself returns a `credentials` function. (Through experimenting with various designs, I decided that a simple function factory approach was preferable to the heavyweight S7 classes and generics approach I thought we could use initially.) Finally, I've added several new unit tests for Databricks and Snowflake authentication. Closes #262. Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel actually, I realized we are seeing something else. Line 76 in check_exclusive(token, credentials, .require = FALSE)
Here is the example token <- NULL
creds <- function(x) "tony"
rlang::check_exclusive(token, creds, .require = FALSE) The result is: |
Edit: moved to #274. |
Right now the complex authentication code for Databricks, Snowflake, and Azure is a bit disorganised and inconsistent (which is my doing, of course). There are also issues with handling Azure and AWS credentials that expire correctly.
Some of this looks like internal mutable state management to me, so I'm thinking of creating an internal
CredentialSource
API using S7 classes and generics that allow us to have sophisticated cloud provider-specific logic around handling headers and expiry. Does that sound like a good idea? I'm also considering whether this could be of use beyondellmer
, too, though I'm hesitant to introduce an S7 generic to e.g.httr2
.The text was updated successfully, but these errors were encountered: