Skip to content

Conversation

jjoyce0510
Copy link
Collaborator

@jjoyce0510 jjoyce0510 commented Sep 8, 2025

In this PR, we add support for an OIDC Oauth authenticator that allows customers to use service accounts to access DataHub.

Specifically:

  • New data models for enabling dynamic configuration of OAuth Providers
  • Static configuration for an OAuth Provider
  • Periodic refresh of dynamic configs (to eventually support UI management of OAuth Providers)
  • Application.yaml to easily enable static oauth providers

Validated this against Okta Service Account APIs. In the next PR, I'll provide some more details about how to configure okta service accounts to use with this.

Testing

Confirmed by setting up a test okta account with Service Account inside:

john@Mac-2960 datahub-1 % curl -v -i \
  -X GET "http://localhost:8080/entitiesV2/urn:li:corpuser:john%40acryl.io" \
  -H "Authorization: Bearer eyJraWQiOiJKQjM4NkI3azZUcDJLb0JNQkc3TXRnZGhlN2NCb2gybjJ1ckp3ZnBkRjJ3IiwiYWxnIjoiUlMyNTYifQ.eyJ2ZXIiOjEsImp0aSI6IkFULmdRSzdYalpsdDVCU1k1MGZUYW5NWUV2N3BRM051Yndkd3dzYjk4aElUNkEiLCJpc3MiOiJodHRwczovL2ludGVncmF0b3ItNDk5NDk0NS5va3RhLmNvbS9vYXV0aDIvYXVzdjY5cTBnNnpHdUtZRXo2OTciLCJhdWQiOiJodHRwOi8vbG9jYWxob3N0OjkwMDIiLCJpYXQiOjE3NTc1NDMyNzAsImV4cCI6MTc1NzU0Njg3MCwiY2lkIjoiMG9hdjY5cGRwZG1MWmQ4cHA2OTciLCJzY3AiOlsiZGF0YWh1YiJdLCJzdWIiOiIwb2F2NjlwZHBkbUxaZDhwcDY5NyJ9.NggGScXymLeT-KuTpfFme2yqQBtY56R6KmuKEljAw6DHauX8Li2epKAl6klv7zTNq5sVaV9F_4ROklUai4IBdeUr2iQ8sMROCAMji0qqtN1CldU79nhNDOfeE6XyuaWoJyzCE_D5rKpD6plZzHJaObrxhUMkCdKf2StJnDAngqu91A3V-JaRJSPRuY4cIjflnGpd8r44ItdOkVyL5tZsX74VsmV7iei2nfDb4SzvhZ-_jSxHspVk3U0iqrsTBxrMhSu1eq_dQqXQA-SPh7se5gum3zKa88MQuztDghITAZc1PpU6aoFZJif6FJqdJE1pj4AL36mrvO05JTQ6Y1HYmA" \
  -H "Accept: application/json"
Note: Unnecessary use of -X or --request, GET is already inferred.
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /entitiesV2/urn:li:corpuser:john%40acryl.io HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.7.1
> Authorization: Bearer eyJraWQiOiJKQjM4NkI3azZUcDJLb0JNQkc3TXRnZGhlN2NCb2gybjJ1ckp3ZnBkRjJ3IiwiYWxnIjoiUlMyNTYifQ.eyJ2ZXIiOjEsImp0aSI6IkFULmdRSzdYalpsdDVCU1k1MGZUYW5NWUV2N3BRM051Yndkd3dzYjk4aElUNkEiLCJpc3MiOiJodHRwczovL2ludGVncmF0b3ItNDk5NDk0NS5va3RhLmNvbS9vYXV0aDIvYXVzdjY5cTBnNnpHdUtZRXo2OTciLCJhdWQiOiJodHRwOi8vbG9jYWxob3N0OjkwMDIiLCJpYXQiOjE3NTc1NDMyNzAsImV4cCI6MTc1NzU0Njg3MCwiY2lkIjoiMG9hdjY5cGRwZG1MWmQ4cHA2OTciLCJzY3AiOlsiZGF0YWh1YiJdLCJzdWIiOiIwb2F2NjlwZHBkbUxaZDhwcDY5NyJ9.NggGScXymLeT-KuTpfFme2yqQBtY56R6KmuKEljAw6DHauX8Li2epKAl6klv7zTNq5sVaV9F_4ROklUai4IBdeUr2iQ8sMROCAMji0qqtN1CldU79nhNDOfeE6XyuaWoJyzCE_D5rKpD6plZzHJaObrxhUMkCdKf2StJnDAngqu91A3V-JaRJSPRuY4cIjflnGpd8r44ItdOkVyL5tZsX74VsmV7iei2nfDb4SzvhZ-_jSxHspVk3U0iqrsTBxrMhSu1eq_dQqXQA-SPh7se5gum3zKa88MQuztDghITAZc1PpU6aoFZJif6FJqdJE1pj4AL36mrvO05JTQ6Y1HYmA
> Accept: application/json
> 
* Request completely sent off
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: Jetty(12.0.21)
Server: Jetty(12.0.21)
< Date: Wed, 10 Sep 2025 22:42:14 GMT
Date: Wed, 10 Sep 2025 22:42:14 GMT
< Content-Type: application/json
Content-Type: application/json
< X-RestLi-Protocol-Version: 1.0.0
X-RestLi-Protocol-Version: 1.0.0
< Content-Length: 256
Content-Length: 256
< 

* Connection #0 to host localhost left intact
{"urn":"urn:li:corpuser:[email protected]","aspects":{"corpUserKey":{"created":{"actor":"urn:li:corpuser:__datahub_system","time":1757544137096},"name":"corpUserKey","type":"VERSIONED","version":0,"value":{"username":"[email protected]"}}},"entityName":"corpuser"}%                            

@github-actions github-actions bot added docs Issues and Improvements to docs product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment labels Sep 8, 2025
}

private PublicKey loadPublicKey(String jwksUri, String keyId) throws Exception {
HttpRequest request = HttpRequest.newBuilder().uri(URI.create(jwksUri)).build();
Copy link

@aikido-pr-checks aikido-pr-checks bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP request might enable SSRF attack - high severity
If an attacker can control the URL input leading into this http request, the attack might be able to perform an SSRF attack. This kind of attack is even more dangerous is the application returns the result of the URL fetch to the user. It can serve as an initial access point for an attacker for stealing credentials in the cloud.

Remediation: If possible, only allow requests to verified domains. If not, consult the article linked above to learn about other mitigating techniques such as disabling redirects, blocking private IPs and making sure private services have internal authentication. If you return data coming from the request to the user, validate the data before returning it to make sure you don't return random data.
View details in Aikido Security

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a deployment time setting and not pulled from a user request, users cannot supply this. The recommendation that production settings should use static provider addresses this risk.

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Sep 8, 2025
Copy link

codecov bot commented Sep 8, 2025

Bundle Report

Changes will decrease total bundle size by 2.47kB (-0.01%) ⬇️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 28.56MB -2.47kB (-0.01%) ⬇️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js -2.47kB 18.91MB -0.01%

Copy link

codecov bot commented Sep 8, 2025

}

private PublicKey loadPublicKey(String jwksUri, String keyId) throws Exception {
HttpRequest request = HttpRequest.newBuilder().uri(URI.create(jwksUri)).build();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a deployment time setting and not pulled from a user request, users cannot supply this. The recommendation that production settings should use static provider addresses this risk.

@chakru-r chakru-r self-requested a review September 9, 2025 12:08
Copy link
Collaborator

@chakru-r chakru-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few changes requested before merging.

@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Sep 9, 2025
@jjoyce0510
Copy link
Collaborator Author

Addressing comments. Thank you!

}

private PublicKey loadPublicKey(String jwksUri, String keyId, String algorithm) throws Exception {
HttpRequest request = HttpRequest.newBuilder().uri(URI.create(jwksUri)).build();
Copy link

@aikido-pr-checks aikido-pr-checks bot Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP request might enable SSRF attack - high severity
If an attacker can control the URL input leading into this http request, the attack might be able to perform an SSRF attack. This kind of attack is even more dangerous is the application returns the result of the URL fetch to the user. It can serve as an initial access point for an attacker for stealing credentials in the cloud.

Remediation: If possible, only allow requests to verified domains. If not, consult the article linked above to learn about other mitigating techniques such as disabling redirects, blocking private IPs and making sure private services have internal authentication. If you return data coming from the request to the user, validate the data before returning it to make sure you don't return random data.
View details in Aikido Security

# External OAuth Configuration
- EXTERNAL_OAUTH_ENABLED=true
- EXTERNAL_OAUTH_TRUSTED_ISSUERS=https://my-okta-domain.okta.com/oauth2/default
- EXTERNAL_OAUTH_ALLOWED_AUDIENCES=0oa1234567890abcdef
Copy link

@aikido-pr-checks aikido-pr-checks bot Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed secret in docs/authentication/external-oauth-providers.md - low severity
Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
View details in Aikido Security

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment docs Issues and Improvements to docs pending-submitter-merge product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants