Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample notes for issuance of temp creds for S3 #2

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

aaronkanzer
Copy link
Contributor

@satra @kabilar @yarikoptic

See README.md and scripts -- very brief proof-of-concept for async issuance of temp creds -- could package it into a CLI option for a user to bypass minting of presigned URLs and access data as fast as possible.

@yarikoptic
Copy link

Is my understanding correct?

  • such functionality for minting temp cred would reside in dandi-archive which has main AWS IAM credentials
  • dandi-archive, through API (which endpoint?) would produce a full bundle of 4 fields and share with dandi-cli for an authorized session :
            return {
                "AccessKeyId": credentials['AccessKeyId'],
                "SecretAccessKey": credentials['SecretAccessKey'],
                "SessionToken": credentials['SessionToken'],
                "Expiration": credentials['Expiration']
            }
  • that AccessKeyId and SecretAccessKey would not be the same as the "main AWS IAM", and would be short lived, client would need to re-request them upon Expiration.

Would be nice to complement example with some use of those credentials for download

@aaronkanzer
Copy link
Contributor Author

aaronkanzer commented Dec 3, 2024

Is my understanding correct?

  • such functionality for minting temp cred would reside in dandi-archive which has main AWS IAM credentials
  • dandi-archive, through API (which endpoint?) would produce a full bundle of 4 fields and share with dandi-cli for an authorized session :
            return {
                "AccessKeyId": credentials['AccessKeyId'],
                "SecretAccessKey": credentials['SecretAccessKey'],
                "SessionToken": credentials['SessionToken'],
                "Expiration": credentials['Expiration']
            }
  • that AccessKeyId and SecretAccessKey would not be the same as the "main AWS IAM", and would be short lived, client would need to re-request them upon Expiration.

Would be nice to complement example with some use of those credentials for download

Yep, your understand is correct! Theoretically, you could tie the endpoint for providing the temp creds to be linked to a user's DANDI_API_KEY, etc.

The example could support "download" operations provided, as s3:GetObject is included in the IAM policy

@yarikoptic
Copy link

Such approach would provide "blanket" access to the resource (well -- as wide as iam policy prescribes), so would work for LINC or any other deployment where overall access to the archive should be "gated" but would not be applicable to any aspect of current DANDI access permission schemes (public and embargoed), correct?

@aaronkanzer
Copy link
Contributor Author

Such approach would provide "blanket" access to the resource (well -- as wide as iam policy prescribes), so would work for LINC or any other deployment where overall access to the archive should be "gated" but would not be applicable to any aspect of current DANDI access permission schemes (public and embargoed), correct?

You could issue temporary IAM policies and roles on a per-embargoed-dandiset basis -- engineering would be a bit more complex, but you could make this more defined rather than a blanket use case.

@yarikoptic
Copy link

You could issue temporary IAM policies and roles on a per-embargoed-dandiset basis -- engineering would be a bit more complex, but you could make this more defined rather than a blanket use case.

how would you see that feasible given that we do not have a "prefix" per dandiset, and blobs and zarrs are in their respective "keys" without any per-dandiset common prefix?

@aaronkanzer
Copy link
Contributor Author

You could issue temporary IAM policies and roles on a per-embargoed-dandiset basis -- engineering would be a bit more complex, but you could make this more defined rather than a blanket use case.

how would you see that feasible given that we do not have a "prefix" per dandiset, and blobs and zarrs are in their respective "keys" without any per-dandiset common prefix?

I could derive the S3 sub-directory value from the dandiset's assetSummary outputs

@yarikoptic
Copy link

how would you see that feasible given that we do not have a "prefix" per dandiset, and blobs and zarrs are in their respective "keys" without any per-dandiset common prefix?

I could derive the S3 sub-directory value from the dandiset's assetSummary outputs

sorry, I am not following how assetSummary could be used here. What I meant is that actual content is spread across all the folders

❯ dandi ls -f json_pp --metadata assets -r DANDI:000029 2>/dev/null | grep 'blobs/'
        "https://dandiarchive.s3.amazonaws.com/blobs/8e5/a7b/8e5a7b66-7608-421c-8f2a-bbb2f5a2cc5a"
        "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
        "https://dandiarchive.s3.amazonaws.com/blobs/56f/5b8/56f5b879-e5fa-476c-9893-dab482f66b3d"
        "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
        "https://dandiarchive.s3.amazonaws.com/blobs/e72/88a/e7288a2c-444c-42c5-b7bb-9a58c728992b"
        "https://dandiarchive.s3.amazonaws.com/blobs/c40/57c/c4057c5e-7af5-4370-878f-ccfc971aeba4"
        "https://dandiarchive.s3.amazonaws.com/blobs/180/9f5/1809f541-1cb1-48b8-b916-9a696bab488d"
        "https://dandiarchive.s3.amazonaws.com/blobs/2db/af0/2dbaf0fd-5003-4a0a-b4c0-bc8cdbdb3826"
        "https://dandiarchive.s3.amazonaws.com/blobs/2fd/746/2fd7464f-5459-4c96-a938-27cf13f4d330"

and this is tiny one -- try for 000026 ;)

@aaronkanzer
Copy link
Contributor Author

how would you see that feasible given that we do not have a "prefix" per dandiset, and blobs and zarrs are in their respective "keys" without any per-dandiset common prefix?

I could derive the S3 sub-directory value from the dandiset's assetSummary outputs

sorry, I am not following how assetSummary could be used here. What I meant is that actual content is spread across all the folders

❯ dandi ls -f json_pp --metadata assets -r DANDI:000029 2>/dev/null | grep 'blobs/'
        "https://dandiarchive.s3.amazonaws.com/blobs/8e5/a7b/8e5a7b66-7608-421c-8f2a-bbb2f5a2cc5a"
        "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
        "https://dandiarchive.s3.amazonaws.com/blobs/56f/5b8/56f5b879-e5fa-476c-9893-dab482f66b3d"
        "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
        "https://dandiarchive.s3.amazonaws.com/blobs/e72/88a/e7288a2c-444c-42c5-b7bb-9a58c728992b"
        "https://dandiarchive.s3.amazonaws.com/blobs/c40/57c/c4057c5e-7af5-4370-878f-ccfc971aeba4"
        "https://dandiarchive.s3.amazonaws.com/blobs/180/9f5/1809f541-1cb1-48b8-b916-9a696bab488d"
        "https://dandiarchive.s3.amazonaws.com/blobs/2db/af0/2dbaf0fd-5003-4a0a-b4c0-bc8cdbdb3826"
        "https://dandiarchive.s3.amazonaws.com/blobs/2fd/746/2fd7464f-5459-4c96-a938-27cf13f4d330"

and this is tiny one -- try for 000026 ;)

Ah good point -- I see what you are saying (I'm watching 0000026 stream the output right now 😂 ) -- in that case (to the point of your output here), we would have to loop and gather the blobs/<values-here> and then construct the IAM Policy, unless there is some form of tagging mechanism we could include to contain those sub-directories in singular place

I guess this solution would work well for zarr....non-zarr might be tricky....will have to think more...

@yarikoptic
Copy link

I guess this solution would work well for zarr....non-zarr might be tricky....will have to think more...

it works well for a "single zarr". For a dandiset with lots of zarrs (and @satra aims all nwbs to become zarrs ;) ) -- you get to the same issue as with 000026. Even now 000108 has already thousands of zarrs.

Tagging -- I thought about that too but would be tricky as well since blobs are shared across dandisets. Ensuring aligned tagging would be non-trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants