Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A cached data package archive file (aka zip-files) may contain restricted data entities #77

Closed
servilla opened this issue Mar 31, 2022 · 2 comments
Assignees
Labels
bug Bug

Comments

@servilla
Copy link
Collaborator

Cached data package archive files may contain restricted data entities if 1) the archive was created by a data entity privileged user or 2) if the archive was created prior to a request to embargo a data packages' data entities.

@servilla servilla added the bug Bug label Mar 31, 2022
@servilla servilla self-assigned this Mar 31, 2022
@servilla
Copy link
Collaborator Author

servilla commented Mar 31, 2022

Per discussions with Kyle Zollo-Venecek the approach to pursue is to:

  1. No longer create a cached data package archive file if any data entity of the data package is read-restricted by the "public" users* (see this fix Zip archive incomplete with embargoed data even when permissions are set correctly #17) and
  2. The process of embargoing a data package's data entities removes any existing archive file.

*Note: the createDataPackageArchive API method does not add data entities to the archive file if the user does not have read permission for the data entities; this conditional will force the archive to be generated for each request.

@servilla
Copy link
Collaborator Author

servilla commented Apr 5, 2022

Regarding the approach discussed above...

As noted in #1 above, the solution presented in #17 creates a custom zip archive file for each user (including the "public user") and only returns that zip file from the cache, if it exists, for the given user. In this case, the zip file only contains resources that the given user is allowed to access. Assuming no cached zip files exist at the time of an embargo, then all newly created zip files will contain only the resources permitted for that user. This approach, however, does not mitigate the issue when a zip file was created by the "public user" prior to an embargo. In this case, the zip file should be removed from the cache to create a new zip file with the correct set of resources. Therefore, the embargo process should automatically remove any existing zip archive file in the cache when the embargo is applied. Similarly, the embargo process should also remove any zip archive file created during the embargo period so that new zip files will be created post-embargo. The removal capability of zip archive files from the archive cache location is addressed in #78.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug
Projects
Status: Done
Development

No branches or pull requests

1 participant