Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance limits of DisclosureProtection metric #691

Open
frances-h opened this issue Dec 10, 2024 · 0 comments
Open

Investigate performance limits of DisclosureProtection metric #691

frances-h opened this issue Dec 10, 2024 · 0 comments
Labels
feature request Request for a new feature

Comments

@frances-h
Copy link
Contributor

Problem Description

Currently, the DisclosureProtection metric warns about poor performance when the size of the input data is greater than 50,000 rows. This number was chosen without investigation into the performance of the metric. It'd be helpful to know how the performance of the metric changes based on the size of the input, so that we can warn the user of possible poor performance earlier and suggest an alternative metric.

Expected behavior

Investigate the performance of the DisclosureProtection metric, considering input data length, number of known/sensitive columns, and number of unique discrete values in those columns. Also test across the different CAP methods.

Once we have a good understanding of the performance, we should update the warning in DisclosureProtection based on the results of the investigation.

@frances-h frances-h added the feature request Request for a new feature label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant