Investigate performance limits of DisclosureProtection
metric
#691
Labels
feature request
Request for a new feature
DisclosureProtection
metric
#691
Problem Description
Currently, the
DisclosureProtection
metric warns about poor performance when the size of the input data is greater than 50,000 rows. This number was chosen without investigation into the performance of the metric. It'd be helpful to know how the performance of the metric changes based on the size of the input, so that we can warn the user of possible poor performance earlier and suggest an alternative metric.Expected behavior
Investigate the performance of the
DisclosureProtection
metric, considering input data length, number of known/sensitive columns, and number of unique discrete values in those columns. Also test across the different CAP methods.Once we have a good understanding of the performance, we should update the warning in
DisclosureProtection
based on the results of the investigation.The text was updated successfully, but these errors were encountered: