Adding intersectional bias mitigation to AIF360 #538

ckalousi · 2024-08-26T12:22:46Z

We have implemented an intersectional bias mitigation algorithm based on https://doi.org/10.1007/978-3-030-87687-6_5 (see also https://doi.org/10.48550/arXiv.2010.13494 for the arxiv version) as discussed further on issue #537. Additional details are available in the demo notebook.

ckalousi · 2024-08-26T12:23:18Z

According to pytest, our code does not reach the desired coverage of 80%. This happens because our code is multi threaded and it was not obvious to us how pytest can support such kind of code. Nonetheless, we have checked that all functions of the main algorithm file are called during our tests.

Signed-off-by: Kalousios <[email protected]>

Signed-off-by: ckalousi <[email protected]>

RahulVadisetty91

In the init method, self.model is set to None, but there's no mechanism for initializing or assigning a model. While the abstract fit method may be responsible for setting the model, it's not clear how this works.

…bar support of the main algorithm added Signed-off-by: Kalousios <[email protected]>

ckalousi · 2024-11-02T22:39:59Z

@RahulVadisetty91

Dear Rahul,

we are very grateful for taking the time to review our code and offer some very valuable comments.

We are also extremely sorry we couldn't address your comments earlier as we had to observe some critical deadlines in our work and also coordinate our actions regarding our pull request.

About your comment on the init method. This is a great catch. Indeed self.model is not needed in the current setting. We had only put it there in case we wanted in the future to expand support of our Intersectional Fairness to more algorithms. Under current circumstances it makes sense to comment out this line (line 27 of your screenshot).

You have also made some very valuable comments in your first version of your comment (before the edit). We considered all of them, and although they are all very important we could only use current resources to address some of them.

More specifically, although it would be nice to switch to TensorFlow 2 for future compatibility, our code is based on the code of the Adversarial Debiasing algorithm found in AIF360 which in turn is based on TensorFlow 1. It is very difficult to modify our code to support TensorFlow 2 the time Adversarial Debiasing uses TensorFlow 1. If in the future the original algorithm is updated we would be happy to also update our code.

Following one of your comments we have now implemented evaluation progress bars in our algorithm.

Once again thank you for your time and we would be happy to further discuss any of your suggestions or concerns.

Best regards,
Chrysostomos

hoffmansc

Thanks for your patience here! My comments are mostly concerned with cleaning up the code and reducing redundancies but overall it's in pretty good shape. Good work!

hoffmansc · 2024-11-11T19:10:54Z

aif360/algorithms/isf_helpers/isf_metrics/disparate_impact.py

+    def calc_di(self, df, protected_attr_info, label_info):
+        """
+        Calculate Disparate Impact score
+
+        Parameters
+        ----------
+        df : DataFrame
+            DataFrame containing sensitive attributes and label
+        sensitive : dictionary
+            Privileged group (sensitive attribute name : attribute value)
+            e.g. {'Gender':1.0,'Race':'black'}
+        label_info : dictionary
+            Label definition (label attribute name : attribute values)
+            e.g. {'denied':1.0}
+
+        Returns
+        -------
+        return value : float
+            Disparete Impact score
+        """
+        df_bunshi, df_bunbo = self.calc_privilege_group(df, protected_attr_info)
+
+        if (len(df_bunshi) == 0):
+            return np.nan
+
+        if (len(df_bunbo) == 0):
+            return np.nan
+
+        label = list(label_info.keys())[0]
+        privileged_value = list(label_info.values())[0]
+
+        a = len(df_bunshi[df_bunshi[label] == privileged_value])
+        b = len(df_bunbo[df_bunbo[label] == privileged_value])
+
+        bunshi_rate = a / len(df_bunshi)
+        bunbo_rate = b / len(df_bunbo)
+
+        if bunbo_rate == 0:
+            return np.nan
+
+        return (bunshi_rate/bunbo_rate)


is it possible to use the built-in disparate_impact_ratio() here?

we usually calculate disparate impact as unprivileged rate / privileged rate so this will be the inverse of what aif360 will return

hoffmansc · 2024-11-11T19:11:57Z

aif360/algorithms/isf_helpers/isf_analysis/intersectional_bias.py

+def calc_intersectionalbias(dataset, metric="DisparateImpact"):
+    """
+    Calculate intersectional bias(DisparateImpact) by more than one sensitive attributes
+
+    Parameters
+    ----------
+    dataset : StructuredDataset
+        A dataset containing more than one sensitive attributes
+
+    metric : str
+        Fairness metric name
+        ["DisparateImpact"]
+
+    Returns
+    -------
+    df_result : DataFrame
+        Intersectional bias(DisparateImpact)
+    """
+
+    df = dataset.convert_to_dataframe()[0]
+    label_info = {dataset.label_names[0]: dataset.favorable_label}
+
+    if metric == "DisparateImpact":
+        fs = DisparateImpact()
+    else:
+        raise ValueError("metric name not in the list of allowed metrics")
+
+    df_result = pd.DataFrame(columns=[metric])
+    for multi_group_label in create_multi_group_label(dataset)[0]:
+        protected_attr_info = multi_group_label[0]
+        di = fs.bias_predict(df,
+                             protected_attr_info=protected_attr_info,
+                             label_info=label_info)
+        name = ''
+        for k, v in protected_attr_info.items():
+            name += k + " = " + str(v) + ","
+        df_result.loc[name[:-1]] = di
+
+    return df_result


is it possible to use the built-in one_vs_rest() here?

y = df.set_index(dataset.protected_attribute_names)[dataset.label_names] one_vs_rest(disparate_impact_ratio, y)

hoffmansc · 2024-11-11T19:22:45Z

aif360/algorithms/isf_helpers/isf_metrics/disparate_impact.py

+import matplotlib.cm as cm
+
+
+class DisparateImpact():


does this need to be a class as opposed to individual functions?

hoffmansc · 2024-11-11T19:46:58Z

aif360/algorithms/isf_helpers/isf_analysis/intersectional_bias.py

+def calc_intersectionalbias_matrix(dataset, metric="DisparateImpact"):
+    """
+    Comparison drawing of intersectional bias in heat map
+
+    Parameters
+    ----------
+    dataset : StructuredDataset
+        Dataset containing two sensitive attributes
+    metric : str
+        Fairness metric name
+        ["DisparateImpact"]
+
+    Returns
+    -------
+    df_result : DataFrame
+        Intersectional bias(DisparateImpact)
+    """
+
+    protect_attr = dataset.protected_attribute_names
+
+    if len(protect_attr) != 2:
+        raise ValueError("specify 2 sensitive attributes.")
+
+    if metric == "DisparateImpact":
+        fs = DisparateImpact()
+    else:
+        raise ValueError("metric name not in the list of allowed metrics")
+
+    df = dataset.convert_to_dataframe()[0]
+    label_info = {dataset.label_names[0]: dataset.favorable_label}
+
+    protect_attr0_values = list(set(df[protect_attr[0]]))
+    protect_attr1_values = list(set(df[protect_attr[1]]))
+
+    df_result = pd.DataFrame(columns=protect_attr1_values)
+
+    for val0 in protect_attr0_values:
+        tmp_li = []
+        col_list = []
+        for val1 in protect_attr1_values:
+            di = fs.bias_predict(df,
+                                 protected_attr_info={protect_attr[0]: val0, protect_attr[1]: val1},
+                                 label_info=label_info)
+            tmp_li += [di]
+            col_list += [protect_attr[1]+"="+str(val1)]
+
+        df_result.loc[protect_attr[0]+"="+str(val0)] = tmp_li
+    df_result = df_result.set_axis(col_list, axis=1)
+
+    return df_result


this seems largely redundant with calc_intersectionalbias() above but pivoted. can't we just accomplish this in pandas?

hoffmansc · 2024-11-11T20:08:22Z

aif360/algorithms/isf_helpers/isf_analysis/metrics.py

are these necessary for the algorithm? they seem non-specific and I don't really see them used anywhere.

hoffmansc · 2024-11-11T22:32:07Z

aif360/algorithms/isf_helpers/preprocessing/massaging.py

+        scale_orig = StandardScaler()
+        X = scale_orig.fit_transform(ds_train.features)


is this necessary inside fit()? can't the user just apply scaling before passing the dataset?

hoffmansc · 2024-11-11T22:36:40Z

aif360/algorithms/isf_helpers/inprocessing/adversarial_debiasing.py

is there any way we can just use the AdversarialDebiasing class directly instead of this wrapper? this doesn't seem to be doing much.

hoffmansc · 2024-11-11T22:39:05Z

aif360/algorithms/isf_helpers/postprocessing/reject_option_based_classification.py

same as before -- can we get rid of these wrapper classes?

hoffmansc · 2024-11-11T22:47:30Z

aif360/algorithms/intersectional_fairness.py

+from aif360.algorithms.isf_helpers.postprocessing.reject_option_based_classification import RejectOptionClassification
+from aif360.algorithms.isf_helpers.postprocessing.equalized_odds_postprocessing import EqualizedOddsPostProcessing
+
+from logging import getLogger, StreamHandler, ERROR, Formatter


can we get rid of the debugging lines or if they're still useful just add a verbose flag?

hoffmansc · 2024-11-11T22:52:09Z

tests/test_isf.py

+    def _read_modelanswer(self, s_result_singleattr, s_result_combattr):
+        # load of model answer
+        ma_singleattr_bias = pd.read_csv(MODEL_ANSWER_PATH + s_result_singleattr, index_col=0)
+        ma_combattr_bias = pd.read_csv(MODEL_ANSWER_PATH + s_result_combattr, index_col=0)
+        return ma_singleattr_bias, ma_combattr_bias
+
+    def _comp_dataframe(self, df1, df2):
+        try:
+            assert_frame_equal(df1, df2)
+        except AssertionError:
+            return False
+        return True


ckalousi added 2 commits August 26, 2024 15:26

Initial commit of intersectional bias mitigation algorithm

9080ddb

Signed-off-by: Kalousios <[email protected]>

remove unnecessary comments, clean code, bug fixes

86c141d

Signed-off-by: ckalousi <[email protected]>

ckalousi force-pushed the main branch from 9fdf797 to 86c141d Compare August 26, 2024 14:40

RahulVadisetty91 reviewed Sep 11, 2024

View reviewed changes

an issue of the init method regarding self.model clarified, progress …

b26b850

…bar support of the main algorithm added Signed-off-by: Kalousios <[email protected]>

hoffmansc requested changes Nov 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding intersectional bias mitigation to AIF360 #538

Adding intersectional bias mitigation to AIF360 #538

ckalousi commented Aug 26, 2024 •

edited

Loading

ckalousi commented Aug 26, 2024

RahulVadisetty91 left a comment •

edited

Loading

ckalousi commented Nov 2, 2024

hoffmansc left a comment

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

hoffmansc Nov 11, 2024

		scale_orig = StandardScaler()
		X = scale_orig.fit_transform(ds_train.features)

Adding intersectional bias mitigation to AIF360 #538

Are you sure you want to change the base?

Adding intersectional bias mitigation to AIF360 #538

Conversation

ckalousi commented Aug 26, 2024 • edited Loading

ckalousi commented Aug 26, 2024

RahulVadisetty91 left a comment • edited Loading

Choose a reason for hiding this comment

ckalousi commented Nov 2, 2024

hoffmansc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ckalousi commented Aug 26, 2024 •

edited

Loading

RahulVadisetty91 left a comment •

edited

Loading