Algo Fairness

Including Juba's video-lecture on algorithmic fairness
michelleg06 · Jan 30, 2024 · 5129a5e · 5129a5e
1 parent 0ec8b22
commit 5129a5e
Show file tree

Hide file tree

Showing 10 changed files with 78 additions and 51 deletions.
diff --git a/NeuralNets.html b/NeuralNets.html
@@ -275,6 +275,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>

diff --git a/_site.yml b/_site.yml
@@ -19,9 +19,9 @@ navbar:
         - text: "Classification:Logistic"
           href: classification.html
           icon: fa-solid fa-gears
-        #- text: "Fair ML/Data Ethics"
-        #  href: fairml.html
-        #  icon: fa fa-graduation-cap
+        - text: "Fair ML/Data Ethics"
+          href: fairml.html
+          icon: fa fa-graduation-cap
         #- text: "Classification"
         #  href: classification.html
         #  icon: fa-gear

diff --git a/discussionboard.html b/discussionboard.html
@@ -275,6 +275,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>

diff --git a/fairml.Rmd b/fairml.Rmd
@@ -18,24 +18,15 @@ Machine Learning promises to be an important tool for Policymakers who wish to i
 <center>
 ```{r, echo=FALSE}
 library("vembedr")
-embed_url("https://youtu.be/4Q5UDg8bu58")
+embed_url("https://youtu.be/4Q5UDg8bu58?si=o3gCzkOLXH_AD6oF")
 ```
 </center>
 
-In this guest video lecture, Dr. [Juba Ziani](https://www.juba-ziani.com/), Assistant Professor at Georgia Tech will give us an overview of the more common ethical dilemmas that arise in machine learning, and practical examples in the public sphere where these issues have had a regressive impact in society. The guest lecture also includes some ways in which we (data scientists, machine learning enthusiasts, and future policymakers) can minimise these biases and avoid negative impacts from ML in the policy decision-making process. Some of his recommendations to delve deeper into this topic include:
+In this video lecture, Dr. [Juba Ziani](https://www.juba-ziani.com/), Assistant Professor at Georgia Tech will give us an overview of the more common ethical dilemmas that arise in machine learning, and practical examples in the public sphere where these issues have had a regressive impact in society. The video also includes some ways in which we (data scientists, machine learning enthusiasts, and future policymakers) can minimise these biases and avoid negative impacts from ML in the policy decision-making process. Some of his recommendations to delve deeper into this topic include:
 
 - The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns and Aaron Roth. Available at: [Amazon](https://www.amazon.com/Ethical-Algorithm-Science-Socially-Design/dp/0190948205).
 
 - Fairness and Machine Learning: Limitations and Opportunities by Solon Barocas, Moritz Hardt, Arvind Narayanan. Available at: [https://fairmlbook.org/](https://fairmlbook.org/)
 
 
-> "Wombats are the best. A wombat's main defense against predators is its butt. When a predator attacks a wombat, it runs to its burrow and uses its tough cartilage-filled bum to block the hole" 
-> `r tufte::quote_footer('--- Dr. Juba Ziani (and lots of Nature articles)')`
-
-<center>
-```{r pressure, echo=FALSE, fig.cap=" ", out.width = '65%'}
-knitr::include_graphics("/Users/michellegonzalez/Documents/GitHub/Machine-Learning-for-Public-Policy/Images/wombat.png")
-```
-</center>
-
-
+This lecture does not come with an applied R or Python exercise, but we do ask that you think about the different sources of bias and how they may come up in your (personal) research. 
diff --git a/fairml.html b/fairml.html
@@ -276,6 +276,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>
@@ -320,16 +327,15 @@ <h1 class="title toc-ignore">Algorithmic fairness</h1>
 </div>
 </div>
 </center>
-<p>In this guest video lecture, Dr. <a
-href="https://www.juba-ziani.com/">Juba Ziani</a>, Assistant Professor
-at Georgia Tech will give us an overview of the more common ethical
-dilemmas that arise in machine learning, and practical examples in the
-public sphere where these issues have had a regressive impact in
-society. The guest lecture also includes some ways in which we (data
-scientists, machine learning enthusiasts, and future policymakers) can
-minimise these biases and avoid negative impacts from ML in the policy
-decision-making process. Some of his recommendations to delve deeper
-into this topic include:</p>
+<p>In this video lecture, Dr. <a href="https://www.juba-ziani.com/">Juba
+Ziani</a>, Assistant Professor at Georgia Tech will give us an overview
+of the more common ethical dilemmas that arise in machine learning, and
+practical examples in the public sphere where these issues have had a
+regressive impact in society. The video also includes some ways in which
+we (data scientists, machine learning enthusiasts, and future
+policymakers) can minimise these biases and avoid negative impacts from
+ML in the policy decision-making process. Some of his recommendations to
+delve deeper into this topic include:</p>
 <ul>
 <li><p>The Ethical Algorithm: The Science of Socially Aware Algorithm
 Design by Michael Kearns and Aaron Roth. Available at: <a
@@ -338,21 +344,9 @@ <h1 class="title toc-ignore">Algorithmic fairness</h1>
 Solon Barocas, Moritz Hardt, Arvind Narayanan. Available at: <a
 href="https://fairmlbook.org/">https://fairmlbook.org/</a></p></li>
 </ul>
-<blockquote>
-“Wombats are the best. A wombat’s main defense against predators is its
-butt. When a predator attacks a wombat, it runs to its burrow and uses
-its tough cartilage-filled bum to block the hole”
-<footer>
-— Dr. Juba Ziani (and lots of Nature articles)
-</footer>
-</blockquote>
-<center>
-<div class="figure">
-<img src="Images/wombat.png" alt=" " width="65%" />
-<p class="caption">
-</p>
-</div>
-</center>
+<p>This lecture does not come with an applied R or Python exercise, but
+we do ask that you think about the different sources of bias and how
+they may come up in your (personal) research.</p>
 
 <!DOCTYPE html>
 <hr>

diff --git a/index.html b/index.html
@@ -275,6 +275,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>

diff --git a/intro.html b/intro.html
@@ -276,6 +276,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>
@@ -1280,8 +1287,8 @@ <h3><strong>An introduction to Python programming</strong></h3>
 ## Dep. Variable:                      y   R-squared:                       0.518
 ## Model:                            OLS   Adj. R-squared:                  0.507
 ## Method:                 Least Squares   F-statistic:                     46.27
-## Date:                Fri, 26 Jan 2024   Prob (F-statistic):           3.83e-62
-## Time:                        10:16:54   Log-Likelihood:                -2386.0
+## Date:                Tue, 30 Jan 2024   Prob (F-statistic):           3.83e-62
+## Time:                        15:04:19   Log-Likelihood:                -2386.0
 ## No. Observations:                 442   AIC:                             4794.
 ## Df Residuals:                     431   BIC:                             4839.
 ## Df Model:                          10                                         

diff --git a/predictionpolicy.html b/predictionpolicy.html
@@ -278,6 +278,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>
@@ -4896,8 +4903,8 @@ <h3>
 ## Dep. Variable:         lnexp_pc_month   R-squared:                       0.599
 ## Model:                            OLS   Adj. R-squared:                  0.598
 ## Method:                 Least Squares   F-statistic:                     498.3
-## Date:                Fri, 26 Jan 2024   Prob (F-statistic):               0.00
-## Time:                        10:17:10   Log-Likelihood:                -5189.2
+## Date:                Tue, 30 Jan 2024   Prob (F-statistic):               0.00
+## Time:                        15:04:35   Log-Likelihood:                -5189.2
 ## No. Observations:                9024   AIC:                         1.043e+04
 ## Df Residuals:                    8996   BIC:                         1.063e+04
 ## Df Model:                          27                                         
@@ -4955,8 +4962,8 @@ <h3>
 ## Dep. Variable:         lnexp_pc_month   R-squared:                       0.599
 ## Model:                            OLS   Adj. R-squared:                  0.598
 ## Method:                 Least Squares   F-statistic:                     431.9
-## Date:                Fri, 26 Jan 2024   Prob (F-statistic):               0.00
-## Time:                        10:17:11   Log-Likelihood:                -5189.2
+## Date:                Tue, 30 Jan 2024   Prob (F-statistic):               0.00
+## Time:                        15:04:35   Log-Likelihood:                -5189.2
 ## No. Observations:                9024   AIC:                         1.043e+04
 ## Df Residuals:                    8996   BIC:                         1.063e+04
 ## Df Model:                          27                                         

diff --git a/treebasedmodels.html b/treebasedmodels.html
@@ -276,6 +276,13 @@
     Classification:Logistic
   </a>
 </li>
+<li>
+  <a href="fairml.html">
+    <span class="fa fa-graduation-cap"></span>
+
+    Fair ML/Data Ethics
+  </a>
+</li>
 <li>
   <a href="treebasedmodels.html">
     <span class="fa fa-tree"></span>
@@ -810,7 +817,7 @@ <h3>
 <pre><code>## Confusion Matrix: [[ 529  279]
 ##  [ 150 1298]]</code></pre>
 <pre class="python"><code>ConfusionMatrixDisplay(confusion_matrix=cm).plot() # create confusion matrix plot</code></pre>
-<pre><code>## &lt;sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay object at 0x173505120&gt;</code></pre>
+<pre><code>## &lt;sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay object at 0x299c5b1c0&gt;</code></pre>
 <pre class="python"><code>plt.show() # display confusion matrix plot created above</code></pre>
 <p><img src="treebasedmodels_files/figure-html/unnamed-chunk-13-1.png" width="672" />
 Based on our out-of-sample predictions, the Random Forest algorithm
@@ -852,12 +859,12 @@ <h3>
 random_search.fit(X_train, y_train)</code></pre>
 <style>#sk-container-id-4 {color: black;background-color: white;}#sk-container-id-4 pre{padding: 0;}#sk-container-id-4 div.sk-toggleable {background-color: white;}#sk-container-id-4 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-4 label.sk-toggleable__label-arrow:before {content: "▸";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-4 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-4 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-4 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-4 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-4 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-4 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: "▾";}#sk-container-id-4 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-4 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-4 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-4 div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-4 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-4 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-4 div.sk-item {position: relative;z-index: 1;}#sk-container-id-4 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-4 div.sk-item::before, #sk-container-id-4 div.sk-parallel-item::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-4 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-4 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-4 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-4 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-4 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-4 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-4 div.sk-label-container {text-align: center;}#sk-container-id-4 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-4 div.sk-text-repr-fallback {display: none;}</style><div id="sk-container-id-4" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>RandomizedSearchCV(cv=5, estimator=RandomForestClassifier(random_state=42),
                    n_iter=5,
-                   param_distributions={&#x27;max_depth&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x1735736d0&gt;,
-                                        &#x27;n_estimators&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x173573d30&gt;},
+                   param_distributions={&#x27;max_depth&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x10d95e8c0&gt;,
+                                        &#x27;n_estimators&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x10d95f490&gt;},
                    random_state=42)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-4" type="checkbox" ><label for="sk-estimator-id-4" class="sk-toggleable__label sk-toggleable__label-arrow">RandomizedSearchCV</label><div class="sk-toggleable__content"><pre>RandomizedSearchCV(cv=5, estimator=RandomForestClassifier(random_state=42),
                    n_iter=5,
-                   param_distributions={&#x27;max_depth&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x1735736d0&gt;,
-                                        &#x27;n_estimators&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x173573d30&gt;},
+                   param_distributions={&#x27;max_depth&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x10d95e8c0&gt;,
+                                        &#x27;n_estimators&#x27;: &lt;scipy.stats._distn_infrastructure.rv_discrete_frozen object at 0x10d95f490&gt;},
                    random_state=42)</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-5" type="checkbox" ><label for="sk-estimator-id-5" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(random_state=42)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-6" type="checkbox" ><label for="sk-estimator-id-6" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(random_state=42)</pre></div></div></div></div></div></div></div></div></div></div>
 <pre class="python"><code># create an object / variable that containes the best hyperparameters, according to our search:
 

diff --git a/treebasedmodels_files/figure-html/unnamed-chunk-8-1.png b/treebasedmodels_files/figure-html/unnamed-chunk-8-1.png