datacarpentry · bencomp · Jun 12, 2023 · May 30, 2023 · May 30, 2023
diff --git a/episodes/02-working-with-openrefine.md b/episodes/02-working-with-openrefine.md
@@ -114,6 +114,10 @@ OpenRefine interface.
 
 Here we will use faceting to look for potential errors in data entry in the `village` column.
 
+::::::::::::::::::::::::: challenge
+
+### Finding (potential) errors
+
 1. Scroll over to the `village` column.
 2. Click the down arrow and choose `Facet` > `Text facet`.
 3. In the left panel, you'll now see a box containing every unique value in the
@@ -131,7 +135,7 @@ Here we will use faceting to look for potential errors in data entry in the `vil
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 - `Chirdozo` is likely a mis-entry of `Chirodzo`.
 - `Ruca` is likely a mis-entry of `Ruaca`.
@@ -141,13 +145,15 @@ Here we will use faceting to look for potential errors in data entry in the `vil
   mistyped entries in a later exercise.
 - The entry `49` is almost certainly an error but you will not be able to fix
   it by reference to other data.
-
 
 :::::::::::::::::::::::::
 
+:::::::::::::::::::::::::::::::::::
+
+
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Interview collection dates
 
 1. Using faceting, find out how many different `interview_date` values there
   are in the survey results.
@@ -162,7 +168,7 @@ Here we will use faceting to look for potential errors in data entry in the `vil
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 For the column `interview_date` do `Facet` > `Text facet`. A box will
 appear in the left panel showing that there are 19 unique entries in
@@ -284,14 +290,14 @@ and the quotes.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Remove unwanted characters
 
 Use this same strategy to remove the single quote marks (`'`), the
 right square brackets (`]`), and spaces from the `items_owned` column.
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 1. `value.replace("'", "")`
 2. `value.replace("]", "")`
@@ -315,14 +321,14 @@ You should now see a new text facet box in the left-hand pane.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Commonly owned items
 
 Which two items are the most commonly owned? Which are the two
 least commonly owned?
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 Select `Sort by:` `count`. The most commonly owned items are
 mobile phone and radio, the least commonly owned are cars and computers.
@@ -334,15 +340,15 @@ mobile phone and radio, the least commonly owned are cars and computers.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Month(s) with farmers lacking food
 
 Perform the same clean up steps and customized text faceting for
 the `months_lack_food` column. Which month(s) were farmers
 more likely to lack food?
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 All four cleaning steps can be performed by combining `.replace`
 statements. The command is:
@@ -357,7 +363,7 @@ November was the most common month for respondents to lack food.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Clean up other columns
 
 Perform the same clean up steps for the `months_no_water`, `liv_owned`,
 `res_change`, and `no_food_mitigation` columns.
@@ -376,7 +382,7 @@ provides `Undo` and `Redo` operations to make this easy.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Explore undo and redo
 
 1. Click where it says `Undo / Redo` on the left side of the screen. All the
   changes you have made so far are listed here.
@@ -388,7 +394,6 @@ provides `Undo` and `Redo` operations to make this easy.
   Before moving on to the next lesson, redo all the steps in your analysis
   so that all of the columns you modified are lacking in square brackets,
   spaces, and single quotes.
-
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
@@ -409,13 +414,18 @@ This is then applied to the data in all columns.
 OpenRefine also provides a menu option to remove blank
 characters from the beginning and end of any entries in the column that you choose.
 
+::::::::::::::::::::::::: challenge
+
+### Remove a trailing space
+
 1. Edit the `village` on the first row to introduce a space at the end, set to `God `.
 2. Create a new text facet for the `village` column. You should now see two
   different entries for `God`, one of which has a trailing whitespace.
 3. To remove the whitespace, choose `Edit cells` > `Common transforms` >
   `Trim leading and trailing whitespace`.
 4. You should now see only four choices in your text facet again.
 
+:::::::::::::::::::::::::::::::::::
 
 
 :::::::::::::::::::::::::::::::::::::::: keypoints

diff --git a/episodes/03-filter-sort.md b/episodes/03-filter-sort.md
@@ -25,23 +25,23 @@ There are many entries in our data table. We can filter it to work on a subset
 of the data in the list for the next set of operations. Please ensure you
 perform this step to save time during the class.
 
+:::::::::::::::::::::::::::::::::::::::  challenge
+
+### Using a Text Filter
+
 1. Click the down arrow next to `respondent_roof_type` > `Text filter`. A
   `respondent_roof_type` facet will appear on the left margin.
 2. Type in `mabat` and press return. There are 58 matching rows of the original
   131 rows (and these rows are selected for the subsequent steps).
 3. At the top, change the view to `Show` 50 `rows`. This way you will see most
   of the matching rows.
-
-:::::::::::::::::::::::::::::::::::::::  challenge
-
-## Exercise
-
-1. What roof types are selected by this procedure?
-2. How would you restrict this to only one of the roof types?
+4. Answer these questions:
+    1. What roof types are selected by this procedure?
+    2. How would you restrict this to only one of the roof types?
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 1. Do `Facet` > `Text facet` on the `respondent_roof_type` column after
   filtering. This will show that two names match your filter criteria.
@@ -53,7 +53,7 @@ perform this step to save time during the class.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
-### Excluding entries
+## Excluding entries
 
 In addition to the simple text filtering we used above, another way to narrow
 our filter is to `include` and/or `exclude` entries in a facet. You will see
@@ -71,13 +71,13 @@ analysis.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Including and excluding rows using a facet
 
 Use `include / exclude` to select only entries from one of these two roof types.
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 1. In the facet (left margin), click on one of the names, such as
   `mabatisloping`. Notice that when you click on the name, or hover over
@@ -108,14 +108,14 @@ sorting.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Finding GPS Altitude outliers
 
 Sort the data by `gps_Altitude`. Do you think the first few entries may have
 incorrect altitudes?
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 In the `gps_Altitude` column, select `Sort...` > `numbers` and select
 `smallest first`. The first few values are all 0. The altitudes are more
@@ -124,8 +124,6 @@ the gps information added automatically by the app. The lack of an altitude
 value suggests that the smartphone was unable to provide it and it
 defaulted to 0.
 
-
-
 :::::::::::::::::::::::::
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -150,7 +148,7 @@ only column sorted, then data reverts to its original order.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Finding village "49"
 
 We discovered in an earlier lesson that the value for one of the `village`
 entries was given as 49. This is clearly wrong. By looking at the GPS
@@ -170,15 +168,13 @@ the data in that column was collected from?
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 The interview data for that row is in a small cluster of Chirodzo
 interviews when sorting by GPS coordinates. When sorting by interview date,
 it is also with Chirodzo interviews. In fact, only Chirodzo had interviews
 conducted on that date.
 
-
-
 :::::::::::::::::::::::::
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::

diff --git a/episodes/04-numbers.md b/episodes/04-numbers.md
@@ -39,24 +39,22 @@ right-justified, and black to green in color.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Transforming column contents to numbers
 
 Transform three more columns, `no_membrs`, `years_liv`, and
 `buildings_in_compound`, from text to numbers. Can all columns be transformed
 to numbers? - Try it with `village` for example.
 
 :::::::::::::::  solution
 
-## Solution
+### Solution
 
 Only observations that include only numerals (0-9) can be transformed to
 numbers. If you apply a number transformation to a column that doesn't meet
 this criteria, and then click the `Undo / Redo` tab, you will see a step
 that starts with `Text transform on 0 cells`. This means that the data in
 that column was not transformed.
 
-
-
 :::::::::::::::::::::::::
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -69,7 +67,7 @@ them. We can do that with a `Numeric facet`.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Using a numeric facet
 
 1. For a column you transformed to numbers, edit one or two cells, replacing
   the numbers with text (such as `abc`) or blank (no number or text). You
@@ -82,7 +80,6 @@ them. We can do that with a `Numeric facet`.
   `Non-numeric` and `Blank` if you changed some values.
 4. Experiment with checking or unchecking these boxes to select subsets of
   your data.
-
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 

diff --git a/episodes/05-scripts.md b/episodes/05-scripts.md
@@ -30,7 +30,9 @@ files had the same column names, you could save the JSON script, open a new
 file to clean in OpenRefine, paste in the script and run it. This gives you a
 quick way to clean all of your related data.
 
-## Saving your work as a script
+:::::::::::::::::::::::::::::::::::::::  challenge
+
+### Saving your work as a script
 
 1. In the `Undo / Redo` section, click `Extract...`, and select the steps that
   you want to apply to other datasets by clicking the check boxes.
@@ -42,7 +44,12 @@ quick way to clean all of your related data.
   text file. In TextEdit, do this by selecting `Format` > `Make plain text`
   and save the file as a `.txt` file.
 
-## Importing a script to use against another dataset
+::::::::::::::::::::::::::::::::::::::::::::::::::
+
+
+:::::::::::::::::::::::::::::::::::::::  challenge
+
+### Importing a script to use against another dataset
 
 Let's practice running these steps on a new dataset. We'll test this on an
 uncleaned version of the dataset we've been working with.
@@ -54,6 +61,8 @@ uncleaned version of the dataset we've been working with.
 3. Click `Perform operations`. The dataset should now be the same as your other
   cleaned dataset.
 
+::::::::::::::::::::::::::::::::::::::::::::::::::
+
 For convenience, we used the same dataset. In reality you could use this
 process to clean related datasets. For example, data that you had collected
 over different fieldwork periods or data that was collected by different

diff --git a/episodes/06-saving.md b/episodes/06-saving.md
@@ -30,7 +30,9 @@ By default OpenRefine is saving your project continuously. If you close
 OpenRefine and open it up again, you'll see a list of your projects. You can
 click on any one of them to open it up again.
 
-### Exporting
+::::::::::::::::::::::::: challenge
+
+### Exporting the project
 
 You can also export a project. This is helpful, for instance, if you wanted to
 send your raw data and cleaning steps to a collaborator, or share this
@@ -65,6 +67,8 @@ You should see:
 
 :::::::::::::::::::::::::
 
+:::::::::::::::::::::::::::::::::::
+
 You can import an existing project into OpenRefine by clicking `Open...` in the
 upper right > `Import Project` and selecting the `tar.gz` project file. This
 project will include all of the raw data and cleaning steps that were part of

diff --git a/episodes/07-resources.md b/episodes/07-resources.md
@@ -48,15 +48,13 @@ your journey.
 
 :::::::::::::::::::::::::::::::::::::::  challenge
 
-## Exercise
+### Discuss a resource
 
 Visit one of these sites and share what you find with another person.
 
-
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
 
-
 :::::::::::::::::::::::::::::::::::::::: keypoints
 
 - Other examples and resources online are good for learning more about OpenRefine.