Skip to content

Commit

Permalink
Replace function example with one more relevant to target audience
Browse files Browse the repository at this point in the history
  • Loading branch information
mariadelmarq committed May 9, 2023
1 parent 0743adc commit b0c4017
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 90 deletions.
4 changes: 2 additions & 2 deletions episodes/02-basics.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Python basics
teaching: 120
exercises: 90
teaching: 90
exercises: 60
---

::::::::::::::::::::::::::::::::::::::: objectives
Expand Down
165 changes: 77 additions & 88 deletions episodes/04-reusable.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,20 +21,15 @@ exercises: 15

## Defining a function

We have already made use of several Python builtin functions like `print`, `list` and `range`.
We have already made use of several Python built-in functions like `print`, `list`, and `range`. But in addition to the functions provided by Python, you can write your own as well. Functions are used when a section of code needs to be repeated several times in a program, it saves you rewriting it. In reality, you rarely need to repeat the _exact same_ code. Usually there will be some variation, for example in the variables the code needs to be run on. Because of this, when you create a function you are allowed to specify a set of `parameters` or arguments to the function.

In addition to the functions provided by Python, you can write your own functions.

Functions are used when a section of code needs to be repeated at various different points in a program. It saves you re-writing it all. In reality you rarely need to repeat the exact same code. Usually there will be some variation in variable values needed. Because of this, when you create a function you are allowed to specify a set of `parameters` which represent variables in the function.

In our use of the `print` function, we have provided whatever we want to `print`, as a `parameter`. Typically whenever we use the `print` function, we pass a different `parameter` value.

The ability to specify parameters make functions very flexible.
When we used the `print` function we provided the text we wanted to `print` as a `parameter`. Typically whenever we use the `print` function, we pass a different `parameter` value. The ability to specify parameters make functions very flexible.

```python
def get_item_count(items_str,sep):
'''
This function takes a string with a list of items and the character that they're separated by and returns the number of items
This function takes a string with a list of items and the character that separates the items, and returns the number of items in the list
'''
items_list = items_str.split(sep)
num_items = len(items_list)
Expand All @@ -52,22 +47,21 @@ print(get_item_count(items_owned,';'))

Points to note:

1. The definition of a function (or procedure) starts with the def keyword and is followed by the name of the function with any parameters used by the function in parentheses.
2. The definition clause is terminated with a `:` which causes indentation on the next and subsequent lines. All of these lines form the statements which make up the function. The function ends after the indentation is removed.
3. Within the function, the parameters behave as variables whose initial values will be those that they were given when the function was called.
4. functions have a return statement which specifies the value to be returned. This is the value assigned to the variable on the left-hand side of the call to the function. (power in the example above)
5. You call (run the code) of a function simply by providing its name and values for its parameters the same way you would for any builtin function.
1. The definition of a function (or procedure) starts with the keyword _def_ and is followed by the name you wish to give to the function, with any parameters used by the function in between parentheses.
2. The definition clause ends in`:` which causes indentation on the next and subsequent lines. All of these lines are the statements which make up the function. The function ends where the indentation ends.
3. Within the function, the parameters behave as variables whose initial values will be those that were given when the function was called.
4. Functions usually "return" something, which is the result of the procedure applied to the parameters, and is the value assigned to the variable on the left-hand side of the call to the function. This is specified using the `return` keyword.
5. You call (run the code) of a function by providing its name and values for its parameters, the same way you would for any built-in function.
6. Once the definition of the function has been executed, it becomes part of Python for the current session and can be used anywhere.
7. Like any other builtin function you can use `shift` + `tab` in Jupyter to see the parameters.
8. At the beginning of the function code we have a multiline `comment` denoted by the `'''` at the beginning and end. This kind of comment is known as a `docstring` and can be used anywhere in Python code as a documentation aid. It is particularly common, and indeed best practice, to use them to give a brief description of the function at the beginning of a function definition in this way. This is because this description will be displayed along with the parameters when you use the help() function or `shift` + `tab` in Jupyter.
9. The variable `x` defined within the function only exists within the function, it cannot be used outside in the main program.
7. At the beginning of the function code we have a multiline `comment` denoted by the `'''` at the beginning and end. This kind of comment is known as a `docstring` and can be used anywhere in Python code as a documentation aid. It is particularly common, and indeed best practice, to use them to give a brief description of the function. This is because this description will be displayed along with the parameters when you use the help() function or `shift` + `tab` in Jupyter.
8. Variables that are defined within a function only exist within the function itself, they cannot be used outside in the main program.

In our `get_item_count` function we have two parameters which must be provided every time the function is used. You need to provide the parameters in the right order or to explicitly name the parameter you are referring to and use the `=` sign to give it a value.
Our `get_item_count` function has two parameters which must be provided every time the function is called. You need to provide the parameters in the right order or to explicitly name the parameter you are referring to and use the `=` sign to give it a value.

In many cases of functions we want to provide default values for parameters so the user doesn't have to. We can do this in the following way
In many cases, there is a value for a certain parameter that is more likely than others. In that case the value can be set as "default", and that is the value that will be taken if the user does not specify a value.

```python
def get_item_count(items_str,sep=';'):
def get_item_count(items_str, sep=';'):
'''
This function takes a string with a list of items and the character that they're separated by and returns the number of items
'''
Expand All @@ -76,14 +70,14 @@ def get_item_count(items_str,sep=';'):
return num_items


print(get_item_count(items_owned))
print(get_item_count(items_owned)) # Note that the separator is not specified
```

```output
4
```

The only change we have made is to provide a default value for the `sep` parameter. Now if the user does not provide a value, then the value of 2 will be used. Because `items_str` is the first parameter we can specify its value by position. We could however have explicitly named the parameters we were referring to.
The only change we made is to provide a default value for the `sep` parameter. Now if the user does not provide a value, then the value _;_ will be used. Because `items_str` is the first parameter, we can specify its value by position, without having to explicitly name it, but it could be clearer to explicitly name all parameters.

```python
print(get_item_count(items_owned, sep = ','))
Expand All @@ -97,91 +91,92 @@ print(get_item_count(items_str = items_owned, sep=';'))

::::::::::::::::::::::::::::::::::::::: challenge

## Volume of a cube
## Exercise

1. Write a function definition to calculate the volume of a cuboid. The function will use three parameters `h`, `w`
and `l` and return the volume.
1. Write a function definition to create an identifier for each survey participant.
The function requires three parameters: `first_name`, `surname`, and their six-digit staff ID number (`id`); and returns an identifier formed by the last letter of the first name, the two middle numbers of the staff ID, and the last letter of the surname.

2. Supposing that in addition to the volume I also wanted to calculate the surface area and the sum of all of the edges. Would I (or should I) have three separate functions or could I write a single function to provide all three values together?
2. Suppose that in addition to the identifier you also wanted to generate a username that each participant could use to log into a platform where you will display their results, formed by their first name initial plus the whole surname, all in lowercase; and also return their full name as one string.
Would you (or should you) have three separate functions or could you write a single function to provide all three values together?

::::::::::::::: solution

## Solution

- A function to calculate the volume of a cuboid could be:
1. A function to calculate the unique identifier as described in the exercise could be:

```python
def calculate_vol_cuboid(h, w, len):
def generate_identifier(first_name, surname, id):
"""
Calculates the volume of a cuboid.
Takes in h, w, len, that represent height, width, and length of the cube.
Returns the volume.
Generates an identifier formed by: the last letter of the first name, the two middle numbers of the staff ID, and the length of their surname.
Takes in first_name, surname, and id; and returns the identifier.
"""
volume = h * w * len
return volume
identifier = first_name[-1] + id[2:4] + str(len(surname))
return identifier
```

- It depends. As a rule-of-thumb, we want our function to **do one thing and one thing only, and to do it well.**
If we always have to calculate these three pieces of information, the 'one thing' could be
'calculate the volume, surface area, and sum of all edges of a cube'. Our function would look like this:
2. It depends. As a rule-of-thumb, functions should __do one thing and one thing only, and do it well.__
If you always need these three pieces of information together, the 'one thing' could be
'for each participant, generate an identifier, username, and full name'. In that case, your function could look like this:

```python
# Method 1 - single function
def calculate_cuboid(h, w, len):
def generate_user_attributes(first_name, surname, id):
"""
Calculates information about a cuboid defined by the dimensions h(eight), w(idth), and len(gth).
Returns the volume, surface area, and sum of edges of the cuboid.
Generates attributes needed for survey participant.
Takes in first_name, surname, and id; and returns an identifier, username, and full name.
"""
volume = h * w * len
surface_area = 2 * (h * w + h * len + len * w)
edges = 4 * (h + w + len)
return volume, surface_area, edges
identifier = first_name[-1] + id[2:4] + str(len(surname))
username = (first_name[0] + surname).lower()
full_name = first_name + " " + surname
return identifier, username, full_name
```

It may be better, however, to break down our function into separate ones - one for each piece of information we are
calculating. Our functions would look like this:
It may be better, however, to break your function down: one for each piece of information you are
generating. Your functions could look like this:

```python
# Method 2 - separate functions
def calc_volume_of_cuboid(h, w, len):
def gen_identifier(first_name, surname, id):
"""
Calculates the volume of a cuboid defined by the dimensions h(eight), w(idth), and len(gth).
Generates an identifier formed by: the last letter of the first name, the two middle numbers of the staff ID, and the length of their surname.
Takes in first_name, surname, and id; and returns the identifier.
"""
volume = h * w * len
return volume
identifier = first_name[-1] + id[2:4] + str(len(surname))
return identifier


def calc_surface_area_of_cuboid(h, w, len):
def gen_username(first_name, surname):
"""
Calculates the surface area of a cuboid defined by the dimensions h(eight), w(idth), and len(gth).
Generates an username formed by: their first name initial plus the whole surname, all in lowercase.
Takes in first_name and surname, and returns the username.
"""
surface_area = 2 * (h * w + h * len + len * w)
return surface_area
username = (first_name[0] + surname).lower()
return username


def calc_sum_of_edges_of_cuboid(h, w, len):
def display_full_name(first_name, surname):
"""
Calculates the sum of edges of a cuboid defined by the dimensions h(eight), w(idth), and len(gth).
Displays the participant's full name.
Takes in first_name and surname, and returns the full name.
"""
sum_of_edges = 4 * (h + w + len)
return sum_of_edges
full_name = first_name + " " + surname
return full_name
```

We could then rewrite our first solution:
We could then rewrite our first function that returns all attributes needed:

```python
def calculate_cuboid(h, w, len):
def gen_attributes(first_name, surname, id):
"""
Calculates information about a cuboid defined by the dimensions h(eight), w(idth), and len(gth).
Returns the volume, surface area, and sum of edges of the cuboid.
Generates attributes needed for survey participant.
Takes in first_name, surname, and id; and returns an identifier, username, and full name.
"""
volume = calc_volume_of_cuboid(h, w, len)
surface_area = calc_surface_area_of_cuboid(h, w, len)
edges = calc_sum_of_edges_of_cuboid(h, w, len)
identifier = gen_identifier(first_name, surname, id)
username = gen_username(first_name, surname)
full_name = display_full_name(first_name, surname)

return volume, surface_area, edges
return identifier, username, full_name
```

:::::::::::::::::::::::::
Expand All @@ -190,31 +185,27 @@ def calculate_cuboid(h, w, len):

## Using libraries

The functions we have created above only exist for the duration of the session in which they have been defined. If you start a new Jupyter notebook you will have to run the code to define them again.

If all of your code is in a single file or notebook this isn't really a problem.

There are however many (thousands) of useful functions which other people have written and have made available to all Python users by creating libraries (also referred to as packages or modules) of functions.

You can find out what all of these libraries are and their contents by visiting the main (python.org) site.
The functions we have created above only exist within the Jupyter notebook in which they have been defined, and only for the duration of the session. If you start a new Jupyter notebook you will have to copy and paste the functions in to define them again. If all of your code is in a single file or notebook this isn't really a problem. But if your project gets larger, it can be hard to keep track of where each function is saved.

We need to go through a 2-step process before we can use them in our own programs.
There are many (thousands) of useful functions which other people have written and have made available to all Python users by creating libraries (also referred to as packages or modules) of functions. You can find out more about existing Python packages by visiting [pypi.org/](https://pypi.org/).

Step 1. use the `pip` command from the commandline. `pip` is installed as part of the Python install and is used to fetch the package from the Internet and install it in your Python configuration.
There are several ways to install third party packages to be able to use them in your own code. If you have Python 3.4 or later, it includes by default a package installer called [pip](https://pypi.org/project/pip/), which can be used to install packages. From a Jupyter notebook, you would use the syntax:

```bash
$ pip install <package name>
```python
!pip install <package_name>
```

pip stands for Python install package and is a commandline function. Because we are using the Anaconda distribution of Python, all of the packages that we will be using in this lesson are already installed for us, so we can move straight on to step 2.

Step 2. In your Python code include an `import package-name` statement. Once this is done, you can use all of the functions contained within the package.

As all of these packages are produced by 3rd parties independently of each other, there is the strong possibility that there may be clashes in function names. To allow for this, when you are calling a function from a package that you have imported, you do so by prefixing the function name with the package name. This can make for long-winded function names so the `import` statement allows you to specify an `alias` for the package name which you must then use instead of the package name.
After installing the package, you still need to "import" the package into your notebook to be able to use the functions contained within the package. This is done by running:
```python
import <package_name>
```

In future episodes, we will be importing the `csv`, `json`, `pandas`, `numpy` and `matplotlib` modules. We will describe their use as we use them.
As all of these packages are produced by third parties independently of each other, there is the strong possibility that there may be clashes in function names, this is there are functions in two different packages that have the exact same name. Therefore, when you are calling a function from a package that you have imported, you can prefix the function name with the package name, which makes it clear which function you are expecting to run. This can make for long-winded function names, though! The `import` statement allows you to also specify an "alias" for the package, which you must then use instead of the full package name. For example:
```python
import numpy as np
```

The code that we will use is shown below
Many aliases (specified after the `as` keyword) are nearly universally adopted conventions used for very popular libraries, and you will almost certainly come across them when searching for example code. In future lessons, we will be importing the `csv`, `json`, `pandas`, `numpy`, and `matplotlib` modules, which we will describe as we use them. The code that we will use to import these packages is:

```python
import csv
Expand All @@ -224,9 +215,7 @@ import numpy as np
import matplotlib.pyplot as plt
```

The first two we don't alias as they have short names. The last three we do. Matplotlib is a very large library broken up into what can be thought of as sub-libraries. As we will only be using the functions contained in the `pyplot` sub-library we can specify that explicitly when we import. This saves time and space. It does not effect how we call the functions in our code.

The `alias` we use (specified after the `as` keyword) is entirely up to us. However those shown here for `pandas`, `numpy` and `matplotlib` are nearly universally adopted conventions used for these popular libraries. If you are searching for code examples for these libraries on the Internet, using these aliases will appear most of the time.
Matplotlib is a very large library broken up into what can be thought of as sub-libraries. As we will only be using the functions contained in the `pyplot` sub-library we can specify that explicitly when we import. This saves time and space, and does not affect how we call the functions in our code.

:::::::::::::::::::::::::::::::::::::::: keypoints

Expand Down

0 comments on commit b0c4017

Please sign in to comment.