layout | permalink | root |
---|---|---|
reference |
/reference/ |
.. |
- Python files have the
.py
extension. - Can be written in a text file or a Jupyter Notebook.
- Jupyter notebooks have the extension
.ipynb
- Jupyter notebooks can be opened from Anaconda or through the command line by entering
$ jupyter notebook
- Markdown and HTML are allowed in markdown cells for documenting code.
- Jupyter notebooks have the extension
- Variables are stored using
=
.- Strings are defined in quotations
'...'
. - Integers and floating point numbers are defined without quotations.
- Strings are defined in quotations
- Variables can contain letters, digits, and underscores
_
.- Cannot start with a digit.
- Variables that start with underscores should be avoided.
- Use
print(...)
to display values as text. - Can use indexing on strings.
- Indexing starts at 0.
- Position is given in square brackets
[position]
following the variable name. - Take a slice using
[start:stop]
. This makes a copy of part of the original string.start
is the index of the first element.stop
is the index of the element after the last desired element.
- Use
len(...)
to find the length of a variable or string.
- Each value has a type. This controls what can be done with it.
int
represents an integerfloat
represents a floating point number.str
represents a string.
- To determine a variables type, use the built-in function
type(...)
, including the variable name in the parenthesis. - Modifying strings:
- Use
+
to concatenate strings. - Use
*
to repeat a string. - Numbers and strings cannot be added to on another.
- Convert string to integer:
int(...)
. - Convert integer to string:
str(...)
.
- Convert string to integer:
- Use
- To add a comment, place
#
before the thing you do not with to be executed. - Commonly used built-in functions:
min()
finds the smallest value.max()
finds the largest value.round()
rounds off a floating point number.help()
displays documentation for the function in the parenthesis.- Other ways to get help include holding down
shift
and pressingtab
in Jupyter Notebooks.
- Other ways to get help include holding down
- Importing a library:
- Use
import ...
to load a library. - Refer to this library by using
module_name.thing_name
..
indicates 'part of'.
- Use
- To import a specific item from a library:
from ... import ...
- To import a library using an alias:
import ... as ...
- Importing the math library:
import math
- Example of referring to an item with the module's name:
math.cos(math.pi)
.
- Example of referring to an item with the module's name:
- Importing the plotting library as an alias:
import matplotlib as mpl
- Use the pandas library to do statistics on tabular data. Load with
import pandas
.- To read in a csv:
pandas.read_csv()
, including the path name in the parenthesis.- To specify a column's values should be used as row headings:
pandas.read_csv('path',index_col='column name')
, where path and column name should be replaced with the relevant values.
- To specify a column's values should be used as row headings:
- To read in a csv:
- To get more information about a DataFrame, use
DataFrame.info
, replacingDataFrame
with the variable name of your DataFrame. - Use
DataFrame.columns
to view the column names. - Use
DataFrame.T
to transpose a DataFrame. - Use
DataFrame.describe
to get summary statistics about your data.
- Select data using
[i,j]
- To select by entry position:
DataFrame.iloc[..., ...]
- This is inclusive of everything except the final index.
- To select by entry label:
DataFrame.loc[..., ...]
- Can select multiple rows or columns by listing labels.
- This is inclusive to both ends.
- Use
:
to select all rows or columns.
- To select by entry position:
- Can also select data based on values using
true
andfalse
. This is a Boolean mask.mask = subset > 10000
- We can then use this to select values.
- To use a select-apply-combine operation we use
data.apply(lambda x: x>x.mean())
wheremean()
can be any operation the user would like to be applied to x.
- The most widely used plotting library is
matplotlib
.- Usually imported using
import matplotlib.pyplot as plt
. - To plot we use the command
plt.plot(time, position)
. - To create a legend use
plt.legend(['label1','label2', loc='upper left'])
- Can also define labels within the plot statements by using
plt.plot(time, position, label='label')
. To make the legend show up, useplt.legend()
- Can also define labels within the plot statements by using
- To label x and y axis
plt.xlabel('label')
andplt.ylabel('label')
are used.
- Usually imported using
- Pandas DataFrames can be used to plot by using
DataFrame.plot()
. Any operations that can be used on a DataFrame can be applied while plotting.- To plot a bar plot
data.plot(kind='bar')
- To plot a bar plot
import matplotlib.puplot as plot
plt.plot(time,position,label='label')
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.legend()
- Defined within
[...]
and separated by,
.- An empty list can be created by using
[]
.
- An empty list can be created by using
- Can use
len(...)
to determine how many values are in a list. - Can index just as done in previous lessons.
- Indexing can be used to reassign values
list_name[0] = newvalue
.
- Indexing can be used to reassign values
- To add an item to a list use
list_name.append()
, with the item to append in the parenthesis. - To combine two lists use
list_name_1.extend(list_name_2)
. - To remove an item from a list use
del list_name[index]
.
- Start a for loop with
for number in [1,2,3]:
, with the following lines indented.[1, 2, 3]
is considered the collection.number
is the loop variable.- The action following the collection is the body.
- To iterate over a sequence of numbers use
range(start, end)
for number in range(0,5):
print(number)
- Use a for loop:
for filename in [file1, file2]:
- To find a set of files using a pattern use
glob.glob
- Must import first using
import glob
. *
indicates "match zero or more characters"?
indicates "match exactly one character"- For example:
glob.glob(*.txt)
will find all files that end with .txt in the current directory.
- For example:
- Must import first using
- Combine these by writing a loop using:
for filename in glob.glob(*.txt):
for filename in glob.glob(*.txt):
data = pandas.read_csv(filename)
- Define a function using
def function_name(parameters):
. Replaceparameters
with the variables to use when the funciton is executed. - Run by using
function_name(parameters)
. - To return a result to the caller use
return ...
in the function.
def add_numbers(a, b):
result = a + b
return result
add_numbers(1, 4)
- A local variable is defined in a function and can only be seen and used within that function.
- A global variable is defined outside of a function and can be seen or used anywhere after definition.
- Defined similarly to a loop, using
if variable conditional value:
.- For example,
if variable > 5:
.
- For example,
- Use
elif:
for additional tests. - Use
else:
for when if statement is not true. - Can Combine more than one conditional by using
and
oror
. - Often used in combination with for loops.
- Conditions that can be used:
==
equal to.>=
greater than or equal to.<=
less than or equal to.>
greater than.<
less than.
for m in [3, 6, 7, 2, 8]:
if m > 5:
print(m, 'is large')
elif m == 5:
print(m, 'is 5')
else:
print(m, 'is small')
- Document your code.
- Use clear and meaningful variable names.
- Follow the PEP8 style guide when setting up your code.
- Use assertions to check for interal errors.
- Use docstrings to provide help.
{:auto_ids} Arguments : Values passed to functions.
Array : A container holding elements of the same type.
Boolean
: An object composed of true
and false
.
DataFrame : The way Pandas represents a table; a collection of series.
Element : An item in a list or an array. For a string, these are the individual charactors.
Function : A block of code that can be called and re-used elsewhere.
Global variable : A variable defined outside of a function that can be used anywhere.
Index : The position of a given element.
Jupyter Notebook : Interactive coding environment allowing a combination of code and markdown.
Library : A collection of files containing functions used by other programs.
Local Variable : A variable defined inside of a function that can only be used inside of that function.
Mask : A boolean object used for selecting data from another object.
Method
: An action tied to a particular object. Called by using object.method
.
Modules : The files within a library containing functions used by other programs.
Parameters : Variables used when executing a function.
Series : A Pandas data structure to represent a column.
Substring : A part of a string.
Variables : Names for values.