Skip to content

profjsb/python-seminar

Repository files navigation

Python Computing for Data Science

Binder

A Graduate Seminar Course at UC Berkeley (AY 250)

Campbell Hall: Monday 4:10 - 7:00 PM SPRING 2022

Synopsis

Python has become the de facto superglue language for modern scientific computing. In this course we will learn Pythonic interactions with databases, imaging processing, advanced statistical and numerical packages, web frameworks, machine-learning, and parallelism. Each week will involve lectures and coding projects. In the final capstone project, students will build a working codebase useful for their own research domain.

This class is for any student working in a quantitative discipline and with familiarity with Python. Those who completed the Python Bootcamp or equivalent will be eligible. You should follow the steps to install the Anaconda 3-2021-* distribution as well as git.

Course Schedule

Date Content Reading Leader
Jan 24 Online only Numpy, Scipy, & Pandas
Binder
- scipy §§ 1.3, 1.5, 2.2
- numpy
- skim chap 4/5 of McKinney
Josh
Jan 31 Data visualization (Matplotlib, Bokeh, Altair) - Skim Tufte's Visualization book
- colormap talk (Scipy 2015)
Josh
Feb 7 Application building and Testing None Josh
Feb 14 Parallelism (asyncio, dask, ray, jax) None Josh
Feb 21 Holiday (no class)
Feb 28 Database interaction (sqlite, postgres, SQLAlchemy),
Large datasets (xarray, HDF5)
None Josh
Mar 7 Machine Learning I (sklearn: regression, classification; dask-learn, auto-ml) None Josh
Mar 1428 Machine Learning II (keras [tensorflow]) Deep Learning with Keras Josh
Mar 21 Spring Break
Mar 28 Interacting with the world (requests, email, IoT/pyserial) None Josh
Apr 1
Friday 10-1pm
Web frameworks & RESTful APIs, Flask None Josh
Apr 4 No lecture
Apr 11 Bayesian programming & Symbolic math Probabalistic Programming eBook
install:
pip install pymc3
Josh
Apr 18 Image processing (OpenCV, skimage) None Stefan van der Walt
Apr 25 Speeding it up (Numba, Cython, wrapping legacy code) None Josh
Onward final project work

Useful Books

Sidebar Concepts

Throughout these lectures we will be peppering in sidebar knowledge concepts:

  • Jupyter & JuypterLab
  • using git & github
  • Docker
  • Data science workflows
  • reproducible research
  • application building
  • debugging
  • testing

Workflow

Each Monday we will be introducing a reasonably self-contained topic with two back-to-back lectures. In between a short (~20 minute) breakout coding session will be conducted. Homeworks will require you to write a large (several hundred line) codebase.

Help sessions will be conducted interactively on the Piazza site for the course. There is also an in-person help session every TBD. Email Josh with any questions.

Contact

Email us at [email protected] or contact the professor directly ([email protected]). You can also contact the GSI, Ellianna Abrahams, at ([email protected]. Auditing is not permitted by the University but those wishing to sit in on a class or two should contact the professor before attending.