Course Description: This course targets undergraduate students, such as Juniors and Seniors. Just about every student at UPenn and in particular in engineering is using progressively larger datasets to ask scientific questions. This course will break down how we use data and modeling to ask scientific questions and teach the basic toolkits to do so. The goal of this course is to allow any student who needs to use data to ask questions to see which computational tools they need to use and to use existing tools to ask those questions. All teaching will be small group and team based. The course will use a broad set of data representative of the school. The course is open to upper level undergraduate students who have some knowledge of Python.
Prerequisites: Some knowledge of Python
- W0-1: Why did this happen? Causality
- W2: Why do we need models? Types of models and how to fit them to data
- W3: What if data has problems? Hidden traps data cleanup, outliers, missing values, missing not random
- W4: How do these variables relate to one another?
- W5: What can we expect in new data?
- W6: How can we detect structure in data?
- W7: How can we make predictions with data?
- W8: How to do a data science project?
- W9: How can algorithms see?
- W10: How can algorithms read text?
- W11: how can data be used in reality?
The contents of this repository are shared under under a Creative Commons Attribution 4.0 International License.
Software elements are additionally licensed under the BSD (3-Clause) License.
Derivative works may use the license that is more appropriate to the relevant context.