Check out our onboarding website with centralized resources here!
If there are any issues or areas of improvement you would like us to know, please create a new entry in "Issues"
If you haven't already, fill out this form and join our mailing list. This will keep you up-to-date on the club.
-
Download the files in this repo by clicking
Code
(the green button near the top) ->Download ZIP
and unzip the files into a folder. You can of course also fork the repo if you have experience with Git. -
Follow the general setup guide.
-
Follow the Git setup guide.
For most people, (3) is the hardest part of the tutorial! If you feel frustrated, know it is normal. Come see us at tutorials or office hours and we will help you out.
If you have trouble with the General Setup, you can follow the Google Colab setup guide and use Colab to complete the tutorials.
If you have trouble with the Git Setup, you can upload your files to Git by going to your GitHub repository and do Add file
-> Upload files
.
Get started with tutorial0
and checkpoint0
in the tutorial0
folder and then move on to tutorial1
and checkpoint1
in the tutorial1
folder. We recommend working through each tutorial before attempting the corresponding checkpoint. However, if you have prior experience, feel free to skip part of or entire tutorial.
The Data-Visualization
folder contains materials for those who want to get a head start. pandas.ipynb
is a very brief introduction to internal Pandas data visualization tools. The AnatomyofMatplotlib
folder contains a comprehensive tutorial for the Matplotlib library, which most beginner projects use and is foundational to other data visualization packages such as seaborn
.
We also highly recommend you looking into Python virtual environments. You can do this at the beginning or after you complete the checkpoints. Our members have made resources explaining it here.
There are three optional challenges available to you: Machine Learning, Deep Learning, and RvF. They are located in three seperate folders under Optional-Challenges
and your code will be needed in the notebooks ending in .ipynb
.
You can choose to complete any one or multiple of them. We usually put new members on beginner or intermediate projects for their very first semester but you may want to work on advanced projects right away if you are experienced with data science. In that case, completion of at least one challenge will be required.
Machine Learning - Loan Approval Prediction
Deep Learning - Titanic
RvF - Computer Vision: Fake Face Detection
These checkpoints are not meant to be selective. Their sole purpose is to give you sufficient foundational knowledge about Python and some important packages so you can start contributing to a project.
The definition of success for us is to have everyone who begins the tutorials finish them. Thus, we will offering support with Office Hours
- OHs are not mandatory
We have also created a forum where you can ask questions.
Due: 01/24/2025 11:59pm EST
Submission will open soon...
We are looking for:
- [REQUIRED] Checkpoint 0 and Checkpoint 1. These are assessed by completion and effort, not accuracy.
- [OPTIONAL] Any additional challenges you completed. These are assessed by merit.
All technical or logistical questions MUST be posted on the ED forum. We will not answer those questions over email.
If you have a personal question, email us at [email protected].
A list of relevent python libraries that are used extensively throughout the checkpoints, challenges, MDST projects, and beyond.
Numpy: https://numpy.org/doc/stable/
Pandas: https://pandas.pydata.org/docs/
Matplotlib: https://matplotlib.org/stable/gallery/index
Scikit-Learn: https://scikit-learn.org/stable/user_guide.html