The purpose of this project has been for me to practice Python and Machine Learning
It is a wrapper for trying out Machine Learning functions.
The project came to be as I tried to learn how to use several Machine Learning models and was tired to write a new script for every single one of them.
Scripts lying allover my computer I decided to create a single repo for them
If you want to test out the project it will require from you some basic knowledge of using the commandline but the instructions are hopefully clear enough for a quick try.
This is mainly just a place for me to store my code, so it is not stable, but everything that is in the guide is tested on Windows.
Project has been tested with these dependency versions
-
Python 3.7.x
You can check if the correct version of the Python is installed by opening commandline (Mac & Linux: Terminal, Windows: Command Prompt or Powershell) and by typing:
python --version
If you're using Mac the python will most likely use Python version 2.7. To check for Python 3 version usepython3 --version
If the first numbers in the version are 3.7 it should work. If the first numbers are something else, you can try, but if your Python is version 2.x.x (2.7.0) it most certainly will not work.
Official Python downloads page:
https://www.python.org/downloads/ -
Pip
Python package installer, which is used to install packages.
Official installation guide is quite strait forward:
https://pip.pypa.io/en/stable/installing/ but the required steps are also here:
Download via commandline get-pip.py file, which is used to install pip:$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
After file is installed run:python get-pip.py
Or in Mac if python3 returned Python version 3 usepython3 get-pip.py
If you want to use data stored in MySQL database.
If you don't care about MySQL implementation you can skip this part.
- Mysql 8.0.20
Link to Official MySQL documentation:
https://dev.mysql.com/doc/refman/8.0/en/installing.html
After installing Python
In commandline:
- Open commanline.
If you are using Mac or Linux it is the application called Terminal. If Windows Command Prompt or PowerShell. - Move to your working directory. This is the directory where this project is stored.
cd <working directory>
- Clone this project in to it.
git clone [email protected]:doslindos/ml_stuff.git
- Move to the created project
cd ml_stuff
- Create virtualenvironment
virtualenv env
- Activate the created virtualenv
in Mac and Linuxsource env/bin/activate
in Windowsenv\Scripts\activate
If (env) appeared in the most left corner of the commandline it means that the virtualenv is activated. - You can check that your virtualenv python version is the one you want
python --version
- Before installing packages with pip, go to requirements.txt file and uncomment the version of tensorflow you want to use!
If you do not have GPU support installed, use "tensorflow". - If everything is in order, install packages with pip
pip install -r requirements.txt
If packages were installed without errors installation is done.
Now you can tap yourself on the shoulder and jump to the Quick test guide!
This guide is for quickly trying out some models!
First go to the project directory in commandline (directory where you downloaded the project in the setup (name is ml_stuff))cd <project directory>
Next activate the environment (just like in Setup #6)
This command is going to download the MNIST dataset in the project folders (stored in data/handlers/mnist/datasets/mnist/), preprocess the images lastly it setups the model and starts to train it.
When the training starts the progress is printed on the terminal.
First is the batch number
with the mnist_basic configurations 50 batches equals the whole training set
validation loss validation loss tells you the models performance for a set of data (validation set) which it is not seen before. The loss is basically the difference between true labels and predictions so if it goes down, the models predictions are closer to the true lables.
accuracy (not implemented yet! for now it is empty)
The training command:
python create.py train -dh mnist -c mnist_basic
After the training is done you can test how the model performs on a test dataset (data which the model has not seen before).
The testing command:
python model_tests.py test_model -dh mnist -test classification_test
First this command will open up a window to select the model to be used.
It will look like this:
After you choose a model (a folder which does have a date as a name) the test dataset inputs are run through the model and the inputs with the models outputs are stored in a file (inside the model folder).
This step is done so that you do not have to run the model everytime you run tests on the same dataset.
After this the confusion matrix is plotted and it looks like this:
In the picture x-axis represents prediction and y-axis the actual label of instances.
For example in the left uppermost corner block is the number of instances the model predicted as zeros which in fact are zeros.
The second block from it to the right is the number of instances the model predicted as ones which were actually zeros.
Red and blue numbers at the right and bottom tells the total number of instances for the current row or column.
For example the red number at the end of (right) the first row (980) tells you the total number of actual zeros in the dataset used.
And the red number at the end of (bottom) the first column (983) tells you the number of times the model predicted the instance to be a zero.
The accuracy and the confusion matrix (if for some reason does not open like above) are printed in the terminal.
TODO
This guide is for you who want to use MySQL data as a datset or Spotify API
If you are using mysql functions make sure that your project has a credentials.ini file with Mysql and Spotify credentials.
Create the file template:python init.py config
After this you can open the credentials.ini file and add your credentials. (more info about the credentials below)
Spotify credentials can be created at https://developer.spotify.com/dashboard/
- It requires a Spotify account, but if you have one just sign-in.
- After sign-in, go to Dashboard and click Create an app, fill info and click Create.
- The Dashboard should now display your app. Click your app and you can find CLIENT_ID and CLIENT_SECRET.
- Put CLIENT_ID and CLIENT_SECRET hash (weird string of numbers and letters) to your credentials.py file, inside the empty strings:
'client_id' and 'client_secret'.
Done.
Fill in the MySQL_connector_params.
More information: https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html
Connector params are given to the connector, therefore if you want to add attributes just add them to params with key as attribute name and value as input.
Dataset is created with <command>
dataset.
Argument | Flag | Info |
---|---|---|
TODO |
Training is called with <command>
train.
Argument | Flag | Info |
---|---|---|
TODO |
Test function are called with <command>
test.
Argument | Flag | Info |
---|---|---|
TODO |