In this project, I aim to build up a sequence model using LSTM to automatically produce musical compositions. The model is currently capable of generating music by predicting the most likely next note when given a sequence of notes, based on music compositions it has been trained on. Separate to this, the model is also capable of predicting the duration of each note and the time difference between notes in a similar manner.
To get started, please ensure you have the below packages installed.
- Keras==2.3.1
- Numpy==1.18.2
- Pandas==1.0.3
- Music21==5.7.2
- Pygame==1.9.6
Here is a summary of what I did in the code main.py.
- Load in the required packages (see above listed prerequisites)
- Define music player functions to play .mid files from folder and in session midi objects using pygame package
- Define a function to visualise model loss over epoch
- Load .mid tracks from /Music folder and breakdown data in the following lists a) notes b) duration c) delta (offsets)
- One hot encode our data, e.g. say we have a total of 40 possible notes in our loaded track with 100 notes (track length) then our one hot encoded array will have shape (40, 100). Additionally, we will breakdown our track into segments each containing 50 notes, i.e. first segment will have notes from position 0 to 49, second segment will have notes from position 1 to 50 and so on. Therefore, in this example our final notes array will have shape (60, 50, 40) for 60 segments, each 50 notes long, each note having 40 possibilities.
- We do the same for duration and delta, then stack notes, duration and delta arrays into one
- Define our LSTM model architecture with one input and 3 outputs, corresponding to our notes, duration and delta (offset) predictions
- Train the model
- Use the model to "create" a new song of length 100 notes a) randomly pick a sequence from the training data b) use said sequence to predict the next note, duration and delta c) remove first note from the sequence and insert the predicted info to the end of the sequence d) use this new sequence to predict the next note, duration and delta e) repeat until we have 100 predicted notes f) translate the predicted notes into midi format and save file
- Training a model with more songs tend to produce worse results, this is because your songs are probably not written in the same key or tempo
- For training, pick songs that are not too fast or slow, and preferably with clear melody, as opposed to blues, jazz, bossa nova, etc
Here are some sample clips from trained models. Try to listen to the clips in order so that you can fully appreciate the benefits of adding extra predictive capabilities (note duration + note offset) in the LSTM model.
- LSTM model predicts pitch only
Sample output - LSTM model predicts pitch + note duration
Sample output - LSTM model predicts pitch + note duration + offset between notes
Sample output A
Sample output B
Sample output B (YouTube link)
I welcome anyone to contribute to this project so if you are interested, feel free to add your code. Alternatively, if you are not a programmer but would still like to contribute to this project, please click on the request feature button at the top of the page and provide your valuable feedback.
Adding silence between notes(part of updates on 20/05/2020)- Predict loudness/softness of notes
- Predict complementing notes from other instruments, e.g. violin
- Code is currently a bit of a mess as I hacked through most of it
- Quality of the predicted notes seem to have suffered after addition of note duration and note offset prediction capabilities