Danger, Danger Will Shakespeare!
A sentiment analysis to track the emotional arcs of Shakespeare's plays
Fasta files are edited files from gutenberg files. Edits made for readable for tokenization and not meant to infringe on any rights. For full license information look into the gutenberg file. Fasta file reads as follows:
>charcter<ACT><SCENCE>_<# of times they have spoken this act/scene>
Currently, code is specific for Hamlet, but can be eventually generalized to take in any .fasta formatted Shakespeare play. This progress is ongoing.
Each speaking role that a character is given is treated as a token. Polarity determines either postive or negative emotions, graphed in either red or blue. Values at 0.00 are considered neutral and are not being properly classifed in the play. Code produces a .csv file and plots the polarity over time (see graphs below).
- Download or clone repo
python shakespeare_sentiment.py -f hamlet.fasta
- Additional arguments to include: specific act, scene and/or character To run the entire play (no specific character)
python shakespeare_sentiment.py -f hamlet.fasta
To run a specific act (no specific character, e.g. Act 3) use -A command followed by the act value (accepts for 3, three, III)
python shakespeare_sentiment.py -f hamlet.fasta -A 3
To run a specific act and scene (no specific character, e.g. Act 3, Scene 1) use -A to specify act and -S for the scene, requires both the act and scene to run for a scene
python shakespeare_sentiment.py -f hamlet.fasta -A 3 -S 1
To run any combination of acts and scenes for a specific character add the -C command (e.g., Hamlet)
python shakespeare_sentiment.py -f hamlet.fasta -C hamlet
python shakespeare_sentiment.py -f hamlet.fasta -A 3 -C hamlet
python shakespeare_sentiment.py -f hamlet.fasta -A 3 -S 1 -C hamlet
The existing code has been classifed based on the sentiment results of textblob. Textblob was trained on modern movie views and isn't optimized for Old English. Future work will train the classifers on Shakespeare text (e.g. sonnets). . The program was initially trained on contemporary movie reviews so the line of blue dots on the 0 mark represent sentences in a speech that the program considered to be neutral statements. Neutral statements are false positive results and artificially pull up the average polarity of the entire play. Among lines that the program was unable to parse were either due to the antiquity language (“o fie!” 1.2.6) or because the program was not properly trained on Old English word choice (“He was a man, take him for all in all, I shall not look upon his like again” 1.2.14). This process will include labelling specific words in Hamlet with stronger negative associations that are common in Shakespeare’s plays (e.g serpent, foul, fate, ghost, rotten, harrow, villain). Once trained, I expect the overall trend to decline toward largely negative emotions and polarity
Full play Act I Act II Act III Act IV Act V
Hamlet through the play Horatio throughout the play Gertrude throughout the play Claudius throughout the play Laertes through the play The Ghost of King Hamlet throughout the play Rosencrantz throughout the play Guildenstern throughout the play The Players throughout the play