Data

This repository contains the code used to extract co-occurrence networks from a tagged corpus of Shakespeare's plays.

The networks have been analysed using persistent homology, a technique from computational topology. Please refer to our paper

Shall I compare thee to a network? – Visualizing the Topological Structure of Shakespeare's Plays

for more details.

Data

The folder Corpus contains the original corpus that was used to calculate co-occurrence networks. Additional information about the amount of speech between certain characters has been added. Please refer to lexically.net for the original data.
The folder Networks contains the co-occurrence networks for all the plays that we used in the paper. Networks are categorized into speech-based and time-based filtrations. Please refer to the paper for more details.
The folder Plays contains the corrected variants of the plays, sorted into three broad categories.

Usage

The main script is called co-occurrence.py. Given the filename of a tagged play, it automatically produces a co-occurrence network using the speech-based filtration we described in the paper. The network will be stored in the current directory. To batch-process all networks automatically, you could for example use:

find ./Plays/ -name "*.txt" -exec ./co-occurrence.py {} \;

This traverses the folder Plays and executes the extraction script for every file. If you want the time-based filtration instead, use the parameter -t, i.e.:

find ./Plays/ -name "*.txt" -exec ./co-occurrence.py {} -t \;

Again, this will result in a set of networks. Note that all existing networks will be overwritten in the current folder.

Demo

A demo of all the extracted networks is available. The demo uses a simple force-directed graph layout to visualize the network.

Licence

The data and the code is are released under an MIT licence. Please refer to the file LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
Corpus		Corpus
Diagrams/Speech		Diagrams/Speech
JSON		JSON
Networks		Networks
Plays		Plays
.nojekyll		.nojekyll
LICENSE.md		LICENSE.md
README.md		README.md
aleph_make_diagrams.zsh		aleph_make_diagrams.zsh
co-occurrence.py		co-occurrence.py
d3.min.js		d3.min.js
degree_distribution.py		degree_distribution.py
extract_network.py		extract_network.py
index.html		index.html
make_visualizations.sh		make_visualizations.sh
network_to_json.py		network_to_json.py
visualize_network.py		visualize_network.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data

Usage

Demo

Licence

About

Releases

Packages

Languages

License

Pseudomanifold/Shakespeare

Folders and files

Latest commit

History

Repository files navigation

Data

Usage

Demo

Licence

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages