Cloujera lets you do a fine-grained search for spoken words in Coursera's videos. It does this by performing full text searches on the transcripts of videos on coursera.
-
Bring up Vagrant (elasticsearch + redis):
vagrant up
-
Compile the clojurescript: (Make sure you have java >1.7)
lein cljsbuild once
-
Start the app:
lein run
-
On the first run, visit
http://127.0.0.1:8080/burglar/go
to seed the db (it will error out ridiculously with anIndexMissingException
from elasticsearch if you don't do this!);
$ vagrant ssh
$ cd /vagrant
$ ./scripts/deploy.sh
NOTE: the address to access the dockerized cloujera is
http://127.0.0.1:8081
(see Vagrantfile
)
$ vagrant ssh
$ cd /vagrant
$ source ./scripts/prod-env.sh
$ lein uberjar
$ java -jar ./target/uberjar/cloujera-*-standalone.jar
NOTE: the address to access the uberjarred cloujera running on port 8080
is http://127.0.0.1:8082
(see Vagrantfile
)
Visiting http://cloujera.whatever/burglar/go
scrapes some 10 courses to get
you started;
To scrape another course, you need to:
-
Visit the cloujera session API
https://api.coursera.org/api/catalog.v1/sessions
and choose a course -
Sign up for the course and agree to honour code manually for the
[email protected]
user -
Find the video lecture URL (
videoLecturesURL
) -
Perform an http
POST http://cloujera.whatever/burglar/raid
with this payload (JSON):{ "url": videoLecturesURL }
For example:
{ "url": "https://class.coursera.org/apcalcpart1-001/lecture" }
$ ssh user@cloudmachine
$ git clone https://github.com/vise890/cloujera
$ cd cloujera
$ sudo ./scripts/provision.sh
# in the cloujera directory...
$ ./scripts/deploy.sh
NOTE: deploy.sh
pulls the most recent version of cloujera from the repo
Ensure that all the containers are running in the VM:
$ vagrant ssh
$ sudo docker ps -a
You should see redis
, elasticsearch
and cloujera
running
$ vagrant ssh
$ sudo docker logs cloujera
Visit http://localhost:9200/
, you should see status: 200
redis-cli
will drop you into a Redis shell. Some useful commands are: INFO
,
MONITOR
, HELP
, HELP @server
.
NOTE: this works form the host as well as in the Vagrant VM
$ vagrant ssh || ssh user@cloudbox
$ sudo docker exec -i -t cloujera bash
lein run
doesn't give any output initiallylein run
doesn't reload