EPIC-narrator is a tool written in Python to annotate actions in videos via narration. This narrator was used in the EPIC-KITCHENS-100 dataset. If you use this narrator, please kindly consider referencing the following:
@ARTICLE{Damen2020RESCALING,
title={Rescaling Egocentric Vision},
author={Damen, Dima and Doughty, Hazel and Farinella, Giovanni Maria and and Furnari, Antonino
and Ma, Jian and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan
and Perrett, Toby and Price, Will and Wray, Michael},
journal = {CoRR},
volume = {abs/2006.13256},
year = {2020},
ee = {http://arxiv.org/abs/2006.13256},
}
For the first version of EPIC-KITCHENS we used a live-commentary approach (people narrating actions without pausing the video) to annotate the untrimmed videos. The narration timestamps served as an initial ground to collect action boundaries and object boxes.
As we collected the dataset, we wondered if the rough narration timestamps could be used to supervise action recognition models. We showed this is possible in our CVPR 19 paper - (project webpage).
While working on the paper we realised we were missing many actions in the videos. This was because annotators did not pause the videos while they were speaking, and as a result they were naturally not able to narrate actions that were happening as they spoke. Moreover, the timestamps were often not well aligned with the videos.
For the EPIC-KITCHENS extension we wanted to fix these issues, i.e. we wanted dense and precise narration timestamps. The EPIC Narrator was thus born! Our tool dramatically increased annotation density and precision. You can find more about the benefits of using the narrator in our arXiv paper.
VLC player must be installed in your system, regardless of your OS (unless you use flatpak).
Download the flatpak bundled with all the dependencies here
To use the flatpak bundle you will need to install first Flatpak on your Linux distro:
to install the narrator flatpak:
flatpak install epic_narrator.flatpak
to run it, just search EPIC Narrator
in your apps.
Alternatively, you can run it from the command line like this:
flatpak run uk.ac.bris.epic.narrator
If installation fails with flatpak with an error like this
error: The application uk.ac.bris.epic.narrator/x86_64/master requires the runtime
org.gnome.Platform/x86_64/3.36 which was not found
Try to run the following
flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo
and then again flatpak install epic_narrator.flatpak
, it should prompt you to download the missing dependencies.
If you don't want to use flatpak, use conda with the provided environment to install the necessary dependencies:
conda env create -f environment.yml
Alternatively, you can try to install the necessary modules yourself:
- GTK 3
- python-vlc
- pysoundfile
- This requires the
libsndfile
library to be installed in your system. In Linux this should be available in all distributions. For Windows and MacOSpip install pysoundfile
should install the library automatically for you.
- This requires the
- pysounddevice
- This requires PortAudio to be installed in your system.
In Linux this should be available in all distributions. For Windows and MacOS
pip install sounddevice
should install the library automatically for you.
- This requires PortAudio to be installed in your system.
In Linux this should be available in all distributions. For Windows and MacOS
- PyGObject
- matplotlib
- PyYAML
Note that the narrator works with Python 3 only.
If you experience choppy playback on Linux your VLC is probably not decoding the videos correctly.
Try to install libva1, libva-{mesa,vdpau}-driver
to fix this issue. More on this here
If you don't see some icons or checkboxes,
try to install the package adwaita-icon-theme-full
via apt
or your distribution's package manager.
Use brew and pip to install the dependencies. Note that you should use pip3 (i.e. pip for python 3.x)
Important: if you use conda, make sure you run conda deactivate
before running the commands below
(even the base
environment must be deactivated)
brew install pygobject3 gtk+3 adwaita-icon-theme
python3 -m pip install matplotlib python-vlc sounddevice soundfile PyYAML
Bear in mind that the brew
installation might take a while.
If you get the following error
ERROR: Could not find an activated virtualenv (required)
Try to export the following variable
export PIP_REQUIRE_VIRTUALENV=false
before installing the python dependencies with pithon3 -m pip
Start the program with python epic_narrator.py
. Once the program has started:
- Make sure your microphone input is correctly being captured. You can do this by checking the signal displayed in the monitor level. If you don't see any signal as you speak try to select a different audio interface (see below how).
- Load the video:
File -> Load video
- Choose where you want to save your recordings. The program will create the folders
epic_narrator_recordings/video_name/
under your selected output folder. - Play the video and narrate actions
Use the playback buttons to pause/play the video, as well as seeking backwards and forwards and mute/unmute the video.
You can use the slider to move across the video.
You can also change the speed of the playback.
To annotate an action press the microphone button. This will pause the video and will start recording your voice immediately. Once you have narrated the action, press the button again to stop the recording and continue annotating. The end of the recording will be delayed by 0.5 seconds to avoid clipping.
Alternatively, if you switch the option Settings -> Hold to record
you can record while holding down either the record
button or the enter key.
You will see all your recorded actions in the right-hand side panel.
- You can jump to the action location by left-clicking on the timestamp.
- You can overwrite a recording by right-clicking on the timestamp (see more below).
- You can also play and delete each recording with the corresponding buttons.
- If you want to play a recording and also jump to the video location at the same time, right-click the recording play button.
Finally, you can listen to the recordings as you watch the video by ticking the box Play recordings with video
, which
is located next to the time label.
left arrow
: seek backwardsright arrow
: seek forwardsspace bar
: pause/play videoenter
: start/stop recordingdelete
orbackspace
: delete the highlighted recordingm
: mute/unmute videoo
: overwrite highlighted recording
You can override a recording by right-clicking on its timestamp on the recording panel.
Alternatively, you can press o
, which will overwrite the highlighted recording.
In any case, you will be asked for a confirmation before overwriting the recording. The recording will start immediately as you confirm. To stop the recording you will have to either click the record button or press Enter, even if you are using the hold-to-record mode
To resume recording simply choose the same output folder you previously selected when you annotated the same video. This will automatically load all your recordings.
Use the Select microphone
menu to select the device you want to use.
By default the program will use the first device listed in the menu.
In the unlikely case the program crashes at start time due to some issues with the audio interface, try to launch the program as follows:
python epic_narrator.py --set_audio_device <device_id>
Run python epic_narrator.py --query_audio_devices
to get the devices available in your system with their corresponding ids.
For example:
$ python epic_narrator.py --query_audio_devices
0 HDA Intel PCH: ALC3220 Analog (hw:0,0), ALSA (2 in, 0 out)
< 1 HDA NVidia: HDMI 0 (hw:1,3), ALSA (0 in, 8 out)
2 HDA NVidia: HDMI 1 (hw:1,7), ALSA (0 in, 8 out)
3 DELL UZ2315H: USB Audio (hw:2,0), ALSA (2 in, 2 out)
4 sysdefault, ALSA (128 in, 0 out)
> 5 default, ALSA (128 in, 0 out)
python epic_narrator.py --set_audio_device 3
Recordings will be saved in mono uncompress format (.wav
) sampled at the default sample rate of
your input audio interface.
The narrator will save some settings under a directory named epic_narrator
automatically created in your home directory.
The settings will save the path of the video you narrated last, as well as the output path, the microphone id and a few other things.
The narrator will write event logs to a file under the same settings directory,
i.e. <your_home>/epic_narrator/narrator.log
.
The logs are saved in a rotating manner. Log files are limited to a maximum of 5MB, for a maximum of 3 files.