Dumps and analyzes the raw data recorded by Puffer, a TV streaming website and Stanford research study. Anyone can download the daily data, build the analysis programs, and run the pipeline themselves. We encourage the public to replicate our results, posted every day on the Puffer website Results page along with the anonymized raw data. The Data Description webpage explains the format of these results, while this README details the analysis pipeline which produces them.
To set up a machine for the analysis, see scripts/init_data_release_vm.sh
. This installs the dependencies listed in scripts/deps.sh
, creates a local directory for the results, and builds the analysis programs.
Note that scripts/deps.sh
installs packages as sudo
, so users may prefer to manage dependencies on their own. Dependencies marked as "private" in the script are not required for users.
The analysis pipeline has been tested on Ubuntu 19.10 and 18.04 (the latter requires slight modifications; see scripts/init_data_release_vm.sh
).
Given a date, the pipeline outputs CSVs containing the day’s (anonymized) raw data, as well as stream and scheme statistics. Scheme statistics are calculated over the day as well as several time periods preceding it (week, two-week, month, and experiment duration).
The full pipeline runs daily on the Puffer server, pushing the results to the puffer-data-release bucket. After the results have been uploaded, those who wish to reproduce them may run the "public" portion of the pipeline, using the CSVs in the bucket as input. Users outside the Puffer organization lack the permissions to run the "private" portion of the pipeline.
As shown in the diagram below, the data pipeline has three stages, executed by influx_to_csv
, csv_to_stream_stats
, and stream_to_scheme_stats
, respectively. The final stage requires two metadata files generated by stream_stats_to_metadata
: scheme intersection
and watch times
, described below.
The Puffer server runs the full pipeline via scripts/private_data_release.sh
. This script first sets environment variables in scripts/export_constants.sh
, then executes the "private" portion of the pipeline, namely scripts/private_entrance.sh
. After the private program generates and uploads CSVs containing anonymized raw data, scripts/public_entrance.sh
outputs statistics summarizing each stream, as well as each scheme's average performance over all streams. Finally, scripts/upload_public_results.sh
uploads all non-private output to the bucket.
In order to reproduce the results generated by the Puffer server, members of the public may download the CSVs for the selected day from the bucket, then run
source ./export_constants.sh ~ <date>
to set up, followed by
scripts/public_entrance.sh
The <date>
to be analyzed may be either the most recent available (yesterday
) or some past day (YYYY-MM-DD
). For instance, if <date>
is 2020-02-14
, then the corresponding bucket directory (from which the CSVs may be downloaded) is 2020-02-14T11_2020-02-15T11
. This directory contains data from 2020-02-14 11AM UTC to 2020-02-15 11AM UTC.
Loading the Puffer video player starts a new streaming "session". Each channel change starts a new "stream" within the session. Each session is randomly assigned a "scheme" on load. A scheme, e.g. Fugu/BBR
or Pensieve/Cubic
, is the set of adaptive bit rate and congestion control algorithms used for the duration of a session. An "experiment" is the group of schemes from which each session’s scheme is randomly chosen. This group of schemes changes over time, as new experiments are performed. See the Puffer paper for further detail.
The scheme intersection
file (i.e. intx_out.txt
) lists the days on which all schemes in the desired day’s experiment were run (see Scheme Statistics for why this is necessary).
To produce the intersection file, first an experiment-agnostic scheme schedule
(i.e. scheme_days_out.txt
) is generated. The schedule enumerates the days each scheme ran, for schemes that appeared in any experiment(s) in the input data. For convenience, stream_stats_to_metadata
also outputs a more human-readable scheme schedule
in scheme_days_err.txt
.
Given an experiment and a scheme schedule
, stream_stats_to_metadata
filters the scheme schedule
to days on which all schemes in the experiment ran. For instance, if the scheme schedule
is
{ Fugu/BBR => Jan 1 : Jan 4, BBA/BBR => Jan 4 : Jan 5, Pensieve/BBR => Jan 3 : Jan 5 }
and the experiment is
{ Fugu/BBR, Pensieve/BBR }
, then the scheme intersection
is { Jan 3 : Jan 4 }
, since both schemes in the experiment ran on those days.
As shown in the diagram above, the entrance program uses the same scheme schedule
and scheme intersection
for all time periods leading up to a given day. In fact, these two files can be generated from any input data range inclusive of all desired periods. Since the monthly period includes the weekly and daily periods, the files can be produced once for all periods, using the month’s data as input.
For convenience, stream_to_scheme_stats
outputs the experiment and scheme intersection
passed to it as argument -- see logs/*scheme_stats_err.txt
.
This file is a static list of stream watch times from which to sample while calculating confidence intervals. Slow streams sample from a separate watch times file. The stream_stats_to_metadata
program can generate these files, but they also reside in the root of the bucket. This avoids materializing the large amount of input data stream_stats_to_metadata
needs to generate statistically sound watch times files.
A set of CSVs is produced for each day, containing the data collected by Puffer. The Data Description webpage explains the format. The influx_to_csv
program anonymizes the raw data, which contains IPs and other user information, assigning numerical identifiers to each session and stream.
A stream statistics file is produced for each day. Each line in this file summarizes a stream’s performance during the day. Stream statistics include SSIM and stall ratio as well as several other metrics.
Stream statistics are output in *stream_stats.txt
.
A scheme statistics file is produced for each day, along with the week, two-week, and month period preceding each day (inclusive of the day itself). Each line in this file summarizes a scheme’s stall ratio and SSIM during the time period, as an average over each stream assigned to that scheme during the period.
Averaging is performed only over days on which all schemes in the day’s experiment were run. This is important because network and TV station conditions change over time, so streams running on different days should not be directly compared.
Scheme statistics, including 95% confidence intervals, are output in *scheme_stats.txt
and plotted in *plot.svg
. Note that stall ratio is calculated using random sampling, so results will differ slightly across runs.