This script pulls dependencies (only gem and ruby at the moment) versions out of the Gemfile.lock
and stores them in snapshot files in json
format for easy consumption. So, it transforms a Gemfile.lock
into json
. It then uses these json files to create diffs between two snapshots to track dependency updates over time.
Currently, the script takes snapshots of gem versions and ruby
versions. If the ruby
version is not called out in the Gemfile.lock
file, it looks at the .ruby-version
file in the directory specified by the repos.repo-name.gemfile_dir
config.
This script was built in the following environment, and is therefore the recommended setup:
- MacOS 12.6 or newer.
- Python 3.
pyenv
as the Python Version Manager andpyenv-virtualenv
as the package manager. The file.python-version
is read bypyenv
and switches to this python version whencd
ing into this directory.
-
If you use
pyenv
andpyenv-virtualenv
, install the version of python (if you don't have it already in thepyenv versions
list) and create the virtualenv:pyenv install `cat .python-version` pyenv virtualenv `cat .python-version` dep
-
Install the required package dependencies:
pip install -r requirements.txt
-
In order to pull data from Github, the environment variable
GITHUB_TOKEN
must be set with permissions to read the repositories specified in the config file. Go here to create one if you don't already have one. -
Create your
configs/default.json
config file. You can start fromcp configs/default_sample.json configs/default.json
If you're using pyenv
and pyenv-virtualenv
, you need to activate the virtualenv with pyenv activate dep
, assuming you called the virtualenv dep
at creation during setup.
By default, if you just run ./dep
, the script will try to use the config file configs/default.json
. If this file doesn't exist, it will fail. You can explicitly specify which config file to use with the -c
flag. For example:
./dep -c services
This will make the script use the config file configs/services.json
.
Every time the script runs successfully, it will create a json snapshot file in snapshots/
.
Use the -d
flag to create a diff file against an old snapshot. Diff files are csv files stored in diffs/
.
-
Use a config file like this:
{ "owner": "github-owner", "repos": { "repo1": { }, "repo2": { }, } }
-
Use a config file like this:
{ "repos": { "repo1": { }, "repo2": { }, }, "force_debug_mode": true, "debug": { "repos_dir": "/Users/reposdirectory" } }
-
Let's say you have a config file like
configs/services.json
like this:{ "owner": "github-owner", "repos": { "repo1": { }, "repo2": { }, } }
-
Let's also say you have a bunch of snapshots taken
snapshots/ | +----- services_2022-10-01.json | +----- services_2022-11-01.json
-
One can capture the dependency changes between one of these old snapshots and today with
./dep -c services -d services_2022-11-01
If today is
2022-12-01
, this will create a diff filediffs/services_2022-11-01_2022-12-01.csv
This is useful when you are in the process of upgrading rails and you run tests on two different versions of rails using different Gemfile.lock
files following a process similar to Github's.
For example, let's say you have a repo called repo1
which is upgrading from Rails 3 to Rails 4, using Gemfile.lock
and Gemfile_next.lock
respectively. Here's how we can get the diff:
-
Use 2 different config files, let's call them
core_r3.json
andcore_r4.json
// core_r3.json { "owner": "github-owner", "repos": { "repo1": { "gemfile_name": "Gemfile.lock" } } }
// core_r4.json { "owner": "github-owner", "repos": { "repo1": { "gemfile_name": "Gemfile_next.lock" } } }
-
Take a snapshot using
core_r3.json
with:$ ./dep -c core_r3.json Using config file core_r3.json Wrote snapshots/core_r3_2022-12-07.json
-
Take a snapshot using
core_r4.json
and diff it against the snapshot just taken for Rails 3:$ ./dep -c core_r4.json -d core_r3_2022-12-07 Wrote snapshots/core_r4_2022-12-07-v1.json Wrote diffs/core_r3_2022-12-07_core_r4_2022-12-07.csv
That's it. The diff file
diffs/core_r3_2022-12-07_core_r4_2022-12-07.csv
will contain all the changes.
These are json
files that live in the configs/
directory have a defined structure. If a config file is not specified, the script is going to use configs/default.json
.
All config files are gitignored.
Config files have the following structure:
{
"owner": "<github-org-or-owner>",
"repos": {
"<repo-name-1>": {
"gemfile_dir": "/path/to/directory",
"gemfile_name": "Gemfile.lock"
},
"<repo-name-2>": { ... },
// ...
},
"force_debug_mode": false,
"debug": {
"repos_dir": "/path/to/directory"
}
}
{
"owner": "<github-org-or-owner>",
// ...
}
Used when making API calls to github. For example when pulling Gemfile.lock
files, the API call path uses an owner name.
{
"repos": { /* ... */ },
// ...
}
Repos to report on. Each key within this dictionary is a repo name. The values are configured as follows.
{
"repos": {
"<repo-name>": {
"gemfile_dir":"/path/to/directory",
// ...
}
},
// ...
}
Defaults to "/"
if this is not present. This is the directory where the script will search for the Gemfile.lock
and .ruby-version
(if the version is not found within the Gemfile.lock
file) files.
{
"repos": {
"<repo-name>": {
"gemfile_name": "Gemfile.lock",
// ...
}
},
// ...
}
Defaults to "Gemfile.lock"
. This is the name of the Gemfile itself within the repo. Together with the gemfile_dir
config, this is how the script finds these files. If you don't specify these configs the script will try to find "/Gemfile.lock"
{
"force_debug_mode": false,
// ...
}
This config allows you to avoid making any API calls to github and only look at the repos
locally. It will look for them in the directory specified by the debug.repos_dir
config.
This config name will likely change in the future to force_local_mode
.
{
"debug": { /* ... */ },
// ...
}
Settings relevant only when the script runs in debug mode.
This config name will likely change in the future to local
.
{
"debug": {
"repos_dir": '/path/to/directory',
// ...
}
// ...
}
When running in debug mode, this is the directory where the script tries to find the repos
instead of making API calls to Github.
Snapshots are gitignored json files stored in snapshots/
.
They contain all the dependency versions for each repo specified in the repos config. The script consumes these files to create the diffs. A new snapshot is created every time the script runs successfully.
The format is snapshots/<config>_<date>.json
where <config>
is the name of the config file used to take the snapshot, and the <date>
is the day when the snapshot was taken. For example snapshots/default_2022-12-07.json
.
Diffs are gitignored csv files stored in diffs/
. They're created only when passing the -c
flag.
The format is diffs/<config-from>_<date-from>_<config-to>_<date-to>.csv
where:
<config-from>
is the name of the config file used in the snapshot we're diffing from.<date-from>
is the date of the snapshot we're diffing from.<config-to>
is the name of the config file used in the snapshot we're diffing to. I.e. the snapshot we took as part of this run.<date-to>
is the date of the snapshot we're diffing to. I.e. the snapshot we took as part of this run.
When both the snapshot we're diffing from and the snapshot we're diffing to use the same config file, the _<config-to>
part is removed. E.g. diffs/default_2022-11-01_2022-12-01.csv