-
Notifications
You must be signed in to change notification settings - Fork 0
Description
This is a bit of a story to answer any questions that might come up in the future.
- 2021-12-04: The Git scraping workflow gets paused by GitHub due to a lack of activity for reasons unknown (the repo is getting new commits each night).
- 2021-12-16 08:45 US Central: Awkward Array gets posted to Hacker News and starts getting a lot of traction and lots of GitHub stars.
- 2021-12-16 12:00 US Central: Matthew notices that the GitHub Action has been paused and restarts the action. There are now 13 days of missing time values in the data set.
- 2021-12-17 UTC: The GitHub Action restarts and begins to Git scrape again, catching the start of the Awkward star rise in the 13 days of missing data.
- 2021-12-30 UTC: The Hacker News inspired star activity on Awkward dies off.
- 2022-03-21: @jpivarski gives a super nice talk on Metrics of computing trends in NHEP at the IRIS-HEP Topical meeting which includes the following plot on slide 6 (page 15)
At the talk @alexander-held notices that the vertical dashed line of when the Hacker News post happened is after a large step and a bit of a climb. @matthewfeickert mentions that the jump is because during the Git scraping stoppage there was additional stars being added, but the vertical line still seems misplaced.
This is because the 13 missing days are not being plotted at all in matplotlib
and so are simply being cut out. When preparing his plot, @jpivarski very reasonably
I drew the vertical line by hand, by doing a linear interpolation between the dates on the horizontal axis. (In Inkscape, I made a box connecting the two tick marks with a tool that snaps to points, calculated the fraction past the first date that I'd need, used a scaling dialog to shrink the box by exactly that percentage, then snapped the box to the first date and used it as a guide to add the dashed line.)
and so the linear interpolation is assuming there are dates there that don't exist, producing a shift in the location of the vertical line.
Drawing the vertical line in maplotlib
when plotting the data itself
_date="2021-12-17" # Can't draw on 2021-12-16 as isn't in the data set
ax.axvline(x=_date, color="grey", linestyle="dashed", label=_date)
ax.text("2021-11-05", 450, _date, color="grey", size=20) # hackily choosing a date to get the text in a reasonable location
shows that the data plotted line up correctly once the missing days are taken into account. 👍
This issue just is here to document what happened to avoid future confusion. Also good eye to @alexander-held for catching this in the talk and for starting a fun little bit of forensic visualization. 🙂