[MRG] Austin workflow refactor part 2 by asoplata · Pull Request #134 · jonescompneurolab/textbook

asoplata · 2025-10-30T20:07:14Z

This is based on, and continuing from, this PR here dylansdaniels#2 . This includes many large refactors, including changing ordered_page_links to a greatly expanded flat_index, along with many other changes.

If merged, this resolves #125 and #126 .

asoplata · 2025-10-30T20:11:18Z

I've marked it as WIP, but almost all the remaining changes I plan on making for this PR are documentation changes. There may be some further code changes, but they will not be very large.

asoplata · 2025-10-31T22:26:17Z

This is now at a good stopping point and is ready for review. I did a bunch of the things we talked about in addition to the Issues mentioned previously. Much of this was adding and updating documentation, including fixing a newly found (but preexisting) bug or two

This is the first part of a major refactoring overhaul of how we handle indexing, input-markdown and output-HTML pathing, and sidebar generation. This is still a work in progress, and most functionality has been commented out for debugging/testing. Major changes: - `update_page_index.py` has been renamed to `create_indices.py`. This new file now creates two indices: 1. "hier_index", which is the same as the old "index.json". Its content has not changed. Instead of being saved to file every time and then loaded later during sidebar creation, it creates and returns the dictionary instead, and that is passed to sidebar for usage. I'm not sure if we need to save it as a file at all, since its value can always be easily debugged live, and we were not doing any kind of checking to skip its loading before (i.e. if we have to reconstruct it every time, then there's no gain from saving a copy to disk). 2. "flat_index". This will eventually replace "ordered_links" and is greatly expanded from what "ordered_links" does. This is treated like "hier_index" as an explicit "generate_page_html" function, but will replace much functionality/data of the rest of the code. This index includes the ordering of legitimate "pages" (i.e. not "sections"), and for every page contains a dictionary containing: A. title of the page B. absolute path of the "input" markdown file C. absolute path of the "output" HTML file (including processing dev_build, and including creation of any parent directories to this not-yet-created file) D. "website-root"-relative path of the "output" HTML file (including processing dev_build) The next major step is using "flat_index" to replace ordered_links and output path/filename lookups throughout the rest of generate_page_html. Minor changes: - Location of the saved index files has been moved to inside "scripts" from the root - Many functions and some script files have been renamed

This continues the use of "flat_index", including in every case of the main loop, except for the "ordered_links" which are next. This includes many minor changes as well for formatting (including a ruff run), documentation, and other small changes.

Among other minor changes, including adding a new optional "save_indices" CLI arg that does what it sounds like, this finishes applying the usage of the new "flat_index" to the remaining code, which was mostly the "ordered" links used in the per-page footers. Everything appears to be working correctly, including that the sidebar and footer navigation uses "dev" or "content" properly. Next up is general cleaning and documenting.

This begins a multi-step transition from the CLI "--build-on-dev" argument to the combination of "--code-version" (formerly the unused "--build-type" argument) and "--custom-owner-commit". The CLI args have already been created, and this commit focuses on applying their use inside `get_commit_hash.py`. This also includes a great deal of minor changes `get_commit_hash`, including better error messages, more explicit variable names, etc. There is also incomplete support for a new "--code-version=no-check" option that skips the version checking in case one is debugging or if one wants to run 'build.py' a lot, but doesn't want to spam HTTP requests. This use-case is useful in itself, but also related to a weak hypothesis I had: that the fact that version mismatches in the previous "default" case of 'build.py' usage provided only *warnings*, instead of RuntimeErrors, like in all other mismatch cases (see https://github.com/dylansdaniels/textbook/blob/workflow_refactor/build.py#L610 )

`dev_build` has finished being replaced by a combination of `code_version` and (either `custom_owner_commit` or `commit_hash`) depending on the function. Documentation has not yet been updated, but it was incomplete to begin with.

This comments out what I think is something that may have been added by accident. Compare this line to https://github.com/dylansdaniels/textbook/blob/31dbaf09d6d854a4e8527024fa99f2b415820192/scripts/convert_notebooks.py#L923 . What was originally going on at the time of the above commit is the following: `notebook_was_run` was used as whether or not the notebook was attempted to be run at all, regardless of if it succeeded (I have changed this variable to `execution_initiated`). `notebook_executed` was used as whether or not the notebook successfully finished its execution (I have changed this variable to `execution_successful`). In the control flow, which has not changed at all, in this particular case, we are in the case where `notebook_was_run`/`execution_initiated` is False, meaning no execution was attempted at all. Also in the original version, `dev_build` would have been activated, meaning we're in a dev state. However, even if we were in a dev state, but if the notebook execution was never initiated/attempted, then in this case, I do not think it makes sense to record the commit hash of the current HNN version into the notebook. If we did, it would imply that the last dev execution was using this commit, but that is not the case.

This adds a large description of the building program as a whole to the CLI for 'build.py' such that it is output when calling python build.py --help This includes description of the assumed file structure (which has many, many assumptions), the overall code path, some examples, and the important options along the way. Minor: this also adds a docstring for 'get_hnn_commit_hash', renames the 'write_standalone_html' variable to 'save_standalone_nb_html', and makes a new CLI argument of a similar name. This documentation was first generated with Claude AI before undergoing *heavy* modification of every word.

This partially reverts our post-hnn_commit_hash-validation code's checks of the "code-version" to use a new var like in the original, called "is_dev_build". The distinction between a "dev" and non dev build is now fully documented, apparent, and set in 'build.py'. "is_dev_build" is also a proper boolean flag (true/false only) and the actual commit hash to use (when necessary) is handled separately. More refactoring: this also renames the hnn-core commit hash module and splits its functions, so that the 'get_hnn_commit_hash' function is ONLY getting the installed version, and the rest of the code is validation of the installed version against various other versions.

There are also some minor variable name changes

Previously, we were loading the same notebook from disk twice, once to calculate the hash, then once to actually use it. It's more efficient to load from disk once.

This includes a special comment about how `_read_nb_json_output_metadata` looks in different places for the JSON output files depending on if you're doing a stable or dev build

SHOULD be no code changes...

You can probably guess where this is going... Note: this does not affect the Workshop page, since it's raw HTML.

My Claude's filesystem scan was out-of-date, and lightly "hallulcinated" a `templates/navbar.html` that no longer exists. This removes that.

Appears to work just as well.

asoplata · 2025-11-04T20:41:28Z

@dylansdaniels this has now been rebased off of main, and additionally the only merge conflict has been resolved. It's definitely ready for review now, and should be much easier to review.

asoplata · 2026-01-05T20:00:24Z

This is complicated enough that we may want to NOT squash these into one big mega commit before merge. Instead, may want to way until we're happy with the current state, then I do a rebase to resolve the merge conflicts, then we merge

asoplata · 2026-01-21T18:47:32Z

For some reason, even if I re-add then re-delete the (no longer necessary) files for dev/05_erps/erps_in_gui.html and scripts/create_navbar.py, it still complains that these are blocking conflict. Therefore, I'm going to re-add them, but we should manually delete them AFTER this PR is merged.

Everything else should be up to date and work with main.

dylansdaniels · 2026-01-27T22:27:17Z

@asoplata reviewed everything and looks good in theory. doing some local testing that the builds are working as expected. I pushed a commit to add "no-check" to the "--code-version" argument, as it was missing .

asoplata · 2026-01-28T15:32:56Z

Thanks for catching that, when going over this again I forgot that --code-version=no-check and --no-version-validation are not the same thing. In the name of reducing complexity, we may want to discuss merging those two options somehow, but that discussion can wait.

I'll add a new commit that forces no-check to force a dev build (instead of content) and to update the documentation.

asoplata · 2026-01-28T15:40:02Z

Another refactoring idea is, even though we're already at pretty high CLI complexity, we may want to separate how whether you're doing a content or dev build into its own CLI arg. Especially when it comes to a CLI, we want to minimize how much mental "control flow" the user needs to keep in their head when it comes to secondary consequences of their choices (as opposed to giving them a higher number of choices but which are all independent of each other)

asoplata · 2026-01-28T15:46:44Z

Done with the addition, now --code-version=no-check causes a dev build (instead of content) and is properly documented (...probably)

dylansdaniels · 2026-01-28T18:08:26Z

build.py

+        be skipped) using the current master (development) branch that you have
+        installed (WITHOUT checking against the latest master branch on Github), and
+        build the website in your local `dev` directory (creating it if necessary):


I think this is confusing, because you techically don't need master installed since you're passing --no-version-validation, so it's actually not guaranteed that you're using the current master (dev) branch. I think we should remove example 4, reindex, and add explanation to the current example 8 stating that the --no-version-validation flag supercedes any use of --code-version. That, or we make it such that you can't actually pass --code-version with --no-version-validation

this is related to your comment on control flow. But I agree that we don't necessarily have to rethink that entirely in this PR, but I do think we should change the examples at the very least

I def agree it's super confusing. It's possible that I meant to strip out --code-version=no-check in its ENTIRETY, and only left it in (e.g. at the end of process_hnn_commit_hashes.py) by accident. It's just been so long that I don't remember.

What about removing no-check entirely? no-check and --no-version-validation are essentially doing the same thing in process_hnn_commit_hashes.py already (except for --code-version=stable where only --no-version-validation is doing the correct behavior and returning None for the commit instead of the current installed commit).

If we remove it entirely, it's much simpler, and the docs will be so too. --code-version determines a content vs dev build based on either stable or {master, custom}. Then, --no-version-validation overrides the version validation by using the appropriate commit in the metadata (the installed one if dev, or None if content) and executing only using the installed commit.

I'll add that even if we drop no-check, it will be an improvement, but still has a lot of room left for further improvement after the PR

Hmm my gut is that it should be the opposite actually - we keep --code-version=no-check and remove --no-version-validation.

Then we entirely avoid the confusion of commands like: python build.py --execution-type=execute-updated-unskipped-notebooks --code-version=master --no-version-validation -> this implies it should be master, but we don't validate, so you can actually run it with stable and it will build anyway. users may thing their build is running on master but it's not

There would never be a case where you need a code version specified when using --no-version-validation in its current form anyway, it's really an either or situation. so it might as well be an option for --code-version. and if there's no version validation, it should probably always be a dev build

I'm okay with removing --no-version-validation and replacing it exclusively with --code-version=no-check, since I agree that it way easier to understand, EXCEPT for one caveat: I think we should simultaneously split dev build logic into its own CLI argument. Reducing the cognitive load of "no check" into a single item is definitely the way to go.

However, we need to be able to control whether we're doing a build into content or dev, independent of that. For example, when I was coding this, I was actively using that situation -- I was trying to test and code the situation where --code-version=stable (which is currently the only way to use the code paths that build into content instead of dev), but I didn't want to be pinging Github everytime I tried to build (since I didn't want my IP to get on the bad list). Therefore, we (moreso you and I, not necessarily the expected user) need to be able to use --code-version=no-check but control whether we're outputting to and building from content or dev.

This has the added benefit of making it very explicit at the CLI level where you're building to. We could still retain the current control flow (i.e. --code-version=stable defaults to building to content, but {master, custom, no-check} default to building to dev) BUT this gives us (and users) a way to force override that preference.

Makes sense! I think that's a good plan and I'm on board 👍

Implemented the change in the latest commit

This removes the CLI arg `--no-version-validation` (so that only `--code-version=no-check` remains), and adds a new CLI arg named `--build-directory`. This new arg accepts values of `auto` (the default, which follows our `is_dev_build` "algorithm"), `content` (user forces website to be built into content dir), and `dev` (user forces website to be built into `dev`). This also upgrades all the documentation to reflect these changes, along with some minor other docstring improvements.

asoplata requested a review from dylansdaniels October 30, 2025 20:09

asoplata force-pushed the austin-workflow-refactor-part-2 branch from 6b9f88e to 6ba91ac Compare October 31, 2025 14:40

asoplata changed the title ~~[WIP] Austin workflow refactor part 2~~ [MRG] Austin workflow refactor part 2 Oct 31, 2025

asoplata added 25 commits November 4, 2025 15:36

ref: minor of check_nb, replace os with pathlib

6c48f08

ref: build arg str var, use "repo-owner"

c35eb08

ref: minor log output improvement

4ebb35a

ref: add unused args+docs for new version args

c3a189b

ref: debug comments, and pathlib replacement

2043ea9

ref: add CLI args to get_commit and ruff

6d957e1

ref: restore future _structure_json function

72d5971

ref: rename execution-filter to execution-type

c4b25ba

propagate rename to CI

846d9a1

ref: finish overhaul of dev CLI args, part 2 of 2

7ebddf1

`dev_build` has finished being replaced by a combination of `code_version` and (either `custom_owner_commit` or `commit_hash`) depending on the function. Documentation has not yet been updated, but it was incomplete to begin with.

ref: ruff run

607ecb0

ref: incr recorded JSON flags verbosity

47d2ecd

ref: 99% formatting and tons of COMMENTS

d52206a

doc: docs and comments for get-hash

e7d1ca0

ref: move nb_html->json up in call tree

c473a42

doc: (mostly) incomplete updating of execute docs

a921def

There are also some minor variable name changes

ref: change calculate_nb_hash to copy not load nb

aeafd0a

Previously, we were loading the same notebook from disk twice, once to calculate the hash, then once to actually use it. It's more efficient to load from disk once.

doc: more docs

24b0b87

This includes a special comment about how `_read_nb_json_output_metadata` looks in different places for the JSON output files depending on if you're doing a stable or dev build

asoplata added 8 commits November 4, 2025 15:36

doc: docstring, comment, help, and ruff changes

738ffff

SHOULD be no code changes...

DELETE dev folder entirely, since bug is fixed

1d57e3b

ref: rename sidebar-header to sidebar-section

667f180

You can probably guess where this is going... Note: this does not affect the Workshop page, since it's raw HTML.

doc: remove Claude hallucination

9b838b2

My Claude's filesystem scan was out-of-date, and lightly "hallulcinated" a `templates/navbar.html` that no longer exists. This removes that.

ref: finish renaming navbar to sidebar

8be7cac

Appears to work just as well.

maint: remove deprecated ordered_links comments

5e38eb5

maint and doc: sidebar code docs, and rename funcs

937e9d8

doc: add tiny note about "dev" sidebar

86eb4da

asoplata force-pushed the austin-workflow-refactor-part-2 branch from 9400138 to 86eb4da Compare November 4, 2025 20:40

This was referenced Nov 4, 2025

[WIP] Test support for stable+dev builds and deployment on forks #135

Open

MAINT/DOC: Designing notebook execution workflows #100

Open

asoplata added 2 commits January 5, 2026 11:51

conflict resolution: re-add then delete HTML

12ee326

conf res: re-delete

e09b530

asoplata added 3 commits January 21, 2026 13:42

lazy file revert to remove conflicts

8ebadfd

delete files that no longer exist

6126ab0

re-add final scripts changes

b0b76b6

asoplata and others added 3 commits January 21, 2026 13:48

ugh re-add deleted files again

cc5be6b

accidentally moved HTML file to wrong folder

2c698ce

add misssing no-check argument to --code-version

3e2c104

fix: add no-check to dev builds, and document it

5ce4753

dylansdaniels reviewed Jan 28, 2026

View reviewed changes

asoplata added 2 commits January 30, 2026 17:02

ref: decr verbosity of some CLI args

cbf9312

Conversation

asoplata commented Oct 30, 2025

Uh oh!

asoplata commented Oct 30, 2025

Uh oh!

asoplata commented Oct 31, 2025

Uh oh!

asoplata commented Nov 4, 2025

Uh oh!

asoplata commented Jan 5, 2026

Uh oh!

asoplata commented Jan 21, 2026

Uh oh!

dylansdaniels commented Jan 27, 2026

Uh oh!

asoplata commented Jan 28, 2026

Uh oh!

asoplata commented Jan 28, 2026

Uh oh!

asoplata commented Jan 28, 2026

Uh oh!

dylansdaniels Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

dylansdaniels Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

asoplata Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dylansdaniels Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

dylansdaniels Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

asoplata Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

dylansdaniels Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

asoplata Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asoplata Jan 28, 2026 •

edited

Loading