Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: allow partial override of build steps #11710

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Oct 23, 2024

I was replacing some of the config models with dataclasses, but I found myself re-implementing some helpers that pydantic provides, so I'm introducing that new dep, it has everything we need, we may be able to use it to simplify our validation once all models are migrated to pydantic.

About incompatible options

I decided to just not allow formats when build.jobs.build is used, seems just easier that way. But not sure if users may want to just override the html build step while still using the default build pdf step, if that's the case, we may need to support using formats... or have another way of saying "use the default".

build.jobs.create_environment is kind of incompatible with python/conda.
Since users will need to manually create the environment,
but they may still want to use the defaults from build.jobs.install,
or maybe they can't even use the defaults, as they are really tied to the default create_environment step.

Which bring us to the next point,
if build.jobs.create_environment is overridden,
users will likely need to override build.jobs.install and build.jobs.build as well...
Maybe just save the user some time and require them to override all of them?
Or maybe just let them figure it out by themselves with the errors?
We could gather more information once we see some real usage of this...

Override them all

I chose to make build.html required if users override that step, seems logical to do so.

Do we want to allow users to build all formats? I'm starting with html and pdf, that seem the most used. But shouldn't be a problem to support all of them. I guess my only question would be about naming, htmlzip has always been a weird name, maybe just zip? I just went ahead and allowed all, with the same name as formats.

Docs

I didn't add docs yet because there is the question... should we just expose this to users? Or maybe just test it internally for now?

Closes #11551

@stsewd
Copy link
Member Author

stsewd commented Oct 24, 2024

Checks are failing because we have their deps hardcoded

https://github.com/readthedocs/common/blob/b8051be8f8221f5fe448dc81e1f2c7668090a732/pre-commit-config.yaml#L111-L113

They should pass after the PR is merged.

@stsewd stsewd marked this pull request as ready for review October 24, 2024 19:44
@stsewd stsewd requested a review from a team as a code owner October 24, 2024 19:44
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a reasonably small set of changes overall compared to what I might have been worried about. Definitely seems nice introducing pydantic to have proper models in the config classes.

readthedocs/core/utils/objects.py Show resolved Hide resolved
readthedocs/core/utils/objects.py Show resolved Hide resolved
readthedocs/core/utils/objects.py Outdated Show resolved Hide resolved
@humitos
Copy link
Member

humitos commented Oct 29, 2024

I decided to just not allow formats when build.jobs.build is used, seems just easier that way. But not sure if users may want to just override the html build step while still using the default build pdf step, if that's the case, we may need to support using formats... or have another way of saying "use the default".

I think formats: [html, pdf] and build.jobs.build.html override would be a common/valid use case, and I say we should support it. I'm fine not supporting in the first iteration if it's complicated to do it, but we should consider doing it after this first iteration is done.

What is the complexity you found while working on this?

build.jobs.create_environment is kind of incompatible with python/conda. Since users will need to manually create the environment, but they may still want to use the defaults from build.jobs.install, or maybe they can't even use the defaults, as they are really tied to the default create_environment step.
Which bring us to the next point, if build.jobs.create_environment is overridden, users will likely need to override build.jobs.install and build.jobs.build as well... Maybe just save the user some time and require them to override all of them? Or maybe just let them figure it out by themselves with the errors? We could gather more information once we see some real usage of this...

I wouldn't start by adding all this complexity to the validation. If users override those steps but they follow what we expect (our contract) that seems fine.

I didn't add docs yet because there is the question... should we just expose this to users? Or maybe just test it internally for now?

I'm fine start by testing it internally first. We have a few issues from users where we can suggest them to try this approach and receive feedback from them to work in the next iteration.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This work looks good to me.

It seems there is no need to apply some of the restrictions implemented in this PR, tho. I left some suggestions that allow using formats together with build.jobs.build overrides.

I also think we should remove the restriction to always override build.jobs.build.html. I didn't see anything blocking us to allow that.

Comment on lines +394 to +395
if not "html" in build_jobs_build:
raise ConfigError(message_id=ConfigError.HTML_BUILD_STEP_REQUIRED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think build.jobs.build.html shouldn't be mandatory. That means that users cannot override only any of the other formats like build.jobs.build.pdf or build.jobs.build.epub.

@@ -692,6 +727,15 @@ def validate_search(self):

return search

def validate_incompatible_keys(self):
# `formats` and `build.jobs.build.*` can't be used together.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this restriction. It seems pretty common to use the default PDF builder (via formats: [pdf]) , but override build.jobs.build.html. What is the problem this restriction solves?

Comment on lines +200 to +210
build_overridden = config.build.jobs.build is not None
if build_overridden:
self.run_build_job("build.html")
self.run_build_job("build.htmlzip")
self.run_build_job("build.pdf")
self.run_build_job("build.epub")
else:
self.build_html()
self.build_htmlzip()
self.build_pdf()
self.build_epub()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where we can remove the formats restriction:

for format in ["html", "htmlzip", "pdf", "epub"]:
  commands = get_dotted_attribute(self.data.config, f"build.jobs.build.{format}", [])
  if commands:
    self.run_build_job(f"build.{format}")
  else:
    method = getattr(self, f"build_{format}")
    method()

That allows users to override only the format they want to override while keeping the default commands when using formats as well.

@stsewd
Copy link
Member Author

stsewd commented Oct 29, 2024

Supporting formats and build.jobs.build.* will require users to specify the format in both places when overriding, and both places must match. This is:

formats: ["pdf"]
build:
  # pdf is duplicated twice
  pdf: echo pdf
  # This is invalid, as epub isn't in formats.
  epub: echo epub

It just feels more explicit having everything in one place, and not depend on other keys. Also, as mentioned in the other issue, overriding one step may require to update all build steps, which will make it not so common to re-use the default commands.

@humitos
Copy link
Member

humitos commented Oct 30, 2024

Supporting formats and build.jobs.build.* will require users to specify the format in both places when overriding, and both places must match

I don't see a problem in the example you shared and I don't think the build.epub key will be invalid there. In that example, the build will end up with PDF and ePUB formats built without issues. Also,

formats: ["epub"]
build:
  pdf:
    - echo pdf

Is also valid and will use the default command for ePUB and a custom one for PDF.

I don't think we need to perform any kind of validation.

@ericholscher
Copy link
Member

We agreed to move forward with:

  • Requiring formats key for building that step
  • If the user overrides the build job format build.jobs.build.pdf, it has to be in the format key.
  • The default step will be run if it isn't overridden.

@stsewd
Copy link
Member Author

stsewd commented Nov 4, 2024

We also noted that running a default step is only required for sphinx, mkdocs and other tools don't have a default step for pdf/zip/epub.

@ericholscher
Copy link
Member

ericholscher commented Nov 4, 2024

My only question is what to do with html. We don't currently allow it in formats, and adding it also makes it complex, because it will change the meaning of previous configs where it wasn't defined but still got built. Making it required makes sense to me in build.jobs.build.html, but then we don't have a way to specify "Use the default html build" 🙃

For example, does this config build html?

formats: ["pdf"]
build:
  pdf:
    - echo pdf

This one does:

build:
  html:
    - $RTD_DEFAULT_BUILD
  pdf:
    - echo pdf

This one doesn't:

build:
  pdf:
    - echo pdf

But it creates a lot of ambiguity.

So I do wonder if the current implementation is the best, where we disallow formats & build.jobs.build.*, and then give an explicit way to run default commands?

The other way to think about this would be using the approach we discussed, and require HTML, so maybe we just say html is always built, and is implied in formats?

So to build default HTML:

formats: ["pdf"]
build:
  pdf:
    - echo pdf

And to override HTML:

formats: ["pdf"]
build:
  html:
    - echo html
  pdf:
    - echo pdf

@ericholscher
Copy link
Member

ericholscher commented Nov 4, 2024

Thinking about this more -- I think the later option still makes sense. We just always assume HTML is built. I don't think there's a super valid use case for not building HTML (though something like Django's use of our platform might be a good example, though they still link to our HTML build when their docs are down as a mirror)

@stsewd
Copy link
Member Author

stsewd commented Nov 4, 2024

I guess your examples are for sphinx. But what about mkdocs and other tools?

We don't have a default build command for PDF.

formats: ["pdf"]
build:
   html: echo html

Should this config be invalid for non-sphinx configs?

Also, mkdocs does have a default commnad for html, so for mkdocs not specifying html should be valid I guess

@ericholscher
Copy link
Member

Yea other tools presumably dont have default commands for non-html builds. They should be able to override those formats though...

@humitos
Copy link
Member

humitos commented Nov 5, 2024

I think the later option still makes sense. We just always assume HTML is built

This is how Read the Docs currently works, and I don't think we have to change that.

You later option from #11710 (comment) makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build: implement build.jobs.create_environment and build.jobs.install
3 participants