Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imports in coherent.test mask project under test #3

Closed
jaraco opened this issue Jun 10, 2024 · 10 comments · Fixed by #9
Closed

imports in coherent.test mask project under test #3

jaraco opened this issue Jun 10, 2024 · 10 comments · Fixed by #9
Assignees

Comments

@jaraco
Copy link
Member

jaraco commented Jun 10, 2024

When running coherent.test, it imports pip_run and coherent.build:

import pip_run
from coherent.build import bootstrap

And then runs pytest in-process:

with bootstrap.write_pyproject(), pip_run.deps.load('.[test]') as home:
sys.path.insert(0, str(home))
runpy.run_module('pytest', run_name='__main__')

As a result, when testing coherent.build or pip-run or any of their dependencies, it will not get current code in the project under test, even though it was just installed, because the modules are already loaded into sys.modules.

I can think of a couple of possible approaches:

  • Run pytest in a clean subprocess.
  • Clean the imports from sys.modules before launching pytest.

As a temporary workaround, I'm invoking coherent.test with an explicit early dependency on . so the local build shows up and is used for coherent.test (and thus gets used by pytest):

pip-run . coherent.test -- -m coherent.test
@pawamoy
Copy link

pawamoy commented Jun 12, 2024

I'm watching activity on pfmoore/editables and I didn't want to derail the conversation there, so I will ask here. I saw your wrote this:

Once the system has editable support, I'd like to use the editable install for installing . so that there's just one copy of the files under test.

Do I understand correctly that you want the tests to run on the sources rather than an installed version of the current project? Isn't it generally recommended against to test the sources directly, as it could hide potential packaging issues?

@jaraco
Copy link
Member Author

jaraco commented Jun 12, 2024

Love the engagement!

Do I understand correctly that you want the tests to run on the sources rather than an installed version of the current project?

Yes. For several reasons.

  • When developing, sources can change rapidly, for debugging and incremental changes. A developer should be able to test the code that's being changed.
  • Packaging the code between each test execution can be slow and tedious and necessarily requires the tooling to get involved (re-running pytest under a watch is not viable).
  • Error messages and tracebacks will point to code other than the source code.
  • Having the code under test in two places can lead to confusion and even errors. In some scenarios, pytest attempts to protect against unexpected imports with an ImportMismatchError, but this error is difficult to troubleshoot and has been disabled in the latest import mode.
  • Some tests (namely doctests) are necessarily (or at least conventionally) discovered in the source tree and not in the imported modules.

Isn't it generally recommended against to test the sources directly, as it could hide potential packaging issues?

It could potentially hide packaging issues. And in fact it's happened to me many times that I've tested a change, got everything to pass, then committed and pushed upstream only to find that a new file wasn't added to the commit. However, this scenario isn't necessarily fixed by installing (as the build might pick up the file even if it's not under source control), and it's almost always caught in CI prior to any release. I've found it exceedingly rare that a change will pass checks in CI but still have packaging issues, especially after adopting setuptools_scm and include_package_data (and other conventions established in skeleton).

Moreover, with the Coherent System, I'm really aiming to take these concerns out of the users' hands. I'd like to provide some modestly-opinionated choices and clear guidance such that a user is unlikely to mess up packaging. For example, the current implementation includes all files and data in the package (with special exclusions for docs and tests). In coherent-oss/coherent.build#4, I plan to make that more sophisticated and require the files be in source control. And long term, I'd like to support a VCS like jj to eliminate the accidental omission of files (as it commits everything by default).

I want to eliminate the foot guns (required config knobs) and duplicative steps (adding a file with MANIFEST.in + VCS) and provide an experience that's easy to get right and intuitive. Due to the reasons enumerated above, it's more intuitive for a developer to test their sources than to test their installed product, so in general, I prefer/advise to stick to one copy of the sources.

That said, an editable install doesn't necessarily have the right fidelity and certainly has limitations (especially with regard to built extensions and or any solution that relies on installer behaviors to function, such as .pth files, cache warming, etc), so I'm not ready to declare testing the installed version off limits, but I am willing to discourage it where it's not needed.

And in cases where it is needed or valuable, I recommend to have a separate set of tests (perhaps akin to integration tests) that test the installed package, which can be local or in CI, but don't interact with the main test suite.

@jaraco
Copy link
Member Author

jaraco commented Jun 12, 2024

I should also point out that this issue exists under the current regime, which does test the modules as installed (as there's not yet support for editable installs of coherent projects).

@pawamoy
Copy link

pawamoy commented Jun 12, 2024

Thank you, that makes sense. I believe my own tests actually run against source files and not a non-editable installed version of my project. And it's working fine. I did make packaging mistakes in the past, forgetting to include files in dists, or misconfiguring the dists builder, that the tests didn't catch. But nowadays, with my templated projects, this just doesn't happen anymore. If Coherent's build and test systems ensure that, then yes, the convenience of editable installs outweighs recommendations. Also you're right, it makes sense t have additional integration tests that specifically check that dists are built as expected.

@jaraco
Copy link
Member Author

jaraco commented Jun 19, 2024

Let me know what direction you're thinking of heading.

@jaraco jaraco assigned jaraco and unassigned bswck Aug 5, 2024
jaraco added a commit that referenced this issue Aug 5, 2024
Fixes #3 but still has problems when tests are launched under coherent.cli (Ref #7).
@jaraco
Copy link
Member Author

jaraco commented Aug 5, 2024

In 7b3cc9a, I've drafted a fix to clean the imports, but this approach unfortunately doesn't work for coherent.cli (the issue reported in #7), because the modules aren't saved early enough (coherent.build is imported in coherent.cli.__main__).

So maybe the best option would be to always launch pytest in a subprocess. That would be the cleanest way to get a clean set of sys.modules for that invocation.

Other options I'm considering:

  • In coherent.cli.__main__, import coherent.test.modules (saving the modules early).
  • In coherent.cli.__main__, push coherent.build import to be delayed, so it's not imported on the code path.
  • In coherent.test, instead of saving the modules, explicitly search through sys.modules and remove anything that's not stdlib.
  • In coherent.test, simply clear everything from sys.modules.

@pawamoy
Copy link

pawamoy commented Aug 5, 2024

In the past, mkdocstrings introspected objects in the same process, and had to force reload sys modules when sources changed. This didn't work well with some packages like Numpy (we were computing the sys.modules diff from one iteration to the other to unload everything) who failed the second time because of some low-level memory initialization I suppose.

I don't think it's insurmountable. I just opted to run a subprocess instead at the time, which was much easier.

jaraco added a commit that referenced this issue Aug 5, 2024
@jaraco
Copy link
Member Author

jaraco commented Aug 5, 2024

In 761e182, I implement the subprocess-based approach. It's more involved because of all the wrapping behaviors that have to happen:

  • prepending a value to PYTHONPATH.
  • passing PYTHONPATH in the environment.
  • Explicitly passing sys.argv[1:].
  • Explicitly reflecting the result code from pytest.

It was so much more elegant to let all of those behaviors just fall through.

@jaraco
Copy link
Member Author

jaraco commented Aug 5, 2024

This didn't work well with some packages like Numpy (we were computing the sys.modules diff from one iteration to the other to unload everything) who failed the second time because of some low-level memory initialization I suppose.

This feedback is really useful. In this particular case, since the imports are largely under the control of the coherent system, there's opportunity to limit the modules to those that work well in such a situation. But I agree, it would be better to be able to avoid imposing these cross-project constraints (i.e. if typer were to adopt numpy, this problem would re-emerge if using a module-clearing approach).

@bswck
Copy link
Member

bswck commented Aug 5, 2024

Isolation is good!

jaraco added a commit that referenced this issue Aug 5, 2024
@jaraco jaraco linked a pull request Aug 5, 2024 that will close this issue
@jaraco jaraco closed this as completed in #9 Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants