Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pants export-codegen fails with "Error copying bytes" when run across entire repo #21771

Open
kdowney-talos opened this issue Dec 17, 2024 Discussed in #21713 · 5 comments
Open
Labels

Comments

@kdowney-talos
Copy link

Discussed in #21713

This particular repo has a number of modules, but only one has code to generate: a repo with Protobuf files. However, if I widen it and run pants export-codegen :: for every module, I get what looks like a race condition related to copying a CSV file. I isolated it to just one module, serenity.risk -- it fails even if I just do pants export-codegen serenity.risk::

(1) It's unclear to me why this file would be in scope for code generation
(2) Regardless, the error itself seems like a red herring, an error changing file permissions

I tried further narrowing to sub-directories, etc., and it does seem to be this particular file.

The only notable thing about this file is its size, more than 500K:

-rw-r--r--  1 serenitydev serenitydev 574136 May 13  2024  DAR_prices_data.csv

Additional information:

  • Observed with pants 2.23.0 and 2.25.0.dev1
  • Running on Ubuntu jammy image
  • Inside Docker container, running in Docker Desktop 4.37 on MacOS 15.1.1
(.venv) serenitydev@adb319d55c1a:/workspaces/pms$ pants export-codegen ::
16:11:29.08 [INFO] Writing generated files to dist/codegen
16:11:29.09 [ERROR] 1 Exception encountered:

Engine traceback:
  in `export-codegen` goal

IntrinsicError: Error copying bytes from /home/serenitydev/.cache/pants/lmdb_store/immutable/files/2d/2d522219da490246fd3842165d3ebb7362314e218a510c16daeb09bc8d5fc5a6 to /workspaces/pms/dist/codegen/serenity.risk/experiments/dar-wilshire/DAR_prices_data.csv: Permission denied (os error 13)

Output of uname -a:

Linux c774de47c447 6.10.14-linuxkit #1 SMP Fri Nov 29 17:22:03 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Severity: very low, I can easily work around this.

@huonw
Copy link
Contributor

huonw commented Dec 18, 2024

Sorry for the trouble.

When you see this error, can you manipulate that file directly (in the same shell running Pants)? For instance cp /home/serenitydev/.cache/pants/lmdb_store/immutable/files/2d/2d522219da490246fd3842165d3ebb7362314e218a510c16daeb09bc8d5fc5a6 ./dist/testing

#19826 is potentially slightly related?

@huonw
Copy link
Contributor

huonw commented Dec 18, 2024

Also, can you work out what targets are generating the file? If so, can you share their definitions here (and pants.toml and any relevant other targets), in as much detail as you are comfortable?

@huonw huonw added the bug label Dec 18, 2024
@kdowney-talos
Copy link
Author

Yes, I can copy it:

(.venv) serenitydev@c774de47c447:/workspaces/pms$ ls -la dist
total 564
drwxr-xr-x  5 serenitydev serenitydev    160 Dec 19 00:00 .
drwxr-xr-x 34 serenitydev serenitydev   1088 Dec 18 18:38 ..
-r-xr-xr-x  1 serenitydev serenitydev 574136 Dec 19 00:00 2d522219da490246fd3842165d3ebb7362314e218a510c16daeb09bc8d5fc5a6

It's hard to identify the targets because as I said, the module that's causing problems doesn't have any codegen targets, and if I specify just the one that has protobuf targets, the bug does not appear.

I cannot directly attach *.toml files, so I have provided pants.toml inline here:

[GLOBAL]
pants_version = "2.23.0"
backend_packages = [
  "pants.backend.build_files.fmt.ruff",
  "pants.backend.codegen.protobuf.lint.buf",
  "pants.backend.codegen.protobuf.python",
  "pants.backend.docker",
  "pants.backend.docker.lint.hadolint",
  "pants.backend.experimental.adhoc",
  "pants.backend.experimental.java",
  "pants.backend.experimental.openapi",
  "pants.backend.experimental.openapi.lint.openapi_format",
  "pants.backend.experimental.python",
  "pants.backend.experimental.python.lint.ruff.check",
  "pants.backend.experimental.python.lint.ruff.format",
  "pants.backend.experimental.terraform",
  "pants.backend.experimental.terraform.lint.tfsec",
  "pants.backend.experimental.tools.yamllint",
  "pants.backend.python",
  "pants.backend.python.typecheck.mypy",
  "pants.backend.shell.lint.shfmt",
  "pants.backend.tools.taplo",
]

[buf]
config = "build-support/buf.yaml"
known_versions = [
  "v1.47.2|linux_arm64 |e7188833039d4e7736de517eba6141b9306f4b60b00974392dac7ce38627321e|22700012",
  "v1.47.2|linux_x86_64|39716cfe0185df3cba21f66ec739620ffb6876c48b2da4338a8c68c290c9b116|24719887",
]
version = "v1.47.2"

[export]
py_editable_in_resolve = [
  "serenity-base",
  "serenity-base-services",
  "serenity-data-client",
  "serenity-data-pipelines",
  "serenity-middleware",
  "serenity-risk",
  "serenity-specifications",
]
resolve = [
  "serenity-deps"
]

[mypy]
config = "build-support/mypy.ini"
install_from_resolve = "mypy"

[pytest]
install_from_resolve = "pytest"
requirements = ["//3rdparty/python:pytest"]

[python]
default_resolve = "serenity-deps"
enable_resolves = true
interpreter_constraints = ['==3.12.*']

[python.resolves]

# tools
black = "3rdparty/python/black.lock"
flake8 = "3rdparty/python/flake8.lock"
isort = "3rdparty/python/isort.lock"
pytest = "3rdparty/python/pytest.lock"
mypy = "3rdparty/python/mypy.lock"
ruff = "3rdparty/python/ruff.lock"

# frameworks
dagster = "3rdparty/python/dagster.lock"

# modules
serenity-deps = "3rdparty/python/serenity-deps.lock"
serenity_data_pipelines-deps = "3rdparty/python/serenity_data_pipelines-deps.lock"

[python-protobuf]
infer_runtime_dependency = false

[ruff]
config = "build-support/ruff.toml"

[source]
root_patterns = [
  "src/protos",
  "src/python",
  "tests",
  "/serenity.analytics",
  "/serenity.base",
  "/serenity.base.services",
  "/serenity.data.client",
  "/serenity.data.pipelines",
  "/serenity.labs",
  "/serenity.middleware",
  "/serenity.specifications",
  "/serenity.risk",
  "/serenity.seal"
]

[taplo]
glob_pattern = ["**/*.toml", "!pyproject.toml", "!.taplo.toml", "!taplo.toml"]

[test]
attempts_default = 3

[yamllint]
config_file_name = "build-support/yamllint.yml"
exclude = ["**/k8s/**/*.yaml", "**/k8s/**/*.yml"]

@huonw
Copy link
Contributor

huonw commented Dec 19, 2024

Thanks for the info.

Re the target, from the error message looks like the cached file is the contents of what is attempting to be serenity.risk/experiments/dar-wilshire/DAR_prices_data.csv. Are there any targets that interact with a file with that name?

@kdowney-talos
Copy link
Author

Hi @huonw, thanks for picking this up. Here's the full BUILD under serenity.risk:

python_requirements(name="deps", source="pyproject.toml")

resources(name="metadata", sources=["**/*.csv"])
resource(name="pyproject", source="pyproject.toml")
resource(name="version", source="VERSION.txt")

python_distribution(
    name="dist",
    dependencies=["//serenity.risk/src/python:lib", ":pyproject", ":version"],
    provides=python_artifact(name="serenity-risk"),
    generate_setup=False,
)

As you can see, the place the CSV file gets picked up is in the metadata resource, which then gets used here, in serenity.risk/src/python/BUILD's python_sources target:

resources(name="config", sources=["**/*.cfg", "**/*.json"])

python_sources(
    name="lib",
    sources=["**/*.py"],
    dependencies=[
        "//serenity.risk:deps",
        "//serenity.base/src/python:lib",
        "//serenity.data.client/src/python:lib",
        "//serenity.specifications/src/python:lib",
        "//serenity.risk:metadata",
        ":config",
    ],
)

Nothing code generation related. I could certainly refine that glob to avoid picking up files under the experiments directory without breaking out build as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants