Skip to content

Latest commit

 

History

History
144 lines (130 loc) · 12.7 KB

2023-11-08.md

File metadata and controls

144 lines (130 loc) · 12.7 KB
title parent
2023-11-08
Meetings

ASWF CI Working Group

Meeting: 08 November 2023

https://zoom-lfx.platform.linuxfoundation.org/meeting/97730566101

Attendees

  • Jean-Francois Panisset, VES Technology Committee
  • Kerby Geffrard, OpenRV
  • Andrew Grimberg, LF Release Engineering
  • Christina Tempelaar-Lietz, ILM

Apologies

New items

  • aswf-docker updates

    • Very close to local Conan cache solution, will be in 2023.2
      • WIP branch
      • Docker BuildKit (buildx) builds for Conan, still serialized but will eventually be able to parallelize Conan builds (still using a non-buildx container run for login to Conan repo)
      • Big win for local builds: local Conan builds now live in BuildKit caches, no longer pulled into Docker build context (got huge with Clang / Qt / PySide Conan builds)
      • OpenEXR and OpenVDB getting pybind11 in their ci- build containers
      • May not get to fixing issue with GLfw cmake setup for OpenVDB in 2023.1 (transition to Conan) (maybe a short script that can be used in their CI?)
    • Immediate plans for 2024 (as soon as 2023.2 is released)
      • Qt 6
      • Clang 15/16: OK to drop 14? (question for Larry)
        • Does OpenEXR have a Clang build? Christina: yes we do, but will only support prior year in the CI. Don't think our CI builds with Clang 14. JF: the VFX Platform is silent about Clang versions. ci-openexr build container doesn't include Clang. Christina: may just be building on the GHA runner? We have Clang 14 and 15 builds. JF: trying to avoid combinatorial explosion, so drop 14 to add 16. Christina: should be fine. JF: Clang is actually in the ci-common container. TODO: JF to review what OpenEXR / Imath is doing with its CI.
        • Kerby: we use Clang through Xcode for macOS, otherwise Visual Studio for Windows and GCC for Linux. JF: comment in Slack from Neil Gompa about standard ffmpeg build he maintains for Fedora and EPEL, could be useful. Interesting perhaps for codec IP issues? Kerby: discussion with Apple about ProRes. In the past we tried to be as similar as possible on each OS, but getting more complicated.
      • Python 3.11
        • Mandated by platform
      • CUDA 12.x
        • Depending on GPU runners
      • OptiX 8
      • Non-platform package updates
      • More packages Conan-only (Clang, Ninja, Pybind11 Qt, PySide)
    • Longer term plans for 2024
      • OIIO build container
      • ORI build containers
      • Container size reduction effort
        • Get rid of nsight-compute from nvidia/cuda base container if possible
        • An issue for builds that also need Intel compilers
        • aswf-docker GHA removes a number of packages from GHA runners to free up disk space: aswf-docker docker-builds.yml GHA
      • Windows / macOS...
      • Your wishes go here
        • Linux Containers based on other OSes than Rocky?
          • CentOS Stream
          • Alma Linux / UBI / ...
          • Ubuntu
          • Building our own base image instead of using nvidia/cuda could help
        • Christina: request for mingw on Windows, so not until we have a Windows container. Windows is higher priority than macOS. We get more reports of issues on Windows builds that are not captured by our CI than we do for MacOS. We have had reports of issues building with mingw, looking at adding a CI job. JF: mingw can be an alternative to Visual Studio (even though it has a license for open source use)
        • Kerby: nothing right now
  • Managing for pay resources

    • GPU / larger runners / ARM Linux runners / Apple Silicon runners
    • More visibility to TAC
      • Integration with LFX dashboard. Andrew: GitHub has no API for that, they say something is coming soonish. Want an API also to see how many free resources we're consuming, zero visibility into how many free resources are being used.
    • Resource allocation to projects
      • Dealing with bursts (OpenVDB October release)
      • Mechanism for projects to request additional resources
    • Carrying balance forward
      • $1450 cap/month (with $1500 hard cap). JF is it "use it or lose it"? Andrew: not really, LF IT keeps track of exact spend per month, LF AP passes all invoices to us for validation, we keep running tally, since we know what the cap is supposed to be, we know how much free space we have left in the budget. So last month was able to raise limit on credit card for October since we had enough budget. So there's a manual carry forward process. JF: would be great if it could somehow be automated.
    • Are we still one of the few LF sub foundations that use non free resources?
      • Andrew: CNCF are a big user, but they self support / built a shadow IT organization, but CNCF is 100x the size of ASWF
    • Andrew: asked John to double CI budget for next year since we know the GPU runners are now in open beta, ARM runners are expected some time in next 2 quarters (we will be taking part in the beta). I expect we'll see these things come in, ASWF will want to use them, need to have the capacity for that. Also additional projects will be joining ASWF. Right now we're running so close to the budget that we need to double it, but it needs to be approved. JF: several TSC chairs should be happy to provide supporting quotes for an increase.
    • JF: better coordination via #wg-ci TODO: post to various project channels about this
  • OpenRV Updates

    • Dockerfile(s) ready, but not à la asfw-docker... What's next?
      • At Autodesk we build on CentOS 7 and Rocky 9, tried to use in aswf-docker but realized the Dockerfiles are a bit "special".
      • Don't need GPU driver or anything like that.
      • JF: can walk you through how to add a ci-openrv build container
      • JF: you can start with the closest container and just run a script in your CI to add missing dependencies
    • Pre-caching dependencies using artifactory
      • OpenRV Dependencies
      • Automate the build, if you click on link above, all the dependencies in OpenRV
      • More than half our build time is taken up by dependency builds, and that's the most difficulty. Becomes really painful, and need a bunch of stuff to your own environment. Starting to talk about caching some dependencies that don't change.
      • Anybody else have a project in mind? Do you know if OIIO or OCIO have talked about providing pre-compiled binaries? Want't to know what's the right way to do this, don't want to put yet another Python / Qt in Artifactory for now good reason. But need to know the info about similar projects. xStudio might benefit from this, first step of their repo is to gather all their dependencies, so a script that can go and get them would help.
      • Andrew: we have to be careful about what we put in there, must be stuff we produced ourselves? Kerby: could we use GitHub releases? Andrew: you may be using open source projects, but don't necessarily have the license to redistribute. We have to be careful about.
      • JF: aswf-docker uses both Conan/Artifactory and Docker Hub
      • Andrew: our agreement with Artifatory is that anything we put in there is produced by our projects, so we aren't authorized to host other projects.
      • Kerby: we want to pre-compile Qt, OpenSSL... Andrew: if the other project is vendored into our own...
      • Andrew: the issue is with Artifactory specifically. Kerby: so GitHub releases would be fine? Andrew: that should be OK, we just can't do that with Artifactory because of our JFrog agreement. Kerby: so we could put resulting OpenRV build in there? What if it bundles a bunch of other stuff? Andrew: I am not a lawyer! Kerby: we find way too much time fighting dependencies. This is coming, we might ask more questions in the future.
    • Cost of CI, what is required to be known?
      • If we want to have a CI for OpenRV, what do we need to know to put something live? We don't need GPU, we don't need ARM / Apple Silicon.
      • Andrew: if you can fit in 2 core / 2 GB... Otherwise you need the larger runners. We run about $600/month below budget.
      • Kerby: at Autodesk we build with 8 cores? JF: look at definitions for for-pay Linux builders in #wg-ci. Kerby: what would be a correct target of spend per month? I know nothing about the cost of open source projects. At Autodesk we have a datacenter where we deploy VMs, now we need to think about the cost of what we're doing, don't know what's "reasonable". Andrew: that's hard to say, we get alerts from GitHub when we near our budget cap. If we run out of capacity near middle of the month we'll pull reports and check, which we did last month for OpenVDB. If we are starting to run close to the budget. "Reasonable" is a matter of opinion across projects.
      • Andrew: we have unlimited free minutes across public repos, but are limited in concurrency. But we are an "enterprise" GitHub plan, so we have largest capacity for concurrency, it's larger than what we should actually have.
      • Kerby: we don't use ccache any more to make the build simpler.
  • Current available for-pay runners (Andrew)

    • ubuntu-20.04-4c-16g-150h
    • ubuntu-20.04-8c-32g-300h
    • ubuntu-20.04-16c-64g-600h
    • windows-2022-8c-32g-600h
    • windows-2022-16c-64g-600h
    • ubuntu-20.04-gpu-6c-112g-336h-16vr
    • ubuntu-20.04-gpu-12c-224g-672h-32vr
    • ubuntu-20.04-gpu-24c-448g-1344h-64vr
  • Maintaining custom ASWF GH actions

    • Does it make sense / would it be useful for projects?

Follow Ups

Tools and Links