Skip to content

Latest commit

 

History

History
89 lines (78 loc) · 14.5 KB

2022-04-27.md

File metadata and controls

89 lines (78 loc) · 14.5 KB
title parent
2022-04-27
Meetings

ASWF CI Working Group

Meeting: 27 April 2022

https://zoom.us/j/757849640?pwd=QzE1K2hrL2FHSFhKK3h5Z3BWTFJsZz09

Attendees

  • Jean-Francois Panisset (VES Technology Committee)
  • Andrew Grimberg (LF RelEng)
  • Ryan Botriell (spk / ILM)
  • Aloys Baillet (NVIDIA)
  • Jean-Christophe Morin
  • Larry Gritz (Sony Imageworks)
  • Dan Bailey
  • Allan Johns (Rez)
  • Matias Codesal

Apologies

New items

  • Allan Johns: Rez recipe repository and CI requirements
    • Cycling back to this idea after a few weeks. Working with Stephen MacKenzie, starting on a repository of Rez recipes to build specific packages for the VFX Reference Platform.
    • rez-recipes-vfxrefplatform
    • Configure things that the packages can't configure
    • Recipes directory has a basic idea of a VFX Reference Platform package, has a number of weak references in REez, sets the bounds of what's possible in terms of versions.
    • In recipes/cmake/package.py can dynamically define package, Python code sets the versions
    • recipes/cmake/build.py : general idea creates a template for a CMake build, sets parameters based on build information that comes from Rez. Setting specific flags for Cmake to enable Qt for instance.
    • Aim is to come up with something that's a first iteration we can start to build, something is missing in rez is package features, a way to describe an attribute of a package such as a build configuration. Would be able to reference this at resolution time. Can be done currently but in a janky way.
    • Enabling "hash variant mode" in Rez, would help with combinatorial explosion of variants in a public repo. Generates a SHA hash that ends on disk. When combined with package features, will end up with a bunch of hashed package artifacts on disk. Getting similar to Conda.
    • Initially, not trying to solve the problem to be able to pull from an artifact repo, just be able to build everything you want present for the VFX Reference Platform, but longer term would make sense to integrate with an artifact repo. Would need updated builds scripts aware of the repo.
    • Ryan: intention would be to have a public repo that can be cloned, and use to build everything you need? Allan: yes, that's the initial goal, something that would be broadly useful to the studios. Ryan: how does Rez handle source files? Does it expect you have already cloned the source files? Allan: may support both existing source files and clone action. Whatever is most appropriate for the specific project (git clone, pull tarball...)
    • Aloys: any desired to make it easier in package.py to specify the version from the command line? It's possible to do this, but it's a bit of boiler plate to copy into all the package.py files. Any cleaner way to do this? Allan: something to think about, the recipes repo will have multiple versions of the packages, so which one do you want to specify? Initially want to do something basic, be able to define the packages you want to build, or the whole VFX platform. Will have a request of what you want to build, keep it simple initially. Don't want to get bogged down too much in trying to solve these issues upfront.
    • Allan: hopefully this is something that multiple studios can use. It implies packaged being named the same way, "python" vs "Python" is often an issue. How variants are layed out, that can depend on whether studios have rezified down to the compiler version level. May end up with config driven ways to be able to change package definitions. Not sure what this ends up looking like. Not sure if studios can use verbatim (with local changes), or if it's more a template they can use for their own.
    • Ryan: had this conversation with spk, what's the override system that allows using a public repository. It's a useful feature to be able to leverage something that exists, with just your patches on top.
    • Andrew: we do something like that the way we manage Packer builds. Provide a library of Packer builds, people symlinks around things to include the basic definitions and overlay their own on top. Gives them the ability to use / add to what LF RelEng does. Packer is JSON based for some of the files, some are YAML. But yes, static structure, although Packer now accepts HCL, which is kind of a programming language, harder to overlay.
    • Allan: in Rez, package aliases help to with packages that do the same thing but have different names at different facilities.
    • Should there be an Ansible repo as well? Could give structure, a way to set up your various build parameters in a known way. Instantiations of Cmake files.
    • Ryan: along those lines, having a container available for CI would be useful, similar way to approach is. An image that can mount the storage location. But Ansible could be applied to an existing host. Allan: not mutually exclusive, a way to create the project structure, might make it clear where studio differences are put in.
  • Larry Gritz: Unity builds and Cmake, experience with OIIO: AcademySoftwareFoundation/OpenImageIO#3402
    • Optimizing Pixar's USD Compile Time
    • Larry: Cmake docs describe unity builds, was assuming it referred to the Unity game engine, but it's not the case. Someone pointed out that it's a phrase that means combining the translation units / cpp files into bigger globs, could be helpful to limit the work done for every translation unit separately. Digesting the same set of large, expensive headers, and other sources of overhead.
    • If you use a large template in header, turning that into code for specific types might have to be done over and over for translation units, then thrown away at link time.
    • Cmake lets you do this more or less automatically, will generate a shell cpp that #includes all the other cpps into a translation unit.
    • Most code bases not safe to do this without some surgery. There are obvious things such as headers requiring #once or include guards, but there are other subtle ways to violate one definition rule, doing it differently in different modules.
    • Tried doing this on OIIO, two cases that needed to be tracked down were a file scope static variable or inline function with the same name, normally safe to do since it's scoped to the file, but not so if you merge into a single cpp compilation unit. Not difficult to resolve, but can take time to track down.
    • A series of PRs fixed these one by one, then PR 3402 turns on unity builds. Has a table with timings. Whether it's useful will greatly depend on the code base. You gain efficiency by removing redundant work, but can lose ground if you are building on a system with lots of cores with parallel builds, you can't load balance as well with a few large cpp files instead of lots of small ones. Sometimes doing more work is better if you can do this work in parallel. After all this work, on 8 core laptop and 32 core workstation, it's kind of a wash if there are any savings, maybe not worth the trouble. But, if you force it to be a single threaded build, then you come out way ahead, there is a situation where this happens is CI, where the GitHub runners get about "1.5" cores. So could cut build time by about 30%. You never know what specific runner you are getting, and build times are not stable, but you can see the trends, and you can pretty reliably cut built time by 30%-40%. 30 different jobs in the test matrix, and extra slow on Windows / Mac VMs. Anything to decrease CI turnaround time is useful. Useful thing for CI builds. Default build will be off in this codebase for individual users, but turned on for CI by default.
    • On TODO list: look at some of OIIO dependencies, although rely on aswf-docker containers for a lot of these dependencies, also compile a lot of versions that aren't in VFX Platform, or build against latest version for the most important ones so I know as quickly as possible if those dependencies are introducing breaking changes. Speeding up those builds also helps. Quite a coincidence that someone posted today a blog post about unity builds for USD. They got a huge boost due to heavy template use. It's a lot of work to get to the point it works.
    • Worth looking at the table of results, also how much variability in CI builds. Could be useful to plant this idea in other projects so different projects could do their own work for unity builds, and anyone building downstream would be safe.
    • Some people say unity builds (Cmake terminology), others call them jumbo builds.
    • Tried to determine which headers in OIIO are most expensive to include and tried to include them in only the modules that really need them.
    • Haven't explored yet: might get better optimized code from unity build, similar to link time optimization, better inlining, cache friendly layout. You don't compile everything in a single module, your build may already be breaking up into specific libraries. There are parameters that control how many cpp files are mashed into a single one.
    • Aloys: interesting to hear. Did a pass to switching systems at AL to unity build, long process but kind of worth it. For USD, just turning on ccache helped. Larry: also used that, had to disable ccache for unity build tests. Ccache helps subsequent builds, but not when you start from scratch.
    • Aloys: a successful build after ccache would be 10x faster for USD
    • Dan: OpenVDB has been looking at this for a while, most notorious example of crazy template instantiation, haven't tried unity builds, but might not be able to squeeze into memory. Larry: yes, you can run out of memory on the runners if you combine enough cpps to get a benefit. Dan: added explicit template instantiation, has an impact on build time. Doing work to instantiate those pieces improved the build time. Also benefits your client code / downstream users. Haven't done everything yet, not the whole codebase. A lot faster, building stuff on top of OpenVDB should be faster, but core library is much larger since it has all the instantiations (included in OpenVDB 9.0).
    • Larry: also increases the surface of the ABI, if the template is just in the header, that's more resistant to ABI changes, whereas if you explicitly instantiate the templates, that makes the template expansion part of the ABI. Dan: yes, pros and cons to all techniques.
    • JF: what about deferring template instantiation at link time / pre compiled headers? Larry: not sure if current compilers do this, or when they actually get around to instantiation time. Dan: quite a bit of redundant instantiation happening. Looking forward to dropping C++14 support soon to be able to use const_expr which should help. Larry: yes, same.
    • Larry: did you have a web reference to template tricks? When I try to explain this to people, it can be tricky and haven't found a good reference that spells this out. Dan: found a few references and will try to dig them up. Dan: also one of these things where it seems tantalizingly easy, but can be quite a lot of work once you try it.
    • Dan: a pet peeve is that a lot of universities focus on OO programming, lots of algorithms in templated classes, trying to move algorithms to a separate place than the data. A fundamental design consideration from the beginning of the project.
  • Native GitHub support for DCO:
    • John Mertic: We've been working with GitHub over the past few months on making DCO sign-offs easier in the browser, and we are pleased to announce that we've turned on this beta feature for testing in all ASWF repos where a DCO is used.
    • GitHub survey: GitHub DCO Survey
    • Andrew: last year, GitHub turned off all the pro bot integrations in Q3, caused a lot of problems for people doing open source projects, including all LF projects which used DCO Pro Bot. GitHub backtracked and turned it back on. To avoid breaking everybody, need DCO native support like Garrit and GitLab, they have been working on this. What's rolling out is part 1 of 2 of their beta, part 1 is in GitHub UI, if you have access on the repository, will add your "signed off by" line for you. Survey is "if you have used this feature, how easy is it to use". Phase 2 will be to have branch protection to disallow a PR that doesn't have DCO on all commits. Will stop someone from raising a PR if not all commits are signed (instead of doing it as a CI stage). A LF Project like ASWF should require DCO on every commit for every repo, but there have been loopholes for some repos that are targeted to non-developers working through the GitHub UI such as the Landscape repo that targets the marketing team. This is a way to enforce DCO on every project.
  • Aswf-docker updates
    • Planning to USD 22.05 soon, but haven't started yet
    • Also CentOS 7.9
  • Permissions update on ASWF wg-ci GitHub Repo : no longer need PR approval to merge, no way to require PR approval while allowing self approval

Tools

Follow Ups

  • From Rob Fanner at Foundry about sharing Conan recipes:
    • Less progress than hoped due to there being a bit more actual work involved to clear out some ugly bits and needing to schedule that among other objectives.
      • Legal team on our side keen to progress with Apache 2.0 for the code we want to share. Cleaner than going the evaluation license route which would need multiple orgs signing off (as noted by you/others in the earlier meeting).
      • Work to clean things up before sharing tentatively scheduled for June. It’s not an enormous amount of work, but enough to make it tricky to expedite vs other work internally.
  • Updates on GHA custom / for pay instances (Andrew)
    • No update yet, last two months of GitHub syncs, people in charge of that functionality weren't available