-
Notifications
You must be signed in to change notification settings - Fork 365
Open
Description
[This proposal is still work in progress and will be updated]
Initial rough outline of the ask
A supplier provided package 'MGR::foo:1.00' for which no
source code is available but an SBOM, e.g. 'foo.spdx.json'.
ORT should "know" about the association to enable
the following:
- Dependencies from SBOM should be added to the
dependency graph as dependencies of MGR::foo:1.00
along with proper metadata. - The scanner should add the data from the SBOMs for 'MGR::foo:1.00' and its dependencies
instead of attempting to scan with ScanCode and fail with a scan issue. - The evaluator rules should work for MGR::foo:1.00
and it's dependencies as usual, and should have an
indication if an examined package corresponds to an
SBOM or to a dependency of an SBOM. - Advisor(s) should query vulnerabilities including SBOM dependencies.
- The notifier should also have such indication available, so that vulnerabilies can be propagated to different
channels. - Produce a combined SBOM. TBC if it is necessary to produce an archive which
includes all corresponding original SBOMs in unmodified form. - Produce a NOTICE file including entries SBOM dependencies.
- Turn around time for making curations / seeing the effect should be low,
to enable an efficient clearance process. (Could also be solved outside ORT)
Known difficulties for solution design choices
- A certain package may be mentioned by multiple SBOMs in a different way, e.g. with conflicting data.
a.) Data could be merged manually, e.g. into a single curation.
b.) Data could be namespaced. - Flexibility in SBOM representations: It may not be possible to write a single on-the-fly
extraction which works for arbitrary SBOMs. - Data in SBOMs may need to be curated.
- There are various SBOM formats and versions.
Solution approaches
Common principles
- Define ORT data structures for the needed data (e.g. curations), which are SBOM-format-agnostic.
- The data in these data structures is then assumed to be correct / already curated.
- No on-the-fly extraction (+ SBOM curations) from original SBOMs. As follow up, it could be an option to look into providing a CLI helper command for this task.
- New dedicated scanner inserts data such as detected copyright / licenses.
- Curation data is centralized, so has to be added / corrected only once for entire ORG.
- Consequence: There is no error fixing iterations by various teams (redundantly)
- data is correct once added, in contrary to on-the-fly parsing and then curating.
- no SBOMs commited to project's code repository
Approach 1 (de-duplicated / merged sbom data)
Turn data from SBOM into following curations:
- Curations to define additional dependencies
a. Enhance package curationsb. A hierarchical directory structure which mapsid: MGR::foo:1.00 additional-dependencies: - MGR::bar:1.00 - MGR::car:1.00Identifier -> Set<Identifier> - A hierarchical directory structure
ort-sboms(file-path isid) with data per package form sbomIf multiple SBOMs contain the same dependency with inconsistent data, extracted data would need to bepurl: "some/purl" declared_licenses: - Apache-2.0 labels: ... detected_licenses: ....
merged when commiting to this file.
pros / cons
- Not possible to have different data for a particular package. This
seems good, but maybe in edge cases it does not work? - de-duplicated data.
- Will produce a clean product SBOM.
- Nicer identifiers (no autmatic conflict resultion necessary).
- no redundancy.
- lower maintainance effort.
Approach 2
Similar to approach one, but without the level of indirection.
So, define a hierarchical directory sboms which maps Identifier -> SbomData
Where SbomData would look like:
id: MGR::foo:1.00
dependencies:
- MGR::bar:1.00
purl: "some/purl"
declared_licenses:
- Apache-2.0
labels:
....
pros / cons
- no depduplication
- conflicting dependency ids possible, which may n
- A lot of redundancy of dependency data. Probably much harder to maintain
when database grows, e.g. used on ORG wide scale. - SBOM data is not scattered, just in one place.
- possible duplicates in result, reports, such as SBOM, notice, which is not so nice.
Metadata
Metadata
Assignees
Labels
No labels