Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create SBOMs for ASF application binary distributions #35

Open
ppkarwasz opened this issue Nov 12, 2024 · 10 comments
Open

Create SBOMs for ASF application binary distributions #35

ppkarwasz opened this issue Nov 12, 2024 · 10 comments

Comments

@ppkarwasz
Copy link

The purpose of this issue is to provide SBOMs for all ASF binary distributions, starting with those that:

  • contain executable applications,
  • bundle all or most of their dependencies.

These are IMHO the most critical distributions, since users can not easily upgrade vulnerable dependencies without a new release.

The following table contains a list of binary application distributions of various Apache TLPs.
Currently only 10% of the TLPs are included:

Application Download URL Tool SBOM
Accumulo https://accumulo.apache.org/downloads/ Maven Assembly [ ]
ActiveMQ Artemis https://activemq.apache.org/components/artemis/download/ Maven Assembly [ ]
ActiveMQ Classic https://activemq.apache.org/components/classic/download/ Maven Assembly [ ]
Age only source releases? [ ]
Airavata https://airavata.apache.org/development.html#downloads Maven Assembly [ ]
Airflow only Python packages + Docker? Python [x]
Allura only source releases? [ ]
Ambari only source releases? [ ]
Ant https://ant.apache.org/bindownload.cgi Ant [ ]
APISIX only source releases + Docker [ ]
Aries only sample applications [ ]
Arrow only libraries [ ]
AsterixDB https://asterixdb.apache.org/download.html Maven Assembly [ ]
Atlas only libraries? [ ]
Avro only libraries? [ ]
Axis https://axis.apache.org/axis2/java/core/download.html Maven WAR [ ]
Beam only libraries? [ ]
Bigtop https://bigtop.apache.org/download.html#releases Dpkg and Rpm [ ]
Bookkeeper https://bookkeeper.apache.org/releases/ Maven Assembly [ ]
Brooklyn https://brooklyn.apache.org/download/index.html Maven Assembly [ ]
@ppkarwasz
Copy link
Author

@hboutemy,

As my initial scan shows the Maven Assembly Plugin is the most common way to create binary application distributions. The CycloneDX Maven Plugin and SPDX Maven Plugin should probably have some special support for the Assembly plugin to generate an appropriate SBOM.

In CycloneDX the application should probably be represented as a CycloneDX assembly of libraries. What do you think?

@raboof
Copy link
Member

raboof commented Nov 12, 2024

(linking CycloneDX/cyclonedx-maven-plugin#472 and spdx/spdx-maven-plugin#159 here as those issues are not really about shading specifically but really about all plugins that embed code from dependencies into a more 'fat' artifact - though the implementation might of course be separate)

@hboutemy
Copy link
Member

hboutemy commented Nov 13, 2024

@ppkarwasz war has already support in CycloneDX Maven Plugin: we can start here
but yes, I'm not surprised that assembly plugin is the most used: order in CycloneDX/cyclonedx-maven-plugin/pull/576 was quite intentional

In CycloneDX the application should probably be represented as a CycloneDX assembly of libraries. What do you think?

I have no clue, this is the most important question I never had any clear answer on and could not easily create myself: need to see an example, be it in XML or json

@raboof shade adds additional problems to solve: this will be the final issue, but not one we can progress easily. We need to divide to conquer: war, then assembly, then shade -- perhaps we'll find other intermediate steps, I at least know these ones

@hboutemy
Copy link
Member

@ppkarwasz BTW, in the list of TLPs that do SBOMs, there are a few that do such application binary distributions

I did a first test on Maven itself and JSPWiki in #23 with a "check SBOM against binary content" approach, but there are other ones: these ones are interesting because we have an initial SBOM that we can compare with proposed target content

@ppkarwasz
Copy link
Author

ppkarwasz commented Nov 13, 2024

In CycloneDX the application should probably be represented as a CycloneDX assembly of libraries. What do you think?

I have no clue, this is the most important question I never had any clear answer on and could not easily create myself: need to see an example, be it in XML or json

As an example, currently the Maven binary distribution is represented as (the example has been edited following Arnout's comment):

<metadata>
  <component type="library" bom-ref="pkg:maven/org.apache.maven/[email protected]?type=pom">
    <publisher>The Apache Software Foundation</publisher>
    <group>org.apache.maven</group>
    <name>apache-maven</name>
    <version>3.9.9</version>
    <description>
      The Apache Maven distribution, source and binary, in zip and tar.gz formats.
    </description>
    <purl>pkg:maven/org.apache.maven/[email protected]?type=pom</purl>
  </component>
</metadata>
<components>
  <component type="library" bom-ref="pkg:maven/org.apache.maven/[email protected]?type=jar">
    <publisher>The Apache Software Foundation</publisher>
    <group>org.apache.maven</group>
    <name>maven-embedder</name>
    <version>3.9.9</version>
    <description>
      Maven embeddable component, with CLI and logging support.
    </description>
    <purl>pkg:maven/org.apache.maven/[email protected]?type=jar</purl>
  </component>
  ...
</components>
<dependencies>
  <dependency ref="pkg:maven/org.apache.maven/[email protected]?type=pom">
    <dependency ref="pkg:maven/org.apache.maven/[email protected]?type=jar"/>
    ...
  </dependency>
</dependencies>

The relation between apache-maven and maven-embedder should probably be represented as:

<metadata>
  <component type="application" bom-ref="pkg:maven/org.apache.maven/[email protected]?classifier=bin">
    <publisher>The Apache Software Foundation</publisher>
    <group>org.apache.maven</group>
    <name>apache-maven</name>
    <version>3.9.9</version>
    <description>
      The Apache Maven binary distribution, in zip and tar.gz formats.
    </description>
    <purl>pkg:maven/org.apache.maven/[email protected]?classifier=bin</purl>
    <components>
      <component type="library" bom-ref="pkg:maven/org.apache.maven/[email protected]?type=jar">
        <publisher>The Apache Software Foundation</publisher>
        <group>org.apache.maven</group>
        <name>maven-embedder</name>
        <version>3.9.9</version>
        <description>
          Maven embeddable component, with CLI and logging support.
        </description>
        <purl>pkg:maven/org.apache.maven/[email protected]?type=jar</purl>
      </component>
      ...
    </components>
  </component>
</metadata>

Comparing to the current SBOM, the type of component and PURL should be adapted. In the example above I consider maven-embedder as a sort of "dynamically linked application", while apache-maven is a sort of "statically linked application".

@raboof
Copy link
Member

raboof commented Nov 13, 2024

As an example, currently the Maven binary distribution is represented as:

Where did you get that? In https://repo1.maven.org/maven2/org/apache/maven/apache-maven/3.9.9/apache-maven-3.9.9-cyclonedx.xml, pkg:maven/org.apache.maven/[email protected]?type=pom is not in components, but in metadata?

The relation between apache-maven and maven-embedder should probably be represented as:

Right - and if pkg:maven/org.apache.maven/[email protected]?type=pom is the metadata.component, in that case pkg:maven/org.apache.maven/[email protected]?type=jar would be moved from components to metadata.component.components? (also CycloneDX/cyclonedx-maven-plugin#472 (comment))

@ppkarwasz
Copy link
Author

As an example, currently the Maven binary distribution is represented as:

Where did you get that? In https://repo1.maven.org/maven2/org/apache/maven/apache-maven/3.9.9/apache-maven-3.9.9-cyclonedx.xml, pkg:maven/org.apache.maven/[email protected]?type=pom is not in components, but in metadata?

I took the liberty to "simplify" the example. I edited the example to follow more closely the real structure.

@hboutemy
Copy link
Member

hboutemy commented Nov 13, 2024

I'm so surprised to see all the components in the distribution moving from components to metadata.components
I suppose they move, isn't it? they are not duplicated?

I can't imagine a Docker image doing that for everything that is embedded in the image (I'm trying to compare to other ecosystems, where there is an equivalent embedding)

@raboof
Copy link
Member

raboof commented Nov 14, 2024

I'm so surprised to see all the components in the distribution moving from components to metadata.components

I didn't expect that either

I suppose they move, isn't it? they are not duplicated?

That is my understanding, yes

I can't imagine a Docker image doing that for everything that is embedded in the image (I'm trying to compare to other ecosystems, where there is an equivalent embedding)

indeed all the tools I'm aware of just put them in the main components - brought it up in CycloneDX/cyclonedx-maven-plugin#472

@hboutemy
Copy link
Member

I imagine there is more diversity than what was expected
in my world, I imagine that when describing an application, we expect that dependencies are embedded, and de-facto when describing a library, dependencies are "soft referenced": but may it's not always true. And it does not cover advanced case of shaded dependencies, where just a few dependencies are embedded into a library

we'll need to have a more interactive discussion with SBOM experts, because it starts to be too complex for our just an issue tracker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants