Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify language around 'identity' #1133

Closed
TomHennen opened this issue Sep 18, 2024 · 24 comments
Closed

Clarify language around 'identity' #1133

TomHennen opened this issue Sep 18, 2024 · 24 comments
Assignees

Comments

@TomHennen
Copy link
Contributor

The way the spec talks about 'identity' it could be taken to mean that at Source Level 2+, SLSA wants to require source control platforms to verify the legal identity of open source contributors. I don't believe that is anyone's intent, I certainly didn't intend for that interpretation. I think that what we meant to get at was being able to associate some token (e.g. account name, handle, signing key) trusted by the specific community with commits & reviews (which most, if not all, source control systems already use to manage changes).

We should make the language we use much more crisp to avoid any ambiguity.

I'll track down some language and make a proposal but I wanted to document this as an issue as it came up as a hot topic during a panel discussion at OSS EU on Tuesday.

If anyone has any suggestions or disagrees your thoughts are welcome.

@TomHennen TomHennen self-assigned this Sep 18, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Issue triage Sep 18, 2024
@TomHennen TomHennen moved this from 🆕 New to 🏗 In progress in Issue triage Sep 18, 2024
@TomHennen TomHennen moved this to In progress in SLSA Source Track Sep 18, 2024
@TomHennen
Copy link
Contributor Author

Related questions about 'identity' in this comment thread:

#1094 (comment)

@marcelamelara

Identity management is a slippery slope ;) The juxtaposition of federated authentication and custom implementation is not really clear to me, especially with the given examples. Is the intent here that something like OAuth/email and key-based approaches are both ways to achieve this requirement?

It also seems like some crucial properties/requirements for the identities themselves should be included here: e.g., unique identities, the "root" issuer for identities, mapping between usernames and other identifiers on cloud-hosted SCPs, etc. Teasing these out might also clarify the security objective of the Identity Management requirement more generally.

@zachariahcox

I hope we don't need to define things down to that level!
It would be nice if we can just leave it at

Something or somebody stores and deals with source revisions: let's call that thing the "scp"
The SCP needs to explain how it identifies the actors who do things and record what it saw them do.
SCPs issue attestations of the above. VSAs can use them or not.

I think it's clear there's more work to do here.

@zachariahcox
Copy link
Contributor

zachariahcox commented Sep 19, 2024

My initial thinking on this is that at a high level, SCPs will need to be responsible for:

  1. tracking which actors made changes
  2. optionally linking that actor to external (perhaps government-provided) identity management systems.
  3. adding policy enforcement tooling to ensure that "all contributors have the external identity service linkage" etc.

I do not think developer tools should try to assert the legal identity of users in provenance attestations.

Also, I don't think we should reference signatures directly due to how easily they can be misused in version control systems. We should default to "strongly authenticated" verbiage to ensure tools can use the best possible authn technologies.

@TomHennen
Copy link
Contributor Author

Regardless of the specific requirements we put on SCPs I wonder if we can make a clear statement about non-requirements as well.

Something along the lines of "Nothing in this specification should be taken to mean that open source software contributors need to, or should, be mapped to legal their identities."

@adityasaky
Copy link
Member

I think being clear about:
a) identities are for internal consistency, so it's possible to track and set policies on the actions of the entity in question
b) identities are not required to be mapped to legal IRL identities
should address this, I'm in support of adding the text @TomHennen proposed.

Also, I don't think we should reference signatures directly due to how easily they can be misused in version control systems. We should default to "strongly authenticated" verbiage to ensure tools can use the best possible authn technologies.

I imagine we'd say "strongly authenticated" and qualify with a non-exhaustive set of examples that include SCP mechanisms, enterprise hosted identity providers, and so on. Not referencing signatures / the ability to sign as a means for authenticating a developer would perhaps stand out in that case. @zachariahcox, could you clarify the concerns you have with their misuse in version control systems? Maybe we can caveat / suggest possible solutions if someone were to go that route.

@TomHennen
Copy link
Contributor Author

@zachariahcox any more thoughts on this discussion?

@TomHennen
Copy link
Contributor Author

I'm also interested if folks think this should be restricted to the Source Track at first (where the issue came up) or if we should have a separate landing page 'Identities in SLSA' to discuss the topic? (I'm leaning towards the latter).

@hepwori
Copy link
Contributor

hepwori commented Oct 15, 2024

fwiw I agree with your instincts to broaden, along the lines of 'Identities in SLSA'.

I wondered if projects like Sigstore, which are even more closely identity-adjacent, might have prior art or concept definitions SLSA could borrow. Nothing immediately leaped out but I did see that the OpenID docs gesture loosely in the direction of an identity being "the outcome of an authentication process".

It'd be good to have SLSA include some words on how identity should and shouldn't be understood in this context.

@marcelamelara
Copy link
Contributor

Chiming in a little late... some thoughts:

I do not think developer tools should try to assert the legal identity of users in provenance attestations.

I completely agree, and agree with @zachariahcox 's suggestion to focus on "strongly authenticated" (the security objective), but like @adityasaky would like to also better understand what the specific concerns are with signatures in VCS's.

identities are for internal consistency, so it's possible to track and set policies on the actions of the entity in question

I'd even go a bit further and say that identities are for consistency within some application context, e.g., within a single enterprise (all users of company XYZ) or within an SCS. Maybe this is already what you meant @adityasaky ?

I'm also interested if folks think this should be restricted to the Source Track at first (where the issue came up) or if we should have a separate landing page 'Identities in SLSA' to discuss the topic? (I'm leaning towards the latter).

@TomHennen I also agree with the latter. Identities are a cross-cutting aspect across tracks.

I wondered if projects like Sigstore, which are even more closely identity-adjacent, might have prior art or concept definitions SLSA could borrow.

This is a good idea @hepwori . I do think we need to be a bit careful for SLSA not to prescribe the use of Sigstore, but we can certainly align on general terminology.

@adityasaky
Copy link
Member

I'd even go a bit further and say that identities are for consistency within some application context, e.g., within a single enterprise (all users of company XYZ) or within an SCS. Maybe this is already what you meant @adityasaky?

Yeah, that's what I meant! It's for consistency within the context of the policy, which may be for a particular application or organization-wide.

@mlieberman85
Copy link
Member

mlieberman85 commented Oct 23, 2024

@hepwori So, I don't know if it was on purpose but from the Sigstore docs, I always felt they sidestepped the question of what abstract concept an identity maps to. They just refer to identities as either proof of ownership of a key or federated identities through OIDC. It's been a while since I've looked through Sigstore, but I think they've largely just used identity as a thing unto itself and not talked about what an identity maps to whether it be a human, set of humans, a system, or anything else.

As a specification as opposed to a suite of tooling I think we might want to be a bit clearer but I think focusing on "what we are" over "what we aren't" is probably the way to go. I think mapping or not mapping identity to literal human is largely out of scope. I think folks will implement SLSA internally at their organization and do want to map OIDC or key to employee. However, in the open source space there's nothing SLSA gives folks to make unmasking even possible, not that we'd ever want to get into that in the first place.

Now with that said, I do think we've had enough folks (on both sides of the argument) rehash the misconception that we're somehow working to unmask anonymous/pseudonymous open source contributors that having it somewhere in our docs we can point to us explicitly stating we're not doing this. I have seen both folks from the open source community make wild claims that SLSA is trying to dox open source contributors, but I've also seen folks from large enterprises who are worried about XZ want SLSA to unmask anonymous potential bad actors.

@hepwori
Copy link
Contributor

hepwori commented Nov 9, 2024

How about the following, in a putative "identities and SLSA" section?

_References to identity in the SLSA specification refer to supply chain actors—human or automated—which are authenticated and uniquely identified in a way which is meaningful to those making downstream trust decisions. There is no requirement to use one identity namespace over another, one authentication domain over another, or to use real human identities over pseudonymous ones. Closed SLSA ecosystems may use an identity system entirely internal to their scope.

In particular, nothing in this specification should be taken to mean that identifiers referring to open source software contributors need to be, or should be, mapped to their real-world legal identities._

@arewm
Copy link
Member

arewm commented Nov 11, 2024

Identity discussions have come up a couple times in SLSA. I feel like it would likely be beneficial to address it broadly, but I feel like that will sidetrack the current conversation which feels more targeted towards the source track itself.

Is the mention of an identity in the source track primarily to enforce non-repudiation? If so, would we just try and be more specific about that target goal?

@TomHennen
Copy link
Contributor Author

Is the mention of an identity in the source track primarily to enforce non-repudiation? If so, would we just try and be more specific about that target goal?

I think non-repudiation may be too strong a term? I don't think we're looking for anything that doesn't allow code authors to repudiate something. Rather it's more like "if there's a compromised account it should be possible to find other changes made by that account" and "consumers should be able to know which account purported made a change so that they transfer whatever trust they have for that account to the change". Maybe that's inline with what you're thinking?

FWIW I do think we need to make a pretty strong statement here. I've had a couple of conversations where people hear "identity" and assume it means a person's legal identity. That sometimes provokes fairly strong reactions (rightfully so) and saying "well that's not what we mean here's some nuanced language" isn't nearly as helpful as being able to say "here's where we say explicitly we don't to de-anonymize anyone".

I'd be fine keeping this in the source track for now, but SLSA as a project being able to say we never want to do this would be helpful in preventing this type of misunderstanding in the future, especially as new tracks are developed. E.g. you could imagine the dependency track thinking about wanting to know the identity of the entity that published a given dependency.

@arewm
Copy link
Member

arewm commented Nov 11, 2024

NIST's definition of non-repudiation often references an identity as associated with a private key: https://csrc.nist.gov/glossary/term/non_repudiation

I don't think we're looking for anything that doesn't allow code authors to repudiate something

I think that I agree with all of the negatives in there.

if there's a compromised account it should be possible to find other changes made by that account

I guess that this is non non-repudiation because if we use that terminology then we are implying that these account holders may want to deny that they produced an artifact (i.e. commit). Is the difference here who wants to actually be able to say "account X did this action" vs someone who wants to try to claim that they didn't?

Can we assume within the source track that artifact-creating identities have a specific account ... and therefore word in terms of accounts?

FWIW I do think we need to make a pretty strong statement here.

Strong statement == within SLSA? There are some places where the identity -> person relationship is more readily assumed (i.e. it is easier in the source track than for a build platform identity).

@hepwori
Copy link
Contributor

hepwori commented Nov 18, 2024

I think the statement I proposed in #1133 (comment) would do the job of reassuring folks about how SLSA expects/intends identity to be used.

To me, the position in the language is at the same time true, uncontroversial, helpful, responsive to real concerns, and likely to be effective in putting them to rest.

Are there downsides with the language which I'm not seeing but might persuade us not to include it? @arewm do you have specific concerns with the proposal as written?

There is probably much more we can say in due course, and much more specificity we may want to layer atop over time (non-repudiation and the like; references to normative standards in other domains; etc.). But in the spirit of not making the perfect the enemy of the good, I'm trying to see if we can converge on something merely "acceptable" for now :)

@zachariahcox
Copy link
Contributor

zachariahcox commented Nov 21, 2024

This is a long thread 😓

I think this line needs to be changed to avoid confusion about legal identity:

Benefits: Allows source consumers to track changes to the software over time and attribute those changes to the people that made them.

To @arewm 's point about addressing the topic broadly, how about this:

If an attestation issuer needs to say something about an identity / actor / user, SLSA can say it MUST rely on strongly authenticated authn systems. The issuer is allowed to trust whatever subsystems it wants to handle that, including aad, federal / legal stuff, or crypto signatures.

SLSA attestation schemes to NOT guarantee that consumers will know the exact "people" that made a change, but they can know which actor the issuer believes made the change.

SLSA can say that the issuer MUST make it clear what kind of identity technologies it uses for actor identification stuff.


☝ if all that's fine, we can probably comment about "how to specify identities in attestations" over in distributing provenance.
We would probably give it a header "Referencing identities in SLSA attestations".

Thoughts?

@zachariahcox
Copy link
Contributor

build attestations might need a way to specify the "actor that kicked-off the workflow" for example.

@david-a-wheeler
Copy link
Member

We could also be more general, "to the people that made them" -> "to who made them".

@trishankatdatadog
Copy link
Member

Late to the party, but Why Not Just™️ public keys as identities at the end of the day? 🙂

@lehors
Copy link
Member

lehors commented Nov 21, 2024

It'd be good to get @AevaOnline's take on this

@trishankatdatadog
Copy link
Member

Got much better context around this from @TomHennen from the SLSA Spec meeting, thanks!

Yes, we should clarify that SLSA is not requiring OSS contributors to deanonymize themselves. At best, we may record things like GitHub usernames, email addresses, and even public keys, all of which do no need to map to real identities.

@zachariahcox
Copy link
Contributor

Discussed today at the spec meeting

TomHennen added a commit to TomHennen/slsa that referenced this issue Dec 2, 2024
…bution in SLSA

As work on the source track progresses the topic of 'identity' comes up quite a bit.
There has been some confusion about what this means, that it could be that SLSA
intends to require legal identities for all contributors.  That isn't the case.

Many in the open source world prefer to contribute without revealing their 'real'
identities as has been practiced for many years. SLSA does not intend to change that.

This PR tries to make it clear that SLSA does not require real identities.

refs slsa-framework#1133

Signed-off-by: Tom Hennen <[email protected]>
TomHennen added a commit that referenced this issue Dec 11, 2024
…SLSA (#1249)

As work on the source track progresses the topic of 'identity' comes up
quite a bit. There has been some confusion about what this means, that
it could be that SLSA intends to require legal identities for all
contributors. That isn't the case.

Many in the open source world prefer to contribute without revealing
their 'real' identities as has been practiced for many years. SLSA does
not intend to change that.

This PR tries to make it clear that SLSA does not require real
identities.

refs #1133

---------

Signed-off-by: Tom Hennen <[email protected]>
Signed-off-by: Tom Hennen <[email protected]>
Co-authored-by: Aditya Sirish <[email protected]>
Co-authored-by: Andrew McNamara <[email protected]>
@TomHennen
Copy link
Contributor Author

I think we can close this now and revisit if we get additional feedback.

@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Issue triage Dec 13, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in SLSA Source Track Dec 13, 2024
@adityasaky
Copy link
Member

I'm not sure if we want to reopen this or track this separately. The source track currently says:

There exists an identity management system or some other means of identifying actors. This system may be a federated authentication system (AAD, Google, Okta, GitHub, etc) or custom implementation (gittuf, gpg-signatures on commits, etc). The SCS MUST document how actors are identified for the purposes of attribution.

Should we clarify the text in the table so we aren't distinguishing between "federated" and "custom" implementations? I'm not sure we want to be bucketing specific mechanisms anymore, for what it's worth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Status: Done
Development

No branches or pull requests

10 participants