Skip to content

Commit

Permalink
Merge pull request #34 from zvr/reword-id-structure
Browse files Browse the repository at this point in the history
Reword the SWHID structure description
  • Loading branch information
rdicosmo authored Oct 1, 2023
2 parents 42a0d1d + 5333e77 commit 26798af
Showing 1 changed file with 19 additions and 14 deletions.
33 changes: 19 additions & 14 deletions Chapters/5.Core_identifiers.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,27 @@
# 5 Core identifiers

A core SWHID identifier is composed of four fields, separated by a colon
`:`. The `swh` prefix makes it explicit that this is a SWHID identifier. `1`
(`<scheme_version>`) is the current version of this identifier *scheme*. Future
editions will use higher version numbers, possibly breaking backward
compatibility. The third field is a tag that correspond to the type of object
A core SWHID identifier is composed of four fields, separated by a colon `:`.

The first field is the *type of the identifier*
and it is defined to be `swh`.

The second field is the *version of the identifier scheme*
and for this version of the specification
it is defined to be `1`.

The third field is a tag corresponding to the *type of object*
identified:

- `cnt` for **contents**.
- `dir` for **directories**,
- `rev` for **revisions**,
- `rel` for **releases**,
- `snp` for **snapshots**,
- `cnt` for **contents** (see 5.1)
- `dir` for **directories** (see 5.2)
- `rev` for **revisions** (see 5.3)
- `rel` for **releases** (see 5.4)
- `snp` for **snapshots** (see 5.5)

The fourth field is the *intrinsic identifier* of the object.
This is a hex-encoded (using lowercase ASCII characters) hash value
computed by the content and relevant metadata of the object.

The fourth and last field is the *intrinsic identifier* of the object. In this
version of the specification, this is a hex-encoded (using lowercase ASCII
characters) SHA1 computed on the content and relevant metadata of the object
itself, as follows.
## 5.1 Contents

A *content* is an uninterpreted byte sequence, typically, the content of a file.
Expand Down

0 comments on commit 26798af

Please sign in to comment.