1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851
|
---
title: Supply Chain Security for Version Control Systems
abbrev: Supply Chain Security for VCSs
docname: draft-nhw-openpgp-supply-chain-security-vcs-00
date: 2023-06-20
category: info
submissiontype: independent
ipr: trust200902
area: int
workgroup: openpgp
keyword: Internet-Draft
stand_alone: yes
pi: [toc, sortrefs, symrefs]
venue:
group: "OpenPGP"
type: "Working Group"
mail: "openpgp@ietf.org"
arch: "https://mailarchive.ietf.org/arch/browse/openpgp/"
repo: "https://gitlab.com/sequoia-pgp/sequoia-git"
latest: "https://sequoia-pgp.gitlab.io/sequoia-git/"
author:
-
ins: N.H. Walfield
name: Neal H. Walfield
org: Sequoia PGP
email: neal@sequoia-pgp.org
-
ins: J. Winter
name: Justus Winter
org: Sequoia PGP
email: justus@sequoia-pgp.org
normative:
RFC2119:
RFC4880:
RFC8174:
toml:
author:
-
ins: T. Preston Werner
name: Tom Preston-Werner
-
ins: P. Gedam
name: Pradyun Gedam
title: TOML v1.0.0
date: 2021-01-12
target: https://toml.io/en/v1.0.0
informative:
event-stream:
author:
-
ins: T. Hunter
name: Thomas Hunter II
title: "Compromised npm Package: event-stream"
date: 2018-11-27
target: https://medium.com/intrinsic-blog/compromised-npm-package-event-stream-d47d08605502
dependency-confusion:
author:
-
ins: A. Birsan
name: Alex Birsan
title: "Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies"
date: 2021-02-09
target: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610
reflections-on-trusting-trust: DOI.10.1145/358198.358210
guix:
author:
-
ins: L. Courtès
name: Ludovic Courtès
title: Building a Secure Software Supply Chain with GNU Guix
date: 2022-06
doi: 10.48550/arXiv.2206.14606
target: https://arxiv.org/abs/2206.14606
--- abstract
In a software supply chain attack, an attacker injects malicious code
into some software, which they then leverage to compromise systems
that depend on that software. A simple example of a supply chain
attack is when SourceForge, a once popular open source software forge,
injected advertising into the binaries that they delivered on behalf
of the projects that they hosted. Software supply chain attacks are
different from normal bugs in that the intent of the perpetrator is
different: in the former case, bugs are added with the intent to harm,
and in the latter they are added inadvertently, or due to negligence.
Software supply chain security starts on a developer's machine. By
signing a commit or a tag, a developer can assert that they wrote or
approved the change. This allows users of a code base to determine
whether a version has been approved, and by whom, and then make a
policy decision based on that information. For instance, a packager
may require that software releases be signed with a particular
certificate.
Version control systems such as git have long included support for
signed commits and tags. Most developers don't sign their commits,
and in the cases where they do, it is usually unclear what the
semantics are.
This document describes a set of semantics for signed commits and
tags, and a framework to work with them in a version control system,
in particular, in a git repository. The framework is designed to be
self contained. That is, given a repository, it is possible to add
changes, or authenticate a version without consulting any third
parties; all of the relevant information is stored in the repository
itself.
By publishing this draft we hope to clarify and enrich the semantics
of signing in version control system repositories thereby enabling a
new tooling ecosystem, which can strengthen software supply chain
security.
--- middle
# Introduction
## Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY",
and "OPTIONAL" in this document are to be interpreted as described
in BCP 14 {{RFC2119}} {{RFC8174}} when, and only when, they appear in
all capitals, as shown here.
## Terminology
- "Maintainer" is a software developer, who is responsible for a
software project in the sense that they act as a gatekeeper, and
decide with other maintainers what changes are acceptable, and
should be added to the software.
- "Contributor" is someone who contributes changes to a software
project. Unlike a maintainer, a contributor cannot add their
changes to a project on their own accord.
- "Software supply chain" is the collection of software that
something depends on. For instance, a software package depends on
libraries, it is built by a compiler, it is distributed by a
package registry, etc.
- "Software supply chain attack" is an attack in which an attacker
compromises a software supply chain. For instance, a maintainer
or a contributor may stealthily insert malicious code into a
software project in order to compromise the security of a system
that depends on that software.
- "Version control system" is a database, which contains versions of
a software project. Each version includes links to preceding
versions.
- "git" is a popular version control system. Although "git" is
distributed and does not rely on a central authority, it is often
used with one to simplify collaboration. Examples of centralized
authorities include gitea, GitHub, and Gitlab.
- "Commit" is a version that is added to the "version control
system". In git, commits are identified by their message digest.
- "Branch" is a typically human readable name given to a particular
commit. When a commit is superseded, the branch is updated to
point to the new commit. Repositories normally have at least one
branch called "main" or "master" where most work is done.
- "Tag" is a name given to a particular commit. Tags are usually
only added for significant versions like releases and are normally
not changed once published.
- "Change" is a commit or a tag.
- "Forge" is a service which hosts software repositories, and often
provides additional services like a bug tracker. Examples of
forges are codeberg, GitHub, and GitLab.
- "Registry" or "Package Registry" is a service that provides an
index of software packages. Maintainers register their software
there under a well-known name. Build tools like `cargo` fetch
dependencies by looking up the software by its name.
- "Authentication" is the process of determining whether something
should be considered authentic.
- "Trust model" is a process for determining what evidence to
consider, and how to weigh it when doing authentication.
- "OpenPGP certificate" or just "certificate" is the data structure
that section 11.2 of {{RFC4880}} defines as a "Transferable Public
Key". A certificate is sometimes called a key, but this is
confusing, because a certificate contains components that are also
called keys.
- "Liveness" is a property of a certificate, a signature, etc. An
object is considered live with respect to some reference time if,
as of the reference time, its creation time is in the past, and it
has not expired.
# Problem Statement
Consider the following scenario. Alice and Bob are developers. They
are the primary maintainers of the Xyzzy project, which is a free and
open source project. Although they do most of the work on the
project, they also have occasional collaborators like Carol, and
drive-by contributions from people like Dave. Paul packages their
software for an operating system distribution. Ted from Ty Coon
Corporation integrates it into his company's software. And, Mallory
is an adversary who is trying to subvert the project.
When someone updates their local copy of Xyzzy's source code
repository, they want to authentic any changes before they use them.
That is, they want to know that each change was made or approved by
someone whom they consider authorized to make that change.
In the Xyzzy project, Alice is willing to rely on Bob to check-in
changes he makes, and to approve contributions from third parties
without auditing the code herself. But, she doesn't want to rely on
anyone else without checking their proposed changes manually. Bob
feels the same way about Alice.
In version control systems like `git`, the meta-data for a commit or
tag includes `author` and `committer` fields. By themselves, these
fields cannot be used to reliably determine who a change's author and
committer are, because these fields are set by the committer and
unauthenticated. That is, Mallory could author a commit, set both of
these fields to "Bob," and push the malicious commit. No one would be
able to tell that they came from Mallory and not Bob.
There are two main ways to authenticate changes. First, changes to a
repository or branch can be mediated by a trusted third party, which
enforces a policy at the time a change is added to the repository.
Second, individual changes can be signed, and a policy can be
evaluated at any time. These two approaches can be mixed.
## Repositories Protected by a Trusted Third Party
When using a trusted third party, only certain users are allowed to
change the repository. This is often realized using access control
lists: the trusted third party has a list of users who are allowed to
do certain types of modifications. Before the trusted third party
allows a user to modify the repository, the user has to authenticate
themselves. When they attempt to make a change, the trusted third
party checks that they are authorized. If they are, the third party
allows the modification. If not, it is rejected. A user of this
repository can now conclude that if they can authenticate the trusted
third party, then the changes were approved.
A drawback of using a trusted third party is that it relies on
centralized infrastructure. This means the only way for a user to
determine if a version of Xyzzy is authentic is to fetch it from the
trusted third party; the repository is not self authenticating. If
the third party ever disappears, users will no longer be able to
authenticate the project's source code.
Another disadvantage is that this approach doesn't expose the
project's policy to its users. This means that both first-parties
like Alice and third-parties like Paul are not able to audit the
trusted third party. This is the case even if the set of users that
are currently authorized to make changes are exposed via a separate
API end point: because the set of authorized users changes with time,
all updates to the ACLs would need to be exposed along with
information about what user authorized each change.
## Self-Authenticating Repositories
An alternative approach is to have authors and committers sign their
changes. Users then check that the changes are signed correctly, and
authenticate the signers. For instance, for the Xyzzy project, Paul
might decide that Alice or Bob are allowed to make changes. So when
Paul fetches changes, he checks whether Alice or Bob signed the new
changes, and flags changes made by anyone else. If Alice and Bob
later decide that Carol should also be allowed to directly commit her
changes, Paul needs to update his policy. If Bob leaves the team,
Paul needs to pay enough attention to notice, and then disallow
changes made by Bob after a certain date.
For projects that sign their commits today, this is more or less the
status quo. Most users, however, do not want to maintain their own
policy, and aren't even in a good position to do so. Since users are
willing to rely on the maintainers to make changes to the project,
they can just as well delegate the policy to them. Now, a user like
Paul just needs to designate an initial policy. If he knows when the
policy changes, and can authenticate changes to the policy based on
the existing policy, then he is able to authenticate any subsequent
changes to the repository.
An easy way to manage the policy is to include it in the repository
itself. Then changes to the policy can be authenticated in the same
way as normal changes. This also makes the repository self
authenticating, because it is self contained.
One issue is how users should handle forks to a project. A fork in a
project may occur due to a social or technical conflict, or because
the project dies, and is later revived by a different party. In both
cases, it may not be possible for there to be a clean hand off to the
new maintainer. That is, Alice or Bob may not be willing or able to
change the policy file to allow Dave to seamlessly continue the
development of Xyzzy.
Forks are straightforward to handle, but require user intervention:
from the system's perspective, Dave is not authorized, so his changes
are rejected. And that's good, as Dave may be an attacker; the system
can't tell. Users opt in to a fork by changing their trust root to
designate a version in which Dave is authorized to make changes.
# Threat Model
Consider an attacker, Mallory, who is trying to compromise a user,
Ursula, by injecting a vulnerability into the software supply chain of
a piece of software, Super Frob, that she uses. There are several
different ways that Mallory could accomplish this. These include:
- Mallory could pose as a contributor, and convince a develop to
authorize a malicious change to one of Super Frob's dependencies,
such as a library.
- Mallory could take over an abandoned package that Super Frob
depends on, and publish a new version with malicious code.
- Mallory could use typo squatting to opportunistically or through
social engineering inject malicious software into Super Frob's
supply chain.
For instance, Mallory could publish a library called `libevent`,
which is a copy of `libevents`, but includes a malicious change,
and Super Frob accidentally includes `libevent` as a dependency
instead of `libevents`.
- Mallory could publish a malicious package that has the same name as
a package on another registry in order to confuse Super Frob's
build tools.
This type of attack is called a dependency confusion attack,
{{dependency-confusion}}. It can be launched when an organization
uses an internal registry and a public registry to find
dependencies. As dependencies are often referenced by name, and
that name does not include the registry, an attacker may trick the
organization into using their malicious version of the package.
- Mallory could sneak a change into one of Super Frob's build
dependencies, like the compiler.
Whereas software maintainers have a large degree of control over
their direct dependencies, they have more limited control over the
tools downstream users use to build their software. In the
extreme, a software project may include a copy of a dependency in
their version control system, or depend on a specific version of a
dependency by cryptographic hash, but only specify a standard that
the compiler needs, like C99.
This attack is most well-known from Ken Thompson's Reflections on
Trusting Trust Turning award lecture,
{{reflections-on-trusting-trust}}.
- Mallory could compromise the tools that a developer uses, e.g., by
publishing a useful, but malicious plug-in for an editor, which
detects certain code patterns, and quietly modifies them to insert
malicious code.
- Mallory could compromise the systems that the developers use, and
modify their source code repositories.
For instance, if Mallory gets access to a developer's machine, he
could stealthy modify code before it is signed and committed. Or,
he could exfiltrate the developer's signing key, or login
credentials and imitate her. Similarly, if a software project uses
a forge and Mallory is able to compromise the forge, he could
modify the source code.
- Mallory could compromise Super Frob or one of its dependencies as
it is being downloaded.
For instance, if a package registry like `crates.io` depends on a
content delivery network (CDN) to distribute packages, a
compromised node in the CDN may return a modified version of the
software to the user.
The setting is as follows. To protect herself from Mallory, Ursula
has to make sure that versions of the software she obtains do not
contain malicious code. Ursula cannot afford to audit every version
of the software, but she is willing to rely on the maintainers of the
project to not add malicious code, and to review contributions from
third parties.
The framework presented in this specification allows Ursula to audit a
dependency and its developers once, and then to delegate decisions of
what code and dependencies to include to the developers. Assuming the
developers are reliable, this can protect Ursula from attacks where
Mallory is not explicitly authorized to make a change. For instance,
if the developers of an abandoned software package do not authorize a
new maintainer, Ursula will be warned when a package has a new
maintainer, as she can no longer authenticate it. She can then
reaudit it. Similarly, when the software is modified in transit by a
machine in the middle, Ursula will not be able to authenticate it.
This can also stop dependency confusion attacks, because the software
cannot be authenticated. It won't however, stop a downgrade attack,
as older versions can still be authenticated.
This framework cannot protect Ursula from mistakes that she or a
developer of the software that she depends on makes. For instance, if
Mallory is able to convince a developer to authorize a malicious
change to their software, this framework consider the change to be
legitimate. This framework can facilitate forensic analysis in these
case by making it easier to identify changes approved by the same
person (potentially across different projects) and thereby conduct a
targeted audit.
# Authentication
This framework helps users authenticate three types of artifacts:
commits, tags, and tarballs or other archives.
## Policy
Every commit has an associated policy. If a commit contains the file
`openpgp-policy.toml` in the root directory, then that file describes
the commit's policy. If the commit does not contain that file, the
void policy is used. The void policy rejects everything.
`openpgp-policy.toml` is a TOML v1.0.0 file {{toml}}. Version 0
defines the following three top-level keys: `version`,
`authorization`, and `commit_goodlist`.
If a parser recognizes the version, but encounters keys that it does
not know, then it must ignore the unknown keys. This allows a degree
of forwards compatibility.
### version
The value of the `version` key is an integer and must be `0`:
version = 0
If the value of `version` is not recognized, the implementation SHOULD
error out. It MAY instead treat the policy as the void policy.
### authorization
`authorization` is a table of authorization entries.
Each key in the `authorization` table is a free-form identifier, which
is chosen by the user of the system. The identifier SHOULD be a UTF-8
encoded, human-readable string that identifies an entity. Examples of
identifiers are `alice`, `Bob <bob@example.org>`, `Boty McBotface
<bot@mcbotface.org>`.
The value of each authorization entry is another table. The table has
the following entries:
- `keyring`
- `sign_commit`
- `sign_tag`
- `sign_archive`
- `audit`
- `add_user`
- `retire_user`
#### keyring
The value of `keyring` is a string. It contains one or more OpenPGP
certificates. The OpenPGP certificates MUST be ASCII-armored. An
ASCII-armored block MAY contain more than one OpenPGP certificate.
The string MAY contain multiple ASCII-armored blocks.
An implementation SHOULD ignore valid OpenPGP certificates that is
does not support, and MAY emit a warning that a certificate, or
component is not supported. An implementation SHOULD return an error
if it encounters something other than an OpenPGP certificate encoded
with ASCII armor.
When adding a certificate, an implementation SHOULD only add
components that are needed to validate the signatures. That is, an
implementation SHOULD strip subkeys that are not signing capable, and
third-party signatures. For components that are kept, an
implementation SHOULD include all known self signatures, and not just
the newest self signature.
#### sign_commit
The value of `sign_commit` is a boolean. If `true`, then the entity
is authorized to sign commits.
#### sign_tag
The value of `sign_tag` is a boolean. If `true`, then the entity is
authorized to sign tags.
#### sign_archive
The value of `sign_archive` is a boolean. If `true`, then the entity
is authorized to sign tarballs or other archives.
#### audit
The value of `audit` is a boolean. If `true`, then the entity is
authorized to add commits to the top-level `commit_goodlist` array.
#### add_user
The value of `add_user` is a boolean. If `true`, then the entity is
authorized to add new entities to the authorization table, to grant
them any capabilities that they have, and to add new certificates to
any entity's keyring.
Note: no special capability is required to extend an existing
certificate. For instance, an entity that has the `sign_commit`
capability can add new user IDs, new subkeys, and new signatures to
any existing certificate. Adding new certificates requires the
`add_user` capability, and removing most packets from an existing
certificate requires the `retire_user` capability.
#### retire_user
The value of `retire_user` is a boolean. If `true`, then the entity
is authorized to retire capabilities from any entity. This includes
capabilities that they do not have. The entity is also authorized to
remove certificates, and to strip components and signatures from
existing certificates.
If an entity does not have the `retire_user` capability, it is still
possible for the entity remove some packets. The following algorithm
determines whether a change is allowed:
- Ignore marker packets.
- Ignore third-party certifications. A third-party certification is a
signature packet where none of the issuer packets and none of the
issuer fingerprint packets alias the certificate's fingerprint.
- Consider all of the remaining non-signature packets to be
components.
- Iterate over the packets in the certificate in the parent commit's
policy in order. For each signature create a tuple consisting of
the signature and the preceding component. Call the set of tuples
`P`.
- Repeat the previous step for the version of the certificate in the
child commit's policy, but call the set of tuples `C`.
- If `P`, the set of tuples derived from the version of the
certificate in the parent policy, minus `C`, the set of tuples
derived from the version of the certificate in the child policy, is
not empty, then the update requires the `retire_user` right.
Note: This algorithm does not check signatures for cryptographic validity.
This means it is possible to handle signatures that use signature
versions, and cryptographyic algorithms that the implementation does not
support.
Changing a signature's associated component is only allowed if the entity
has the `retire_user` right.
An entity can always add new signatures.
Components are only considered in the context of a signature. Consider
the following certificate:
- Primary Key
- Signature
- User ID A
- User ID B
- Signature
Since the algorithm above would not create any tuples consisting of user
ID `A` and a signature, removing the user ID `A` packet does not require
the `retire_user` right.
#### Example
The following is an example of an authorization entry. The user has
been granted all the capabilities. The user is identified by two
different OpenPGP certificates. The certificates are contained in two
concatenated ASCII armored blocks.
[authorization."Neal H. Walfield <neal@pep.foundation>"]
sign_commit = true
sign_tag = true
sign_archive = true
add_user = true
retire_user = true
audit = true
keyring = """
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: F717 3B3C 7C68 5CD9 ECC4 191B 74E4 45BA 0E15 C957
Comment: Neal H. Walfield (Code Signing Key) <neal@pep.foundatio
xjMEWhaZ2xYJKwYBBAHaRw8BAQdAinglS6SRXyMb51hMk+mpM4y0Uh0vcGcTyXa+
...
=i3xd
-----END PGP PUBLIC KEY BLOCK-----
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: 8F17 7771 18A3 3DDA 9BA4 8E62 AACB 3243 6300 52D9
Comment: Neal H. Walfield <neal@gnupg.org>
Comment: Neal H. Walfield <neal@pep-project.org>
Comment: Neal H. Walfield <neal@pep.foundation>
Comment: Neal H. Walfield <neal@sequoia-pgp.org>
Comment: Neal H. Walfield <neal@walfield.org>
xsEhBFUjmukBDqCpmVI7Ve+2xTFSTG+mXMFHml63/Yai2nqxBk9gBfQfRFIjMt74
=MESu
-----END PGP PUBLIC KEY BLOCK-----
"""
### commit_goodlist
The value of `commit_goodlist` is an array of strings where each
string contains a commit identifier. The commit identifier MUST be a
full hash. The commit identifier MUST NOT be a branch name, a tag
name, or a truncated hash.
Commits listed in the `commit_goodlist` are commits that have
retroactively been marked as valid. This may be useful when a
certificate's private key material has been compromised.
## Authenticating Commits
Each commit in a `git` repository is part of a directed acyclic graph
(DAG) where a node is a commit, and a directed edge shows how two
commits are related. Specifically, the head of a directed edge is a
commit that is derived from the tail. Except for the root commits,
each commit has one or more parents. A commit that has multiple
parents is derived from multiple commits. Conceptually, it merges
multiple paths, and as such is called a merge commit.
A commit is consider authenticated if at least one of its parent
commits considers the commit to be authenticated. This rule is
different from Guix's *authorization invariant* as described in
{{guix}}, which states that all parent commits must consider the
commit to be authenticated. The semantics described here allow a
developer to add commits from unauthorized third-parties as-is using a
merge commit. Using Guix's authorization invariant, the third party's
commit would have to be resigned, which loses the third-party's
signature, and consequently complicates forensic analysis.
A commit's parent authenticates it as follows.
First, the implementation looks up the signer's certificate in the
parent commit's policy file. If the implementation finds a
certificate, it scans the commit's policy file for any updates to that
certificate (and only that certificate) except for revocations. That
is, the implementation iterates over all of the certificates in the
commit's policy file, and looks for certificates with the same
fingerprint. If it finds any, it merges them into the original
certificate with the exception of any revocation signatures. In this
way, it is straightforward for a user to recover if the certificate in
the parent commit's policy file is no longer usable, e.g., because it
has expired, or the signing subkey has been replaced. Consider a
parent commit whose policy file that contains a certificate that
expires at time `t`. After `t`, the certificate is unusable; it can't
be used to authenticate any commits made at or after `t`. This
mechanism allows the user to easily add new commits by extending their
certificate's expiration, and adding the update to a new commit.
Revocation certificates are skipped so that it is possible for a user
to add a commit that revokes their own certificate, or a component
thereof.
The implementation SHOULD then canonicalize the certificate so that
the active self signatures are those that were active when the
signature was made. A self signature is valid, if it is not revoked,
and not expired. A self signature is active, if it is the most
recent, valid self signature prior to a reference time. That is, if a
new commit was made on June 9, 2023, then each component's most recent
signature as of June 9, 2023, which is also not revoked, and not
expired, is considered that component's active self signature.
If the canonicalized certificate is valid as of the signature's time,
not expired as of that time, not soft revoked as of that time, not
hard revoked at any time, and the signature is correct, then the
signature is considered verified. The implementation MAY consider
certificate updates from other sources. If it does, it SHOULD only
consider hard revocations.
The implementation MUST then check that the type of change is
authorized by the policy.
The following capabilities allow the specified types of changes:
- `sign_commit`: Needed for any change.
- `add_user`: Needed to delegate a capability to another user.
Updating `keyring` does not require this capability if a
certificate is only updated, and not added.
- `retire_user`: Needed to rescind a capability from another user.
- `audit`: Needed to modify the `version` field, and the
`commit_goodlist` list.
If the signature is considered verified, and the signer is authorized
to make the type of change that was made, then the commit is
considered authenticated.
If the commit is not considered authenticated, because the signer's
certificate has been hard revoked, but the commit is included in a
later commit's `commit_goodlist`, then the commit is considered to be
authenticated.
A commit is considered to occur later if when authenticating a range
of commits, a commit is a direct descendant of the commit in question,
and it is in the commit range. Consider the three commits `a`, `b`,
and `c` where `a` is `b`'s parent, `b` is `c`'s parent, the
certificate used to sign `b` has been hard revoked, and `c` includes
`b` in its `commit_goodlist`. In this case, the hard revocation for
the certificate to use `b` is ignored. All other criteria including
the fact that the signature on `b` is valid are still checked.
## Authenticating Tags
A tag is a special type of commit in `git`, which has no content, but
assigns a name to a specific commit. A tag is usually used to mark
release points.
A tag is authenticated in the same way as a commit, as described in
the previous section, with the following exceptions.
First, the tagged commit is considered a parent commit, and the tag is
considered its child commit.
The entity that signed the tag needs the `sign_tag` capability, and
only the `sign_tag` capability.
## Authenticating Archives
Archives like tarballs are often generated as part of a software's
release process. These may be signed. To authenticate an archive
with respect to a signature, and a trust root, the trust root's policy
is used to authenticate the tarball's signature. The entity that
signed the tarball must have the `sign_archive` capability.
Unlike a commit, an archive does not have a pointer to the commit that
it was derived from. Thus, if an archive is derived from commit `c`,
it may be possible to authenticate commit `c`, as well as tags
referring to commit `c` using a given trust root, but to not
authenticate an archive derived from commit `c` using the same trust
root, because the policy changed in the meantime.
If the signature includes the notation
`commit@notations.sequoia-pgp.org`, then the value of the notation is
interpreted as the commit that the archive is derived from. The value
of the notation is a hexadecimal value corresponding to the commit's
full hash. Truncated hashes MUST be considered erroneous. The commit
identifier MUST NOT be a branch name, a tag name, or a truncated hash.
Since archives are often verified outside of a repository, one or more
repositories may be specified using the
`repository@notations.sequoia-pgp.org` notation. In that case, each
notation indicates a git repository. For example, the main repository
of the reference implementation, `sq-git`, is
`https://gitlab.com/sequoia-pgp/sequoia-git.git`. So, archives SHOULD
include the `repository@notations.sequoia-pgp.org` notation with
`https://gitlab.com/sequoia-pgp/sequoia-git.git` as the value.
When `commit@notations.sequoia-pgp.org` is present in the signature,
the implementation MUST use that commit's policy to authenticate the
archive, and then authenticate that commit by chaining back to the
trust root, as described above; in this case, it MUST NOT use the
trust root's policy directly unless the specified commit is also the
trust root.
# Reference implementation
A Rust implementation of this specification is part of Sequoia. See
https://gitlab.com/sequoia-pgp/sequoia-git for the source code.
# Security Concerns
## Malicious vs. Buggy Changes
The scheme presented here can help mitigate malicious attacks on a
code base, but it does nothing to prevent design flaws or code errors.
That is, this scheme does not and cannot provide any protections from
normal bugs.
## Trusted Developers
The protections outlined in this document are mainly designed to stop
third-parties from adding malicious code to a project. This system
provides no protection from a developer who is authorized to make
changes and turns out to be malicious. That said, because commits are
signed, when malicious code is discovered, an audit is required to
restore trust in the code base. Using this system, it is easier to
identify other code added by the same person, and focus an audit on
that code.
## Judging Code vs. Judging Humans
The approach described in this document relies on transitive trust.
The basic idea is that if a user is willing to run a developer's code,
then they can reasonably rely on that developer to modify the code,
and to delegate that capability to a third party.
Yet, writing and reviewing code is fundamentally different from
evaluating another person's intents. This is demonstrated quite well
by the events surrounding the popular `event-stream` npm package,
{{event-stream}}. In 2018, a new developer gained the trust of the
package's maintainer by contributing a number of high-quality changes.
The original developer eventually made the new developer the
maintainer, and the new maintainer introduced malicious code to steal
user's credentials.
## Operational Security
Signing commits relies on each developer having a long-term identity
key, which they keep safe. If the key is compromised, the attacker is
able to impersonate the developer. It is possible to limit the damage
by revoking the compromised key, or having another authorized user
retire the developer's access.
In this regard, sigstore appears to be better as it relies on
ephemeral signing keys, which are issued by a central authority.
However, in order to obtain a signing key, the user needs to log in.
If they use a password, then if an attacker gets access to the
password, an attacker can impersonate the developer. If the developer
uses a second factor like a hardware token, then they are again using
private key cryptography, and may as well put their private keys on a
hardware token, and forego the centralized infrastructure.
## Dependencies
This specification has concentrated on enabling a user of a software
project to authenticate new versions. But most software has its own
dependencies, and those also need to be authenticated. A user could
identify all software that they are willing to rely on, but this is
more work than most users are willing and able to do. But, just as
developers are usually in a better position to evaluate who should be
allowed to contribute to their project, they are also in a better
position to designate a trust root for their dependencies.
Enabling this functionality requires ecosystem-specific tooling. The
developer needs to be able to specifying a trust root for each
dependency, and the build infrastructure needs to authenticate the
dependencies. For instance, the Rust ecosystem uses Cargo for
building and dependency management. Currently, to add
`sequoia-openpgp` as a dependency to a project, a developer would
modify their `Cargo.toml` file as follows:
[dependencies]
sequoia-openpgp = { version = "1" }
Instead, they would also specify a trust root, which they've
presumably audited:
[dependencies]
sequoia-openpgp = { version = "1", trust-root = "HASH" }
When downloading the dependency, `cargo` would make sure that the
dependency can be authenticated from the specified trust root, and if
not throw an error.
## Document History
This is a first draft that has not been published.
# Acknowledgments
My thanks go---in particular, but not only---to the Sequoia PGP team
for many fruitful discussions. Funding for this project was provided
by the Sovereign Tech Fund.
|