archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <>
Cc: "Jeff King" <>, "Patrick Steinhardt" <>,
	"Christian Couder" <>,
	"Albert Cui" <>,
	"Jonathan Tan" <>,
	"Ævar Arnfjörð Bjarmason" <>
Subject: [RFC PATCH 13/13] bundle-uri docs: add design notes
Date: Thu,  5 Aug 2021 17:07:29 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

Add a design doc for the bundle-uri protocol extension to go along
with the packfile-uri extension added in cd8402e0fd8 (Documentation:
add Packfile URIs design doc, 2020-06-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <>
 Documentation/technical/bundle-uri.txt  | 119 ++++++++++++++++++++++++
 Documentation/technical/protocol-v2.txt |   5 +
 2 files changed, 124 insertions(+)
 create mode 100644 Documentation/technical/bundle-uri.txt

diff --git a/Documentation/technical/bundle-uri.txt b/Documentation/technical/bundle-uri.txt
new file mode 100644
index 0000000000..5ae9a15eaf
--- /dev/null
+++ b/Documentation/technical/bundle-uri.txt
@@ -0,0 +1,119 @@
+Bundle URI Design Notes
+See `bundle-uri` in the link:protocol-v2.html[protocol-v2]
+documentation for a discussion of the bundle-uri command, and the
+expectations of clients and servers.
+This document is a a more general discussion of how the `bundle-uri`
+command fits in with the rest of the git ecosystem, its design goals
+and non-goals, comparison to alternatives etc.
+Comparison with Packfile URIs
+There is a similar "Packfile URIs" facility, see the
+link:packfile-uri.html[packfile-uri] documentation for details.
+The Packfile URIs facility requires a much closer cooperation between
+CDN and server than the bundle URI facility.
+I.e. the server MUST know what objects exist in the packfile URI it's
+pointing to, as well as its pack checksum. Failure to do so will not
+only result in a client error (the packfile hash won't match), but
+even if it got past that would likely result in a corrupt repository
+with tips pointing to unreachable objects.
+By comparison the bundle URIs are meant to be a "dumb" solution
+friendly to e.g. having a weekly cronjob take a snapshot of a git
+repository, that snapshot being uploaded to a network of FTP mirrors
+(which may be inconsistent or out of date).
+The server does not need to know what state the side-channel download
+is at, because the client will first validate it, and then optionally
+negotiate with the server using what it discovers there.
+Using the local `transfer.injectBundleURI` configuration variable (see
+linkgit:git-config[1]) the `bundle-uri` mechanism doesn't even need
+the server to support it.
+The omission of something equivalent to the packfile <OID> in the
+Packfile URIs protocol is intentional, as having it would require
+closer server and CDN cooperation than some server operators are
+comfortable with.
+Furthermore, it is not needed for security. The server doesn't need to
+trust its CDN. If the server were to attempt to send harmful content
+to the client, the result would not validate against the server's
+provided ref tips gotten from ls-refs.
+The lack of a such a hash does leave room open to a malicious CDN
+operation to be annoying however. E.g. they could inject irrelevant
+objects into the bundles, which would enlarge the downloaded
+repository until a "gc" would eventually throw them away.
+In practice the lack of a hash is considered to be a non-issue. Anyone
+concerned about such security problems between their server and their
+CDN is going to be pointing to a "https" URL under their control. For
+a client the "threat" is the same as without bundle-uri, i.e. a server
+is free to be annoying today and send you garbage in the PACK that you
+won't need.
+Security issues peculiar to bundle-uri
+Both packfile-uri and bundle-uri use the `fetch.uriProtocols`
+configuration variable (see linkgit:git-config[1]) to configure which
+protocols they support.
+By default this is set to "http,https" for both, but bundle-uri
+supports adding "file" to that list. The server can thus point to
+"file://" URIs it expects the client to have access to.
+This is primarily intended for use with the `transfer.injectBundleURI`
+mechanism, but can also be useful e.g. in a centralized environment
+where a server might point to a "file:///mnt/bundles/big-repo.bdl" it
+knows to be mounted on the local machine (e.g. a racked server),
+points to it in its "bundle-uri" response.
+The client can then add "file" to the `fetch.uriProtocols` list to
+obey such responses. That does mean that a malicious server can point
+to any arbitrary file on the local machine. The threat of this is
+considered minimal, since anyone adding `file` to `fetch.uriProtocols`
+likely knows what they're doing and controls both ands, and the worst
+they can do is make a curl(1) pipe garbage into "index-pack" (which
+will likely promptly die on the non-PACK-file).
+Security comparison with packfile-uri
+The initial implementation of packfile-uri needed special adjusting to
+run "git fsck" on incoming .gitmodules files, this was to deal with a
+general security issue in git, See CVE-2018-17456.
+The current packfile-uri mechanism requires special handling around
+"fsck" to do such cross-PACK fsck's, this is because it first indexes
+the "incremental" PACK, and then any PACK(s) provided via
+packfile-uri, before finally doing a full connectivity check.
+This is effect doing the fsck one might do via "clone" and "fetch" in
+reverse, or the equivalent of starting with the incremental "fetch",
+followed by the "clone".
+Since the packfile-uri mechanism can result in the .gitmodules blob
+referenced by such a "fetch" to be in the pack for the "clone" the
+fetch-pack process needs to keep state between the indexing of
+multiple packs, to remember to fsck the blob (via the "clone") later
+after seeing it in a tree (from the "fetch).
+There are no known security issues with the way packfile-uri does
+this, but since bundle-uri effectively emulates what a which doesn't
+support either "bundle-uri" or "packfile-uri" would do on clone/fetch,
+any future security issues peculiar to the packfile-uri approach are
+unlikely to be shared by it.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index d10d5e9ef6..5536ea4b7e 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -696,3 +696,8 @@ intended to support future features such as:
    they've got that OID already (for multi-tips the client would need
    to fetch the bundle, or do e.g. HTTP range requests to get its
+bundle-uri SEE ALSO
+See the link:bundle-uri.html[Bundle URI Design Notes] for more.

  parent reply	other threads:[~2021-08-05 15:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 15:07 [RFC PATCH 00/13] Add bundle-uri: resumably clones, static "dumb" CDN etc Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 01/13] serve: add command to advertise bundle URIs Ævar Arnfjörð Bjarmason
2021-08-10 13:58   ` Derrick Stolee
2021-08-23 13:25     ` Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 02/13] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 03/13] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 04/13] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 05/13] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 06/13] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 07/13] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 08/13] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 09/13] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 10/13] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 11/13] index-pack: add --progress-title option Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 12/13] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-24 21:48   ` [RFC PATCH 13/13] bundle-uri docs: add design notes brian m. carlson
2021-08-24 22:33     ` Ævar Arnfjörð Bjarmason
2021-08-06 14:38 ` [RFC PATCH 00/13] Add bundle-uri: resumably clones, static "dumb" CDN etc Jonathan Nieder
2021-08-06 16:26   ` Ævar Arnfjörð Bjarmason
2021-08-06 20:40     ` Jonathan Nieder
2021-08-07  2:19       ` Ævar Arnfjörð Bjarmason
2021-08-10 13:55 ` Derrick Stolee
2021-08-23 13:28   ` Ævar Arnfjörð Bjarmason
2021-08-24  2:03     ` Derrick Stolee
2021-08-24 22:00       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).