From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>, "Patrick Steinhardt" <ps@pks.im>,
"Christian Couder" <christian.couder@gmail.com>,
"Albert Cui" <albertqcui@gmail.com>,
"Jonathan Tan" <jonathantanmy@google.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [RFC PATCH 13/13] bundle-uri docs: add design notes
Date: Thu, 5 Aug 2021 17:07:29 +0200 [thread overview]
Message-ID: <RFC-patch-13.13-1e657ed27a-20210805T150534Z-avarab@gmail.com> (raw)
In-Reply-To: <RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com>
Add a design doc for the bundle-uri protocol extension to go along
with the packfile-uri extension added in cd8402e0fd8 (Documentation:
add Packfile URIs design doc, 2020-06-10).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
Documentation/technical/bundle-uri.txt | 119 ++++++++++++++++++++++++
Documentation/technical/protocol-v2.txt | 5 +
2 files changed, 124 insertions(+)
create mode 100644 Documentation/technical/bundle-uri.txt
diff --git a/Documentation/technical/bundle-uri.txt b/Documentation/technical/bundle-uri.txt
new file mode 100644
index 0000000000..5ae9a15eaf
--- /dev/null
+++ b/Documentation/technical/bundle-uri.txt
@@ -0,0 +1,119 @@
+Bundle URI Design Notes
+=======================
+
+Protocol
+--------
+
+See `bundle-uri` in the link:protocol-v2.html[protocol-v2]
+documentation for a discussion of the bundle-uri command, and the
+expectations of clients and servers.
+
+This document is a a more general discussion of how the `bundle-uri`
+command fits in with the rest of the git ecosystem, its design goals
+and non-goals, comparison to alternatives etc.
+
+Comparison with Packfile URIs
+-----------------------------
+
+There is a similar "Packfile URIs" facility, see the
+link:packfile-uri.html[packfile-uri] documentation for details.
+
+The Packfile URIs facility requires a much closer cooperation between
+CDN and server than the bundle URI facility.
+
+I.e. the server MUST know what objects exist in the packfile URI it's
+pointing to, as well as its pack checksum. Failure to do so will not
+only result in a client error (the packfile hash won't match), but
+even if it got past that would likely result in a corrupt repository
+with tips pointing to unreachable objects.
+
+By comparison the bundle URIs are meant to be a "dumb" solution
+friendly to e.g. having a weekly cronjob take a snapshot of a git
+repository, that snapshot being uploaded to a network of FTP mirrors
+(which may be inconsistent or out of date).
+
+The server does not need to know what state the side-channel download
+is at, because the client will first validate it, and then optionally
+negotiate with the server using what it discovers there.
+
+Using the local `transfer.injectBundleURI` configuration variable (see
+linkgit:git-config[1]) the `bundle-uri` mechanism doesn't even need
+the server to support it.
+
+Security
+--------
+
+The omission of something equivalent to the packfile <OID> in the
+Packfile URIs protocol is intentional, as having it would require
+closer server and CDN cooperation than some server operators are
+comfortable with.
+
+Furthermore, it is not needed for security. The server doesn't need to
+trust its CDN. If the server were to attempt to send harmful content
+to the client, the result would not validate against the server's
+provided ref tips gotten from ls-refs.
+
+The lack of a such a hash does leave room open to a malicious CDN
+operation to be annoying however. E.g. they could inject irrelevant
+objects into the bundles, which would enlarge the downloaded
+repository until a "gc" would eventually throw them away.
+
+In practice the lack of a hash is considered to be a non-issue. Anyone
+concerned about such security problems between their server and their
+CDN is going to be pointing to a "https" URL under their control. For
+a client the "threat" is the same as without bundle-uri, i.e. a server
+is free to be annoying today and send you garbage in the PACK that you
+won't need.
+
+Security issues peculiar to bundle-uri
+--------------------------------------
+
+Both packfile-uri and bundle-uri use the `fetch.uriProtocols`
+configuration variable (see linkgit:git-config[1]) to configure which
+protocols they support.
+
+By default this is set to "http,https" for both, but bundle-uri
+supports adding "file" to that list. The server can thus point to
+"file://" URIs it expects the client to have access to.
+
+This is primarily intended for use with the `transfer.injectBundleURI`
+mechanism, but can also be useful e.g. in a centralized environment
+where a server might point to a "file:///mnt/bundles/big-repo.bdl" it
+knows to be mounted on the local machine (e.g. a racked server),
+points to it in its "bundle-uri" response.
+
+The client can then add "file" to the `fetch.uriProtocols` list to
+obey such responses. That does mean that a malicious server can point
+to any arbitrary file on the local machine. The threat of this is
+considered minimal, since anyone adding `file` to `fetch.uriProtocols`
+likely knows what they're doing and controls both ands, and the worst
+they can do is make a curl(1) pipe garbage into "index-pack" (which
+will likely promptly die on the non-PACK-file).
+
+Security comparison with packfile-uri
+-------------------------------------
+
+The initial implementation of packfile-uri needed special adjusting to
+run "git fsck" on incoming .gitmodules files, this was to deal with a
+general security issue in git, See CVE-2018-17456.
+
+The current packfile-uri mechanism requires special handling around
+"fsck" to do such cross-PACK fsck's, this is because it first indexes
+the "incremental" PACK, and then any PACK(s) provided via
+packfile-uri, before finally doing a full connectivity check.
+
+This is effect doing the fsck one might do via "clone" and "fetch" in
+reverse, or the equivalent of starting with the incremental "fetch",
+followed by the "clone".
+
+Since the packfile-uri mechanism can result in the .gitmodules blob
+referenced by such a "fetch" to be in the pack for the "clone" the
+fetch-pack process needs to keep state between the indexing of
+multiple packs, to remember to fsck the blob (via the "clone") later
+after seeing it in a tree (from the "fetch).
+
+There are no known security issues with the way packfile-uri does
+this, but since bundle-uri effectively emulates what a which doesn't
+support either "bundle-uri" or "packfile-uri" would do on clone/fetch,
+any future security issues peculiar to the packfile-uri approach are
+unlikely to be shared by it.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index d10d5e9ef6..5536ea4b7e 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -696,3 +696,8 @@ intended to support future features such as:
they've got that OID already (for multi-tips the client would need
to fetch the bundle, or do e.g. HTTP range requests to get its
header).
+
+bundle-uri SEE ALSO
+^^^^^^^^^^^^^^^^^^^
+
+See the link:bundle-uri.html[Bundle URI Design Notes] for more.
--
2.33.0.rc0.646.g585563e77f
next prev parent reply other threads:[~2021-08-05 15:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-05 15:07 [RFC PATCH 00/13] Add bundle-uri: resumably clones, static "dumb" CDN etc Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 01/13] serve: add command to advertise bundle URIs Ævar Arnfjörð Bjarmason
2021-08-10 13:58 ` Derrick Stolee
2021-08-23 13:25 ` Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 02/13] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 03/13] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 04/13] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 05/13] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 06/13] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 07/13] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 08/13] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 09/13] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 10/13] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 11/13] index-pack: add --progress-title option Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` [RFC PATCH 12/13] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2021-08-05 15:07 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-24 21:48 ` [RFC PATCH 13/13] bundle-uri docs: add design notes brian m. carlson
2021-08-24 22:33 ` Ævar Arnfjörð Bjarmason
2021-08-06 14:38 ` [RFC PATCH 00/13] Add bundle-uri: resumably clones, static "dumb" CDN etc Jonathan Nieder
2021-08-06 16:26 ` Ævar Arnfjörð Bjarmason
2021-08-06 20:40 ` Jonathan Nieder
2021-08-07 2:19 ` Ævar Arnfjörð Bjarmason
2021-08-10 13:55 ` Derrick Stolee
2021-08-23 13:28 ` Ævar Arnfjörð Bjarmason
2021-08-24 2:03 ` Derrick Stolee
2021-08-24 22:00 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=RFC-patch-13.13-1e657ed27a-20210805T150534Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=albertqcui@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).