All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [WIP 6/4] Documentation: add Packfile URIs design doc
Date: Fri, 11 Jan 2019 14:18:19 -0800	[thread overview]
Message-ID: <e621cd3fb84a8d6865b5d8fe7f14e253a62af936.1547244620.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1547244620.git.jonathantanmy@google.com>

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/packfile-uri.txt | 83 ++++++++++++++++++++++++
 Documentation/technical/protocol-v2.txt  |  6 +-
 2 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt
new file mode 100644
index 0000000000..6535801486
--- /dev/null
+++ b/Documentation/technical/packfile-uri.txt
@@ -0,0 +1,83 @@
+Packfile URIs
+=============
+
+This feature allows servers to serve part of their packfile response as URIs.
+This allows server designs that improve scalability in bandwidth and CPU usage
+(for example, by serving some data through a CDN), and (in the future) provides
+some measure of resumability to clients.
+
+This feature is available only in protocol version 2.
+
+Protocol
+--------
+
+The server advertises `packfile-uris`.
+
+If the client replies with the following arguments:
+
+ * packfile-uris
+ * thin-pack
+ * ofs-delta
+
+when the server sends the packfile, it MAY send a `packfile-uris` section
+directly before the `packfile` section (right after `wanted-refs` if it is
+sent) containing HTTP(S) URIs. See protocol-v2.txt for the documentation of
+this section.
+
+Clients then should understand that the returned packfile could be incomplete,
+and that it needs to download all the given URIs before the fetch or clone is
+complete. Each URI should point to a Git packfile (which may be a thin pack and
+which may contain offset deltas).
+
+Server design
+-------------
+
+The server can be trivially made compatible with the proposed protocol by
+having it advertise `packfile-uris`, tolerating the client sending
+`packfile-uris`, and never sending any `packfile-uris` section. But we should
+include some sort of non-trivial implementation in the Minimum Viable Product,
+at least so that we can test the client.
+
+This is the implementation: a feature, marked experimental, that allows the
+server to be configured by one or more `uploadpack.blobPackfileUri=<sha1>
+<uri>` entries. Whenever the list of objects to be sent is assembled, a blob
+with the given sha1 can be replaced by the given URI. This allows, for example,
+servers to delegate serving of large blobs to CDNs.
+
+Client design
+-------------
+
+While fetching, the client needs to remember the list of URIs and cannot
+declare that the fetch is complete until all URIs have been downloaded as
+packfiles.
+
+The division of work (initial fetch + additional URIs) introduces convenient
+points for resumption of an interrupted clone - such resumption can be done
+after the Minimum Viable Product (see "Future work").
+
+The client can inhibit this feature (i.e. refrain from sending the
+`packfile-urls` parameter) by passing --no-packfile-urls to `git fetch`.
+
+Future work
+-----------
+
+The protocol design allows some evolution of the server and client without any
+need for protocol changes, so only a small-scoped design is included here to
+form the MVP. For example, the following can be done:
+
+ * On the server, a long-running process that takes in entire requests and
+   outputs a list of URIs and the corresponding inclusion and exclusion sets of
+   objects. This allows, e.g., signed URIs to be used and packfiles for common
+   requests to be cached.
+ * On the client, resumption of clone. If a clone is interrupted, information
+   could be recorded in the repository's config and a "clone-resume" command
+   can resume the clone in progress. (Resumption of subsequent fetches is more
+   difficult because that must deal with the user wanting to use the repository
+   even after the fetch was interrupted.)
+
+There are some possible features that will require a change in protocol:
+
+ * Additional HTTP headers (e.g. authentication)
+ * Byte range support
+ * Different file formats referenced by URIs (e.g. raw object)
+
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index c76f2311c2..de5c63bc9b 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -323,7 +323,8 @@ header. Most sections are sent only when the packfile is sent.
 
     output = acknowledgements flush-pkt |
 	     [acknowledgments delim-pkt] [shallow-info delim-pkt]
-	     [wanted-refs delim-pkt] packfile flush-pkt
+	     [wanted-refs delim-pkt] [packfile-uris delim-pkt]
+	     packfile flush-pkt
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
 		      (nak | *ack)
@@ -341,6 +342,9 @@ header. Most sections are sent only when the packfile is sent.
 		  *PKT-LINE(wanted-ref LF)
     wanted-ref = obj-id SP refname
 
+    packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
+    packfile-uri = PKT-LINE("uri" SP *%x20-ff LF)
+
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
 
-- 
2.19.0.271.gfe8321ec05.dirty


  parent reply	other threads:[~2019-01-11 22:18 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 22:18 [PATCH 0/4] Sideband the whole fetch v2 response Jonathan Tan
2019-01-11 22:18 ` [PATCH 1/4] pkt-line: introduce struct packet_writer Jonathan Tan
2019-01-14 20:07   ` Junio C Hamano
2019-01-15 18:18     ` Jonathan Tan
2019-01-11 22:18 ` [PATCH 2/4] sideband: reverse its dependency on pkt-line Jonathan Tan
2019-01-14 20:07   ` Junio C Hamano
2019-01-14 22:11     ` Junio C Hamano
2019-01-15 18:38       ` Jonathan Tan
2019-01-15 18:36     ` Jonathan Tan
2019-01-11 22:18 ` [PATCH 3/4] {fetch,upload}-pack: sideband v2 fetch response Jonathan Tan
2019-01-14 22:27   ` Junio C Hamano
2019-01-15 19:18     ` Jonathan Tan
2019-01-11 22:18 ` [PATCH 4/4] tests: define GIT_TEST_SIDEBAND_ALL Jonathan Tan
2019-01-11 22:18 ` [WIP 5/4] Documentation: order protocol v2 sections Jonathan Tan
2019-01-11 22:18 ` Jonathan Tan [this message]
2019-01-11 22:18 ` [WIP 7/4] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-01-11 22:18 ` [WIP 8/4] upload-pack: send part of packfile response as uri Jonathan Tan
2019-01-15 19:40 ` [PATCH v2 0/4] Sideband the whole fetch v2 response Jonathan Tan
2019-01-15 19:40   ` [PATCH v2 1/4] pkt-line: introduce struct packet_writer Jonathan Tan
2019-01-15 19:40   ` [PATCH v2 2/4] sideband: reverse its dependency on pkt-line Jonathan Tan
2019-01-16 10:34     ` SZEDER Gábor
2019-01-16 17:43       ` Jonathan Tan
2019-01-16 19:17         ` SZEDER Gábor
2019-01-15 19:40   ` [PATCH v2 3/4] {fetch,upload}-pack: sideband v2 fetch response Jonathan Tan
2019-01-15 19:40   ` [PATCH v2 4/4] tests: define GIT_TEST_SIDEBAND_ALL Jonathan Tan
2019-01-15 19:50   ` [PATCH v2 0/4] Sideband the whole fetch v2 response Junio C Hamano
2019-01-15 21:11   ` Junio C Hamano
2019-01-15 22:08     ` Jonathan Tan
2019-01-15 22:35       ` Junio C Hamano
2019-01-15 23:02         ` Jonathan Tan
2019-01-15 23:13           ` Junio C Hamano
2019-01-16  0:38             ` Jonathan Tan
2019-01-16  4:12               ` Junio C Hamano
2019-01-16 19:28 ` [PATCH v3 " Jonathan Tan
2019-01-16 19:28   ` [PATCH v3 1/4] pkt-line: introduce struct packet_writer Jonathan Tan
2019-01-16 19:28   ` [PATCH v3 2/4] sideband: reverse its dependency on pkt-line Jonathan Tan
2019-01-16 19:28   ` [PATCH v3 3/4] {fetch,upload}-pack: sideband v2 fetch response Jonathan Tan
2019-01-16 19:28   ` [PATCH v3 4/4] tests: define GIT_TEST_SIDEBAND_ALL Jonathan Tan
2019-01-29 23:27     ` Jonathan Nieder
2019-02-13  6:49       ` Jeff King
2019-01-17 19:43   ` [PATCH v3 0/4] Sideband the whole fetch v2 response Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e621cd3fb84a8d6865b5d8fe7f14e253a62af936.1547244620.git.jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.