All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
	newren@gmail.com, avarab@gmail.com, dyroneteng@gmail.com,
	Johannes.Schindelin@gmx.de, szeder.dev@gmail.com,
	mjcheetham@outlook.com, Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 0/5] Bundle URIs II: git clone --bundle-uri
Date: Wed, 27 Jul 2022 15:11:44 -0700	[thread overview]
Message-ID: <YuG4ICBFNbLPR9Iv@google.com> (raw)
In-Reply-To: <pull.1300.git.1658781277.gitgitgadget@gmail.com>

On 2022.07.25 20:34, Derrick Stolee via GitGitGadget wrote:
> This is the second series building the bundle URI feature as discussed in
> the previous series that added the design document [1]. This series does not
> modify the design document, so the patches are independent and can be
> applied to the latest 'master'.
> 
> [1]
> https://lore.kernel.org/git/pull.1248.v3.git.1658757188.gitgitgadget@gmail.com
> 
> This series brings in just enough logic that we can bootstrap clones from a
> single bundle using git clone --bundle-uri=<X>.
> 
>  * Patch 1 adds a 'get' capability to 'git remote-https' which allows
>    downloading the contents of a URI to a local file.
>  * Patch 2 creates basic file-copy logic within a new bundle-uri.c file. It
>    is not used until patch 3.
>  * Patch 3 creates the git clone --bundle-uri=<X> option, allowing Git to
>    bootstrap a clone from a bundle, but get the remaining objects from the
>    origin URL. (As of this patch, it only accepts a filename.)
>  * Patch 4 extends the git clone --bundle-uri=<X> option to allow file://
>    and https:// URIs.
>  * Patch 5 is a CLI helper to avoid using --bundle-uri and --depth at the
>    same time in git clone.
> 
> As outlined in [1], the next steps after this are:
> 
>  1. Allow parsing a bundle list as a config file at the given URI. The
>     key-value format is unified with the protocol v2 verb (coming in (3)).
>     [2]
>  2. Implement the protocol v2 verb, re-using the bundle list logic from (2).
>     Use this to auto-discover bundle URIs during 'git clone' (behind a
>     config option). [3]
>  3. Implement the 'creationToken' heuristic, allowing incremental 'git
>     fetch' commands to download a bundle list from a configured URI, and
>     only download bundles that are new based on the creation token values.
>     [4]
> 
> I have prepared some of this work as pull requests on my personal fork so
> curious readers can look ahead to where we are going:
> 
> [2] https://github.com/derrickstolee/git/pull/20
> 
> [3] https://github.com/derrickstolee/git/pull/21
> 
> [4] https://github.com/derrickstolee/git/pull/22
> 
> Note: this series includes some code pulled out of the first series [1], and
> in particular the git fetch --bundle-uri=<X> option is removed. The
> intention was to replace that with git bundle fetch <X>, but that conflicts
> with the work to refactor how subcommands are parsed. The git bundle fetch
> subcommand could be added later for maximum flexibility on the client side,
> but we can also move forward without it.
> 
> Thanks, -Stolee
> 
> P.S. Here is the range-diff compared to v2 of the previous bundle URI
> series:
> 
> 1:  d444042dc4dcc < -:  ------------- docs: document bundle URI standard
> 2:  0a2cf60437f78 ! 1:  40808e92afb7b remote-curl: add 'get' capability
>     @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
>       
>      +static void parse_get(struct strbuf *buf)
>      +{
>     -+	struct http_get_options opts = { 0 };
>      +	struct strbuf url = STRBUF_INIT;
>      +	struct strbuf path = STRBUF_INIT;
>      +	const char *p, *space;
>     @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
>      +	strbuf_add(&url, p, space - p);
>      +	strbuf_addstr(&path, space + 1);
>      +
>     -+	if (http_get_file(url.buf, path.buf, &opts))
>     ++	if (http_get_file(url.buf, path.buf, NULL))
>      +		die(_("failed to download file at URL '%s'"), url.buf);
>      +
>      +	strbuf_release(&url);
>     @@ t/t5557-http-get.sh (new)
>      +	get $url file1
>      +	EOF
>      +
>     -+	test_must_fail git remote-http $url $url <input 2>err &&
>     ++	test_must_fail git remote-http $url <input 2>err &&
>      +	test_path_is_missing file1 &&
>      +	grep "failed to download file at URL" err &&
>      +	rm file1.temp
>     @@ t/t5557-http-get.sh (new)
>      +
>      +	EOF
>      +
>     -+	GIT_TRACE2_PERF=1 git remote-http $url $url <input &&
>     ++	GIT_TRACE2_PERF=1 git remote-http $url <input &&
>      +	test_cmp "$HTTPD_DOCUMENT_ROOT_PATH/exists.txt" file2
>      +'
>      +
>      +test_done
>     -
>     - ## transport-helper.c ##
>     -@@ transport-helper.c: struct helper_data {
>     - 		check_connectivity : 1,
>     - 		no_disconnect_req : 1,
>     - 		no_private_update : 1,
>     --		object_format : 1;
>     -+		object_format : 1,
>     -+		get : 1;
>     - 
>     - 	/*
>     - 	 * As an optimization, the transport code may invoke fetch before
>     -@@ transport-helper.c: static struct child_process *get_helper(struct transport *transport)
>     - 			data->no_private_update = 1;
>     - 		} else if (starts_with(capname, "object-format")) {
>     - 			data->object_format = 1;
>     -+		} else if (!strcmp(capname, "get")) {
>     -+			data->get = 1;
>     - 		} else if (mandatory) {
>     - 			die(_("unknown mandatory capability %s; this remote "
>     - 			      "helper probably needs newer version of Git"),
> 3:  abec47564fd9c ! 2:  7d3159f0d9a29 bundle-uri: create basic file-copy logic
>     @@ Commit message
>          file, not a bundle list. Bundle lists will be implemented in a future
>          change.
>      
>     +    Note that the discovery of a temporary filename is slightly racy because
>     +    the odb_mkstemp() relies on the temporary file not existing. With the
>     +    current implementation being limited to file copies, we could replace
>     +    the copy_file() with copy_fd(). The tricky part comes in future changes
>     +    that send the filename to 'git remote-https' and its 'get' capability.
>     +    At that point, we need the file descriptor closed _and_ the file
>     +    unlinked. If we were to keep the file descriptor open for the sake of
>     +    normal file copies, then we would pollute the rest of the code for
>     +    little benefit. This is especially the case because we expect that most
>     +    bundle URI use will be based on HTTPS instead of file copies.
>     +
>          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>      
>       ## Makefile ##
>     @@ bundle-uri.c (new)
>      +#include "refs.h"
>      +#include "run-command.h"
>      +
>     -+static void find_temp_filename(struct strbuf *name)
>     ++static int find_temp_filename(struct strbuf *name)
>      +{
>      +	int fd;
>      +	/*
>     @@ bundle-uri.c (new)
>      +	 * racy, but unlikely to collide.
>      +	 */
>      +	fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX");
>     -+	if (fd < 0)
>     -+		die(_("failed to create temporary file"));
>     ++	if (fd < 0) {
>     ++		warning(_("failed to create temporary file"));
>     ++		return -1;
>     ++	}
>     ++
>      +	close(fd);
>      +	unlink(name->buf);
>     ++	return 0;
>      +}
>      +
>     -+static int copy_uri_to_file(const char *uri, const char *file)
>     ++static int copy_uri_to_file(const char *file, const char *uri)
>      +{
>     -+	/* Copy as a file */
>     -+	return copy_file(file, uri, 0444);
>     ++	/* File-based URIs only for now. */
>     ++	return copy_file(file, uri, 0);
>      +}
>      +
>      +static int unbundle_from_file(struct repository *r, const char *file)
>     @@ bundle-uri.c (new)
>      +	int result = 0;
>      +	int bundle_fd;
>      +	struct bundle_header header = BUNDLE_HEADER_INIT;
>     -+	struct strvec extra_index_pack_args = STRVEC_INIT;
>      +	struct string_list_item *refname;
>      +	struct strbuf bundle_ref = STRBUF_INIT;
>      +	size_t bundle_prefix_len;
>     @@ bundle-uri.c (new)
>      +	if ((bundle_fd = read_bundle_header(file, &header)) < 0)
>      +		return 1;
>      +
>     -+	if ((result = unbundle(r, &header, bundle_fd, &extra_index_pack_args)))
>     ++	if ((result = unbundle(r, &header, bundle_fd, NULL)))
>      +		return 1;
>      +
>      +	/*
>     @@ bundle-uri.c (new)
>      +	int result = 0;
>      +	struct strbuf filename = STRBUF_INIT;
>      +
>     -+	find_temp_filename(&filename);
>     -+	if ((result = copy_uri_to_file(uri, filename.buf)))
>     ++	if ((result = find_temp_filename(&filename)))
>     ++		goto cleanup;
>     ++
>     ++	if ((result = copy_uri_to_file(filename.buf, uri))) {
>     ++		warning(_("failed to download bundle from URI '%s'"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>     -+	if ((result = !is_bundle(filename.buf, 0)))
>     ++	if ((result = !is_bundle(filename.buf, 0))) {
>     ++		warning(_("file at URI '%s' is not a bundle"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>     -+	if ((result = unbundle_from_file(r, filename.buf)))
>     ++	if ((result = unbundle_from_file(r, filename.buf))) {
>     ++		warning(_("failed to unbundle bundle from URI '%s'"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>      +cleanup:
>      +	unlink(filename.buf);
> 4:  f6255ec518857 < -:  ------------- fetch: add --bundle-uri option
> -:  ------------- > 3:  29e645a54ba7f clone: add --bundle-uri option
> 5:  bfbd11b48bf1b ! 4:  f6bc3177332e8 bundle-uri: add support for http(s):// and file://
>     @@ Metadata
>       ## Commit message ##
>          bundle-uri: add support for http(s):// and file://
>      
>     -    The previous change created the 'git fetch --bundle-uri=<uri>' option.
>     +    The previous change created the 'git clone --bundle-uri=<uri>' option.
>          Currently, <uri> must be a filename.
>      
>          Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
>     @@ Commit message
>          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>      
>       ## bundle-uri.c ##
>     -@@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
>     - 	unlink(name->buf);
>     +@@ bundle-uri.c: static int find_temp_filename(struct strbuf *name)
>     + 	return 0;
>       }
>       
>     -+static int download_https_uri_to_file(const char *uri, const char *file)
>     -+{
>     +-static int copy_uri_to_file(const char *file, const char *uri)
>     ++static int download_https_uri_to_file(const char *file, const char *uri)
>     + {
>     +-	/* File-based URIs only for now. */
>     +-	return copy_file(file, uri, 0);
>      +	int result = 0;
>      +	struct child_process cp = CHILD_PROCESS_INIT;
>      +	FILE *child_in = NULL, *child_out = NULL;
>     @@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
>      +	return result;
>      +}
>      +
>     - static int copy_uri_to_file(const char *uri, const char *file)
>     - {
>     ++static int copy_uri_to_file(const char *filename, const char *uri)
>     ++{
>      +	const char *out;
>     ++
>      +	if (skip_prefix(uri, "https:", &out) ||
>      +	    skip_prefix(uri, "http:", &out))
>     -+		return download_https_uri_to_file(uri, file);
>     ++		return download_https_uri_to_file(filename, uri);
>      +
>      +	if (!skip_prefix(uri, "file://", &out))
>      +		out = uri;
>      +
>     - 	/* Copy as a file */
>     --	return copy_file(file, uri, 0444);
>     -+	return !!copy_file(file, out, 0);
>     ++	/* Copy as a file */
>     ++	return copy_file(filename, out, 0);
>       }
>       
>       static int unbundle_from_file(struct repository *r, const char *file)
>      
>     - ## t/t5558-fetch-bundle-uri.sh ##
>     -@@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
>     + ## t/t5558-clone-bundle-uri.sh ##
>     +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' '
>           test_cmp expect actual
>       '
>       
>     -+test_expect_success 'fetch file:// bundle' '
>     -+	git init fetch-file &&
>     -+	git -C fetch-file fetch --bundle-uri="file://$(pwd)/fetch-from/B.bundle" &&
>     -+	git -C fetch-file rev-parse refs/bundles/topic >actual &&
>     -+	git -C fetch-from rev-parse topic >expect &&
>     ++test_expect_success 'clone with file:// bundle' '
>     ++	git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
>     ++		clone-from clone-file &&
>     ++	git -C clone-file rev-parse refs/bundles/topic >actual &&
>     ++	git -C clone-from rev-parse topic >expect &&
>      +	test_cmp expect actual
>      +'
>      +
>     @@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
>      +start_httpd
>      +
>      +test_expect_success 'fail to fetch from non-existent HTTP URL' '
>     -+	test_must_fail git fetch --bundle-uri="$HTTPD_URL/does-not-exist" 2>err &&
>     ++	test_when_finished rm -rf test &&
>     ++	git clone --bundle-uri="$HTTPD_URL/does-not-exist" . test 2>err &&
>      +	grep "failed to download bundle from URI" err
>      +'
>      +
>      +test_expect_success 'fail to fetch from non-bundle HTTP URL' '
>     ++	test_when_finished rm -rf test &&
>      +	echo bogus >"$HTTPD_DOCUMENT_ROOT_PATH/bogus" &&
>     -+	test_must_fail git fetch --bundle-uri="$HTTPD_URL/bogus" 2>err &&
>     ++	git clone --bundle-uri="$HTTPD_URL/bogus" . test 2>err &&
>      +	grep "is not a bundle" err
>      +'
>      +
>     -+test_expect_success 'fetch HTTP bundle' '
>     -+	cp fetch-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
>     -+	git init fetch-http &&
>     -+	git -C fetch-http fetch --bundle-uri="$HTTPD_URL/B.bundle" &&
>     -+	git -C fetch-http rev-parse refs/bundles/topic >actual &&
>     -+	git -C fetch-from rev-parse topic >expect &&
>     -+	test_cmp expect actual
>     ++test_expect_success 'clone HTTP bundle' '
>     ++	cp clone-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
>     ++
>     ++	git clone --no-local --mirror clone-from \
>     ++		"$HTTPD_DOCUMENT_ROOT_PATH/fetch.git" &&
>     ++
>     ++	git clone --bundle-uri="$HTTPD_URL/B.bundle" \
>     ++		"$HTTPD_URL/smart/fetch.git" clone-http &&
>     ++	git -C clone-http rev-parse refs/bundles/topic >actual &&
>     ++	git -C clone-from rev-parse topic >expect &&
>     ++	test_cmp expect actual &&
>     ++
>     ++	test_config -C clone-http log.excludedecoration refs/bundle/
>      +'
>      +
>      +# Do not add tests here unless they use the HTTP server, as they will
> 6:  a217e9a0640b4 < -:  ------------- fetch: add 'refs/bundle/' to log.excludeDecoration
> -:  ------------- > 5:  e823b168ab725 clone: --bundle-uri cannot be combined with --depth
> 
> 
> Derrick Stolee (5):
>   remote-curl: add 'get' capability
>   bundle-uri: create basic file-copy logic
>   clone: add --bundle-uri option
>   bundle-uri: add support for http(s):// and file://
>   clone: --bundle-uri cannot be combined with --depth
> 
>  Documentation/git-clone.txt         |   7 ++
>  Documentation/gitremote-helpers.txt |   9 ++
>  Makefile                            |   1 +
>  builtin/clone.c                     |  18 +++
>  bundle-uri.c                        | 168 ++++++++++++++++++++++++++++
>  bundle-uri.h                        |  14 +++
>  remote-curl.c                       |  32 ++++++
>  t/t5557-http-get.sh                 |  37 ++++++
>  t/t5558-clone-bundle-uri.sh         |  81 ++++++++++++++
>  t/t5606-clone-options.sh            |   8 ++
>  10 files changed, 375 insertions(+)
>  create mode 100644 bundle-uri.c
>  create mode 100644 bundle-uri.h
>  create mode 100755 t/t5557-http-get.sh
>  create mode 100755 t/t5558-clone-bundle-uri.sh
> 
> 
> base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1300%2Fderrickstolee%2Fbundle-redo%2Fclone-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1300/derrickstolee/bundle-redo/clone-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1300
> -- 
> gitgitgadget

Looks good to me, with just one small fix and a couple of optional
nitpicks. Thaks for the series!

Reviewed-by: Josh Steadmon <steadmon@google.com>

  parent reply	other threads:[~2022-07-27 22:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-25 20:34 [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-27 22:08   ` Josh Steadmon
2022-07-27 23:00   ` Ævar Arnfjörð Bjarmason
2022-08-01 13:55     ` Derrick Stolee
2022-08-01 14:39       ` Ævar Arnfjörð Bjarmason
2022-07-25 20:34 ` [PATCH 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-07-27 22:09   ` Josh Steadmon
2022-08-01 13:58     ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-07-27 22:09   ` Josh Steadmon
2022-08-01 14:00     ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-07-27 22:11 ` Josh Steadmon [this message]
2022-08-01 14:00   ` [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee
2022-08-02 12:29 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-02 21:32     ` Junio C Hamano
2022-08-04 15:34       ` Derrick Stolee
2022-08-04 18:19         ` Junio C Hamano
2022-08-02 12:29   ` [PATCH v2 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-08-09 13:11   ` [PATCH v3 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-22 21:24       ` Junio C Hamano
2022-08-23 14:05         ` Derrick Stolee
2022-08-24 15:46           ` Junio C Hamano
2022-08-09 13:11     ` [PATCH v3 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-29  4:58       ` Teng Long
2022-08-30 13:33         ` Derrick Stolee
2022-08-09 13:11     ` [PATCH v3 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YuG4ICBFNbLPR9Iv@google.com \
    --to=steadmon@google.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dyroneteng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=mjcheetham@outlook.com \
    --cc=newren@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.