All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Add fetch.updateHead option
@ 2023-04-05  1:27 Felipe Contreras
  2023-04-05  1:27 ` [PATCH 1/2] " Felipe Contreras
  2023-04-05  1:27 ` [PATCH 2/2] fetch: add support for HEAD update on mirrors Felipe Contreras
  0 siblings, 2 replies; 9+ messages in thread
From: Felipe Contreras @ 2023-04-05  1:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Jeff King,
	Patrick Steinhardt, Jonathan Nieder, Daniel Martí,
	Felipe Contreras

It's surprising that `git clone` and `git init && git remote add -f` don't
create the same remote state.

Fix this by introducing a new configuration: `fetch.updateHead` which updates
the remote `HEAD` when it's not present with "missing", or always with
"always".

By default it's "never", which retains the current behavior.

This has already been discussed before [1].

[1] https://lore.kernel.org/git/20201118091219.3341585-1-felipe.contreras@gmail.com/

Felipe Contreras (2):
  Add fetch.updateHead option
  fetch: add support for HEAD update on mirrors

 Documentation/config/fetch.txt  |  4 ++
 Documentation/config/remote.txt |  3 ++
 builtin/fetch.c                 | 69 ++++++++++++++++++++++++++++++++-
 remote.c                        | 21 ++++++++++
 remote.h                        | 11 ++++++
 t/t5510-fetch.sh                | 49 +++++++++++++++++++++++
 6 files changed, 156 insertions(+), 1 deletion(-)

-- 
2.40.0+fc1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] Add fetch.updateHead option
  2023-04-05  1:27 [PATCH 0/2] Add fetch.updateHead option Felipe Contreras
@ 2023-04-05  1:27 ` Felipe Contreras
  2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
  2023-04-05  9:28   ` Ævar Arnfjörð Bjarmason
  2023-04-05  1:27 ` [PATCH 2/2] fetch: add support for HEAD update on mirrors Felipe Contreras
  1 sibling, 2 replies; 9+ messages in thread
From: Felipe Contreras @ 2023-04-05  1:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Jeff King,
	Patrick Steinhardt, Jonathan Nieder, Daniel Martí,
	Felipe Contreras

Users might change the behavior when running "git fetch" so that the
remote's HEAD symbolic ref is updated at certain point.

For example after running "git remote add" the remote HEAD is not
set like it is with "git clone".

Setting "fetch.updatehead = missing" would probably be a sensible
default that everyone would want, but for now the default behavior is to
never update HEAD, so there shouldn't be any functional changes.

For the next major version of Git, we might want to change this default.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
---
 Documentation/config/fetch.txt  |  4 +++
 Documentation/config/remote.txt |  3 ++
 builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
 remote.c                        | 21 +++++++++++
 remote.h                        | 11 ++++++
 t/t5510-fetch.sh                | 31 ++++++++++++++++
 6 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
index 568f0f75b3..dc147ffb35 100644
--- a/Documentation/config/fetch.txt
+++ b/Documentation/config/fetch.txt
@@ -120,3 +120,7 @@ fetch.bundleCreationToken::
 The creation token values are chosen by the provider serving the specific
 bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
 remove the value for the `fetch.bundleCreationToken` value before fetching.
+
+fetch.updateHead::
+	Defines when to update the remote HEAD symbolic ref. Values are 'never',
+	'missing' (update only when HEAD is missing), and 'always'.
diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 0678b4bcfe..9d739d2ed4 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -86,3 +86,6 @@ remote.<name>.partialclonefilter::
 	Changing or clearing this value will only affect fetches for new commits.
 	To fetch associated objects for commits already present in the local object
 	database, use the `--refetch` option of linkgit:git-fetch[1].
+
+remote.<name>.updateHead::
+	Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`.
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7221e57f35..7e93a1aa46 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -59,6 +59,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
 static int prune_tags = -1; /* unspecified */
 #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
 
+static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
+
 static int all, append, dry_run, force, keep, multiple, update_head_ok;
 static int write_fetch_head = 1;
 static int verbosity, deepen_relative, set_upstream, refetch;
@@ -129,6 +131,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(k, "fetch.updatehead"))
+		return parse_update_head(&fetch_update_head, k, v);
+
 	return git_default_config(k, v, cb);
 }
 
@@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
 	return retcode;
 }
 
+static void update_head(int config, const struct ref *head, const struct remote *remote)
+{
+	char *ref, *target;
+	const char *r;
+	int flags;
+
+	if (!head || !head->symref || !remote)
+		return;
+
+	ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
+	target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
+
+	if (!ref || !target) {
+		warning(_("could not update remote head"));
+		return;
+	}
+
+	r = resolve_ref_unsafe(ref, 0, NULL, &flags);
+
+	if (r) {
+		if (config == FETCH_UPDATE_HEAD_MISSING) {
+			if (flags & REF_ISSYMREF)
+				/* already present */
+				return;
+		} else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
+			if (!strcmp(r, target))
+				/* already up-to-date */
+				return;
+		} else
+			/* should never happen */
+			return;
+	}
+
+	if (!create_symref(ref, target, "remote update head")) {
+		if (verbosity >= 0)
+			printf(_("Updated remote '%s' HEAD\n"), remote->name);
+	} else {
+		warning(_("could not update remote head"));
+	}
+}
+
 static int do_fetch(struct transport *transport,
 		    struct refspec *rs)
 {
@@ -1592,6 +1638,7 @@ static int do_fetch(struct transport *transport,
 	int must_list_refs = 1;
 	struct fetch_head fetch_head = { 0 };
 	struct strbuf err = STRBUF_INIT;
+	int need_update_head = 0, update_head_config = 0;
 
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
@@ -1626,9 +1673,21 @@ static int do_fetch(struct transport *transport,
 	} else {
 		struct branch *branch = branch_get(NULL);
 
-		if (transport->remote->fetch.nr)
+		if (transport->remote->fetch.nr) {
+
+			if (transport->remote->update_head)
+				update_head_config = transport->remote->update_head;
+			else
+				update_head_config = fetch_update_head;
+
+			need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
+
+			if (need_update_head)
+				strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
 			refspec_ref_prefixes(&transport->remote->fetch,
 					     &transport_ls_refs_options.ref_prefixes);
+		}
+
 		if (branch_has_merge_config(branch) &&
 		    !strcmp(branch->remote_name, transport->remote->name)) {
 			int i;
@@ -1737,6 +1796,9 @@ static int do_fetch(struct transport *transport,
 
 	commit_fetch_head(&fetch_head);
 
+	if (need_update_head)
+		update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
+
 	if (set_upstream) {
 		struct branch *branch = branch_get("HEAD");
 		struct ref *rm;
diff --git a/remote.c b/remote.c
index 641b083d90..5f3a9aa53e 100644
--- a/remote.c
+++ b/remote.c
@@ -344,6 +344,25 @@ static void read_branches_file(struct remote_state *remote_state,
 	remote->fetch_tags = 1; /* always auto-follow */
 }
 
+int parse_update_head(int *r, const char *var, const char *value)
+{
+	if (!r)
+		return -1;
+	else if (!value)
+		return config_error_nonbool(var);
+	else if (!strcmp(value, "never"))
+		*r = FETCH_UPDATE_HEAD_NEVER;
+	else if (!strcmp(value, "missing"))
+		*r = FETCH_UPDATE_HEAD_MISSING;
+	else if (!strcmp(value, "always"))
+		*r = FETCH_UPDATE_HEAD_ALWAYS;
+	else {
+		error(_("malformed value for %s: %s"), var, value);
+		return error(_("must be one of never, missing, or always"));
+	}
+	return 0;
+}
+
 static int handle_config(const char *key, const char *value, void *cb)
 {
 	const char *name;
@@ -473,6 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 					 key, value);
 	} else if (!strcmp(subkey, "vcs")) {
 		return git_config_string(&remote->foreign_vcs, key, value);
+	} else if (!strcmp(subkey, "updatehead")) {
+		return parse_update_head(&remote->update_head, key, value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index 73638cefeb..9dce42d65d 100644
--- a/remote.h
+++ b/remote.h
@@ -22,6 +22,13 @@ enum {
 	REMOTE_BRANCHES
 };
 
+enum {
+	FETCH_UPDATE_HEAD_DEFAULT = 0,
+	FETCH_UPDATE_HEAD_NEVER,
+	FETCH_UPDATE_HEAD_MISSING,
+	FETCH_UPDATE_HEAD_ALWAYS,
+};
+
 struct rewrite {
 	const char *base;
 	size_t baselen;
@@ -97,6 +104,8 @@ struct remote {
 	int prune;
 	int prune_tags;
 
+	int update_head;
+
 	/**
 	 * The configured helper programs to run on the remote side, for
 	 * Git-native protocols.
@@ -449,4 +458,6 @@ void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
 char *relative_url(const char *remote_url, const char *url,
 		   const char *up_path);
 
+int parse_update_head(int *r, const char *var, const char *value);
+
 #endif
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index dc44da9c79..dbeb2928ae 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
 	git fetch multipleurls
 '
 
+test_cmp_symbolic_ref () {
+	git symbolic-ref "$1" >actual &&
+	echo "$2" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'updatehead' '
+	test_when_finished "rm -rf updatehead" &&
+
+	git init updatehead &&
+	(
+		cd updatehead &&
+
+		git config fetch.updateHead never &&
+		git remote add origin .. &&
+		git fetch &&
+		test_must_fail git rev-parse --verify refs/remotes/origin/HEAD &&
+
+		git config fetch.updateHead missing &&
+		git fetch &&
+		test_cmp_symbolic_ref refs/remotes/origin/HEAD refs/remotes/origin/main &&
+		git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/side &&
+		git fetch &&
+		test_cmp_symbolic_ref refs/remotes/origin/HEAD refs/remotes/origin/side &&
+
+		git config fetch.updateHead always &&
+		git fetch &&
+		test_cmp_symbolic_ref refs/remotes/origin/HEAD refs/remotes/origin/main
+	)
+'
+
 # configured prune tests
 
 set_config_tristate () {
-- 
2.40.0+fc1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] fetch: add support for HEAD update on mirrors
  2023-04-05  1:27 [PATCH 0/2] Add fetch.updateHead option Felipe Contreras
  2023-04-05  1:27 ` [PATCH 1/2] " Felipe Contreras
@ 2023-04-05  1:27 ` Felipe Contreras
  1 sibling, 0 replies; 9+ messages in thread
From: Felipe Contreras @ 2023-04-05  1:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Jeff King,
	Patrick Steinhardt, Jonathan Nieder, Daniel Martí,
	Felipe Contreras

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
---
 builtin/fetch.c  | 15 ++++++++++-----
 t/t5510-fetch.sh | 18 ++++++++++++++++++
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7e93a1aa46..6bf147b012 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1593,12 +1593,17 @@ static void update_head(int config, const struct ref *head, const struct remote
 	if (!head || !head->symref || !remote)
 		return;
 
-	ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
-	target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
+	if (!remote->mirror) {
+		ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
+		target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
 
-	if (!ref || !target) {
-		warning(_("could not update remote head"));
-		return;
+		if (!ref || !target) {
+			warning(_("could not update remote head"));
+			return;
+		}
+	} else {
+		ref = "HEAD";
+		target = head->symref;
 	}
 
 	r = resolve_ref_unsafe(ref, 0, NULL, &flags);
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index dbeb2928ae..d3f3b24378 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -845,6 +845,24 @@ test_expect_success 'updatehead' '
 	)
 '
 
+test_expect_success 'updatehead mirror' '
+	test_when_finished "rm -rf updatehead" &&
+
+	git clone --mirror . updatehead &&
+	(
+		cd updatehead &&
+
+		git config fetch.updateHead missing &&
+		git symbolic-ref HEAD refs/heads/side &&
+		git fetch &&
+		test_cmp_symbolic_ref HEAD refs/heads/side &&
+
+		git config fetch.updateHead always &&
+		git fetch &&
+		test_cmp_symbolic_ref HEAD refs/heads/main
+	)
+'
+
 # configured prune tests
 
 set_config_tristate () {
-- 
2.40.0+fc1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-05  1:27 ` [PATCH 1/2] " Felipe Contreras
@ 2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
  2023-04-05 10:15     ` Patrick Steinhardt
  2023-04-05 14:55     ` Felipe Contreras
  2023-04-05  9:28   ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2023-04-05  9:16 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Jeff King, Patrick Steinhardt, Jonathan Nieder, Daniel Martí


On Tue, Apr 04 2023, Felipe Contreras wrote:

> Users might change the behavior when running "git fetch" so that the
> remote's HEAD symbolic ref is updated at certain point.
>
> For example after running "git remote add" the remote HEAD is not
> set like it is with "git clone".
>
> Setting "fetch.updatehead = missing" would probably be a sensible
> default that everyone would want, but for now the default behavior is to
> never update HEAD, so there shouldn't be any functional changes.
>
> For the next major version of Git, we might want to change this default.
>
> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> ---
>  Documentation/config/fetch.txt  |  4 +++
>  Documentation/config/remote.txt |  3 ++
>  builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
>  remote.c                        | 21 +++++++++++
>  remote.h                        | 11 ++++++
>  t/t5510-fetch.sh                | 31 ++++++++++++++++
>  6 files changed, 133 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
> index 568f0f75b3..dc147ffb35 100644
> --- a/Documentation/config/fetch.txt
> +++ b/Documentation/config/fetch.txt
> @@ -120,3 +120,7 @@ fetch.bundleCreationToken::
>  The creation token values are chosen by the provider serving the specific
>  bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
>  remove the value for the `fetch.bundleCreationToken` value before fetching.
> +
> +fetch.updateHead::
> +	Defines when to update the remote HEAD symbolic ref. Values are 'never',
> +	'missing' (update only when HEAD is missing), and 'always'.
> diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
> index 0678b4bcfe..9d739d2ed4 100644
> --- a/Documentation/config/remote.txt
> +++ b/Documentation/config/remote.txt
> @@ -86,3 +86,6 @@ remote.<name>.partialclonefilter::
>  	Changing or clearing this value will only affect fetches for new commits.
>  	To fetch associated objects for commits already present in the local object
>  	database, use the `--refetch` option of linkgit:git-fetch[1].
> +
> +remote.<name>.updateHead::
> +	Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`.
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 7221e57f35..7e93a1aa46 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -59,6 +59,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
>  static int prune_tags = -1; /* unspecified */
>  #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
>  
> +static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
> +
>  static int all, append, dry_run, force, keep, multiple, update_head_ok;
>  static int write_fetch_head = 1;
>  static int verbosity, deepen_relative, set_upstream, refetch;
> @@ -129,6 +131,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
>  		return 0;
>  	}
>  
> +	if (!strcmp(k, "fetch.updatehead"))
> +		return parse_update_head(&fetch_update_head, k, v);
> +
>  	return git_default_config(k, v, cb);
>  }
>  
> @@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
>  	return retcode;
>  }
>  
> +static void update_head(int config, const struct ref *head, const struct remote *remote)

Here you pass a "const struct remote".

> +{
> +	char *ref, *target;
> +	const char *r;
> +	int flags;
> +
> +	if (!head || !head->symref || !remote)
> +		return;
> +
> +	ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
> +	target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);

But here we end up with this cast, as it's not const after all, we're
modifying it.

I think this sort of thing makes the code harder to read & reason about,
and adds cast verbosity.

If you want to clearly communicate that the "remote->name" and
"remote->mirror" you're using are "const" I think a better way to do
this is to pass those as explicit parameters to this new static helper
function, and then just pass a "struct refspec *fetch_rs" directly.

> +
> +	if (!ref || !target) {
> +		warning(_("could not update remote head"));
> +		return;
> +	}
> +
> +	r = resolve_ref_unsafe(ref, 0, NULL, &flags);
> +
> +	if (r) {
> +		if (config == FETCH_UPDATE_HEAD_MISSING) {
> +			if (flags & REF_ISSYMREF)
> +				/* already present */
> +				return;
> +		} else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
> +			if (!strcmp(r, target))
> +				/* already up-to-date */
> +				return;

I think you should name the "enum" you're adding below, the one that
contains the new "FETCH_UPDATE_HEAD_DEFAULT".

Then this could be a "switch", and the compiler could check the
arguments, i.e. you could pass an enum type instead of an "int".

> +		} else

{} missing, if you keep this, but...

> +			/* should never happen */
> +			return;

...so, here we're not checking some enum values, but presumably other
things check this, I haven't checked.


But for a "should never happen", should we make this a "BUG()", or is it
user-controlled?



> +	}
> +
> +	if (!create_symref(ref, target, "remote update head")) {
> +		if (verbosity >= 0)
> +			printf(_("Updated remote '%s' HEAD\n"), remote->name);
> +	} else {
> +		warning(_("could not update remote head"));
> +	}
> +}
> +
>  static int do_fetch(struct transport *transport,
>  		    struct refspec *rs)
>  {
> @@ -1592,6 +1638,7 @@ static int do_fetch(struct transport *transport,
>  	int must_list_refs = 1;
>  	struct fetch_head fetch_head = { 0 };
>  	struct strbuf err = STRBUF_INIT;
> +	int need_update_head = 0, update_head_config = 0;
>  
>  	if (tags == TAGS_DEFAULT) {
>  		if (transport->remote->fetch_tags == 2)
> @@ -1626,9 +1673,21 @@ static int do_fetch(struct transport *transport,
>  	} else {
>  		struct branch *branch = branch_get(NULL);
>  
> -		if (transport->remote->fetch.nr)
> +		if (transport->remote->fetch.nr) {
> +
> +			if (transport->remote->update_head)
> +				update_head_config = transport->remote->update_head;
> +			else
> +				update_head_config = fetch_update_head;
> +
> +			need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
> +
> +			if (need_update_head)
> +				strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
>  			refspec_ref_prefixes(&transport->remote->fetch,
>  					     &transport_ls_refs_options.ref_prefixes);
> +		}
> +
>  		if (branch_has_merge_config(branch) &&
>  		    !strcmp(branch->remote_name, transport->remote->name)) {
>  			int i;
> @@ -1737,6 +1796,9 @@ static int do_fetch(struct transport *transport,
>  
>  	commit_fetch_head(&fetch_head);
>  
> +	if (need_update_head)
> +		update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);

Some overly long lines here...

> +
>  	if (set_upstream) {
>  		struct branch *branch = branch_get("HEAD");
>  		struct ref *rm;
> diff --git a/remote.c b/remote.c
> index 641b083d90..5f3a9aa53e 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -344,6 +344,25 @@ static void read_branches_file(struct remote_state *remote_state,
>  	remote->fetch_tags = 1; /* always auto-follow */
>  }
>  
> +int parse_update_head(int *r, const char *var, const char *value)
> +{
> +	if (!r)
> +		return -1;
> +	else if (!value)
> +		return config_error_nonbool(var);
> +	else if (!strcmp(value, "never"))
> +		*r = FETCH_UPDATE_HEAD_NEVER;
> +	else if (!strcmp(value, "missing"))
> +		*r = FETCH_UPDATE_HEAD_MISSING;
> +	else if (!strcmp(value, "always"))
> +		*r = FETCH_UPDATE_HEAD_ALWAYS;

Ditto, this could really benefit from an enum type, instead of the bare
"int".

> +	else {
> +		error(_("malformed value for %s: %s"), var, value);
> +		return error(_("must be one of never, missing, or always"));

Shouldn't we use git_die_config() or similar here, to get the line
number etc., or do we get that somehow (I can't recall).

> +	}
> +	return 0;
> +}
> +
>  static int handle_config(const char *key, const char *value, void *cb)
>  {
>  	const char *name;
> @@ -473,6 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
>  					 key, value);
>  	} else if (!strcmp(subkey, "vcs")) {
>  		return git_config_string(&remote->foreign_vcs, key, value);
> +	} else if (!strcmp(subkey, "updatehead")) {
> +		return parse_update_head(&remote->update_head, key, value);
>  	}
>  	return 0;
>  }
> diff --git a/remote.h b/remote.h
> index 73638cefeb..9dce42d65d 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -22,6 +22,13 @@ enum {
>  	REMOTE_BRANCHES
>  };
>  
> +enum {
> +	FETCH_UPDATE_HEAD_DEFAULT = 0,

We tend to only init these to 0 when the default being 0 matters,
i.e. we use it as a boolean, but is that the case here?

> +	FETCH_UPDATE_HEAD_NEVER,
> +	FETCH_UPDATE_HEAD_MISSING,
> +	FETCH_UPDATE_HEAD_ALWAYS,
> +};

I.e. let's name this.


> +
>  struct rewrite {
>  	const char *base;
>  	size_t baselen;
> @@ -97,6 +104,8 @@ struct remote {
>  	int prune;
>  	int prune_tags;
>  
> +	int update_head;
> +
>  	/**
>  	 * The configured helper programs to run on the remote side, for
>  	 * Git-native protocols.
> @@ -449,4 +458,6 @@ void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
>  char *relative_url(const char *remote_url, const char *url,
>  		   const char *up_path);
>  
> +int parse_update_head(int *r, const char *var, const char *value);

For new functions and/or enums, having some brief API docs (using the
"/** ... */" syntax) would be better.


> +
>  #endif
> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> index dc44da9c79..dbeb2928ae 100755
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
>  	git fetch multipleurls
>  '
>  
> +test_cmp_symbolic_ref () {
> +	git symbolic-ref "$1" >actual &&
> +	echo "$2" >expected &&
> +	test_cmp expected actual
> +}

Sort of an aside, but this seems to be the Nth use of this pattern in
the test suite, e.g. t1401-symbolic-ref.sh repeatedly hardcodes the
same.

I wonder if a prep commit to stick this in test-lib-functions.sh would
be in order, or maybe a "--symbolic" argument to "test_cmp_rev"?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-05  1:27 ` [PATCH 1/2] " Felipe Contreras
  2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
@ 2023-04-05  9:28   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2023-04-05  9:28 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Jeff King, Patrick Steinhardt, Jonathan Nieder, Daniel Martí


On Tue, Apr 04 2023, Felipe Contreras wrote:

> Users might change the behavior when running "git fetch" so that the
> remote's HEAD symbolic ref is updated at certain point.
>
> For example after running "git remote add" the remote HEAD is not
> set like it is with "git clone".
>
> Setting "fetch.updatehead = missing" would probably be a sensible
> default that everyone would want, but for now the default behavior is to
> never update HEAD, so there shouldn't be any functional changes.
>
> For the next major version of Git, we might want to change this default.
>
> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> ---
>  Documentation/config/fetch.txt  |  4 +++
>  Documentation/config/remote.txt |  3 ++
>  builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
>  remote.c                        | 21 +++++++++++
>  remote.h                        | 11 ++++++
>  t/t5510-fetch.sh                | 31 ++++++++++++++++
>  6 files changed, 133 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
> index 568f0f75b3..dc147ffb35 100644
> --- a/Documentation/config/fetch.txt
> +++ b/Documentation/config/fetch.txt
> @@ -120,3 +120,7 @@ fetch.bundleCreationToken::
>  The creation token values are chosen by the provider serving the specific
>  bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
>  remove the value for the `fetch.bundleCreationToken` value before fetching.
> +
> +fetch.updateHead::
> +	Defines when to update the remote HEAD symbolic ref. Values are 'never',
> +	'missing' (update only when HEAD is missing), and 'always'.

Missed the first time around, I think it would be useful to explain the
historical behavior heher, and why it's been in place.

I.e. that we use this during the initial fetch/clone to find the "HEAD",
for discovering the default branch, but then proceed to not care about
it after that.

We should also link and cross-link to the other recent-ish config
options (whose name I'm blanking on), which implement the "take the
remote's suggestion of a default branch name" here.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
@ 2023-04-05 10:15     ` Patrick Steinhardt
  2023-04-05 14:55     ` Felipe Contreras
  1 sibling, 0 replies; 9+ messages in thread
From: Patrick Steinhardt @ 2023-04-05 10:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Felipe Contreras, git, Jeff King, Jonathan Nieder, Daniel Martí

[-- Attachment #1: Type: text/plain, Size: 4005 bytes --]

On Wed, Apr 05, 2023 at 11:16:12AM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Tue, Apr 04 2023, Felipe Contreras wrote:
[snip]
> > @@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
> >  	return retcode;
> >  }
> >  
> > +static void update_head(int config, const struct ref *head, const struct remote *remote)
> 
> Here you pass a "const struct remote".
> 
> > +{
> > +	char *ref, *target;
> > +	const char *r;
> > +	int flags;
> > +
> > +	if (!head || !head->symref || !remote)
> > +		return;
> > +
> > +	ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
> > +	target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
> 
> But here we end up with this cast, as it's not const after all, we're
> modifying it.
> 
> I think this sort of thing makes the code harder to read & reason about,
> and adds cast verbosity.
> 
> If you want to clearly communicate that the "remote->name" and
> "remote->mirror" you're using are "const" I think a better way to do
> this is to pass those as explicit parameters to this new static helper
> function, and then just pass a "struct refspec *fetch_rs" directly.

I think the underlying problem is that `apply_refspecs()` and
transitively called functions expect the argument to be non-const even
though they never modify it.

So maybe the proper way to handle this would be to add a preparatory
patch that constifies the parameter. Something like what I've attached
to the end of this mail.

Patrick

-- >8 --

diff --git a/remote.c b/remote.c
index b04e5da338..1752c391c3 100644
--- a/remote.c
+++ b/remote.c
@@ -851,7 +851,7 @@ static int refspec_match(const struct refspec_item *refspec,
 	return !strcmp(refspec->src, name);
 }
 
-int omit_name_by_refspec(const char *name, struct refspec *rs)
+int omit_name_by_refspec(const char *name, const struct refspec *rs)
 {
 	int i;
 
@@ -880,7 +880,7 @@ struct ref *apply_negative_refspecs(struct ref *ref_map, struct refspec *rs)
 	return ref_map;
 }
 
-static int query_matches_negative_refspec(struct refspec *rs, struct refspec_item *query)
+static int query_matches_negative_refspec(const struct refspec *rs, struct refspec_item *query)
 {
 	int i, matched_negative = 0;
 	int find_src = !query->src;
@@ -968,7 +968,7 @@ static void query_refspecs_multiple(struct refspec *rs,
 	}
 }
 
-int query_refspecs(struct refspec *rs, struct refspec_item *query)
+int query_refspecs(const struct refspec *rs, struct refspec_item *query)
 {
 	int i;
 	int find_src = !query->src;
@@ -1002,7 +1002,7 @@ int query_refspecs(struct refspec *rs, struct refspec_item *query)
 	return -1;
 }
 
-char *apply_refspecs(struct refspec *rs, const char *name)
+char *apply_refspecs(const struct refspec *rs, const char *name)
 {
 	struct refspec_item query;
 
diff --git a/remote.h b/remote.h
index 5b38ee20b8..cd3c1439ab 100644
--- a/remote.h
+++ b/remote.h
@@ -253,7 +253,7 @@ struct ref *ref_remove_duplicates(struct ref *ref_map);
  * Check whether a name matches any negative refspec in rs. Returns 1 if the
  * name matches at least one negative refspec, and 0 otherwise.
  */
-int omit_name_by_refspec(const char *name, struct refspec *rs);
+int omit_name_by_refspec(const char *name, const struct refspec *rs);
 
 /*
  * Remove all entries in the input list which match any negative refspec in
@@ -261,8 +261,8 @@ int omit_name_by_refspec(const char *name, struct refspec *rs);
  */
 struct ref *apply_negative_refspecs(struct ref *ref_map, struct refspec *rs);
 
-int query_refspecs(struct refspec *rs, struct refspec_item *query);
-char *apply_refspecs(struct refspec *rs, const char *name);
+int query_refspecs(const struct refspec *rs, struct refspec_item *query);
+char *apply_refspecs(const struct refspec *rs, const char *name);
 
 int check_push_refs(struct ref *src, struct refspec *rs);
 int match_push_refs(struct ref *src, struct ref **dst,


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
  2023-04-05 10:15     ` Patrick Steinhardt
@ 2023-04-05 14:55     ` Felipe Contreras
  2023-04-06  7:33       ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 9+ messages in thread
From: Felipe Contreras @ 2023-04-05 14:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Patrick Steinhardt, Jonathan Nieder, Daniel Martí

On Wed, Apr 5, 2023 at 4:28 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Tue, Apr 04 2023, Felipe Contreras wrote:
>
> > Users might change the behavior when running "git fetch" so that the
> > remote's HEAD symbolic ref is updated at certain point.
> >
> > For example after running "git remote add" the remote HEAD is not
> > set like it is with "git clone".
> >
> > Setting "fetch.updatehead = missing" would probably be a sensible
> > default that everyone would want, but for now the default behavior is to
> > never update HEAD, so there shouldn't be any functional changes.
> >
> > For the next major version of Git, we might want to change this default.
> >
> > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> > ---
> >  Documentation/config/fetch.txt  |  4 +++
> >  Documentation/config/remote.txt |  3 ++
> >  builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
> >  remote.c                        | 21 +++++++++++
> >  remote.h                        | 11 ++++++
> >  t/t5510-fetch.sh                | 31 ++++++++++++++++
> >  6 files changed, 133 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
> > index 568f0f75b3..dc147ffb35 100644
> > --- a/Documentation/config/fetch.txt
> > +++ b/Documentation/config/fetch.txt
> > @@ -120,3 +120,7 @@ fetch.bundleCreationToken::
> >  The creation token values are chosen by the provider serving the specific
> >  bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
> >  remove the value for the `fetch.bundleCreationToken` value before fetching.
> > +
> > +fetch.updateHead::
> > +     Defines when to update the remote HEAD symbolic ref. Values are 'never',
> > +     'missing' (update only when HEAD is missing), and 'always'.
> > diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
> > index 0678b4bcfe..9d739d2ed4 100644
> > --- a/Documentation/config/remote.txt
> > +++ b/Documentation/config/remote.txt
> > @@ -86,3 +86,6 @@ remote.<name>.partialclonefilter::
> >       Changing or clearing this value will only affect fetches for new commits.
> >       To fetch associated objects for commits already present in the local object
> >       database, use the `--refetch` option of linkgit:git-fetch[1].
> > +
> > +remote.<name>.updateHead::
> > +     Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`.
> > diff --git a/builtin/fetch.c b/builtin/fetch.c
> > index 7221e57f35..7e93a1aa46 100644
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -59,6 +59,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
> >  static int prune_tags = -1; /* unspecified */
> >  #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
> >
> > +static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
> > +
> >  static int all, append, dry_run, force, keep, multiple, update_head_ok;
> >  static int write_fetch_head = 1;
> >  static int verbosity, deepen_relative, set_upstream, refetch;
> > @@ -129,6 +131,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
> >               return 0;
> >       }
> >
> > +     if (!strcmp(k, "fetch.updatehead"))
> > +             return parse_update_head(&fetch_update_head, k, v);
> > +
> >       return git_default_config(k, v, cb);
> >  }
> >
> > @@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
> >       return retcode;
> >  }
> >
> > +static void update_head(int config, const struct ref *head, const struct remote *remote)
>
> Here you pass a "const struct remote".
>
> > +{
> > +     char *ref, *target;
> > +     const char *r;
> > +     int flags;
> > +
> > +     if (!head || !head->symref || !remote)
> > +             return;
> > +
> > +     ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
> > +     target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
>
> But here we end up with this cast, as it's not const after all, we're
> modifying it.

It is a const, and we are not modifying it. `apply_refspecs()` is not
saying what it should say: the refspec remains constant.

As Patrick explained: `apply_refspecs()` should probably be fixed.

> > +
> > +     if (!ref || !target) {
> > +             warning(_("could not update remote head"));
> > +             return;
> > +     }
> > +
> > +     r = resolve_ref_unsafe(ref, 0, NULL, &flags);
> > +
> > +     if (r) {
> > +             if (config == FETCH_UPDATE_HEAD_MISSING) {
> > +                     if (flags & REF_ISSYMREF)
> > +                             /* already present */
> > +                             return;
> > +             } else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
> > +                     if (!strcmp(r, target))
> > +                             /* already up-to-date */
> > +                             return;
>
> I think you should name the "enum" you're adding below, the one that
> contains the new "FETCH_UPDATE_HEAD_DEFAULT".
>
> Then this could be a "switch", and the compiler could check the
> arguments, i.e. you could pass an enum type instead of an "int".

Sure, it can be an `enum fetch_update_mode` instead of `int`, but I
don't see what value it provides, other than more verbosity. The enum
right above is also unnamed, and 'remote->origin' is an int. And it's
not the only enum of that kind in the source code.

Using a switch is better, but that doesn't require an enum type. The
multiple ifs are just a remnant of a previous version of the code.

> > +             } else
>
> {} missing, if you keep this, but...
>
> > +                     /* should never happen */
> > +                     return;
>
> ...so, here we're not checking some enum values, but presumably other
> things check this, I haven't checked.

Yes, the function cannot be called otherwise.

> But for a "should never happen", should we make this a "BUG()", or is it
> user-controlled?

Sure, it can be a `BUG()`. It truly should not happen.

> > +     }
> > +
> > +     if (!create_symref(ref, target, "remote update head")) {
> > +             if (verbosity >= 0)
> > +                     printf(_("Updated remote '%s' HEAD\n"), remote->name);
> > +     } else {
> > +             warning(_("could not update remote head"));
> > +     }
> > +}
> > +
> >  static int do_fetch(struct transport *transport,
> >                   struct refspec *rs)
> >  {
> > @@ -1592,6 +1638,7 @@ static int do_fetch(struct transport *transport,
> >       int must_list_refs = 1;
> >       struct fetch_head fetch_head = { 0 };
> >       struct strbuf err = STRBUF_INIT;
> > +     int need_update_head = 0, update_head_config = 0;
> >
> >       if (tags == TAGS_DEFAULT) {
> >               if (transport->remote->fetch_tags == 2)
> > @@ -1626,9 +1673,21 @@ static int do_fetch(struct transport *transport,
> >       } else {
> >               struct branch *branch = branch_get(NULL);
> >
> > -             if (transport->remote->fetch.nr)
> > +             if (transport->remote->fetch.nr) {
> > +
> > +                     if (transport->remote->update_head)
> > +                             update_head_config = transport->remote->update_head;
> > +                     else
> > +                             update_head_config = fetch_update_head;
> > +
> > +                     need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
> > +
> > +                     if (need_update_head)
> > +                             strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
> >                       refspec_ref_prefixes(&transport->remote->fetch,
> >                                            &transport_ls_refs_options.ref_prefixes);
> > +             }
> > +
> >               if (branch_has_merge_config(branch) &&
> >                   !strcmp(branch->remote_name, transport->remote->name)) {
> >                       int i;
> > @@ -1737,6 +1796,9 @@ static int do_fetch(struct transport *transport,
> >
> >       commit_fetch_head(&fetch_head);
> >
> > +     if (need_update_head)
> > +             update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
>
> Some overly long lines here...

Not unique in this document:

117:                                         warning(_("rejected %s
because shallow roots are not allowed to be updated"),
115:                                         warning(_("multiple
branches detected, incompatible with --set-upstream"));
112:         OPT_STRING_LIST('o', "server-option", &server_options,
N_("server-specific"), N_("option to transmit")),
111:                         need_update_head = update_head_config &&
update_head_config != FETCH_UPDATE_HEAD_NEVER;
108:                                   "you need to specify exactly
one branch with the --set-upstream option"));
106:                         die(_("options '%s' and '%s' cannot be
used together"), "--depth", "--unshallow");
106:                 update_head(update_head_config,
find_ref_by_name(remote_refs, "HEAD"), transport->remote);
103:                                 die(_("fetching a group and
specifying refspecs does not make sense"));
103:                         die(_("options '%s' and '%s' cannot be
used together"), "--deepen", "--depth");
103:                                 warning(_("not setting upstream
for a remote remote-tracking branch"));

But I seem to recall previous discussions (perhaps in LKML) where
people accepted that lines 120-characters long are OK. We don't live
in the 80's anymore, terminals have more than 80 columns.

> > +
> >       if (set_upstream) {
> >               struct branch *branch = branch_get("HEAD");
> >               struct ref *rm;
> > diff --git a/remote.c b/remote.c
> > index 641b083d90..5f3a9aa53e 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -344,6 +344,25 @@ static void read_branches_file(struct remote_state *remote_state,
> >       remote->fetch_tags = 1; /* always auto-follow */
> >  }
> >
> > +int parse_update_head(int *r, const char *var, const char *value)
> > +{
> > +     if (!r)
> > +             return -1;
> > +     else if (!value)
> > +             return config_error_nonbool(var);
> > +     else if (!strcmp(value, "never"))
> > +             *r = FETCH_UPDATE_HEAD_NEVER;
> > +     else if (!strcmp(value, "missing"))
> > +             *r = FETCH_UPDATE_HEAD_MISSING;
> > +     else if (!strcmp(value, "always"))
> > +             *r = FETCH_UPDATE_HEAD_ALWAYS;
>
> Ditto, this could really benefit from an enum type, instead of the bare
> "int".

What would change other than `int *r` -> `enum name *r`?

> > +     else {
> > +             error(_("malformed value for %s: %s"), var, value);
> > +             return error(_("must be one of never, missing, or always"));
>
> Shouldn't we use git_die_config() or similar here, to get the line
> number etc., or do we get that somehow (I can't recall).

There's plenty of `error()` in config.c, including
`git_default_push_config`, which this was based on.

> > +     }
> > +     return 0;
> > +}
> > +
> >  static int handle_config(const char *key, const char *value, void *cb)
> >  {
> >       const char *name;
> > @@ -473,6 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
> >                                        key, value);
> >       } else if (!strcmp(subkey, "vcs")) {
> >               return git_config_string(&remote->foreign_vcs, key, value);
> > +     } else if (!strcmp(subkey, "updatehead")) {
> > +             return parse_update_head(&remote->update_head, key, value);
> >       }
> >       return 0;
> >  }
> > diff --git a/remote.h b/remote.h
> > index 73638cefeb..9dce42d65d 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -22,6 +22,13 @@ enum {
> >       REMOTE_BRANCHES
> >  };
> >
> > +enum {
> > +     FETCH_UPDATE_HEAD_DEFAULT = 0,
>
> We tend to only init these to 0 when the default being 0 matters,
> i.e. we use it as a boolean, but is that the case here?

In the current version of the code it doesn't matter, but the default
could be different later on.

For example if the default is not specified `!fetch_update_head` the
code could do some guessing, like doing "always" if the remote is a
mirror.

I learned this lesson reorganizing the options of `builtin/pull.c` in
a patch that was never merged. [1]

> > +
> >  #endif
> > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> > index dc44da9c79..dbeb2928ae 100755
> > --- a/t/t5510-fetch.sh
> > +++ b/t/t5510-fetch.sh
> > @@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
> >       git fetch multipleurls
> >  '
> >
> > +test_cmp_symbolic_ref () {
> > +     git symbolic-ref "$1" >actual &&
> > +     echo "$2" >expected &&
> > +     test_cmp expected actual
> > +}
>
> Sort of an aside, but this seems to be the Nth use of this pattern in
> the test suite, e.g. t1401-symbolic-ref.sh repeatedly hardcodes the
> same.
>
> I wonder if a prep commit to stick this in test-lib-functions.sh would
> be in order, or maybe a "--symbolic" argument to "test_cmp_rev"?

Sure. If I had incline that such a patch would be merged (or this one)
I would do it, but I have a plethora of cleanup patches just gathering
dust, so I'd rather not.

Cheers.

[1] https://lore.kernel.org/git/20210705123209.1808663-24-felipe.contreras@gmail.com/

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-05 14:55     ` Felipe Contreras
@ 2023-04-06  7:33       ` Ævar Arnfjörð Bjarmason
  2023-04-07  2:41         ` Felipe Contreras
  0 siblings, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2023-04-06  7:33 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Jeff King, Patrick Steinhardt, Jonathan Nieder, Daniel Martí


On Wed, Apr 05 2023, Felipe Contreras wrote:

> On Wed, Apr 5, 2023 at 4:28 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>>
>> On Tue, Apr 04 2023, Felipe Contreras wrote:
>>
>> > Users might change the behavior when running "git fetch" so that the
>> > remote's HEAD symbolic ref is updated at certain point.
>> >
>> > For example after running "git remote add" the remote HEAD is not
>> > set like it is with "git clone".
>> >
>> > Setting "fetch.updatehead = missing" would probably be a sensible
>> > default that everyone would want, but for now the default behavior is to
>> > never update HEAD, so there shouldn't be any functional changes.
>> >
>> > For the next major version of Git, we might want to change this default.
>> >
>> > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
>> > ---
>> >  Documentation/config/fetch.txt  |  4 +++
>> >  Documentation/config/remote.txt |  3 ++
>> >  builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
>> >  remote.c                        | 21 +++++++++++
>> >  remote.h                        | 11 ++++++
>> >  t/t5510-fetch.sh                | 31 ++++++++++++++++
>> >  6 files changed, 133 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
>> > index 568f0f75b3..dc147ffb35 100644
>> > --- a/Documentation/config/fetch.txt
>> > +++ b/Documentation/config/fetch.txt
>> > @@ -120,3 +120,7 @@ fetch.bundleCreationToken::
>> >  The creation token values are chosen by the provider serving the specific
>> >  bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
>> >  remove the value for the `fetch.bundleCreationToken` value before fetching.
>> > +
>> > +fetch.updateHead::
>> > +     Defines when to update the remote HEAD symbolic ref. Values are 'never',
>> > +     'missing' (update only when HEAD is missing), and 'always'.
>> > diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
>> > index 0678b4bcfe..9d739d2ed4 100644
>> > --- a/Documentation/config/remote.txt
>> > +++ b/Documentation/config/remote.txt
>> > @@ -86,3 +86,6 @@ remote.<name>.partialclonefilter::
>> >       Changing or clearing this value will only affect fetches for new commits.
>> >       To fetch associated objects for commits already present in the local object
>> >       database, use the `--refetch` option of linkgit:git-fetch[1].
>> > +
>> > +remote.<name>.updateHead::
>> > +     Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`.
>> > diff --git a/builtin/fetch.c b/builtin/fetch.c
>> > index 7221e57f35..7e93a1aa46 100644
>> > --- a/builtin/fetch.c
>> > +++ b/builtin/fetch.c
>> > @@ -59,6 +59,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
>> >  static int prune_tags = -1; /* unspecified */
>> >  #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
>> >
>> > +static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
>> > +
>> >  static int all, append, dry_run, force, keep, multiple, update_head_ok;
>> >  static int write_fetch_head = 1;
>> >  static int verbosity, deepen_relative, set_upstream, refetch;
>> > @@ -129,6 +131,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
>> >               return 0;
>> >       }
>> >
>> > +     if (!strcmp(k, "fetch.updatehead"))
>> > +             return parse_update_head(&fetch_update_head, k, v);
>> > +
>> >       return git_default_config(k, v, cb);
>> >  }
>> >
>> > @@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
>> >       return retcode;
>> >  }
>> >
>> > +static void update_head(int config, const struct ref *head, const struct remote *remote)
>>
>> Here you pass a "const struct remote".
>>
>> > +{
>> > +     char *ref, *target;
>> > +     const char *r;
>> > +     int flags;
>> > +
>> > +     if (!head || !head->symref || !remote)
>> > +             return;
>> > +
>> > +     ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
>> > +     target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
>>
>> But here we end up with this cast, as it's not const after all, we're
>> modifying it.
>
> It is a const, and we are not modifying it. `apply_refspecs()` is not
> saying what it should say: the refspec remains constant.
>
> As Patrick explained: `apply_refspecs()` should probably be fixed.

Yes, that's a much better fix. I'd assumed that we were altering it, but
a prep change like that to give it "const" would be much better, then we
can avoid the cast.

>> > +
>> > +     if (!ref || !target) {
>> > +             warning(_("could not update remote head"));
>> > +             return;
>> > +     }
>> > +
>> > +     r = resolve_ref_unsafe(ref, 0, NULL, &flags);
>> > +
>> > +     if (r) {
>> > +             if (config == FETCH_UPDATE_HEAD_MISSING) {
>> > +                     if (flags & REF_ISSYMREF)
>> > +                             /* already present */
>> > +                             return;
>> > +             } else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
>> > +                     if (!strcmp(r, target))
>> > +                             /* already up-to-date */
>> > +                             return;
>>
>> I think you should name the "enum" you're adding below, the one that
>> contains the new "FETCH_UPDATE_HEAD_DEFAULT".
>>
>> Then this could be a "switch", and the compiler could check the
>> arguments, i.e. you could pass an enum type instead of an "int".
>
> Sure, it can be an `enum fetch_update_mode` instead of `int`, but I
> don't see what value it provides, other than more verbosity. The enum
> right above is also unnamed, and 'remote->origin' is an int. And it's
> not the only enum of that kind in the source code.
>
> Using a switch is better, but that doesn't require an enum type. The
> multiple ifs are just a remnant of a previous version of the code.

More on this below, but it's for self-documentation (makes the code
easier to follow), and the compiler can notice missing "case" arms,
which isn't the case with an "int".

>> > +             } else
>>
>> {} missing, if you keep this, but...
>>
>> > +                     /* should never happen */
>> > +                     return;
>>
>> ...so, here we're not checking some enum values, but presumably other
>> things check this, I haven't checked.
>
> Yes, the function cannot be called otherwise.

...more on this below...
>
>> But for a "should never happen", should we make this a "BUG()", or is it
>> user-controlled?
>
> Sure, it can be a `BUG()`. It truly should not happen.

...more on this below...

>> > +     }
>> > +
>> > +     if (!create_symref(ref, target, "remote update head")) {
>> > +             if (verbosity >= 0)
>> > +                     printf(_("Updated remote '%s' HEAD\n"), remote->name);
>> > +     } else {
>> > +             warning(_("could not update remote head"));
>> > +     }
>> > +}
>> > +
>> >  static int do_fetch(struct transport *transport,
>> >                   struct refspec *rs)
>> >  {
>> > @@ -1592,6 +1638,7 @@ static int do_fetch(struct transport *transport,
>> >       int must_list_refs = 1;
>> >       struct fetch_head fetch_head = { 0 };
>> >       struct strbuf err = STRBUF_INIT;
>> > +     int need_update_head = 0, update_head_config = 0;
>> >
>> >       if (tags == TAGS_DEFAULT) {
>> >               if (transport->remote->fetch_tags == 2)
>> > @@ -1626,9 +1673,21 @@ static int do_fetch(struct transport *transport,
>> >       } else {
>> >               struct branch *branch = branch_get(NULL);
>> >
>> > -             if (transport->remote->fetch.nr)
>> > +             if (transport->remote->fetch.nr) {
>> > +
>> > +                     if (transport->remote->update_head)
>> > +                             update_head_config = transport->remote->update_head;
>> > +                     else
>> > +                             update_head_config = fetch_update_head;
>> > +
>> > +                     need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
>> > +
>> > +                     if (need_update_head)
>> > +                             strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
>> >                       refspec_ref_prefixes(&transport->remote->fetch,
>> >                                            &transport_ls_refs_options.ref_prefixes);
>> > +             }
>> > +
>> >               if (branch_has_merge_config(branch) &&
>> >                   !strcmp(branch->remote_name, transport->remote->name)) {
>> >                       int i;
>> > @@ -1737,6 +1796,9 @@ static int do_fetch(struct transport *transport,
>> >
>> >       commit_fetch_head(&fetch_head);
>> >
>> > +     if (need_update_head)
>> > +             update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
>>
>> Some overly long lines here...
>
> Not unique in this document:

Yes...

> 117:                                         warning(_("rejected %s
> because shallow roots are not allowed to be updated"),
> 115:                                         warning(_("multiple
> branches detected, incompatible with --set-upstream"));
> 112:         OPT_STRING_LIST('o', "server-option", &server_options,
> N_("server-specific"), N_("option to transmit")),

...I think we have an informal exception for longer strings more often than not...

> 111:                         need_update_head = update_head_config &&
> update_head_config != FETCH_UPDATE_HEAD_NEVER;

...here's another thing you're adding in these proposed patches, so that doesn't really count...

> 108:                                   "you need to specify exactly
> one branch with the --set-upstream option"));
> 106:                         die(_("options '%s' and '%s' cannot be
> used together"), "--depth", "--unshallow");

..more strings...

> 106:                 update_head(update_head_config,
> find_ref_by_name(remote_refs, "HEAD"), transport->remote);

...ditto stuff you're adding...

> 103:                                 die(_("fetching a group and
> specifying refspecs does not make sense"));
> 103:                         die(_("options '%s' and '%s' cannot be
> used together"), "--deepen", "--depth");
> 103:                                 warning(_("not setting upstream
> for a remote remote-tracking branch"));

...some strings...

> But I seem to recall previous discussions (perhaps in LKML) where
> people accepted that lines 120-characters long are OK. We don't live
> in the 80's anymore, terminals have more than 80 columns.

I don't know what the kernel does, but we try to conform to our
CodingGuidelines, which sets a limit of 80.

But whatever else we do, we don't generally say that a newly added
function to a given file should be exempted from the preferred coding
style because the file isn't consistently using it.

>> > +
>> >       if (set_upstream) {
>> >               struct branch *branch = branch_get("HEAD");
>> >               struct ref *rm;
>> > diff --git a/remote.c b/remote.c
>> > index 641b083d90..5f3a9aa53e 100644
>> > --- a/remote.c
>> > +++ b/remote.c
>> > @@ -344,6 +344,25 @@ static void read_branches_file(struct remote_state *remote_state,
>> >       remote->fetch_tags = 1; /* always auto-follow */
>> >  }
>> >
>> > +int parse_update_head(int *r, const char *var, const char *value)
>> > +{
>> > +     if (!r)
>> > +             return -1;
>> > +     else if (!value)
>> > +             return config_error_nonbool(var);
>> > +     else if (!strcmp(value, "never"))
>> > +             *r = FETCH_UPDATE_HEAD_NEVER;
>> > +     else if (!strcmp(value, "missing"))
>> > +             *r = FETCH_UPDATE_HEAD_MISSING;
>> > +     else if (!strcmp(value, "always"))
>> > +             *r = FETCH_UPDATE_HEAD_ALWAYS;
>>
>> Ditto, this could really benefit from an enum type, instead of the bare
>> "int".
>
> What would change other than `int *r` -> `enum name *r`?

More on that below...

>> > +     else {
>> > +             error(_("malformed value for %s: %s"), var, value);
>> > +             return error(_("must be one of never, missing, or always"));
>>
>> Shouldn't we use git_die_config() or similar here, to get the line
>> number etc., or do we get that somehow (I can't recall).
>
> There's plenty of `error()` in config.c, including
> `git_default_push_config`, which this was based on.

Ah, I see in these cases the config API handles emitting the bad line
number, nevermind.

As an aside, I think we could avoid some of these "malformed value" if
we just made git_die_config_linenr() slighly smarter, and had it print
the bad value in cases where there's only one value, but that's
unrelated.

>> > +     }
>> > +     return 0;
>> > +}
>> > +
>> >  static int handle_config(const char *key, const char *value, void *cb)
>> >  {
>> >       const char *name;
>> > @@ -473,6 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
>> >                                        key, value);
>> >       } else if (!strcmp(subkey, "vcs")) {
>> >               return git_config_string(&remote->foreign_vcs, key, value);
>> > +     } else if (!strcmp(subkey, "updatehead")) {
>> > +             return parse_update_head(&remote->update_head, key, value);
>> >       }
>> >       return 0;
>> >  }
>> > diff --git a/remote.h b/remote.h
>> > index 73638cefeb..9dce42d65d 100644
>> > --- a/remote.h
>> > +++ b/remote.h
>> > @@ -22,6 +22,13 @@ enum {
>> >       REMOTE_BRANCHES
>> >  };
>> >
>> > +enum {
>> > +     FETCH_UPDATE_HEAD_DEFAULT = 0,
>>
>> We tend to only init these to 0 when the default being 0 matters,
>> i.e. we use it as a boolean, but is that the case here?
>
> In the current version of the code it doesn't matter, but the default
> could be different later on.
>
> For example if the default is not specified `!fetch_update_head` the
> code could do some guessing, like doing "always" if the remote is a
> mirror.
>
> I learned this lesson reorganizing the options of `builtin/pull.c` in
> a patch that was never merged. [1]
>
>> > +
>> >  #endif
>> > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
>> > index dc44da9c79..dbeb2928ae 100755
>> > --- a/t/t5510-fetch.sh
>> > +++ b/t/t5510-fetch.sh
>> > @@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
>> >       git fetch multipleurls
>> >  '
>> >
>> > +test_cmp_symbolic_ref () {
>> > +     git symbolic-ref "$1" >actual &&
>> > +     echo "$2" >expected &&
>> > +     test_cmp expected actual
>> > +}
>>
>> Sort of an aside, but this seems to be the Nth use of this pattern in
>> the test suite, e.g. t1401-symbolic-ref.sh repeatedly hardcodes the
>> same.
>>
>> I wonder if a prep commit to stick this in test-lib-functions.sh would
>> be in order, or maybe a "--symbolic" argument to "test_cmp_rev"?
>
> Sure. If I had incline that such a patch would be merged (or this one)
> I would do it, but I have a plethora of cleanup patches just gathering
> dust, so I'd rather not.

Fair enough, thanks.

Re the "more below" above, I tried hacking some of what I suggested
upthread on top of your patches, here's the result of
that. Changes/commentary:

 * Switched the "int" to "enum"

 * You've prepared the parse_update_head() to accept a NULL "r", but as
   this & your other code shows, we never pass it NULL. I don't get why
   we'd have it handle that case, as surely all plausible users are
   "populate this config variable for me", no?

 * I think better than a BUG() call in the new update_head() we should
   just drop "need_update_head" entirely. It ends up just being a
   variable that states "is missing or always", so for update_head() we
   can just pass a boolean "missing?".

   The two added "switch" statements are a bit verbose, I mainly
   included them to show what the pre-image is implicitly assuming with
   the "update_head_config && ...", and that you init the
   "update_head_config" to "0", instead of using
   "FETCH_UPDATE_HEAD_DEFAULT".

 * I renamed "update_head" to "fetch_update_head" just to have the
   compiler catch cases where we were using the old "int", but if you
   find some of this useful we could keep the old name.

Hope some of that helps.

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 6bf147b0123..6492e88d779 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -59,7 +59,7 @@ static int fetch_prune_tags_config = -1; /* unspecified */
 static int prune_tags = -1; /* unspecified */
 #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
 
-static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
+static enum fetch_update_head fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
 
 static int all, append, dry_run, force, keep, multiple, update_head_ok;
 static int write_fetch_head = 1;
@@ -1584,7 +1584,8 @@ static int backfill_tags(struct transport *transport,
 	return retcode;
 }
 
-static void update_head(int config, const struct ref *head, const struct remote *remote)
+static void update_head(int fetch_missing, const struct ref *head,
+			struct remote *remote)
 {
 	char *ref, *target;
 	const char *r;
@@ -1594,7 +1595,7 @@ static void update_head(int config, const struct ref *head, const struct remote
 		return;
 
 	if (!remote->mirror) {
-		ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
+		ref = apply_refspecs(&remote->fetch, "refs/heads/HEAD");
 		target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
 
 		if (!ref || !target) {
@@ -1609,17 +1610,14 @@ static void update_head(int config, const struct ref *head, const struct remote
 	r = resolve_ref_unsafe(ref, 0, NULL, &flags);
 
 	if (r) {
-		if (config == FETCH_UPDATE_HEAD_MISSING) {
+		if (fetch_missing) {
 			if (flags & REF_ISSYMREF)
 				/* already present */
 				return;
-		} else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
-			if (!strcmp(r, target))
-				/* already up-to-date */
-				return;
-		} else
-			/* should never happen */
+		} else if (!strcmp(r, target)) {
+			/* already up-to-date */
 			return;
+		}
 	}
 
 	if (!create_symref(ref, target, "remote update head")) {
@@ -1643,7 +1641,7 @@ static int do_fetch(struct transport *transport,
 	int must_list_refs = 1;
 	struct fetch_head fetch_head = { 0 };
 	struct strbuf err = STRBUF_INIT;
-	int need_update_head = 0, update_head_config = 0;
+	enum fetch_update_head update_head_config = FETCH_UPDATE_HEAD_DEFAULT;
 
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
@@ -1680,15 +1678,19 @@ static int do_fetch(struct transport *transport,
 
 		if (transport->remote->fetch.nr) {
 
-			if (transport->remote->update_head)
-				update_head_config = transport->remote->update_head;
+			if (transport->remote->fetch_update_head != FETCH_UPDATE_HEAD_DEFAULT)
+				update_head_config = transport->remote->fetch_update_head;
 			else
 				update_head_config = fetch_update_head;
 
-			need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
-
-			if (need_update_head)
+			switch (update_head_config) {
+			case FETCH_UPDATE_HEAD_MISSING:
+			case FETCH_UPDATE_HEAD_ALWAYS:
 				strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+			case FETCH_UPDATE_HEAD_DEFAULT:
+			case FETCH_UPDATE_HEAD_NEVER:
+				break;
+			}
 			refspec_ref_prefixes(&transport->remote->fetch,
 					     &transport_ls_refs_options.ref_prefixes);
 		}
@@ -1801,8 +1803,16 @@ static int do_fetch(struct transport *transport,
 
 	commit_fetch_head(&fetch_head);
 
-	if (need_update_head)
-		update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
+	switch (update_head_config) {
+	case FETCH_UPDATE_HEAD_MISSING:
+	case FETCH_UPDATE_HEAD_ALWAYS:
+		update_head(update_head_config == FETCH_UPDATE_HEAD_MISSING,
+			    find_ref_by_name(remote_refs, "HEAD"),
+			    transport->remote);
+	case FETCH_UPDATE_HEAD_DEFAULT:
+	case FETCH_UPDATE_HEAD_NEVER:
+		break;
+	}
 
 	if (set_upstream) {
 		struct branch *branch = branch_get("HEAD");
diff --git a/remote.c b/remote.c
index 5f3a9aa53ec..c05c344d806 100644
--- a/remote.c
+++ b/remote.c
@@ -344,11 +344,10 @@ static void read_branches_file(struct remote_state *remote_state,
 	remote->fetch_tags = 1; /* always auto-follow */
 }
 
-int parse_update_head(int *r, const char *var, const char *value)
+int parse_update_head(enum fetch_update_head *r, const char *var,
+		      const char *value)
 {
-	if (!r)
-		return -1;
-	else if (!value)
+	if (!value)
 		return config_error_nonbool(var);
 	else if (!strcmp(value, "never"))
 		*r = FETCH_UPDATE_HEAD_NEVER;
@@ -493,7 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 	} else if (!strcmp(subkey, "vcs")) {
 		return git_config_string(&remote->foreign_vcs, key, value);
 	} else if (!strcmp(subkey, "updatehead")) {
-		return parse_update_head(&remote->update_head, key, value);
+		return parse_update_head(&remote->fetch_update_head, key,
+					 value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index 9dce42d65d0..80b8cc24b6b 100644
--- a/remote.h
+++ b/remote.h
@@ -22,8 +22,8 @@ enum {
 	REMOTE_BRANCHES
 };
 
-enum {
-	FETCH_UPDATE_HEAD_DEFAULT = 0,
+enum fetch_update_head {
+	FETCH_UPDATE_HEAD_DEFAULT,
 	FETCH_UPDATE_HEAD_NEVER,
 	FETCH_UPDATE_HEAD_MISSING,
 	FETCH_UPDATE_HEAD_ALWAYS,
@@ -104,7 +104,7 @@ struct remote {
 	int prune;
 	int prune_tags;
 
-	int update_head;
+	enum fetch_update_head fetch_update_head;
 
 	/**
 	 * The configured helper programs to run on the remote side, for
@@ -458,6 +458,7 @@ void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
 char *relative_url(const char *remote_url, const char *url,
 		   const char *up_path);
 
-int parse_update_head(int *r, const char *var, const char *value);
+int parse_update_head(enum fetch_update_head *r, const char *var,
+		      const char *value);
 
 #endif

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Add fetch.updateHead option
  2023-04-06  7:33       ` Ævar Arnfjörð Bjarmason
@ 2023-04-07  2:41         ` Felipe Contreras
  0 siblings, 0 replies; 9+ messages in thread
From: Felipe Contreras @ 2023-04-07  2:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Felipe Contreras
  Cc: git, Jeff King, Patrick Steinhardt, Jonathan Nieder, Daniel Martí

Ævar Arnfjörð Bjarmason wrote:
> On Wed, Apr 05 2023, Felipe Contreras wrote:
> > On Wed, Apr 5, 2023 at 4:28 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >> On Tue, Apr 04 2023, Felipe Contreras wrote:

> >> > +
> >> > +     if (!ref || !target) {
> >> > +             warning(_("could not update remote head"));
> >> > +             return;
> >> > +     }
> >> > +
> >> > +     r = resolve_ref_unsafe(ref, 0, NULL, &flags);
> >> > +
> >> > +     if (r) {
> >> > +             if (config == FETCH_UPDATE_HEAD_MISSING) {
> >> > +                     if (flags & REF_ISSYMREF)
> >> > +                             /* already present */
> >> > +                             return;
> >> > +             } else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
> >> > +                     if (!strcmp(r, target))
> >> > +                             /* already up-to-date */
> >> > +                             return;
> >>
> >> I think you should name the "enum" you're adding below, the one that
> >> contains the new "FETCH_UPDATE_HEAD_DEFAULT".
> >>
> >> Then this could be a "switch", and the compiler could check the
> >> arguments, i.e. you could pass an enum type instead of an "int".
> >
> > Sure, it can be an `enum fetch_update_mode` instead of `int`, but I
> > don't see what value it provides, other than more verbosity. The enum
> > right above is also unnamed, and 'remote->origin' is an int. And it's
> > not the only enum of that kind in the source code.
> >
> > Using a switch is better, but that doesn't require an enum type. The
> > multiple ifs are just a remnant of a previous version of the code.
> 
> More on this below, but it's for self-documentation (makes the code
> easier to follow),

I guess it *can* make the code easier to follow for some people, but only very
marginally, and certainly not for me.

> and the compiler can notice missing "case" arms,
> which isn't the case with an "int".

Yes, but that has never been useful in my experience.

> > But I seem to recall previous discussions (perhaps in LKML) where
> > people accepted that lines 120-characters long are OK. We don't live
> > in the 80's anymore, terminals have more than 80 columns.
> 
> I don't know what the kernel does, but we try to conform to our
> CodingGuidelines, which sets a limit of 80.

There's a difference between claiming we try to conform to X, and actually
trying to conform to X. I think the evidence above shows it's not the latter.

That is: the guideline says something which isn't actually true. Or at least:
if we are trying, we are not trying very hard.

> But whatever else we do, we don't generally say that a newly added
> function to a given file should be exempted from the preferred coding
> style because the file isn't consistently using it.

A guideline is not a law.

If the guideline says "try to do X", and a patch doesn't do X, that's not a
valid reason to reject it. It is not a prescriptive command, it has no "shall"
or "must". It's merely a suggestion.

And as I showed above, a suggestion that is clearly not followed to the tee in
all the code base.

> >> > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> >> > index dc44da9c79..dbeb2928ae 100755
> >> > --- a/t/t5510-fetch.sh
> >> > +++ b/t/t5510-fetch.sh
> >> > @@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
> >> >       git fetch multipleurls
> >> >  '
> >> >
> >> > +test_cmp_symbolic_ref () {
> >> > +     git symbolic-ref "$1" >actual &&
> >> > +     echo "$2" >expected &&
> >> > +     test_cmp expected actual
> >> > +}
> >>
> >> Sort of an aside, but this seems to be the Nth use of this pattern in
> >> the test suite, e.g. t1401-symbolic-ref.sh repeatedly hardcodes the
> >> same.
> >>
> >> I wonder if a prep commit to stick this in test-lib-functions.sh would
> >> be in order, or maybe a "--symbolic" argument to "test_cmp_rev"?
> >
> > Sure. If I had incline that such a patch would be merged (or this one)
> > I would do it, but I have a plethora of cleanup patches just gathering
> > dust, so I'd rather not.
> 
> Fair enough, thanks.
> 
> Re the "more below" above, I tried hacking some of what I suggested
> upthread on top of your patches, here's the result of
> that. Changes/commentary:
> 
>  * Switched the "int" to "enum"

For the record: I still don't see any value in doing that.

But I also don't see any harm, so I'm OK with that change.

>  * You've prepared the parse_update_head() to accept a NULL "r", but as
>    this & your other code shows, we never pass it NULL. I don't get why
>    we'd have it handle that case, as surely all plausible users are
>    "populate this config variable for me", no?

Yeah, I don't see any value in checking that, it probably was already there
from the original function I copied the code.

>  * I think better than a BUG() call in the new update_head() we should
>    just drop "need_update_head" entirely. It ends up just being a
>    variable that states "is missing or always", so for update_head() we
>    can just pass a boolean "missing?".

Actually, `need_update_head` doesn't equal "is missing or always": it mostly
tracks the fact that we sent "HEAD" as part of the refspecs sent to the remote.

For example if you do `git fetch`, that sets `need_update_head`, but if you do
`git fetch master` it does not (AFAIK).

So your change is not equivalent, and it would call update_head() unnecessarily
in many instances where there is no "HEAD" coming back from the remote, so
`struct ref *head` is NULL. That's not a big issue, since the function will
simply return in those cases.

But...

According to Jeff King, there are some instances where "HEAD" is coming back
from the server, even if we didn't request it, in those cases we would want the
local "remote/foo/HEAD" to be updated as well (if configured).

So your change is not functionally equivalent: it's actually better. The reason
I didn't implement the logic Jeff King suggested is that I didn't see a way to
do it without complicating the code, but your suggestion is the way.

>  * I renamed "update_head" to "fetch_update_head" just to have the
>    compiler catch cases where we were using the old "int", but if you
>    find some of this useful we could keep the old name.

I do find a lot of this useful (bar the switch to an enum), but I think the old
name is better, since it's a configuration that affects commands other than
`git fetch`, for example `git remote update`.

> Hope some of that helps.
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 6bf147b0123..6492e88d779 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -59,7 +59,7 @@ static int fetch_prune_tags_config = -1; /* unspecified */
>  static int prune_tags = -1; /* unspecified */
>  #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
>  
> -static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
> +static enum fetch_update_head fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
>  
>  static int all, append, dry_run, force, keep, multiple, update_head_ok;
>  static int write_fetch_head = 1;
> @@ -1584,7 +1584,8 @@ static int backfill_tags(struct transport *transport,
>  	return retcode;
>  }
>  
> -static void update_head(int config, const struct ref *head, const struct remote *remote)
> +static void update_head(int fetch_missing, const struct ref *head,
> +			struct remote *remote)

This is good, it simplifies the logic below.

>  {
>  	char *ref, *target;
>  	const char *r;
> @@ -1594,7 +1595,7 @@ static void update_head(int config, const struct ref *head, const struct remote
>  		return;
>  
>  	if (!remote->mirror) {
> -		ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
> +		ref = apply_refspecs(&remote->fetch, "refs/heads/HEAD");
>  		target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);

Small nit: you didn't drop this casting.

>  
>  		if (!ref || !target) {
> @@ -1609,17 +1610,14 @@ static void update_head(int config, const struct ref *head, const struct remote
>  	r = resolve_ref_unsafe(ref, 0, NULL, &flags);
>  
>  	if (r) {
> -		if (config == FETCH_UPDATE_HEAD_MISSING) {
> +		if (fetch_missing) {
>  			if (flags & REF_ISSYMREF)
>  				/* already present */
>  				return;
> -		} else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
> -			if (!strcmp(r, target))
> -				/* already up-to-date */
> -				return;
> -		} else
> -			/* should never happen */
> +		} else if (!strcmp(r, target)) {
> +			/* already up-to-date */
>  			return;
> +		}

I prefer to have the main logic (always) on top, but otherwise good.

>  	}
>  
>  	if (!create_symref(ref, target, "remote update head")) {
> @@ -1643,7 +1641,7 @@ static int do_fetch(struct transport *transport,
>  	int must_list_refs = 1;
>  	struct fetch_head fetch_head = { 0 };
>  	struct strbuf err = STRBUF_INIT;
> -	int need_update_head = 0, update_head_config = 0;
> +	enum fetch_update_head update_head_config = FETCH_UPDATE_HEAD_DEFAULT;
>  
>  	if (tags == TAGS_DEFAULT) {
>  		if (transport->remote->fetch_tags == 2)
> @@ -1680,15 +1678,19 @@ static int do_fetch(struct transport *transport,
>  
>  		if (transport->remote->fetch.nr) {
>  
> -			if (transport->remote->update_head)
> -				update_head_config = transport->remote->update_head;
> +			if (transport->remote->fetch_update_head != FETCH_UPDATE_HEAD_DEFAULT)

In Git codestyle implicit tends to be preferred over explicit (as is in the Linux codesyle)

 * `if (p)` over `if (p != NULL)`
 * `if (i)` over `if (i != 0)`
 * `if (!strcmp(...)` over `if (strcmp(...) == 0`

And so on, so I think this strongly suggests this is preferred:

  if (transport->remote->update_head)

Over

  if (transport->remote->fetch_update_head != FETCH_UPDATE_HEAD_DEFAULT)

> +				update_head_config = transport->remote->fetch_update_head;
>  			else
>  				update_head_config = fetch_update_head;
>  
> -			need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
> -
> -			if (need_update_head)
> +			switch (update_head_config) {
> +			case FETCH_UPDATE_HEAD_MISSING:
> +			case FETCH_UPDATE_HEAD_ALWAYS:
>  				strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
> +			case FETCH_UPDATE_HEAD_DEFAULT:
> +			case FETCH_UPDATE_HEAD_NEVER:

I would rather have "default:" here to catch all the rest.

I suppose this could be clearer to some people (although IMO overly verbose),
but this has nothing to do with the enum change, as it can be done with `int
update_head_config`.

> +				break;
> +			}
>  			refspec_ref_prefixes(&transport->remote->fetch,
>  					     &transport_ls_refs_options.ref_prefixes);
>  		}
> @@ -1801,8 +1803,16 @@ static int do_fetch(struct transport *transport,
>  
>  	commit_fetch_head(&fetch_head);
>  
> -	if (need_update_head)
> -		update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
> +	switch (update_head_config) {
> +	case FETCH_UPDATE_HEAD_MISSING:
> +	case FETCH_UPDATE_HEAD_ALWAYS:
> +		update_head(update_head_config == FETCH_UPDATE_HEAD_MISSING,
> +			    find_ref_by_name(remote_refs, "HEAD"),
> +			    transport->remote);
> +	case FETCH_UPDATE_HEAD_DEFAULT:
> +	case FETCH_UPDATE_HEAD_NEVER:
> +		break;
> +	}

Ditto. Although this isn't functionally equivalent, it's actually better.

---

I've integrated the important parts of these changes into v2 and sent that.

I still don't see an incline of this patch ever being merged, so it's probably
just an exercise, but for the record there it is.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-04-07  2:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-05  1:27 [PATCH 0/2] Add fetch.updateHead option Felipe Contreras
2023-04-05  1:27 ` [PATCH 1/2] " Felipe Contreras
2023-04-05  9:16   ` Ævar Arnfjörð Bjarmason
2023-04-05 10:15     ` Patrick Steinhardt
2023-04-05 14:55     ` Felipe Contreras
2023-04-06  7:33       ` Ævar Arnfjörð Bjarmason
2023-04-07  2:41         ` Felipe Contreras
2023-04-05  9:28   ` Ævar Arnfjörð Bjarmason
2023-04-05  1:27 ` [PATCH 2/2] fetch: add support for HEAD update on mirrors Felipe Contreras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.