git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Clarify pseudo-ref terminology
@ 2024-04-29 13:41 Patrick Steinhardt
  2024-04-29 13:41 ` [PATCH 1/3] refs: move `is_special_ref()` Patrick Steinhardt
                   ` (6 more replies)
  0 siblings, 7 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-29 13:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2842 bytes --]

Hi,

there is quite some confusion around what is a pseudo ref and what is a
special ref, and the documentation in gitglossary(7) overlaps. This
patch series clarifies that:

  - A pseudo ref is anything in the root directory that conforms to the
    pseudo ref syntax (all uppercase, must end with _HEAD), except for
    special refs. There are some exceptions that are now listed
    explicitly, and these only exist due to historic reasons.

  - Special refs are really only either FETCH_HEAD or MERGE_HEAD, where
    the reason is that those aren't really refs, but can sometimes be
    used as such.

I was very much pressed to go a completely different way: drop the
special ref term again that we have recently introduced and replace the
pseudo ref term with it. This would return pseudo refs to their original
meaning, namely something that is not a ref, but behaves like one at
times.

The current class of pseudo refs I would then drop completely -- in my
opinion there is just no need for it. Everything should be a ref, and
what we currently call pseudo refs (things in the root hierarchy) is
really just a naming policy. Refs must either have pseudoref syntax or
they must start with "refs/". It doesn't help in my opinion that we give
refs which conform to that naming policy but happen to live in the root
directory a separate name. They behave no different than a normal ref
anyway: they are stored in the refdb and can be read and written as any
other ref starting with "refs/".

So, if we went down that road, we would:

  - Not have special refs anymore.

  - Have two pseudo refs, FETCH_HEAD and MERGE_HEAD. There is no other
    pseudo ref.

  - Clarify that refs must either start with "refs/", or have an
    all-uppercase name ending with "_HEAD". Exceptions to this rule are
    "HEAD" and a couple of others that we wrote due to historic reasons.
    All refs that match these rules behave the same, there is no
    difference between root refs and refs living in "refs/".

I think that this would be quite a lot easier to understand than the
current state of affairs we have, and also return us to the original
meaning of pseudorefs.

If people agree with that line of thought I'll happily revise this patch
series. I didn't do that yet because it would be quite a lot more work,
so I first wanted to get some buy-in.

Patrick

Patrick Steinhardt (3):
  refs: move `is_special_ref()`
  refs: do not label special refs as pseudo refs
  refs: fix segfault in `is_pseudoref()` when ref cannot be resolved

 Documentation/glossary-content.txt | 36 +++++++------
 refs.c                             | 81 ++++++++++++++----------------
 t/t6302-for-each-ref-filter.sh     | 17 +++++++
 3 files changed, 76 insertions(+), 58 deletions(-)

-- 
2.45.0-rc1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 1/3] refs: move `is_special_ref()`
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-04-29 13:41 ` Patrick Steinhardt
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-29 13:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2795 bytes --]

Move `is_special_ref()` further up so that we can start using it in
`is_pseudo_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 62 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/refs.c b/refs.c
index 55d2e0b2cb..c64f66bff9 100644
--- a/refs.c
+++ b/refs.c
@@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
+static int is_special_ref(const char *refname)
+{
+	/*
+	 * Special references are refs that have different semantics compared
+	 * to "normal" refs. These refs can thus not be stored in the ref
+	 * backend, but must always be accessed via the filesystem. The
+	 * following refs are special:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * Reading, writing or deleting references must consistently go either
+	 * through the filesystem (special refs) or through the reference
+	 * backend (normal ones).
+	 */
+	static const char * const special_refs[] = {
+		"FETCH_HEAD",
+		"MERGE_HEAD",
+	};
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
+		if (!strcmp(refname, special_refs[i]))
+			return 1;
+
+	return 0;
+}
+
 static int is_pseudoref_syntax(const char *refname)
 {
 	const char *c;
@@ -1876,37 +1907,6 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_special_ref(const char *refname)
-{
-	/*
-	 * Special references are refs that have different semantics compared
-	 * to "normal" refs. These refs can thus not be stored in the ref
-	 * backend, but must always be accessed via the filesystem. The
-	 * following refs are special:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (special refs) or through the reference
-	 * backend (normal ones).
-	 */
-	static const char * const special_refs[] = {
-		"FETCH_HEAD",
-		"MERGE_HEAD",
-	};
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
-		if (!strcmp(refname, special_refs[i]))
-			return 1;
-
-	return 0;
-}
-
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
-- 
2.45.0-rc1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-04-29 13:41 ` [PATCH 1/3] refs: move `is_special_ref()` Patrick Steinhardt
@ 2024-04-29 13:41 ` Patrick Steinhardt
  2024-04-29 15:12   ` Phillip Wood
                     ` (3 more replies)
  2024-04-29 13:41 ` [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved Patrick Steinhardt
                   ` (4 subsequent siblings)
  6 siblings, 4 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-29 13:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 6069 bytes --]

We have two refs which almost behave like a ref in many contexts, but
aren't really:

  - MERGE_HEAD contains the list of parents during a merge.

  - FETCH_HEAD contains the list of fetched references after
    git-fetch(1) with some annotations.

These references have been declared "special refs" in 8df4c5d205
(Documentation: add "special refs" to the glossary, 2024-01-19).

Due to their "_HEAD" suffix, those special refs also almost look like a
pseudo ref, even though they aren't. But because `is_pseudoref()` labels
anything as a pseudo ref that ends with the `_HEAD` suffix, it will also
happily label both of the above special refs as pseudo refs.

This mis-labeling creates some weirdness and inconsistent behaviour
across ref backends. As special refs are never stored via a ref backend,
they theoretically speaking cannot know about special refs. But with the
recent introduction of the `--include-root-refs` flag this isn't quite
true anymore: the "files" backend will yield all refs that look like a
pseudo ref or "HEAD" stored in the root directory. And given that both
of the above look like pseudo refs, the "files" backend will list those,
too. The "reftable" backend naturally cannot know about those, and
teaching it to parse and yield these special refs very much feels like
the wrong way to go. So, arguably, the better direction to go is to mark
the "files" behaviour as a bug and stop yielding special refs there.

Conceptually, this feels like the right thing to do, too. Special refs
really aren't refs, they are a different file format that for some part
may behave like a ref. If we were designing these special refs from
scratch, we would have likely never named it anything like a "ref" at
all.

So let's double down on the path that the mentioned commit has started,
which is to cleanly distinguish special refs and pseudo refs.

Ideally, the proper way would be to return to the original meaning that
pseudo refs really had: a ref that behaves like a ref for most of the
part, but isn't really a ref. We would essentially replace the current
"pseudoref" term with the "special ref" term. The consequence is that
all refs except for FETCH_HEAD and MERGE_HEAD would be normal refs,
regardless of whether they live in the root hierarchy or not. The way
that pseudorefs are enforced now would then change to be a naming policy
for refs, only. It's unclear though how sensible it would be to do such
a large change to terminology now, which is why this commit does the
next best thing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 36 ++++++++++++++++++------------
 refs.c                             |  2 ++
 t/t6302-for-each-ref-filter.sh     | 17 ++++++++++++++
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..4275918fa0 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -497,20 +497,28 @@ exclude;;
 	unusual refs.
 
 [[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+	Pseudorefs are references that live in the root of the reference
+	hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an
+	all-uppercase name and must end with a "_HEAD" suffix, for example
+	"`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as
+	any other reference and can be both read and written via regular Git
+	tooling.
++
+<<def_special_ref>,Special refs>> are not pseudorefs.
++
+Due to historic reasons, Git has several irregular pseudo refs that do not
+follow above rules. The following list of irregular pseudo refs is exhaustive
+and shall not be extended in the future:
+
+ - "`AUTO_MERGE`"
+
+ - "`BISECT_EXPECTED_REV`"
+
+ - "`NOTES_MERGE_PARTIAL`"
+
+ - "`NOTES_MERGE_REF`"
+
+ - "`MERGE_AUTOSTASH`"
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
diff --git a/refs.c b/refs.c
index c64f66bff9..567c6fc6ff 100644
--- a/refs.c
+++ b/refs.c
@@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 
 	if (!is_pseudoref_syntax(refname))
 		return 0;
+	if (is_special_ref(refname))
+		return 0;
 
 	if (ends_with(refname, "_HEAD")) {
 		refs_resolve_ref_unsafe(refs, refname,
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 948f1bb5f4..8c92fbde79 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs pattern does not print special refs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git rev-parse HEAD >.git/MERGE_HEAD &&
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_expect_success '--include-root-refs with other patterns' '
 	cat >expect <<-\EOF &&
 	HEAD
-- 
2.45.0-rc1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-04-29 13:41 ` [PATCH 1/3] refs: move `is_special_ref()` Patrick Steinhardt
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
@ 2024-04-29 13:41 ` Patrick Steinhardt
  2024-04-29 15:25   ` Phillip Wood
  2024-04-29 18:57   ` Karthik Nayak
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-29 13:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2554 bytes --]

The `is_pseudoref()` function has somewhat weird behaviour in that it
both checks whether a reference looks like a pseudoref, but also that
the reference actually resolves to an object ID.

In case a reference does not resolve though we can run into a segfault
because we never initialize the local `struct object_id` variable. Thus,
when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the
variable will stay uninitialize. We then try to look up the hash algo
via the uninitialized value when calling `is_null_oid()`, which causes
us to segfault.

It is somewhat questionable in the first place that we declare a ref to
be a pseudorefe depending on whether it resolves to an object ID or not.
And to make things even worse, a symbolic ref is currently considered to
not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`, which
will cause us to not resolve them to an object ID. Last but not least,
it also is inconsistent with `is_headref()`, which only checks for the
reference to exist via `refs_ref_exists()`.

Refactor the code to do the same. While that still feels somewhat fishy,
it at least fixes the segfault for now. I have not been able to come up
with a reproducible test case that does not rely on other bugs and very
intricate state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index 567c6fc6ff..b35485f150 100644
--- a/refs.c
+++ b/refs.c
@@ -900,7 +900,6 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
 	size_t i;
 
 	if (!is_pseudoref_syntax(refname))
@@ -908,20 +907,12 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 	if (is_special_ref(refname))
 		return 0;
 
-	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
-	}
+	if (ends_with(refname, "_HEAD"))
+		return refs_ref_exists(refs, refname);
 
 	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
-		}
+		if (!strcmp(refname, irregular_pseudorefs[i]))
+			return refs_ref_exists(refs, refname);
 
 	return 0;
 }
-- 
2.45.0-rc1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
@ 2024-04-29 15:12   ` Phillip Wood
  2024-04-30  7:30     ` Patrick Steinhardt
  2024-04-29 16:24   ` Junio C Hamano
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 93+ messages in thread
From: Phillip Wood @ 2024-04-29 15:12 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Jeff King, Karthik Nayak

Hi Patrick

On 29/04/2024 14:41, Patrick Steinhardt wrote:
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index d71b199955..4275918fa0 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -497,20 +497,28 @@ exclude;;
>   	unusual refs.
>   
>   [[def_pseudoref]]pseudoref::
> -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> -	like refs for the purposes of rev-parse, but which are treated
> -	specially by git.  Pseudorefs both have names that are all-caps,
> -	and always start with a line consisting of a
> -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> -	pseudoref, because it is sometimes a symbolic ref.  They might
> -	optionally contain some additional data.  `MERGE_HEAD` and
> -	`CHERRY_PICK_HEAD` are examples.  Unlike
> -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> -	be symbolic refs, and never have reflogs.  They also cannot be
> -	updated through the normal ref update machinery.  Instead,
> -	they are updated by directly writing to the files.  However,
> -	they can be read as if they were refs, so `git rev-parse
> -	MERGE_HEAD` will work.
> +	Pseudorefs are references that live in the root of the reference
> +	hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an
> +	all-uppercase name and must end with a "_HEAD" suffix, for example
> +	"`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as
> +	any other reference and can be both read and written via regular Git
> +	tooling.

This changes the definition to allow pseudorefs to by symbolic refs. 
When is_pseudoref() was introduced Junio and I had a brief discussion 
about this restriction and he was not in favor of allowing pseudorefs to 
be symbolic refs [1].

Are there any practical implications of the changes in this patch for 
users running commands like "git log FETCH_HEAD" (I can't think of any 
off the top of my head but it would be good to have some reassurance on 
that point in the commit message)

Best Wishes

Phillip

[1] https://lore.kernel.org/git/xmqq34u2q3zs.fsf@gitster.g/

> +<<def_special_ref>,Special refs>> are not pseudorefs.
> ++
> +Due to historic reasons, Git has several irregular pseudo refs that do not
> +follow above rules. The following list of irregular pseudo refs is exhaustive
> +and shall not be extended in the future:
> +
> + - "`AUTO_MERGE`"
> +
> + - "`BISECT_EXPECTED_REV`"
> +
> + - "`NOTES_MERGE_PARTIAL`"
> +
> + - "`NOTES_MERGE_REF`"
> +
> + - "`MERGE_AUTOSTASH`"
>   
>   [[def_pull]]pull::
>   	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
> diff --git a/refs.c b/refs.c
> index c64f66bff9..567c6fc6ff 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>   
>   	if (!is_pseudoref_syntax(refname))
>   		return 0;
> +	if (is_special_ref(refname))
> +		return 0;
>   
>   	if (ends_with(refname, "_HEAD")) {
>   		refs_resolve_ref_unsafe(refs, refname,
> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> index 948f1bb5f4..8c92fbde79 100755
> --- a/t/t6302-for-each-ref-filter.sh
> +++ b/t/t6302-for-each-ref-filter.sh
> @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success '--include-root-refs pattern does not print special refs' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit initial &&
> +		git rev-parse HEAD >.git/MERGE_HEAD &&
> +		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
> +		cat >expect <<-EOF &&
> +		HEAD
> +		$(git symbolic-ref HEAD)
> +		refs/tags/initial
> +		EOF
> +		test_cmp expect actual
> +	)
> +'
> +
>   test_expect_success '--include-root-refs with other patterns' '
>   	cat >expect <<-\EOF &&
>   	HEAD

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 13:41 ` [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved Patrick Steinhardt
@ 2024-04-29 15:25   ` Phillip Wood
  2024-04-29 18:57   ` Karthik Nayak
  1 sibling, 0 replies; 93+ messages in thread
From: Phillip Wood @ 2024-04-29 15:25 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Jeff King, Karthik Nayak

Hi Patrick

On 29/04/2024 14:41, Patrick Steinhardt wrote:
> The `is_pseudoref()` function has somewhat weird behaviour in that it
> both checks whether a reference looks like a pseudoref, but also that
> the reference actually resolves to an object ID.
> 
> In case a reference does not resolve though we can run into a segfault
> because we never initialize the local `struct object_id` variable. Thus,
> when `refs_resolve_ref_unsafe()` is unable to resolve the reference, the
> variable will stay uninitialize. We then try to look up the hash algo

s/uninitialize/uninitialized/

> via the uninitialized value when calling `is_null_oid()`, which causes
> us to segfault.
> 
> It is somewhat questionable in the first place that we declare a ref to
> be a pseudorefe depending on whether it resolves to an object ID or not.

If I remember rightly Karthik added that check to avoid the files 
backend calling a file with a name that matched the pseudoref syntax a 
pseudoref when it wasn't actually a pseudoref.

> And to make things even worse, a symbolic ref is currently considered to
> not be a pseudo ref either because of `RRESOLVE_REF_NO_RECURSE`,

s/RR/R/

That was a deliberate choice to fit with the definition of pseudorefs 
excluding symbolic refs.

> which
> will cause us to not resolve them to an object ID. Last but not least,
> it also is inconsistent with `is_headref()`, which only checks for the
> reference to exist via `refs_ref_exists()`.
> 
> Refactor the code to do the same. While that still feels somewhat fishy,
> it at least fixes the segfault for now.

Alternatively we could call oidclr() when refs_resolve_refs_unsafe() 
returns NULL

Best Wishes

Phillip

> I have not been able to come up
> with a reproducible test case that does not rely on other bugs and very
> intricate state.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>   refs.c | 17 ++++-------------
>   1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 567c6fc6ff..b35485f150 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -900,7 +900,6 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>   		"NOTES_MERGE_REF",
>   		"MERGE_AUTOSTASH",
>   	};
> -	struct object_id oid;
>   	size_t i;
>   
>   	if (!is_pseudoref_syntax(refname))
> @@ -908,20 +907,12 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>   	if (is_special_ref(refname))
>   		return 0;
>   
> -	if (ends_with(refname, "_HEAD")) {
> -		refs_resolve_ref_unsafe(refs, refname,
> -					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
> -					&oid, NULL);
> -		return !is_null_oid(&oid);
> -	}
> +	if (ends_with(refname, "_HEAD"))
> +		return refs_ref_exists(refs, refname);
>   
>   	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
> -		if (!strcmp(refname, irregular_pseudorefs[i])) {
> -			refs_resolve_ref_unsafe(refs, refname,
> -						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
> -						&oid, NULL);
> -			return !is_null_oid(&oid);
> -		}
> +		if (!strcmp(refname, irregular_pseudorefs[i]))
> +			return refs_ref_exists(refs, refname);
>   
>   	return 0;
>   }

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
  2024-04-29 15:12   ` Phillip Wood
@ 2024-04-29 16:24   ` Junio C Hamano
  2024-04-29 22:52   ` Justin Tobler
  2024-05-09 17:29   ` Jean-Noël AVILA
  3 siblings, 0 replies; 93+ messages in thread
From: Junio C Hamano @ 2024-04-29 16:24 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> +Due to historic reasons, Git has several irregular pseudo refs that do not
> +follow above rules. The following list of irregular pseudo refs is exhaustive
> +and shall not be extended in the future:

I like this part of the patch the most ;-).

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 13:41 ` [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved Patrick Steinhardt
  2024-04-29 15:25   ` Phillip Wood
@ 2024-04-29 18:57   ` Karthik Nayak
  2024-04-29 19:47     ` Phillip Wood
  2024-04-30  7:30     ` Patrick Steinhardt
  1 sibling, 2 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-04-29 18:57 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Jeff King

[-- Attachment #1: Type: text/plain, Size: 836 bytes --]

> It is somewhat questionable in the first place that we declare a ref to
> be a pseudorefe depending on whether it resolves to an object ID or not.

s/pseudorefe/pseudoref

[snip]

Phillip Wood <phillip.wood123@gmail.com> writes:
>> via the uninitialized value when calling `is_null_oid()`, which causes
>> us to segfault.
>>
>> It is somewhat questionable in the first place that we declare a ref to
>> be a pseudorefe depending on whether it resolves to an object ID or not.
>
> If I remember rightly Karthik added that check to avoid the files
> backend calling a file with a name that matched the pseudoref syntax a
> pseudoref when it wasn't actually a pseudoref.

Not sure I follow. I think it was strictly done to ensure we don't
consider symrefs as pseudorefs [1].

[1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 18:57   ` Karthik Nayak
@ 2024-04-29 19:47     ` Phillip Wood
  2024-04-29 20:44       ` Karthik Nayak
  2024-04-30  7:30     ` Patrick Steinhardt
  1 sibling, 1 reply; 93+ messages in thread
From: Phillip Wood @ 2024-04-29 19:47 UTC (permalink / raw)
  To: Karthik Nayak, Patrick Steinhardt, git; +Cc: Jeff King

Hi Karthik

On 29/04/2024 19:57, Karthik Nayak wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>>> via the uninitialized value when calling `is_null_oid()`, which causes
>>> us to segfault.
>>>
>>> It is somewhat questionable in the first place that we declare a ref to
>>> be a pseudorefe depending on whether it resolves to an object ID or not.
>>
>> If I remember rightly Karthik added that check to avoid the files
>> backend calling a file with a name that matched the pseudoref syntax a
>> pseudoref when it wasn't actually a pseudoref.
> 
> Not sure I follow. I think it was strictly done to ensure we don't
> consider symrefs as pseudorefs [1].

Junio suggested using refs_read_ref_unsafe() to ensure we don't consider 
symrefs as pseudorefs but your patch was already reading the ref to 
ensure it was not some random file whose name matches the pseudoref syntax.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 19:47     ` Phillip Wood
@ 2024-04-29 20:44       ` Karthik Nayak
  0 siblings, 0 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-04-29 20:44 UTC (permalink / raw)
  To: phillip.wood, Patrick Steinhardt, git; +Cc: Jeff King

[-- Attachment #1: Type: text/plain, Size: 1088 bytes --]

Hey Phillip,

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Karthik
>
> On 29/04/2024 19:57, Karthik Nayak wrote:
>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>>> via the uninitialized value when calling `is_null_oid()`, which causes
>>>> us to segfault.
>>>>
>>>> It is somewhat questionable in the first place that we declare a ref to
>>>> be a pseudorefe depending on whether it resolves to an object ID or not.
>>>
>>> If I remember rightly Karthik added that check to avoid the files
>>> backend calling a file with a name that matched the pseudoref syntax a
>>> pseudoref when it wasn't actually a pseudoref.
>>
>> Not sure I follow. I think it was strictly done to ensure we don't
>> consider symrefs as pseudorefs [1].
>
> Junio suggested using refs_read_ref_unsafe() to ensure we don't consider
> symrefs as pseudorefs but your patch was already reading the ref to
> ensure it was not some random file whose name matches the pseudoref syntax.
>
> Best Wishes
>
> Phillip

Oh yes. You're absolutely correct. I just didn't understand what you
were referring to :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
  2024-04-29 15:12   ` Phillip Wood
  2024-04-29 16:24   ` Junio C Hamano
@ 2024-04-29 22:52   ` Justin Tobler
  2024-04-30  7:29     ` Patrick Steinhardt
  2024-05-09 17:29   ` Jean-Noël AVILA
  3 siblings, 1 reply; 93+ messages in thread
From: Justin Tobler @ 2024-04-29 22:52 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Karthik Nayak

On 24/04/29 03:41PM, Patrick Steinhardt wrote:
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index d71b199955..4275918fa0 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -497,20 +497,28 @@ exclude;;
>  	unusual refs.
>  
>  [[def_pseudoref]]pseudoref::
> -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> -	like refs for the purposes of rev-parse, but which are treated
> -	specially by git.  Pseudorefs both have names that are all-caps,
> -	and always start with a line consisting of a
> -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> -	pseudoref, because it is sometimes a symbolic ref.  They might

We remove the example here about HEAD not being a pseudoref. This
example seems helpful to indicate that a pseudoref cannot be a symbolic
ref. Is this no longer the case and the change intended?

> -	optionally contain some additional data.  `MERGE_HEAD` and
> -	`CHERRY_PICK_HEAD` are examples.  Unlike
> -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> -	be symbolic refs, and never have reflogs.  They also cannot be
> -	updated through the normal ref update machinery.  Instead,
> -	they are updated by directly writing to the files.  However,
> -	they can be read as if they were refs, so `git rev-parse
> -	MERGE_HEAD` will work.
> +	Pseudorefs are references that live in the root of the reference
> +	hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an
> +	all-uppercase name and must end with a "_HEAD" suffix, for example
> +	"`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as
> +	any other reference and can be both read and written via regular Git
> +	tooling.

Pseudorefs behaving the same and using the same tooling seems to
contridict the previous documentation. I assume the previous information
was out-of-date, but it might be nice to explain this in the message.

> ++
> +<<def_special_ref>,Special refs>> are not pseudorefs.
> ++
> +Due to historic reasons, Git has several irregular pseudo refs that do not
> +follow above rules. The following list of irregular pseudo refs is exhaustive

We seem to be inconsistent between using "pseudoref" and "pseudo ref".
Not sure it we want to be consistent here. 

-Justin

> +and shall not be extended in the future:
> +
> + - "`AUTO_MERGE`"
> +
> + - "`BISECT_EXPECTED_REV`"
> +
> + - "`NOTES_MERGE_PARTIAL`"
> +
> + - "`NOTES_MERGE_REF`"
> +
> + - "`MERGE_AUTOSTASH`"
>  
>  [[def_pull]]pull::
>  	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
> diff --git a/refs.c b/refs.c
> index c64f66bff9..567c6fc6ff 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
>  
>  	if (!is_pseudoref_syntax(refname))
>  		return 0;
> +	if (is_special_ref(refname))
> +		return 0;
>  
>  	if (ends_with(refname, "_HEAD")) {
>  		refs_resolve_ref_unsafe(refs, refname,
> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> index 948f1bb5f4..8c92fbde79 100755
> --- a/t/t6302-for-each-ref-filter.sh
> +++ b/t/t6302-for-each-ref-filter.sh
> @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success '--include-root-refs pattern does not print special refs' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit initial &&
> +		git rev-parse HEAD >.git/MERGE_HEAD &&
> +		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
> +		cat >expect <<-EOF &&
> +		HEAD
> +		$(git symbolic-ref HEAD)
> +		refs/tags/initial
> +		EOF
> +		test_cmp expect actual
> +	)
> +'
> +
>  test_expect_success '--include-root-refs with other patterns' '
>  	cat >expect <<-\EOF &&
>  	HEAD
> -- 
> 2.45.0-rc1
> 



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 22:52   ` Justin Tobler
@ 2024-04-30  7:29     ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30  7:29 UTC (permalink / raw)
  To: git, Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 5114 bytes --]

On Mon, Apr 29, 2024 at 05:52:41PM -0500, Justin Tobler wrote:
> On 24/04/29 03:41PM, Patrick Steinhardt wrote:
> > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> > index d71b199955..4275918fa0 100644
> > --- a/Documentation/glossary-content.txt
> > +++ b/Documentation/glossary-content.txt
> > @@ -497,20 +497,28 @@ exclude;;
> >  	unusual refs.
> >  
> >  [[def_pseudoref]]pseudoref::
> > -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> > -	like refs for the purposes of rev-parse, but which are treated
> > -	specially by git.  Pseudorefs both have names that are all-caps,
> > -	and always start with a line consisting of a
> > -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> > -	pseudoref, because it is sometimes a symbolic ref.  They might
> 
> We remove the example here about HEAD not being a pseudoref. This
> example seems helpful to indicate that a pseudoref cannot be a symbolic
> ref. Is this no longer the case and the change intended?

I just don't see why we would want to have this restriction. Honestly,
the more I think about this whole topic the more I want to go into the
direction I've hinted at in the cover letter: drop "special refs" and
define pseudo refs as either FETCH_HEAD or MERGE_HEAD. Everything else
is just a normal ref, even though some of those may live in the root
directory if they conform to a set of strict rules:

  - All upppercase characters plus underscores.

  - Must end with "_HEAD", except a list of known irregular root refs.

I feel like the world would be better like this.

> > -	optionally contain some additional data.  `MERGE_HEAD` and
> > -	`CHERRY_PICK_HEAD` are examples.  Unlike
> > -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> > -	be symbolic refs, and never have reflogs.  They also cannot be
> > -	updated through the normal ref update machinery.  Instead,
> > -	they are updated by directly writing to the files.  However,
> > -	they can be read as if they were refs, so `git rev-parse
> > -	MERGE_HEAD` will work.
> > +	Pseudorefs are references that live in the root of the reference
> > +	hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an
> > +	all-uppercase name and must end with a "_HEAD" suffix, for example
> > +	"`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as
> > +	any other reference and can be both read and written via regular Git
> > +	tooling.
> 
> Pseudorefs behaving the same and using the same tooling seems to
> contridict the previous documentation. I assume the previous information
> was out-of-date, but it might be nice to explain this in the message.

Yes, and I actually want to change this. We never enforced restrictions
for pseudorefs anyway, they can be symrefs just fine. And neither would
I see any reason why that should be the case in the first place.

> > ++
> > +<<def_special_ref>,Special refs>> are not pseudorefs.
> > ++
> > +Due to historic reasons, Git has several irregular pseudo refs that do not
> > +follow above rules. The following list of irregular pseudo refs is exhaustive
> 
> We seem to be inconsistent between using "pseudoref" and "pseudo ref".
> Not sure it we want to be consistent here. 

Makes sense.

Patrick

> -Justin
> 
> > +and shall not be extended in the future:
> > +
> > + - "`AUTO_MERGE`"
> > +
> > + - "`BISECT_EXPECTED_REV`"
> > +
> > + - "`NOTES_MERGE_PARTIAL`"
> > +
> > + - "`NOTES_MERGE_REF`"
> > +
> > + - "`MERGE_AUTOSTASH`"
> >  
> >  [[def_pull]]pull::
> >  	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
> > diff --git a/refs.c b/refs.c
> > index c64f66bff9..567c6fc6ff 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
> >  
> >  	if (!is_pseudoref_syntax(refname))
> >  		return 0;
> > +	if (is_special_ref(refname))
> > +		return 0;
> >  
> >  	if (ends_with(refname, "_HEAD")) {
> >  		refs_resolve_ref_unsafe(refs, refname,
> > diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> > index 948f1bb5f4..8c92fbde79 100755
> > --- a/t/t6302-for-each-ref-filter.sh
> > +++ b/t/t6302-for-each-ref-filter.sh
> > @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
> >  	test_cmp expect actual
> >  '
> >  
> > +test_expect_success '--include-root-refs pattern does not print special refs' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +		test_commit initial &&
> > +		git rev-parse HEAD >.git/MERGE_HEAD &&
> > +		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
> > +		cat >expect <<-EOF &&
> > +		HEAD
> > +		$(git symbolic-ref HEAD)
> > +		refs/tags/initial
> > +		EOF
> > +		test_cmp expect actual
> > +	)
> > +'
> > +
> >  test_expect_success '--include-root-refs with other patterns' '
> >  	cat >expect <<-\EOF &&
> >  	HEAD
> > -- 
> > 2.45.0-rc1
> > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 15:12   ` Phillip Wood
@ 2024-04-30  7:30     ` Patrick Steinhardt
  2024-04-30  9:59       ` Phillip Wood
  2024-04-30 10:23       ` Jeff King
  0 siblings, 2 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30  7:30 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 4169 bytes --]

On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote:
> Hi Patrick
> 
> On 29/04/2024 14:41, Patrick Steinhardt wrote:
> > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> > index d71b199955..4275918fa0 100644
> > --- a/Documentation/glossary-content.txt
> > +++ b/Documentation/glossary-content.txt
> > @@ -497,20 +497,28 @@ exclude;;
> >   	unusual refs.
> >   [[def_pseudoref]]pseudoref::
> > -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> > -	like refs for the purposes of rev-parse, but which are treated
> > -	specially by git.  Pseudorefs both have names that are all-caps,
> > -	and always start with a line consisting of a
> > -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> > -	pseudoref, because it is sometimes a symbolic ref.  They might
> > -	optionally contain some additional data.  `MERGE_HEAD` and
> > -	`CHERRY_PICK_HEAD` are examples.  Unlike
> > -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> > -	be symbolic refs, and never have reflogs.  They also cannot be
> > -	updated through the normal ref update machinery.  Instead,
> > -	they are updated by directly writing to the files.  However,
> > -	they can be read as if they were refs, so `git rev-parse
> > -	MERGE_HEAD` will work.
> > +	Pseudorefs are references that live in the root of the reference
> > +	hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an
> > +	all-uppercase name and must end with a "_HEAD" suffix, for example
> > +	"`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as
> > +	any other reference and can be both read and written via regular Git
> > +	tooling.
> 
> This changes the definition to allow pseudorefs to by symbolic refs. When
> is_pseudoref() was introduced Junio and I had a brief discussion about this
> restriction and he was not in favor of allowing pseudorefs to be symbolic
> refs [1].

So the reason why pseudorefs exist is that some refs behave like a ref
sometimes, but not always. And in my book that really only applies to
MERGE_HEAD and FETCH_HEAD, because those contain additional metadata
that makes them not-a-ref. And for those I very much see that they
should not ever be a symref.

But everyhing else living in the root of the ref hierarchy is not
special in any way, at least not in my opinion. We have never enforced
that those cannot be symrefs, and it makes our terminology needlessly
confusing.

I think I'm going to reroll this patch series and go down the nuclear
path that I've hinted at in the cover letter:

  - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD.

  - Refs starting with "refs/" are just plain normal refs.

  - Refs living in the root of the ref hierarchy need to conform to a
    set of strict rules, as Peff is starting to enforce in a separate
    patch series. These are just normal refs, as well, even though we
    may call them "root ref" in our tooling as they live in the root of
    the ref hierarchy.

I just don't think that the current state makes sense to anybody. It's
majorly confusing -- I've spent the last 8 months working in our refs
code almost exclusively and still forget what's what. How are our users
expected to understand this?

> Are there any practical implications of the changes in this patch for users
> running commands like "git log FETCH_HEAD" (I can't think of any off the top
> of my head but it would be good to have some reassurance on that point in
> the commit message)

Not really, no. We have never been doing a good job at enforcing the
difference between pseudo refs or normal refs anyway. Pseudo refs can be
symrefs just fine, and our tooling won't complain. The only exception
where I want us to become stricter is in how we enforce the syntax rules
for root refs (which is handled by Peff in a separate patch series), and
that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
They should still resolve when you ask git-rev-parse(1), but when you
iterate through refs they should not be surfaced as they _aren't_ refs.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved
  2024-04-29 18:57   ` Karthik Nayak
  2024-04-29 19:47     ` Phillip Wood
@ 2024-04-30  7:30     ` Patrick Steinhardt
  1 sibling, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30  7:30 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1431 bytes --]

On Mon, Apr 29, 2024 at 11:57:53AM -0700, Karthik Nayak wrote:
> > It is somewhat questionable in the first place that we declare a ref to
> > be a pseudorefe depending on whether it resolves to an object ID or not.
> 
> s/pseudorefe/pseudoref
> 
> [snip]
> 
> Phillip Wood <phillip.wood123@gmail.com> writes:
> >> via the uninitialized value when calling `is_null_oid()`, which causes
> >> us to segfault.
> >>
> >> It is somewhat questionable in the first place that we declare a ref to
> >> be a pseudorefe depending on whether it resolves to an object ID or not.
> >
> > If I remember rightly Karthik added that check to avoid the files
> > backend calling a file with a name that matched the pseudoref syntax a
> > pseudoref when it wasn't actually a pseudoref.
> 
> Not sure I follow. I think it was strictly done to ensure we don't
> consider symrefs as pseudorefs [1].
> 
> [1]: https://lore.kernel.org/git/xmqqfrymeega.fsf@gitster.g/

And that's fair from a terminology perspective. But honestly, I really
doubt that any user will understand that REBASE_HEAD is a pseudoref when
it contains an object ID, but is not a pseudoref when it is a symref.

Anyway, as I've said in parallel mails, I want to change the definition
of what a pseudoref is. I just think that the current mess is understood
by nobody and doesn't make any sense.

I'll thus implicitly address this in my v2.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30  7:30     ` Patrick Steinhardt
@ 2024-04-30  9:59       ` Phillip Wood
  2024-04-30 12:11         ` Patrick Steinhardt
  2024-04-30 10:23       ` Jeff King
  1 sibling, 1 reply; 93+ messages in thread
From: Phillip Wood @ 2024-04-30  9:59 UTC (permalink / raw)
  To: Patrick Steinhardt, phillip.wood; +Cc: git, Jeff King, Karthik Nayak

Hi Patrick

On 30/04/2024 08:30, Patrick Steinhardt wrote:
> On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote:
>
>> This changes the definition to allow pseudorefs to by symbolic refs. When
>> is_pseudoref() was introduced Junio and I had a brief discussion about this
>> restriction and he was not in favor of allowing pseudorefs to be symbolic
>> refs [1].
> 
> So the reason why pseudorefs exist is that some refs behave like a ref
> sometimes, but not always. And in my book that really only applies to
> MERGE_HEAD and FETCH_HEAD, because those contain additional metadata
> that makes them not-a-ref. And for those I very much see that they
> should not ever be a symref.
> 
> But everyhing else living in the root of the ref hierarchy is not
> special in any way, at least not in my opinion. We have never enforced
> that those cannot be symrefs, and it makes our terminology needlessly
> confusing.

I agree HEAD not being a pseudoref and having special refs as well as 
pseudorefs refs is confusing. I do have some sympathy for the argument 
that pseudorefs should not be symbolic refs though as AUTO_MERGE, 
CHERRY_PICK_HEAD, ORIG_HEAD etc. are all pointers to a commit and it 
would be a bug for them to be a symbolic ref. It is unfortunate that in 
the move away from assessing those refs as files we lost the check that 
they are not symbolic refs.

> I think I'm going to reroll this patch series and go down the nuclear
> path that I've hinted at in the cover letter:
> 
>    - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD.
> 
>    - Refs starting with "refs/" are just plain normal refs.
> 
>    - Refs living in the root of the ref hierarchy need to conform to a
>      set of strict rules, as Peff is starting to enforce in a separate
>      patch series. These are just normal refs, as well, even though we
>      may call them "root ref" in our tooling as they live in the root of
>      the ref hierarchy.

That would certainly be simpler.

> I just don't think that the current state makes sense to anybody. It's
> majorly confusing -- I've spent the last 8 months working in our refs
> code almost exclusively and still forget what's what. How are our users
> expected to understand this?

The current state is confusing but arguably there is a logic to the 
various distinctions - whether those distinctions are useful in practice 
is open to debate though. I wonder how much users really care about 
these distinctions and whether it affects their use of git. I was 
unaware of the distinction between HEAD and pseudorefs until I reviewed 
Karthik's for-each-ref series a couple of months ago and I don't think 
that lack of knowledge had caused me any trouble when using git.

>> Are there any practical implications of the changes in this patch for users
>> running commands like "git log FETCH_HEAD" (I can't think of any off the top
>> of my head but it would be good to have some reassurance on that point in
>> the commit message)
> 
> Not really, no. We have never been doing a good job at enforcing the
> difference between pseudo refs or normal refs anyway. Pseudo refs can be
> symrefs just fine, and our tooling won't complain. The only exception
> where I want us to become stricter is in how we enforce the syntax rules
> for root refs (which is handled by Peff in a separate patch series), and
> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
> They should still resolve when you ask git-rev-parse(1), but when you
> iterate through refs they should not be surfaced as they _aren't_ refs.

That's good

Thanks

Phillip


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30  7:30     ` Patrick Steinhardt
  2024-04-30  9:59       ` Phillip Wood
@ 2024-04-30 10:23       ` Jeff King
  2024-04-30 12:07         ` Karthik Nayak
  2024-04-30 12:16         ` Patrick Steinhardt
  1 sibling, 2 replies; 93+ messages in thread
From: Jeff King @ 2024-04-30 10:23 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: phillip.wood, git, Karthik Nayak

On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote:

> So the reason why pseudorefs exist is that some refs behave like a ref
> sometimes, but not always. And in my book that really only applies to
> MERGE_HEAD and FETCH_HEAD, because those contain additional metadata
> that makes them not-a-ref. And for those I very much see that they
> should not ever be a symref.
> 
> But everyhing else living in the root of the ref hierarchy is not
> special in any way, at least not in my opinion. We have never enforced
> that those cannot be symrefs, and it makes our terminology needlessly
> confusing.
> 
> I think I'm going to reroll this patch series and go down the nuclear
> path that I've hinted at in the cover letter:
> 
>   - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD.
> 
>   - Refs starting with "refs/" are just plain normal refs.
> 
>   - Refs living in the root of the ref hierarchy need to conform to a
>     set of strict rules, as Peff is starting to enforce in a separate
>     patch series. These are just normal refs, as well, even though we
>     may call them "root ref" in our tooling as they live in the root of
>     the ref hierarchy.
> 
> I just don't think that the current state makes sense to anybody. It's
> majorly confusing -- I've spent the last 8 months working in our refs
> code almost exclusively and still forget what's what. How are our users
> expected to understand this?

Yes, I very much agree with your final paragraph. I have been working on
Git for 18 years, and am learning new things about pseudo and special
refs in this thread. ;) (Admittedly, I think that distinction is new in
the past few months).

I think the "everything is a ref, even at the root" is the simplest
thing for users. And the only rules they need to know are the syntactic
ones: names start with "refs/" or are all-caps and underscore. But I do
not see the value in them caring that HEAD can be a symref or that
MERGE_HEAD cannot (nor the value in the code making such a distinction).

My series does not enforce the "_HEAD" suffix (plus special cases) as a
syntactic rule, but we could do that easily on top. That would help
protect case-insensitive filesystems from the same shenanigans that my
series aims for (e.g., "CONFIG" on such a system will still look at the
"config" file).

It is unfortunate to me that we even need to call out FETCH_HEAD and
MERGE_HEAD. I know they are special within Git, and probably ref
backends need to be aware (because they have to be able to carry extra
data). But from a user's perspective they resolve in the normal way
(unless you are trying to look at them in their special non-ref way).
I guess the user must care that they will always be in the filesystem in
order to access them in that special way, though.

> > Are there any practical implications of the changes in this patch for users
> > running commands like "git log FETCH_HEAD" (I can't think of any off the top
> > of my head but it would be good to have some reassurance on that point in
> > the commit message)
> 
> Not really, no. We have never been doing a good job at enforcing the
> difference between pseudo refs or normal refs anyway. Pseudo refs can be
> symrefs just fine, and our tooling won't complain. The only exception
> where I want us to become stricter is in how we enforce the syntax rules
> for root refs (which is handled by Peff in a separate patch series), and
> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
> They should still resolve when you ask git-rev-parse(1), but when you
> iterate through refs they should not be surfaced as they _aren't_ refs.

I actually would not even mind if they are surfaced when iterating with
--include-root-refs. But then I am a little skeptical of the purpose of
that feature in the first place. After all, the reason code shoves stuff
into .git/FOO_HEAD is precisely because we don't want other stuff
iterating over them, using them for reachability, and so on. That is why
"--all" does not include them, for example.

But I did not follow the development of the feature, so maybe I am
missing some cool use case.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30 10:23       ` Jeff King
@ 2024-04-30 12:07         ` Karthik Nayak
  2024-04-30 12:33           ` Patrick Steinhardt
  2024-04-30 12:16         ` Patrick Steinhardt
  1 sibling, 1 reply; 93+ messages in thread
From: Karthik Nayak @ 2024-04-30 12:07 UTC (permalink / raw)
  To: Jeff King, Patrick Steinhardt; +Cc: phillip.wood, git

[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]

Jeff King <peff@peff.net> writes:

> On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote:
>
>> So the reason why pseudorefs exist is that some refs behave like a ref
>> sometimes, but not always. And in my book that really only applies to
>> MERGE_HEAD and FETCH_HEAD, because those contain additional metadata
>> that makes them not-a-ref. And for those I very much see that they
>> should not ever be a symref.
>>
>> But everyhing else living in the root of the ref hierarchy is not
>> special in any way, at least not in my opinion. We have never enforced
>> that those cannot be symrefs, and it makes our terminology needlessly
>> confusing.
>>
>> I think I'm going to reroll this patch series and go down the nuclear
>> path that I've hinted at in the cover letter:
>>
>>   - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD.
>>
>>   - Refs starting with "refs/" are just plain normal refs.
>>
>>   - Refs living in the root of the ref hierarchy need to conform to a
>>     set of strict rules, as Peff is starting to enforce in a separate
>>     patch series. These are just normal refs, as well, even though we
>>     may call them "root ref" in our tooling as they live in the root of
>>     the ref hierarchy.
>>
>> I just don't think that the current state makes sense to anybody. It's
>> majorly confusing -- I've spent the last 8 months working in our refs
>> code almost exclusively and still forget what's what. How are our users
>> expected to understand this?
>
> Yes, I very much agree with your final paragraph. I have been working on
> Git for 18 years, and am learning new things about pseudo and special
> refs in this thread. ;) (Admittedly, I think that distinction is new in
> the past few months).
>
> I think the "everything is a ref, even at the root" is the simplest
> thing for users. And the only rules they need to know are the syntactic
> ones: names start with "refs/" or are all-caps and underscore. But I do
> not see the value in them caring that HEAD can be a symref or that
> MERGE_HEAD cannot (nor the value in the code making such a distinction).
>
> My series does not enforce the "_HEAD" suffix (plus special cases) as a
> syntactic rule, but we could do that easily on top. That would help
> protect case-insensitive filesystems from the same shenanigans that my
> series aims for (e.g., "CONFIG" on such a system will still look at the
> "config" file).
>
> It is unfortunate to me that we even need to call out FETCH_HEAD and
> MERGE_HEAD. I know they are special within Git, and probably ref
> backends need to be aware (because they have to be able to carry extra
> data). But from a user's perspective they resolve in the normal way
> (unless you are trying to look at them in their special non-ref way).
> I guess the user must care that they will always be in the filesystem in
> order to access them in that special way, though.
>
>> > Are there any practical implications of the changes in this patch for users
>> > running commands like "git log FETCH_HEAD" (I can't think of any off the top
>> > of my head but it would be good to have some reassurance on that point in
>> > the commit message)
>>
>> Not really, no. We have never been doing a good job at enforcing the
>> difference between pseudo refs or normal refs anyway. Pseudo refs can be
>> symrefs just fine, and our tooling won't complain. The only exception
>> where I want us to become stricter is in how we enforce the syntax rules
>> for root refs (which is handled by Peff in a separate patch series), and
>> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
>> They should still resolve when you ask git-rev-parse(1), but when you
>> iterate through refs they should not be surfaced as they _aren't_ refs.
>
> I actually would not even mind if they are surfaced when iterating with
> --include-root-refs. But then I am a little skeptical of the purpose of
> that feature in the first place. After all, the reason code shoves stuff
> into .git/FOO_HEAD is precisely because we don't want other stuff
> iterating over them, using them for reachability, and so on. That is why
> "--all" does not include them, for example.
>
> But I did not follow the development of the feature, so maybe I am
> missing some cool use case.
>

The use case was to allow us to look at these refs when working with
the reftable backend. Currently there is no way to do that, with the
files backend, well you could just read the files. So mostly a debugging
usecase.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30  9:59       ` Phillip Wood
@ 2024-04-30 12:11         ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:11 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 4721 bytes --]

On Tue, Apr 30, 2024 at 10:59:36AM +0100, Phillip Wood wrote:
> Hi Patrick
> 
> On 30/04/2024 08:30, Patrick Steinhardt wrote:
> > On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote:
> > 
> > > This changes the definition to allow pseudorefs to by symbolic refs. When
> > > is_pseudoref() was introduced Junio and I had a brief discussion about this
> > > restriction and he was not in favor of allowing pseudorefs to be symbolic
> > > refs [1].
> > 
> > So the reason why pseudorefs exist is that some refs behave like a ref
> > sometimes, but not always. And in my book that really only applies to
> > MERGE_HEAD and FETCH_HEAD, because those contain additional metadata
> > that makes them not-a-ref. And for those I very much see that they
> > should not ever be a symref.
> > 
> > But everyhing else living in the root of the ref hierarchy is not
> > special in any way, at least not in my opinion. We have never enforced
> > that those cannot be symrefs, and it makes our terminology needlessly
> > confusing.
> 
> I agree HEAD not being a pseudoref and having special refs as well as
> pseudorefs refs is confusing. I do have some sympathy for the argument that
> pseudorefs should not be symbolic refs though as AUTO_MERGE,
> CHERRY_PICK_HEAD, ORIG_HEAD etc. are all pointers to a commit and it would
> be a bug for them to be a symbolic ref. It is unfortunate that in the move
> away from assessing those refs as files we lost the check that they are not
> symbolic refs.

While I agree that conceptually these should always be "regular" refs, I
feel like that is higher-level logic that belongs into the respective
subsystems that write those. I just don't see why the ref backend should
care about the particular usecases that those higher-level subsystems
have, and I can very much see that there might eventually be another
subsystem that actually wants a specific ref to be a symref.

No we could of course start to hard code all kinds of refs into the ref
layer. But I think that this is the wrong way to go, and treating the
ref store as just that, a generic store where you can store refs,
without attaching specific meaning to any of the refs, is the proper way
to go.

> > I think I'm going to reroll this patch series and go down the nuclear
> > path that I've hinted at in the cover letter:
> > 
> >    - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD.
> > 
> >    - Refs starting with "refs/" are just plain normal refs.
> > 
> >    - Refs living in the root of the ref hierarchy need to conform to a
> >      set of strict rules, as Peff is starting to enforce in a separate
> >      patch series. These are just normal refs, as well, even though we
> >      may call them "root ref" in our tooling as they live in the root of
> >      the ref hierarchy.
> 
> That would certainly be simpler.
> 
> > I just don't think that the current state makes sense to anybody. It's
> > majorly confusing -- I've spent the last 8 months working in our refs
> > code almost exclusively and still forget what's what. How are our users
> > expected to understand this?
> 
> The current state is confusing but arguably there is a logic to the various
> distinctions - whether those distinctions are useful in practice is open to
> debate though. I wonder how much users really care about these distinctions
> and whether it affects their use of git. I was unaware of the distinction
> between HEAD and pseudorefs until I reviewed Karthik's for-each-ref series a
> couple of months ago and I don't think that lack of knowledge had caused me
> any trouble when using git.

There is some logic, that's true enough. I just don't think that anybody
understands the logic.

Patrick

> > > Are there any practical implications of the changes in this patch for users
> > > running commands like "git log FETCH_HEAD" (I can't think of any off the top
> > > of my head but it would be good to have some reassurance on that point in
> > > the commit message)
> > 
> > Not really, no. We have never been doing a good job at enforcing the
> > difference between pseudo refs or normal refs anyway. Pseudo refs can be
> > symrefs just fine, and our tooling won't complain. The only exception
> > where I want us to become stricter is in how we enforce the syntax rules
> > for root refs (which is handled by Peff in a separate patch series), and
> > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
> > They should still resolve when you ask git-rev-parse(1), but when you
> > iterate through refs they should not be surfaced as they _aren't_ refs.
> 
> That's good
> 
> Thanks
> 
> Phillip
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30 10:23       ` Jeff King
  2024-04-30 12:07         ` Karthik Nayak
@ 2024-04-30 12:16         ` Patrick Steinhardt
  1 sibling, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:16 UTC (permalink / raw)
  To: Jeff King; +Cc: phillip.wood, git, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2747 bytes --]

On Tue, Apr 30, 2024 at 06:23:10AM -0400, Jeff King wrote:
> On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote:
[snip]
> > > Are there any practical implications of the changes in this patch for users
> > > running commands like "git log FETCH_HEAD" (I can't think of any off the top
> > > of my head but it would be good to have some reassurance on that point in
> > > the commit message)
> > 
> > Not really, no. We have never been doing a good job at enforcing the
> > difference between pseudo refs or normal refs anyway. Pseudo refs can be
> > symrefs just fine, and our tooling won't complain. The only exception
> > where I want us to become stricter is in how we enforce the syntax rules
> > for root refs (which is handled by Peff in a separate patch series), and
> > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
> > They should still resolve when you ask git-rev-parse(1), but when you
> > iterate through refs they should not be surfaced as they _aren't_ refs.
> 
> I actually would not even mind if they are surfaced when iterating with
> --include-root-refs. But then I am a little skeptical of the purpose of
> that feature in the first place. After all, the reason code shoves stuff
> into .git/FOO_HEAD is precisely because we don't want other stuff
> iterating over them, using them for reachability, and so on. That is why
> "--all" does not include them, for example.
> 
> But I did not follow the development of the feature, so maybe I am
> missing some cool use case.

The thing is that once we start to surface pseudorefs (in the sense of
these _really_ aren't refs) in ref-related tooling, users will want to
treat them as a ref, as well. And that's just bound to happen with
plumbing like `git for-each-ref`, where a user may rightfully expect
that all output here can be treated like a normal ref.

In fact though, I want to double down on restrictions regarding the
pseudorefs FETCH_HEAD and MERGE_HEAD. While it's fair enough that those
can be read like a ref, writing to them is a totally different thing. It
does not make any sense to try and write such refs, and our abstractions
aren't even prepared to write them correctly. They go through the ref
backend, and thus the "reftable" backend would write them into the
reftable stack instead of into the filesystem. Now you could argue that
this should be fixed, but I don't think it is reasonable to expect the
reftable backend to start writing loose refs for those pseudorefs.

So I'd really like to stick with the current explanation that we have in
the "special ref" glossary: pseudorefs must be written via the
filesystem and can't ever go through the ref backends.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v2 00/10] Clarify pseudo-ref terminology
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-04-29 13:41 ` [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved Patrick Steinhardt
@ 2024-04-30 12:26 ` Patrick Steinhardt
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
                     ` (9 more replies)
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (2 subsequent siblings)
  6 siblings, 10 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2969 bytes --]

Hi,

this is the second version of my patch series that tries to clarify the
pseudoref terminology.

As I have alluded to in my first version of this patch series, I'd
really like to return the pseudoref terminology back to its original
roots. Namely, a pseudoref is not a ref and does not conform to the
format of refs as they may contain additional metadata. With the new
definition, we really only have two pseudorefs: FETCH_HEAD and
MERGE_HEAD.

This has multiple consequences:

  - Pseudorefs are never stored via the ref backend.

  - Pseudorefs can be read via tools like git-rev-parse(1).

  - Pseudorefs are not surfaced by tools like git-for-each-ref(1). They
    are not refs, so a tool that goes through all refs should not
    surface them.

  - Pseudorefs cannot be written via tools like git-update-ref(1). They
    are always written by the respective subsystems that create them via
    the filesystem directly.

  - All other refs in the root hierarchy are just plain refs. They are
    not special. They can be symbolic or regular refs. The only thing of
    notice here is a bunch of restrictions that they have in place
    regarding their naming.

  - Special refs are no more. Or rather, special refs are the new
    pseudorefs.

Overall, this significantly simplifies our whole terminology around
refs that most people didn't really understand in the first place,
including myself. Furthermore, it makes it so that the ref backends
don't need to know about any policy except for what is a proper ref
name. Whether refs should be symbolic or direct refs is higher-level
logic that belongs in the respective subsystems, and the ref backends
should not stand in the way as generic vessels for refs.

This patch series makes the necessary changes to out glossary as well as
the code.

Patrick

Patrick Steinhardt (10):
  Documentation/glossary: redefine pseudorefs as special refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: define root refs as refs
  refs: rename `is_pseudoref()` to `is_root_ref()`
  refs: refname `is_special_ref()` to `is_pseudo_ref()`
  refs: classify HEAD as a root ref
  refs: root refs can be symbolic refs
  refs: pseudorefs are no refs
  ref-filter: properly distinuish pseudo and root refs
  refs: refuse to write pseudorefs

 Documentation/glossary-content.txt |  75 +++++++++---------
 builtin/for-each-ref.c             |   2 +-
 ref-filter.c                       |  16 ++--
 ref-filter.h                       |   4 +-
 refs.c                             | 120 ++++++++++++++++-------------
 refs.h                             |  50 +++++++++++-
 refs/files-backend.c               |   3 +-
 refs/reftable-backend.c            |   3 +-
 t/t5510-fetch.sh                   |   6 +-
 t/t6302-for-each-ref-filter.sh     |  34 ++++++++
 10 files changed, 207 insertions(+), 106 deletions(-)

-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 12:49     ` Karthik Nayak
                       ` (2 more replies)
  2024-04-30 12:26   ` [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 3 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 6065 bytes --]

Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadads, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that hihlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 42 +++++++++---------------------
 1 file changed, 13 insertions(+), 29 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..f5c0f49150 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -496,21 +496,19 @@ exclude;;
 	that start with `refs/bisect/`, but might later include other
 	unusual refs.
 
-[[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+[[def_pseudoref]]pseudoref ref::
+	A ref that has different semantics than normal refs. These refs can be
+	accessed via normal Git commands but may not behave the same as a
+	normal ref in some cases.
++
+The following pseudorefs are known to Git:
+
+ - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
+   may refer to multiple object IDs. Each object ID is annotated with metadata
+   indicating where it was fetched from and its fetch status.
+
+ - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
+   conflicts. It contains all commit IDs which are being merged.
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
@@ -638,20 +636,6 @@ The most notable example is `HEAD`.
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.
 
-[[def_special_ref]]special ref::
-	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
-+
-The following special refs are known to Git:
-
- - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
-   may refer to multiple object IDs. Each object ID is annotated with metadata
-   indicating where it was fetched from and its fetch status.
-
- - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
-   conflicts. It contains all commit IDs which are being merged.
-
 [[def_submodule]]submodule::
 	A <<def_repository,repository>> that holds the history of a
 	separate project inside another repository (the latter of
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 13:35     ` Kristoffer Haugsbakk
  2024-04-30 12:26   ` [PATCH v2 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]

Clarify limitations that pseudorefs have:

  - They can be read via git-rev-parse(1) and similar tools.

  - They are not surfaced when iterating through refs, like when using
    git-for-each-ref(1). They are no ref, so iterating through refs
    should not surface them.

  - They cannot be written via git-update-ref(1) and related commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index f5c0f49150..13e1aa63ab 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -498,8 +498,8 @@ exclude;;
 
 [[def_pseudoref]]pseudoref ref::
 	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
+	read via normal Git commands, but cannot be written to by commands like
+	linkgit:git-update-ref[1].
 +
 The following pseudorefs are known to Git:
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 03/10] Documentation/glossary: define root refs as refs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-04-30 12:26   ` [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 12:56     ` Karthik Nayak
  2024-04-30 12:26   ` [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2738 bytes --]

Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.

Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 33 +++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 13e1aa63ab..683b727349 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -550,20 +550,39 @@ The following pseudorefs are known to Git:
 	to the result.
 
 [[def_ref]]ref::
-	A name that begins with `refs/` (e.g. `refs/heads/master`)
-	that points to an <<def_object_name,object name>> or another
-	ref (the latter is called a <<def_symref,symbolic ref>>).
+	A name that that points to an <<def_object_name,object name>> or
+	another ref (the latter is called a <<def_symref,symbolic ref>>).
 	For convenience, a ref can sometimes be abbreviated when used
 	as an argument to a Git command; see linkgit:gitrevisions[7]
 	for details.
 	Refs are stored in the <<def_repository,repository>>.
 +
 The ref namespace is hierarchical.
-Different subhierarchies are used for different purposes (e.g. the
-`refs/heads/` hierarchy is used to represent local branches).
+Ref names must either start with `refs/` or be located in the root of
+the hierarchy. In that case, their name must conform to the following
+rules:
 +
-There are a few special-purpose refs that do not begin with `refs/`.
-The most notable example is `HEAD`.
+ - The name consists of only upper-case characters or underscores.
+
+ - The name ends with "`_HEAD`" or is equal to "`HEAD`".
++
+There are some irregular refs in the root of the hierarchy that do not
+match these rules. The following list is exhaustive and shall not be
+extended in the future:
++
+ - AUTO_MERGE
+
+ - BISECT_EXPECTED_REV
+
+ - NOTES_MERGE_PARTIAL
+
+ - NOTES_MERGE_REF
+
+ - MERGE_AUTOSTASH
++
+Different subhierarchies are used for different purposes. For example,
+the `refs/heads/` hierarchy is used to represent local branches whereas
+the `refs/tags/` hierarchy is used to represent local tags..
 
 [[def_reflog]]reflog::
 	A reflog shows the local "history" of a ref.  In other words,
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()`
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 20:20     ` Junio C Hamano
  2024-04-30 12:26   ` [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]

Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ref-filter.c            |  2 +-
 refs.c                  | 14 +++++++-------
 refs.h                  | 28 +++++++++++++++++++++++++++-
 refs/files-backend.c    |  2 +-
 refs/reftable-backend.c |  2 +-
 5 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 59ad6f54dd..361beb6619 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_pseudoref(get_main_ref_store(the_repository), refname))
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
 		return FILTER_REFS_PSEUDOREFS;
 
 	return FILTER_REFS_OTHERS;
diff --git a/refs.c b/refs.c
index 55d2e0b2cb..0a4acde3ca 100644
--- a/refs.c
+++ b/refs.c
@@ -844,7 +844,7 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudoref_syntax(const char *refname)
+static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
 
@@ -860,9 +860,9 @@ static int is_pseudoref_syntax(const char *refname)
 	return 1;
 }
 
-int is_pseudoref(struct ref_store *refs, const char *refname)
+int is_root_ref(struct ref_store *refs, const char *refname)
 {
-	static const char *const irregular_pseudorefs[] = {
+	static const char *const irregular_root_refs[] = {
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -872,7 +872,7 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 	struct object_id oid;
 	size_t i;
 
-	if (!is_pseudoref_syntax(refname))
+	if (!is_root_ref_syntax(refname))
 		return 0;
 
 	if (ends_with(refname, "_HEAD")) {
@@ -882,8 +882,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 		return !is_null_oid(&oid);
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+		if (!strcmp(refname, irregular_root_refs[i])) {
 			refs_resolve_ref_unsafe(refs, refname,
 						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 						&oid, NULL);
@@ -902,7 +902,7 @@ int is_headref(struct ref_store *refs, const char *refname)
 }
 
 static int is_current_worktree_ref(const char *ref) {
-	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
+	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
 
 enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
diff --git a/refs.h b/refs.h
index d278775e08..d0374c3275 100644
--- a/refs.h
+++ b/refs.h
@@ -1051,7 +1051,33 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
-int is_pseudoref(struct ref_store *refs, const char *refname);
+/*
+ * Check whether the reference is an existing root reference.
+ *
+ * A root ref is a reference that lives in the root of the reference hierarchy.
+ * These references must conform to special syntax:
+ *
+ *   - Their name must be all-uppercase or underscores ("_").
+ *
+ *   - Their name must end with "_HEAD".
+ *
+ *   - Their name may not contain a slash.
+ *
+ * There is a special set of irregular root refs that exist due to historic
+ * reasons, only. This list shall not be expanded in the future:
+ *
+ *   - AUTO_MERGE
+ *
+ *   - BISECT_EXPECTED_REV
+ *
+ *   - NOTES_MERGE_PARTIAL
+ *
+ *   - NOTES_MERGE_REF
+ *
+ *   - MERGE_AUTOSTASH
+ */
+int is_root_ref(struct ref_store *refs, const char *refname);
+
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a098d14ea0..0fcb601444 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,7 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) ||
+		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
 								is_headref(ref_store, de->d_name)))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1cda48c504..5a5e64fe69 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -356,7 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_pseudoref(&iter->refs->base, iter->ref.refname) ||
+		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
 		}
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()`
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 12:58     ` Karthik Nayak
  2024-04-30 12:26   ` [PATCH v2 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionall break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 0a4acde3ca..6266f77474 100644
--- a/refs.c
+++ b/refs.c
@@ -1876,13 +1876,13 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_special_ref(const char *refname)
+static int is_pseudo_ref(const char *refname)
 {
 	/*
-	 * Special references are refs that have different semantics compared
-	 * to "normal" refs. These refs can thus not be stored in the ref
-	 * backend, but must always be accessed via the filesystem. The
-	 * following refs are special:
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
 	 *
 	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
 	 *   carries additional metadata like where it came from.
@@ -1891,17 +1891,17 @@ static int is_special_ref(const char *refname)
 	 *   heads.
 	 *
 	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (special refs) or through the reference
+	 * through the filesystem (pseudorefs) or through the reference
 	 * backend (normal ones).
 	 */
-	static const char * const special_refs[] = {
+	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
 	};
 	size_t i;
 
-	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
-		if (!strcmp(refname, special_refs[i]))
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
 			return 1;
 
 	return 0;
@@ -1912,7 +1912,7 @@ int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      unsigned int *type, int *failure_errno)
 {
 	assert(failure_errno);
-	if (is_special_ref(refname))
+	if (is_pseudo_ref(refname))
 		return refs_read_special_head(ref_store, refname, oid, referent,
 					      type, failure_errno);
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 06/10] refs: classify HEAD as a root ref
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 12:26   ` [PATCH v2 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 3168 bytes --]

Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

- `ref_kind_from_refname()` already handles "HEAD" explicitly before
  calling `is_root_ref()`.

- The "files" and "reftable" backends explicitly called both
  `is_root_ref()` and `is_headref()`.

This change should thus essentially be a no-op.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  | 2 ++
 refs.h                  | 6 +++++-
 refs/files-backend.c    | 3 +--
 refs/reftable-backend.c | 3 +--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 6266f77474..5b89e83ad7 100644
--- a/refs.c
+++ b/refs.c
@@ -874,6 +874,8 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 
 	if (!is_root_ref_syntax(refname))
 		return 0;
+	if (is_headref(refs, refname))
+		return 1;
 
 	if (ends_with(refname, "_HEAD")) {
 		refs_resolve_ref_unsafe(refs, refname,
diff --git a/refs.h b/refs.h
index d0374c3275..4ac454b0c3 100644
--- a/refs.h
+++ b/refs.h
@@ -1059,7 +1059,8 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  *
  *   - Their name must be all-uppercase or underscores ("_").
  *
- *   - Their name must end with "_HEAD".
+ *   - Their name must end with "_HEAD". As a special rule, "HEAD" is a root
+ *     ref, as well.
  *
  *   - Their name may not contain a slash.
  *
@@ -1078,6 +1079,9 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(struct ref_store *refs, const char *refname);
 
+/*
+ * Check whether the reference is "HEAD" and whether it exists.
+ */
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fcb601444..ea927c516d 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,8 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
-								is_headref(ref_store, de->d_name)))
+		if (dtype == DT_REG && is_root_ref(ref_store, de->d_name))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
 		strbuf_setlen(&refname, dirnamelen);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 5a5e64fe69..41555fcf64 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -356,8 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
-		      is_headref(&iter->refs->base, iter->ref.refname)))) {
+		      is_root_ref(&iter->refs->base, iter->ref.refname))) {
 			continue;
 		}
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 07/10] refs: root refs can be symbolic refs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 17:09     ` Justin Tobler
  2024-04-30 12:26   ` [PATCH v2 08/10] refs: pseudorefs are no refs Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 4965 bytes --]

Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Furthermore, the behaviour is different to `is_headref()`, which only
checks for the ref to exist. While that is in line with our glossary,
this inconsistency only adds to the confusion.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never intialized `oid`.

Let's loosen the restrictions in accordance to the new definition of
root refs, which are simply plain refs that may as well be a symbolic
ref. Consequently, we can just check for the ref to exist instead of
requiring it to be a regular ref.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 50 ++++++++++++++++++++++++----------
 t/t6302-for-each-ref-filter.sh | 17 ++++++++++++
 2 files changed, 53 insertions(+), 14 deletions(-)

diff --git a/refs.c b/refs.c
index 5b89e83ad7..ca9844bc3e 100644
--- a/refs.c
+++ b/refs.c
@@ -869,7 +869,10 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
+	struct strbuf referent = STRBUF_INIT;
+	struct object_id oid = { 0 };
+	int failure_errno, ret = 0;
+	unsigned int flags;
 	size_t i;
 
 	if (!is_root_ref_syntax(refname))
@@ -877,30 +880,49 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	if (is_headref(refs, refname))
 		return 1;
 
+	/*
+	 * Note that we cannot use `refs_ref_exists()` here because that also
+	 * checks whether its target ref exists in case refname is a symbolic
+	 * ref.
+	 */
 	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+					 &flags, &failure_errno);
+		goto done;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) {
 		if (!strcmp(refname, irregular_root_refs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
+			ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+						 &flags, &failure_errno);
+			goto done;
 		}
+	}
 
-	return 0;
+done:
+	strbuf_release(&referent);
+	return ret;
 }
 
 int is_headref(struct ref_store *refs, const char *refname)
 {
-	if (!strcmp(refname, "HEAD"))
-		return refs_ref_exists(refs, refname);
+	struct strbuf referent = STRBUF_INIT;
+	struct object_id oid = { 0 };
+	int failure_errno, ret = 0;
+	unsigned int flags;
 
-	return 0;
+	/*
+	 * Note that we cannot use `refs_ref_exists()` here because that also
+	 * checks whether its target ref exists in case refname is a symbolic
+	 * ref.
+	 */
+	if (!strcmp(refname, "HEAD")) {
+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+					 &flags, &failure_errno);
+	}
+
+	strbuf_release(&referent);
+	return ret;
 }
 
 static int is_current_worktree_ref(const char *ref) {
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 948f1bb5f4..92ed8957c8 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs omits dangling symrefs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git symbolic-ref DANGLING_HEAD refs/heads/missing &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_expect_success 'filtering with --points-at' '
 	cat >expect <<-\EOF &&
 	refs/heads/main
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 08/10] refs: pseudorefs are no refs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
@ 2024-04-30 12:26   ` Patrick Steinhardt
  2024-04-30 12:27   ` [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
  2024-04-30 12:27   ` [PATCH v2 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:26 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 4360 bytes --]

The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.

Fix this issue by filtering out pseudorefs in `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 65 +++++++++++++++++-----------------
 t/t6302-for-each-ref-filter.sh | 17 +++++++++
 2 files changed, 50 insertions(+), 32 deletions(-)

diff --git a/refs.c b/refs.c
index ca9844bc3e..dec9dbdc2d 100644
--- a/refs.c
+++ b/refs.c
@@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
+static int is_pseudo_ref(const char *refname)
+{
+	/*
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * Reading, writing or deleting references must consistently go either
+	 * through the filesystem (pseudorefs) or through the reference
+	 * backend (normal ones).
+	 */
+	static const char * const pseudo_refs[] = {
+		"FETCH_HEAD",
+		"MERGE_HEAD",
+	};
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
+			return 1;
+
+	return 0;
+}
+
 static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
@@ -875,7 +906,8 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	unsigned int flags;
 	size_t i;
 
-	if (!is_root_ref_syntax(refname))
+	if (!is_root_ref_syntax(refname) ||
+	    is_pseudo_ref(refname))
 		return 0;
 	if (is_headref(refs, refname))
 		return 1;
@@ -1900,37 +1932,6 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_pseudo_ref(const char *refname)
-{
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
-	static const char * const pseudo_refs[] = {
-		"FETCH_HEAD",
-		"MERGE_HEAD",
-	};
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
-		if (!strcmp(refname, pseudo_refs[i]))
-			return 1;
-
-	return 0;
-}
-
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 92ed8957c8..163c378cfd 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs pattern does not print special refs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git rev-parse HEAD >.git/MERGE_HEAD &&
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_expect_success '--include-root-refs with other patterns' '
 	cat >expect <<-\EOF &&
 	HEAD
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-04-30 12:26   ` [PATCH v2 08/10] refs: pseudorefs are no refs Patrick Steinhardt
@ 2024-04-30 12:27   ` Patrick Steinhardt
  2024-04-30 13:11     ` Karthik Nayak
  2024-04-30 12:27   ` [PATCH v2 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:27 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 5906 bytes --]

The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/for-each-ref.c |  2 +-
 ref-filter.c           | 16 +++++++++-------
 ref-filter.h           |  4 ++--
 refs.c                 | 18 +-----------------
 refs.h                 | 18 ++++++++++++++++++
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 919282e12a..5517a4a1c0 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	}
 
 	if (include_root_refs)
-		flags |= FILTER_REFS_ROOT_REFS;
+		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
 
 	filter.match_as_path = 1;
 	filter_and_format_refs(&filter, flags, sorting, &format);
diff --git a/ref-filter.c b/ref-filter.c
index 361beb6619..d72113edfe 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2628,7 +2628,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       each_ref_fn cb,
 				       void *cb_data)
 {
-	if (filter->kind == FILTER_REFS_KIND_MASK) {
+	if (filter->kind & FILTER_REFS_ROOT_REFS) {
 		/* In this case, we want to print all refs including root refs. */
 		return refs_for_each_include_root_refs(get_main_ref_store(the_repository),
 						       cb, cb_data);
@@ -2756,8 +2756,10 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_root_ref(get_main_ref_store(the_repository), refname))
+	if (is_pseudo_ref(refname))
 		return FILTER_REFS_PSEUDOREFS;
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
+		return FILTER_REFS_ROOT_REFS;
 
 	return FILTER_REFS_OTHERS;
 }
@@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
 	/*
 	 * Generally HEAD refs are printed with special description denoting a rebase,
 	 * detached state and so forth. This is useful when only printing the HEAD ref
-	 * But when it is being printed along with other pseudorefs, it makes sense to
-	 * keep the formatting consistent. So we mask the type to act like a pseudoref.
+	 * But when it is being printed along with other root refs, it makes sense to
+	 * keep the formatting consistent. So we mask the type to act like a root ref.
 	 */
-	if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD)
-		kind = FILTER_REFS_PSEUDOREFS;
+	if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
+		kind = FILTER_REFS_ROOT_REFS;
 	else if (!(kind & filter->kind))
 		return NULL;
 
@@ -3072,7 +3074,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 		 * When printing all ref types, HEAD is already included,
 		 * so we don't want to print HEAD again.
 		 */
-		if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) &&
+		if (!ret && !(filter->kind & FILTER_REFS_ROOT_REFS) &&
 		    (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
 	}
diff --git a/ref-filter.h b/ref-filter.h
index 0ca28d2bba..27ae1aa0d1 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -23,9 +23,9 @@
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
 #define FILTER_REFS_PSEUDOREFS     0x0040
-#define FILTER_REFS_ROOT_REFS      (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS)
+#define FILTER_REFS_ROOT_REFS      0x0080
 #define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \
-				    FILTER_REFS_PSEUDOREFS)
+				    FILTER_REFS_PSEUDOREFS | FILTER_REFS_ROOT_REFS)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs.c b/refs.c
index dec9dbdc2d..50d679b7e7 100644
--- a/refs.c
+++ b/refs.c
@@ -844,24 +844,8 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudo_ref(const char *refname)
+int is_pseudo_ref(const char *refname)
 {
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
 	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
diff --git a/refs.h b/refs.h
index 4ac454b0c3..8255989e7e 100644
--- a/refs.h
+++ b/refs.h
@@ -1084,4 +1084,22 @@ int is_root_ref(struct ref_store *refs, const char *refname);
  */
 int is_headref(struct ref_store *refs, const char *refname);
 
+/*
+ * Pseudorefs are refs that have different semantics compared to
+ * "normal" refs. These refs can thus not be stored in the ref backend,
+ * but must always be accessed via the filesystem. The following refs
+ * are pseudorefs:
+ *
+ * - FETCH_HEAD may contain multiple object IDs, and each one of them
+ *   carries additional metadata like where it came from.
+ *
+ * - MERGE_HEAD may contain multiple object IDs when merging multiple
+ *   heads.
+ *
+ * Reading, writing or deleting references must consistently go either
+ * through the filesystem (pseudorefs) or through the reference
+ * backend (normal ones).
+ */
+int is_pseudo_ref(const char *refname);
+
 #endif /* REFS_H */
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v2 10/10] refs: refuse to write pseudorefs
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-04-30 12:27   ` [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
@ 2024-04-30 12:27   ` Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:27 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.

Restrict writing pseudorefs via the ref backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c           | 7 +++++++
 t/t5510-fetch.sh | 6 +++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index 50d679b7e7..7c3c7465a4 100644
--- a/refs.c
+++ b/refs.c
@@ -1307,6 +1307,13 @@ int ref_transaction_update(struct ref_transaction *transaction,
 		return -1;
 	}
 
+	if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
+	    is_pseudo_ref(refname)) {
+		strbuf_addf(err, _("refusing to update pseudoref '%s'"),
+			    refname);
+		return -1;
+	}
+
 	if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
 		BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
 
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 33d34d5ae9..4eb569f4df 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -518,7 +518,7 @@ test_expect_success 'fetch with a non-applying branch.<name>.merge' '
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [1]' '
 	one_head=$(cd one && git rev-parse HEAD) &&
 	this_head=$(git rev-parse HEAD) &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -530,7 +530,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 	one_ref=$(cd one && git symbolic-ref HEAD) &&
 	git config branch.main.remote blub &&
 	git config branch.main.merge "$one_ref" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -540,7 +540,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 # the merge spec does not match the branch the remote HEAD points to
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [3]' '
 	git config branch.main.merge "${one_ref}_not" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-30 12:07         ` Karthik Nayak
@ 2024-04-30 12:33           ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-04-30 12:33 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Jeff King, phillip.wood, git

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

On Tue, Apr 30, 2024 at 05:07:19AM -0700, Karthik Nayak wrote:
> Jeff King <peff@peff.net> writes:
> 
> > On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote:
[snip]
> >> > Are there any practical implications of the changes in this patch for users
> >> > running commands like "git log FETCH_HEAD" (I can't think of any off the top
> >> > of my head but it would be good to have some reassurance on that point in
> >> > the commit message)
> >>
> >> Not really, no. We have never been doing a good job at enforcing the
> >> difference between pseudo refs or normal refs anyway. Pseudo refs can be
> >> symrefs just fine, and our tooling won't complain. The only exception
> >> where I want us to become stricter is in how we enforce the syntax rules
> >> for root refs (which is handled by Peff in a separate patch series), and
> >> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs.
> >> They should still resolve when you ask git-rev-parse(1), but when you
> >> iterate through refs they should not be surfaced as they _aren't_ refs.
> >
> > I actually would not even mind if they are surfaced when iterating with
> > --include-root-refs. But then I am a little skeptical of the purpose of
> > that feature in the first place. After all, the reason code shoves stuff
> > into .git/FOO_HEAD is precisely because we don't want other stuff
> > iterating over them, using them for reachability, and so on. That is why
> > "--all" does not include them, for example.
> >
> > But I did not follow the development of the feature, so maybe I am
> > missing some cool use case.
> >
> 
> The use case was to allow us to look at these refs when working with
> the reftable backend. Currently there is no way to do that, with the
> files backend, well you could just read the files. So mostly a debugging
> usecase.

That's true for normal root refs, only, though. The pseudorefs (current
special refs) can still be surfaced even if the for-each-ref machinery
doesn't surface them because by definition, they always live in the
filesystem.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
@ 2024-04-30 12:49     ` Karthik Nayak
  2024-04-30 17:17     ` Justin Tobler
  2024-04-30 20:12     ` Junio C Hamano
  2 siblings, 0 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-04-30 12:49 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 6507 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Nowadays, Git knows about three different kinds of refs. As defined in
> gitglossary(7):
>
>   - Regular refs that start with "refs/", like "refs/heads/main".
>
>   - Pseudorefs, which live in the root directory. These must have
>     all-caps names and must be a file that start with an object hash.
>     Consequently, symbolic refs are not pseudorefs because they do not
>     start with an object hash.
>
>   - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".
>

Nit: but since you go into explaining what the _old_ pseudoref is,
perhaps you should also add a line about why "FETCH_HEAD" and
"MERGE_HEAD" were called special refs.

> This state is extremely confusing, and I would claim that most folks
> don't fully understand what is what here. The current definitions also
> have several problems:
>
>   - Where does "HEAD" fit in? It's not a pseudoref because it can be
>     a symbolic ref. It's not a regular ref because it does not start
>     with "refs/". And it's not a special ref, either.
>
>   - There is a strong overlap between pseudorefs and special refs. The
>     pseudoref section for example mentions "MERGE_HEAD", even though it
>     is a special ref. Is it thus both a pseudoref and a special ref?
>
>   - Why do we even need to distinguish refs that live in the root from
>     other refs when they behave just like a regular ref anyway?
>
> In other words, the current state is quite a mess and leads to wild
> inconsistencies without much of a good reason.
>
> The original reason why pseudorefs were introduced is that there are
> some refs that sometimes behave like a ref, even though they aren't a
> ref. And we really only have two of these nowadads, namely "MERGE_HEAD"
> and "FETCH_HEAD". Those files are never written via the ref backends,
> but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
> They contain additional metadata that hihlights where a ref has been

s/hihlights/highlights

> fetched from or the list of commits that have been merged.

This is good detail and I guess you can skip my earlier suggestion.

> This original intent in fact matches the definition of special refs that
> we have recently introduced in 8df4c5d205 (Documentation: add "special
> refs" to the glossary, 2024-01-19). Due to the introduction of the new
> reftable backend we were forced to distinguish those refs more clearly
> such that we don't ever try to read or write them via the reftable
> backend. In the same series, we also addressed all the other cases where
> we used to write those special refs via the filesystem directly, thus
> circumventing the ref backend, to instead write them via the backends.
> Consequently, there are no other refs left anymore which are special.
>
> Let's address this mess and return the pseudoref terminology back to its
> original intent: a ref that sometimes behave like a ref, but which isn't
> really a ref because it gets written to the filesystem directly. Or in
> other words, let's redefine pseudorefs to match the current definition
> of special refs. As special refs and pseudorefs are now the same per
> definition, we can drop the "special refs" term again. It's not exposed
> to our users and thus they wouldn't ever encounter that term anyway.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/glossary-content.txt | 42 +++++++++---------------------
>  1 file changed, 13 insertions(+), 29 deletions(-)
>
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index d71b199955..f5c0f49150 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -496,21 +496,19 @@ exclude;;
>  	that start with `refs/bisect/`, but might later include other
>  	unusual refs.
>
> -[[def_pseudoref]]pseudoref::
> -	Pseudorefs are a class of files under `$GIT_DIR` which behave
> -	like refs for the purposes of rev-parse, but which are treated
> -	specially by git.  Pseudorefs both have names that are all-caps,
> -	and always start with a line consisting of a
> -	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
> -	pseudoref, because it is sometimes a symbolic ref.  They might
> -	optionally contain some additional data.  `MERGE_HEAD` and
> -	`CHERRY_PICK_HEAD` are examples.  Unlike
> -	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
> -	be symbolic refs, and never have reflogs.  They also cannot be
> -	updated through the normal ref update machinery.  Instead,
> -	they are updated by directly writing to the files.  However,
> -	they can be read as if they were refs, so `git rev-parse
> -	MERGE_HEAD` will work.
> +[[def_pseudoref]]pseudoref ref::

shouldn't this just be 'pseudoref'?

> +	A ref that has different semantics than normal refs. These refs can be
> +	accessed via normal Git commands but may not behave the same as a
> +	normal ref in some cases.
> ++
> +The following pseudorefs are known to Git:
> +
> + - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
> +   may refer to multiple object IDs. Each object ID is annotated with metadata
> +   indicating where it was fetched from and its fetch status.
> +
> + - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
> +   conflicts. It contains all commit IDs which are being merged.
>
>  [[def_pull]]pull::
>  	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
> @@ -638,20 +636,6 @@ The most notable example is `HEAD`.
>  	An <<def_object,object>> used to temporarily store the contents of a
>  	<<def_dirty,dirty>> working directory and the index for future reuse.
>
> -[[def_special_ref]]special ref::
> -	A ref that has different semantics than normal refs. These refs can be
> -	accessed via normal Git commands but may not behave the same as a
> -	normal ref in some cases.
> -+
> -The following special refs are known to Git:
> -
> - - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
> -   may refer to multiple object IDs. Each object ID is annotated with metadata
> -   indicating where it was fetched from and its fetch status.
> -
> - - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
> -   conflicts. It contains all commit IDs which are being merged.
> -
>  [[def_submodule]]submodule::
>  	A <<def_repository,repository>> that holds the history of a
>  	separate project inside another repository (the latter of
> --
> 2.45.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 03/10] Documentation/glossary: define root refs as refs
  2024-04-30 12:26   ` [PATCH v2 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
@ 2024-04-30 12:56     ` Karthik Nayak
  0 siblings, 0 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-04-30 12:56 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2988 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
> in the root of the ref hierarchy behave the exact same as normal refs.
> They can be symbolic refs or direct refs and can be read, iterated over
> and written via normal tooling. All of these refs are stored in the ref
> backends, which further demonstrates that they are just normal refs.
>
> Extend the definition of "ref" to also cover such root refs. The only
> additional restriction for root refs is that they must conform to a
> specific naming schema.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/glossary-content.txt | 33 +++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index 13e1aa63ab..683b727349 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -550,20 +550,39 @@ The following pseudorefs are known to Git:
>  	to the result.
>
>  [[def_ref]]ref::
> -	A name that begins with `refs/` (e.g. `refs/heads/master`)
> -	that points to an <<def_object_name,object name>> or another
> -	ref (the latter is called a <<def_symref,symbolic ref>>).
> +	A name that that points to an <<def_object_name,object name>> or
> +	another ref (the latter is called a <<def_symref,symbolic ref>>).
>  	For convenience, a ref can sometimes be abbreviated when used
>  	as an argument to a Git command; see linkgit:gitrevisions[7]
>  	for details.
>  	Refs are stored in the <<def_repository,repository>>.
>  +
>  The ref namespace is hierarchical.
> -Different subhierarchies are used for different purposes (e.g. the
> -`refs/heads/` hierarchy is used to represent local branches).
> +Ref names must either start with `refs/` or be located in the root of
> +the hierarchy. In that case, their name must conform to the following
> +rules:
>  +

The last sentence here doesn't clarify what it is referring to, perhaps
something like 'For the latter, their name must follow these rules:'
instead?

> -There are a few special-purpose refs that do not begin with `refs/`.
> -The most notable example is `HEAD`.
> + - The name consists of only upper-case characters or underscores.
> +
> + - The name ends with "`_HEAD`" or is equal to "`HEAD`".
> ++
> +There are some irregular refs in the root of the hierarchy that do not
> +match these rules. The following list is exhaustive and shall not be
> +extended in the future:
> ++
> + - AUTO_MERGE
> +
> + - BISECT_EXPECTED_REV
> +
> + - NOTES_MERGE_PARTIAL
> +
> + - NOTES_MERGE_REF
> +
> + - MERGE_AUTOSTASH
> ++
> +Different subhierarchies are used for different purposes. For example,
> +the `refs/heads/` hierarchy is used to represent local branches whereas
> +the `refs/tags/` hierarchy is used to represent local tags..
>
>  [[def_reflog]]reflog::
>  	A reflog shows the local "history" of a ref.  In other words,
> --
> 2.45.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()`
  2024-04-30 12:26   ` [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
@ 2024-04-30 12:58     ` Karthik Nayak
  0 siblings, 0 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-04-30 12:58 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 441 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
> defined terminology in our gitglossary(7). Note that in the preceding
> commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
> there may be confusion for in-flight patch series that add new calls to
> `is_pseudoref()`. In order to intentionall break such patch series we

s/intentionall/intentionally


[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-04-30 12:27   ` [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
@ 2024-04-30 13:11     ` Karthik Nayak
  2024-05-02  8:08       ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Karthik Nayak @ 2024-04-30 13:11 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1356 bytes --]

In the subject: s/distinuish/distinguish

Patrick Steinhardt <ps@pks.im> writes:

> The ref-filter interfaces currently define root refs as either a
> detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
> let's properly distinguish those ref types.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  builtin/for-each-ref.c |  2 +-
>  ref-filter.c           | 16 +++++++++-------
>  ref-filter.h           |  4 ++--
>  refs.c                 | 18 +-----------------
>  refs.h                 | 18 ++++++++++++++++++
>  5 files changed, 31 insertions(+), 27 deletions(-)
>
> diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> index 919282e12a..5517a4a1c0 100644
> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>  	}
>
>  	if (include_root_refs)
> -		flags |= FILTER_REFS_ROOT_REFS;
> +		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;

The only issue I see with this patch is that it makes me think that HEAD
is not a root ref anymore. I get that this is the best way to define the
directives because otherwise you'd need a new flag something like
`FILTER_REFS_ROOT_REFS_WITHOUT_HEAD` and `FILTER_REFS_ROOT_REFS` would
be the summation of that and the HEAD flag.

Apart from this, the patch looks good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs
  2024-04-30 12:26   ` [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
@ 2024-04-30 13:35     ` Kristoffer Haugsbakk
  0 siblings, 0 replies; 93+ messages in thread
From: Kristoffer Haugsbakk @ 2024-04-30 13:35 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, git

On Tue, Apr 30, 2024, at 14:26, Patrick Steinhardt wrote:
>   - They are not surfaced when iterating through refs, like when using
>     git-for-each-ref(1). They are no ref, so iterating through refs
>     should not surface them.

s/They are no ref/They are not refs

-- 
Kristoffer Haugsbakk


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 07/10] refs: root refs can be symbolic refs
  2024-04-30 12:26   ` [PATCH v2 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
@ 2024-04-30 17:09     ` Justin Tobler
  2024-05-02  8:07       ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Justin Tobler @ 2024-04-30 17:09 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano

On 24/04/30 02:26PM, Patrick Steinhardt wrote:
> Before this patch series, root refs except for "HEAD" and our special
> refs were classified as pseudorefs. Furthermore, our terminology
> clarified that pseudorefs must not be symbolic refs. This restriction
> is enforced in `is_root_ref()`, which explicitly checks that a supposed
> root ref resolves to an object ID without recursing.
> 
> This has been extremely confusing right from the start because (in old
> terminology) a ref name may sometimes be a pseudoref and sometimes not
> depending on whether it is a symbolic or regular ref. This behaviour
> does not seem reasonable at all and I very much doubt that it results in
> anything sane.
> 
> Furthermore, the behaviour is different to `is_headref()`, which only
> checks for the ref to exist. While that is in line with our glossary,
> this inconsistency only adds to the confusion.
> 
> Last but not least, the current behaviour can actually lead to a
> segfault when calling `is_root_ref()` with a reference that either does
> not exist or that is a symbolic ref because we never intialized `oid`.

s/intialized/initialized/

> Let's loosen the restrictions in accordance to the new definition of
> root refs, which are simply plain refs that may as well be a symbolic
> ref. Consequently, we can just check for the ref to exist instead of
> requiring it to be a regular ref.
> 
> Add a test that verifies that this does not change user-visible
> behaviour. Namely, we still don't want to show broken refs to the user
> by default in git-for-each-ref(1). What this does allow though is for
> internal callers to surface dangling root refs when they pass in the
> `DO_FOR_EACH_INCLUDE_BROKEN` flag.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c                         | 50 ++++++++++++++++++++++++----------
>  t/t6302-for-each-ref-filter.sh | 17 ++++++++++++
>  2 files changed, 53 insertions(+), 14 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 5b89e83ad7..ca9844bc3e 100644
> --- a/refs.c
> +++ b/refs.c
...  
>  int is_headref(struct ref_store *refs, const char *refname)
>  {
> -	if (!strcmp(refname, "HEAD"))
> -		return refs_ref_exists(refs, refname);
> +	struct strbuf referent = STRBUF_INIT;
> +	struct object_id oid = { 0 };
> +	int failure_errno, ret = 0;
> +	unsigned int flags;
>  
> -	return 0;
> +	/*
> +	 * Note that we cannot use `refs_ref_exists()` here because that also
> +	 * checks whether its target ref exists in case refname is a symbolic
> +	 * ref.
> +	 */
> +	if (!strcmp(refname, "HEAD")) {
> +		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
> +					 &flags, &failure_errno);
> +	}
> +
> +	strbuf_release(&referent);
> +	return ret;
>  }

I'm not quite sure I understand why we are changing the behavior of
`is_headref()` here. Do we no longer want to validate the ref exists if
it is symbolic?

In a prior commit, `is_headref()` is commented to mention that we check
whether the reference exists. Maybe that could use some additional
clarification?

-Justin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-04-30 12:49     ` Karthik Nayak
@ 2024-04-30 17:17     ` Justin Tobler
  2024-04-30 20:12     ` Junio C Hamano
  2 siblings, 0 replies; 93+ messages in thread
From: Justin Tobler @ 2024-04-30 17:17 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano

On 24/04/30 02:26PM, Patrick Steinhardt wrote:
> Nowadays, Git knows about three different kinds of refs. As defined in
> gitglossary(7):
> 
>   - Regular refs that start with "refs/", like "refs/heads/main".
> 
>   - Pseudorefs, which live in the root directory. These must have
>     all-caps names and must be a file that start with an object hash.
>     Consequently, symbolic refs are not pseudorefs because they do not
>     start with an object hash.
> 
>   - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".
> 
> This state is extremely confusing, and I would claim that most folks
> don't fully understand what is what here. The current definitions also
> have several problems:
> 
>   - Where does "HEAD" fit in? It's not a pseudoref because it can be
>     a symbolic ref. It's not a regular ref because it does not start
>     with "refs/". And it's not a special ref, either.
> 
>   - There is a strong overlap between pseudorefs and special refs. The
>     pseudoref section for example mentions "MERGE_HEAD", even though it
>     is a special ref. Is it thus both a pseudoref and a special ref?
> 
>   - Why do we even need to distinguish refs that live in the root from
>     other refs when they behave just like a regular ref anyway?
> 
> In other words, the current state is quite a mess and leads to wild
> inconsistencies without much of a good reason.
> 
> The original reason why pseudorefs were introduced is that there are
> some refs that sometimes behave like a ref, even though they aren't a
> ref. And we really only have two of these nowadads, namely "MERGE_HEAD"

s/nowadads/nowadays/

-Justin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-04-30 12:49     ` Karthik Nayak
  2024-04-30 17:17     ` Justin Tobler
@ 2024-04-30 20:12     ` Junio C Hamano
  2024-05-02  8:07       ` Patrick Steinhardt
  2 siblings, 1 reply; 93+ messages in thread
From: Junio C Hamano @ 2024-04-30 20:12 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> Let's address this mess and return the pseudoref terminology back to its
> original intent: a ref that sometimes behave like a ref, but which isn't
> really a ref because it gets written to the filesystem directly. Or in
> other words, let's redefine pseudorefs to match the current definition
> of special refs. As special refs and pseudorefs are now the same per
> definition, we can drop the "special refs" term again. It's not exposed
> to our users and thus they wouldn't ever encounter that term anyway.

Good intentions.

I do not agree with "the ones at the root should not be special" at
all, though.  We need to reject names like 'config' somehow, as long
as there are users who use files backend.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()`
  2024-04-30 12:26   ` [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
@ 2024-04-30 20:20     ` Junio C Hamano
  0 siblings, 0 replies; 93+ messages in thread
From: Junio C Hamano @ 2024-04-30 20:20 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
> terminology in our gitglossary(7).

OK.  root-ref is a good name that is not yet tainted.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-04-30 20:12     ` Junio C Hamano
@ 2024-05-02  8:07       ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

On Tue, Apr 30, 2024 at 01:12:33PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Let's address this mess and return the pseudoref terminology back to its
> > original intent: a ref that sometimes behave like a ref, but which isn't
> > really a ref because it gets written to the filesystem directly. Or in
> > other words, let's redefine pseudorefs to match the current definition
> > of special refs. As special refs and pseudorefs are now the same per
> > definition, we can drop the "special refs" term again. It's not exposed
> > to our users and thus they wouldn't ever encounter that term anyway.
> 
> Good intentions.
> 
> I do not agree with "the ones at the root should not be special" at
> all, though.  We need to reject names like 'config' somehow, as long
> as there are users who use files backend.

Oh, yes, I totally agree and thought I'd mentioned this in the message.
But it seems like I only mention this in a subsequent message. Let me
add a hint to the commit message that mentions that a subsequent commit
will clearly define "root refs".

In any case, root refs should not be special regarding their behaviour,
but should have a strict naming schema:

    - Only uppercase letters or underscores.

    - Must end with "_HEAD" or be called "HEAD".

    - There is an exhaustive list of legacy root refs that don't conform
      to this naming schema, like "AUTO_MERGE". This list shall not be
      extended in the future.

This explanation is added in patch 3.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 07/10] refs: root refs can be symbolic refs
  2024-04-30 17:09     ` Justin Tobler
@ 2024-05-02  8:07       ` Patrick Steinhardt
  2024-05-03 20:49         ` Justin Tobler
  0 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:07 UTC (permalink / raw)
  To: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]

On Tue, Apr 30, 2024 at 12:09:57PM -0500, Justin Tobler wrote:
> On 24/04/30 02:26PM, Patrick Steinhardt wrote:
[snip]
> > diff --git a/refs.c b/refs.c
> > index 5b89e83ad7..ca9844bc3e 100644
> > --- a/refs.c
> > +++ b/refs.c
> ...  
> >  int is_headref(struct ref_store *refs, const char *refname)
> >  {
> > -	if (!strcmp(refname, "HEAD"))
> > -		return refs_ref_exists(refs, refname);
> > +	struct strbuf referent = STRBUF_INIT;
> > +	struct object_id oid = { 0 };
> > +	int failure_errno, ret = 0;
> > +	unsigned int flags;
> >  
> > -	return 0;
> > +	/*
> > +	 * Note that we cannot use `refs_ref_exists()` here because that also
> > +	 * checks whether its target ref exists in case refname is a symbolic
> > +	 * ref.
> > +	 */
> > +	if (!strcmp(refname, "HEAD")) {
> > +		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
> > +					 &flags, &failure_errno);
> > +	}
> > +
> > +	strbuf_release(&referent);
> > +	return ret;
> >  }
> 
> I'm not quite sure I understand why we are changing the behavior of
> `is_headref()` here. Do we no longer want to validate the ref exists if
> it is symbolic?

The implementation does not conform to the definition of a "HEAD" ref.
Even before this patch series, a "HEAD" ref could either be a symbolic
or a regular ref. So to answer the question of "Is this a HEAD ref?" you
only need to check whether the ref exists, not whether its target
exists.

> In a prior commit, `is_headref()` is commented to mention that we check
> whether the reference exists. Maybe that could use some additional
> clarification?

Which particular commit do you refer to? It's not part of this series,
is it?

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-04-30 13:11     ` Karthik Nayak
@ 2024-05-02  8:08       ` Patrick Steinhardt
  2024-05-02 10:03         ` Karthik Nayak
  0 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:08 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2823 bytes --]

On Tue, Apr 30, 2024 at 06:11:05AM -0700, Karthik Nayak wrote:
> In the subject: s/distinuish/distinguish
> 
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The ref-filter interfaces currently define root refs as either a
> > detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
> > let's properly distinguish those ref types.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  builtin/for-each-ref.c |  2 +-
> >  ref-filter.c           | 16 +++++++++-------
> >  ref-filter.h           |  4 ++--
> >  refs.c                 | 18 +-----------------
> >  refs.h                 | 18 ++++++++++++++++++
> >  5 files changed, 31 insertions(+), 27 deletions(-)
> >
> > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> > index 919282e12a..5517a4a1c0 100644
> > --- a/builtin/for-each-ref.c
> > +++ b/builtin/for-each-ref.c
> > @@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
> >  	}
> >
> >  	if (include_root_refs)
> > -		flags |= FILTER_REFS_ROOT_REFS;
> > +		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
> 
> The only issue I see with this patch is that it makes me think that HEAD
> is not a root ref anymore. I get that this is the best way to define the
> directives because otherwise you'd need a new flag something like
> `FILTER_REFS_ROOT_REFS_WITHOUT_HEAD` and `FILTER_REFS_ROOT_REFS` would
> be the summation of that and the HEAD flag.
> 
> Apart from this, the patch looks good.

Well, it is a root ref, but we treat it differently in the ref-filter
interface because it's rendered differently than any other root ref.
Furthermore, the ref-filter interfaces allow you to _only_ list the HEAD
ref, which is another reason why it's singled out.

Renaming this to FILTER_REFS_ROOT_REFS_WITHOUT_HEAD ould be quite
misleading, too, because we have the following snippet:

@@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
        /*
         * Generally HEAD refs are printed with special description denoting a rebase,
         * detached state and so forth. This is useful when only printing the HEAD ref
         * But when it is being printed along with other root refs, it makes sense to
         * keep the formatting consistent. So we mask the type to act like a root ref.
         */
        if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
                kind = FILTER_REFS_ROOT_REFS;
        else if (!(kind & filter->kind))
                return NULL;

If we named this FILTER_REFS_ROOT_REFS_WITHOUT_HEAD then the above code
would be even more surprising.

So yeah, it's a bit weird, but I think it's more sensible to retain the
code as proposed.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v3 00/10] Clarify pseudo-ref terminology
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-05-02  8:17 ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
                     ` (9 more replies)
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
  6 siblings, 10 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 7096 bytes --]

Hi,

this is the third version of my patch series that aims to clarify the
pseudo-ref terminology.

Changes compared to v2:

    - Various typo fixes.

    - Added a note in the first commit that we're about to clearly
      define rules around "root refs" in a subsequent commit. While this
      patch series will make root refs act like "just a normal ref", we
      will still have strict limits around the naming policy for them.

Thanks!

Patrick

Patrick Steinhardt (10):
  Documentation/glossary: redefine pseudorefs as special refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: define root refs as refs
  refs: rename `is_pseudoref()` to `is_root_ref()`
  refs: refname `is_special_ref()` to `is_pseudo_ref()`
  refs: classify HEAD as a root ref
  refs: root refs can be symbolic refs
  refs: pseudorefs are no refs
  ref-filter: properly distinuish pseudo and root refs
  refs: refuse to write pseudorefs

 Documentation/glossary-content.txt |  72 ++++++++---------
 builtin/for-each-ref.c             |   2 +-
 ref-filter.c                       |  16 ++--
 ref-filter.h                       |   4 +-
 refs.c                             | 120 ++++++++++++++++-------------
 refs.h                             |  50 +++++++++++-
 refs/files-backend.c               |   3 +-
 refs/reftable-backend.c            |   3 +-
 t/t5510-fetch.sh                   |   6 +-
 t/t6302-for-each-ref-filter.sh     |  34 ++++++++
 10 files changed, 205 insertions(+), 105 deletions(-)

Range-diff against v2:
 1:  2489bb5585 !  1:  e651bae690 Documentation/glossary: redefine pseudorefs as special refs
    @@ Commit message
     
         The original reason why pseudorefs were introduced is that there are
         some refs that sometimes behave like a ref, even though they aren't a
    -    ref. And we really only have two of these nowadads, namely "MERGE_HEAD"
    +    ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
         and "FETCH_HEAD". Those files are never written via the ref backends,
         but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
    -    They contain additional metadata that hihlights where a ref has been
    +    They contain additional metadata that highlights where a ref has been
         fetched from or the list of commits that have been merged.
     
         This original intent in fact matches the definition of special refs that
    @@ Commit message
         definition, we can drop the "special refs" term again. It's not exposed
         to our users and thus they wouldn't ever encounter that term anyway.
     
    +    Refs that live in the root of the ref hierarchy but which are not
    +    pseudorefs will be further defined in a subsequent commit.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## Documentation/glossary-content.txt ##
     @@ Documentation/glossary-content.txt: exclude;;
    - 	that start with `refs/bisect/`, but might later include other
      	unusual refs.
      
    --[[def_pseudoref]]pseudoref::
    + [[def_pseudoref]]pseudoref::
     -	Pseudorefs are a class of files under `$GIT_DIR` which behave
     -	like refs for the purposes of rev-parse, but which are treated
     -	specially by git.  Pseudorefs both have names that are all-caps,
    @@ Documentation/glossary-content.txt: exclude;;
     -	they are updated by directly writing to the files.  However,
     -	they can be read as if they were refs, so `git rev-parse
     -	MERGE_HEAD` will work.
    -+[[def_pseudoref]]pseudoref ref::
     +	A ref that has different semantics than normal refs. These refs can be
     +	accessed via normal Git commands but may not behave the same as a
     +	normal ref in some cases.
 2:  1f2f8cf3f2 !  2:  66ac046132 Documentation/glossary: clarify limitations of pseudorefs
    @@ Commit message
           - They can be read via git-rev-parse(1) and similar tools.
     
           - They are not surfaced when iterating through refs, like when using
    -        git-for-each-ref(1). They are no ref, so iterating through refs
    +        git-for-each-ref(1). They are not refs, so iterating through refs
             should not surface them.
     
           - They cannot be written via git-update-ref(1) and related commands.
    @@ Commit message
      ## Documentation/glossary-content.txt ##
     @@ Documentation/glossary-content.txt: exclude;;
      
    - [[def_pseudoref]]pseudoref ref::
    + [[def_pseudoref]]pseudoref::
      	A ref that has different semantics than normal refs. These refs can be
     -	accessed via normal Git commands but may not behave the same as a
     -	normal ref in some cases.
 3:  9659d7da3f !  3:  243d616101 Documentation/glossary: define root refs as refs
    @@ Documentation/glossary-content.txt: The following pseudorefs are known to Git:
     -Different subhierarchies are used for different purposes (e.g. the
     -`refs/heads/` hierarchy is used to represent local branches).
     +Ref names must either start with `refs/` or be located in the root of
    -+the hierarchy. In that case, their name must conform to the following
    -+rules:
    ++the hierarchy. For the latter, their name must follow these rules:
      +
     -There are a few special-purpose refs that do not begin with `refs/`.
     -The most notable example is `HEAD`.
 4:  3d7ea70417 =  4:  0a116f9d11 refs: rename `is_pseudoref()` to `is_root_ref()`
 5:  e6b6db972d !  5:  484a0856bc refs: refname `is_special_ref()` to `is_pseudo_ref()`
    @@ Commit message
         defined terminology in our gitglossary(7). Note that in the preceding
         commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
         there may be confusion for in-flight patch series that add new calls to
    -    `is_pseudoref()`. In order to intentionall break such patch series we
    +    `is_pseudoref()`. In order to intentionally break such patch series we
         have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
         new name.
     
 6:  44f72a7baf =  6:  c196fe3c45 refs: classify HEAD as a root ref
 7:  e90b2f8aa9 !  7:  92a71222e1 refs: root refs can be symbolic refs
    @@ Commit message
     
         Last but not least, the current behaviour can actually lead to a
         segfault when calling `is_root_ref()` with a reference that either does
    -    not exist or that is a symbolic ref because we never intialized `oid`.
    +    not exist or that is a symbolic ref because we never initialized `oid`.
     
         Let's loosen the restrictions in accordance to the new definition of
         root refs, which are simply plain refs that may as well be a symbolic
 8:  bc82d7ae65 =  8:  8bd52e5363 refs: pseudorefs are no refs
 9:  95d7547b2e =  9:  cd6d745a01 ref-filter: properly distinuish pseudo and root refs
10:  b2029612dd = 10:  6956fccced refs: refuse to write pseudorefs
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v3 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 6095 bytes --]

Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that highlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Refs that live in the root of the ref hierarchy but which are not
pseudorefs will be further defined in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 40 +++++++++---------------------
 1 file changed, 12 insertions(+), 28 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..ca04768e3b 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -497,20 +497,18 @@ exclude;;
 	unusual refs.
 
 [[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+	A ref that has different semantics than normal refs. These refs can be
+	accessed via normal Git commands but may not behave the same as a
+	normal ref in some cases.
++
+The following pseudorefs are known to Git:
+
+ - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
+   may refer to multiple object IDs. Each object ID is annotated with metadata
+   indicating where it was fetched from and its fetch status.
+
+ - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
+   conflicts. It contains all commit IDs which are being merged.
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
@@ -638,20 +636,6 @@ The most notable example is `HEAD`.
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.
 
-[[def_special_ref]]special ref::
-	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
-+
-The following special refs are known to Git:
-
- - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
-   may refer to multiple object IDs. Each object ID is annotated with metadata
-   indicating where it was fetched from and its fetch status.
-
- - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
-   conflicts. It contains all commit IDs which are being merged.
-
 [[def_submodule]]submodule::
 	A <<def_repository,repository>> that holds the history of a
 	separate project inside another repository (the latter of
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 02/10] Documentation/glossary: clarify limitations of pseudorefs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]

Clarify limitations that pseudorefs have:

  - They can be read via git-rev-parse(1) and similar tools.

  - They are not surfaced when iterating through refs, like when using
    git-for-each-ref(1). They are not refs, so iterating through refs
    should not surface them.

  - They cannot be written via git-update-ref(1) and related commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index ca04768e3b..b464b926d5 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -498,8 +498,8 @@ exclude;;
 
 [[def_pseudoref]]pseudoref::
 	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
+	read via normal Git commands, but cannot be written to by commands like
+	linkgit:git-update-ref[1].
 +
 The following pseudorefs are known to Git:
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 03/10] Documentation/glossary: define root refs as refs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 2726 bytes --]

Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.

Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 32 +++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index b464b926d5..d6cf907a19 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -550,20 +550,38 @@ The following pseudorefs are known to Git:
 	to the result.
 
 [[def_ref]]ref::
-	A name that begins with `refs/` (e.g. `refs/heads/master`)
-	that points to an <<def_object_name,object name>> or another
-	ref (the latter is called a <<def_symref,symbolic ref>>).
+	A name that that points to an <<def_object_name,object name>> or
+	another ref (the latter is called a <<def_symref,symbolic ref>>).
 	For convenience, a ref can sometimes be abbreviated when used
 	as an argument to a Git command; see linkgit:gitrevisions[7]
 	for details.
 	Refs are stored in the <<def_repository,repository>>.
 +
 The ref namespace is hierarchical.
-Different subhierarchies are used for different purposes (e.g. the
-`refs/heads/` hierarchy is used to represent local branches).
+Ref names must either start with `refs/` or be located in the root of
+the hierarchy. For the latter, their name must follow these rules:
 +
-There are a few special-purpose refs that do not begin with `refs/`.
-The most notable example is `HEAD`.
+ - The name consists of only upper-case characters or underscores.
+
+ - The name ends with "`_HEAD`" or is equal to "`HEAD`".
++
+There are some irregular refs in the root of the hierarchy that do not
+match these rules. The following list is exhaustive and shall not be
+extended in the future:
++
+ - AUTO_MERGE
+
+ - BISECT_EXPECTED_REV
+
+ - NOTES_MERGE_PARTIAL
+
+ - NOTES_MERGE_REF
+
+ - MERGE_AUTOSTASH
++
+Different subhierarchies are used for different purposes. For example,
+the `refs/heads/` hierarchy is used to represent local branches whereas
+the `refs/tags/` hierarchy is used to represent local tags..
 
 [[def_reflog]]reflog::
 	A reflog shows the local "history" of a ref.  In other words,
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 04/10] refs: rename `is_pseudoref()` to `is_root_ref()`
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]

Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ref-filter.c            |  2 +-
 refs.c                  | 14 +++++++-------
 refs.h                  | 28 +++++++++++++++++++++++++++-
 refs/files-backend.c    |  2 +-
 refs/reftable-backend.c |  2 +-
 5 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 59ad6f54dd..361beb6619 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_pseudoref(get_main_ref_store(the_repository), refname))
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
 		return FILTER_REFS_PSEUDOREFS;
 
 	return FILTER_REFS_OTHERS;
diff --git a/refs.c b/refs.c
index 55d2e0b2cb..0a4acde3ca 100644
--- a/refs.c
+++ b/refs.c
@@ -844,7 +844,7 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudoref_syntax(const char *refname)
+static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
 
@@ -860,9 +860,9 @@ static int is_pseudoref_syntax(const char *refname)
 	return 1;
 }
 
-int is_pseudoref(struct ref_store *refs, const char *refname)
+int is_root_ref(struct ref_store *refs, const char *refname)
 {
-	static const char *const irregular_pseudorefs[] = {
+	static const char *const irregular_root_refs[] = {
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -872,7 +872,7 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 	struct object_id oid;
 	size_t i;
 
-	if (!is_pseudoref_syntax(refname))
+	if (!is_root_ref_syntax(refname))
 		return 0;
 
 	if (ends_with(refname, "_HEAD")) {
@@ -882,8 +882,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 		return !is_null_oid(&oid);
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+		if (!strcmp(refname, irregular_root_refs[i])) {
 			refs_resolve_ref_unsafe(refs, refname,
 						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 						&oid, NULL);
@@ -902,7 +902,7 @@ int is_headref(struct ref_store *refs, const char *refname)
 }
 
 static int is_current_worktree_ref(const char *ref) {
-	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
+	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
 
 enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
diff --git a/refs.h b/refs.h
index d278775e08..d0374c3275 100644
--- a/refs.h
+++ b/refs.h
@@ -1051,7 +1051,33 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
-int is_pseudoref(struct ref_store *refs, const char *refname);
+/*
+ * Check whether the reference is an existing root reference.
+ *
+ * A root ref is a reference that lives in the root of the reference hierarchy.
+ * These references must conform to special syntax:
+ *
+ *   - Their name must be all-uppercase or underscores ("_").
+ *
+ *   - Their name must end with "_HEAD".
+ *
+ *   - Their name may not contain a slash.
+ *
+ * There is a special set of irregular root refs that exist due to historic
+ * reasons, only. This list shall not be expanded in the future:
+ *
+ *   - AUTO_MERGE
+ *
+ *   - BISECT_EXPECTED_REV
+ *
+ *   - NOTES_MERGE_PARTIAL
+ *
+ *   - NOTES_MERGE_REF
+ *
+ *   - MERGE_AUTOSTASH
+ */
+int is_root_ref(struct ref_store *refs, const char *refname);
+
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a098d14ea0..0fcb601444 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,7 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) ||
+		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
 								is_headref(ref_store, de->d_name)))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1cda48c504..5a5e64fe69 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -356,7 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_pseudoref(&iter->refs->base, iter->ref.refname) ||
+		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
 		}
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()`
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionally break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 0a4acde3ca..6266f77474 100644
--- a/refs.c
+++ b/refs.c
@@ -1876,13 +1876,13 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_special_ref(const char *refname)
+static int is_pseudo_ref(const char *refname)
 {
 	/*
-	 * Special references are refs that have different semantics compared
-	 * to "normal" refs. These refs can thus not be stored in the ref
-	 * backend, but must always be accessed via the filesystem. The
-	 * following refs are special:
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
 	 *
 	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
 	 *   carries additional metadata like where it came from.
@@ -1891,17 +1891,17 @@ static int is_special_ref(const char *refname)
 	 *   heads.
 	 *
 	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (special refs) or through the reference
+	 * through the filesystem (pseudorefs) or through the reference
 	 * backend (normal ones).
 	 */
-	static const char * const special_refs[] = {
+	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
 	};
 	size_t i;
 
-	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
-		if (!strcmp(refname, special_refs[i]))
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
 			return 1;
 
 	return 0;
@@ -1912,7 +1912,7 @@ int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      unsigned int *type, int *failure_errno)
 {
 	assert(failure_errno);
-	if (is_special_ref(refname))
+	if (is_pseudo_ref(refname))
 		return refs_read_special_head(ref_store, refname, oid, referent,
 					      type, failure_errno);
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 06/10] refs: classify HEAD as a root ref
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 3168 bytes --]

Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

- `ref_kind_from_refname()` already handles "HEAD" explicitly before
  calling `is_root_ref()`.

- The "files" and "reftable" backends explicitly called both
  `is_root_ref()` and `is_headref()`.

This change should thus essentially be a no-op.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  | 2 ++
 refs.h                  | 6 +++++-
 refs/files-backend.c    | 3 +--
 refs/reftable-backend.c | 3 +--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 6266f77474..5b89e83ad7 100644
--- a/refs.c
+++ b/refs.c
@@ -874,6 +874,8 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 
 	if (!is_root_ref_syntax(refname))
 		return 0;
+	if (is_headref(refs, refname))
+		return 1;
 
 	if (ends_with(refname, "_HEAD")) {
 		refs_resolve_ref_unsafe(refs, refname,
diff --git a/refs.h b/refs.h
index d0374c3275..4ac454b0c3 100644
--- a/refs.h
+++ b/refs.h
@@ -1059,7 +1059,8 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  *
  *   - Their name must be all-uppercase or underscores ("_").
  *
- *   - Their name must end with "_HEAD".
+ *   - Their name must end with "_HEAD". As a special rule, "HEAD" is a root
+ *     ref, as well.
  *
  *   - Their name may not contain a slash.
  *
@@ -1078,6 +1079,9 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(struct ref_store *refs, const char *refname);
 
+/*
+ * Check whether the reference is "HEAD" and whether it exists.
+ */
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fcb601444..ea927c516d 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,8 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
-								is_headref(ref_store, de->d_name)))
+		if (dtype == DT_REG && is_root_ref(ref_store, de->d_name))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
 		strbuf_setlen(&refname, dirnamelen);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 5a5e64fe69..41555fcf64 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -356,8 +356,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
-		      is_headref(&iter->refs->base, iter->ref.refname)))) {
+		      is_root_ref(&iter->refs->base, iter->ref.refname))) {
 			continue;
 		}
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-03 18:13     ` Jeff King
  2024-05-02  8:17   ` [PATCH v3 08/10] refs: pseudorefs are no refs Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 4966 bytes --]

Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Furthermore, the behaviour is different to `is_headref()`, which only
checks for the ref to exist. While that is in line with our glossary,
this inconsistency only adds to the confusion.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`.

Let's loosen the restrictions in accordance to the new definition of
root refs, which are simply plain refs that may as well be a symbolic
ref. Consequently, we can just check for the ref to exist instead of
requiring it to be a regular ref.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 50 ++++++++++++++++++++++++----------
 t/t6302-for-each-ref-filter.sh | 17 ++++++++++++
 2 files changed, 53 insertions(+), 14 deletions(-)

diff --git a/refs.c b/refs.c
index 5b89e83ad7..ca9844bc3e 100644
--- a/refs.c
+++ b/refs.c
@@ -869,7 +869,10 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
+	struct strbuf referent = STRBUF_INIT;
+	struct object_id oid = { 0 };
+	int failure_errno, ret = 0;
+	unsigned int flags;
 	size_t i;
 
 	if (!is_root_ref_syntax(refname))
@@ -877,30 +880,49 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	if (is_headref(refs, refname))
 		return 1;
 
+	/*
+	 * Note that we cannot use `refs_ref_exists()` here because that also
+	 * checks whether its target ref exists in case refname is a symbolic
+	 * ref.
+	 */
 	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+					 &flags, &failure_errno);
+		goto done;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) {
 		if (!strcmp(refname, irregular_root_refs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
+			ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+						 &flags, &failure_errno);
+			goto done;
 		}
+	}
 
-	return 0;
+done:
+	strbuf_release(&referent);
+	return ret;
 }
 
 int is_headref(struct ref_store *refs, const char *refname)
 {
-	if (!strcmp(refname, "HEAD"))
-		return refs_ref_exists(refs, refname);
+	struct strbuf referent = STRBUF_INIT;
+	struct object_id oid = { 0 };
+	int failure_errno, ret = 0;
+	unsigned int flags;
 
-	return 0;
+	/*
+	 * Note that we cannot use `refs_ref_exists()` here because that also
+	 * checks whether its target ref exists in case refname is a symbolic
+	 * ref.
+	 */
+	if (!strcmp(refname, "HEAD")) {
+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+					 &flags, &failure_errno);
+	}
+
+	strbuf_release(&referent);
+	return ret;
 }
 
 static int is_current_worktree_ref(const char *ref) {
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 948f1bb5f4..92ed8957c8 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs omits dangling symrefs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git symbolic-ref DANGLING_HEAD refs/heads/missing &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_expect_success 'filtering with --points-at' '
 	cat >expect <<-\EOF &&
 	refs/heads/main
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 08/10] refs: pseudorefs are no refs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 4360 bytes --]

The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.

Fix this issue by filtering out pseudorefs in `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 65 +++++++++++++++++-----------------
 t/t6302-for-each-ref-filter.sh | 17 +++++++++
 2 files changed, 50 insertions(+), 32 deletions(-)

diff --git a/refs.c b/refs.c
index ca9844bc3e..dec9dbdc2d 100644
--- a/refs.c
+++ b/refs.c
@@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
+static int is_pseudo_ref(const char *refname)
+{
+	/*
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * Reading, writing or deleting references must consistently go either
+	 * through the filesystem (pseudorefs) or through the reference
+	 * backend (normal ones).
+	 */
+	static const char * const pseudo_refs[] = {
+		"FETCH_HEAD",
+		"MERGE_HEAD",
+	};
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
+			return 1;
+
+	return 0;
+}
+
 static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
@@ -875,7 +906,8 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	unsigned int flags;
 	size_t i;
 
-	if (!is_root_ref_syntax(refname))
+	if (!is_root_ref_syntax(refname) ||
+	    is_pseudo_ref(refname))
 		return 0;
 	if (is_headref(refs, refname))
 		return 1;
@@ -1900,37 +1932,6 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_pseudo_ref(const char *refname)
-{
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
-	static const char * const pseudo_refs[] = {
-		"FETCH_HEAD",
-		"MERGE_HEAD",
-	};
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
-		if (!strcmp(refname, pseudo_refs[i]))
-			return 1;
-
-	return 0;
-}
-
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 92ed8957c8..163c378cfd 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs pattern does not print special refs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git rev-parse HEAD >.git/MERGE_HEAD &&
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_expect_success '--include-root-refs with other patterns' '
 	cat >expect <<-\EOF &&
 	HEAD
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 08/10] refs: pseudorefs are no refs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  2024-05-02  8:17   ` [PATCH v3 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 5906 bytes --]

The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/for-each-ref.c |  2 +-
 ref-filter.c           | 16 +++++++++-------
 ref-filter.h           |  4 ++--
 refs.c                 | 18 +-----------------
 refs.h                 | 18 ++++++++++++++++++
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 919282e12a..5517a4a1c0 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	}
 
 	if (include_root_refs)
-		flags |= FILTER_REFS_ROOT_REFS;
+		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
 
 	filter.match_as_path = 1;
 	filter_and_format_refs(&filter, flags, sorting, &format);
diff --git a/ref-filter.c b/ref-filter.c
index 361beb6619..d72113edfe 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2628,7 +2628,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       each_ref_fn cb,
 				       void *cb_data)
 {
-	if (filter->kind == FILTER_REFS_KIND_MASK) {
+	if (filter->kind & FILTER_REFS_ROOT_REFS) {
 		/* In this case, we want to print all refs including root refs. */
 		return refs_for_each_include_root_refs(get_main_ref_store(the_repository),
 						       cb, cb_data);
@@ -2756,8 +2756,10 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_root_ref(get_main_ref_store(the_repository), refname))
+	if (is_pseudo_ref(refname))
 		return FILTER_REFS_PSEUDOREFS;
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
+		return FILTER_REFS_ROOT_REFS;
 
 	return FILTER_REFS_OTHERS;
 }
@@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
 	/*
 	 * Generally HEAD refs are printed with special description denoting a rebase,
 	 * detached state and so forth. This is useful when only printing the HEAD ref
-	 * But when it is being printed along with other pseudorefs, it makes sense to
-	 * keep the formatting consistent. So we mask the type to act like a pseudoref.
+	 * But when it is being printed along with other root refs, it makes sense to
+	 * keep the formatting consistent. So we mask the type to act like a root ref.
 	 */
-	if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD)
-		kind = FILTER_REFS_PSEUDOREFS;
+	if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
+		kind = FILTER_REFS_ROOT_REFS;
 	else if (!(kind & filter->kind))
 		return NULL;
 
@@ -3072,7 +3074,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 		 * When printing all ref types, HEAD is already included,
 		 * so we don't want to print HEAD again.
 		 */
-		if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) &&
+		if (!ret && !(filter->kind & FILTER_REFS_ROOT_REFS) &&
 		    (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
 	}
diff --git a/ref-filter.h b/ref-filter.h
index 0ca28d2bba..27ae1aa0d1 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -23,9 +23,9 @@
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
 #define FILTER_REFS_PSEUDOREFS     0x0040
-#define FILTER_REFS_ROOT_REFS      (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS)
+#define FILTER_REFS_ROOT_REFS      0x0080
 #define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \
-				    FILTER_REFS_PSEUDOREFS)
+				    FILTER_REFS_PSEUDOREFS | FILTER_REFS_ROOT_REFS)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs.c b/refs.c
index dec9dbdc2d..50d679b7e7 100644
--- a/refs.c
+++ b/refs.c
@@ -844,24 +844,8 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudo_ref(const char *refname)
+int is_pseudo_ref(const char *refname)
 {
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
 	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
diff --git a/refs.h b/refs.h
index 4ac454b0c3..8255989e7e 100644
--- a/refs.h
+++ b/refs.h
@@ -1084,4 +1084,22 @@ int is_root_ref(struct ref_store *refs, const char *refname);
  */
 int is_headref(struct ref_store *refs, const char *refname);
 
+/*
+ * Pseudorefs are refs that have different semantics compared to
+ * "normal" refs. These refs can thus not be stored in the ref backend,
+ * but must always be accessed via the filesystem. The following refs
+ * are pseudorefs:
+ *
+ * - FETCH_HEAD may contain multiple object IDs, and each one of them
+ *   carries additional metadata like where it came from.
+ *
+ * - MERGE_HEAD may contain multiple object IDs when merging multiple
+ *   heads.
+ *
+ * Reading, writing or deleting references must consistently go either
+ * through the filesystem (pseudorefs) or through the reference
+ * backend (normal ones).
+ */
+int is_pseudo_ref(const char *refname);
+
 #endif /* REFS_H */
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 10/10] refs: refuse to write pseudorefs
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-05-02  8:17   ` [PATCH v3 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
@ 2024-05-02  8:17   ` Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  8:17 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.

Restrict writing pseudorefs via the ref backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c           | 7 +++++++
 t/t5510-fetch.sh | 6 +++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index 50d679b7e7..7c3c7465a4 100644
--- a/refs.c
+++ b/refs.c
@@ -1307,6 +1307,13 @@ int ref_transaction_update(struct ref_transaction *transaction,
 		return -1;
 	}
 
+	if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
+	    is_pseudo_ref(refname)) {
+		strbuf_addf(err, _("refusing to update pseudoref '%s'"),
+			    refname);
+		return -1;
+	}
+
 	if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
 		BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
 
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 33d34d5ae9..4eb569f4df 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -518,7 +518,7 @@ test_expect_success 'fetch with a non-applying branch.<name>.merge' '
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [1]' '
 	one_head=$(cd one && git rev-parse HEAD) &&
 	this_head=$(git rev-parse HEAD) &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -530,7 +530,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 	one_ref=$(cd one && git symbolic-ref HEAD) &&
 	git config branch.main.remote blub &&
 	git config branch.main.merge "$one_ref" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -540,7 +540,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 # the merge spec does not match the branch the remote HEAD points to
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [3]' '
 	git config branch.main.merge "${one_ref}_not" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-05-02  8:08       ` Patrick Steinhardt
@ 2024-05-02 10:03         ` Karthik Nayak
  0 siblings, 0 replies; 93+ messages in thread
From: Karthik Nayak @ 2024-05-02 10:03 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Phillip Wood, Junio C Hamano, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 2910 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> On Tue, Apr 30, 2024 at 06:11:05AM -0700, Karthik Nayak wrote:
>> In the subject: s/distinuish/distinguish
>>
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> > The ref-filter interfaces currently define root refs as either a
>> > detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
>> > let's properly distinguish those ref types.
>> >
>> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
>> > ---
>> >  builtin/for-each-ref.c |  2 +-
>> >  ref-filter.c           | 16 +++++++++-------
>> >  ref-filter.h           |  4 ++--
>> >  refs.c                 | 18 +-----------------
>> >  refs.h                 | 18 ++++++++++++++++++
>> >  5 files changed, 31 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
>> > index 919282e12a..5517a4a1c0 100644
>> > --- a/builtin/for-each-ref.c
>> > +++ b/builtin/for-each-ref.c
>> > @@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>> >  	}
>> >
>> >  	if (include_root_refs)
>> > -		flags |= FILTER_REFS_ROOT_REFS;
>> > +		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
>>
>> The only issue I see with this patch is that it makes me think that HEAD
>> is not a root ref anymore. I get that this is the best way to define the
>> directives because otherwise you'd need a new flag something like
>> `FILTER_REFS_ROOT_REFS_WITHOUT_HEAD` and `FILTER_REFS_ROOT_REFS` would
>> be the summation of that and the HEAD flag.
>>
>> Apart from this, the patch looks good.
>
> Well, it is a root ref, but we treat it differently in the ref-filter
> interface because it's rendered differently than any other root ref.
> Furthermore, the ref-filter interfaces allow you to _only_ list the HEAD
> ref, which is another reason why it's singled out.
>
> Renaming this to FILTER_REFS_ROOT_REFS_WITHOUT_HEAD ould be quite
> misleading, too, because we have the following snippet:
>
> @@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
>         /*
>          * Generally HEAD refs are printed with special description denoting a rebase,
>          * detached state and so forth. This is useful when only printing the HEAD ref
>          * But when it is being printed along with other root refs, it makes sense to
>          * keep the formatting consistent. So we mask the type to act like a root ref.
>          */
>         if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
>                 kind = FILTER_REFS_ROOT_REFS;
>         else if (!(kind & filter->kind))
>                 return NULL;
>

Right..

> If we named this FILTER_REFS_ROOT_REFS_WITHOUT_HEAD then the above code
> would be even more surprising.
>
> So yeah, it's a bit weird, but I think it's more sensible to retain the
> code as proposed.

Agreed. Let's keep it as is!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-02  8:17   ` [PATCH v3 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
@ 2024-05-03 18:13     ` Jeff King
  2024-05-15  4:16       ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote:

> Before this patch series, root refs except for "HEAD" and our special
> refs were classified as pseudorefs. Furthermore, our terminology
> clarified that pseudorefs must not be symbolic refs. This restriction
> is enforced in `is_root_ref()`, which explicitly checks that a supposed
> root ref resolves to an object ID without recursing.
> 
> This has been extremely confusing right from the start because (in old
> terminology) a ref name may sometimes be a pseudoref and sometimes not
> depending on whether it is a symbolic or regular ref. This behaviour
> does not seem reasonable at all and I very much doubt that it results in
> anything sane.
> 
> Furthermore, the behaviour is different to `is_headref()`, which only
> checks for the ref to exist. While that is in line with our glossary,
> this inconsistency only adds to the confusion.
> 
> Last but not least, the current behaviour can actually lead to a
> segfault when calling `is_root_ref()` with a reference that either does
> not exist or that is a symbolic ref because we never initialized `oid`.
> 
> Let's loosen the restrictions in accordance to the new definition of
> root refs, which are simply plain refs that may as well be a symbolic
> ref. Consequently, we can just check for the ref to exist instead of
> requiring it to be a regular ref.

It's not clear to me that this existence check is particularly useful.
Something that fails read_raw_ref() will fail if:

  - the file does not exist at all. But then how did somebody find out
    about it at all to ask is_pseudoref()?

  - it does exist, but does not look like a ref. Is this important? If I
    do "echo foo >.git/CHERRY_PICK_HEAD", does it become not a root ref
    anymore? Or is it a root ref that is broken? I'd have thought the
    latter, and the syntax is what distinguishes it.

Making the classification purely syntactic based on the name feels
simpler to me to reason about. You'll never run into confusing cases
where repo state changes how commands may behave.

And arguably is_pseudoref_syntax() should be taking into account the
"_HEAD" restriction and special names anyway. It is a bit weird that
even if we tighten up the refname checking to use is_pseudoref_syntax(),
you'd still be able to "git update-ref FOO" but then not see it as a
root ref!

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 07/10] refs: root refs can be symbolic refs
  2024-05-02  8:07       ` Patrick Steinhardt
@ 2024-05-03 20:49         ` Justin Tobler
  2024-05-07 10:32           ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Justin Tobler @ 2024-05-03 20:49 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano

On 24/05/02 10:07AM, Patrick Steinhardt wrote:
> On Tue, Apr 30, 2024 at 12:09:57PM -0500, Justin Tobler wrote:

> > I'm not quite sure I understand why we are changing the behavior of
> > `is_headref()` here. Do we no longer want to validate the ref exists if
> > it is symbolic?
> 
> The implementation does not conform to the definition of a "HEAD" ref.
> Even before this patch series, a "HEAD" ref could either be a symbolic
> or a regular ref. So to answer the question of "Is this a HEAD ref?" you
> only need to check whether the ref exists, not whether its target
> exists.

Thanks Patrick! I think this explantion might be good to add to the
commit message.

> > In a prior commit, `is_headref()` is commented to mention that we check
> > whether the reference exists. Maybe that could use some additional
> > clarification?
> 
> Which particular commit do you refer to? It's not part of this series,
> is it?

I'm refering to the comment added above `is_headref()` in
(refs: classify HEAD as a root ref, 2024-04-30):

"Check whether the reference is "HEAD" and whether it exists."

Maybe I misunderstand its intent though.

-Justin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v2 07/10] refs: root refs can be symbolic refs
  2024-05-03 20:49         ` Justin Tobler
@ 2024-05-07 10:32           ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-07 10:32 UTC (permalink / raw)
  To: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]

On Fri, May 03, 2024 at 03:49:25PM -0500, Justin Tobler wrote:
> On 24/05/02 10:07AM, Patrick Steinhardt wrote:
> > On Tue, Apr 30, 2024 at 12:09:57PM -0500, Justin Tobler wrote:
> 
> > > I'm not quite sure I understand why we are changing the behavior of
> > > `is_headref()` here. Do we no longer want to validate the ref exists if
> > > it is symbolic?
> > 
> > The implementation does not conform to the definition of a "HEAD" ref.
> > Even before this patch series, a "HEAD" ref could either be a symbolic
> > or a regular ref. So to answer the question of "Is this a HEAD ref?" you
> > only need to check whether the ref exists, not whether its target
> > exists.
> 
> Thanks Patrick! I think this explantion might be good to add to the
> commit message.

I'll restructure this a bit. In fact, we can even get rid of
`is_headref()` completely as it is now covered by `is_root_ref()`, and
there are no callers of `is_headref()`.

> > > In a prior commit, `is_headref()` is commented to mention that we check
> > > whether the reference exists. Maybe that could use some additional
> > > clarification?
> > 
> > Which particular commit do you refer to? It's not part of this series,
> > is it?
> 
> I'm refering to the comment added above `is_headref()` in
> (refs: classify HEAD as a root ref, 2024-04-30):
> 
> "Check whether the reference is "HEAD" and whether it exists."
> 
> Maybe I misunderstand its intent though.

Ah, now I get what you're saying. Yeah, this could indeed use a
clarification. I'll add it to `is_root_ref()` though given that
`is_headref()` will go away.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-04-29 22:52   ` Justin Tobler
@ 2024-05-09 17:29   ` Jean-Noël AVILA
  2024-05-10  8:33     ` Patrick Steinhardt
  3 siblings, 1 reply; 93+ messages in thread
From: Jean-Noël AVILA @ 2024-05-09 17:29 UTC (permalink / raw)
  To: git, Patrick Steinhardt; +Cc: Jeff King, Karthik Nayak

Hello,

On Monday, 29 April 2024 15:41:28 CEST Patrick Steinhardt wrote:
> ---
>  Documentation/glossary-content.txt | 36 ++++++++++++++++++------------
>  refs.c                             |  2 ++
>  t/t6302-for-each-ref-filter.sh     | 17 ++++++++++++++
>  3 files changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-
content.txt
> index d71b199955..4275918fa0 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -497,20 +497,28 @@ exclude;;
...
> +
> + - "`AUTO_MERGE`"
> +
> + - "`BISECT_EXPECTED_REV`"
> +
> + - "`NOTES_MERGE_PARTIAL`"
> +
> + - "`NOTES_MERGE_REF`"
> +
> + - "`MERGE_AUTOSTASH`"

Quoting the names seems overkill here, as they are already formatted with the 
back-quotes. The rendering as bold or monospace is enough.

Regards,

Jean-Noël





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/3] refs: do not label special refs as pseudo refs
  2024-05-09 17:29   ` Jean-Noël AVILA
@ 2024-05-10  8:33     ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:33 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git, Jeff King, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 1227 bytes --]

On Thu, May 09, 2024 at 07:29:18PM +0200, Jean-Noël AVILA wrote:
> Hello,
> 
> On Monday, 29 April 2024 15:41:28 CEST Patrick Steinhardt wrote:
> > ---
> >  Documentation/glossary-content.txt | 36 ++++++++++++++++++------------
> >  refs.c                             |  2 ++
> >  t/t6302-for-each-ref-filter.sh     | 17 ++++++++++++++
> >  3 files changed, 41 insertions(+), 14 deletions(-)
> > 
> > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-
> content.txt
> > index d71b199955..4275918fa0 100644
> > --- a/Documentation/glossary-content.txt
> > +++ b/Documentation/glossary-content.txt
> > @@ -497,20 +497,28 @@ exclude;;
> ...
> > +
> > + - "`AUTO_MERGE`"
> > +
> > + - "`BISECT_EXPECTED_REV`"
> > +
> > + - "`NOTES_MERGE_PARTIAL`"
> > +
> > + - "`NOTES_MERGE_REF`"
> > +
> > + - "`MERGE_AUTOSTASH`"
> 
> Quoting the names seems overkill here, as they are already formatted with the 
> back-quotes. The rendering as bold or monospace is enough.
> 
> Regards,
> 
> Jean-Noël

These don't exist in later versions of this patch series anymore. But
there are other cases where I did the same in the current version, so
let me drop that.

Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 00/10] Clarify pseudo-ref terminology
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-05-10  8:48 ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
                     ` (10 more replies)
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
  6 siblings, 11 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 9595 bytes --]

Hi,

thi sis the fourth version of my patch series that aims to clarify the
pseudo-ref terminology. Changes compared to v3:

  - Render refs in "Documentation/glossary-context.txt" more
    consistently with backticks, only.

  - Reorder patches 6 and 7 such that we first correct `is_root_ref()`,
    and then adapt `is_headref()`.

  - Furthermore, I have inlined `is_headref()` into `is_root_ref()`
    completely now as it didn't have any users anymore.

Thanks!

Patrick

Patrick Steinhardt (10):
  Documentation/glossary: redefine pseudorefs as special refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: define root refs as refs
  refs: rename `is_pseudoref()` to `is_root_ref()`
  refs: refname `is_special_ref()` to `is_pseudo_ref()`
  refs: root refs can be symbolic refs
  refs: classify HEAD as a root ref
  refs: pseudorefs are no refs
  ref-filter: properly distinuish pseudo and root refs
  refs: refuse to write pseudorefs

 Documentation/glossary-content.txt |  72 +++++++++----------
 builtin/for-each-ref.c             |   2 +-
 ref-filter.c                       |  16 +++--
 ref-filter.h                       |   4 +-
 refs.c                             | 108 ++++++++++++++---------------
 refs.h                             |  48 ++++++++++++-
 refs/files-backend.c               |   3 +-
 refs/reftable-backend.c            |   3 +-
 t/t5510-fetch.sh                   |   6 +-
 t/t6302-for-each-ref-filter.sh     |  34 +++++++++
 10 files changed, 185 insertions(+), 111 deletions(-)

Range-diff against v3:
 1:  e651bae690 !  1:  b1fc4c1ac7 Documentation/glossary: redefine pseudorefs as special refs
    @@ Documentation/glossary-content.txt: exclude;;
     ++
     +The following pseudorefs are known to Git:
     +
    -+ - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
    ++ - `FETCH_HEAD` is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
     +   may refer to multiple object IDs. Each object ID is annotated with metadata
     +   indicating where it was fetched from and its fetch status.
     +
    -+ - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
    ++ - `MERGE_HEAD` is written by linkgit:git-merge[1] when resolving merge
     +   conflicts. It contains all commit IDs which are being merged.
      
      [[def_pull]]pull::
 2:  66ac046132 =  2:  dce3a0fa7e Documentation/glossary: clarify limitations of pseudorefs
 3:  243d616101 !  3:  79249962f5 Documentation/glossary: define root refs as refs
    @@ Documentation/glossary-content.txt: The following pseudorefs are known to Git:
     +match these rules. The following list is exhaustive and shall not be
     +extended in the future:
     ++
    -+ - AUTO_MERGE
    ++ - `AUTO_MERGE`
     +
    -+ - BISECT_EXPECTED_REV
    ++ - `BISECT_EXPECTED_REV`
     +
    -+ - NOTES_MERGE_PARTIAL
    ++ - `NOTES_MERGE_PARTIAL`
     +
    -+ - NOTES_MERGE_REF
    ++ - `NOTES_MERGE_REF`
     +
    -+ - MERGE_AUTOSTASH
    ++ - `MERGE_AUTOSTASH`
     ++
     +Different subhierarchies are used for different purposes. For example,
     +the `refs/heads/` hierarchy is used to represent local branches whereas
 4:  0a116f9d11 =  4:  ee2b090f75 refs: rename `is_pseudoref()` to `is_root_ref()`
 5:  484a0856bc =  5:  2c09bc7690 refs: refname `is_special_ref()` to `is_pseudo_ref()`
 7:  92a71222e1 !  6:  5e402811a6 refs: root refs can be symbolic refs
    @@ Commit message
         does not seem reasonable at all and I very much doubt that it results in
         anything sane.
     
    -    Furthermore, the behaviour is different to `is_headref()`, which only
    -    checks for the ref to exist. While that is in line with our glossary,
    -    this inconsistency only adds to the confusion.
    -
         Last but not least, the current behaviour can actually lead to a
         segfault when calling `is_root_ref()` with a reference that either does
         not exist or that is a symbolic ref because we never initialized `oid`.
    @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
      	size_t i;
      
      	if (!is_root_ref_syntax(refname))
    -@@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    - 	if (is_headref(refs, refname))
    - 		return 1;
    + 		return 0;
      
     +	/*
     +	 * Note that we cannot use `refs_ref_exists()` here because that also
    @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
      }
      
      int is_headref(struct ref_store *refs, const char *refname)
    - {
    --	if (!strcmp(refname, "HEAD"))
    --		return refs_ref_exists(refs, refname);
    -+	struct strbuf referent = STRBUF_INIT;
    -+	struct object_id oid = { 0 };
    -+	int failure_errno, ret = 0;
    -+	unsigned int flags;
    - 
    --	return 0;
    -+	/*
    -+	 * Note that we cannot use `refs_ref_exists()` here because that also
    -+	 * checks whether its target ref exists in case refname is a symbolic
    -+	 * ref.
    -+	 */
    -+	if (!strcmp(refname, "HEAD")) {
    -+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
    -+					 &flags, &failure_errno);
    -+	}
    -+
    -+	strbuf_release(&referent);
    -+	return ret;
    - }
    +
    + ## refs.h ##
    +@@ refs.h: extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
    + void update_ref_namespace(enum ref_namespace namespace, char *ref);
      
    - static int is_current_worktree_ref(const char *ref) {
    + /*
    +- * Check whether the reference is an existing root reference.
    ++ * Check whether the reference is an existing root reference. A root reference
    ++ * that is a dangling symbolic ref is considered to exist.
    +  *
    +  * A root ref is a reference that lives in the root of the reference hierarchy.
    +  * These references must conform to special syntax:
     
      ## t/t6302-for-each-ref-filter.sh ##
     @@ t/t6302-for-each-ref-filter.sh: test_expect_success '--include-root-refs with other patterns' '
 6:  c196fe3c45 !  7:  b32c56afcb refs: classify HEAD as a root ref
    @@ Commit message
         - The "files" and "reftable" backends explicitly called both
           `is_root_ref()` and `is_headref()`.
     
    -    This change should thus essentially be a no-op.
    +    This also aligns behaviour or `is_root_ref()` and `is_headref()` such
    +    that we also return a trueish value when the ref is a dangling symbolic
    +    ref. As there are no callers of `is_headref()` left afer the refactoring
    +    we absorb it completely into `is_root_ref()`.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## refs.c ##
    +@@ refs.c: static int is_root_ref_syntax(const char *refname)
    + int is_root_ref(struct ref_store *refs, const char *refname)
    + {
    + 	static const char *const irregular_root_refs[] = {
    ++		"HEAD",
    + 		"AUTO_MERGE",
    + 		"BISECT_EXPECTED_REV",
    + 		"NOTES_MERGE_PARTIAL",
     @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    + 	return ret;
    + }
      
    - 	if (!is_root_ref_syntax(refname))
    - 		return 0;
    -+	if (is_headref(refs, refname))
    -+		return 1;
    - 
    - 	if (ends_with(refname, "_HEAD")) {
    - 		refs_resolve_ref_unsafe(refs, refname,
    +-int is_headref(struct ref_store *refs, const char *refname)
    +-{
    +-	if (!strcmp(refname, "HEAD"))
    +-		return refs_ref_exists(refs, refname);
    +-
    +-	return 0;
    +-}
    +-
    + static int is_current_worktree_ref(const char *ref) {
    + 	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
    + }
     
      ## refs.h ##
     @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
    @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
      int is_root_ref(struct ref_store *refs, const char *refname);
      
    -+/*
    -+ * Check whether the reference is "HEAD" and whether it exists.
    -+ */
    - int is_headref(struct ref_store *refs, const char *refname);
    - 
    +-int is_headref(struct ref_store *refs, const char *refname);
    +-
      #endif /* REFS_H */
     
      ## refs/files-backend.c ##
 8:  8bd52e5363 !  8:  19af8c754c refs: pseudorefs are no refs
    @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
     +	if (!is_root_ref_syntax(refname) ||
     +	    is_pseudo_ref(refname))
      		return 0;
    - 	if (is_headref(refs, refname))
    - 		return 1;
    + 
    + 	/*
     @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
      	return result;
      }
 9:  cd6d745a01 !  9:  86f7f2d2d8 ref-filter: properly distinuish pseudo and root refs
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      		"MERGE_HEAD",
     
      ## refs.h ##
    -@@ refs.h: int is_root_ref(struct ref_store *refs, const char *refname);
    +@@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
    - int is_headref(struct ref_store *refs, const char *refname);
    + int is_root_ref(struct ref_store *refs, const char *refname);
      
     +/*
     + * Pseudorefs are refs that have different semantics compared to
10:  6956fccced = 10:  640d3b169f refs: refuse to write pseudorefs

base-commit: 0f3415f1f8478b05e64db11eb8aaa2915e48fef6
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 6091 bytes --]

Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that highlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Refs that live in the root of the ref hierarchy but which are not
pseudorefs will be further defined in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 40 +++++++++---------------------
 1 file changed, 12 insertions(+), 28 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..e686c83026 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -497,20 +497,18 @@ exclude;;
 	unusual refs.
 
 [[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+	A ref that has different semantics than normal refs. These refs can be
+	accessed via normal Git commands but may not behave the same as a
+	normal ref in some cases.
++
+The following pseudorefs are known to Git:
+
+ - `FETCH_HEAD` is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
+   may refer to multiple object IDs. Each object ID is annotated with metadata
+   indicating where it was fetched from and its fetch status.
+
+ - `MERGE_HEAD` is written by linkgit:git-merge[1] when resolving merge
+   conflicts. It contains all commit IDs which are being merged.
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
@@ -638,20 +636,6 @@ The most notable example is `HEAD`.
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.
 
-[[def_special_ref]]special ref::
-	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
-+
-The following special refs are known to Git:
-
- - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
-   may refer to multiple object IDs. Each object ID is annotated with metadata
-   indicating where it was fetched from and its fetch status.
-
- - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
-   conflicts. It contains all commit IDs which are being merged.
-
 [[def_submodule]]submodule::
 	A <<def_repository,repository>> that holds the history of a
 	separate project inside another repository (the latter of
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 02/10] Documentation/glossary: clarify limitations of pseudorefs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]

Clarify limitations that pseudorefs have:

  - They can be read via git-rev-parse(1) and similar tools.

  - They are not surfaced when iterating through refs, like when using
    git-for-each-ref(1). They are not refs, so iterating through refs
    should not surface them.

  - They cannot be written via git-update-ref(1) and related commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index e686c83026..d8c04b37be 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -498,8 +498,8 @@ exclude;;
 
 [[def_pseudoref]]pseudoref::
 	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
+	read via normal Git commands, but cannot be written to by commands like
+	linkgit:git-update-ref[1].
 +
 The following pseudorefs are known to Git:
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 03/10] Documentation/glossary: define root refs as refs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2736 bytes --]

Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.

Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 32 +++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d8c04b37be..c434387186 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -550,20 +550,38 @@ The following pseudorefs are known to Git:
 	to the result.
 
 [[def_ref]]ref::
-	A name that begins with `refs/` (e.g. `refs/heads/master`)
-	that points to an <<def_object_name,object name>> or another
-	ref (the latter is called a <<def_symref,symbolic ref>>).
+	A name that that points to an <<def_object_name,object name>> or
+	another ref (the latter is called a <<def_symref,symbolic ref>>).
 	For convenience, a ref can sometimes be abbreviated when used
 	as an argument to a Git command; see linkgit:gitrevisions[7]
 	for details.
 	Refs are stored in the <<def_repository,repository>>.
 +
 The ref namespace is hierarchical.
-Different subhierarchies are used for different purposes (e.g. the
-`refs/heads/` hierarchy is used to represent local branches).
+Ref names must either start with `refs/` or be located in the root of
+the hierarchy. For the latter, their name must follow these rules:
 +
-There are a few special-purpose refs that do not begin with `refs/`.
-The most notable example is `HEAD`.
+ - The name consists of only upper-case characters or underscores.
+
+ - The name ends with "`_HEAD`" or is equal to "`HEAD`".
++
+There are some irregular refs in the root of the hierarchy that do not
+match these rules. The following list is exhaustive and shall not be
+extended in the future:
++
+ - `AUTO_MERGE`
+
+ - `BISECT_EXPECTED_REV`
+
+ - `NOTES_MERGE_PARTIAL`
+
+ - `NOTES_MERGE_REF`
+
+ - `MERGE_AUTOSTASH`
++
+Different subhierarchies are used for different purposes. For example,
+the `refs/heads/` hierarchy is used to represent local branches whereas
+the `refs/tags/` hierarchy is used to represent local tags..
 
 [[def_reflog]]reflog::
 	A reflog shows the local "history" of a ref.  In other words,
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 04/10] refs: rename `is_pseudoref()` to `is_root_ref()`
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]

Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ref-filter.c            |  2 +-
 refs.c                  | 14 +++++++-------
 refs.h                  | 28 +++++++++++++++++++++++++++-
 refs/files-backend.c    |  2 +-
 refs/reftable-backend.c |  2 +-
 5 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 59ad6f54dd..361beb6619 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_pseudoref(get_main_ref_store(the_repository), refname))
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
 		return FILTER_REFS_PSEUDOREFS;
 
 	return FILTER_REFS_OTHERS;
diff --git a/refs.c b/refs.c
index 55d2e0b2cb..0a4acde3ca 100644
--- a/refs.c
+++ b/refs.c
@@ -844,7 +844,7 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudoref_syntax(const char *refname)
+static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
 
@@ -860,9 +860,9 @@ static int is_pseudoref_syntax(const char *refname)
 	return 1;
 }
 
-int is_pseudoref(struct ref_store *refs, const char *refname)
+int is_root_ref(struct ref_store *refs, const char *refname)
 {
-	static const char *const irregular_pseudorefs[] = {
+	static const char *const irregular_root_refs[] = {
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -872,7 +872,7 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 	struct object_id oid;
 	size_t i;
 
-	if (!is_pseudoref_syntax(refname))
+	if (!is_root_ref_syntax(refname))
 		return 0;
 
 	if (ends_with(refname, "_HEAD")) {
@@ -882,8 +882,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 		return !is_null_oid(&oid);
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+		if (!strcmp(refname, irregular_root_refs[i])) {
 			refs_resolve_ref_unsafe(refs, refname,
 						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 						&oid, NULL);
@@ -902,7 +902,7 @@ int is_headref(struct ref_store *refs, const char *refname)
 }
 
 static int is_current_worktree_ref(const char *ref) {
-	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
+	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
 
 enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
diff --git a/refs.h b/refs.h
index d278775e08..d0374c3275 100644
--- a/refs.h
+++ b/refs.h
@@ -1051,7 +1051,33 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
-int is_pseudoref(struct ref_store *refs, const char *refname);
+/*
+ * Check whether the reference is an existing root reference.
+ *
+ * A root ref is a reference that lives in the root of the reference hierarchy.
+ * These references must conform to special syntax:
+ *
+ *   - Their name must be all-uppercase or underscores ("_").
+ *
+ *   - Their name must end with "_HEAD".
+ *
+ *   - Their name may not contain a slash.
+ *
+ * There is a special set of irregular root refs that exist due to historic
+ * reasons, only. This list shall not be expanded in the future:
+ *
+ *   - AUTO_MERGE
+ *
+ *   - BISECT_EXPECTED_REV
+ *
+ *   - NOTES_MERGE_PARTIAL
+ *
+ *   - NOTES_MERGE_REF
+ *
+ *   - MERGE_AUTOSTASH
+ */
+int is_root_ref(struct ref_store *refs, const char *refname);
+
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a098d14ea0..0fcb601444 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,7 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) ||
+		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
 								is_headref(ref_store, de->d_name)))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 010ef811b6..36ab3357a7 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -354,7 +354,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_pseudoref(&iter->refs->base, iter->ref.refname) ||
+		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
 		}
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()`
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 06/10] refs: root refs can be symbolic refs Patrick Steinhardt
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionally break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 0a4acde3ca..6266f77474 100644
--- a/refs.c
+++ b/refs.c
@@ -1876,13 +1876,13 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_special_ref(const char *refname)
+static int is_pseudo_ref(const char *refname)
 {
 	/*
-	 * Special references are refs that have different semantics compared
-	 * to "normal" refs. These refs can thus not be stored in the ref
-	 * backend, but must always be accessed via the filesystem. The
-	 * following refs are special:
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
 	 *
 	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
 	 *   carries additional metadata like where it came from.
@@ -1891,17 +1891,17 @@ static int is_special_ref(const char *refname)
 	 *   heads.
 	 *
 	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (special refs) or through the reference
+	 * through the filesystem (pseudorefs) or through the reference
 	 * backend (normal ones).
 	 */
-	static const char * const special_refs[] = {
+	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
 	};
 	size_t i;
 
-	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
-		if (!strcmp(refname, special_refs[i]))
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
 			return 1;
 
 	return 0;
@@ -1912,7 +1912,7 @@ int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      unsigned int *type, int *failure_errno)
 {
 	assert(failure_errno);
-	if (is_special_ref(refname))
+	if (is_pseudo_ref(refname))
 		return refs_read_special_head(ref_store, refname, oid, referent,
 					      type, failure_errno);
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 06/10] refs: root refs can be symbolic refs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 4671 bytes --]

Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`.

Let's loosen the restrictions in accordance to the new definition of
root refs, which are simply plain refs that may as well be a symbolic
ref. Consequently, we can just check for the ref to exist instead of
requiring it to be a regular ref.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 31 ++++++++++++++++++++-----------
 refs.h                         |  3 ++-
 t/t6302-for-each-ref-filter.sh | 17 +++++++++++++++++
 3 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index 6266f77474..d63d60a0dc 100644
--- a/refs.c
+++ b/refs.c
@@ -869,28 +869,37 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
+	struct strbuf referent = STRBUF_INIT;
+	struct object_id oid = { 0 };
+	int failure_errno, ret = 0;
+	unsigned int flags;
 	size_t i;
 
 	if (!is_root_ref_syntax(refname))
 		return 0;
 
+	/*
+	 * Note that we cannot use `refs_ref_exists()` here because that also
+	 * checks whether its target ref exists in case refname is a symbolic
+	 * ref.
+	 */
 	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+					 &flags, &failure_errno);
+		goto done;
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) {
 		if (!strcmp(refname, irregular_root_refs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
+			ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
+						 &flags, &failure_errno);
+			goto done;
 		}
+	}
 
-	return 0;
+done:
+	strbuf_release(&referent);
+	return ret;
 }
 
 int is_headref(struct ref_store *refs, const char *refname)
diff --git a/refs.h b/refs.h
index d0374c3275..b15ac3835e 100644
--- a/refs.h
+++ b/refs.h
@@ -1052,7 +1052,8 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
 /*
- * Check whether the reference is an existing root reference.
+ * Check whether the reference is an existing root reference. A root reference
+ * that is a dangling symbolic ref is considered to exist.
  *
  * A root ref is a reference that lives in the root of the reference hierarchy.
  * These references must conform to special syntax:
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 948f1bb5f4..92ed8957c8 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs omits dangling symrefs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git symbolic-ref DANGLING_HEAD refs/heads/missing &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_expect_success 'filtering with --points-at' '
 	cat >expect <<-\EOF &&
 	refs/heads/main
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 07/10] refs: classify HEAD as a root ref
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 06/10] refs: root refs can be symbolic refs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 08/10] refs: pseudorefs are no refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 3716 bytes --]

Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

- `ref_kind_from_refname()` already handles "HEAD" explicitly before
  calling `is_root_ref()`.

- The "files" and "reftable" backends explicitly called both
  `is_root_ref()` and `is_headref()`.

This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we also return a trueish value when the ref is a dangling symbolic
ref. As there are no callers of `is_headref()` left afer the refactoring
we absorb it completely into `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  | 9 +--------
 refs.h                  | 5 ++---
 refs/files-backend.c    | 3 +--
 refs/reftable-backend.c | 3 +--
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/refs.c b/refs.c
index d63d60a0dc..077ae5756a 100644
--- a/refs.c
+++ b/refs.c
@@ -863,6 +863,7 @@ static int is_root_ref_syntax(const char *refname)
 int is_root_ref(struct ref_store *refs, const char *refname)
 {
 	static const char *const irregular_root_refs[] = {
+		"HEAD",
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -902,14 +903,6 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	return ret;
 }
 
-int is_headref(struct ref_store *refs, const char *refname)
-{
-	if (!strcmp(refname, "HEAD"))
-		return refs_ref_exists(refs, refname);
-
-	return 0;
-}
-
 static int is_current_worktree_ref(const char *ref) {
 	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
diff --git a/refs.h b/refs.h
index b15ac3835e..f6f4d61e1b 100644
--- a/refs.h
+++ b/refs.h
@@ -1060,7 +1060,8 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  *
  *   - Their name must be all-uppercase or underscores ("_").
  *
- *   - Their name must end with "_HEAD".
+ *   - Their name must end with "_HEAD". As a special rule, "HEAD" is a root
+ *     ref, as well.
  *
  *   - Their name may not contain a slash.
  *
@@ -1079,6 +1080,4 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(struct ref_store *refs, const char *refname);
 
-int is_headref(struct ref_store *refs, const char *refname);
-
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fcb601444..ea927c516d 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,8 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
-								is_headref(ref_store, de->d_name)))
+		if (dtype == DT_REG && is_root_ref(ref_store, de->d_name))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
 		strbuf_setlen(&refname, dirnamelen);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 36ab3357a7..7ad4337229 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -354,8 +354,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
-		      is_headref(&iter->refs->base, iter->ref.refname)))) {
+		      is_root_ref(&iter->refs->base, iter->ref.refname))) {
 			continue;
 		}
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 08/10] refs: pseudorefs are no refs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 4321 bytes --]

The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.

Fix this issue by filtering out pseudorefs in `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 65 +++++++++++++++++-----------------
 t/t6302-for-each-ref-filter.sh | 17 +++++++++
 2 files changed, 50 insertions(+), 32 deletions(-)

diff --git a/refs.c b/refs.c
index 077ae5756a..f5e98e5b46 100644
--- a/refs.c
+++ b/refs.c
@@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
+static int is_pseudo_ref(const char *refname)
+{
+	/*
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * Reading, writing or deleting references must consistently go either
+	 * through the filesystem (pseudorefs) or through the reference
+	 * backend (normal ones).
+	 */
+	static const char * const pseudo_refs[] = {
+		"FETCH_HEAD",
+		"MERGE_HEAD",
+	};
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
+			return 1;
+
+	return 0;
+}
+
 static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
@@ -876,7 +907,8 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 	unsigned int flags;
 	size_t i;
 
-	if (!is_root_ref_syntax(refname))
+	if (!is_root_ref_syntax(refname) ||
+	    is_pseudo_ref(refname))
 		return 0;
 
 	/*
@@ -1878,37 +1910,6 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_pseudo_ref(const char *refname)
-{
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
-	static const char * const pseudo_refs[] = {
-		"FETCH_HEAD",
-		"MERGE_HEAD",
-	};
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
-		if (!strcmp(refname, pseudo_refs[i]))
-			return 1;
-
-	return 0;
-}
-
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 92ed8957c8..163c378cfd 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs pattern does not print special refs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git rev-parse HEAD >.git/MERGE_HEAD &&
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_expect_success '--include-root-refs with other patterns' '
 	cat >expect <<-\EOF &&
 	HEAD
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 08/10] refs: pseudorefs are no refs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10  8:48   ` [PATCH v4 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  2024-05-10 18:59   ` [PATCH v4 00/10] Clarify pseudo-ref terminology Junio C Hamano
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 5913 bytes --]

The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/for-each-ref.c |  2 +-
 ref-filter.c           | 16 +++++++++-------
 ref-filter.h           |  4 ++--
 refs.c                 | 18 +-----------------
 refs.h                 | 18 ++++++++++++++++++
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 919282e12a..5517a4a1c0 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	}
 
 	if (include_root_refs)
-		flags |= FILTER_REFS_ROOT_REFS;
+		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
 
 	filter.match_as_path = 1;
 	filter_and_format_refs(&filter, flags, sorting, &format);
diff --git a/ref-filter.c b/ref-filter.c
index 361beb6619..d72113edfe 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2628,7 +2628,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       each_ref_fn cb,
 				       void *cb_data)
 {
-	if (filter->kind == FILTER_REFS_KIND_MASK) {
+	if (filter->kind & FILTER_REFS_ROOT_REFS) {
 		/* In this case, we want to print all refs including root refs. */
 		return refs_for_each_include_root_refs(get_main_ref_store(the_repository),
 						       cb, cb_data);
@@ -2756,8 +2756,10 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_root_ref(get_main_ref_store(the_repository), refname))
+	if (is_pseudo_ref(refname))
 		return FILTER_REFS_PSEUDOREFS;
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
+		return FILTER_REFS_ROOT_REFS;
 
 	return FILTER_REFS_OTHERS;
 }
@@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
 	/*
 	 * Generally HEAD refs are printed with special description denoting a rebase,
 	 * detached state and so forth. This is useful when only printing the HEAD ref
-	 * But when it is being printed along with other pseudorefs, it makes sense to
-	 * keep the formatting consistent. So we mask the type to act like a pseudoref.
+	 * But when it is being printed along with other root refs, it makes sense to
+	 * keep the formatting consistent. So we mask the type to act like a root ref.
 	 */
-	if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD)
-		kind = FILTER_REFS_PSEUDOREFS;
+	if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
+		kind = FILTER_REFS_ROOT_REFS;
 	else if (!(kind & filter->kind))
 		return NULL;
 
@@ -3072,7 +3074,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 		 * When printing all ref types, HEAD is already included,
 		 * so we don't want to print HEAD again.
 		 */
-		if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) &&
+		if (!ret && !(filter->kind & FILTER_REFS_ROOT_REFS) &&
 		    (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
 	}
diff --git a/ref-filter.h b/ref-filter.h
index 0ca28d2bba..27ae1aa0d1 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -23,9 +23,9 @@
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
 #define FILTER_REFS_PSEUDOREFS     0x0040
-#define FILTER_REFS_ROOT_REFS      (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS)
+#define FILTER_REFS_ROOT_REFS      0x0080
 #define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \
-				    FILTER_REFS_PSEUDOREFS)
+				    FILTER_REFS_PSEUDOREFS | FILTER_REFS_ROOT_REFS)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs.c b/refs.c
index f5e98e5b46..c882ece6e7 100644
--- a/refs.c
+++ b/refs.c
@@ -844,24 +844,8 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudo_ref(const char *refname)
+int is_pseudo_ref(const char *refname)
 {
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
 	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
diff --git a/refs.h b/refs.h
index f6f4d61e1b..815dc514c7 100644
--- a/refs.h
+++ b/refs.h
@@ -1080,4 +1080,22 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(struct ref_store *refs, const char *refname);
 
+/*
+ * Pseudorefs are refs that have different semantics compared to
+ * "normal" refs. These refs can thus not be stored in the ref backend,
+ * but must always be accessed via the filesystem. The following refs
+ * are pseudorefs:
+ *
+ * - FETCH_HEAD may contain multiple object IDs, and each one of them
+ *   carries additional metadata like where it came from.
+ *
+ * - MERGE_HEAD may contain multiple object IDs when merging multiple
+ *   heads.
+ *
+ * Reading, writing or deleting references must consistently go either
+ * through the filesystem (pseudorefs) or through the reference
+ * backend (normal ones).
+ */
+int is_pseudo_ref(const char *refname);
+
 #endif /* REFS_H */
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 10/10] refs: refuse to write pseudorefs
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
@ 2024-05-10  8:48   ` Patrick Steinhardt
  2024-05-10 18:59   ` [PATCH v4 00/10] Clarify pseudo-ref terminology Junio C Hamano
  10 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-10  8:48 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.

Restrict writing pseudorefs via the ref backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c           | 7 +++++++
 t/t5510-fetch.sh | 6 +++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index c882ece6e7..f2507c5a74 100644
--- a/refs.c
+++ b/refs.c
@@ -1285,6 +1285,13 @@ int ref_transaction_update(struct ref_transaction *transaction,
 		return -1;
 	}
 
+	if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
+	    is_pseudo_ref(refname)) {
+		strbuf_addf(err, _("refusing to update pseudoref '%s'"),
+			    refname);
+		return -1;
+	}
+
 	if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
 		BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
 
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 33d34d5ae9..4eb569f4df 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -518,7 +518,7 @@ test_expect_success 'fetch with a non-applying branch.<name>.merge' '
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [1]' '
 	one_head=$(cd one && git rev-parse HEAD) &&
 	this_head=$(git rev-parse HEAD) &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -530,7 +530,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 	one_ref=$(cd one && git symbolic-ref HEAD) &&
 	git config branch.main.remote blub &&
 	git config branch.main.merge "$one_ref" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -540,7 +540,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 # the merge spec does not match the branch the remote HEAD points to
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [3]' '
 	git config branch.main.merge "${one_ref}_not" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 00/10] Clarify pseudo-ref terminology
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-05-10  8:48   ` [PATCH v4 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
@ 2024-05-10 18:59   ` Junio C Hamano
  10 siblings, 0 replies; 93+ messages in thread
From: Junio C Hamano @ 2024-05-10 18:59 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Justin Tobler,
	Kristoffer Haugsbakk, Jean-Noël AVILA

Patrick Steinhardt <ps@pks.im> writes:

>   - Reorder patches 6 and 7 such that we first correct `is_root_ref()`,
>     and then adapt `is_headref()`.
>
>   - Furthermore, I have inlined `is_headref()` into `is_root_ref()`
>     completely now as it didn't have any users anymore.

This does look like a good change, relative to the previous
iteration.  The code paths gets greatly simplified.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-03 18:13     ` Jeff King
@ 2024-05-15  4:16       ` Patrick Steinhardt
  2024-05-15  4:39         ` Patrick Steinhardt
  2024-05-15  6:20         ` Jeff King
  0 siblings, 2 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  4:16 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 3966 bytes --]

On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote:
> On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote:
> 
> > Before this patch series, root refs except for "HEAD" and our special
> > refs were classified as pseudorefs. Furthermore, our terminology
> > clarified that pseudorefs must not be symbolic refs. This restriction
> > is enforced in `is_root_ref()`, which explicitly checks that a supposed
> > root ref resolves to an object ID without recursing.
> > 
> > This has been extremely confusing right from the start because (in old
> > terminology) a ref name may sometimes be a pseudoref and sometimes not
> > depending on whether it is a symbolic or regular ref. This behaviour
> > does not seem reasonable at all and I very much doubt that it results in
> > anything sane.
> > 
> > Furthermore, the behaviour is different to `is_headref()`, which only
> > checks for the ref to exist. While that is in line with our glossary,
> > this inconsistency only adds to the confusion.
> > 
> > Last but not least, the current behaviour can actually lead to a
> > segfault when calling `is_root_ref()` with a reference that either does
> > not exist or that is a symbolic ref because we never initialized `oid`.
> > 
> > Let's loosen the restrictions in accordance to the new definition of
> > root refs, which are simply plain refs that may as well be a symbolic
> > ref. Consequently, we can just check for the ref to exist instead of
> > requiring it to be a regular ref.
> 
> It's not clear to me that this existence check is particularly useful.
> Something that fails read_raw_ref() will fail if:
> 
>   - the file does not exist at all. But then how did somebody find out
>     about it at all to ask is_pseudoref()?
> 
>   - it does exist, but does not look like a ref. Is this important? If I
>     do "echo foo >.git/CHERRY_PICK_HEAD", does it become not a root ref
>     anymore? Or is it a root ref that is broken? I'd have thought the
>     latter, and the syntax is what distinguishes it.
> 
> Making the classification purely syntactic based on the name feels
> simpler to me to reason about. You'll never run into confusing cases
> where repo state changes how commands may behave.

I certainly agree and have been complaining about that in the past, too.
I didn't dare to change the semantics this far yet. Let's have a look at
the callers:

  - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    It's clear that the intent is to classify based on the ref name,
    only.

  - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    determine whether it should add a ref to the root directory. It
    feels fishy that this uses ref existence checks to do that.

  - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    filter root refs. Again, using existence checks is pointless here as
    the iterator has just surfaced the ref, so it does exist.

  - "refs.c:is_current_worktree_ref()" uses it. Fishy as well, as the
    call to `is_per_worktree_ref()` also only checks for the refname.

So let's remove these existence checks altogether and make this a check
that purely checks semantics.

> And arguably is_pseudoref_syntax() should be taking into account the
> "_HEAD" restriction and special names anyway. It is a bit weird that
> even if we tighten up the refname checking to use is_pseudoref_syntax(),
> you'd still be able to "git update-ref FOO" but then not see it as a
> root ref!

True, as well. I'm less comfortable with doing that change in this
series though as it does impose a major restriction that did not exist
previously. We probably want some escape hatches so that it would still
be possible to modify those refs when really required, for example to
delete such broken refs.

I would thus like to defer this to a follow up patch series, if you
don't mind.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  4:16       ` Patrick Steinhardt
@ 2024-05-15  4:39         ` Patrick Steinhardt
  2024-05-15  6:22           ` Jeff King
  2024-05-15  6:20         ` Jeff King
  1 sibling, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  4:39 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 1323 bytes --]

On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote:
> On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote:
> > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote:
[snip]
> > And arguably is_pseudoref_syntax() should be taking into account the
> > "_HEAD" restriction and special names anyway. It is a bit weird that
> > even if we tighten up the refname checking to use is_pseudoref_syntax(),
> > you'd still be able to "git update-ref FOO" but then not see it as a
> > root ref!
> 
> True, as well. I'm less comfortable with doing that change in this
> series though as it does impose a major restriction that did not exist
> previously. We probably want some escape hatches so that it would still
> be possible to modify those refs when really required, for example to
> delete such broken refs.
> 
> I would thus like to defer this to a follow up patch series, if you
> don't mind.

Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed
to `is_root_ref_syntax()`) at all anymore after this series lands
because it can be neatly rolled into `is_root_ref()`. The only caller,
`is_current_worktree_ref()`, should really call `is_roof_ref()` and not
`is_root_ref_syntax()`.

But again, I'll defer this to a follow-up patch series.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  4:16       ` Patrick Steinhardt
  2024-05-15  4:39         ` Patrick Steinhardt
@ 2024-05-15  6:20         ` Jeff King
  1 sibling, 0 replies; 93+ messages in thread
From: Jeff King @ 2024-05-15  6:20 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote:

> > Making the classification purely syntactic based on the name feels
> > simpler to me to reason about. You'll never run into confusing cases
> > where repo state changes how commands may behave.
> 
> I certainly agree and have been complaining about that in the past, too.
> I didn't dare to change the semantics this far yet. Let's have a look at
> the callers:
> 
>   - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
>     It's clear that the intent is to classify based on the ref name,
>     only.
> 
>   - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
>     determine whether it should add a ref to the root directory. It
>     feels fishy that this uses ref existence checks to do that.
> 
>   - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
>     filter root refs. Again, using existence checks is pointless here as
>     the iterator has just surfaced the ref, so it does exist.
> 
>   - "refs.c:is_current_worktree_ref()" uses it. Fishy as well, as the
>     call to `is_per_worktree_ref()` also only checks for the refname.
> 
> So let's remove these existence checks altogether and make this a check
> that purely checks semantics.

Thanks for doing the digging on the callers. That matches my intuition /
light analysis, which is good. ;)

> > And arguably is_pseudoref_syntax() should be taking into account the
> > "_HEAD" restriction and special names anyway. It is a bit weird that
> > even if we tighten up the refname checking to use is_pseudoref_syntax(),
> > you'd still be able to "git update-ref FOO" but then not see it as a
> > root ref!
> 
> True, as well. I'm less comfortable with doing that change in this
> series though as it does impose a major restriction that did not exist
> previously. We probably want some escape hatches so that it would still
> be possible to modify those refs when really required, for example to
> delete such broken refs.
> 
> I would thus like to defer this to a follow up patch series, if you
> don't mind.

I don't mind deferring. I thought it might make the simplifications
you're doing in this series easier to reason about. But TBH I haven't
had the chance to look through your series very carefully yet (and I'm
still a bit back-logged), so I'm happy to go with your judgement on how
to split it up.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  4:39         ` Patrick Steinhardt
@ 2024-05-15  6:22           ` Jeff King
  2024-05-15  6:35             ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2024-05-15  6:22 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

On Wed, May 15, 2024 at 06:39:52AM +0200, Patrick Steinhardt wrote:

> On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote:
> > On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote:
> > > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote:
> [snip]
> > > And arguably is_pseudoref_syntax() should be taking into account the
> > > "_HEAD" restriction and special names anyway. It is a bit weird that
> > > even if we tighten up the refname checking to use is_pseudoref_syntax(),
> > > you'd still be able to "git update-ref FOO" but then not see it as a
> > > root ref!
> > 
> > True, as well. I'm less comfortable with doing that change in this
> > series though as it does impose a major restriction that did not exist
> > previously. We probably want some escape hatches so that it would still
> > be possible to modify those refs when really required, for example to
> > delete such broken refs.
> > 
> > I would thus like to defer this to a follow up patch series, if you
> > don't mind.
> 
> Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed
> to `is_root_ref_syntax()`) at all anymore after this series lands
> because it can be neatly rolled into `is_root_ref()`. The only caller,
> `is_current_worktree_ref()`, should really call `is_roof_ref()` and not
> `is_root_ref_syntax()`.

Yeah, and I'd expect that the more-strict check_refname_format() that I
proposed elsewhere would be in the same boat. The only reason I used the
"_syntax()" variant is that it was obviously wrong to do existence
checks there. Once those are gone, then naturally it should be able to
rely on is_root_ref() itself.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  6:22           ` Jeff King
@ 2024-05-15  6:35             ` Patrick Steinhardt
  2024-05-15  6:49               ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:35 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 2018 bytes --]

On Wed, May 15, 2024 at 02:22:20AM -0400, Jeff King wrote:
> On Wed, May 15, 2024 at 06:39:52AM +0200, Patrick Steinhardt wrote:
> 
> > On Wed, May 15, 2024 at 06:16:18AM +0200, Patrick Steinhardt wrote:
> > > On Fri, May 03, 2024 at 02:13:39PM -0400, Jeff King wrote:
> > > > On Thu, May 02, 2024 at 10:17:42AM +0200, Patrick Steinhardt wrote:
> > [snip]
> > > > And arguably is_pseudoref_syntax() should be taking into account the
> > > > "_HEAD" restriction and special names anyway. It is a bit weird that
> > > > even if we tighten up the refname checking to use is_pseudoref_syntax(),
> > > > you'd still be able to "git update-ref FOO" but then not see it as a
> > > > root ref!
> > > 
> > > True, as well. I'm less comfortable with doing that change in this
> > > series though as it does impose a major restriction that did not exist
> > > previously. We probably want some escape hatches so that it would still
> > > be possible to modify those refs when really required, for example to
> > > delete such broken refs.
> > > 
> > > I would thus like to defer this to a follow up patch series, if you
> > > don't mind.
> > 
> > Arguably, we don't need `is_pseudoref_syntax()` (which is being renamed
> > to `is_root_ref_syntax()`) at all anymore after this series lands
> > because it can be neatly rolled into `is_root_ref()`. The only caller,
> > `is_current_worktree_ref()`, should really call `is_roof_ref()` and not
> > `is_root_ref_syntax()`.
> 
> Yeah, and I'd expect that the more-strict check_refname_format() that I
> proposed elsewhere would be in the same boat. The only reason I used the
> "_syntax()" variant is that it was obviously wrong to do existence
> checks there. Once those are gone, then naturally it should be able to
> rely on is_root_ref() itself.

This series hasn't been queued/merged yet, right? Do you plan to reroll
it? I think that the changes in there are a good complementary addition
to the clarifications in my patch series.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  6:35             ` Patrick Steinhardt
@ 2024-05-15  6:49               ` Jeff King
  2024-05-15  6:59                 ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2024-05-15  6:49 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

On Wed, May 15, 2024 at 08:35:52AM +0200, Patrick Steinhardt wrote:

> > Yeah, and I'd expect that the more-strict check_refname_format() that I
> > proposed elsewhere would be in the same boat. The only reason I used the
> > "_syntax()" variant is that it was obviously wrong to do existence
> > checks there. Once those are gone, then naturally it should be able to
> > rely on is_root_ref() itself.
> 
> This series hasn't been queued/merged yet, right? Do you plan to reroll
> it? I think that the changes in there are a good complementary addition
> to the clarifications in my patch series.

Correct, I don't think Junio picked it up. It needed a re-roll anyway,
so I'd plan to do it on top of your patches, assuming they are on track
to get merged (and it sounds like there are no real objections).

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v5 00/10] Clarify pseudo-ref terminology
  2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
@ 2024-05-15  6:50 ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
                     ` (9 more replies)
  6 siblings, 10 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 17612 bytes --]

Hi,

this is the fifth version of my patch series that aims to clarify the
pseudo-ref terminology. Changes compared to v4:

  - Dropped a now-unneeded comment in `is_head_ref_syntax()` which
    claims that "HEAD is not a pseudoref". Due to the rename of the
    function this comment no longer applies.

  - Adapted `is_root_ref()` so that it does not check the ref for
    existence at all anymore, as proposed by Peff. This makes the
    function's behaviour way less confusing overall. I also added some
    explanations to the commit message to explain why this is okay to
    do.

  - Adapted the commit message of `is_headref()` to explain some of the
    subtleties that result from the removed ref existence check.

Thanks!

Patrick Steinhardt (10):
  Documentation/glossary: redefine pseudorefs as special refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: define root refs as refs
  refs: rename `is_pseudoref()` to `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: do not check ref existence in `is_root_ref()`
  refs: classify HEAD as a root ref
  refs: pseudorefs are no refs
  ref-filter: properly distinuish pseudo and root refs
  refs: refuse to write pseudorefs

 Documentation/glossary-content.txt | 72 +++++++++++-----------
 builtin/for-each-ref.c             |  2 +-
 ref-filter.c                       | 16 ++---
 ref-filter.h                       |  4 +-
 refs.c                             | 98 +++++++++++-------------------
 refs.h                             | 48 ++++++++++++++-
 refs/files-backend.c               |  3 +-
 refs/reftable-backend.c            |  3 +-
 t/t5510-fetch.sh                   |  6 +-
 t/t6302-for-each-ref-filter.sh     | 34 +++++++++++
 10 files changed, 169 insertions(+), 117 deletions(-)

Range-diff against v4:
 1:  b1fc4c1ac7 =  1:  1f2445b95b Documentation/glossary: redefine pseudorefs as special refs
 2:  dce3a0fa7e =  2:  d328081c52 Documentation/glossary: clarify limitations of pseudorefs
 3:  79249962f5 =  3:  0d185e6479 Documentation/glossary: define root refs as refs
 4:  ee2b090f75 !  4:  33b221b248 refs: rename `is_pseudoref()` to `is_root_ref()`
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      	const char *c;
      
     @@ refs.c: static int is_pseudoref_syntax(const char *refname)
    + 			return 0;
    + 	}
    + 
    +-	/*
    +-	 * HEAD is not a pseudoref, but it certainly uses the
    +-	 * pseudoref syntax.
    +-	 */
      	return 1;
      }
      
 5:  2c09bc7690 !  5:  9087696d82 refs: refname `is_special_ref()` to `is_pseudo_ref()`
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    refs: refname `is_special_ref()` to `is_pseudo_ref()`
    +    refs: rename `is_special_ref()` to `is_pseudo_ref()`
     
         Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
         defined terminology in our gitglossary(7). Note that in the preceding
 6:  5e402811a6 !  6:  af22581c22 refs: root refs can be symbolic refs
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    refs: root refs can be symbolic refs
    +    refs: do not check ref existence in `is_root_ref()`
     
         Before this patch series, root refs except for "HEAD" and our special
         refs were classified as pseudorefs. Furthermore, our terminology
    @@ Commit message
     
         Last but not least, the current behaviour can actually lead to a
         segfault when calling `is_root_ref()` with a reference that either does
    -    not exist or that is a symbolic ref because we never initialized `oid`.
    +    not exist or that is a symbolic ref because we never initialized `oid`,
    +    but then read it via `is_null_oid()`.
     
    -    Let's loosen the restrictions in accordance to the new definition of
    -    root refs, which are simply plain refs that may as well be a symbolic
    -    ref. Consequently, we can just check for the ref to exist instead of
    -    requiring it to be a regular ref.
    +    We have now changed terminology to clarify that pseudorefs are really
    +    only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
    +    in the root of the ref hierarchy are just plain refs. Thus, we do not
    +    need to check whether the ref is symbolic or not. In fact, we can now
    +    avoid looking up the ref completely as the name is sufficient for us to
    +    figure out whether something would be a root ref or not.
    +
    +    This change of course changes semantics for our callers. As there are
    +    only three of them we can assess each of them individually:
    +
    +      - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    +        It's clear that the intent is to classify based on the ref name,
    +        only.
    +
    +      - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    +        filter root refs. Again, using existence checks is pointless here as
    +        the iterator has just surfaced the ref, so we know it does exist.
    +
    +      - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    +        determine whether it should add a ref to the root directory of its
    +        iterator. This had the effect that we skipped over any files that
    +        are either a symbolic ref, or which are not a ref at all.
    +
    +        The new behaviour is to include symbolic refs know, which aligns us
    +        with the adapted terminology. Furthermore, files which look like
    +        root refs but aren't are now mark those as "broken". As broken refs
    +        are not surfaced by our tooling, this should not lead to a change in
    +        user-visible behaviour, but may cause us to emit warnings. This
    +        feels like the right thing to do as we would otherwise just silently
    +        ignore corrupted root refs completely.
    +
    +    So in all cases the existence check was either superfluous, not in line
    +    with the adapted terminology or masked potential issues. This commit
    +    thus changes the behaviour as proposed and drops the existence check
    +    altogether.
     
         Add a test that verifies that this does not change user-visible
         behaviour. Namely, we still don't want to show broken refs to the user
    @@ Commit message
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    + ## ref-filter.c ##
    +@@ ref-filter.c: static int ref_kind_from_refname(const char *refname)
    + 			return ref_kind[i].kind;
    + 	}
    + 
    +-	if (is_root_ref(get_main_ref_store(the_repository), refname))
    ++	if (is_root_ref(refname))
    + 		return FILTER_REFS_PSEUDOREFS;
    + 
    + 	return FILTER_REFS_OTHERS;
    +
      ## refs.c ##
    +@@ refs.c: static int is_root_ref_syntax(const char *refname)
    + 	return 1;
    + }
    + 
    +-int is_root_ref(struct ref_store *refs, const char *refname)
    ++int is_root_ref(const char *refname)
    + {
    + 	static const char *const irregular_root_refs[] = {
    + 		"AUTO_MERGE",
     @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
      		"NOTES_MERGE_REF",
      		"MERGE_AUTOSTASH",
      	};
     -	struct object_id oid;
    -+	struct strbuf referent = STRBUF_INIT;
    -+	struct object_id oid = { 0 };
    -+	int failure_errno, ret = 0;
    -+	unsigned int flags;
      	size_t i;
      
      	if (!is_root_ref_syntax(refname))
      		return 0;
      
    -+	/*
    -+	 * Note that we cannot use `refs_ref_exists()` here because that also
    -+	 * checks whether its target ref exists in case refname is a symbolic
    -+	 * ref.
    -+	 */
    - 	if (ends_with(refname, "_HEAD")) {
    +-	if (ends_with(refname, "_HEAD")) {
     -		refs_resolve_ref_unsafe(refs, refname,
     -					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
     -					&oid, NULL);
     -		return !is_null_oid(&oid);
    -+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
    -+					 &flags, &failure_errno);
    -+		goto done;
    - 	}
    +-	}
    ++	if (ends_with(refname, "_HEAD"))
    ++		return 1;
      
    --	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
    -+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) {
    - 		if (!strcmp(refname, irregular_root_refs[i])) {
    + 	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
    +-		if (!strcmp(refname, irregular_root_refs[i])) {
     -			refs_resolve_ref_unsafe(refs, refname,
     -						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
     -						&oid, NULL);
     -			return !is_null_oid(&oid);
    -+			ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
    -+						 &flags, &failure_errno);
    -+			goto done;
    - 		}
    -+	}
    +-		}
    ++		if (!strcmp(refname, irregular_root_refs[i]))
    ++			return 1;
      
    --	return 0;
    -+done:
    -+	strbuf_release(&referent);
    -+	return ret;
    + 	return 0;
      }
    - 
    - int is_headref(struct ref_store *refs, const char *refname)
     
      ## refs.h ##
     @@ refs.h: extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
    @@ refs.h: extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
      
      /*
     - * Check whether the reference is an existing root reference.
    -+ * Check whether the reference is an existing root reference. A root reference
    -+ * that is a dangling symbolic ref is considered to exist.
    ++ * Check whether the provided name names a root reference. This function only
    ++ * performs a syntactic check.
       *
       * A root ref is a reference that lives in the root of the reference hierarchy.
       * These references must conform to special syntax:
    +@@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
    +  *
    +  *   - MERGE_AUTOSTASH
    +  */
    +-int is_root_ref(struct ref_store *refs, const char *refname);
    ++int is_root_ref(const char *refname);
    + 
    + int is_headref(struct ref_store *refs, const char *refname);
    + 
    +
    + ## refs/files-backend.c ##
    +@@ refs/files-backend.c: static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
    + 		strbuf_addstr(&refname, de->d_name);
    + 
    + 		dtype = get_dtype(de, &path, 1);
    +-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
    +-								is_headref(ref_store, de->d_name)))
    ++		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
    ++					is_headref(ref_store, de->d_name)))
    + 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
    + 
    + 		strbuf_setlen(&refname, dirnamelen);
    +
    + ## refs/reftable-backend.c ##
    +@@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
    + 		 */
    + 		if (!starts_with(iter->ref.refname, "refs/") &&
    + 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
    +-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
    ++		     (is_root_ref(iter->ref.refname) ||
    + 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
    + 			continue;
    + 		}
     
      ## t/t6302-for-each-ref-filter.sh ##
     @@ t/t6302-for-each-ref-filter.sh: test_expect_success '--include-root-refs with other patterns' '
 7:  b32c56afcb !  7:  b719fb7110 refs: classify HEAD as a root ref
    @@ Commit message
         Adapt the function to also treat "HEAD" as a root ref. This change is
         safe to do for all current callers:
     
    -    - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    -      calling `is_root_ref()`.
    +      - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    +        calling `is_root_ref()`.
     
    -    - The "files" and "reftable" backends explicitly called both
    -      `is_root_ref()` and `is_headref()`.
    +      - The "files" and "reftable" backends explicitly call both
    +        `is_root_ref()` and `is_headref()` together.
     
         This also aligns behaviour or `is_root_ref()` and `is_headref()` such
    -    that we also return a trueish value when the ref is a dangling symbolic
    -    ref. As there are no callers of `is_headref()` left afer the refactoring
    -    we absorb it completely into `is_root_ref()`.
    +    that we stop checking for ref existence. This changes semantics for our
    +    backends:
    +
    +      - In the reftable backend we already know that the ref must exist
    +        because `is_headref()` is called as part of the ref iterator. The
    +        existence check is thus redundant, and the change is safe to do.
    +
    +      - In the files backend we use it when populating root refs, where we
    +        would skip adding the "HEAD" file if it was not possible to resolve
    +        it. The new behaviour is to instead mark "HEAD" as broken, which
    +        will cause us to emit warnings in various places.
    +
    +    As there are no callers of `is_headref()` left afer the refactoring, we
    +    can absorb it completely into `is_root_ref()`.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## refs.c ##
     @@ refs.c: static int is_root_ref_syntax(const char *refname)
    - int is_root_ref(struct ref_store *refs, const char *refname)
    + int is_root_ref(const char *refname)
      {
      	static const char *const irregular_root_refs[] = {
     +		"HEAD",
      		"AUTO_MERGE",
      		"BISECT_EXPECTED_REV",
      		"NOTES_MERGE_PARTIAL",
    -@@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    - 	return ret;
    +@@ refs.c: int is_root_ref(const char *refname)
    + 	return 0;
      }
      
     -int is_headref(struct ref_store *refs, const char *refname)
    @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       *
     @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
    - int is_root_ref(struct ref_store *refs, const char *refname);
    + int is_root_ref(const char *refname);
      
     -int is_headref(struct ref_store *refs, const char *refname);
     -
    @@ refs/files-backend.c: static void add_pseudoref_and_head_entries(struct ref_stor
      		strbuf_addstr(&refname, de->d_name);
      
      		dtype = get_dtype(de, &path, 1);
    --		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
    --								is_headref(ref_store, de->d_name)))
    -+		if (dtype == DT_REG && is_root_ref(ref_store, de->d_name))
    +-		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
    +-					is_headref(ref_store, de->d_name)))
    ++		if (dtype == DT_REG && is_root_ref(de->d_name))
      			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
      
      		strbuf_setlen(&refname, dirnamelen);
    @@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_ite
      		 */
      		if (!starts_with(iter->ref.refname, "refs/") &&
      		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
    --		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
    +-		     (is_root_ref(iter->ref.refname) ||
     -		      is_headref(&iter->refs->base, iter->ref.refname)))) {
    -+		      is_root_ref(&iter->refs->base, iter->ref.refname))) {
    ++		      is_root_ref(iter->ref.refname))) {
      			continue;
      		}
      
 8:  19af8c754c !  8:  5709d7f780 refs: pseudorefs are no refs
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      static int is_root_ref_syntax(const char *refname)
      {
      	const char *c;
    -@@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    - 	unsigned int flags;
    +@@ refs.c: int is_root_ref(const char *refname)
    + 	};
      	size_t i;
      
     -	if (!is_root_ref_syntax(refname))
    @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
     +	    is_pseudo_ref(refname))
      		return 0;
      
    - 	/*
    + 	if (ends_with(refname, "_HEAD"))
     @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
      	return result;
      }
 9:  86f7f2d2d8 !  9:  c7e90e3170 ref-filter: properly distinuish pseudo and root refs
    @@ ref-filter.c: static int ref_kind_from_refname(const char *refname)
      			return ref_kind[i].kind;
      	}
      
    --	if (is_root_ref(get_main_ref_store(the_repository), refname))
    +-	if (is_root_ref(refname))
     +	if (is_pseudo_ref(refname))
      		return FILTER_REFS_PSEUDOREFS;
    -+	if (is_root_ref(get_main_ref_store(the_repository), refname))
    ++	if (is_root_ref(refname))
     +		return FILTER_REFS_ROOT_REFS;
      
      	return FILTER_REFS_OTHERS;
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      ## refs.h ##
     @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
    - int is_root_ref(struct ref_store *refs, const char *refname);
    + int is_root_ref(const char *refname);
      
     +/*
     + * Pseudorefs are refs that have different semantics compared to
10:  640d3b169f = 10:  15595991dc refs: refuse to write pseudorefs

base-commit: 83f1add914c6b4682de1e944ec0d1ac043d53d78
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v5 01/10] Documentation/glossary: redefine pseudorefs as special refs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 6093 bytes --]

Nowadays, Git knows about three different kinds of refs. As defined in
gitglossary(7):

  - Regular refs that start with "refs/", like "refs/heads/main".

  - Pseudorefs, which live in the root directory. These must have
    all-caps names and must be a file that start with an object hash.
    Consequently, symbolic refs are not pseudorefs because they do not
    start with an object hash.

  - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD".

This state is extremely confusing, and I would claim that most folks
don't fully understand what is what here. The current definitions also
have several problems:

  - Where does "HEAD" fit in? It's not a pseudoref because it can be
    a symbolic ref. It's not a regular ref because it does not start
    with "refs/". And it's not a special ref, either.

  - There is a strong overlap between pseudorefs and special refs. The
    pseudoref section for example mentions "MERGE_HEAD", even though it
    is a special ref. Is it thus both a pseudoref and a special ref?

  - Why do we even need to distinguish refs that live in the root from
    other refs when they behave just like a regular ref anyway?

In other words, the current state is quite a mess and leads to wild
inconsistencies without much of a good reason.

The original reason why pseudorefs were introduced is that there are
some refs that sometimes behave like a ref, even though they aren't a
ref. And we really only have two of these nowadays, namely "MERGE_HEAD"
and "FETCH_HEAD". Those files are never written via the ref backends,
but are instead written by git-fetch(1), git-pull(1) and git-merge(1).
They contain additional metadata that highlights where a ref has been
fetched from or the list of commits that have been merged.

This original intent in fact matches the definition of special refs that
we have recently introduced in 8df4c5d205 (Documentation: add "special
refs" to the glossary, 2024-01-19). Due to the introduction of the new
reftable backend we were forced to distinguish those refs more clearly
such that we don't ever try to read or write them via the reftable
backend. In the same series, we also addressed all the other cases where
we used to write those special refs via the filesystem directly, thus
circumventing the ref backend, to instead write them via the backends.
Consequently, there are no other refs left anymore which are special.

Let's address this mess and return the pseudoref terminology back to its
original intent: a ref that sometimes behave like a ref, but which isn't
really a ref because it gets written to the filesystem directly. Or in
other words, let's redefine pseudorefs to match the current definition
of special refs. As special refs and pseudorefs are now the same per
definition, we can drop the "special refs" term again. It's not exposed
to our users and thus they wouldn't ever encounter that term anyway.

Refs that live in the root of the ref hierarchy but which are not
pseudorefs will be further defined in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 40 +++++++++---------------------
 1 file changed, 12 insertions(+), 28 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d71b199955..e686c83026 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -497,20 +497,18 @@ exclude;;
 	unusual refs.
 
 [[def_pseudoref]]pseudoref::
-	Pseudorefs are a class of files under `$GIT_DIR` which behave
-	like refs for the purposes of rev-parse, but which are treated
-	specially by git.  Pseudorefs both have names that are all-caps,
-	and always start with a line consisting of a
-	<<def_SHA1,SHA-1>> followed by whitespace.  So, HEAD is not a
-	pseudoref, because it is sometimes a symbolic ref.  They might
-	optionally contain some additional data.  `MERGE_HEAD` and
-	`CHERRY_PICK_HEAD` are examples.  Unlike
-	<<def_per_worktree_ref,per-worktree refs>>, these files cannot
-	be symbolic refs, and never have reflogs.  They also cannot be
-	updated through the normal ref update machinery.  Instead,
-	they are updated by directly writing to the files.  However,
-	they can be read as if they were refs, so `git rev-parse
-	MERGE_HEAD` will work.
+	A ref that has different semantics than normal refs. These refs can be
+	accessed via normal Git commands but may not behave the same as a
+	normal ref in some cases.
++
+The following pseudorefs are known to Git:
+
+ - `FETCH_HEAD` is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
+   may refer to multiple object IDs. Each object ID is annotated with metadata
+   indicating where it was fetched from and its fetch status.
+
+ - `MERGE_HEAD` is written by linkgit:git-merge[1] when resolving merge
+   conflicts. It contains all commit IDs which are being merged.
 
 [[def_pull]]pull::
 	Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and
@@ -638,20 +636,6 @@ The most notable example is `HEAD`.
 	An <<def_object,object>> used to temporarily store the contents of a
 	<<def_dirty,dirty>> working directory and the index for future reuse.
 
-[[def_special_ref]]special ref::
-	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
-+
-The following special refs are known to Git:
-
- - "`FETCH_HEAD`" is written by linkgit:git-fetch[1] or linkgit:git-pull[1]. It
-   may refer to multiple object IDs. Each object ID is annotated with metadata
-   indicating where it was fetched from and its fetch status.
-
- - "`MERGE_HEAD`" is written by linkgit:git-merge[1] when resolving merge
-   conflicts. It contains all commit IDs which are being merged.
-
 [[def_submodule]]submodule::
 	A <<def_repository,repository>> that holds the history of a
 	separate project inside another repository (the latter of
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 02/10] Documentation/glossary: clarify limitations of pseudorefs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]

Clarify limitations that pseudorefs have:

  - They can be read via git-rev-parse(1) and similar tools.

  - They are not surfaced when iterating through refs, like when using
    git-for-each-ref(1). They are not refs, so iterating through refs
    should not surface them.

  - They cannot be written via git-update-ref(1) and related commands.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index e686c83026..d8c04b37be 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -498,8 +498,8 @@ exclude;;
 
 [[def_pseudoref]]pseudoref::
 	A ref that has different semantics than normal refs. These refs can be
-	accessed via normal Git commands but may not behave the same as a
-	normal ref in some cases.
+	read via normal Git commands, but cannot be written to by commands like
+	linkgit:git-update-ref[1].
 +
 The following pseudorefs are known to Git:
 
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 03/10] Documentation/glossary: define root refs as refs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2738 bytes --]

Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live
in the root of the ref hierarchy behave the exact same as normal refs.
They can be symbolic refs or direct refs and can be read, iterated over
and written via normal tooling. All of these refs are stored in the ref
backends, which further demonstrates that they are just normal refs.

Extend the definition of "ref" to also cover such root refs. The only
additional restriction for root refs is that they must conform to a
specific naming schema.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/glossary-content.txt | 32 +++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index d8c04b37be..c434387186 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -550,20 +550,38 @@ The following pseudorefs are known to Git:
 	to the result.
 
 [[def_ref]]ref::
-	A name that begins with `refs/` (e.g. `refs/heads/master`)
-	that points to an <<def_object_name,object name>> or another
-	ref (the latter is called a <<def_symref,symbolic ref>>).
+	A name that that points to an <<def_object_name,object name>> or
+	another ref (the latter is called a <<def_symref,symbolic ref>>).
 	For convenience, a ref can sometimes be abbreviated when used
 	as an argument to a Git command; see linkgit:gitrevisions[7]
 	for details.
 	Refs are stored in the <<def_repository,repository>>.
 +
 The ref namespace is hierarchical.
-Different subhierarchies are used for different purposes (e.g. the
-`refs/heads/` hierarchy is used to represent local branches).
+Ref names must either start with `refs/` or be located in the root of
+the hierarchy. For the latter, their name must follow these rules:
 +
-There are a few special-purpose refs that do not begin with `refs/`.
-The most notable example is `HEAD`.
+ - The name consists of only upper-case characters or underscores.
+
+ - The name ends with "`_HEAD`" or is equal to "`HEAD`".
++
+There are some irregular refs in the root of the hierarchy that do not
+match these rules. The following list is exhaustive and shall not be
+extended in the future:
++
+ - `AUTO_MERGE`
+
+ - `BISECT_EXPECTED_REV`
+
+ - `NOTES_MERGE_PARTIAL`
+
+ - `NOTES_MERGE_REF`
+
+ - `MERGE_AUTOSTASH`
++
+Different subhierarchies are used for different purposes. For example,
+the `refs/heads/` hierarchy is used to represent local branches whereas
+the `refs/tags/` hierarchy is used to represent local tags..
 
 [[def_reflog]]reflog::
 	A reflog shows the local "history" of a ref.  In other words,
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 04/10] refs: rename `is_pseudoref()` to `is_root_ref()`
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-05-15  6:50   ` [PATCH v5 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 05/10] refs: rename `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 5153 bytes --]

Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ref-filter.c            |  2 +-
 refs.c                  | 18 +++++++-----------
 refs.h                  | 28 +++++++++++++++++++++++++++-
 refs/files-backend.c    |  2 +-
 refs/reftable-backend.c |  2 +-
 5 files changed, 37 insertions(+), 15 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 59ad6f54dd..361beb6619 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_pseudoref(get_main_ref_store(the_repository), refname))
+	if (is_root_ref(get_main_ref_store(the_repository), refname))
 		return FILTER_REFS_PSEUDOREFS;
 
 	return FILTER_REFS_OTHERS;
diff --git a/refs.c b/refs.c
index 55d2e0b2cb..434c4da7ce 100644
--- a/refs.c
+++ b/refs.c
@@ -844,7 +844,7 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudoref_syntax(const char *refname)
+static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
 
@@ -853,16 +853,12 @@ static int is_pseudoref_syntax(const char *refname)
 			return 0;
 	}
 
-	/*
-	 * HEAD is not a pseudoref, but it certainly uses the
-	 * pseudoref syntax.
-	 */
 	return 1;
 }
 
-int is_pseudoref(struct ref_store *refs, const char *refname)
+int is_root_ref(struct ref_store *refs, const char *refname)
 {
-	static const char *const irregular_pseudorefs[] = {
+	static const char *const irregular_root_refs[] = {
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -872,7 +868,7 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 	struct object_id oid;
 	size_t i;
 
-	if (!is_pseudoref_syntax(refname))
+	if (!is_root_ref_syntax(refname))
 		return 0;
 
 	if (ends_with(refname, "_HEAD")) {
@@ -882,8 +878,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname)
 		return !is_null_oid(&oid);
 	}
 
-	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
-		if (!strcmp(refname, irregular_pseudorefs[i])) {
+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
+		if (!strcmp(refname, irregular_root_refs[i])) {
 			refs_resolve_ref_unsafe(refs, refname,
 						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
 						&oid, NULL);
@@ -902,7 +898,7 @@ int is_headref(struct ref_store *refs, const char *refname)
 }
 
 static int is_current_worktree_ref(const char *ref) {
-	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
+	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
 
 enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
diff --git a/refs.h b/refs.h
index d278775e08..d0374c3275 100644
--- a/refs.h
+++ b/refs.h
@@ -1051,7 +1051,33 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
-int is_pseudoref(struct ref_store *refs, const char *refname);
+/*
+ * Check whether the reference is an existing root reference.
+ *
+ * A root ref is a reference that lives in the root of the reference hierarchy.
+ * These references must conform to special syntax:
+ *
+ *   - Their name must be all-uppercase or underscores ("_").
+ *
+ *   - Their name must end with "_HEAD".
+ *
+ *   - Their name may not contain a slash.
+ *
+ * There is a special set of irregular root refs that exist due to historic
+ * reasons, only. This list shall not be expanded in the future:
+ *
+ *   - AUTO_MERGE
+ *
+ *   - BISECT_EXPECTED_REV
+ *
+ *   - NOTES_MERGE_PARTIAL
+ *
+ *   - NOTES_MERGE_REF
+ *
+ *   - MERGE_AUTOSTASH
+ */
+int is_root_ref(struct ref_store *refs, const char *refname);
+
 int is_headref(struct ref_store *refs, const char *refname);
 
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a098d14ea0..0fcb601444 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,7 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) ||
+		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
 								is_headref(ref_store, de->d_name)))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 010ef811b6..36ab3357a7 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -354,7 +354,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_pseudoref(&iter->refs->base, iter->ref.refname) ||
+		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
 		}
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 05/10] refs: rename `is_special_ref()` to `is_pseudo_ref()`
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-05-15  6:50   ` [PATCH v5 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15  6:50   ` [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()` Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2588 bytes --]

Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
defined terminology in our gitglossary(7). Note that in the preceding
commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where
there may be confusion for in-flight patch series that add new calls to
`is_pseudoref()`. In order to intentionally break such patch series we
have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the
new name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 434c4da7ce..c1c406fc5f 100644
--- a/refs.c
+++ b/refs.c
@@ -1872,13 +1872,13 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_special_ref(const char *refname)
+static int is_pseudo_ref(const char *refname)
 {
 	/*
-	 * Special references are refs that have different semantics compared
-	 * to "normal" refs. These refs can thus not be stored in the ref
-	 * backend, but must always be accessed via the filesystem. The
-	 * following refs are special:
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
 	 *
 	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
 	 *   carries additional metadata like where it came from.
@@ -1887,17 +1887,17 @@ static int is_special_ref(const char *refname)
 	 *   heads.
 	 *
 	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (special refs) or through the reference
+	 * through the filesystem (pseudorefs) or through the reference
 	 * backend (normal ones).
 	 */
-	static const char * const special_refs[] = {
+	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
 	};
 	size_t i;
 
-	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
-		if (!strcmp(refname, special_refs[i]))
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
 			return 1;
 
 	return 0;
@@ -1908,7 +1908,7 @@ int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      unsigned int *type, int *failure_errno)
 {
 	assert(failure_errno);
-	if (is_special_ref(refname))
+	if (is_pseudo_ref(refname))
 		return refs_read_special_head(ref_store, refname, oid, referent,
 					      type, failure_errno);
 
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()`
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-05-15  6:50   ` [PATCH v5 05/10] refs: rename `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15 20:38     ` Justin Tobler
  2024-05-15  6:50   ` [PATCH v5 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 8055 bytes --]

Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.

We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.

This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:

  - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    It's clear that the intent is to classify based on the ref name,
    only.

  - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    filter root refs. Again, using existence checks is pointless here as
    the iterator has just surfaced the ref, so we know it does exist.

  - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    determine whether it should add a ref to the root directory of its
    iterator. This had the effect that we skipped over any files that
    are either a symbolic ref, or which are not a ref at all.

    The new behaviour is to include symbolic refs know, which aligns us
    with the adapted terminology. Furthermore, files which look like
    root refs but aren't are now mark those as "broken". As broken refs
    are not surfaced by our tooling, this should not lead to a change in
    user-visible behaviour, but may cause us to emit warnings. This
    feels like the right thing to do as we would otherwise just silently
    ignore corrupted root refs completely.

So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ref-filter.c                   |  2 +-
 refs.c                         | 19 +++++--------------
 refs.h                         |  5 +++--
 refs/files-backend.c           |  4 ++--
 refs/reftable-backend.c        |  2 +-
 t/t6302-for-each-ref-filter.sh | 17 +++++++++++++++++
 6 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 361beb6619..23e81e3e04 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,7 +2756,7 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_root_ref(get_main_ref_store(the_repository), refname))
+	if (is_root_ref(refname))
 		return FILTER_REFS_PSEUDOREFS;
 
 	return FILTER_REFS_OTHERS;
diff --git a/refs.c b/refs.c
index c1c406fc5f..4fec29e660 100644
--- a/refs.c
+++ b/refs.c
@@ -856,7 +856,7 @@ static int is_root_ref_syntax(const char *refname)
 	return 1;
 }
 
-int is_root_ref(struct ref_store *refs, const char *refname)
+int is_root_ref(const char *refname)
 {
 	static const char *const irregular_root_refs[] = {
 		"AUTO_MERGE",
@@ -865,26 +865,17 @@ int is_root_ref(struct ref_store *refs, const char *refname)
 		"NOTES_MERGE_REF",
 		"MERGE_AUTOSTASH",
 	};
-	struct object_id oid;
 	size_t i;
 
 	if (!is_root_ref_syntax(refname))
 		return 0;
 
-	if (ends_with(refname, "_HEAD")) {
-		refs_resolve_ref_unsafe(refs, refname,
-					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-					&oid, NULL);
-		return !is_null_oid(&oid);
-	}
+	if (ends_with(refname, "_HEAD"))
+		return 1;
 
 	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
-		if (!strcmp(refname, irregular_root_refs[i])) {
-			refs_resolve_ref_unsafe(refs, refname,
-						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
-						&oid, NULL);
-			return !is_null_oid(&oid);
-		}
+		if (!strcmp(refname, irregular_root_refs[i]))
+			return 1;
 
 	return 0;
 }
diff --git a/refs.h b/refs.h
index d0374c3275..8a574a22c7 100644
--- a/refs.h
+++ b/refs.h
@@ -1052,7 +1052,8 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
 /*
- * Check whether the reference is an existing root reference.
+ * Check whether the provided name names a root reference. This function only
+ * performs a syntactic check.
  *
  * A root ref is a reference that lives in the root of the reference hierarchy.
  * These references must conform to special syntax:
@@ -1076,7 +1077,7 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  *
  *   - MERGE_AUTOSTASH
  */
-int is_root_ref(struct ref_store *refs, const char *refname);
+int is_root_ref(const char *refname);
 
 int is_headref(struct ref_store *refs, const char *refname);
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fcb601444..06240ce327 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,8 +351,8 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
-								is_headref(ref_store, de->d_name)))
+		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
+					is_headref(ref_store, de->d_name)))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
 		strbuf_setlen(&refname, dirnamelen);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 36ab3357a7..bc927ef17b 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -354,7 +354,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
+		     (is_root_ref(iter->ref.refname) ||
 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
 		}
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 948f1bb5f4..92ed8957c8 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -62,6 +62,23 @@ test_expect_success '--include-root-refs with other patterns' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs omits dangling symrefs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git symbolic-ref DANGLING_HEAD refs/heads/missing &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_expect_success 'filtering with --points-at' '
 	cat >expect <<-\EOF &&
 	refs/heads/main
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 07/10] refs: classify HEAD as a root ref
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-05-15  6:50   ` [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()` Patrick Steinhardt
@ 2024-05-15  6:50   ` Patrick Steinhardt
  2024-05-15 20:44     ` Justin Tobler
  2024-05-15  6:51   ` [PATCH v5 08/10] refs: pseudorefs are no refs Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:50 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 4093 bytes --]

Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

  - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    calling `is_root_ref()`.

  - The "files" and "reftable" backends explicitly call both
    `is_root_ref()` and `is_headref()` together.

This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we stop checking for ref existence. This changes semantics for our
backends:

  - In the reftable backend we already know that the ref must exist
    because `is_headref()` is called as part of the ref iterator. The
    existence check is thus redundant, and the change is safe to do.

  - In the files backend we use it when populating root refs, where we
    would skip adding the "HEAD" file if it was not possible to resolve
    it. The new behaviour is to instead mark "HEAD" as broken, which
    will cause us to emit warnings in various places.

As there are no callers of `is_headref()` left afer the refactoring, we
can absorb it completely into `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  | 9 +--------
 refs.h                  | 5 ++---
 refs/files-backend.c    | 3 +--
 refs/reftable-backend.c | 3 +--
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/refs.c b/refs.c
index 4fec29e660..9fb1061d52 100644
--- a/refs.c
+++ b/refs.c
@@ -859,6 +859,7 @@ static int is_root_ref_syntax(const char *refname)
 int is_root_ref(const char *refname)
 {
 	static const char *const irregular_root_refs[] = {
+		"HEAD",
 		"AUTO_MERGE",
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
@@ -880,14 +881,6 @@ int is_root_ref(const char *refname)
 	return 0;
 }
 
-int is_headref(struct ref_store *refs, const char *refname)
-{
-	if (!strcmp(refname, "HEAD"))
-		return refs_ref_exists(refs, refname);
-
-	return 0;
-}
-
 static int is_current_worktree_ref(const char *ref) {
 	return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
 }
diff --git a/refs.h b/refs.h
index 8a574a22c7..8489b45265 100644
--- a/refs.h
+++ b/refs.h
@@ -1060,7 +1060,8 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  *
  *   - Their name must be all-uppercase or underscores ("_").
  *
- *   - Their name must end with "_HEAD".
+ *   - Their name must end with "_HEAD". As a special rule, "HEAD" is a root
+ *     ref, as well.
  *
  *   - Their name may not contain a slash.
  *
@@ -1079,6 +1080,4 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(const char *refname);
 
-int is_headref(struct ref_store *refs, const char *refname);
-
 #endif /* REFS_H */
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 06240ce327..6f9a631592 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -351,8 +351,7 @@ static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
 		strbuf_addstr(&refname, de->d_name);
 
 		dtype = get_dtype(de, &path, 1);
-		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
-					is_headref(ref_store, de->d_name)))
+		if (dtype == DT_REG && is_root_ref(de->d_name))
 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 
 		strbuf_setlen(&refname, dirnamelen);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bc927ef17b..821acd461a 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -354,8 +354,7 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		 */
 		if (!starts_with(iter->ref.refname, "refs/") &&
 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
-		     (is_root_ref(iter->ref.refname) ||
-		      is_headref(&iter->refs->base, iter->ref.refname)))) {
+		      is_root_ref(iter->ref.refname))) {
 			continue;
 		}
 
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 08/10] refs: pseudorefs are no refs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-05-15  6:50   ` [PATCH v5 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
@ 2024-05-15  6:51   ` Patrick Steinhardt
  2024-05-15  6:51   ` [PATCH v5 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
  2024-05-15  6:51   ` [PATCH v5 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:51 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 4312 bytes --]

The `is_root_ref()` function will happily clarify a pseudoref as a root
ref, even though pseudorefs are no refs. Next to being wrong, it also
leads to inconsistent behaviour across ref backends: while the "files"
backend accidentally knows to parse those pseudorefs and thus yields
them to the caller, the "reftable" backend won't ever see the pseudoref
at all because they are never stored in the "reftable" backend.

Fix this issue by filtering out pseudorefs in `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                         | 65 +++++++++++++++++-----------------
 t/t6302-for-each-ref-filter.sh | 17 +++++++++
 2 files changed, 50 insertions(+), 32 deletions(-)

diff --git a/refs.c b/refs.c
index 9fb1061d52..2074281a0e 100644
--- a/refs.c
+++ b/refs.c
@@ -844,6 +844,37 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
+static int is_pseudo_ref(const char *refname)
+{
+	/*
+	 * Pseudorefs are refs that have different semantics compared to
+	 * "normal" refs. These refs can thus not be stored in the ref backend,
+	 * but must always be accessed via the filesystem. The following refs
+	 * are pseudorefs:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * Reading, writing or deleting references must consistently go either
+	 * through the filesystem (pseudorefs) or through the reference
+	 * backend (normal ones).
+	 */
+	static const char * const pseudo_refs[] = {
+		"FETCH_HEAD",
+		"MERGE_HEAD",
+	};
+	size_t i;
+
+	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
+		if (!strcmp(refname, pseudo_refs[i]))
+			return 1;
+
+	return 0;
+}
+
 static int is_root_ref_syntax(const char *refname)
 {
 	const char *c;
@@ -868,7 +899,8 @@ int is_root_ref(const char *refname)
 	};
 	size_t i;
 
-	if (!is_root_ref_syntax(refname))
+	if (!is_root_ref_syntax(refname) ||
+	    is_pseudo_ref(refname))
 		return 0;
 
 	if (ends_with(refname, "_HEAD"))
@@ -1856,37 +1888,6 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
-static int is_pseudo_ref(const char *refname)
-{
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
-	static const char * const pseudo_refs[] = {
-		"FETCH_HEAD",
-		"MERGE_HEAD",
-	};
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
-		if (!strcmp(refname, pseudo_refs[i]))
-			return 1;
-
-	return 0;
-}
-
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 92ed8957c8..163c378cfd 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' '
 	test_cmp expect actual
 '
 
+test_expect_success '--include-root-refs pattern does not print special refs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git rev-parse HEAD >.git/MERGE_HEAD &&
+		git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+		cat >expect <<-EOF &&
+		HEAD
+		$(git symbolic-ref HEAD)
+		refs/tags/initial
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_expect_success '--include-root-refs with other patterns' '
 	cat >expect <<-\EOF &&
 	HEAD
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 09/10] ref-filter: properly distinuish pseudo and root refs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-05-15  6:51   ` [PATCH v5 08/10] refs: pseudorefs are no refs Patrick Steinhardt
@ 2024-05-15  6:51   ` Patrick Steinhardt
  2024-05-15  6:51   ` [PATCH v5 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:51 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 5819 bytes --]

The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/for-each-ref.c |  2 +-
 ref-filter.c           | 16 +++++++++-------
 ref-filter.h           |  4 ++--
 refs.c                 | 18 +-----------------
 refs.h                 | 18 ++++++++++++++++++
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 919282e12a..5517a4a1c0 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -98,7 +98,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	}
 
 	if (include_root_refs)
-		flags |= FILTER_REFS_ROOT_REFS;
+		flags |= FILTER_REFS_ROOT_REFS | FILTER_REFS_DETACHED_HEAD;
 
 	filter.match_as_path = 1;
 	filter_and_format_refs(&filter, flags, sorting, &format);
diff --git a/ref-filter.c b/ref-filter.c
index 23e81e3e04..41f639bc2f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2628,7 +2628,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       each_ref_fn cb,
 				       void *cb_data)
 {
-	if (filter->kind == FILTER_REFS_KIND_MASK) {
+	if (filter->kind & FILTER_REFS_ROOT_REFS) {
 		/* In this case, we want to print all refs including root refs. */
 		return refs_for_each_include_root_refs(get_main_ref_store(the_repository),
 						       cb, cb_data);
@@ -2756,8 +2756,10 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
-	if (is_root_ref(refname))
+	if (is_pseudo_ref(refname))
 		return FILTER_REFS_PSEUDOREFS;
+	if (is_root_ref(refname))
+		return FILTER_REFS_ROOT_REFS;
 
 	return FILTER_REFS_OTHERS;
 }
@@ -2794,11 +2796,11 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
 	/*
 	 * Generally HEAD refs are printed with special description denoting a rebase,
 	 * detached state and so forth. This is useful when only printing the HEAD ref
-	 * But when it is being printed along with other pseudorefs, it makes sense to
-	 * keep the formatting consistent. So we mask the type to act like a pseudoref.
+	 * But when it is being printed along with other root refs, it makes sense to
+	 * keep the formatting consistent. So we mask the type to act like a root ref.
 	 */
-	if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD)
-		kind = FILTER_REFS_PSEUDOREFS;
+	if (filter->kind & FILTER_REFS_ROOT_REFS && kind == FILTER_REFS_DETACHED_HEAD)
+		kind = FILTER_REFS_ROOT_REFS;
 	else if (!(kind & filter->kind))
 		return NULL;
 
@@ -3072,7 +3074,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 		 * When printing all ref types, HEAD is already included,
 		 * so we don't want to print HEAD again.
 		 */
-		if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) &&
+		if (!ret && !(filter->kind & FILTER_REFS_ROOT_REFS) &&
 		    (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
 	}
diff --git a/ref-filter.h b/ref-filter.h
index 0ca28d2bba..27ae1aa0d1 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -23,9 +23,9 @@
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
 #define FILTER_REFS_PSEUDOREFS     0x0040
-#define FILTER_REFS_ROOT_REFS      (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS)
+#define FILTER_REFS_ROOT_REFS      0x0080
 #define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \
-				    FILTER_REFS_PSEUDOREFS)
+				    FILTER_REFS_PSEUDOREFS | FILTER_REFS_ROOT_REFS)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs.c b/refs.c
index 2074281a0e..c13b8ff6d8 100644
--- a/refs.c
+++ b/refs.c
@@ -844,24 +844,8 @@ int is_per_worktree_ref(const char *refname)
 	       starts_with(refname, "refs/rewritten/");
 }
 
-static int is_pseudo_ref(const char *refname)
+int is_pseudo_ref(const char *refname)
 {
-	/*
-	 * Pseudorefs are refs that have different semantics compared to
-	 * "normal" refs. These refs can thus not be stored in the ref backend,
-	 * but must always be accessed via the filesystem. The following refs
-	 * are pseudorefs:
-	 *
-	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
-	 *   carries additional metadata like where it came from.
-	 *
-	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
-	 *   heads.
-	 *
-	 * Reading, writing or deleting references must consistently go either
-	 * through the filesystem (pseudorefs) or through the reference
-	 * backend (normal ones).
-	 */
 	static const char * const pseudo_refs[] = {
 		"FETCH_HEAD",
 		"MERGE_HEAD",
diff --git a/refs.h b/refs.h
index 8489b45265..dc4358727f 100644
--- a/refs.h
+++ b/refs.h
@@ -1080,4 +1080,22 @@ void update_ref_namespace(enum ref_namespace namespace, char *ref);
  */
 int is_root_ref(const char *refname);
 
+/*
+ * Pseudorefs are refs that have different semantics compared to
+ * "normal" refs. These refs can thus not be stored in the ref backend,
+ * but must always be accessed via the filesystem. The following refs
+ * are pseudorefs:
+ *
+ * - FETCH_HEAD may contain multiple object IDs, and each one of them
+ *   carries additional metadata like where it came from.
+ *
+ * - MERGE_HEAD may contain multiple object IDs when merging multiple
+ *   heads.
+ *
+ * Reading, writing or deleting references must consistently go either
+ * through the filesystem (pseudorefs) or through the reference
+ * backend (normal ones).
+ */
+int is_pseudo_ref(const char *refname);
+
 #endif /* REFS_H */
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v5 10/10] refs: refuse to write pseudorefs
  2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-05-15  6:51   ` [PATCH v5 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
@ 2024-05-15  6:51   ` Patrick Steinhardt
  9 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:51 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Justin Tobler, Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

Pseudorefs are not stored in the ref database as by definition, they
carry additional metadata that essentially makes them not a ref. As
such, writing pseudorefs via the ref backend does not make any sense
whatsoever as the ref backend wouldn't know how exactly to store the
data.

Restrict writing pseudorefs via the ref backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c           | 7 +++++++
 t/t5510-fetch.sh | 6 +++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index c13b8ff6d8..9abd9e5b86 100644
--- a/refs.c
+++ b/refs.c
@@ -1263,6 +1263,13 @@ int ref_transaction_update(struct ref_transaction *transaction,
 		return -1;
 	}
 
+	if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
+	    is_pseudo_ref(refname)) {
+		strbuf_addf(err, _("refusing to update pseudoref '%s'"),
+			    refname);
+		return -1;
+	}
+
 	if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
 		BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
 
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 9441793d06..2af277be9a 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -518,7 +518,7 @@ test_expect_success 'fetch with a non-applying branch.<name>.merge' '
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [1]' '
 	one_head=$(cd one && git rev-parse HEAD) &&
 	this_head=$(git rev-parse HEAD) &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -530,7 +530,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 	one_ref=$(cd one && git symbolic-ref HEAD) &&
 	git config branch.main.remote blub &&
 	git config branch.main.merge "$one_ref" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
@@ -540,7 +540,7 @@ test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge
 # the merge spec does not match the branch the remote HEAD points to
 test_expect_success 'fetch from GIT URL with a non-applying branch.<name>.merge [3]' '
 	git config branch.main.merge "${one_ref}_not" &&
-	git update-ref -d FETCH_HEAD &&
+	rm .git/FETCH_HEAD &&
 	git fetch one &&
 	test $one_head = "$(git rev-parse --verify FETCH_HEAD)" &&
 	test $this_head = "$(git rev-parse --verify HEAD)"
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 07/10] refs: root refs can be symbolic refs
  2024-05-15  6:49               ` Jeff King
@ 2024-05-15  6:59                 ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  6:59 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Phillip Wood, Junio C Hamano, Justin Tobler,
	Kristoffer Haugsbakk

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

On Wed, May 15, 2024 at 02:49:12AM -0400, Jeff King wrote:
> On Wed, May 15, 2024 at 08:35:52AM +0200, Patrick Steinhardt wrote:
> 
> > > Yeah, and I'd expect that the more-strict check_refname_format() that I
> > > proposed elsewhere would be in the same boat. The only reason I used the
> > > "_syntax()" variant is that it was obviously wrong to do existence
> > > checks there. Once those are gone, then naturally it should be able to
> > > rely on is_root_ref() itself.
> > 
> > This series hasn't been queued/merged yet, right? Do you plan to reroll
> > it? I think that the changes in there are a good complementary addition
> > to the clarifications in my patch series.
> 
> Correct, I don't think Junio picked it up. It needed a re-roll anyway,
> so I'd plan to do it on top of your patches, assuming they are on track
> to get merged (and it sounds like there are no real objections).

That feels sensible. The series needs another thorough review and an Ack
by somebody before Junio wants to merge it, but until now I'm not aware
of any objections, yeah. So it should hopefully land soonish.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()`
  2024-05-15  6:50   ` [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()` Patrick Steinhardt
@ 2024-05-15 20:38     ` Justin Tobler
  2024-05-16  4:13       ` Patrick Steinhardt
  0 siblings, 1 reply; 93+ messages in thread
From: Justin Tobler @ 2024-05-15 20:38 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Kristoffer Haugsbakk, Jean-Noël AVILA

On 24/05/15 08:50AM, Patrick Steinhardt wrote:
> Before this patch series, root refs except for "HEAD" and our special
> refs were classified as pseudorefs. Furthermore, our terminology
> clarified that pseudorefs must not be symbolic refs. This restriction
> is enforced in `is_root_ref()`, which explicitly checks that a supposed
> root ref resolves to an object ID without recursing.
> 
> This has been extremely confusing right from the start because (in old
> terminology) a ref name may sometimes be a pseudoref and sometimes not
> depending on whether it is a symbolic or regular ref. This behaviour
> does not seem reasonable at all and I very much doubt that it results in
> anything sane.
> 
> Last but not least, the current behaviour can actually lead to a
> segfault when calling `is_root_ref()` with a reference that either does
> not exist or that is a symbolic ref because we never initialized `oid`,
> but then read it via `is_null_oid()`.
> 
> We have now changed terminology to clarify that pseudorefs are really
> only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
> in the root of the ref hierarchy are just plain refs. Thus, we do not
> need to check whether the ref is symbolic or not. In fact, we can now
> avoid looking up the ref completely as the name is sufficient for us to
> figure out whether something would be a root ref or not.
> 
> This change of course changes semantics for our callers. As there are
> only three of them we can assess each of them individually:
> 
>   - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
>     It's clear that the intent is to classify based on the ref name,
>     only.
> 
>   - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
>     filter root refs. Again, using existence checks is pointless here as
>     the iterator has just surfaced the ref, so we know it does exist.
> 
>   - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
>     determine whether it should add a ref to the root directory of its
>     iterator. This had the effect that we skipped over any files that
>     are either a symbolic ref, or which are not a ref at all.
> 
>     The new behaviour is to include symbolic refs know, which aligns us

s/know/now/

>     with the adapted terminology. Furthermore, files which look like
>     root refs but aren't are now mark those as "broken". As broken refs
>     are not surfaced by our tooling, this should not lead to a change in
>     user-visible behaviour, but may cause us to emit warnings. This
>     feels like the right thing to do as we would otherwise just silently
>     ignore corrupted root refs completely.

Is there an expected source of broken root refs? Or would it just be due
to bugs?

> So in all cases the existence check was either superfluous, not in line
> with the adapted terminology or masked potential issues. This commit
> thus changes the behaviour as proposed and drops the existence check
> altogether.

Dropping the existence check makes sense to me. It also has the added
benefit of simplifying `is_root_ref()` which is nice.

> 
> Add a test that verifies that this does not change user-visible
> behaviour. Namely, we still don't want to show broken refs to the user
> by default in git-for-each-ref(1). What this does allow though is for
> internal callers to surface dangling root refs when they pass in the
> `DO_FOR_EACH_INCLUDE_BROKEN` flag.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v5 07/10] refs: classify HEAD as a root ref
  2024-05-15  6:50   ` [PATCH v5 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
@ 2024-05-15 20:44     ` Justin Tobler
  0 siblings, 0 replies; 93+ messages in thread
From: Justin Tobler @ 2024-05-15 20:44 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Kristoffer Haugsbakk, Jean-Noël AVILA

On 24/05/15 08:50AM, Patrick Steinhardt wrote:
> Root refs are those refs that live in the root of the ref hierarchy.
> Our old and venerable "HEAD" reference falls into this category, but we
> don't yet classify it as such in `is_root_ref()`.
> 
> Adapt the function to also treat "HEAD" as a root ref. This change is
> safe to do for all current callers:

I like that this change gives HEAD a proper home now. :)

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()`
  2024-05-15 20:38     ` Justin Tobler
@ 2024-05-16  4:13       ` Patrick Steinhardt
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Steinhardt @ 2024-05-16  4:13 UTC (permalink / raw)
  To: git, Jeff King, Karthik Nayak, Phillip Wood, Junio C Hamano,
	Kristoffer Haugsbakk, Jean-Noël AVILA

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

On Wed, May 15, 2024 at 03:38:47PM -0500, Justin Tobler wrote:
> On 24/05/15 08:50AM, Patrick Steinhardt wrote:
[snip]
> >     The new behaviour is to include symbolic refs know, which aligns us
> 
> s/know/now/

Fixed locally. I'll refrain from sending a new version just to fix this
typo though.

> >     with the adapted terminology. Furthermore, files which look like
> >     root refs but aren't are now mark those as "broken". As broken refs
> >     are not surfaced by our tooling, this should not lead to a change in
> >     user-visible behaviour, but may cause us to emit warnings. This
> >     feels like the right thing to do as we would otherwise just silently
> >     ignore corrupted root refs completely.
> 
> Is there an expected source of broken root refs? Or would it just be due
> to bugs?

Dangling symbolic refs are the only expected source. The fact that we
did not include those here feels like a bug to me.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2024-05-16  4:13 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-29 13:41 [PATCH 0/3] Clarify pseudo-ref terminology Patrick Steinhardt
2024-04-29 13:41 ` [PATCH 1/3] refs: move `is_special_ref()` Patrick Steinhardt
2024-04-29 13:41 ` [PATCH 2/3] refs: do not label special refs as pseudo refs Patrick Steinhardt
2024-04-29 15:12   ` Phillip Wood
2024-04-30  7:30     ` Patrick Steinhardt
2024-04-30  9:59       ` Phillip Wood
2024-04-30 12:11         ` Patrick Steinhardt
2024-04-30 10:23       ` Jeff King
2024-04-30 12:07         ` Karthik Nayak
2024-04-30 12:33           ` Patrick Steinhardt
2024-04-30 12:16         ` Patrick Steinhardt
2024-04-29 16:24   ` Junio C Hamano
2024-04-29 22:52   ` Justin Tobler
2024-04-30  7:29     ` Patrick Steinhardt
2024-05-09 17:29   ` Jean-Noël AVILA
2024-05-10  8:33     ` Patrick Steinhardt
2024-04-29 13:41 ` [PATCH 3/3] refs: fix segfault in `is_pseudoref()` when ref cannot be resolved Patrick Steinhardt
2024-04-29 15:25   ` Phillip Wood
2024-04-29 18:57   ` Karthik Nayak
2024-04-29 19:47     ` Phillip Wood
2024-04-29 20:44       ` Karthik Nayak
2024-04-30  7:30     ` Patrick Steinhardt
2024-04-30 12:26 ` [PATCH v2 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
2024-04-30 12:26   ` [PATCH v2 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
2024-04-30 12:49     ` Karthik Nayak
2024-04-30 17:17     ` Justin Tobler
2024-04-30 20:12     ` Junio C Hamano
2024-05-02  8:07       ` Patrick Steinhardt
2024-04-30 12:26   ` [PATCH v2 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
2024-04-30 13:35     ` Kristoffer Haugsbakk
2024-04-30 12:26   ` [PATCH v2 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
2024-04-30 12:56     ` Karthik Nayak
2024-04-30 12:26   ` [PATCH v2 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
2024-04-30 20:20     ` Junio C Hamano
2024-04-30 12:26   ` [PATCH v2 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
2024-04-30 12:58     ` Karthik Nayak
2024-04-30 12:26   ` [PATCH v2 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
2024-04-30 12:26   ` [PATCH v2 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
2024-04-30 17:09     ` Justin Tobler
2024-05-02  8:07       ` Patrick Steinhardt
2024-05-03 20:49         ` Justin Tobler
2024-05-07 10:32           ` Patrick Steinhardt
2024-04-30 12:26   ` [PATCH v2 08/10] refs: pseudorefs are no refs Patrick Steinhardt
2024-04-30 12:27   ` [PATCH v2 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
2024-04-30 13:11     ` Karthik Nayak
2024-05-02  8:08       ` Patrick Steinhardt
2024-05-02 10:03         ` Karthik Nayak
2024-04-30 12:27   ` [PATCH v2 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
2024-05-02  8:17 ` [PATCH v3 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 06/10] refs: classify HEAD as a root ref Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 07/10] refs: root refs can be symbolic refs Patrick Steinhardt
2024-05-03 18:13     ` Jeff King
2024-05-15  4:16       ` Patrick Steinhardt
2024-05-15  4:39         ` Patrick Steinhardt
2024-05-15  6:22           ` Jeff King
2024-05-15  6:35             ` Patrick Steinhardt
2024-05-15  6:49               ` Jeff King
2024-05-15  6:59                 ` Patrick Steinhardt
2024-05-15  6:20         ` Jeff King
2024-05-02  8:17   ` [PATCH v3 08/10] refs: pseudorefs are no refs Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
2024-05-02  8:17   ` [PATCH v3 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
2024-05-10  8:48 ` [PATCH v4 00/10] Clarify pseudo-ref terminology Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 05/10] refs: refname `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 06/10] refs: root refs can be symbolic refs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 08/10] refs: pseudorefs are no refs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
2024-05-10  8:48   ` [PATCH v4 10/10] refs: refuse to write pseudorefs Patrick Steinhardt
2024-05-10 18:59   ` [PATCH v4 00/10] Clarify pseudo-ref terminology Junio C Hamano
2024-05-15  6:50 ` [PATCH v5 " Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 01/10] Documentation/glossary: redefine pseudorefs as special refs Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 02/10] Documentation/glossary: clarify limitations of pseudorefs Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 03/10] Documentation/glossary: define root refs as refs Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 04/10] refs: rename `is_pseudoref()` to `is_root_ref()` Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 05/10] refs: rename `is_special_ref()` to `is_pseudo_ref()` Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 06/10] refs: do not check ref existence in `is_root_ref()` Patrick Steinhardt
2024-05-15 20:38     ` Justin Tobler
2024-05-16  4:13       ` Patrick Steinhardt
2024-05-15  6:50   ` [PATCH v5 07/10] refs: classify HEAD as a root ref Patrick Steinhardt
2024-05-15 20:44     ` Justin Tobler
2024-05-15  6:51   ` [PATCH v5 08/10] refs: pseudorefs are no refs Patrick Steinhardt
2024-05-15  6:51   ` [PATCH v5 09/10] ref-filter: properly distinuish pseudo and root refs Patrick Steinhardt
2024-05-15  6:51   ` [PATCH v5 10/10] refs: refuse to write pseudorefs Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).