git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Yasushi SHOJI <yasushi.shoji@gmail.com>
Cc: Denton Liu <liu.denton@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Segfault: git show-branch --reflog refs/pullreqs/1
Date: Wed, 21 Feb 2024 03:42:50 -0500	[thread overview]
Message-ID: <20240221084250.GA25385@coredump.intra.peff.net> (raw)
In-Reply-To: <CAELBRWK-bZTV0qx6_34HAgpmYwy+5Zo2E0M+4B6yZJJ3CqweTw@mail.gmail.com>

On Wed, Feb 21, 2024 at 10:48:25AM +0900, Yasushi SHOJI wrote:

> Does anyone see a segfault on `git show-branch --reflog refs/pullreqs/1`?
> 
> It seems files_reflog_path() creates a wrong path with the above command
> using REF_WORKTREE_SHARED.

I can trigger a segfault, but I think the issue is simply a ref that has
no reflog. Here's a simple reproduction:

  $ git init
  $ git commit --allow-empty -m foo
  $ rm -rf .git/logs
  $ git show-branch --reflog
  Segmentation fault

The bug is in read_ref_at(). When asked for the reflog at position "0",
it calls refs_for_each_reflog_ent_reverse() with a special callback, but
does not check that it actually found anything! So we return "0" for
success, but all of the returned fields are garbage (including the
pointer to reflog message, which is where I see the segfault).

The bug was introduced by 6436a20284 (refs: allow @{n} to work with
n-sized reflog, 2021-01-07). Probably the fix is something like:

diff --git a/refs.c b/refs.c
index 03968ad787..c2a48f8188 100644
--- a/refs.c
+++ b/refs.c
@@ -945,6 +945,8 @@ static int read_ref_at_ent_newest(struct object_id *ooid, struct object_id *noid
 
 	set_read_ref_cutoffs(cb, timestamp, tz, message);
 	oidcpy(cb->oid, noid);
+	cb->reccnt++;
+	cb->found_it = 1;
 	/* We just want the first entry */
 	return 1;
 }
@@ -980,12 +982,10 @@ int read_ref_at(struct ref_store *refs, const char *refname,
 	cb.cutoff_cnt = cutoff_cnt;
 	cb.oid = oid;
 
-	if (cb.cnt == 0) {
+	if (cb.cnt == 0)
 		refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent_newest, &cb);
-		return 0;
-	}
-
-	refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent, &cb);
+	else
+		refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent, &cb);
 
 	if (!cb.reccnt) {
 		if (flags & GET_OID_QUIETLY)

but that breaks t1508.35, which explicitly tests for branch@{0} to work
with an empty reflog file (added by that same commit). The code in
get_oid_basic() to parse reflogs doesn't suffer from the same bugs: it
checks up front that the reflog file exists, it preloads the output oid
with the current ref value, and it doesn't look at other fields (like
the reflog message).

So I'm not sure if read_ref_at() needs to be made safer, or if
cmd_show_branch() needs to learn the same tricks as get_oid_basic().
Those are the only two callers of read_ref_at().

Beyond that confusion, I noticed we do not have many tests for
show-branch, and none for "--reflog". So I thought to add a basic one
where we _do_ have an actual reflog to show. But wow, this has been
broken for some time. I found at least two issues trying to run a test
like:

diff --git a/t/t3207-show-branch-reflog.sh b/t/t3207-show-branch-reflog.sh
new file mode 100755
index 0000000000..7f52c8dcb1
--- /dev/null
+++ b/t/t3207-show-branch-reflog.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='show-branch reflog tests'
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit base &&
+	git checkout -b branch &&
+	test_commit one &&
+	git reset --hard HEAD^ &&
+	test_commit two &&
+	test_commit three
+'
+
+test_expect_success '--reflog shows reflog entries' '
+	cat >expect <<-\EOF &&
+	! [branch@{0}] (0 seconds ago) commit: three
+	 ! [branch@{1}] (60 seconds ago) commit: two
+	  ! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
+	   ! [branch@{3}] (2 minutes ago) commit: one
+	----
+	+    [refs/heads/branch@{0}] three
+	++   [refs/heads/branch@{1}] two
+	   + [refs/heads/branch@{3}] one
+	++++ [refs/heads/branch@{2}] base
+	EOF
+	# the output always contains relative timestamps; use
+	# a known time to get deterministic results
+	GIT_TEST_DATE_NOW=$test_tick \
+	git show-branch --reflog branch >actual &&
+	test_cmp expect actual
+'
+
+test_done

The first is that "show-branch" does not print the correct reflog
message, and you get output like this:

  ! [branch@{0}] (0 seconds ago) (none)
   ! [branch@{1}] (0 seconds ago) (none)
    ! [branch@{2}] (60 seconds ago) (none)
     ! [branch@{3}] (2 minutes ago) (none)

Once upon a time, read_ref_at() returned the whole reflog line, and
show-branch had to find the tab-separator. But since 4207ed285f (refs.c:
change read_ref_at to use the reflog iterators, 2014-06-03), it returns
just the actual message (curiously, with the newline still attached). So
we need something like this to fix it:

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d6d2dabeca..b678b9fedb 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -761,7 +761,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		for (i = 0; i < reflog; i++) {
 			char *logmsg;
 			char *nth_desc;
-			const char *msg;
+			char *eol;
 			timestamp_t timestamp;
 			int tz;
 
@@ -771,15 +771,13 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 				reflog = i;
 				break;
 			}
-			msg = strchr(logmsg, '\t');
-			if (!msg)
-				msg = "(none)";
-			else
-				msg++;
+			eol = strchr(logmsg, '\n');
+			if (eol)
+				*eol = '\0';
 			reflog_msg[i] = xstrfmt("(%s) %s",
 						show_date(timestamp, tz,
 							  DATE_MODE(RELATIVE)),
-						msg);
+						logmsg);
 			free(logmsg);
 
 			nth_desc = xstrfmt("%s@{%d}", *av, base+i);

Easy enough. But the output is still subtly wrong! Now we're back to
6436a20284 (refs: allow @{n} to work with n-sized reflog, 2021-01-07)
again. Before that commit, applying the fix above gives the expected
output from my test:

  ! [branch@{0}] (0 seconds ago) commit: three
   ! [branch@{1}] (60 seconds ago) commit: two
    ! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
     ! [branch@{3}] (2 minutes ago) commit: one

but afterwards, entries higher than one are all shifted (so 1 is a
duplicate of 0, 2 is the old 1, and so on):

  ! [branch@{0}] (0 seconds ago) commit: three
   ! [branch@{1}] (0 seconds ago) commit: three
    ! [branch@{2}] (60 seconds ago) commit: two
     ! [branch@{3}] (2 minutes ago) reset: moving to HEAD^

I am still trying to wrap my head around how it can get such wrong
results for show-branch when asking for "git rev-parse branch@{0}", etc,
are correct. I think it is that "rev-parse branch@{0}" is only looking
at the output oid and does not consider the reflog message at all. So I
think it is subtly broken, but in a way that happens to work for that
caller. But I'm not sure of the correct fix. At least not at this time
of night.

Cc-ing folks involved in 6436a20284.

-Peff

  reply	other threads:[~2024-02-21  8:42 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21  1:48 Segfault: git show-branch --reflog refs/pullreqs/1 Yasushi SHOJI
2024-02-21  8:42 ` Jeff King [this message]
2024-02-21 10:05   ` Patrick Steinhardt
2024-02-21 17:38     ` Jeff King
2024-02-21 17:44   ` Junio C Hamano
2024-02-22  9:02     ` Patrick Steinhardt
2024-02-22 16:32       ` Junio C Hamano
2024-02-22 17:22         ` Jeff King
2024-02-26 10:00           ` [PATCH 0/3] show-branch --reflog fixes Jeff King
2024-02-26 10:02             ` [PATCH 1/3] Revert "refs: allow @{n} to work with n-sized reflog" Jeff King
2024-02-26 10:04             ` [PATCH 2/3] get_oid_basic(): special-case ref@{n} for oldest reflog entry Jeff King
2024-02-26 15:59               ` Junio C Hamano
2024-02-26 10:08             ` [PATCH 3/3] read_ref_at(): special-case ref@{0} for an empty reflog Jeff King
2024-02-26 10:10               ` Jeff King
2024-02-26 17:25                 ` Junio C Hamano
2024-02-27  8:07                   ` Jeff King
2024-02-26 17:25               ` Junio C Hamano
2024-02-27  8:05                 ` Jeff King
2024-02-27 17:03                   ` Junio C Hamano
2024-02-21  9:52 ` Segfault: git show-branch --reflog refs/pullreqs/1 Patrick Steinhardt
2024-02-21  9:56 ` [PATCH 0/2] Detect empty or missing reflogs with `ref@{0}` Patrick Steinhardt
2024-02-21  9:56   ` [PATCH 1/2] object-name: detect and report empty reflogs Patrick Steinhardt
2024-02-21 10:37     ` Kristoffer Haugsbakk
2024-02-21 16:48     ` Eric Sunshine
2024-02-21 17:31     ` Jeff King
2024-02-21  9:56   ` [PATCH 2/2] builtin/show-branch: detect " Patrick Steinhardt
2024-02-21 17:35     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240221084250.GA25385@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=liu.denton@gmail.com \
    --cc=yasushi.shoji@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).