All of lore.kernel.org
 help / color / mirror / Atom feed
* First/oldest entry in reflog dropped
@ 2010-11-21  3:11 Martin von Zweigbergk
  2010-11-21  5:35 ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Martin von Zweigbergk @ 2010-11-21  3:11 UTC (permalink / raw)
  To: git

Can someone explain the behavior in the execution below?

# I expected this reflog...
$ git branch tmp
$ git reflog show refs/heads/tmp
b60a214 refs/heads/tmp@{0}: branch: Created from master

# ... and this one as well...
$ git update-ref refs/heads/tmp HEAD^
$ git reflog show refs/heads/tmp
7d1a0b8 refs/heads/tmp@{0}:
b60a214 refs/heads/tmp@{1}: branch: Created from master

# ... but why is the first entry (i.e. "branch: Created from master")
# dropped here?
$ git update-ref refs/heads/tmp HEAD
$ git reflog show refs/heads/tmp
b60a214 refs/heads/tmp@{0}:
7d1a0b8 refs/heads/tmp@{1}:

If the ref is updated once more (to e.g. HEAD^^) before being moved back
to HEAD, the first entry will be shown in the output.

If this is a bug, it seems to be in reflog, rather than in update-ref,
because the first entry does exist in .git/logs/refs/heads/tmp.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-21  3:11 First/oldest entry in reflog dropped Martin von Zweigbergk
@ 2010-11-21  5:35 ` Jeff King
  2010-11-21 11:36   ` Johannes Schindelin
  2010-11-21 22:57   ` Martin von Zweigbergk
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff King @ 2010-11-21  5:35 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Johannes Schindelin, git

On Sat, Nov 20, 2010 at 10:11:34PM -0500, Martin von Zweigbergk wrote:

> Can someone explain the behavior in the execution below?
> 
> # I expected this reflog...
> $ git branch tmp
> $ git reflog show refs/heads/tmp
> b60a214 refs/heads/tmp@{0}: branch: Created from master
> 
> # ... and this one as well...
> $ git update-ref refs/heads/tmp HEAD^
> $ git reflog show refs/heads/tmp
> 7d1a0b8 refs/heads/tmp@{0}:
> b60a214 refs/heads/tmp@{1}: branch: Created from master
> 
> # ... but why is the first entry (i.e. "branch: Created from master")
> # dropped here?
> $ git update-ref refs/heads/tmp HEAD
> $ git reflog show refs/heads/tmp
> b60a214 refs/heads/tmp@{0}:
> 7d1a0b8 refs/heads/tmp@{1}:
> 
> If the ref is updated once more (to e.g. HEAD^^) before being moved back
> to HEAD, the first entry will be shown in the output.
> 
> If this is a bug, it seems to be in reflog, rather than in update-ref,
> because the first entry does exist in .git/logs/refs/heads/tmp.

I think it's a bug in the reflog-walking machinery, which is sort of
bolted onto the regular revision traversal machinery. When we hit
b60a214 the first time, we show it and set the SHOWN flag (since the
normal traversal machinery would not want to show a commit twice). When
we hit it again, simplify_commit() sees that it is SHOWN and tells us to
skip it.

However, the bolted-on reflog-walking machinery does have a way of
handling this. While we are traversing via get_revision(), we notice
that we are doing a reflog walk and call fake_reflog_parent. This
function is responsible for replacing the actual parents of the commit
with a fake list consisting of the previous reflog entry (so we
basically pretend that the history consists of a string of commits, each
one pointing to the previous reflog entry, not the actual parent).

This function _also_ clears some flags, including the SHOWN flag, in
what almost seems like a tacked-on side effect. So if we hit the same
commit twice, we will actually show it again. Which is what makes
reflogs with repeated commits work at all.

However, there is a subtle bug: it clears the flags at the very end of
the function. But through the function, if we see that there are no fake
parents (because we are on the very first reflog entry), we do an early
return. But we not only skip the later "set up parents" code, we also
accidentally skip the "clear SHOWN flag" side-effect code.

So I believe we will always fail to show the very first reflog if it is
a repeated commit.

The fix, AFAICT, is to just move the flag clearing above the early
returns (patch below). But I have to admit I do not quite understand
what the ADDED and SEEN flags are doing here, as this is the first time
I have ever looked at the reflog-walk code. So possibly just the SHOWN
flag should be unconditionally cleared.

This patch clears up your bug, and doesn't break any tests. But I'd
really like to get a second opinion on the significance of those other
flags, or why the flag clearing was at the bottom of the function in the
first place.

-Peff

---
diff --git a/reflog-walk.c b/reflog-walk.c
index 4879615..d5d055b 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -210,44 +210,45 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 	add_commit_info(commit, commit_reflog, &info->reflogs);
 	return 0;
 }
 
 void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 {
 	struct commit_info *commit_info =
 		get_commit_info(commit, &info->reflogs, 0);
 	struct commit_reflog *commit_reflog;
 	struct reflog_info *reflog;
 
+	commit->object.flags &= ~(ADDED | SEEN | SHOWN);
+
 	info->last_commit_reflog = NULL;
 	if (!commit_info)
 		return;
 
 	commit_reflog = commit_info->util;
 	if (commit_reflog->recno < 0) {
 		commit->parents = NULL;
 		return;
 	}
 
 	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
 	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
 	if (!commit_info->commit) {
 		commit->parents = NULL;
 		return;
 	}
 
 	commit->parents = xcalloc(sizeof(struct commit_list), 1);
 	commit->parents->item = commit_info->commit;
-	commit->object.flags &= ~(ADDED | SEEN | SHOWN);
 }
 
 void get_reflog_selector(struct strbuf *sb,
 			 struct reflog_walk_info *reflog_info,
 			 enum date_mode dmode,
 			 int shorten)
 {
 	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
 	struct reflog_info *info;
 	const char *printed_ref;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-21  5:35 ` Jeff King
@ 2010-11-21 11:36   ` Johannes Schindelin
  2010-11-22  4:42     ` Jeff King
  2010-11-21 22:57   ` Martin von Zweigbergk
  1 sibling, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2010-11-21 11:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin von Zweigbergk, git

Hi,

On Sun, 21 Nov 2010, Jeff King wrote:

> This patch clears up your bug, and doesn't break any tests. But I'd 
> really like to get a second opinion on the significance of those other 
> flags, or why the flag clearing was at the bottom of the function in the 
> first place.

The flag clearing was at the bottom because I had the impression that 
something function one might want to call in that function in the future 
could set the flags again. Maybe a goto would be appropriate here instead 
of the early returns?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-21  5:35 ` Jeff King
  2010-11-21 11:36   ` Johannes Schindelin
@ 2010-11-21 22:57   ` Martin von Zweigbergk
  1 sibling, 0 replies; 8+ messages in thread
From: Martin von Zweigbergk @ 2010-11-21 22:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, git

On Sun, Nov 21, 2010 at 12:35 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Nov 20, 2010 at 10:11:34PM -0500, Martin von Zweigbergk wrote:
>
>> Can someone explain the behavior in the execution below?
>>
>> # I expected this reflog...
>> $ git branch tmp
>> $ git reflog show refs/heads/tmp
>> b60a214 refs/heads/tmp@{0}: branch: Created from master
>>
>> # ... and this one as well...
>> $ git update-ref refs/heads/tmp HEAD^
>> $ git reflog show refs/heads/tmp
>> 7d1a0b8 refs/heads/tmp@{0}:
>> b60a214 refs/heads/tmp@{1}: branch: Created from master
>>
>> # ... but why is the first entry (i.e. "branch: Created from master")
>> # dropped here?
>> $ git update-ref refs/heads/tmp HEAD
>> $ git reflog show refs/heads/tmp
>> b60a214 refs/heads/tmp@{0}:
>> 7d1a0b8 refs/heads/tmp@{1}:
>>
>> If the ref is updated once more (to e.g. HEAD^^) before being moved back
>> to HEAD, the first entry will be shown in the output.
>>
>> If this is a bug, it seems to be in reflog, rather than in update-ref,
>> because the first entry does exist in .git/logs/refs/heads/tmp.
>
> I think it's a bug in the reflog-walking machinery, which is sort of
> bolted onto the regular revision traversal machinery. When we hit
> b60a214 the first time, we show it and set the SHOWN flag (since the
> normal traversal machinery would not want to show a commit twice). When
> we hit it again, simplify_commit() sees that it is SHOWN and tells us to
> skip it.
>
> However, the bolted-on reflog-walking machinery does have a way of
> handling this. While we are traversing via get_revision(), we notice
> that we are doing a reflog walk and call fake_reflog_parent. This
> function is responsible for replacing the actual parents of the commit
> with a fake list consisting of the previous reflog entry (so we
> basically pretend that the history consists of a string of commits, each
> one pointing to the previous reflog entry, not the actual parent).
>
> This function _also_ clears some flags, including the SHOWN flag, in
> what almost seems like a tacked-on side effect. So if we hit the same
> commit twice, we will actually show it again. Which is what makes
> reflogs with repeated commits work at all.
>
> However, there is a subtle bug: it clears the flags at the very end of
> the function. But through the function, if we see that there are no fake
> parents (because we are on the very first reflog entry), we do an early
> return. But we not only skip the later "set up parents" code, we also
> accidentally skip the "clear SHOWN flag" side-effect code.
>
> So I believe we will always fail to show the very first reflog if it is
> a repeated commit.
>
> The fix, AFAICT, is to just move the flag clearing above the early
> returns (patch below). But I have to admit I do not quite understand
> what the ADDED and SEEN flags are doing here, as this is the first time
> I have ever looked at the reflog-walk code. So possibly just the SHOWN
> flag should be unconditionally cleared.
>
> This patch clears up your bug, and doesn't break any tests. But I'd
> really like to get a second opinion on the significance of those other
> flags, or why the flag clearing was at the bottom of the function in the
> first place.
>
> -Peff
>
> ---
> diff --git a/reflog-walk.c b/reflog-walk.c
> index 4879615..d5d055b 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -210,44 +210,45 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
>        add_commit_info(commit, commit_reflog, &info->reflogs);
>        return 0;
>  }
>
>  void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  {
>        struct commit_info *commit_info =
>                get_commit_info(commit, &info->reflogs, 0);
>        struct commit_reflog *commit_reflog;
>        struct reflog_info *reflog;
>
> +       commit->object.flags &= ~(ADDED | SEEN | SHOWN);
> +
>        info->last_commit_reflog = NULL;
>        if (!commit_info)
>                return;
>
>        commit_reflog = commit_info->util;
>        if (commit_reflog->recno < 0) {
>                commit->parents = NULL;
>                return;
>        }
>
>        reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>        info->last_commit_reflog = commit_reflog;
>        commit_reflog->recno--;
>        commit_info->commit = (struct commit *)parse_object(reflog->osha1);
>        if (!commit_info->commit) {
>                commit->parents = NULL;
>                return;
>        }
>
>        commit->parents = xcalloc(sizeof(struct commit_list), 1);
>        commit->parents->item = commit_info->commit;
> -       commit->object.flags &= ~(ADDED | SEEN | SHOWN);
>  }
>
>  void get_reflog_selector(struct strbuf *sb,
>                         struct reflog_walk_info *reflog_info,
>                         enum date_mode dmode,
>                         int shorten)
>  {
>        struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
>        struct reflog_info *info;
>        const char *printed_ref;
>
>

Good job! And yes, if the first commit is included in any cycles, it
seems to be dropped. I must have been blind yesterday...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-21 11:36   ` Johannes Schindelin
@ 2010-11-22  4:42     ` Jeff King
  2010-11-22  9:32       ` Johannes Schindelin
  2010-11-24  0:35       ` Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff King @ 2010-11-22  4:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Martin von Zweigbergk, git

On Sun, Nov 21, 2010 at 12:36:21PM +0100, Johannes Schindelin wrote:

> On Sun, 21 Nov 2010, Jeff King wrote:
> 
> > This patch clears up your bug, and doesn't break any tests. But I'd 
> > really like to get a second opinion on the significance of those other 
> > flags, or why the flag clearing was at the bottom of the function in the 
> > first place.
> 
> The flag clearing was at the bottom because I had the impression that 
> something function one might want to call in that function in the future 
> could set the flags again. Maybe a goto would be appropriate here instead 
> of the early returns?

That makes sense. I did a quick skim of the called code and I'm not sure
any flags could be set, but even if I am right, I think it is better to
be defensive.

So let's do this, which is the equivalent behavior to your gotos, but
this structure makes more sense to me as a reader (and it doesn't
involve goto :) ).

-- >8 --
Subject: [PATCH] reflogs: clear flags properly in corner case

The reflog-walking mechanism is based on the regular
revision traversal. We just rewrite the parents of each
commit in fake_reflog_parent to point to the commit in the
next reflog entry instead of the real parents.

However, the regular revision traversal tries not to show
the same commit twice, and so sets the SHOWN flag on each
commit it shows. In a reflog, however, we may want to see
the same commit more than once if it appears in the reflog
multiple times (which easily happens, for example, if you do
a reset to a prior state).

The fake_reflog_parent function takes care of this by
clearing flags, including SHOWN. Unfortunately, it does so
at the very end of the function, and it is possible to
return early from the function if there is no fake parent to
set up (e.g., because we are at the very first reflog entry
on the branch). In such a case the flag is not cleared, and
the entry is skipped by the revision traversal machinery as
already shown.

You can see this by walking the log of a ref which is set to
its very first commit more than once (the test below shows
such a situation). In this case the reflog walk will fail to
show the entry for the initial creation of the ref.

We don't want to simply move the flag-clearing to the top of
the function; we want to make sure flags set during the
fake-parent installation are also cleared. Instead, let's
hoist the flag-clearing out of the fake_reflog_parent
function entirely. It's not really about fake parents
anyway, and the only caller is the get_revision machinery.

Reported-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          |    1 -
 revision.c             |    4 +++-
 t/t1412-reflog-loop.sh |   34 ++++++++++++++++++++++++++++++++++
 3 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100755 t/t1412-reflog-loop.sh

diff --git a/reflog-walk.c b/reflog-walk.c
index 4879615..5d81d39 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -239,7 +239,6 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 
 	commit->parents = xcalloc(sizeof(struct commit_list), 1);
 	commit->parents->item = commit_info->commit;
-	commit->object.flags &= ~(ADDED | SEEN | SHOWN);
 }
 
 void get_reflog_selector(struct strbuf *sb,
diff --git a/revision.c b/revision.c
index b1c1890..ded8812 100644
--- a/revision.c
+++ b/revision.c
@@ -2030,8 +2030,10 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		revs->commits = entry->next;
 		free(entry);
 
-		if (revs->reflog_info)
+		if (revs->reflog_info) {
 			fake_reflog_parent(revs->reflog_info, commit);
+			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
+		}
 
 		/*
 		 * If we haven't done the list limiting, we need to look at
diff --git a/t/t1412-reflog-loop.sh b/t/t1412-reflog-loop.sh
new file mode 100755
index 0000000..7f519e5
--- /dev/null
+++ b/t/t1412-reflog-loop.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='reflog walk shows repeated commits again'
+. ./test-lib.sh
+
+test_expect_success 'setup commits' '
+	test_tick &&
+	echo content >file && git add file && git commit -m one &&
+	git tag one &&
+	echo content >>file && git add file && git commit -m two &&
+	git tag two
+'
+
+test_expect_success 'setup reflog with alternating commits' '
+	git checkout -b topic &&
+	git reset one &&
+	git reset two &&
+	git reset one &&
+	git reset two
+'
+
+test_expect_success 'reflog shows all entries' '
+	cat >expect <<-\EOF
+		topic@{0} two: updating HEAD
+		topic@{1} one: updating HEAD
+		topic@{2} two: updating HEAD
+		topic@{3} one: updating HEAD
+		topic@{4} branch: Created from HEAD
+	EOF
+	git log -g --format="%gd %gs" topic >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.7.3.2.509.g15259

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-22  4:42     ` Jeff King
@ 2010-11-22  9:32       ` Johannes Schindelin
  2010-11-24  0:35       ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Johannes Schindelin @ 2010-11-22  9:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Martin von Zweigbergk, git

Hi,

On Sun, 21 Nov 2010, Jeff King wrote:

> Subject: [PATCH] reflogs: clear flags properly in corner case
> 
> [...]

ACK,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-22  4:42     ` Jeff King
  2010-11-22  9:32       ` Johannes Schindelin
@ 2010-11-24  0:35       ` Junio C Hamano
  2010-11-25 16:15         ` Jeff King
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2010-11-24  0:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, Martin von Zweigbergk, git

Jeff King <peff@peff.net> writes:

> On Sun, Nov 21, 2010 at 12:36:21PM +0100, Johannes Schindelin wrote:
>
>> On Sun, 21 Nov 2010, Jeff King wrote:
>> 
>> > This patch clears up your bug, and doesn't break any tests. But I'd 
>> > really like to get a second opinion on the significance of those other 
>> > flags, or why the flag clearing was at the bottom of the function in the 
>> > first place.
>> 
>> The flag clearing was at the bottom because I had the impression that 
>> something function one might want to call in that function in the future 
>> could set the flags again. Maybe a goto would be appropriate here instead 
>> of the early returns?
>
> That makes sense. I did a quick skim of the called code and I'm not sure
> any flags could be set, but even if I am right, I think it is better to
> be defensive.
>
> So let's do this, which is the equivalent behavior to your gotos, but
> this structure makes more sense to me as a reader (and it doesn't
> involve goto :) ).

I have a feeling that we probably should have structured the code a bit
more modularly, placing some info in "rev" that tells us how to "pop" a
commit from the rev->list (typically "not the ones that we have shown"),
what other commits to push back into the queue (typically, "all the
parents that are not interesting"), and what side effects we should cause
when we do so (typically, "mark uninteresting parents"), etc., instead of
the current "if we are walking reflog, here is the special codepath we
take", so that the walking is more generalized when we did the reflog
walking (in fact, if we did this properly we probably wouldn't be calling
it "bolted on").  But for now let's refrain from doing such a rewrite.

Thanks, both.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: First/oldest entry in reflog dropped
  2010-11-24  0:35       ` Junio C Hamano
@ 2010-11-25 16:15         ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2010-11-25 16:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Martin von Zweigbergk, git

On Tue, Nov 23, 2010 at 04:35:52PM -0800, Junio C Hamano wrote:

> > So let's do this, which is the equivalent behavior to your gotos, but
> > this structure makes more sense to me as a reader (and it doesn't
> > involve goto :) ).
> 
> I have a feeling that we probably should have structured the code a bit
> more modularly, placing some info in "rev" that tells us how to "pop" a
> commit from the rev->list (typically "not the ones that we have shown"),
> what other commits to push back into the queue (typically, "all the
> parents that are not interesting"), and what side effects we should cause
> when we do so (typically, "mark uninteresting parents"), etc., instead of
> the current "if we are walking reflog, here is the special codepath we
> take", so that the walking is more generalized when we did the reflog
> walking (in fact, if we did this properly we probably wouldn't be calling
> it "bolted on").  But for now let's refrain from doing such a rewrite.

FWIW, I agree with all of that. I don't think it's worth fixing now, but
perhaps in libgit2. :)

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-11-25 16:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-21  3:11 First/oldest entry in reflog dropped Martin von Zweigbergk
2010-11-21  5:35 ` Jeff King
2010-11-21 11:36   ` Johannes Schindelin
2010-11-22  4:42     ` Jeff King
2010-11-22  9:32       ` Johannes Schindelin
2010-11-24  0:35       ` Junio C Hamano
2010-11-25 16:15         ` Jeff King
2010-11-21 22:57   ` Martin von Zweigbergk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.