git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Han-Wen Nienhuys <hanwen@google.com>
Cc: git <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Martin Fick <mfick@codeaurora.org>
Subject: Re: Distinguishing FF vs non-FF updates in the reflog?
Date: Mon, 22 Mar 2021 14:26:36 +0100	[thread overview]
Message-ID: <87eeg7qpyr.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CAFQ2z_MefCwiWdhs0buJv5Zok+nsgaOvUCcsSnfm_PP0WozZKA@mail.gmail.com>


On Wed, Mar 17 2021, Han-Wen Nienhuys wrote:

> Hi there,
>
> I'm working on some extensions to Gerrit for which it would be very
> beneficial if we could tell from the reflog if an update is a
> fast-forward or not: if we find a SHA1 in the reflog, and see there
> were only FF updates since, we can be sure that the SHA1 is reachable
> from the branch, without having to open packfiles and decode commits.
>
> For the reftable format, I think we could store this easily by
> introducing more record types. [snip].

Aside from what others have mentioned here, you're talking about the
log_type field are you not? I.e.:
https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md#log-block-format

Has that "log_type = 0x0" tombstone proven to be a worthwhile
optimization past the stash case mention there (which is presumably not
relevant to the vast majority of Google's use-cases).

I.e. it's redundant to looking at the record and seeing if new_id =
ZERO_OID.

Similarly can't ff v.s. non-ff be deduced unambiguously by looking ahead
to the next record, and seeing if the current record's "old_id" matches
that of the last record's "new_id". If it does it's a FF, if not it's a
non-FF (or a create/delete).

I'm not arguing that a quicker lookup isn't needed, I'm just trying to
dig at what "beneficial" here is. The format is ordered, and the common
case is that the page we have in memory has the last record.

What sort of case are we talking about where not unpacking the log_data
segment is making a difference?

> However, the textual reflog format doesn't easily allow for this.
> However, we might add a convention, eg. have the message start with
> 'FF' or 'NFF' depending on the nature of the update.

Maybe a bit ugly, but a ".." and "..." prefix would at least be
consistent with "fetch" output. Or e.g. "commit:" and "+commit:" for ff
and non-ff (and we could make it "\t commit:" v.s. "\t+commit:"
v.s. current "\tcommit:" to distinguish all three in the current
text-based format. Per "OUTPUT" in git-fetch(1).

> [Ævar: snipped from earlier] Today we have 0 = deletion, 1 = update,
> and we could add 2 = FF update, 3 = non-FF update.

I've written log table implementations (a site table in a RDBMS) for git
(one table for refs) which had:

    create, ff, non-ff, delete

I wonder if that quad-state would be useful for reftable too, with this
proposed change you'd still need to unpack the record and see if the
old_id is ZERO_OID to check if it's a creation, would you not?

I also wonder if it couldn't be:

    0 = deletion, 1 = non-ff-update, 2 = ff-update, 4 = creation

So the format wouldn't forever carry the historical wart of this not
having been considered from the beginning.

It would mean that the few current reftable users (just Google?) would
have to look at the record to see if it's *really* a non-ff-update, but
presumably they need to do so now for ff v.s. non-ff, so they're no
worse off than they are now.

Then when those users know they're on a version that distinguishes these
they can hard rely on 1 not being a "ff for sure", not a "maybe" status
for new updates. Presumably they either don't care about ancient reflog
records, or a one-off migration of rewriting the records for older
entries could be done.

Also between my [1] and this proposal we have at least a reftable v1.01
in the wild (the filename locking behavior change discussed in [1]), and
this would make it v1.02, but the only up-to-date spec is for v1.00 (and
maybe JGit has other changes I haven't tracked).

That [1] change is minor, but still, a spec change.

So just a *poke* that having some version where the spec is kept
up-to-date with that and this change if it happens would be very useful,
especially if the reftable-in-git.git lands one of these days.

1. https://lore.kernel.org/git/87k0tzulf1.fsf@evledraar.gmail.com/

  parent reply	other threads:[~2021-03-22 13:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 20:06 Distinguishing FF vs non-FF updates in the reflog? Han-Wen Nienhuys
2021-03-17 21:21 ` Martin Fick
2021-03-18  8:58   ` Han-Wen Nienhuys
2021-03-18 19:35     ` Jeff King
2021-03-18 22:24     ` Martin Fick
2021-03-22 12:31       ` Han-Wen Nienhuys
2021-03-22 17:45         ` Martin Fick
2021-03-18 22:31     ` Martin Fick
2021-03-18 22:54       ` Jeff King
2021-03-18 19:47 ` Jeff King
2021-03-22 14:40   ` Han-Wen Nienhuys
2021-03-26  7:43     ` Jeff King
2021-03-22 13:26 ` Ævar Arnfjörð Bjarmason [this message]
2021-03-22 14:59   ` Han-Wen Nienhuys
2021-03-22 15:39     ` Ævar Arnfjörð Bjarmason
2021-03-22 15:56       ` Han-Wen Nienhuys
2021-03-22 16:40         ` Ævar Arnfjörð Bjarmason
2021-03-22 17:12           ` Han-Wen Nienhuys
2021-03-22 18:36           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eeg7qpyr.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=hanwen@google.com \
    --cc=mfick@codeaurora.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).