git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Han-Wen Nienhuys <hanwen@google.com>
Cc: git <git@vger.kernel.org>
Subject: Re: Distinguishing FF vs non-FF updates in the reflog?
Date: Thu, 18 Mar 2021 15:47:27 -0400	[thread overview]
Message-ID: <YFOuT6L0dsrCGTBk@coredump.intra.peff.net> (raw)
In-Reply-To: <CAFQ2z_MefCwiWdhs0buJv5Zok+nsgaOvUCcsSnfm_PP0WozZKA@mail.gmail.com>

On Wed, Mar 17, 2021 at 09:06:06PM +0100, Han-Wen Nienhuys wrote:

> I'm working on some extensions to Gerrit for which it would be very
> beneficial if we could tell from the reflog if an update is a
> fast-forward or not: if we find a SHA1 in the reflog, and see there
> were only FF updates since, we can be sure that the SHA1 is reachable
> from the branch, without having to open packfiles and decode commits.

I left some numbers in another part of the thread, but IMHO performance
isn't that compelling a reason to do this these days, if you are using
commit-graphs.

Just walking the reflog might be _slightly_ faster, though not
necessarily (it depends on whether the depth of the object graph or the
depth of the reflog chain is deeper). It might matter more if you are
using a more exotic storage scheme, where switching from accessing
reflogs to objects implies extra round-trips to a server (e.g., custom
storage backends with JGit; I don't know the state of the art in what
Google is doing there).

> For the reftable format, I think we could store this easily by
> introducing more record types. Today we have 0 = deletion, 1 = update,
> and we could add 2 = FF update, 3 = non-FF update.
> 
> However, the textual reflog format doesn't easily allow for this.
> However, we might add a convention, eg. have the message start with
> 'FF' or 'NFF' depending on the nature of the update.
> 
> Does this make sense, and if yes is it worth proposing a change?

At GitHub we do something similar. We don't generally use reflogs much
at all, but we keep a custom "audit log": a single append-only file that
records every ref update in the repository. And its format just happens
to be one reflog entry per line, prefixed by the updated ref.

And there we do generally annotate the FF-ness of an update by stuffing
it into the free-form message field (in fact, we shove in a small JSON
object, so we record multiple fields like the pushing id, IP, etc).

But the main goal there isn't performance (and in fact we don't
generally consult it for anything outside of debugging). The reason we
record FF-ness is for later debugging or analysis. We don't prune from
the audit log, and we don't consider it for reachability when we prune
objects (since otherwise you'd never be able to prune anything!). So the
objects sometimes aren't available later to compute, but we still want
to know if the user did a force-push, etc.

I don't think that really applies to regular reflogs, because they do
imply reachability (and they are not great for later analysis, because
we may selectively expire unreachable entries).

-Peff

  parent reply	other threads:[~2021-03-18 19:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 20:06 Distinguishing FF vs non-FF updates in the reflog? Han-Wen Nienhuys
2021-03-17 21:21 ` Martin Fick
2021-03-18  8:58   ` Han-Wen Nienhuys
2021-03-18 19:35     ` Jeff King
2021-03-18 22:24     ` Martin Fick
2021-03-22 12:31       ` Han-Wen Nienhuys
2021-03-22 17:45         ` Martin Fick
2021-03-18 22:31     ` Martin Fick
2021-03-18 22:54       ` Jeff King
2021-03-18 19:47 ` Jeff King [this message]
2021-03-22 14:40   ` Han-Wen Nienhuys
2021-03-26  7:43     ` Jeff King
2021-03-22 13:26 ` Ævar Arnfjörð Bjarmason
2021-03-22 14:59   ` Han-Wen Nienhuys
2021-03-22 15:39     ` Ævar Arnfjörð Bjarmason
2021-03-22 15:56       ` Han-Wen Nienhuys
2021-03-22 16:40         ` Ævar Arnfjörð Bjarmason
2021-03-22 17:12           ` Han-Wen Nienhuys
2021-03-22 18:36           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFOuT6L0dsrCGTBk@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=hanwen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).