git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git <git@vger.kernel.org>, "Jeff King" <peff@peff.net>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narębski" <jnareb@gmail.com>,
	"Derrick Stolee" <dstolee@microsoft.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [RFC] Generation Number v2
Date: Mon, 29 Oct 2018 12:22:03 -0700	[thread overview]
Message-ID: <CAGZ79ka-FTqaXdrMixjUp2THJ3L0YvEnkKxs3XFgB3WEEy2-Tg@mail.gmail.com> (raw)
In-Reply-To: <6367e30a-1b3a-4fe9-611b-d931f51effef@gmail.com>

> Based on the performance results alone, we should remove minimum
> generation numbers, (epoch, date) pairs, and FELINE index from
> consideration. There are enough examples of these indexes performing
> poorly.
>
> In contrast, maximum generation numbers and corrected commit
> dates both performed quite well. They are frequently the top
> two performing indexes, and rarely significantly different.
>
> The trade-off here now seems to be: which _property_ is more important,
> locally-computable or backwards-compatible?
>
> * Maximum generation number is backwards-compatible but not
>    locally-computable or immutable.

These maximum generation numbers sound like the reverse of
the generation numbers as they are currently implemented, i.e.
we count all commits between the commit A and all heads.
How would this impact creation of a commit?

The current generation numbers can be lazily updated or not
updated at all. In my understanding of the maximum generation
numbers, a new commit would make these maximum generation
numbers invalid (i.e. they have to be recomputed).

Are there ways out by deferring the computation of maximum
generation numbers while still being able to work with some
commits that are un-accounted for?

When recomputing these numbers, the current generation number
(V0) has the property that already existing numbers can be re-used
as-is. We only need to compute the numbers for new commits,
and then insert this to the appropriate data structures (which is
a separate problem, one could imagine a split generation
numbers file like the split index)

For the V2 maximum generation numbers, would we need to
rewrite the numbers for all commits once we recompute them?
Assuming that is true, it sounds like the benchmark doesn't
cover the whole costs associated with V2, which is why the
exceptional performance can be explained.

(Note: The table as read in
https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51effef@gmail.com/
seems line broken, using gmails web interface is not good
for ASCII art and patches, git-send-email would fare better)

>
> * Corrected commit-date is locally-computable and immutable,
>    but not backwards-compatible.

How are these dates not backwards incompatible?
We don't have to expose these dates to the user, but
- just like generation numbers - could store them and use them
but not tell the user about it.

We would need to be really careful to not expose them at all
as they look like the real dates, so that could make for an
awkward bug report.

The approach of "corrected commit date" sounds like we could
have a very lazy approach, i.e. no extra data structures needed
for many commits (as the corrected date equals the real date)
and only need to store the corrections for some commits.
Such an approach however would not make it easy to know
if we operate on corrected dates, or if we even checked them
for correctness before.

So if we'd have an additional row in the generation numbers file
telling the corrected date, then we should be able to be backwards
compatible?

  reply	other threads:[~2018-10-29 19:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 16:55 [RFC] Generation Number v2 Derrick Stolee
2018-10-29 19:22 ` Stefan Beller [this message]
2018-10-29 20:06   ` Derrick Stolee
2018-11-01 20:06   ` Jakub Narebski
2018-11-02  9:30     ` Jakub Narebski
2018-11-03 17:27       ` Jakub Narebski
2018-10-29 20:25 ` Derrick Stolee
2018-11-01 22:13   ` Jakub Narebski
2018-10-30  3:59 ` Junio C Hamano
2018-10-31 12:30   ` Derrick Stolee
2018-11-02 13:33     ` Jakub Narebski
2018-10-31 12:54   ` Ævar Arnfjörð Bjarmason
2018-10-31 13:04     ` Derrick Stolee
2018-11-02 17:44       ` Jakub Narebski
2018-11-01 12:27 ` Jakub Narebski
2018-11-01 13:29   ` Derrick Stolee
2018-11-03 12:33     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79ka-FTqaXdrMixjUp2THJ3L0YvEnkKxs3XFgB3WEEy2-Tg@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).