All of lore.kernel.org
 help / color / mirror / Atom feed
From: Han-Wen Nienhuys <hanwen@google.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
	Han-Wen Nienhuys via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, lucamilanesio <luca.milanesio@gmail.com>
Subject: Re: [PATCH 2/3] t1405: mark test that checks existence as REFFILES
Date: Thu, 3 Feb 2022 19:10:43 +0100	[thread overview]
Message-ID: <CAFQ2z_MkZBtjViTsDNuKLYUXzFXJM6sPLOvXdiRAZrs84pggUw@mail.gmail.com> (raw)
In-Reply-To: <220203.867dab6dmp.gmgdl@evledraar.gmail.com>

On Thu, Feb 3, 2022 at 6:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >> Yes.  It is exactly the point of the question I asked.  If it is
> >> simple and easy to add such a new type that is ignored/skipped by
> >> existing clients, then we can go that route.  If it is simple and
> >> easy to add a new bit per ref that existing clients would not barf,
> >> we can use that as an alternative implementation strategy.
> >
> > I'm not sure that there are any JGit clients: I committed reftable
> > support at the end of 2019. Before that time, we were running it
> > internally at Google, but only ref storage, and without the posix
> > part. Reflogs were never stored in refable, and I actually found a
> > couple of bugs in Shawn's Java code.
> >
> > Gerrit has increasingly started using Git as a database, and the
> > packed/loose system is just not a very good database, so that
> > motivates the work reftable in general. But the folks who run Gerrit
> > on a POSIX filesystem want to be sure that isn't a fringe feature, so
> > they only want to start using it once Git itself supports it. So there
> > is a chicken & egg problem.
> >
> > It's sad that we have to introduce an existence bit to make things
> > work, but overall it is probably easier for me to do than trying to
> > make sense of sequencer.c and how it uses refs/stash@{0}.
> >
> > Technically, the only obstacle I see is that we'd need to treat an
> > existence entry especially for the purpose of compaction/gc: we can
> > discard older entries, but we shouldn't discard the existence bit, no
> > matter how old it is.
>
> Ah, that's very informative. I had been assuming (or misremembered) that
> reftable was already seeing production use at Google. Perhaps I
> remembering the now-dead Google Code (or whatever it was called). Maybe
> not.

We use the format (the JGit code) at Google, but we only use it to
store refs, that is, the refname => {SHA1, symref, tag} mapping. We
currently don't store reflog data in reftable, and the bugs I found
were just in the reflog parts.

We store the tables in bigtable (among others), so the part that does
the POSIX file locking is new (basically, everything in
reftable/stack.c and its equivalent in JGit).

So we are locked into the format to some degree.

For the existence bit, I think I could simply record a $zeroid =>
$zeroid ref update in ref log and treat that specially.

> In any case, not being locked into the format as specified is very
> nice. So is it basically seeing no (production) use anywhere as far as
> you know? Whether that's in production at Google, or some third parties
> via JGit-something (maybe as editor libraries?).

I think our friends at Gerritforge have been experimenting with it,
but not in a production setting, AFAIK. Luca might confirm.

> Taking a bit of a step back.
>
> I do think that generally speaking parts of this series are putting the
> cart before the horse in seemingly trying to get the test suite clean
> before we have the integration in-tree.
>
> Not everything you have here, but some of it.
>
> I know I'm the one who started encouraging you to work towards getting
> the test mode passing, but I think that while it's good to mark some
> obviously file-only tests beforehand, anything where we have different
> behavior under reftable should really come after.

(I can't parse your last sentence)

> Of course that will mean we'll have some interim period where our test
> suite is a dumpster fire under GIT_TEST_REFTABLE=true, and I think

It's actually not that bad. By my last count, there were 38 files with
test failures.

> that's fine, as long as we work towards getting it passing, and as long
> as the non-stability of the nascent backend is very prominently
> advertised in the interim.
>
> I.e. I think *the* issue with the original series you had in this regard
> was that git-init.txt (or whatever it was) basically just discussed
> enabling reftable matter-of-factly, when we were still failing
> tens/hundreds of tests, which is just setting up a big bear trap for
> users to step into.

I read that comment, and I removed that long ago. Right now the only
way to get a reftable is to say GIT_TEST_REFTABLE=1 on init.

> But if we just changed those docs a bit to note "!!WARNING WARNING!!
> EXPERIMENTAL AND UNSTABLE !!WARNING WARNING!!" or whatever we could
> merge the API integration parts sooner than later, even with a lot of
> known-broken tests.
>
> We could then whitelist the broken parts, and work on narrowing that set
> down. Similar what the SANITIZE=leak mode is currently doing for memory
> leaks.
>
> I think that would make things a lot easier when reviewing submissions
> like these, in that we have reftable/* in-tree already, but with the
> "real" integration we could check how files/reftable backends behave,
> add the diverging behavior to tests etc.
>
> What do you think?

Sounds more fun than the current model :-)

The latest version of the code is here:
https://github.com/hanwen/git/tree/merged-seen-20220117

What do you need besides the "RFC: reftable backend" commit?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

  reply	other threads:[~2022-02-03 18:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-31 17:50 [PATCH 0/3] reftable related test tweaks Han-Wen Nienhuys via GitGitGadget
2022-01-31 17:50 ` [PATCH 1/3] t1405: explictly delete reflogs for reftable Han-Wen Nienhuys via GitGitGadget
2022-01-31 17:50 ` [PATCH 2/3] t1405: mark test that checks existence as REFFILES Han-Wen Nienhuys via GitGitGadget
2022-01-31 21:26   ` Taylor Blau
2022-01-31 22:15     ` Junio C Hamano
2022-02-01 20:06       ` Han-Wen Nienhuys
2022-02-01 21:03         ` Junio C Hamano
2022-02-01 21:22           ` Ævar Arnfjörð Bjarmason
2022-02-01 22:11             ` Junio C Hamano
2022-02-03 16:02               ` Han-Wen Nienhuys
2022-02-03 17:39                 ` Ævar Arnfjörð Bjarmason
2022-02-03 18:10                   ` Han-Wen Nienhuys [this message]
2022-02-03 23:06                 ` Junio C Hamano
2022-02-07  9:48                   ` Han-Wen Nienhuys
2022-02-07 16:52                     ` Han-Wen Nienhuys
2022-02-07 23:40                       ` Junio C Hamano
2022-02-08 14:58                         ` Han-Wen Nienhuys
2022-01-31 17:50 ` [PATCH 3/3] t5312: prepare for reftable Han-Wen Nienhuys via GitGitGadget
2022-02-01 21:17   ` Ævar Arnfjörð Bjarmason
2022-02-03 14:24     ` Han-Wen Nienhuys
2022-02-03 18:31       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFQ2z_MkZBtjViTsDNuKLYUXzFXJM6sPLOvXdiRAZrs84pggUw@mail.gmail.com \
    --to=hanwen@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=luca.milanesio@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.