All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Han-Wen Nienhuys <hanwen@google.com>
Cc: git <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Michael Haggerty <mhagger@alum.mit.edu>,
	Patrick Steinhardt <ps@pks.im>
Subject: Re: How to move forward with Reftable
Date: Mon, 11 May 2020 11:36:02 -0700	[thread overview]
Message-ID: <20200511183602.GA141481@google.com> (raw)
In-Reply-To: <CAFQ2z_Ne-1TWMRS88ADhzQmx4OfzNEkVUJ1anjf57mQD4Gdywg@mail.gmail.com>

(+cc: a few people I'm particularly interested in feedback from)

Han-Wen Nienhuys wrote[3]:

> I have noticed that not everyone is happy with the reftable series as
> it currently stands. I wanted to give some background on why it looks
> the way it does today, I want to solicit feedback on where to go from
> here, and share my worries about the negative feedback I've gotten so
> far.
[...]
> As it currently stands, the library implements the specification
> completely, including the parts that are currently probably not useful
> for Git, like support for the object-ID/ref mapping. The API between
> reftable and git is extremely narrow: it's about 20 functions, and 5
> structure definitions. It's also well tested (see coverage [2]. As of
> today, the tests are leak-free according to valgrind). This is also
> why I don't believe Johannes' argument that, because he found one bug,
> it must be full of bugs.
>
> This approach has been successful, in the sense that the libgit2
> project has experimented with integrating it, see
> https://github.com/libgit2/libgit2/pull/5462. They seem happy at the
> prospect of integrating existing code rather than reimplementing it.

Separate from the integration aspect is that this is not yet
battle-tested code.  One benefit of sharing code is to be able to share
the benefit of users testing it.

Since the ref system is fairly modular and this is about a non-default
backend, it's likely okay to integrate it initially as "experimental"
and then update docs as we gain confidence.

> Johannes had suggested that this should be developed and maintained in
> git-core first, and the result could then be reused by libgit2
> project. According to the libgit2 folks, this what that would look
> like:
>
> """
>     - It needs to be easy to split out from git-core. If it is
>       self-contained in a single directory, then I'd be sufficiently
>       happy already.
>
>     - It should continue providing a clean interface to external
>       callers. This also means that its interface should stay stable so
>       we don't have to adapt on every update. git-core historically
>       never had such promises, but it kind of worked out for the xdiff
>       code.
>
>     - My most important fear would be that the reftable interface
>       becomes heavily dependent on git-core's own data types. We had
>       this discussion at the Contributor's Summit already, but if it
>       starts adopting things like git-core's own string buffer then it
>       would become a lot harder for us to use it.
>
>     - Probably obvious, but contained in the above is that it stays
>       compilable on its own. So even if you split out its directory and
>       wire up some build instructions, it should not have any
>       dependencies on git-core.
> """
>
> (for the discussion at the summit:
> https://lore.kernel.org/git/1B71B54C-E000-4CEB-8AC6-3DB86E96E31A@jramsay.com.au/)
>
> I can make that work, but it would be good to know if this is
> something the project is OK with in principle, or whether the code
> needs to be completely Git-ified. If the latter happens, that would
> effectively fork the code, which I think is a missed opportunity.

There's been some discussion about use of strbuf versus the module's
own growing-buffer "struct slice".  Is that the only instance of this
kind of infrastructure duplication or are there others?

> I have received a lot of negative comments, and I can't gauge well
> what to make of them, but they kindle several worries:
>
> * The git community decides they don't need the object reverse index,
>   and do not write 'o' section in reftables, because that culls 1000
>   lines of code. This generally works, but will cause hard to debug
>   performance regressions if command-line git is used on a gerrit
>   server.
>
> * The git community looks at this, and decides the standard is too
>   complex, and goes off to create a reftable v3.
>
> * The git community asks me to do a ton of superficial work (eg. slice
>   -> strbuf), and then decides the overall design needs to be different,
>   and should be completely rewritten.
>
> Jonathan Nieder said I shouldn't worry about standards compliance,
> because the Git project has already ratified the reftable standard,
> and wouldn't want to break JGit compatibility, but it would be good to
> have the community leaders reaffirm this stance.

Recording from today's IRC standup for easy reference later:

  <gitster> jrnieder: that might have been solved already by me declaring
  that whether they like it or not, reftable must be in git-core to match
  what JGit has.

> For my last worry, it would be good if someone would make an
> assessment of the overall design to see if it is acceptable.

Thanks.  I think Peff did a little in this area; help from others
would be welcome as well.

> Once we have a path forward we can think of a way of integrating the
> code. I think it may take a while to shake out bugs. Not bugs in the
> reftable library itself, but Git is not very strict in enforcing
> proper use of the ref backend abstraction (eg. pseudorefs [1]). Many
> integration tests also don't go through proper channels for accessing
> refs.

Yes, I'm *very* excited that the series includes a knob for running
the testsuite using the reftable ref backend, providing a way for
anyone interested to pitch in to help with the issues they reveal
(both in Git and in its testsuite).

Summary: thanks for writing this.  I personally think your proposal is
a reasonable one, but I'm looking forward to hearing from others as
well.  I'm also looking forward to discussing "struct slice" and the
overall architecture in the context of the patches[4] and making some
more progress.

Sincerely,
Jonathan

> cheers,
>
> [1] https://lore.kernel.org/git/CAFQ2z_NZgkPE+3oazfb_m0_7TWxHjje1yYCc0bMZG05_KUKEow@mail.gmail.com/
>
> [2] https://hanwen.home.xs4all.nl/public/software/reftable-coverage/

[3] https://lore.kernel.org/git/CAFQ2z_Ne-1TWMRS88ADhzQmx4OfzNEkVUJ1anjf57mQD4Gdywg@mail.gmail.com/

[4] https://lore.kernel.org/git/pull.539.v12.git.1588845585.gitgitgadget@gmail.com/

  reply	other threads:[~2020-05-11 18:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07 18:51 How to move forward with Reftable Han-Wen Nienhuys
2020-05-11 18:36 ` Jonathan Nieder [this message]
2020-05-11 18:50   ` Junio C Hamano
2020-05-12 23:37   ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200511183602.GA141481@google.com \
    --to=jrnieder@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=hanwen@google.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.