On 2020-05-11 at 18:36:02, Jonathan Nieder wrote: > Separate from the integration aspect is that this is not yet > battle-tested code. One benefit of sharing code is to be able to share > the benefit of users testing it. > > Since the ref system is fairly modular and this is about a non-default > backend, it's likely okay to integrate it initially as "experimental" > and then update docs as we gain confidence. If we're going to integrate it, I would like to pass the testsuite when we do. We can certainly do a series of preparatory patches (e.g., to the testsuite) to get it ready, but once people can turn it on or use it (via config, I assume), it should be fully functional and tested. Having said that, I agree we should mark it as experimental first, at least for a while. I'm interested to see how it works both from a functionality perspective as well as a performance perspective. For example, is reftable a win with a relatively large number of refs (say, tens of thousands)? Operational experience will tell us that and help guide us to figure out if and when it should be the default. > > Johannes had suggested that this should be developed and maintained in > > git-core first, and the result could then be reused by libgit2 > > project. According to the libgit2 folks, this what that would look > > like: > > > > """ > > - It needs to be easy to split out from git-core. If it is > > self-contained in a single directory, then I'd be sufficiently > > happy already. > > > > - It should continue providing a clean interface to external > > callers. This also means that its interface should stay stable so > > we don't have to adapt on every update. git-core historically > > never had such promises, but it kind of worked out for the xdiff > > code. > > > > - My most important fear would be that the reftable interface > > becomes heavily dependent on git-core's own data types. We had > > this discussion at the Contributor's Summit already, but if it > > starts adopting things like git-core's own string buffer then it > > would become a lot harder for us to use it. > > > > - Probably obvious, but contained in the above is that it stays > > compilable on its own. So even if you split out its directory and > > wire up some build instructions, it should not have any > > dependencies on git-core. > > """ > > > > (for the discussion at the summit: > > https://lore.kernel.org/git/1B71B54C-E000-4CEB-8AC6-3DB86E96E31A@jramsay.com.au/) > > > > I can make that work, but it would be good to know if this is > > something the project is OK with in principle, or whether the code > > needs to be completely Git-ified. If the latter happens, that would > > effectively fork the code, which I think is a missed opportunity. > > There's been some discussion about use of strbuf versus the module's > own growing-buffer "struct slice". Is that the only instance of this > kind of infrastructure duplication or are there others? There's duplication of the hash algorithm stuff. I don't know what else because I haven't taken an in-depth look at the code other than for SHA-256 compatibility. I think Dscho has more insight here. In general, my view here is that if this is going to be a part of core Git, then it should live in core Git and use our standard tooling. If this is going to be a logically independent shared library (like zlib) or an optional feature that one can compile with or not, then it can live outside of the tree as a separate project (and shared library) that we link against. We've already seen a bunch of compatibility pain from sha1dc, which has a much smaller, more well-defined interface. I'd like to not repeat that behavior with the reftable code, especially since Git runs on a wide variety of systems and has significant compatibility needs. I also don't love the fact that we have an update script that overwrites all of our changes with the upstream code when there are some of us who have no intention of contributing to upstream (e.g., for CLA reasons). Barring some way of addressing those concerns, I think we're going to need to assume that updates require some sort of manual rebase work like with other code that we import. > Yes, I'm *very* excited that the series includes a knob for running > the testsuite using the reftable ref backend, providing a way for > anyone interested to pitch in to help with the issues they reveal > (both in Git and in its testsuite). I think this is a good feature to have and definitely the right way forward. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204