All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avery Pennarun <apenwarr@gmail.com>
To: Stefan Beller <sbeller@google.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Stephen R Guglielmo" <srguglielmo@gmail.com>,
	"A . Wilcox" <AWilcox@wilcox-tech.com>,
	"David Aguilar" <davvid@gmail.com>
Subject: Re: [PATCH 0/4] subtree: move out of contrib
Date: Mon, 30 Apr 2018 17:53:28 -0400	[thread overview]
Message-ID: <CAHqTa-1KCsbG=6T8M0PLuM5s-j972jiv=vvZHUiwOxwgpPWJeA@mail.gmail.com> (raw)
In-Reply-To: <CAGZ79kakirTjA32cTmByLpjnb3QKUL5eGEgPFFMhUnewC73S8Q@mail.gmail.com>

On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller <sbeller@google.com> wrote:
> On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> No objections from me either.
>
> Submodules seem to serve a slightly different purpose, though?

I think the purpose is actually the same - it's just the tradeoffs
that are difference.  Both sets of tradeoffs kind of suck.

> With Subtrees the superproject always contains all the code,
> even when you squash the subtree histroy when merging it in.
> In the submodule world, you may not have access to one of the
> submodules.

Right.  Personally I think it's a disadvantage of subtree that it
always contains all the code (what if some people don't want the code
for a particular build variant?).  However, it's a huge pain that
submodules *don't* contain all the code (what if I'm not online right
now, or the site supposedly containing the code goes offline, or I
want to make my own fork?).

For the best of both worlds, I've often thought that a good balance
would be to use the same data structure that submodule uses, but to
store all the code in a single git repo under different refs, which we
might or might not download (or might or might not have different
ACLs) under different circumstances.  However, when some projects get
really huge (lots of very big submodule dependencies), then repacking
one-big-repo starts becoming unwieldy; in that situation git-subtree
also fails completely.

> Submodules do not need to produce a synthetic project history
> when splitting off again, as the history is genuine. This allows
> for easier work with upstream.

Splitting for easier work upstream is great, and there really ought to
be an official version of 'git subtree split', which is good for all
sorts of purposes.

However, I suspect almost all uses of the split feature are a)
splitting a subtree that you previously merged in, or b) splitting a
subtree into a separate project that you want to maintain separately
from now on.  Repeated splits in case (a) are only necessary because
you're not using submodules, or in case (b) are only necessary because
you didn't *switch* to submodules when it finally came time to split
the projects.  (In both cases you probably didn't switch to submodules
because you didn't like one of its tradeoffs, especially the need to
track multiple repos when you fork.)

> Subtrees present you the whole history by default and the user
> needs to be explicit about not wanting to see history from the
> subtree, which is the opposite of submodules (though this
> may be planned in the future to switch).

It turns out that AFAIK, almost everyone prefers 'git subtree
--squash', which squashes into a single commit each time you merge,
much like git submodule does.  I doubt people would cry too much if
the full-history feature went away.

There's one exception, which is doing a one-time permanent merge of
two projects into one.  That's a nice feature, but is probably used
extremely rarely.  More often people get into a
merge-split-merge-split cycle that would be better served by a
slightly improved git-submodule.

>> The gerrit team (eg. Stefan Beller) has been doing some really great
>> stuff to make submodules more usable by helping with relative
>> submodule links and by auto-updating links in supermodules at the
>> right times.  Unfortunately doing that requires help from the server
>> side, which kind of messes up decentralization and so doesn't solve
>> the problem in the general case.
>
> Conceptually Gerrit is doing
>
>   while true:
>     git submodule update --remote
>     if worktree is dirty:
>         git commit "update the submodules"
>
> just that Gerrit doesn't poll but does it event based.

...and it's super handy :)  The problem is it's fundamentally
centralized: because gerrit can serialize merges into the submodule,
it also knows exactly how to update the link in the supermodule.  If
there was wild branching and merging (as there often is in git) and
you had to resolve conflicts between two submodules, I don't think it
would be obvious at all how to do it automatically when pushing a
submodule.  (This also works quite badly with git subtree --squash.)

>> I really wish there were a good answer, but I don't know what it is.
>> I do know that lots of people seem to at least be happy using
>> git-subtree, and would be even happier if it were installed
>> automatically with git.
>
> https://trends.google.com/trends/explore?date=all&q=git%20subtree,git%20submodule
>
> Not sure what to make of this data.

Clearly people need a lot more help when using submodules than when
using subtree :)

Have fun,

Avery

  reply	other threads:[~2018-04-30 21:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  9:50 [PATCH 0/4] subtree: move out of contrib Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 1/4] git-subtree: move from contrib/subtree/ Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 2/4] subtree: remove support for git version <1.7 Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 3/4] subtree: fix a test failure under GETTEXT_POISON Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 4/4] i18n: translate the git-subtree command Ævar Arnfjörð Bjarmason
2018-04-30 12:05 ` [PATCH 0/4] subtree: move out of contrib Philip Oakley
2018-04-30 20:45 ` Avery Pennarun
2018-04-30 21:38   ` Stefan Beller
2018-04-30 21:53     ` Avery Pennarun [this message]
2018-04-30 22:18       ` Stefan Beller
2018-04-30 22:21       ` Ævar Arnfjörð Bjarmason
2018-04-30 22:24         ` Avery Pennarun
2018-05-01 11:37 ` Duy Nguyen
2018-05-01 11:42 ` Johannes Schindelin
2018-05-01 12:48   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHqTa-1KCsbG=6T8M0PLuM5s-j972jiv=vvZHUiwOxwgpPWJeA@mail.gmail.com' \
    --to=apenwarr@gmail.com \
    --cc=AWilcox@wilcox-tech.com \
    --cc=avarab@gmail.com \
    --cc=davvid@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=srguglielmo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.