All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avery Pennarun <apenwarr@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Bryan Larsen <bryan.larsen@gmail.com>, git <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: Avery Pennarun's git-subtree?
Date: Wed, 21 Jul 2010 17:09:17 -0400	[thread overview]
Message-ID: <AANLkTimiROxqf7KcRKTZvMvsFdd4w3jK_GLeZR8n7tdA@mail.gmail.com> (raw)
In-Reply-To: <AANLkTikl2zKcie3YGhBHrGbYbX3yB9QCtuJTKjsAfK07@mail.gmail.com>

On Wed, Jul 21, 2010 at 4:36 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Wed, Jul 21, 2010 at 19:56, Avery Pennarun <apenwarr@gmail.com> wrote:
>> No amount of bugfixing in git submodule can fix this workflow, because
>> it's not a result of bugs.  (The bugs, particularly the
>> disconnected-by-default HEADs on submodule checkouts, do make it a bit
>> worse :( )  It would require a fundamental redesign to make this work
>> nicely with submodules.
> [...]
> I think most of those can be fixed, actually. The only requirement
> that the git plumbing imposes on git-submodules is that a "commit"
> entry exist in your tree, the rest is just (ugly plumbing).

Sure.  But this commit object (and the objects it points to) are never
automatically pushed, fetched, or fsck'd.  They're second class
citizens.  As it turns out, this was a major design mistake in
implementing the submodule commit objects.

All the behaviour people *currently* get from submodules could have
been obtained without using a new 'commit' object type at all.  Just
add a commitid to the horrible junk (including repo URLs, argh) that
already needs to get pasted into .gitmodules, and have git-commit at
the top level update .gitmodules automatically (as it currently
updates the 'commit' tree entries).  Problem solved (at least, solved
to exactly the extent that it is today).

What we *really* want is a way to have git actually recurse through
commit objects when doing *any* operation, as if they were tree
objects.  If we had that, submodules could be beautiful (because you'd
push them to the same repo, etc and users would see none of the
complexity).  But this doesn't exist.  And for backward compatibility
at this point, we'd probably need to introduce an entirely new kind of
tree entry to support such a thing.

> Thus, we could:
>
>   * Hack git-submodule (or its replacement) to check import the tree
>     that contains that "commit" into one central .git

This part is relatively easy, I think - at least in concept, although
I bet there would be widespread implementation tweaks - and would
clean up a lot of the mess.  However it would require a change to the
.git/index file format to remember when a subdir is a commit and not a
"normal" tree so that it doesn't silently commit the next thing as a
tree instead.

>   * Fix git status / git commit so that you could commit into
>     submodules, i.e.:
>
>     for each submodule in this-commit:
>         chdir $submodule && commit
>     done && cd $root && commit -m"bumping submodules"

After making the earlier change to get rid of the extra .git subdirs,
this next requirement would actually be considerably more work,
because 'git commit' would need to know how to update a subcommit
without changing HEAD.  You certainly couldn't just code it up as a
recursive "git commit" as you imply (and as you could do right now).

>   * Make git-push push the submodule contents and the
>     superprojects. You'd just need to have commit access to the url
>     listed in .gitmodules.

This is really a *killer* problem, and you're making it sound easy.
Let's imagine that my app has 25 different submodules - not
unreasonable at all in a world with dozens of ever-changing ruby gems
and suchlike.

Now, if I want to branch my project, I might have to branch 25
projects just so I can push my changes?  It's totally awful.  And the
awfulness is multiplied many times over if .gitmodules has hard-coded
repo paths, because then I have to update the repo path in my branch
but not the other branch, and merging will have conflicts.  You might
think that my .git/config could just override .gitmodules, but then
some guy trying to fetch my branch will fail to fetch the submodules
from my branch and get errors and have no idea what's going on.

And you might think that using relative repo paths in .gitmodules
would work, but that's only if I branched all 25 submodules in the
*first* place.  In real life, most subprojects point at the original
project's home repo by default (because nobody thinks they'll be
patching 25 subprojects when they start, and they're probably right),
but then you have to individually change the URLs when you decide you
need to patch them, and life gets complicated and ugly, especially
when the next guy goes to fork your project and now needs to fork some
subprojects but not others.

There is no good solution to the submodule problem if each submodule
has to go in its own repo.  I've been thinking about this for years
now, and watching lots of discussions about it on the git mailing
list, and I just can't see any other option.  All the submodules have
to get pushed to and fetched from the same repo by default.  Anything
else is insane.

One option might be to store the submodule commit refs as refs in your
superproject.  That wouldn't actually be so bad, except for the
aforementioned problem that fetch/push/clone/etc don't actually trace
through commit objects when deciding what objects to send you, so
fetching the ref of the superproject wouldn't autofetch the subproject
refs.  Also, you could accidentally delete one of the subproject refs
and lose tons of history without ever realizing it.  That's error
prone and confusing... and clutters up your repo refs list with
administrative stuff you didn't actually want in the first place.

> What's missing from that (which would be nice) is the ability to check
> out a subdirectory from another repository. That could (I think) be
> done by just adding a normal "tree" entry, and then specifying that
> that tree can be found in git://... instead of the main tree.

Actually that's already easy with submodules (and git-subtree makes it
easy too, though slightly different).  Just fetch the commit from the
other repo, and do:

   git checkout FETCH_HEAD -- subdirname

>> If we can get some kind of consensus in principle that git-subtree is
>> a good idea to merge into git core, I can prepare some patches and we
>> can talk about the details.
>
> From having looked at it briefly it looks very nice. But it looks to
> me as if the main differences between git-submodule and git-subtree
> are in the porcelain, not the plumbing.

No.  The fundamental difference is exactly one: git-subtree uses
normal 'tree' entries (rather than commits) in its trees, so that all
the git tools recurse through them like any other tree.  Thus you
don't need any extra refs, extra .git dirs, etc.  That allows you to
bypass all the useless behaviour git has around 'commit' entries.
This is very much a plumbing difference.

The git-submodule porcelain happens to independently be kind of
annoying and inconvenient, but that would be much easier to fix if it
weren't for the plumbing-related problems.

> It would be a lot less confusing to users of Git in the long term if
> we would at least try to unify these two approaches instead of having
> two mutually incompatible ways of doing essentially the same thing.

True.  But I don't have the time, and implementing the new 'commit'
entry semantics sounds like a lot of work (as opposed to arguing about
them, which I guess I'm good at but which seems unproductive).

In productive terms: git-subtree is solving problems for real users
right now.  It might solve more problems for more users if it were
integrated into the core and thus made "official."  Nothing precludes
making submodules better later.

Have fun,

Avery

  reply	other threads:[~2010-07-21 21:09 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-21 17:15 Avery Pennarun's git-subtree? Bryan Larsen
2010-07-21 19:43 ` Ævar Arnfjörð Bjarmason
2010-07-21 19:56   ` Avery Pennarun
2010-07-21 20:36     ` Ævar Arnfjörð Bjarmason
2010-07-21 21:09       ` Avery Pennarun [this message]
2010-07-21 21:20         ` Avery Pennarun
2010-07-21 22:46         ` Jens Lehmann
2010-07-22  1:09           ` Avery Pennarun
     [not found]             ` <m31vavn8la.fsf@localhost.localdomain>
2010-07-22 18:23               ` Bryan Larsen
2010-07-24 22:36                 ` Jakub Narebski
2010-07-22 19:41               ` Avery Pennarun
2010-07-22 19:56                 ` Jonathan Nieder
2010-07-22 20:06                   ` Avery Pennarun
2010-07-22 20:17                   ` Ævar Arnfjörð Bjarmason
2010-07-22 21:33                     ` Avery Pennarun
2010-07-23 15:10                       ` Jens Lehmann
2010-07-26 17:34                       ` Eugene Sajine
2010-07-22 20:43                   ` Elijah Newren
2010-07-22 21:32                     ` Avery Pennarun
2010-07-23  8:31                 ` Chris Webb
2010-07-23  8:40                   ` Avery Pennarun
2010-07-23 15:11                     ` Jens Lehmann
2010-07-23 22:33                       ` Avery Pennarun
2010-07-23 15:13                     ` Jens Lehmann
2010-07-23 15:10                 ` Jens Lehmann
2010-07-23 16:05                   ` Bryan Larsen
2010-07-23 17:11                     ` Jens Lehmann
2010-07-23 19:01                       ` Bryan Larsen
2010-07-23 22:32                   ` Avery Pennarun
2010-07-25 19:57                     ` Jens Lehmann
2010-07-27 18:40                       ` Avery Pennarun
2010-07-27 21:14                         ` Jens Lehmann
2010-07-23 15:19                 ` Marc Branchaud
2010-07-23 22:50                   ` Avery Pennarun
2010-07-24  0:58                     ` skillzero
2010-07-24  1:20                       ` Avery Pennarun
2010-07-24 19:40                         ` skillzero
2010-07-25  1:47                           ` Nguyen Thai Ngoc Duy
2010-07-28 22:27                             ` Jakub Narebski
2010-07-26 13:13                           ` Jakub Narebski
2010-07-26 16:37                         ` Marc Branchaud
2010-07-26 16:41                           ` Linus Torvalds
2010-07-26 17:36                             ` Bryan Larsen
2010-07-26 17:48                               ` Linus Torvalds
2010-07-27 18:28                             ` Avery Pennarun
2010-07-27 20:25                               ` Junio C Hamano
2010-07-27 20:57                                 ` Avery Pennarun
2010-07-27 21:14                                   ` Junio C Hamano
2010-07-27 21:32                                   ` Jens Lehmann
2010-07-26  8:56                       ` Jakub Narebski
2010-07-27 18:36                         ` Avery Pennarun
2010-07-28 13:36                           ` Marc Branchaud
2010-07-28 18:32                           ` Jakub Narebski
2010-07-24 20:07                     ` Sverre Rabbelier
2010-07-26  8:51                     ` Jakub Narebski
2010-07-27 19:15                       ` Avery Pennarun
2010-07-26 15:15                     ` Marc Branchaud
2010-07-21 23:46         ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimiROxqf7KcRKTZvMvsFdd4w3jK_GLeZR8n7tdA@mail.gmail.com \
    --to=apenwarr@gmail.com \
    --cc=avarab@gmail.com \
    --cc=bryan.larsen@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.