All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: git <git@vger.kernel.org>, Derrick Stolee <dstolee@microsoft.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH] unpack-trees: do not set SKIP_WORKTREE on submodules
Date: Wed, 17 Jun 2020 14:58:41 -0300	[thread overview]
Message-ID: <CAHd-oW5gTJO=6pYXvg3v=JfjffcajPyMpsUOoqXnozwYrg3WwQ@mail.gmail.com> (raw)
In-Reply-To: <pull.809.git.git.1592356884310.gitgitgadget@gmail.com>

On Tue, Jun 16, 2020 at 10:21 PM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Elijah Newren <newren@gmail.com>
>
> As noted in commit e7d7c73249 ("git-sparse-checkout: clarify
> interactions with submodules", 2020-06-10), sparse-checkout cannot
> remove submodules even if they don't match the sparsity patterns,
> because doing so would risk data loss -- unpushed, uncommitted, or
> untracked changes could all be lost.  That commit also updated the
> documentation to point out that submodule initialization state was a
> parallel, orthogonal reason that entries in the index might not be
> present in the working tree.
>
> However, sparsity and submodule initialization weren't actually fully
> orthogonal yet.  The SKIP_WORKTREE handling in unpack_trees would
> attempt to set the SKIP_WORKTREE bit on submodules when the submodule
> did not match the sparsity patterns.  This resulted in innocuous but
> potentially alarming warning messages:
>
>     warning: unable to rmdir 'sha1collisiondetection': Directory not empty
>
> It could also make things confusing since the entry would be marked as
> SKIP_WORKTREE in the index but actually still be present in the working
> tree:
>
>     $ git ls-files -t | grep sha1collisiondetection
>     S sha1collisiondetection
>     $ ls -al sha1collisiondetection/ | wc -l
>     13
>
> Submodules have always been their own form of "partial checkout"
> behavior, with their own "present or not" state determined by running
> "git submodule [init|deinit|update]".  Enforce that separation by having
> the SKIP_WORKTREE logic not touch submodules and allow submodules to
> continue using their own initialization state for determining if the
> submodule is present.

Makes sense to me.

I'm just thinking about the possible implications in grep (with
mt/grep-sparse-checkout). As you mentioned in [1], users might think
of "git grep $rev $pat" as an optimized version of "git checkout $rev
&& git grep $pat". And, in this sense, they probably expect the output
of these two operations to be equal. But if we don't set the
SKIP_WORKTREE bit for submodules they might diverge.

As an example, if we have a repository like:
.
|-- A
|   `-- sub
`-- B

And the [cone-mode] sparsity rules:
/*
!/*/
/B/

Then, "git grep --recurse-submodules $rev $pat" would search only in B
(as A doesn't match the sparsity patterns and thus, is not recursed
into). But "git checkout $rev && git grep --recurse-submodules $pat"
would search in both B and A/sub (as the latter would not have the
SKIP_WORKTREE bit set).

This might be a problem for git-grep, not git-sparse-checkout. But I'm
not sure how we could solve it efficiently, as the submodule might be
deep down in a path whose first dir was already ignored for not
matching the sparsity patterns. Is this a problem we should consider,
or is it OK if the outputs of these two operations diverge?

[1]: https://lore.kernel.org/git/CABPp-BFKHivKffBPO0M_s-JtpiLyEMLZr+sX9_yZk9ZX7ULtbw@mail.gmail.com/

  reply	other threads:[~2020-06-17 17:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-17  1:21 [PATCH] unpack-trees: do not set SKIP_WORKTREE on submodules Elijah Newren via GitGitGadget
2020-06-17 17:58 ` Matheus Tavares Bernardino [this message]
2020-06-18  0:24   ` Elijah Newren
2020-06-18 14:34     ` Matheus Tavares Bernardino
2020-06-18 19:19       ` Elijah Newren
2020-06-18 20:29         ` Matheus Tavares Bernardino
2020-06-18  1:51 How soon
2020-10-06  0:06 Luv MeZza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHd-oW5gTJO=6pYXvg3v=JfjffcajPyMpsUOoqXnozwYrg3WwQ@mail.gmail.com' \
    --to=matheus.bernardino@usp.br \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.