All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Matheus Tavares Bernardino <matheus.bernardino@usp.br>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH v2 0/5] Sparse Index: Integrate with 'git add'
Date: Wed, 28 Jul 2021 20:57:14 -0600	[thread overview]
Message-ID: <CABPp-BGUTg=GarkhP0MwjWKWmDyRJiEL2J75wFz52y2xi_50mw@mail.gmail.com> (raw)
In-Reply-To: <6a63736a-feb8-b74b-ef68-73cc71009e1d@gmail.com>

On Wed, Jul 28, 2021 at 8:03 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 7/28/2021 7:13 PM, Elijah Newren wrote:
> > On Mon, Jul 26, 2021 at 9:18 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> ...
> >>  * a full proposal for what to do with "git (add|mv|rm)" and paths outside
> >>    the cone is delayed to another series (with an RFC round) because the
> >>    behavior of the sparse-index matches a full index with sparse-checkout.
> >
> > I think this makes sense.
> >
> > I've read through the patches, and I like this version...with one
> > exception.  Can we mark the test added in patch 1 under
> >
> >      # 3. Rename the file to another sparse filename and
> >      #    accept conflict markers as resolved content.
> >
> > as NEEDSWORK or even MAYNEEDWORK?
>
> I have no objection to adding a blurb such as:
>
>         # NEEDSWORK: allowing adds outside the sparse cone can be
>         # confusingto users, as the file can disappear from the
>         # worktree without warning in later Git commands.
>

Sounds great to me other than the simple typo (s/confusingto/confusing to/)

> And perhaps I'm misunderstanding the situation a bit, but that
> seems to apply not just to this third case, but all of them. I
> don't see why the untracked case is special compared to the
> tracked case. More investigation may be required on my part.

The possible cases for files outside the sparsity patterns are:
  a) untracked
  b) tracked and SKIP_WORKTREE
  c) tracked and !SKIP_WORKTREE (e.g. because merge conflicts)

From the above set, we've been talking about untracked and I think
we're on the same page about those.  Case (b) was already corrected by
Matheus a number of releases back; git-add will throw an error
explaining the situation and prevent the adding.  The error tells the
user to expand their sparsity set to work on those files.  For case
(c), you are right that those are problematic in the same way (they
can disappear later after a git-add)...but we're also in the situation
where the only way to get rid of the conflicting stages is to run git
add.  So, in my mind, case (c) puts us between a rock and a hard
place, and we probably need to allow the git-add.

> >  I'm still quite unconvinced that it
> > is testing for correct behavior, and don't want to paint ourselves
> > into a corner.  In particular, we don't allow folks to "git add
> > $IGNORED_FILE" without a --force override because it's likely to be a
> > mistake.
>
> I agree about ignored files, and that is true whether or not they
> are in the sparse cone.

Yes, and...

> > I think the same logic holds for adding untracked files
> > outside the sparsity cone.

In my opinion, "outside the sparsity cone" is another form of "being
ignored", and in my mind should be treated similarly -- it should
generally require an override to add such files.  (Case (c) possibly
being an exception, though maybe even it shouldn't be.)

> >  But it's actually even worse than that
> > case because there's a secondary level of surprise too: adding files
> > outside the sparsity cone will result in delayed user surprises when
> > the next git command that happens to call unpack_trees() (which are
> > found all over the codebase) removes the file from the working tree.
> > I've had some such reports already.
>
> I believe this is testing a realistic scenario that users are
> hitting in the wild today. I would believe that users succeed with
> these commands more often than they are confused by the file
> disappearing from the worktree in a later Git command, so having
> this sequence of events be documented as a potential use case has
> some value.
>
> I simultaneously don't think it is behavior we want to commit to
> as a contract for all future Git versions, but there is value in
> showing how this situation changes with any future meddling. In
> particular: will users be able to self-discover the "new" way of
> doing things?

Oh, I totally agree that documenting how things work definitely has
value.  I've added several test_expect_failure cases and whatnot to
the testsuite.  But there's a big difference between documenting how
things work and documenting how we expect them to work.  If the two
differ, then any good provided by documenting how things work with a
test marked as test_expect_success may be counterbalanced or even
overwhelmed by the harm it also causes, particularly in areas where
working around backward compatibility constraints are more difficult.

For example, not that long ago, it seemed people agreed (even Junio)
that commit hooks were never intended to be part of rebase (they
aren't part of the apply backend, and were only part of the
merge/interactive backend due to historical accident) and could be
removed (being replaced by just a rebase hook called at the end of the
rebase instead of with every commit).  There were user complaints
about the commit hooks being triggered when the default backend
switched, backing up the expectation.  But no one jumped in to fix it
at the time.  Then when it was brought up again recently, Junio said
we couldn't just remove those because of backward compatibility.
That's forcing me to consider suggesting a bunch of new arguments to
rebase to let users get unbroken when they discover they need it, or
maybe even a new toplevel command because we painted ourselves into a
corner (there are more backward compatibility corners in rebase
too...).

Trying to get out of a corner we paint ourselves into with
sparse-checkout would be massively harder, which is why I keep harping
on this kind of thing.  I'm very concerned it's happening even despite
my numerous comments and worries about it.

> The proposal part of changing how add/mv/rm behave in these cases
> would need to adjust this test with something that would also help
> direct users to a helpful resolution. For example, the first run
> of
>
>         git add sparse/dir/file
>
> could error out with an error message saying "The pathspec is
> outside of your sparse cone, so staging the file might lead to
> a staged change that is removed from your working directory."

Yes, much like we currently do with tracked files which are SKIP_WORKTREE.

> But we should _also_ include two strategies for getting out of
> this state:
>
> 1. Adjust your sparse-checkout definition so this file is in scope.
>
> -or- (and this is the part that would be new)
>
> 2. If you understand the risks of staging a file outside the sparse
>    cone, then run 'git add --sparse sparse/dir/file'.
>
> (Insert whatever option would be appropriate for --sparse here.)
>
> Such a warning message would allow users who follow the steps listed
> in the test to know how to adjust their usage to then get into a
> good state.

Choice 2 doesn't exist yet, but yeah your suggestion makes sense.

> > If that test is marked as NEEDSWORK or even as the correct behavior
> > still being under dispute, then you can happily apply my:
>
> I would classify this as "The test documents current behavior, but
> isn't a contract for future behavior." With a concept such as my
> suggestion above, the test could be modified to check for the
> warning and then run the second command with the extra option and
> complete the test's expectations. Having the existing behavior
> documented in a test helps demonstrate how behavior is changing.
>
> We we've discussed, we want to give such a behavior change the
> right venue for feedback and suggestions for alternate approaches,
> and this series is not the right place for that. Hopefully you
> can tell that it is on my mind and that I want to recommend a
> change in the near future.

I'm totally fine with such changes not being part of this series.  I
just don't want a test_expect_success that checks for behavior that I
consider buggy unless it comes with a disclaimer that it's checking
for existing rather than expected behavior.

  reply	other threads:[~2021-07-29  2:57 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 21:06 [PATCH 0/5] Sparse Index: Integrate with 'git add' Derrick Stolee via GitGitGadget
2021-07-21 21:06 ` [PATCH 1/5] t1092: test merge conflicts outside cone Derrick Stolee via GitGitGadget
2021-07-23 17:34   ` Elijah Newren
2021-07-23 17:44     ` Eric Sunshine
2021-07-23 17:47       ` Elijah Newren
2021-07-26 14:10     ` Derrick Stolee
2021-07-21 21:06 ` [PATCH 2/5] add: allow operating on a sparse-only index Derrick Stolee via GitGitGadget
2021-07-21 22:19   ` Junio C Hamano
2021-07-21 22:50     ` Derrick Stolee
2021-07-23 17:45   ` Elijah Newren
2021-07-26 13:11     ` Derrick Stolee
2021-07-26 13:33     ` Derrick Stolee
2021-07-21 21:06 ` [PATCH 3/5] pathspec: stop calling ensure_full_index Derrick Stolee via GitGitGadget
2021-07-23 18:17   ` Elijah Newren
2021-07-21 21:06 ` [PATCH 4/5] t1092: 'git add --refresh' difference with sparse-index Derrick Stolee via GitGitGadget
2021-07-21 21:06 ` [PATCH 5/5] add: ignore outside the sparse-checkout in refresh() Derrick Stolee via GitGitGadget
2021-07-23 19:46   ` Elijah Newren
2021-07-23 12:51 ` [PATCH 0/5] Sparse Index: Integrate with 'git add' Elijah Newren
2021-07-23 20:10   ` Elijah Newren
2021-07-26 15:18 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2021-07-26 15:18   ` [PATCH v2 1/5] t1092: test merge conflicts outside cone Derrick Stolee via GitGitGadget
2021-07-26 15:18   ` [PATCH v2 2/5] add: allow operating on a sparse-only index Derrick Stolee via GitGitGadget
2021-07-26 15:18   ` [PATCH v2 3/5] pathspec: stop calling ensure_full_index Derrick Stolee via GitGitGadget
2021-07-26 15:18   ` [PATCH v2 4/5] add: ignore outside the sparse-checkout in refresh() Derrick Stolee via GitGitGadget
2021-07-26 15:18   ` [PATCH v2 5/5] add: remove ensure_full_index() with --renormalize Derrick Stolee via GitGitGadget
2021-07-28 23:13   ` [PATCH v2 0/5] Sparse Index: Integrate with 'git add' Elijah Newren
2021-07-29  2:03     ` Derrick Stolee
2021-07-29  2:57       ` Elijah Newren [this message]
2021-07-29 14:49         ` Derrick Stolee
2021-07-30 12:52           ` Elijah Newren
2021-07-29 14:52   ` [PATCH v3 " Derrick Stolee via GitGitGadget
2021-07-29 14:52     ` [PATCH v3 1/5] t1092: test merge conflicts outside cone Derrick Stolee via GitGitGadget
2021-07-29 14:52     ` [PATCH v3 2/5] add: allow operating on a sparse-only index Derrick Stolee via GitGitGadget
2021-07-29 14:52     ` [PATCH v3 3/5] pathspec: stop calling ensure_full_index Derrick Stolee via GitGitGadget
2021-07-29 14:52     ` [PATCH v3 4/5] add: ignore outside the sparse-checkout in refresh() Derrick Stolee via GitGitGadget
2021-07-29 14:52     ` [PATCH v3 5/5] add: remove ensure_full_index() with --renormalize Derrick Stolee via GitGitGadget
2021-07-29 14:58     ` [PATCH v3 0/5] Sparse Index: Integrate with 'git add' Elijah Newren
2021-07-29 23:00     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BGUTg=GarkhP0MwjWKWmDyRJiEL2J75wFz52y2xi_50mw@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.