git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Bert Wesarg <bert.wesarg@googlemail.com>,
	Matheus Tavares Bernardino <matheus.bernardino@usp.br>,
	git <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Derrick Stolee <dstolee@microsoft.com>,
	Taylor Blau <me@ttaylorr.com>
Subject: Re: git-grep in sparse checkout
Date: Wed, 2 Oct 2019 09:46:23 -0700	[thread overview]
Message-ID: <CABPp-BE6w_GJ6+N0PJBpJh=pguM85izUYCfFy=AoE53OiifAUg@mail.gmail.com> (raw)
In-Reply-To: <xmqqy2y3ejwe.fsf@gitster-ct.c.googlers.com>

On Tue, Oct 1, 2019 at 11:33 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > * other commands (archive, bisect, clean?, gitk, shortlog, blame,
> > fsck?, etc.) likely need to pay attention to sparsity patterns as
> > well, but there are some special cases:
>
> "git archive" falls into the same class as fast-(im|ex)port; it
> should ignore the sparse cone by default.  I suspect you threw
> "fsck" as a joke, but I do not think it should pay attention to the
> sparse cone, either (besides, most of the time in fsck the objects
> subject to checking do not know all the paths that reach them).

archive in the same category as fast-(im|ex)port makes sense.  I'm not
sure if "ignore the sparse cone" by default makes sense or if it
should be a case where we error out if --ignore-sparsity-patterns
isn't specified, especially if history is also sparse.

In terms of fsck, I agree that if history is dense and the worktree is
sparse that you want to walk all history.  I was thinking further
along the lines when partial clones and sparse checkouts are combined
so that history is also sparse.  In cases where a partial clone is in
use, rather than download everything in order to walk it, wouldn't it
make more sense to have fsck walk over the bits that are already
downloaded?  I don't really know how that'd all work, but it seems
that if fsck walked over all history it'd be treated as a
useless/dangerous command by those who are doing partial clones
because the repo is just too big.

> > * merge, cherry-pick, and rebase (anything touching the merge
> > machinery) will need to expand the size of the non-sparse worktree if
> > there are files outside the sparsity patterns with conflicts.  (Though
> > merge should do a better job of not expanding the non-sparse worktree
> > when files can cleanly be resolved.)
>
> I think the important point is what is done to the result of
> operation.  Result of these operations that create new commits are
> meant to be consumed by other people, who may not share your
> definition of sparse cone.  And such a command (i.e. those whose
> results are consumed by others who may have different sparse cone)
> must be full-tree by default.
>
> > * fast-export and format-patch are not about viewing history but about
> > exporting it, and limiting to sparsity patterns would result in the
> > creation of an incompatible history.
>
> I agree with the conclusion; see above.
>
> > * New worktrees, by default, should copy the sparsity-patterns of the
> > worktree they were created from (much like a new shell inherits the
> > current working directory of it's parent process)
>
> Sorry, but I do not share this view at all.
>
> In my mental model, "worktree new" is to attach a brand-new worktree
> to a bare repository that underlies the existing worktree I happen
> to be in, and that existing worktree that I happen to type "worktree
> new" in is no more or no less special than other worktrees.
>
> The above isn't to say that I'd veto your "a new worktree inherits
> traits from an existing worktree that 'git worktree add' was invoked
> in" idea.  I am just saying that I have a problem with that mode of
> operation and mental model being the default.

If worktrees are the only area we disagree on, then I'll happily take
the stuff we agree on and can overlook this piece.

But, perhaps some further explaining on worktrees might help us reach
some middle ground. If worktrees are dense by default and folks have
not only sparse checkouts but sparse history, then creating a new
worktree would suddenly mandate downloading a lot more of history --
which could be prohibitively expensive, forcing people to instead have
N clones without any shared history.  That may be fine (I tend to not
be a heavy worktree user, I just support some users who are), but is
it the route we want to push people with big repos towards?


Thanks for the feedback on the ideas,
Elijah

  reply	other threads:[~2019-10-02 16:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 13:06 git-grep in sparse checkout Matheus Tavares Bernardino
2019-10-01 13:30 ` Bert Wesarg
2019-10-01 16:12   ` Elijah Newren
2019-10-02  6:33     ` Junio C Hamano
2019-10-02 16:46       ` Elijah Newren [this message]
2019-10-01 18:29 ` Derrick Stolee
2019-10-02  0:06   ` Matheus Tavares Bernardino
2019-10-02  6:18   ` Junio C Hamano
2019-10-02 12:09     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BE6w_GJ6+N0PJBpJh=pguM85izUYCfFy=AoE53OiifAUg@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=bert.wesarg@googlemail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).