git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Victoria Dye <vdye@github.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix'
Date: Fri, 25 Feb 2022 23:52:25 -0800	[thread overview]
Message-ID: <CABPp-BE6n2-v7-91b-aXOsMod6d3dA4zZE4nNyzxac+tVpaaCw@mail.gmail.com> (raw)
In-Reply-To: <686e9c8e-7c5b-7cbc-4772-92b9754caa07@github.com>

Hi Victoria,

On Fri, Feb 25, 2022 at 12:25 PM Victoria Dye <vdye@github.com> wrote:
>
> Elijah Newren wrote:
> > On Thu, Feb 24, 2022 at 2:34 PM Victoria Dye via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Victoria Dye <vdye@github.com>
> >>
> >> When 'git read-tree' is provided with a prefix, expand the index only if the
> >> prefix is equivalent to a sparse directory or contained within one. If the
> >> index is not expanded in these cases, 'ce_in_traverse_path' will indicate
> >> that the relevant sparse directory is not in the prefix/traverse path,
> >> skipping past it and not unpacking the appropriate tree(s).
> >>
> >> If the prefix is in-cone, its sparse subdirectories (if any) will be
> >> traversed correctly without index expansion.
> >>
> >> Signed-off-by: Victoria Dye <vdye@github.com>
> >> ---
> >>  builtin/read-tree.c                      |  3 +--
> >>  t/t1092-sparse-checkout-compatibility.sh |  8 ++++++-
> >>  unpack-trees.c                           | 30 ++++++++++++++++++++++++
> >>  3 files changed, 38 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> >> index c2fdbc2657f..a7b7f822281 100644
> >> --- a/builtin/read-tree.c
> >> +++ b/builtin/read-tree.c
> >> @@ -213,8 +213,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
> >>         if (opts.merge && !opts.index_only)
> >>                 setup_work_tree();
> >>
> >> -       /* TODO: audit sparse index behavior in unpack_trees */
> >> -       if (opts.skip_sparse_checkout || opts.prefix)
> >> +       if (opts.skip_sparse_checkout)
> >>                 ensure_full_index(&the_index);
> >>
> >>         if (opts.merge) {
> >> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> >> index ae44451a0a9..a404be0a10f 100755
> >> --- a/t/t1092-sparse-checkout-compatibility.sh
> >> +++ b/t/t1092-sparse-checkout-compatibility.sh
> >> @@ -1415,7 +1415,13 @@ test_expect_success 'sparse index is not expanded: read-tree' '
> >>         do
> >>                 ensure_not_expanded read-tree -mu $MERGE_TREES &&
> >>                 ensure_not_expanded reset --hard HEAD || return 1
> >> -       done
> >> +       done &&
> >> +
> >> +       rm -rf sparse-index/deep/deeper2 &&
> >> +       ensure_not_expanded add . &&
> >> +       ensure_not_expanded commit -m "test" &&
> >> +
> >> +       ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest
> >>  '
> >>
> >>  test_expect_success 'ls-files' '
> >> diff --git a/unpack-trees.c b/unpack-trees.c
> >> index 360844bda3a..dba122a02bb 100644
> >> --- a/unpack-trees.c
> >> +++ b/unpack-trees.c
> >> @@ -1739,6 +1739,36 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >>                 setup_standard_excludes(o->dir);
> >>         }
> >>
> >> +       /*
> >> +        * If the prefix is equal to or contained within a sparse directory, the
> >> +        * index needs to be expanded to traverse with the specified prefix.
> >> +        */
> >> +       if (o->prefix && o->src_index->sparse_index) {
> >> +               int prefix_len = strlen(o->prefix);
> >> +
> >> +               while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/')
> >> +                       prefix_len--;
> >> +
> >> +               if (prefix_len > 0) {
> >
> > Is this condition check necessary?  If we want some safety check here,
> > could it instead be something like
> >
> >    if (prefix_len <= 0)
> >        BUG("Broken prefix passed to unpack_trees");
> >
>
> This condition was intended to skip unnecessary computation for the
> (probably unlikely, but still technically valid) case where the prefix is
> the repo root (e.g., '--prefix=/') - because the repo root is represented
> with only directory separator(s), `prefix_len` would be 0 after removing
> trailing '/'. In that scenario, the index won't need to be expanded, so we
> don't need to go looking in the index for that path.

Wow, that doesn't result in an error?!?  That surprises me.  I never
even thought to test such a thing.  Clearly, the following related
command does give an error:

  git read-tree --prefix=/toplevel/ $TREE

(namely, "error: invalid path '/toplevel/subpath'")

whereas, the reason a single slash is accepted it because git is
trying to be forgiving and treat the following two commands the same:

  git read-tree --prefix=subdir/ $TREE
  git read-tree --prefix=subdir $TREE

i.e. it tries to allow the trailing slash to be optional.  And, by
implementation quirk, making a trailing slash be optional turns out to
mean that --prefix=/ is treated the same as no prefix at all, because
empty string prefix just happens to give the same behavior as NULL
prefix.

I think we should just throw an error if prefix starts with '/'.
unpack_trees() can make it be a BUG() (and at the beginning of the
function rather than down at this point and only inside some
conditional). builtin/read-tree.c, the only thing that ever sets
prefix in unpack_trees_options, should die() if it's passed something
that starts with a '/'.  Having paths start with a '/' is antithetical
to how "prefix" is used throughout the codebase, though I guess I can
see people making that mistake if they are used to gitignore-style
patterns instead.

> None of that is particularly clear from reading the patch, though, so I'll
> add a comment & test covering it explicitly.
>
> > and then dedent the following code?  (Or are callers allowed to not
> > sanitize their input before passing to unpack_trees(), meaning that we
> > should use a die() rather than a BUG()?)
> >
> > To test this idea, near the top of unpack_trees(), I added:
> >     if (o->prefix)
> >         assert(*o->prefix && *o->prefix != '/');
> > and reran all tests.  They all ran without hitting that assertion.  FWIW.
> >
> >> +                       struct strbuf ce_prefix = STRBUF_INIT;
> >> +                       strbuf_grow(&ce_prefix, prefix_len + 1);
> >> +                       strbuf_add(&ce_prefix, o->prefix, prefix_len);
> >> +                       strbuf_addch(&ce_prefix, '/');
> >> +
> >> +                       /*
> >> +                        * If the prefix is not inside the sparse cone, then the
> >> +                        * index is explicitly expanded if it is found as a sparse
> >> +                        * directory, or implicitly expanded (by 'index_name_pos')
> >> +                        * if the path is inside a sparse directory.
> >> +                        */
> >> +                       if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) &&
> >> +                           index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0)
> >
> > style nit: Can you rewrap both the comments and the code at 80 characters?
> >
>
> I couldn't think of a way to wrap the condition that wouldn't make it more
> difficult to read. The best I could come up with was:
>
>                         if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf,
>                                                                o->src_index) &&
>                             index_name_pos(o->src_index,
>                                            ce_prefix.buf,
>                                            ce_prefix.len) >= 0)
>                                 ensure_full_index(o->src_index);
>
>
> which, to me, is a bit hard to parse. Alternatively, though, I can move the
> prefix-checking logic into its own function (kind of like
> 'pathspec_needs_expanded_index(...)' in [1]), in which case I won't need to
> change the current wrapping to keep it under 80 characters.
>
> [1] https://lore.kernel.org/git/822d7344587f698e73abba1ca726c3a905f7b403.1638201164.git.gitgitgadget@gmail.com/
>
> > It took me a bit of playing and testing to understand these two lines.
> > The comment helps, but it's still a bit dense to unpack; somehow I
> > didn't understand that the comment was referring to index_name_pos()'s
> > call to ensure_full_index().  Once I understood that, it all looks
> > good.
> >
>
> Sorry about that, I'll revise to make that clearer.

Thanks.  :-)

> >
> >> +                               ensure_full_index(o->src_index);
> >> +
> >> +                       strbuf_release(&ce_prefix);
> >> +               }
> >> +       }
> >> +
> >>         if (!core_apply_sparse_checkout || !o->update)
> >>                 o->skip_sparse_checkout = 1;
> >>         if (!o->skip_sparse_checkout && !o->pl) {
> >> --
> >> gitgitgadget

  reply	other threads:[~2022-02-26  7:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23 18:25 [PATCH 0/7] Sparse index: integrate with 'read-tree' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 16:48   ` Derrick Stolee
2022-02-24 21:42     ` Victoria Dye
2022-02-23 18:25 ` [PATCH 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-26  8:05   ` Elijah Newren
2022-02-28 18:04     ` Victoria Dye
2022-03-01  2:56       ` Elijah Newren
2022-02-23 18:25 ` [PATCH 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-24 16:59 ` [PATCH 0/7] Sparse index: integrate with 'read-tree' Derrick Stolee
2022-02-24 22:34 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-25  7:45     ` Elijah Newren
2022-02-28 23:17       ` Victoria Dye
2022-02-24 22:34   ` [PATCH v2 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-26  8:41     ` Elijah Newren
2022-02-28 18:14       ` Victoria Dye
2022-02-28 23:09     ` Ævar Arnfjörð Bjarmason
2022-02-28 23:27       ` Victoria Dye
2022-02-28 23:46         ` Ævar Arnfjörð Bjarmason
2022-02-24 22:34   ` [PATCH v2 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-25  8:38     ` Elijah Newren
2022-02-25 20:25       ` Victoria Dye
2022-02-26  7:52         ` Elijah Newren [this message]
2022-02-28 18:44           ` Victoria Dye
2022-02-24 22:34   ` [PATCH v2 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-26  8:46   ` [PATCH v2 0/7] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-01 20:24   ` [PATCH v3 0/8] " Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 1/8] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 2/8] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 3/8] read-tree: explicitly disallow prefixes with a leading '/' Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 4/8] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 5/8] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 6/8] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-03-03 17:54       ` Glen Choo
2022-03-03 21:19         ` Victoria Dye
2022-03-04 18:47           ` Glen Choo
2022-03-01 20:24     ` [PATCH v3 7/8] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 8/8] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-03-02  7:22     ` [PATCH v3 0/8] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-02 13:40       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BE6n2-v7-91b-aXOsMod6d3dA4zZE4nNyzxac+tVpaaCw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).