From: Elijah Newren <newren@gmail.com>
To: Victoria Dye <vdye@github.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix'
Date: Fri, 25 Feb 2022 23:52:25 -0800 [thread overview]
Message-ID: <CABPp-BE6n2-v7-91b-aXOsMod6d3dA4zZE4nNyzxac+tVpaaCw@mail.gmail.com> (raw)
In-Reply-To: <686e9c8e-7c5b-7cbc-4772-92b9754caa07@github.com>
Hi Victoria,
On Fri, Feb 25, 2022 at 12:25 PM Victoria Dye <vdye@github.com> wrote:
>
> Elijah Newren wrote:
> > On Thu, Feb 24, 2022 at 2:34 PM Victoria Dye via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Victoria Dye <vdye@github.com>
> >>
> >> When 'git read-tree' is provided with a prefix, expand the index only if the
> >> prefix is equivalent to a sparse directory or contained within one. If the
> >> index is not expanded in these cases, 'ce_in_traverse_path' will indicate
> >> that the relevant sparse directory is not in the prefix/traverse path,
> >> skipping past it and not unpacking the appropriate tree(s).
> >>
> >> If the prefix is in-cone, its sparse subdirectories (if any) will be
> >> traversed correctly without index expansion.
> >>
> >> Signed-off-by: Victoria Dye <vdye@github.com>
> >> ---
> >> builtin/read-tree.c | 3 +--
> >> t/t1092-sparse-checkout-compatibility.sh | 8 ++++++-
> >> unpack-trees.c | 30 ++++++++++++++++++++++++
> >> 3 files changed, 38 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> >> index c2fdbc2657f..a7b7f822281 100644
> >> --- a/builtin/read-tree.c
> >> +++ b/builtin/read-tree.c
> >> @@ -213,8 +213,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
> >> if (opts.merge && !opts.index_only)
> >> setup_work_tree();
> >>
> >> - /* TODO: audit sparse index behavior in unpack_trees */
> >> - if (opts.skip_sparse_checkout || opts.prefix)
> >> + if (opts.skip_sparse_checkout)
> >> ensure_full_index(&the_index);
> >>
> >> if (opts.merge) {
> >> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> >> index ae44451a0a9..a404be0a10f 100755
> >> --- a/t/t1092-sparse-checkout-compatibility.sh
> >> +++ b/t/t1092-sparse-checkout-compatibility.sh
> >> @@ -1415,7 +1415,13 @@ test_expect_success 'sparse index is not expanded: read-tree' '
> >> do
> >> ensure_not_expanded read-tree -mu $MERGE_TREES &&
> >> ensure_not_expanded reset --hard HEAD || return 1
> >> - done
> >> + done &&
> >> +
> >> + rm -rf sparse-index/deep/deeper2 &&
> >> + ensure_not_expanded add . &&
> >> + ensure_not_expanded commit -m "test" &&
> >> +
> >> + ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest
> >> '
> >>
> >> test_expect_success 'ls-files' '
> >> diff --git a/unpack-trees.c b/unpack-trees.c
> >> index 360844bda3a..dba122a02bb 100644
> >> --- a/unpack-trees.c
> >> +++ b/unpack-trees.c
> >> @@ -1739,6 +1739,36 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >> setup_standard_excludes(o->dir);
> >> }
> >>
> >> + /*
> >> + * If the prefix is equal to or contained within a sparse directory, the
> >> + * index needs to be expanded to traverse with the specified prefix.
> >> + */
> >> + if (o->prefix && o->src_index->sparse_index) {
> >> + int prefix_len = strlen(o->prefix);
> >> +
> >> + while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/')
> >> + prefix_len--;
> >> +
> >> + if (prefix_len > 0) {
> >
> > Is this condition check necessary? If we want some safety check here,
> > could it instead be something like
> >
> > if (prefix_len <= 0)
> > BUG("Broken prefix passed to unpack_trees");
> >
>
> This condition was intended to skip unnecessary computation for the
> (probably unlikely, but still technically valid) case where the prefix is
> the repo root (e.g., '--prefix=/') - because the repo root is represented
> with only directory separator(s), `prefix_len` would be 0 after removing
> trailing '/'. In that scenario, the index won't need to be expanded, so we
> don't need to go looking in the index for that path.
Wow, that doesn't result in an error?!? That surprises me. I never
even thought to test such a thing. Clearly, the following related
command does give an error:
git read-tree --prefix=/toplevel/ $TREE
(namely, "error: invalid path '/toplevel/subpath'")
whereas, the reason a single slash is accepted it because git is
trying to be forgiving and treat the following two commands the same:
git read-tree --prefix=subdir/ $TREE
git read-tree --prefix=subdir $TREE
i.e. it tries to allow the trailing slash to be optional. And, by
implementation quirk, making a trailing slash be optional turns out to
mean that --prefix=/ is treated the same as no prefix at all, because
empty string prefix just happens to give the same behavior as NULL
prefix.
I think we should just throw an error if prefix starts with '/'.
unpack_trees() can make it be a BUG() (and at the beginning of the
function rather than down at this point and only inside some
conditional). builtin/read-tree.c, the only thing that ever sets
prefix in unpack_trees_options, should die() if it's passed something
that starts with a '/'. Having paths start with a '/' is antithetical
to how "prefix" is used throughout the codebase, though I guess I can
see people making that mistake if they are used to gitignore-style
patterns instead.
> None of that is particularly clear from reading the patch, though, so I'll
> add a comment & test covering it explicitly.
>
> > and then dedent the following code? (Or are callers allowed to not
> > sanitize their input before passing to unpack_trees(), meaning that we
> > should use a die() rather than a BUG()?)
> >
> > To test this idea, near the top of unpack_trees(), I added:
> > if (o->prefix)
> > assert(*o->prefix && *o->prefix != '/');
> > and reran all tests. They all ran without hitting that assertion. FWIW.
> >
> >> + struct strbuf ce_prefix = STRBUF_INIT;
> >> + strbuf_grow(&ce_prefix, prefix_len + 1);
> >> + strbuf_add(&ce_prefix, o->prefix, prefix_len);
> >> + strbuf_addch(&ce_prefix, '/');
> >> +
> >> + /*
> >> + * If the prefix is not inside the sparse cone, then the
> >> + * index is explicitly expanded if it is found as a sparse
> >> + * directory, or implicitly expanded (by 'index_name_pos')
> >> + * if the path is inside a sparse directory.
> >> + */
> >> + if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) &&
> >> + index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0)
> >
> > style nit: Can you rewrap both the comments and the code at 80 characters?
> >
>
> I couldn't think of a way to wrap the condition that wouldn't make it more
> difficult to read. The best I could come up with was:
>
> if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf,
> o->src_index) &&
> index_name_pos(o->src_index,
> ce_prefix.buf,
> ce_prefix.len) >= 0)
> ensure_full_index(o->src_index);
>
>
> which, to me, is a bit hard to parse. Alternatively, though, I can move the
> prefix-checking logic into its own function (kind of like
> 'pathspec_needs_expanded_index(...)' in [1]), in which case I won't need to
> change the current wrapping to keep it under 80 characters.
>
> [1] https://lore.kernel.org/git/822d7344587f698e73abba1ca726c3a905f7b403.1638201164.git.gitgitgadget@gmail.com/
>
> > It took me a bit of playing and testing to understand these two lines.
> > The comment helps, but it's still a bit dense to unpack; somehow I
> > didn't understand that the comment was referring to index_name_pos()'s
> > call to ensure_full_index(). Once I understood that, it all looks
> > good.
> >
>
> Sorry about that, I'll revise to make that clearer.
Thanks. :-)
> >
> >> + ensure_full_index(o->src_index);
> >> +
> >> + strbuf_release(&ce_prefix);
> >> + }
> >> + }
> >> +
> >> if (!core_apply_sparse_checkout || !o->update)
> >> o->skip_sparse_checkout = 1;
> >> if (!o->skip_sparse_checkout && !o->pl) {
> >> --
> >> gitgitgadget
next prev parent reply other threads:[~2022-02-26 7:52 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-23 18:25 [PATCH 0/7] Sparse index: integrate with 'read-tree' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 16:48 ` Derrick Stolee
2022-02-24 21:42 ` Victoria Dye
2022-02-23 18:25 ` [PATCH 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-26 8:05 ` Elijah Newren
2022-02-28 18:04 ` Victoria Dye
2022-03-01 2:56 ` Elijah Newren
2022-02-23 18:25 ` [PATCH 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-24 16:59 ` [PATCH 0/7] Sparse index: integrate with 'read-tree' Derrick Stolee
2022-02-24 22:34 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-02-24 22:34 ` [PATCH v2 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 22:34 ` [PATCH v2 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-25 7:45 ` Elijah Newren
2022-02-28 23:17 ` Victoria Dye
2022-02-24 22:34 ` [PATCH v2 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-26 8:41 ` Elijah Newren
2022-02-28 18:14 ` Victoria Dye
2022-02-28 23:09 ` Ævar Arnfjörð Bjarmason
2022-02-28 23:27 ` Victoria Dye
2022-02-28 23:46 ` Ævar Arnfjörð Bjarmason
2022-02-24 22:34 ` [PATCH v2 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-24 22:34 ` [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-25 8:38 ` Elijah Newren
2022-02-25 20:25 ` Victoria Dye
2022-02-26 7:52 ` Elijah Newren [this message]
2022-02-28 18:44 ` Victoria Dye
2022-02-24 22:34 ` [PATCH v2 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-24 22:34 ` [PATCH v2 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-26 8:46 ` [PATCH v2 0/7] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-01 20:24 ` [PATCH v3 0/8] " Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 1/8] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 2/8] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 3/8] read-tree: explicitly disallow prefixes with a leading '/' Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 4/8] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 5/8] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 6/8] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-03-03 17:54 ` Glen Choo
2022-03-03 21:19 ` Victoria Dye
2022-03-04 18:47 ` Glen Choo
2022-03-01 20:24 ` [PATCH v3 7/8] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-03-01 20:24 ` [PATCH v3 8/8] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-03-02 7:22 ` [PATCH v3 0/8] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-02 13:40 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABPp-BE6n2-v7-91b-aXOsMod6d3dA4zZE4nNyzxac+tVpaaCw@mail.gmail.com \
--to=newren@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).