From: ZheNing Hu <adlternative@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>,
git@vger.kernel.org,
"Christian Couder" <christian.couder@gmail.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Jeff King" <peff@peff.net>, "Junio C Hamano" <gitster@pobox.com>,
"Derrick Stolee" <derrickstolee@github.com>,
"Johannes Schindelin" <johannes.schindelin@gmx.de>,
"Victoria Dye" <vdye@github.com>,
"Emily Shaffer" <emilyshaffer@google.com>,
"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
"Shaoxuan Yuan" <shaoxuan.yuan02@gmail.com>
Subject: Re: [PATCH] [RFC] diff: introduce scope option
Date: Sun, 6 Nov 2022 10:11:02 +0800 [thread overview]
Message-ID: <CAOLTT8TceM-NpV2_hUCZj2Dx=W30f_9SHW8CcRH-pw32BRd-oA@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BGUXKk-LSJtHP2jSDSVYNpQgzOeferx6xJ36ntDgrBNCw@mail.gmail.com>
inHi,
Elijah Newren <newren@gmail.com> 于2022年11月1日周二 13:18写道:
>
> Hi,
>
> On Sun, Oct 30, 2022 at 9:11 PM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > When we use sparse-checkout, we often want the set of files
> > that some commands operate on to be restricted to the
> > sparse-checkout specification.
>
> It seems a bit premature to send this, when the guideline document[*]
> detailing how these options should work is still in the "Needs Review"
> state. I know, it's been waiting for a while, but it's a _long_
> document.
>
> [*] https://lore.kernel.org/git/pull.1367.v3.git.1665269538608.gitgitgadget@gmail.com/
>
Fine, I just want to start trying to experiment with this feature in
git-diff earlier,
and I can wait for the sparse-checkout.txt documentation to finish
first if needed :)
> > So introduce the `--scope` option to git diff, which have two
> > value: "sparse" and "all". "sparse" mean that diff is performed
> > restrict to paths which matching sparse-checkout specification,
> > "all" mean that diff is performed regardless of whether the path
> > meets the sparse-checkout specification.
>
> The wording probably needs some care to reflect the fact that it only
> affects cases where either --cached or revisions are passed. In
> particular, your wording for "all" suggests behavior very different
> from today, whereas "all" is probably best thought of as today's
> current behavior. For example, a plain `git diff` without --cached or
> revisions, should be unaffected by either of these flags.
>
Yes, after re-reading your sparse-checkout.txt patch, I realized that I
misinterpreted "--scope=sparse" as sparse patterns instead of sparse
specification, and misinterpreted "-scope=all" as diff on all files.
> > `--no-scope` is the default
> > option for now.
>
> What does --no-scope even mean? You didn't explain it, and I don't
> see how it could make sense. We explicitly avoided a `--no-` prefix
> by allowing the --scope option to take a value. I don't think there
> should be a --no-scope option.
>
I think the “--no-scope” here does nothing, as if it were unaffected by scope
(just like correctly "--scope=all", right?)
> > Add `diff.scope={sparse, all}` config, which can also have the same
> > capabilities as `--scope`, and it will be covered by `--scope` option.
>
> This is not what we want. The high level usecases should not need to
> be configured per-command. There should be a config option which
> reflects the high level use cases (e.g. sparse.scope) and then all
> relevant commands (diff, log, grep, etc.) can key off it.
>
Ok, using a global config should indeed be more useful.
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> > [RFC] diff: introduce scope option
> >
> > In [1], we discovered that users working on different sparse-checkout
> > specification may download unnecessary blobs from each other's
> > specification in collaboration. In [2] Junio suggested that maybe we can
> > restrict some git command's filespec in sparse-checkout specification to
> > elegantly solve this problem above. In [3]: Newren and Derrick Stolee
> > prefer to name the option --scope={sparse, all}.
> >
> > So this patch is attempt to do this thing on git diff:
> >
> > v1:
> >
> > 1. add --restrict option to git diff, which restrict diff filespec in
> > sparse-checkout specification. [4] v2.
> > 2. rename --restrict to --scope={sparse, all}, support --no-scope.
> > 3. add config: diff.scope={sparse,all}.
> >
> > Unresolved work:
> >
> > 1. how to properly pass this --scope={sparse, all} to other commands
> > like git log, git format-patch, etc.
>
> log & grep should accept a similar flag. format-patch should not, and
> should ignore any config in this area.
>
> > 2. how to set the default value of scope for different diff commands.
>
> I don't understand this.
>
Oh, I was just curious if the config defaults for scope needed to be configured
separately for the different diff commands git diff-files, git diff-index,
git diff-no-index, git diff-tree, since sparse-checkout.txt mentions
the different
behavior of scope for them. Now I think this just needs to be handled in command
code logic and no additional command level configuration is needed.
> > [1]:
> > https://lore.kernel.org/git/CAOLTT8SHo66kGbvWr=+LQ9UVd1NHgqGGEYK2qq6==QgRCgLZqQ@mail.gmail.com/
> > [2]: https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/ [3]:
> > https://lore.kernel.org/git/07a25d48-e364-0d9b-6ffa-41a5984eb5db@github.com/
> > [4]:
> > https://lore.kernel.org/git/pull.1368.git.1664036052741.gitgitgadget@gmail.com/
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1398%2Fadlternative%2Fzh%2Fdiff-scope-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1398/adlternative/zh/diff-scope-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1398
> >
> > Documentation/config/diff.txt | 12 +
> > Documentation/diff-options.txt | 18 +
> > builtin/diff.c | 4 +
> > diff-lib.c | 36 +-
> > diff-no-index.c | 4 +
> > diff.c | 39 +++
> > diff.h | 11 +
> > t/t4070-diff-sparse-checkout-scope.sh | 469 ++++++++++++++++++++++++++
> > tree-diff.c | 5 +
> > 9 files changed, 597 insertions(+), 1 deletion(-)
> > create mode 100644 t/t4070-diff-sparse-checkout-scope.sh
> >
> > diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
> > index 35a7bf86d77..52707e1b2d6 100644
> > --- a/Documentation/config/diff.txt
> > +++ b/Documentation/config/diff.txt
> > @@ -201,6 +201,18 @@ diff.algorithm::
> > --
> > +
> >
> > +diff.scope::
> > + Choose diff scope. The variants are as follows:
> > ++
> > +--
> > +`sparse`;;
> > + Restrict diff paths to those matching sparse-checkout specification.
> > +`all`;;
> > + Without restriction, diff is performed regardless of whether the path
> > + meets the sparse-checkout specification.
>
> As noted above, this is the wrong level to specify things. The
> description for "all" is misleading as well and suggests something
> other than "behavior B" from the direction document.
>
So do we need to make "--scope=all" correspond to the "behavior B",
The correct description of it should be: "worktree-sparse-history-dense"...
> > diff.wsErrorHighlight::
> > Highlight whitespace errors in the `context`, `old` or `new`
> > lines of the diff. Multiple values are separated by comma,
> > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> > index 3674ac48e92..04bf83e8be1 100644
> > --- a/Documentation/diff-options.txt
> > +++ b/Documentation/diff-options.txt
> > @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a
> > non-default value and want to use the default one, then you
> > have to use `--diff-algorithm=default` option.
> >
> > +ifndef::git-format-patch[]
> > +ifndef::git-log[]
> > +
> > +--scope={sparse|all}::
> > + Choose diff scope. The variants are as follows:
> > ++
> > +--
> > +`--sparse`;;
> > + Restrict diff paths to those matching sparse-checkout specification.
> > +`--all`;;
> > + Without restriction, diff is performed regardless of whether the path
> > + meets the sparse-checkout specification.
> > +--
> > ++
> > +
> > +endif::git-log[]
> > +endif::git-format-patch[]
>
> What about diff-files, diff-index, diff-tree, and show?
>
diff-options.txt included in their documents, and git-format-patch.txt,
git-log.txt, but should not in git-format-patch.txt. I don't know if it
should be included in git-diff-files.txt, because git diff-files compare
the files in the working tree and the index (so it's the same as
"simple" git-diff, which should not be affected by scope?)
> > +
> > --stat[=<width>[,<name-width>[,<count>]]]::
> > Generate a diffstat. By default, as much space as necessary
> > will be used for the filename part, and the rest for the graph
> > diff --git a/builtin/diff.c b/builtin/diff.c
> > index 854d2c5a5c4..6b450f7184c 100644
> > --- a/builtin/diff.c
> > +++ b/builtin/diff.c
> > @@ -54,6 +54,10 @@ static void stuff_change(struct diff_options *opt,
> > oideq(old_oid, new_oid) && (old_mode == new_mode))
> > return;
> >
> > + if (opt->scope == DIFF_SCOPE_SPARSE &&
> > + !diff_paths_in_sparse_checkout(old_path, new_path))
> > + return;
>
> This can't be right.
> git diff c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e
> 72d42bd856228c15f702fa3c353432f4f1defe03
> (to directly diff two known blobs) will go through this function, with
> old_path == c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e and new_path ==
> 72d42bd856228c15f702fa3c353432f4f1defe03. But those aren't real
> paths, and sparse-checkout should not restrict what is shown in those
> cases.
>
Yeah, it's buggy that I forget to check two blobs without paths.
> > +
> > if (opt->flags.reverse_diff) {
> > SWAP(old_mode, new_mode);
> > SWAP(old_oid, new_oid);
> > diff --git a/diff-lib.c b/diff-lib.c
> > index 2edea41a234..a3381f2e0ff 100644
> > --- a/diff-lib.c
> > +++ b/diff-lib.c
> > @@ -88,6 +88,22 @@ static int match_stat_with_submodule(struct diff_options *diffopt,
> > return changed;
> > }
> >
> > +int diff_path_in_sparse_checkout(const char *path) {
> > + if (core_sparse_checkout_cone)
> > + return path_in_cone_mode_sparse_checkout(path, the_repository->index);
> > + else
> > + return path_in_sparse_checkout(path, the_repository->index);
> > +}
>
> This says we are including the path if it matches the sparsity
> patterns. Thus, we have to be careful when we use this function,
> because the relevant paths are ones that match the sparsity
> specification. The sparsity specification will always match the
> sparsity patterns when diffing two commits, but when either the index
> or the working tree is part of the diff, the sparsity specification
> *might* be temporarily expanded.
>
Yes, I may have to look at more code to better understand how and when the
"sparsity specification" is extended. Any recommendations for places to read?
> > +int diff_paths_in_sparse_checkout(const char *one, const char*two) {
> > + if (one == two || !strcmp(one, two))
> > + return diff_path_in_sparse_checkout(one);
> > + else
> > + return diff_path_in_sparse_checkout(one) &&
> > + diff_path_in_sparse_checkout(two);
>
> Why && rather than || ?
>
Agree, we do need to use || here, because the one diff side in the sparse
specification, we should diff the two files.
> > +}
> > +
> > +
> > int run_diff_files(struct rev_info *revs, unsigned int option)
> > {
> > int entries, i;
> > @@ -113,6 +129,9 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
> >
> > if (diff_can_quit_early(&revs->diffopt))
> > break;
> > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE &&
> > + !diff_path_in_sparse_checkout(ce->name))
> > + continue;
>
> Here you've cut off the possibility of showing diffs for anything
> outside the sparsity patterns, which is a mistake. We need to handle
> a temporarily expanded sparse specification too.
>
Agree.
> > if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
> > continue;
> > @@ -202,7 +221,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
> > continue;
> > }
> >
> > - if (ce_uptodate(ce) || ce_skip_worktree(ce))
> > + if (ce_uptodate(ce) ||
> > + (revs->diffopt.scope != DIFF_SCOPE_ALL && ce_skip_worktree(ce)))
> > continue;
>
> Here you make --scope=all show files even if they are skip-worktree,
> making them appear to have been deleted. I called out your
> description earlier as potentially misleading, because it could imply
> this behavior. It looks like you were consistent with the description
> and implementation, it just doesn't match what we want.
>
Agree too. So we should not do anything in run_diff_files.
> > /*
> > @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o,
>
> do_oneway_diff is for cases where we are diffing against the index...
>
> > return; /* nothing to diff.. */
> > }
> >
> > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) {
> > + if (idx && tree) {
> > + if (!diff_paths_in_sparse_checkout(idx->name, tree->name))
> > + return;
> > + } else if (idx) {
> > + if (!diff_path_in_sparse_checkout(idx->name))
> > + return;
> > + } else if (tree) {
> > + if (!diff_path_in_sparse_checkout(tree->name))
> > + return;
> > + } else
> > + return;
> > + }
>
> ...and you again mistakenly only compare to the sparsity patterns
> instead of the sparse specification.
>
So here we should use ce_skip_worktree(idx) to check if this index entry matches
sparse specification.
> > +
> > /* if the entry is not checked out, don't examine work tree */
> > cached = o->index_only ||
> > (idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx)));
>
>
>
> > diff --git a/diff-no-index.c b/diff-no-index.c
> > index 18edbdf4b59..ea94a104ea4 100644
> > --- a/diff-no-index.c
> > +++ b/diff-no-index.c
> > @@ -281,6 +281,10 @@ int diff_no_index(struct rev_info *revs,
> >
> > fixup_paths(paths, &replacement);
> >
> > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE &&
> > + !diff_paths_in_sparse_checkout(paths[0], paths[1]))
> > + goto out;
>
> --no-index means we're diffing two files that are not tracked, or at
> least treating them as not tracked. sparse-checkout should not affect
> such files.
>
Yeah, we should care about untracked files sparse-checkout only when
we are using git add/rm/update-index...
> > +
> > revs->diffopt.skip_stat_unmatch = 1;
> > if (!revs->diffopt.output_format)
> > revs->diffopt.output_format = DIFF_FORMAT_PATCH;
> > diff --git a/diff.c b/diff.c
> > index 285d6e2d575..9de4044ae05 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -48,6 +48,7 @@ static int diff_interhunk_context_default;
> > static const char *diff_word_regex_cfg;
> > static const char *external_diff_cmd_cfg;
> > static const char *diff_order_file_cfg;
> > +static const char *external_diff_scope_cfg;
> > int diff_auto_refresh_index = 1;
> > static int diff_mnemonic_prefix;
> > static int diff_no_prefix;
> > @@ -57,6 +58,7 @@ static int diff_dirstat_permille_default = 30;
> > static struct diff_options default_diff_options;
> > static long diff_algorithm;
> > static unsigned ws_error_highlight_default = WSEH_NEW;
> > +static enum diff_scope external_diff_scope;
>
> Why is this called "external"?
>
Learn from external_diff_cmd_cfg, I should remove it.
> > static char diff_colors[][COLOR_MAXLEN] = {
> > GIT_COLOR_RESET,
> > @@ -423,6 +425,16 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
> > return 0;
> > }
> >
> > + if (!strcmp(var, "diff.scope")) {
> > + git_config_string(&external_diff_scope_cfg, var, value);
> > + if (!strcmp(value, "all"))
> > + external_diff_scope = DIFF_SCOPE_ALL;
> > + else if (!strcmp(value, "sparse"))
> > + external_diff_scope = DIFF_SCOPE_SPARSE;
> > + else
> > + return -1;
> > + }
> > +
> > if (git_color_config(var, value, cb) < 0)
> > return -1;
> >
> > @@ -4663,6 +4675,7 @@ void repo_diff_setup(struct repository *r, struct diff_options *options)
> >
> > options->color_moved = diff_color_moved_default;
> > options->color_moved_ws_handling = diff_color_moved_ws_default;
> > + options->scope = external_diff_scope;
> >
> > prep_parse_options(options);
> > }
> > @@ -4914,6 +4927,29 @@ static int parse_dirstat_opt(struct diff_options *options, const char *params)
> > return 1;
> > }
> >
> > +static int diff_opt_diff_scope(const struct option *option,
> > + const char *optarg, int unset)
> > +{
> > + struct diff_options *opt = option->value;
> > +
> > + if (unset) {
> > + opt->scope = DIFF_SCOPE_NONE;
>
> I think we should instead have a
> BUG_ON_OPT_NEG(unset)
> or, even better, a
> BUG_ON_OPT_NEG_NOARG(unset, optarg)
> at the beginning of this function...
>
> > + } else if (optarg) {
>
> ...which would also allow you to drop this if and dedent the rest of
> the function.
>
Agree.
> > + if (!strcmp(optarg, "all")) {
> > + if (core_apply_sparse_checkout) {
> > + opt->scope = DIFF_SCOPE_ALL;
> > + }
> > + } else if (!strcmp(optarg, "sparse")) {
> > + if (core_apply_sparse_checkout) {
> > + opt->scope = DIFF_SCOPE_SPARSE;
> > + }
>
> If core_apply_sparse_checkout is false, should we perhaps throw an
> error instead of just silently ignoring the option the user passed?
>
Agree.
> > + } else
> > + return error(_("invalid --scope value: %s"), optarg);
> > + }
>
> As written with no follow-on else clause here, wouldn't this silently
> accept "--scope" without an "=<something>" argument without an error
> and without properly initializing opt->scope?
>
Because opt will be initializing with default_diff_options in repo_diff_setup(),
and opt->scope should respect config value first. So I don't think there's a
mistake here.
> > +
> > + return 0;
> > +}
> > +
> > static int diff_opt_diff_filter(const struct option *option,
> > const char *optarg, int unset)
> > {
> > @@ -5683,6 +5719,9 @@ static void prep_parse_options(struct diff_options *options)
> > OPT_CALLBACK_F(0, "diff-filter", options, N_("[(A|C|D|M|R|T|U|X|B)...[*]]"),
> > N_("select files by diff type"),
> > PARSE_OPT_NONEG, diff_opt_diff_filter),
> > + OPT_CALLBACK_F(0, "scope", options, N_("[sparse|all]"),
> > + N_("choose diff scope"),
>
> maybe "choose diff scope in sparse checkouts"?
>
OK.
> > + PARSE_OPT_OPTARG, diff_opt_diff_scope),
> > { OPTION_CALLBACK, 0, "output", options, N_("<file>"),
> > N_("output to a specific file"),
> > PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
> > diff --git a/diff.h b/diff.h
> > index 8ae18e5ab1e..90f7512034c 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -230,6 +230,12 @@ enum diff_submodule_format {
> > DIFF_SUBMODULE_INLINE_DIFF
> > };
> >
> > +enum diff_scope {
> > + DIFF_SCOPE_NONE = 0,
> > + DIFF_SCOPE_ALL,
> > + DIFF_SCOPE_SPARSE,
> > +};
> > +
> > /**
> > * the set of options the calling program wants to affect the operation of
> > * diffcore library with.
> > @@ -285,6 +291,9 @@ struct diff_options {
> > /* diff-filter bits */
> > unsigned int filter, filter_not;
> >
> > + /* diff sparse-checkout scope */
> > + enum diff_scope scope;
> > +
> > int use_color;
> >
> > /* Number of context lines to generate in patch output. */
> > @@ -696,4 +705,6 @@ void print_stat_summary(FILE *fp, int files,
> > int insertions, int deletions);
> > void setup_diff_pager(struct diff_options *);
> >
> > +int diff_path_in_sparse_checkout(const char *path);
> > +int diff_paths_in_sparse_checkout(const char *one, const char *two);
> > #endif /* DIFF_H */
> > diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh
> > new file mode 100644
>
> This needs to be fixed.
>
> > index 00000000000..dca75a3308b
> > --- /dev/null
> > +++ b/t/t4070-diff-sparse-checkout-scope.sh
> > @@ -0,0 +1,469 @@
> > +#!/bin/sh
> > +
> > +test_description='diff sparse-checkout scope'
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +
> > +. ./test-lib.sh
> > +
> > +
> > +test_expect_success 'setup' '
> > + git init temp &&
> > + (
> > + cd temp &&
> > + mkdir sub1 &&
> > + mkdir sub2 &&
> > + echo sub1/file1 >sub1/file1 &&
> > + echo sub2/file2 >sub2/file2 &&
> > + echo file1 >file1 &&
> > + echo file2 >file2 &&
> > + git add --all &&
> > + git commit -m init &&
> > + echo sub1/file1 >>sub1/file1 &&
> > + echo sub1/file2 >>sub1/file2 &&
> > + echo sub2/file1 >>sub2/file1 &&
> > + echo sub2/file2 >>sub2/file2 &&
> > + echo file1 >>file1 &&
> > + echo file2 >>file2 &&
> > + git add --all &&
> > + git commit -m change1 &&
> > + echo sub1/file1 >>sub1/file1 &&
> > + echo sub1/file2 >>sub1/file2 &&
> > + echo sub2/file1 >>sub2/file1 &&
> > + echo sub2/file2 >>sub2/file2 &&
> > + echo file1 >>file1 &&
> > + echo file2 >>file2 &&
> > + git add --all &&
> > + git commit -m change2
> > + )
> > +'
> > +
> > +reset_repo () {
> > + rm -rf repo &&
> > + git clone --no-checkout temp repo
>
> Why --no-checkout rather than say --sparse?
>
This is because I accidentally associated it with a
partial clone. I often use "git clone -filter=blob:none -no-checkout"
to do a partial clone, then "git sparse- checkout set <pattern>"
to set sparse-checkout patterns, and "git checkout" to download
the missing blobs and checkout to a branch. But in this
test file, we only need sparse-checkout, so it's true that I should
not do such strange no-checkout thing.
> > +}
> > +
> > +reset_with_sparse_checkout() {
> > + reset_repo &&
> > + git -C repo sparse-checkout set $1 sub1 &&
> > + git -C repo checkout
>
> Fixing the above would let us get rid of this really weird extra
> checkout command too.
>
> > +}
> > +
> > +change_worktree_and_index() {
> > + (
> > + cd repo &&
> > + mkdir sub2 sub3 &&
> > + echo sub1/file3 >sub1/file3 &&
> > + echo sub2/file3 >sub2/file3 &&
> > + echo sub3/file3 >sub3/file3 &&
> > + echo file3 >file3 &&
> > + git add --all --sparse &&
> > + echo sub1/file3 >>sub1/file3 &&
> > + echo sub2/file3 >>sub2/file3 &&
> > + echo sub3/file3 >>sub3/file3 &&
> > + echo file3 >>file3
> > + )
> > +}
>
> It would be nice to modify different paths in the working tree and
> index, to see if we can handle cases where the sparse specification
> does not match the sparsity patterns better. (So, modify files inside
> and outside the sparsity patterns, stage those changes, and then do a
> `git sparse-checkout reapply` to make the files outside the sparsity
> patterns disappear from the working tree...then modify different files
> in the working tree both inside and outside the sparsity patterns.
> And also remove some file that matches the sparsity patterns and
> manually mark it as SKIP_WORKTREE.) That'd give us much better
> coverage.
>
Nice addition. So I should use git update-index --skip-worktree to set
skip_worktree bit for index entries.
> > +
> > +diff_scope() {
> > + title=$1
> > + need_change_worktree_and_index=$2
> > + sparse_checkout_option=$3
> > + scope_option=$4
> > + expect=$5
> > + shift 5
> > + args=("$@")
> > +
> > + test_expect_success "$title $sparse_checkout_option $scope_option" "
> > + reset_with_sparse_checkout $sparse_checkout_option &&
> > + if test \"$need_change_worktree_and_index\" = \"true\" ; then
> > + change_worktree_and_index
> > + fi &&
> > + git -C repo diff $scope_option ${args[*]} >actual &&
> > + if test -z \"$expect\" ; then
> > + >expect
> > + else
> > + cat > expect <<-EOF
> > +$expect
> > + EOF
> > + fi &&
> > + test_cmp expect actual
> > + "
> > +}
> > +
> > +args=("--name-only" "HEAD" "HEAD~")
> > +diff_scope builtin_diff_tree false "--no-cone" "--scope=sparse" \
> > +"sub1/file1
> > +sub1/file2" "${args[@]}"
> > +
> > +diff_scope builtin_diff_tree false "--no-cone" "--scope=all" \
> > +"file1
> > +file2
> > +sub1/file1
> > +sub1/file2
> > +sub2/file1
> > +sub2/file2" "${args[@]}"
> > +
> > +diff_scope builtin_diff_tree false "--no-cone" "--no-scope" \
> > +"file1
> > +file2
> > +sub1/file1
> > +sub1/file2
> > +sub2/file1
> > +sub2/file2" "${args[@]}"
> > +
> > +diff_scope builtin_diff_tree false "--cone" "--scope=sparse" \
> > +"file1
> > +file2
> > +sub1/file1
> > +sub1/file2" "${args[@]}"
> > +
> > +diff_scope builtin_diff_tree false "--cone" "--scope=all" \
> > +"file1
> > +file2
> > +sub1/file1
> > +sub1/file2
> > +sub2/file1
> > +sub2/file2" "${args[@]}"
> > +
> > +diff_scope builtin_diff_tree false "--cone" "--no-scope" \
> > +"file1
> > +file2
> > +sub1/file1
> > +sub1/file2
> > +sub2/file1
> > +sub2/file2" "${args[@]}"
> > +
> > +args=("--name-only" "HEAD~")
> > +diff_scope builtin_diff_index true "--no-cone" "--scope=sparse" \
> > +"sub1/file1
> > +sub1/file2
> > +sub1/file3" "${args[@]}"
>
> Here's a good example where the testcase is doing the wrong thing.
> The expected answer here would also include file3, sub2/file3, and
> sub3/file3.
>
Yeah. Files that are not part of the sparse-checkout patterns are temporarily
extended into the sparse specification.
> > +
> > +diff_scope builtin_diff_index true "--no-cone" "--scope=all" \
> > +"file1
> > +file2
> > +file3
> > +sub1/file1
> > +sub1/file2
> > +sub1/file3
> > +sub2/file1
> > +sub2/file2
> > +sub2/file3
> > +sub3/file3" "${args[@]}"
> > +
> > +diff_scope builtin_diff_index true "--no-cone" "--no-scope" \
> > +"file1
> > +file2
> > +file3
> > +sub1/file1
> > +sub1/file2
> > +sub1/file3
> > +sub2/file1
> > +sub2/file2
> > +sub2/file3
> > +sub3/file3" "${args[@]}"
> > +
> > +diff_scope builtin_diff_index true "--cone" "--scope=sparse" \
> > +"file1
> > +file2
> > +file3
> > +sub1/file1
> > +sub1/file2
> > +sub1/file3" "${args[@]}"
>
> This is also wrong; it's missing sub2/file3 and sub3/file3.
>
> > +
> > +diff_scope builtin_diff_index true "--cone" "--scope=all" \
> > +"file1
> > +file2
> > +file3
> > +sub1/file1
> > +sub1/file2
> > +sub1/file3
> > +sub2/file1
> > +sub2/file2
> > +sub2/file3
> > +sub3/file3" "${args[@]}"
> > +
> > +diff_scope builtin_diff_index true "--cone" "--no-scope" \
> > +"file1
> > +file2
> > +file3
> > +sub1/file1
> > +sub1/file2
> > +sub1/file3
> > +sub2/file1
> > +sub2/file2
> > +sub2/file3
> > +sub3/file3" "${args[@]}"
> > +
> > +args=("--name-only" "file3" "sub1/" "sub2/")
> > +
> > +diff_scope builtin_diff_files true "--no-cone" "--scope=sparse" \
> > +"sub1/file3" "${args[@]}"
>
> This should also include file3, sub2/file3, and sub3/file3.
> `--scope=` should not affect diff output at all if neither --cached
> nor revision arguments are supplied.
>
Agree.
> > +
> > +diff_scope builtin_diff_files true "--no-cone" "--scope=all" \
> > +"file3
> > +sub1/file3
> > +sub2/file1
> > +sub2/file2
> > +sub2/file3" "${args[@]}"
>
> This is wrong due to including too much; it should not include
> sub2/file1 or sub2/file2 (it is only including those because it is
> showing them as deleted, when they are not deleted but are
> SKIP_WORKTREE).
>
Agree.
> I think I'm going to stop reviewing here. I'm probably just going to
> keep repeating the same issues I identified earlier if I continue.
Thank you very much for your review, you have pointed out very many
problems with this patch :)
--
ZheNing Hu
next prev parent reply other threads:[~2022-11-06 2:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 4:11 [PATCH] [RFC] diff: introduce scope option ZheNing Hu via GitGitGadget
2022-11-01 1:34 ` Taylor Blau
2022-11-01 2:13 ` ZheNing Hu
2022-11-01 5:18 ` Elijah Newren
2022-11-06 2:11 ` ZheNing Hu [this message]
2022-11-06 6:58 ` Elijah Newren
2022-11-14 9:08 ` ZheNing Hu
2022-11-25 2:45 ` [PATCH v2] [RFC] diff: introduce --scope option ZheNing Hu via GitGitGadget
2022-11-29 12:00 ` [PATCH v3 0/2] [RFC] diff: introduce scope option ZheNing Hu via GitGitGadget
2022-11-29 12:00 ` [PATCH v3 1/2] [RFC] diff: introduce --scope option ZheNing Hu via GitGitGadget
2022-11-29 12:00 ` [PATCH v3 2/2] [RPC] grep: " ZheNing Hu via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOLTT8TceM-NpV2_hUCZj2Dx=W30f_9SHW8CcRH-pw32BRd-oA@mail.gmail.com' \
--to=adlternative@gmail.com \
--cc=avarab@gmail.com \
--cc=christian.couder@gmail.com \
--cc=derrickstolee@github.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
--cc=matheus.bernardino@usp.br \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=shaoxuan.yuan02@gmail.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).