git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, stolee@gmail.com, newren@gmail.com,
	Victoria Dye <vdye@github.com>,
	Kevin Willford <kewillf@microsoft.com>
Subject: Re: [PATCH 1/7] reset: behave correctly with sparse-checkout
Date: Thu, 30 Sep 2021 11:34:28 -0700	[thread overview]
Message-ID: <xmqqpmspc3pn.fsf@gitster.g> (raw)
In-Reply-To: <65905bf4e001118e8b9ced95c1bcecbacb6334ac.1633013461.git.gitgitgadget@gmail.com> (Kevin Willford via GitGitGadget's message of "Thu, 30 Sep 2021 14:50:55 +0000")

"Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -127,12 +129,49 @@ static void update_index_from_diff(struct diff_queue_struct *q,
>  		struct diff_options *opt, void *data)
>  {
>  	int i;
> +	int pos;
>  	int intent_to_add = *(int *)data;
>  
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filespec *one = q->queue[i]->one;
> +		struct diff_filespec *two = q->queue[i]->two;
>  		int is_missing = !(one->mode && !is_null_oid(&one->oid));
> +		int was_missing = !two->mode && is_null_oid(&two->oid);

Not a problem introduced by this patch per-se, but is_missing is a
counter-intuitive name for what the boolean wants to represent, I
think, which was brought in by b4b313f9 (reset: support "--mixed
--intent-to-add" mode, 2014-02-04).  Before the commit, we used to
say

 	for (i = 0; i < q->nr; i++) {
 		struct diff_filespec *one = q->queue[i]->one;
		if (one->mode && !is_null_sha1(one->sha1)) {
			... create ce out of one and add to the	index ...
		} else
 			remove_file_from_cache(one->path);
		...

i.e. "if one is not missing, create a ce and add it, otherwise
remove the path".

It should have been called "one_is_missing" if we wanted to
literally express the condition the code checked, but an even better
name would have been given after the intent of what the code wants
to do with the information.  If the resetted-to tree (that is what
'one' side of the comparison in diff_cache() is) has a valid blob,
we want it to be in the index, and otherwise, we do not want it in
the index.

Now, the patch makes things worse and I had to do the above digging
to see why the new code is even more confusing.  The 'two' side of
the comparison is what is in the to-be-corrected-by-reset index.
"was_missing" in contrast to "is_missing" makes it sound as if it
was the state before whatever "is_missing" tries to represent, but
that is not what is happening.  "is_missing" does not mean "the
entry is currently not there in the index", but "was_missing" does
mean exactly that: "the entry is currently not there in the index".

There isn't any "was" missing about it.  It "is" missing in the
index.  Instead of renaming, I wonder if we can do without this new
variable.  Let's read on the patch.

Also, now the code uses both sides of the filepair, we must double
check that our do_diff_cache() is *not* doing any rename detection.
It might be even prudent to ensure that 

	if (strcmp(one->path, two->path))
		BUG("reset drove diff-cache with rename detection");

but it might be with too much paranoia.  I dunno.

>  		struct cache_entry *ce;
> +		struct cache_entry *ce_before;
> +		struct checkout state = CHECKOUT_INIT;

These two new variables do not need this wide a scope, I would
think.  Shouldn't it be inside the body of the new "if" statement
this patch adds?

> +		/*
> +		 * When using the sparse-checkout feature the cache entries
> +		 * that are added here will not have the skip-worktree bit
> +		 * set. Without this code there is data that is lost because
> +		 * the files that would normally be in the working directory
> +		 * are not there and show as deleted for the next status.
> +		 * In the case of added files, they just disappear.
> +		 *
> +		 * We need to create the previous version of the files in
> +		 * the working directory so that they will have the right
> +		 * content and the next status call will show modified or
> +		 * untracked files correctly.
> +		 */
> +		if (core_apply_sparse_checkout && !file_exists(two->path)) {

In a sparsely checked out working tree, there is nothing in the
working tree at the path.  It may be because it is sparse and we
didn't want to have anything there, or it may be because the user
wanted to get rid of it and said "rm path" (not "git rm path") and
this part of the tree were of interest even if the sparse checkout
feature was used to hide other parts of the tree.  With the above
two checks alone, we cannot tell which.  Let's read on.

> +			pos = cache_name_pos(two->path, strlen(two->path));

We check the index to see if there is an entry for it.  I suspect
that because we need to do this check anyway, we shouldn't even have
to look at 'two' (and add a new 'was_missing' variable), because
'one' and 'two' came from a comparison between the resetted-to tree
object and the current index, and if cache_name_pos() for the path
(we can use 'one->path') says there is an entry in the index, by
definition, 'two' would not be showing a "removed" state (i.e. "the
resetted-to tree had it, the index does not" is what "was_missing"
wants to say).

So I wonder if it is better to

 - use one->path for !file_exists() above and cache_name_pos() here
   instead of two->path.

 - drop the confusingly named 'was_missing', because (pos < 0) is
   equivalent to it after this point, and we didn't need it up to
   this point.

> +			if ((pos >= 0 && ce_skip_worktree(active_cache[pos])) &&

And we do find an entry for it.  So this path is not something
sparse cone specifies not to check out (otherwise we would have a
tree-like entry that covers this path in the index and not an entry
for this specific path)?

Anyway, if it is marked with the skip-worktree bit, does that mean
there is no risk that the reason why two->path does not exist in the
working tree is because we earlier gave it in the working tree but
it was later removed by the user?  Just making sure that we are not
breaking the end-user's wish that the path should be removed by
resurrecting it in the working tree with a new call to
checkout_entry().

> +			    (is_missing || !was_missing)) {

And in such a case, if the resetted-to tree says we shouldn't have
the path in the resulting index, or if the original state in the
index had this path (but because (0 <= pos) must be true for us to 
reach this point, I am not sure if "was_missing" can ever be true
here), then do the following, which is ...

> +				state.force = 1;
> +				state.refresh_cache = 1;
> +				state.istate = &the_index;
> +
> +				ce_before = make_cache_entry(&the_index, two->mode,
> +							     &two->oid, two->path,
> +							     0, 0);
> +				if (!ce_before)
> +					die(_("make_cache_entry failed for path '%s'"),
> +						two->path);
> +
> +				checkout_entry(ce_before, &state, NULL, NULL);

... to resurrect the last "git add"ed state from the index and write
it out to the working tree.  As I suspected, ce_before and state
should be scoped inside this block and not visible outside, no?

I am not sure why this behaviour is desirable.  The "mixed" reset
should not have to touch the working tree in the first place.

The large comment before this block says "... will not have the
skip-worktree bit set", but we are dealing with a case where the
original index had a cache entry there with skip-worktree bit set,
so isn't the more desirable outcome that the cache entry added back
to the index has the skip-worktree bit still set and there is no
working tree file that the user did not desire to have?

And isn't it the matter of preserving the skip-worktree bit when the
code in the post context of this hunk this patch did not touch adds
the entry back to the index?

> +			}
> +		}
>  
>  		if (is_missing && !intent_to_add) {
>  			remove_file_from_cache(one->path);

If we look at the code after this point, we do use "is_missing"
information to tweak ce->ce_flags with the intent-to-add bit.

Perhaps we can do a similar tweak to the cache entry to mark it with
skip-worktree bit if the index had a cache entry at the path with
the bit set?  The code that needs to do so would only have to
remember if the one->path is in the current index and the cache
entry for the path has the skip-worktree bit in the body of the new
if() statement about (core_apply_sparse_checkout && !file_exists())
added by this patch (I am not sure if !file_exists() even matters,
though, as the approach I am suggesting is to preserve the skip bit
and not disturb the working tree files at all).

Thanks.





  reply	other threads:[~2021-09-30 18:34 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 14:50 [PATCH 0/7] Sparse Index: integrate with reset Victoria Dye via GitGitGadget
2021-09-30 14:50 ` [PATCH 1/7] reset: behave correctly with sparse-checkout Kevin Willford via GitGitGadget
2021-09-30 18:34   ` Junio C Hamano [this message]
2021-10-01 14:55     ` Victoria Dye
2021-09-30 14:50 ` [PATCH 2/7] sparse-index: update command for expand/collapse test Victoria Dye via GitGitGadget
2021-09-30 19:17   ` Taylor Blau
2021-09-30 20:11     ` Victoria Dye
2021-09-30 21:32       ` Junio C Hamano
2021-09-30 22:59         ` Victoria Dye
2021-10-01  0:04           ` Junio C Hamano
2021-10-04 13:47             ` Victoria Dye
2021-10-01  9:14   ` Bagas Sanjaya
2021-09-30 14:50 ` [PATCH 3/7] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-09-30 14:50 ` [PATCH 4/7] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-09-30 14:50 ` [PATCH 5/7] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-09-30 14:51 ` [PATCH 6/7] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-10-01 15:03   ` Victoria Dye
2021-09-30 14:51 ` [PATCH 7/7] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-10-05 13:20 ` [PATCH v2 0/7] Sparse Index: integrate with reset Victoria Dye via GitGitGadget
2021-10-05 13:20   ` [PATCH v2 1/7] reset: behave correctly with sparse-checkout Kevin Willford via GitGitGadget
2021-10-05 19:30     ` Junio C Hamano
2021-10-05 21:59       ` Victoria Dye
2021-10-06 12:44         ` Junio C Hamano
2021-10-06  1:46     ` Elijah Newren
2021-10-06 20:09       ` Victoria Dye
2021-10-06 10:31     ` Bagas Sanjaya
2021-10-05 13:20   ` [PATCH v2 2/7] update-index: add --force-full-index option for expand/collapse test Victoria Dye via GitGitGadget
2021-10-06  2:00     ` Elijah Newren
2021-10-06 20:40       ` Victoria Dye
2021-10-08  3:42         ` Elijah Newren
2021-10-08 17:11           ` Junio C Hamano
2021-10-06 10:33     ` Bagas Sanjaya
2021-10-05 13:20   ` [PATCH v2 3/7] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-10-06  2:04     ` Elijah Newren
2021-10-05 13:20   ` [PATCH v2 4/7] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-10-06  2:15     ` Elijah Newren
2021-10-06 17:48       ` Junio C Hamano
2021-10-05 13:20   ` [PATCH v2 5/7] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-10-06  3:43     ` Elijah Newren
2021-10-06 20:56       ` Victoria Dye
2021-10-06 10:34     ` Bagas Sanjaya
2021-10-05 13:20   ` [PATCH v2 6/7] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-10-06  4:43     ` Elijah Newren
2021-10-07 14:34       ` Victoria Dye
2021-10-05 13:20   ` [PATCH v2 7/7] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-10-06 10:37     ` Bagas Sanjaya
2021-10-05 15:34   ` [PATCH v2 0/7] Sparse Index: integrate with reset Ævar Arnfjörð Bjarmason
2021-10-05 20:44     ` Victoria Dye
2021-10-06  5:46   ` Elijah Newren
2021-10-07 21:15   ` [PATCH v3 0/8] " Victoria Dye via GitGitGadget
2021-10-07 21:15     ` [PATCH v3 1/8] reset: rename is_missing to !is_in_reset_tree Victoria Dye via GitGitGadget
2021-10-07 21:15     ` [PATCH v3 2/8] reset: preserve skip-worktree bit in mixed reset Kevin Willford via GitGitGadget
2021-10-08  9:04       ` Junio C Hamano
2021-10-07 21:15     ` [PATCH v3 3/8] update-index: add --force-full-index option for expand/collapse test Victoria Dye via GitGitGadget
2021-10-08  2:50       ` Bagas Sanjaya
2021-10-08  5:24       ` Junio C Hamano
2021-10-08 15:47         ` Victoria Dye
2021-10-08 17:19           ` Junio C Hamano
2021-10-11 14:12             ` Derrick Stolee
2021-10-11 15:05               ` Victoria Dye
2021-10-11 15:24               ` Junio C Hamano
2021-10-07 21:15     ` [PATCH v3 4/8] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-10-07 21:15     ` [PATCH v3 5/8] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-10-07 21:15     ` [PATCH v3 6/8] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-10-08 11:09       ` Phillip Wood
2021-10-08 17:14         ` Victoria Dye
2021-10-08 18:31           ` Junio C Hamano
2021-10-09 11:18             ` Phillip Wood
2021-10-10 22:03               ` Junio C Hamano
2021-10-11 15:55                 ` Victoria Dye
2021-10-11 16:16                   ` Junio C Hamano
2021-10-12 10:16                 ` Phillip Wood
2021-10-12 19:15                   ` Junio C Hamano
2021-10-12 10:17           ` Phillip Wood
2021-10-07 21:15     ` [PATCH v3 7/8] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-11-20 22:02       ` Elijah Newren
2021-11-22 16:46         ` Victoria Dye
2021-10-07 21:15     ` [PATCH v3 8/8] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-10-11 20:30     ` [PATCH v4 0/8] Sparse Index: integrate with reset Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 1/8] reset: rename is_missing to !is_in_reset_tree Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 2/8] reset: preserve skip-worktree bit in mixed reset Victoria Dye via GitGitGadget
2021-10-22  4:19         ` Emily Shaffer
2021-10-25 15:59           ` Victoria Dye
2021-10-11 20:30       ` [PATCH v4 3/8] sparse-index: update command for expand/collapse test Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 4/8] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 5/8] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 6/8] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 7/8] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-10-11 20:30       ` [PATCH v4 8/8] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-10-27 14:39       ` [PATCH v5 0/8] Sparse Index: integrate with reset Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 1/8] reset: rename is_missing to !is_in_reset_tree Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 2/8] reset: preserve skip-worktree bit in mixed reset Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 3/8] sparse-index: update command for expand/collapse test Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 4/8] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 5/8] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 6/8] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-10-27 14:39         ` [PATCH v5 7/8] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-11-20 22:05           ` Elijah Newren
2021-11-22 16:54             ` Victoria Dye
2021-10-27 14:39         ` [PATCH v5 8/8] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-11-20 22:13         ` [PATCH v5 0/8] Sparse Index: integrate with reset Elijah Newren
2021-11-29 15:52         ` [PATCH v6 " Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 1/8] reset: rename is_missing to !is_in_reset_tree Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 2/8] reset: preserve skip-worktree bit in mixed reset Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 3/8] sparse-index: update command for expand/collapse test Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 4/8] reset: expand test coverage for sparse checkouts Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 5/8] reset: integrate with sparse index Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 6/8] reset: make sparse-aware (except --mixed) Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 7/8] reset: make --mixed sparse-aware Victoria Dye via GitGitGadget
2021-11-29 15:52           ` [PATCH v6 8/8] unpack-trees: improve performance of next_cache_entry Victoria Dye via GitGitGadget
2021-11-29 18:37           ` [PATCH v6 0/8] Sparse Index: integrate with reset Elijah Newren
2021-11-29 19:44             ` Victoria Dye
2021-11-30  3:59               ` Elijah Newren
2021-12-01 20:26               ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqpmspc3pn.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=kewillf@microsoft.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).