All of lore.kernel.org
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Derrick Stolee <stolee@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
Date: Mon, 18 Oct 2021 10:14:56 -0400	[thread overview]
Message-ID: <4656a934-5305-fbdf-76ca-17562fca62ef@github.com> (raw)
In-Reply-To: <b51500f2-854a-3006-810a-fb7fb8d8dcfb@gmail.com>

Derrick Stolee wrote:
> On 10/17/2021 9:15 PM, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>>>     * In addition, with these patches, if index.sparse=false, a
>>>>       sparse index will be expaned to full upon reading, and if it
>>>>       is true, a non-sparse index will be shrunk to sparse upon
>>>>       reading
>>>
>>> This is only half true. If "index.sparse=false", then a sparse
>>> index will be expanded to full upon reading. If "index.sparse=true"
>>> then nothing special will happen to an index on reading.
>>
>> OK.  I somehow got the impression that we convert in both ways from
>> the patch text under discussion, specifically this part in
>> do_read_index():
>>
>>> -	if (istate->repo->settings.command_requires_full_index)
>>> +	if (!istate->repo->settings.sparse_index ||
>>> +	    istate->repo->settings.command_requires_full_index)
>>>  		ensure_full_index(istate);
>>> +	else if (!istate->sparse_index)
>>> +		convert_to_sparse(istate, 0);
>>>  
>>>  	return istate->cache_nr;
>>
>> We used to say "when we know we are running a command that is not
>> sparse ready, expand it here" and nothing else.
>>
>> We now added a bit more condition for expansion, i.e. "when we are
>> told that the repository does not want sparse index, or when the
>> command is not sparse ready".
>>
>> But the patch goes one more step.  "If we didn't find a reason to
>> expand to full, and if the in-core index we read is not sparse, then
>> call convert_to_sparse() on it".  So if the repo settings tells us
>> that the repository wants a sparse index, and the index we read was
>> not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
>> do?  Nothing special?
> 
> You are absolutely right. I've been talking about what I _thought_
> the code does (and should do) but I missed this 'else if' which is
> in fact doing what you have been claiming the entire time. I should
> have done a better job double-checking the code before doubling
> down on my claims.
> 
> I think the 'else if' should be removed, which would match my
> expectations.
> 

By leaving that part out, wouldn't you only solve half of the "mismatch"
between in-core index and repo setting? Earlier, you described your
expectation as:

> * If index.sparse=false, then a sparse index will be converted to
>   full upon read.
> 
> * If index.sparse=true, then a full index will be converted to sparse
>   on write.

Why should the direction of change to the setting value (false->true vs
true->false) cause the index to convert at different times? Consider the
scenario:

# In a cone-mode, sparse index-enabled sparse checkout repo
$ git -c index.sparse=false status    # 1
$ git status                          # 2
$ git status                          # 3

Before this patch, the index has the following states per command:

(1) the index is sparse in-core, writes full on-disk
(2) the index is full in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, I see two mismatches in my expectations: (1) operates on an in-core
sparse index, despite `index.sparse=false`, and (2) operates on an in-core
full index, despite `index.sparse=true`. 

What you're suggesting solves only the first mismatch. However, the second
mismatch incurs the performance hit of a supposedly-sparse command actually
operating on an in-core full index. It also creates an inconsistency between
(2) and (3) in their use of the sparse index. What I'd like this patch to do
is:

(1) the index is full in-core, writes full on-disk
(2) the index is sparse in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, there are no more mismatches between in-core index usage and what is
written to disk, and (2) and (3) use the same index sparsity.

>> I see many unconditional calls to convert_to_sparse(istate, 0) on
>> the write code paths, where I instead would expect "if the repo
>> wants sparse, call convert_to_sparse(), and if the repo does not
>> want sparse, call ensure_full_index()", before actualy writing the
>> entries out.  These also are setting traps to confuse readers.
>>
>> Perhaps we want a helper function "ensure_right_sparsity(istate)"
>> that would be called where we have
>>
>> 	if (condition)
>> 		convert_to_sparse();
>> 	else
>> 		ensure_full_index();
>>
>> or an unconditonal call to convert_to_sparse() to unconfuse readers?
> 
> convert_to_sparse() has several checks that prevent the conversion
> from happening, such as having a split index. In particular, it will
> skip the conversion if the_repository->settings.sparse_index is false.
> Thus, calling it unconditionally in the write code is correct.
> 

I may have introduced some confusion by redundantly checking
`!istate->sparse_index` before converting to sparse (my goal was readability
- showing the index does not need to be converted to sparse if it is already
sparse - but I think it ended up doing the opposite). The condition could be
removed entirely, thanks to an equivalent check inside `convert_to_sparse`:

-       if (istate->repo->settings.command_requires_full_index)
+       if (!istate->repo->settings.sparse_index ||
+           istate->repo->settings.command_requires_full_index)
                ensure_full_index(istate);
+       else
+               convert_to_sparse(istate, 0);

  reply	other threads:[~2021-10-18 14:17 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
2021-10-15 21:53   ` Junio C Hamano
2021-10-17  1:21     ` Derrick Stolee
2021-10-17  5:58       ` Junio C Hamano
2021-10-17 19:33         ` Derrick Stolee
2021-10-18  1:15           ` Junio C Hamano
2021-10-18 13:25             ` Derrick Stolee
2021-10-18 14:14               ` Victoria Dye [this message]
2021-10-21 13:26                 ` Derrick Stolee
2021-10-18 16:09               ` Junio C Hamano
2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-21 22:20     ` Junio C Hamano
2021-10-27 17:21       ` Victoria Dye
2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-27 20:07       ` Derrick Stolee
2021-10-27 21:32         ` Junio C Hamano
2021-10-28  1:24           ` Derrick Stolee
2021-10-29 13:43             ` Victoria Dye
2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-10-29 18:45         ` Junio C Hamano
2021-10-29 19:00           ` Derrick Stolee
2021-10-29 20:01             ` Junio C Hamano
2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-22 17:36         ` Elijah Newren
2021-11-22 18:59           ` Victoria Dye
2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4656a934-5305-fbdf-76ca-17562fca62ef@github.com \
    --to=vdye@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.