git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Nguyễn Thái Ngọc" <pclouds@gmail.com>,
	"Derrick Stolee" <derrickstolee@github.com>,
	"Derrick Stolee" <dstolee@microsoft.com>
Subject: Re: [PATCH 02/27] sparse-index: implement ensure_full_index()
Date: Wed, 27 Jan 2021 08:43:28 -0500	[thread overview]
Message-ID: <ef86c7fc-e811-036b-b2e3-221e3bdb624a@gmail.com> (raw)
In-Reply-To: <CABPp-BE3tLmfwyncbdTKZUgLYH_8M9zMjH=+LJG4bdGcbYDPMg@mail.gmail.com>

On 1/26/2021 10:05 PM, Elijah Newren wrote:
> On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
...
>> Sparse directory entries have a specific 'ce_mode' value. The macro
>> S_ISSPARSEDIR(ce) can check if a cache_entry 'ce' has this type. This
>> ce_mode is not possible with the existing index formats, so we don't
>> also verify all properties of a sparse-directory entry, which are:
>>
>>  1. ce->ce_mode == 01000755
> 
> This is a weird number.  What's the reason for choosing it?  It looks
> deceptively close to 0100755, normal executable files, but has the
> extra 0, meaning that ce->ce_mode & S_IFMT is 0, suggesting it has no
> file type.
> 
> Since it's a directory, why not use S_IFDIR (040000)?
> 
> (GITLINK does use the weird 0160000 value, but it happens to be
> S_IFLNK | S_IFDIR == 0120000 | 040000, which conveys "it's both a
> directory and a symlink")

I forget how exactly I came up with these magic constants, but then
completely forgot to think of them critically because I haven't had
to look at them in a while. They _are_ important, especially because
these values affect the file format itself.

I'll think harder on this before submitting a series intended for
merging.

>>  2. ce->flags & CE_SKIP_WORKTREE is true
> 
> Makes sense.
> 
>>  3. ce->name[ce->namelen - 1] == '/' (ends in dir separator)
> 
> Is there a particular reason for this?  I'm used to seeing names
> without the trailing slash, both in the index and in tree objects.  I
> don't know enough to be for or against this idea; just curious at this
> point.

It's yet another way to distinguish directories from files, but
there are cases where we do string searches up to a prefix, and
having these directory separators did help, IIRC.

>>  4. ce->oid references a tree object.
> 
> Makes sense...but doesn't that suggest we'd want to use ce->ce_mode = 040000?

...

>> +#define CE_MODE_SPARSE_DIRECTORY 01000755
>> +#define SPARSE_DIR_MODE 0100
> 
> Another magic value.  Feels like the commit message should reference
> this one and why it was picked.  Seems odd to me, and possibly
> problematic to re-use file permission bits that might collide with
> files recorded by really old versions of git.  Maybe that's not a
> concern, though.
> 
>> +#define S_ISSPARSEDIR(m) ((m)->ce_mode == CE_MODE_SPARSE_DIRECTORY)
> 
> Should the special sauce apply to ce_flags rather than ce_mode?  Thus,
> instead of an S_ISSPARSEDIR, perhaps have a ce_sparse_dir macro
> (similar to ce_skip_worktree) based on a CE_SPARSE_DIR value (similar
> to CE_SKIP_WORKTREE)?
>
> Or, alternatively, do we need a single special state here?  Could we
> check for a combination of ce_mode == 040000 && ce_skip_worktree(ce)?

The intention was that ce_mode be a unique value that could only
be assigned to a directory entry, which would then by necessity be
sparse. Checking both ce_mode and ce_flags seemed wasteful with the
given assumptions

...

>> +       /* Copy back into original index. */
>> +       memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash));
>> +       istate->sparse_index = 0;
>> +       istate->cache = full->cache;
> 
> Haven't you leaked the original istate->cache here?

Yes, seems so. Will fix.

Thanks,
-Stolee

  reply	other threads:[~2021-01-27 13:45 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 17:41 [PATCH 00/27] [RFC] Sparse Index Derrick Stolee via GitGitGadget
2021-01-25 17:41 ` [PATCH 01/27] sparse-index: add guard to ensure full index Derrick Stolee via GitGitGadget
2021-01-25 17:41 ` [PATCH 02/27] sparse-index: implement ensure_full_index() Derrick Stolee via GitGitGadget
2021-01-27  3:05   ` Elijah Newren
2021-01-27 13:43     ` Derrick Stolee [this message]
2021-01-27 16:38       ` Elijah Newren
2021-01-28  5:25     ` Junio C Hamano
2021-01-25 17:41 ` [PATCH 03/27] t1092: compare sparse-checkout to sparse-index Derrick Stolee via GitGitGadget
2021-01-27  3:08   ` Elijah Newren
2021-01-27 13:30     ` Derrick Stolee
2021-01-27 16:54       ` Elijah Newren
2021-01-25 17:41 ` [PATCH 04/27] test-read-cache: print cache entries with --table Derrick Stolee via GitGitGadget
2021-01-27  3:25   ` Elijah Newren
2021-01-25 17:41 ` [PATCH 05/27] test-tool: read-cache --table --no-stat Derrick Stolee via GitGitGadget
2021-01-25 17:41 ` [PATCH 06/27] test-tool: don't force full index Derrick Stolee via GitGitGadget
2021-01-25 17:41 ` [PATCH 07/27] unpack-trees: ensure " Derrick Stolee via GitGitGadget
2021-01-27  4:43   ` Elijah Newren
2021-01-25 17:41 ` [PATCH 08/27] sparse-checkout: hold pattern list in index Derrick Stolee via GitGitGadget
2021-01-27 17:00   ` Elijah Newren
2021-01-28 13:12     ` Derrick Stolee
2021-01-25 17:41 ` [PATCH 09/27] sparse-index: convert from full to sparse Derrick Stolee via GitGitGadget
2021-01-27 17:30   ` Elijah Newren
2021-01-25 17:41 ` [PATCH 10/27] submodule: sparse-index should not collapse links Derrick Stolee via GitGitGadget
2021-01-25 17:41 ` [PATCH 11/27] unpack-trees: allow sparse directories Derrick Stolee via GitGitGadget
2021-01-27 17:36   ` Elijah Newren
2021-01-25 17:41 ` [PATCH 12/27] sparse-index: check index conversion happens Derrick Stolee via GitGitGadget
2021-01-27 17:46   ` Elijah Newren
2021-01-25 17:41 ` [PATCH 13/27] sparse-index: create extension for compatibility Derrick Stolee via GitGitGadget
2021-01-27 18:03   ` Elijah Newren
2021-01-25 17:42 ` [PATCH 14/27] sparse-checkout: toggle sparse index from builtin Derrick Stolee via GitGitGadget
2021-01-27 18:18   ` Elijah Newren
2021-01-28 15:26     ` Derrick Stolee
2021-01-25 17:42 ` [PATCH 15/27] [RFC-VERSION] *: ensure full index Derrick Stolee via GitGitGadget
2021-02-01 20:22   ` Elijah Newren
2021-02-01 21:10     ` Derrick Stolee
2021-01-25 17:42 ` [PATCH 16/27] unpack-trees: make sparse aware Derrick Stolee via GitGitGadget
2021-02-01 20:50   ` Elijah Newren
2021-02-09 17:23     ` Derrick Stolee
2021-01-25 17:42 ` [PATCH 17/27] dir.c: accept a directory as part of cone-mode patterns Derrick Stolee via GitGitGadget
2021-02-01 22:12   ` Elijah Newren
2021-01-25 17:42 ` [PATCH 18/27] status: use sparse-index throughout Derrick Stolee via GitGitGadget
2021-01-25 17:42 ` [PATCH 19/27] status: skip sparse-checkout percentage with sparse-index Derrick Stolee via GitGitGadget
2021-01-25 17:42 ` [PATCH 20/27] sparse-index: expand_to_path() trivial implementation Derrick Stolee via GitGitGadget
2021-01-25 17:42 ` [PATCH 21/27] sparse-index: expand_to_path no-op if path exists Derrick Stolee via GitGitGadget
2021-02-01 22:34   ` Elijah Newren
2021-01-25 17:42 ` [PATCH 22/27] add: allow operating on a sparse-only index Derrick Stolee via GitGitGadget
2021-02-01 23:08   ` Elijah Newren
2021-01-25 17:42 ` [PATCH 23/27] submodule: die_path_inside_submodule is sparse aware Derrick Stolee via GitGitGadget
2021-01-25 17:42 ` [PATCH 24/27] dir: use expand_to_path in add_patterns() Derrick Stolee via GitGitGadget
2021-02-01 23:21   ` Elijah Newren
2021-01-25 17:42 ` [PATCH 25/27] fsmonitor: disable if index is sparse Derrick Stolee via GitGitGadget
2021-01-25 17:42 ` [PATCH 26/27] pathspec: stop calling ensure_full_index Derrick Stolee via GitGitGadget
2021-02-01 23:24   ` Elijah Newren
2021-02-02  2:39     ` Derrick Stolee
2021-01-25 17:42 ` [PATCH 27/27] cache-tree: integrate with sparse directory entries Derrick Stolee via GitGitGadget
2021-02-01 23:54   ` Elijah Newren
2021-02-02  2:41     ` Derrick Stolee
2021-02-02  3:05       ` Elijah Newren
2021-01-25 20:10 ` [PATCH 00/27] [RFC] Sparse Index Junio C Hamano
2021-01-25 21:18   ` Derrick Stolee
2021-02-02  3:11 ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef86c7fc-e811-036b-b2e3-221e3bdb624a@gmail.com \
    --to=stolee@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).