All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Phillip Wood via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: "Derrick Stolee" <dstolee@microsoft.com>,
	"René Scharfe" <l.s.r@web.de>, "Elijah Newren" <newren@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Phillip Wood" <phillip.wood@dunelm.org.uk>,
	vdye@github.com
Subject: Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
Date: Wed, 6 Oct 2021 07:20:19 -0400	[thread overview]
Message-ID: <2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com> (raw)
In-Reply-To: <pull.1053.git.1633512591608.gitgitgadget@gmail.com>

On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when
> index_name_pos() looks up a entry that is missing from the index and
> that would be a descendant of a sparse entry. That triggers a call to
> ensure_full_index() which frees the cache tree that is being verified.
> Carrying on trying to verify the tree after this results in a
> use-after-free bug. Instead restart the verification if a sparse index
> is converted to a full index. This bug is triggered by a call to
> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
> help analyzing the problem.

Thank you for identifying an interesting case! I hadn't thought to
change the mode from --merge to --apply.

>     In a sparse index it is possible for the tree that is being verified to
>     be freed while it is being verified. This is an RFC as I'm not familiar
>     with the cache tree code. I'm confused as to why this bug is triggered
>     by the sequence
>     
>     unpack_trees()
>     prime_cache_tree()
>     write_locked_index()
>     
>     but not
>     
>     unpack_trees()
>     write_locked_index()
>     
>     
>     as unpack_trees() appears to update the cache tree with
>     
>     if (!cache_tree_fully_valid(o->result.cache_tree))
>                 cache_tree_update(&o->result,
>                           WRITE_TREE_SILENT |
>                           WRITE_TREE_REPAIR);
>     
>     
>     and I don't understand why the cache tree from prime_cache_tree()
>     results in different behavior. It concerns me that this fix is hiding
>     another bug.

prime_cache_tree() appears to clear the cache tree and start from scratch
from a tree object instead of using the index.

In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
cone, so the cache tree is the full size at that point.

When the verify_one() method reaches these nodes that are outside of the
cone, index_name_pos() triggers the index expansion in a way that the
cache-tree that is restricted to the sparse-checkout cone does not.

Hopefully that helps clear up _why_ this happens.

There is a remaining issue that "git rebase --apply" will be a lot slower
than "git rebase --merge" because of this construction of a cache-tree
that is much larger than necessary.

I will make note of this as a potential improvement for the future.

> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {
>  	int i, pos, len = path->len;
>  	struct strbuf tree_buf = STRBUF_INIT;
> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>  
>  	for (i = 0; i < it->subtree_nr; i++) {
>  		strbuf_addf(path, "%s/", it->down[i]->name);
> -		verify_one(r, istate, it->down[i]->cache_tree, path);
> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
> +			return 1;
>  		strbuf_setlen(path, len);
>  	}
>  
>  	if (it->entry_count < 0 ||
>  	    /* no verification on tests (t7003) that replace trees */
>  	    lookup_replace_object(r, &it->oid) != &it->oid)
> -		return;
> +		return 0;
>  
>  	if (path->len) {
> +		/*
> +		 * If the index is sparse index_name_pos() may trigger
> +		 * ensure_full_index() which will free the tree that is being
> +		 * verified.
> +		 */
> +		int is_sparse = istate->sparse_index;
>  		pos = index_name_pos(istate, path->buf, path->len);
> +		if (is_sparse && !istate->sparse_index)
> +			return 1;

I think this guard is good to have, even if we fix prime_cache_tree() to
avoid triggering expansion here in most cases.

>  		if (pos >= 0) {
>  			verify_one_sparse(r, istate, it, path, pos);
> -			return;
> +			return 0;
>  		}
>  
>  		pos = -pos - 1;
> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>  		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>  	strbuf_setlen(path, len);
>  	strbuf_release(&tree_buf);
> +	return 0;
>  }
>  
>  void cache_tree_verify(struct repository *r, struct index_state *istate)
> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}

And this limits us to doing at most two passes. Good.

>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"

Thank you for the additional test!

Thanks,
-Stolee

  reply	other threads:[~2021-10-06 11:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
2021-10-06 11:20 ` Derrick Stolee [this message]
2021-10-06 14:01   ` Phillip Wood
2021-10-06 14:19     ` Derrick Stolee
2021-10-06 19:17 ` Junio C Hamano
2021-10-06 20:43   ` Derrick Stolee
2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
2021-10-07 13:35   ` Derrick Stolee
2021-10-07 14:59     ` Phillip Wood
2021-10-07 13:53   ` Derrick Stolee
2021-10-07 15:05     ` Phillip Wood
2021-10-07 15:44       ` Derrick Stolee
2021-10-07 17:59         ` Phillip Wood
2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
2021-10-07 21:23     ` Junio C Hamano
2021-10-08  9:09       ` Phillip Wood
2021-10-08 18:53         ` Derrick Stolee
2021-10-08 19:57         ` Junio C Hamano
2021-10-14 13:34           ` Phillip Wood
2021-10-14 16:42             ` Junio C Hamano
2021-10-08  9:38     ` Bagas Sanjaya
2021-10-14  9:40       ` Phillip Wood
2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
2021-10-17  5:38       ` Junio C Hamano
2021-10-17 19:35         ` Derrick Stolee
2021-10-18  9:37         ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com \
    --to=stolee@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=l.s.r@web.de \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.