All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, newren@gmail.com,
	matheus.bernardino@usp.br, stolee@gmail.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search
Date: Thu, 19 Aug 2021 10:01:20 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2108190950540.55@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <10bcadb284e49419f9b4baf75e05f719ec395d98.1629206603.git.gitgitgadget@gmail.com>

Hi Stolee,

On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> The iterated search in find_cache_entry() was recently modified to
> include a loop that searches backwards for a sparse directory entry that
> matches the given traverse_info and name_entry. However, the string
> comparison failed to actually concatenate those two strings, so this
> failed to find a sparse directory when it was not a top-level directory.
>
> This caused some errors in rare cases where a 'git checkout' spanned a
> diff that modified files within the sparse directory entry, but we could
> not correctly find the entry.

Good explanation.

I wonder a bit about the performance impact. How "hot" is this function?
I.e. how often is it called, on average?

I ask because I see opportunities to optimize in both directions: it could
be written more concisely (if speed does not matter as much), and it could
be made faster (if speed matters a lot). See below for more.

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  unpack-trees.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 5786645f315..df1f4437723 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1255,9 +1255,10 @@ static int sparse_dir_matches_path(const struct cache_entry *ce,
>  static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  					    const struct name_entry *p)
>  {
> -	struct cache_entry *ce;
> +	struct cache_entry *ce = NULL;

Makes sense: since you need to release the allocated memory, you can no
longer `return NULL` early, but have to break out of the loop and return
`ce`.

>  	int pos = find_cache_pos(info, p->path, p->pathlen);
>  	struct unpack_trees_options *o = info->data;
> +	struct strbuf full_path = STRBUF_INIT;
>
>  	if (0 <= pos)
>  		return o->src_index->cache[pos];
> @@ -1273,6 +1274,10 @@ static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  	if (pos < 0 || pos >= o->src_index->cache_nr)
>  		return NULL;
>
> +	strbuf_addstr(&full_path, info->traverse_path);
> +	strbuf_add(&full_path, p->path, p->pathlen);
> +	strbuf_addch(&full_path, '/');

This could be reduced to:

	strbuf_addf(&full_path, "%s%.*s/",
		    info->traverse_path, (int)p->pathlen, p->path);

But if speed matters, we probably need something more like this:

	size_t full_path_len;
	const char *full_path;
	char *full_path_1 = NULL;

	if (!*info->traverse_path) {
		full_path = p->path;
		full_path_len = p->pathlen;
	} else {
		size_t len = strlen(info->traverse_path);

		full_path_len = len + p->pathlen + 1;
		full_path = full_path_1 = xmalloc(full_path_len + 1);
		memcpy(full_path_1, info->traverse_path, len);
		memcpy(full_path_1 + len, p->path, p->pathlen);
		full_path_1[full_path_len - 1] = '/';
		full_path_1[full_path_len] = '\0';
	}

	[...]

	free(full_path_1);

It would obviously be much nicer if we did not have to go for that ugly
long version...

> +
>  	/*
>  	 * Due to lexicographic sorting and sparse directory
>  	 * entries ending with a trailing slash, our path as a
> @@ -1283,17 +1288,20 @@ static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  	while (pos >= 0) {
>  		ce = o->src_index->cache[pos];
>
> -		if (strncmp(ce->name, p->path, p->pathlen))
> -			return NULL;
> +		if (strncmp(ce->name, full_path.buf, full_path.len)) {
> +			ce = NULL;
> +			break;
> +		}
>
>  		if (S_ISSPARSEDIR(ce->ce_mode) &&
>  		    sparse_dir_matches_path(ce, info, p))
> -			return ce;
> +			break;
>
>  		pos--;
>  	}
>
> -	return NULL;
> +	strbuf_release(&full_path);
> +	return ce;
>  }
>
>  static void debug_path(struct traverse_info *info)
> --
> gitgitgadget
>
>

  reply	other threads:[~2021-08-19  8:09 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 17:27 [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 1/2] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 2/2] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-07-30 13:52   ` Elijah Newren
2021-08-02 14:34     ` Derrick Stolee
2021-08-02 16:17       ` Elijah Newren
2021-08-05  1:55         ` Derrick Stolee
2021-08-05  3:54           ` Elijah Newren
2021-07-30 13:11 ` [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-10 19:50 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-19 18:24     ` Elijah Newren
2021-08-20 15:04       ` Derrick Stolee
2021-08-10 19:50   ` [PATCH v2 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-12 17:29     ` Derrick Stolee
2021-08-10 19:50   ` [PATCH v2 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-17 13:23   ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-19  7:45       ` Johannes Schindelin
2021-08-20 15:09         ` Derrick Stolee
2021-08-20 16:40           ` Eric Sunshine
2021-08-17 13:23     ` [PATCH v3 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-19  8:01       ` Johannes Schindelin [this message]
2021-08-20 15:18         ` Derrick Stolee
2021-08-20 19:35           ` René Scharfe
2021-08-20 20:22             ` René Scharfe
2021-08-19 18:29       ` Elijah Newren
2021-08-17 13:23     ` [PATCH v3 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-19  8:07       ` Johannes Schindelin
2021-08-20 15:30         ` Derrick Stolee
2021-08-17 13:23     ` [PATCH v3 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-19  8:11       ` Johannes Schindelin
2021-08-20 15:36         ` Derrick Stolee
2021-08-19 20:53       ` Elijah Newren
2021-08-20 15:39         ` Derrick Stolee
2021-08-20 16:05           ` Elijah Newren
2021-08-17 13:23     ` [PATCH v3 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-18 18:59       ` Derrick Stolee
2021-08-17 13:23     ` [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-19  8:48       ` Johannes Schindelin
2021-08-20 15:49         ` Derrick Stolee
2021-08-20 16:15           ` Elijah Newren
2021-08-20 15:56         ` Elijah Newren
2021-08-23 20:00           ` Johannes Schindelin
2021-08-17 14:09     ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-24 21:51     ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 01/10] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 02/10] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 03/10] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 04/10] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-08-27 21:33         ` Elijah Newren
2021-08-30 13:19           ` Derrick Stolee
2021-08-30 20:08             ` Elijah Newren
2021-08-24 21:51       ` [PATCH v4 05/10] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-24 22:21         ` René Scharfe
2021-08-25  1:09           ` Derrick Stolee
2021-08-24 21:51       ` [PATCH v4 06/10] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 07/10] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 08/10] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 09/10] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 10/10] sparse-checkout: add config to disable deleting dirs Derrick Stolee via GitGitGadget
2021-08-27 20:58         ` Elijah Newren
2021-08-30 13:30           ` Derrick Stolee
2021-08-30 20:11             ` Elijah Newren
2021-08-27 21:56       ` [PATCH v4 00/10] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-27 22:01         ` Elijah Newren
2021-08-30 13:34           ` Derrick Stolee
2021-08-30 20:14             ` Elijah Newren
2021-08-30 13:54         ` Derrick Stolee
2021-08-30 20:23           ` Elijah Newren
2021-09-08  1:42       ` [PATCH v5 0/9] " Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 1/9] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 2/9] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 3/9] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 4/9] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 5/9] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 6/9] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 7/9] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 8/9] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 9/9] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-09-08  5:21         ` [PATCH v5 0/9] Sparse index: delete ignored files outside sparse cone Junio C Hamano
2021-09-08  6:56           ` Junio C Hamano
2021-09-08 11:39             ` Derrick Stolee
2021-09-08 16:11               ` Junio C Hamano
2021-09-08  5:30         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2108190950540.55@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.