All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com, szeder.dev@gmail.com,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH 12/12] builtin/commit-graph.c: introduce '--max-new-filters=<n>'
Date: Mon, 14 Sep 2020 16:31:03 -0400	[thread overview]
Message-ID: <134d64a0-abb6-bdc9-2c05-7aded01a906a@gmail.com> (raw)
In-Reply-To: <20200914201258.GA12431@nand.local>

On 9/14/2020 4:12 PM, Taylor Blau wrote:
> On Fri, Sep 11, 2020 at 03:25:55PM -0400, Taylor Blau wrote:
>> On Fri, Sep 11, 2020 at 02:59:34PM -0400, Taylor Blau wrote:
>>> On Fri, Sep 11, 2020 at 01:52:16PM -0400, Jeff King wrote:
>>>> On Wed, Sep 09, 2020 at 11:24:00AM -0400, Taylor Blau wrote:
>>>>> +With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
>>>>> +filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
>>>>> +enforced. Commits whose filters are not calculated are stored as a
>>>>> +length zero Bloom filter, and their bit is marked in the `BFXL` chunk.
>>>>> +Overrides the `commitGraph.maxNewFilters` configuration.
>>>>
>>>> The BFXL chunk doesn't exist anymore in this iteration, right?
>>>
>>> Ack; I'll have to drop that.
>>
>> Junio, I know that I've already sent one replacement patch. If you don't
>> mind, here's another (and if you do mind, I'm happy to re-roll the
>> series).
> 
> Just kidding. Let's use *this* version which fixes a bug reading the
> commitGraph.maxNewFilters configuration. At this point, the
> fix-ups are:
> 
>   - This patch (attached below the scisors) instead of 12/12, and
> 
>   - This [1] patch instead of 10/12.
> 
> [1]: https://lore.kernel.org/git/20200910154516.GA32117@nand.local/
> 
> Let me know if you'd rather have a full re-roll.

It's getting a bit difficult to track all of these "use this instead"
patches. But, I'm not the one applying them, so maybe that's not actually
a problem.

You might need a re-roll, anyway, as I have a few comments here:

> --- 8< ---
> 
> Subject: [PATCH] builtin/commit-graph.c: introduce '--max-new-filters=<n>'

You also introduce commitGraph.maxNewFitlers here, which is not
mentioned in the commit message anywhere. In fact, it might be
good to include it as a separate patch so its implementation and
tests can be isolated from the command-line functionality.

> +length zero Bloom filter. Overrides the `commitGraph.maxNewFilters`
> +configuration.

We have found it valuable to demonstrate these overrides in tests.
Let's inspect your tests for this.

> +test_bloom_filters_computed () {
> +	commit_graph_args=$1
> +	rm -f "$TRASH_DIRECTORY/trace.event" &&
> +	GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write \
> +		$commit_graph_args &&
> +	grep "\"filter_not_computed\":$2" "$TRASH_DIRECTORY/trace.event" &&
> +	grep "\"filter_trunc_large\":$3" "$TRASH_DIRECTORY/trace.event" &&
> +	grep "\"filter_computed\":$4" "$TRASH_DIRECTORY/trace.event"
> +}

If the arguments were moved to the last parameter, then we could do a few
interesting things here.

test_bloom_filters_computed () {
	NOT_COMPUTED="\"filter_not_computed\":$1" &&
	shift &&
	TRUNCATED="\"filter_trunc_large\":$1" &&
	shift &&
	COMPUTED="\"filter_computed\":$1" &&
	shift &&
	rm -f "$TRASH_DIRECTORY/trace.event" &&
	GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write $@ &&
	grep "$NOT_COMPUTED" "$TRASH_DIRECTORY/trace.event" &&
	grep "$TRUNCATED" "$TRASH_DIRECTORY/trace.event" &&
	grep "$COMPUTED" "$TRASH_DIRECTORY/trace.event"
}


(I have not tested this script. It might need some work.)
This would make your callers a bit cleaner-looking, for example:

test_expect_success 'Bloom generation is limited by --max-new-filters' '
	(
		cd limits &&
		test_commit c2 filter &&
		test_commit c3 filter &&
		test_commit c4 no-filter &&
		test_bloom_filters_computed 3 0 2 \
			--reachable --changed-paths --split=replace --max-new-filters=2
	)
'

At least, this looks nicer to me.

> +test_expect_success 'Bloom generation backfills previously-skipped filters' '
> +	# Check specifying commitGraph.maxNewFilters over "git config" works.
> +	test_config -C limits commitGraph.maxNewFilters 1 &&
> +	(
> +		cd limits &&
> +		test_bloom_filters_computed "--reachable --changed-paths --split=replace" \
> +			4 0 1
> +	)
> +'

Adding a case for `commitGraph.maxNewFilters=1` and `--max-new-filters=2` might
be interesting for the override rules.

> +
> +test_expect_success 'Bloom generation backfills empty commits' '
> +	git init empty &&
> +	test_when_finished "rm -fr empty" &&
> +	(
> +		cd empty &&
> +		for i in $(test_seq 1 6)
> +		do
> +			git commit --allow-empty -m "$i"
> +		done &&
> +
> +		# Generate Bloom filters for empty commits 1-6, two at a time.
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&

I'm concerned that the max-new-filters limit (2) is a divisor
of the full number of commits (6). It might be good to add one
more commit here and test again with a limit of 2. That would
handle both "equal to limit" and "less than limit" cases.

> +		# Finally, make sure that once all commits have filters, that
> +		# none are subsequently recomputed.
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			6 0 0
> +	)
> +'

Thanks,
-Stolee


  reply	other threads:[~2020-09-14 20:32 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-09 15:22 [PATCH 00/12] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-09 15:22 ` [PATCH 01/12] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-09 15:22 ` [PATCH 02/12] t4216: use an '&&'-chain Taylor Blau
2020-09-09 15:22 ` [PATCH 03/12] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-09 15:23 ` [PATCH 04/12] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-09 15:23 ` [PATCH 05/12] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-09 15:23 ` [PATCH 06/12] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-09 15:23 ` [PATCH 07/12] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-09 15:23 ` [PATCH 08/12] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-09 15:23 ` [PATCH 09/12] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-09 15:23 ` [PATCH 10/12] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-10  3:35   ` Taylor Blau
2020-09-10 15:45     ` Taylor Blau
2020-09-11 18:15       ` Derrick Stolee
2020-09-09 15:23 ` [PATCH 11/12] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-09 15:24 ` [PATCH 12/12] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-11 17:52   ` Jeff King
2020-09-11 18:59     ` Taylor Blau
2020-09-11 19:25       ` Taylor Blau
2020-09-14 20:12         ` Taylor Blau
2020-09-14 20:31           ` Derrick Stolee [this message]
2020-09-14 20:36             ` Taylor Blau
2020-09-15  0:59               ` Derrick Stolee
2020-09-15  4:31                 ` Taylor Blau
2020-09-15 21:49               ` Junio C Hamano
2020-09-15 21:53                 ` Taylor Blau
2020-09-11 19:47       ` Jeff King
2020-09-11 19:31     ` Junio C Hamano
2020-09-16 18:06 ` [PATCH v2 00/13] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-16 18:06   ` [PATCH v2 01/13] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-16 18:07   ` [PATCH v2 02/13] t4216: use an '&&'-chain Taylor Blau
2020-09-16 18:07   ` [PATCH v2 03/13] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-16 18:07   ` [PATCH v2 04/13] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-16 18:07   ` [PATCH v2 05/13] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-16 18:07   ` [PATCH v2 06/13] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-16 18:07   ` [PATCH v2 07/13] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-16 18:07   ` [PATCH v2 08/13] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-16 18:07   ` [PATCH v2 09/13] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-16 18:07   ` [PATCH v2 10/13] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-17 22:13     ` SZEDER Gábor
2020-09-17 23:13       ` Taylor Blau
2020-09-18  0:52         ` Junio C Hamano
2020-09-18  1:15           ` Taylor Blau
2020-09-16 18:08   ` [PATCH v2 11/13] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-16 18:08   ` [PATCH v2 12/13] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-18  9:23     ` SZEDER Gábor
2020-09-18 13:27       ` Taylor Blau
2020-09-16 18:08   ` [PATCH v2 13/13] commit-graph: introduce 'commitGraph.maxNewFilters' Taylor Blau
2020-09-16 22:51   ` [PATCH v2 00/13] more miscellaneous Bloom filter improvements, redux Derrick Stolee
2020-09-16 23:07     ` Junio C Hamano
2020-09-17  0:45       ` Taylor Blau
2020-09-17  0:59         ` Junio C Hamano
2020-09-17  1:10           ` Taylor Blau
2020-09-17 13:34             ` Taylor Blau
2020-09-17 13:38               ` Derrick Stolee
2020-09-18  2:58 ` [PATCH v3 " Taylor Blau
2020-09-18  2:58   ` [PATCH v3 01/13] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-18  2:58   ` [PATCH v3 02/13] t4216: use an '&&'-chain Taylor Blau
2020-09-18  2:59   ` [PATCH v3 03/13] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-18  2:59   ` [PATCH v3 04/13] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-18  2:59   ` [PATCH v3 05/13] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-18  2:59   ` [PATCH v3 06/13] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-18  2:59   ` [PATCH v3 07/13] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-18  2:59   ` [PATCH v3 08/13] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-18 16:27     ` SZEDER Gábor
2020-09-18 16:32       ` Taylor Blau
2020-09-18  2:59   ` [PATCH v3 09/13] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-18  2:59   ` [PATCH v3 10/13] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-18  2:59   ` [PATCH v3 11/13] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-18  2:59   ` [PATCH v3 12/13] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-18  2:59   ` [PATCH v3 13/13] commit-graph: introduce 'commitGraph.maxNewFilters' Taylor Blau
2020-09-18 13:29     ` Taylor Blau
2020-09-18 17:43       ` Junio C Hamano
2020-09-18 13:31   ` [PATCH v3 00/13] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-18 13:34     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=134d64a0-abb6-bdc9-2c05-7aded01a906a@gmail.com \
    --to=stolee@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.