Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files
Date: Tue, 27 Apr 2021 09:24:38 -0400
Message-ID: <1f4a4de9-4067-8608-0379-84098c9b404f@gmail.com> (raw)
In-Reply-To: <9f263e70c7245a01210ac743ce3dfda4dcc08308.1617291666.git.gitgitgadget@gmail.com>

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to periodically truncate the list of
> modified files to save some memory.
> 
> Clients will ask for the set of changes relative to a token that they
> found in the FSMN index extension in the index.  (This token is like a
> point in time, but different).  Clients will then update the index to
> contain the response token (so that subsequent commands will be
> relative to this new token).
> 
> Therefore, the daemon can gradually truncate the in-memory list of
> changed paths as they become obsolete (older that the previous token).

s/older that/older than/

> Since we may have multiple clients making concurrent requests with a
> skew of tokens and clients may be racing to the talk to the daemon,
> we lazily truncate the list.
> 
> We introduce a 5 minute delay and truncate batches 5 minutes after
> they are considered obsolete.

5 minutes seems like a good default timeframe. We can consider
making this customizable in the future.

> +/*
> + * To keep the batch list from growing unbounded in response to filesystem
> + * activity, we try to truncate old batches from the end of the list as
> + * they become irrelevant.
> + *
> + * We assume that the .git/index will be updated with the most recent token
> + * any time the index is updated.  And future commands will only ask for
> + * recent changes *since* that new token.  So as tokens advance into the
> + * future, older batch items will never be requested/needed.  So we can
> + * truncate them without loss of functionality.
> + *
> + * However, multiple commands may be talking to the daemon concurrently
> + * or perform a slow command, so a little "token skew" is possible.
> + * Therefore, we want this to be a little bit lazy and have a generous
> + * delay.

I appreciate this documentation of the "expected" behavior and how it
compares to the "possible" behavior.

> + * The current reader thread walked backwards in time from `token->batch_head`
> + * back to `batch_marker` somewhere in the middle of the batch list.
> + *
> + * Let's walk backwards in time from that marker an arbitrary delay
> + * and truncate the list there.  Note that these timestamps are completely
> + * artificial (based on when we pinned the batch item) and not on any
> + * filesystem activity.
> + */
> +#define MY_TIME_DELAY (5 * 60) /* seconds */

Perhaps put the units into the macro? MY_TIME_DELAY_SECONDS?
> +static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
> +				      const struct fsmonitor_batch *batch_marker)
> +{
> +	/* assert state->main_lock */

If this comment is intended to be a warning for consumers that they should
have the lock around this method, then maybe that should be in a documentation
comment above the method declaration.

> +	const struct fsmonitor_batch *batch;
> +	struct fsmonitor_batch *rest;
> +	struct fsmonitor_batch *p;
> +	time_t t;

This is only used within the for loop, so it could be defined there.

> +
> +	if (!batch_marker)
> +		return;
> +
> +	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",

What's the value of abbreviating "truncate" like this? Is there a special
reason?

> +			 batch_marker->batch_seq_nr,
> +			 (uint64_t)batch_marker->pinned_time);
> +
> +	for (batch = batch_marker; batch; batch = batch->next) {
> +		if (!batch->pinned_time) /* an overflow batch */
> +			continue;
> +
> +		t = batch->pinned_time + MY_TIME_DELAY;
> +		if (t > batch_marker->pinned_time) /* too close to marker */
> +			continue;> +
> +		goto truncate_past_here;
> +	}
> +
> +	return;
> +
> +truncate_past_here:
> +	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
> +
> +	rest = ((struct fsmonitor_batch *)batch)->next;
> +	((struct fsmonitor_batch *)batch)->next = NULL;
> +
> +	for (p = rest; p; p = fsmonitor_batch__free(p)) {
> +		trace_printf_key(&trace_fsmonitor,
> +				 "TRNC kill (%"PRIu64",%"PRIu64")",
> +				 p->batch_seq_nr, (uint64_t)p->pinned_time);
> +	}

I see that you are not using the method that frees the entire list so
you can trace each entry as it is deleted. That works.

> +}
> +
>  static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
>  {
>  	struct fsmonitor_batch *p;
> @@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  			 * that work.
>  			 */
>  			fsmonitor_free_token_data(token_data);
> +		} else if (batch) {
> +			/*
> +			 * This batch is the first item in the list
> +			 * that is older than the requested sequence
> +			 * number and might be considered to be
> +			 * obsolete.  See if we can truncate the list
> +			 * and save some memory.
> +			 */
> +			fsmonitor_batch__truncate(state, batch);

Seems to work as advertised.

Thanks,
-Stolee

  reply index

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
2021-04-26 14:13   ` Derrick Stolee
2021-04-28 13:54     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-26 14:31   ` Derrick Stolee
2021-04-26 20:20     ` Eric Sunshine
2021-04-26 21:02       ` Derrick Stolee
2021-04-28 19:26     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 03/23] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
2021-04-01 15:40 ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
2021-04-26 14:56   ` Derrick Stolee
2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
2021-04-27 12:42       ` Derrick Stolee
2021-04-28  7:59         ` Ævar Arnfjörð Bjarmason
2021-04-28 16:26           ` [PATCH] repo-settings.c: simplify the setup Ævar Arnfjörð Bjarmason
2021-04-28 19:09             ` Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup) Derrick Stolee
2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
2021-05-05 16:12                 ` Johannes Schindelin
2021-04-29  5:12               ` Nesting topics within other threads Junio C Hamano
2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
2021-04-29 20:14                   ` Jeff King
2021-04-30  0:07                   ` Junio C Hamano
2021-04-30 14:23     ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Jeff Hostetler
2021-04-01 15:40 ` [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
2021-04-26 15:08   ` Derrick Stolee
2021-04-26 15:45     ` Derrick Stolee
2021-04-30 14:31       ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 06/23] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
2021-04-26 15:12   ` Derrick Stolee
2021-04-30 14:33     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
2021-04-26 15:23   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 08/23] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
2021-04-01 15:40 ` [PATCH 09/23] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
2021-04-26 15:47   ` Derrick Stolee
2021-04-26 16:12     ` Derrick Stolee
2021-04-30 15:18       ` Jeff Hostetler
2021-04-30 15:59     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 10/23] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
2021-04-26 19:17   ` Derrick Stolee
2021-04-26 20:11     ` Eric Sunshine
2021-04-26 20:24       ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 11/23] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
2021-04-26 19:49   ` Derrick Stolee
2021-04-26 20:01     ` Eric Sunshine
2021-04-26 20:03       ` Derrick Stolee
2021-04-30 16:17     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
2021-04-26 20:22   ` Derrick Stolee
2021-04-30 17:36     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
2021-04-27 17:22   ` Derrick Stolee
2021-04-27 17:41     ` Eric Sunshine
2021-04-30 19:32     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
2021-04-27 18:13   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
2021-04-27 18:35   ` Derrick Stolee
2021-04-30 20:05     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
2021-04-26 21:01   ` Derrick Stolee
2021-05-03 15:04     ` Jeff Hostetler
2021-05-13 18:52   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
2021-04-27 13:24   ` Derrick Stolee [this message]
2021-04-01 15:41 ` [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing Jeff Hostetler via GitGitGadget
2021-04-27 13:36   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
2021-04-27 14:23   ` Derrick Stolee
2021-05-03 21:59     ` Jeff Hostetler
2021-04-01 15:41 ` [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances Jeff Hostetler via GitGitGadget
2021-04-27 14:52   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 21/23] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:41   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 22/23] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:45   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:51   ` Derrick Stolee
2021-04-16 22:44 ` [PATCH 00/23] [RFC] Builtin FSMonitor Feature Junio C Hamano
2021-04-20 15:27   ` Johannes Schindelin
2021-04-20 19:13     ` Jeff Hostetler
2021-04-21 13:17     ` Derrick Stolee
2021-04-27 18:49 ` FS Monitor Windows Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature) Derrick Stolee
2021-04-27 19:31 ` FS Monitor macOS " Derrick Stolee
2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 02/28] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 03/28] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-06-02 11:24     ` Johannes Schindelin
2021-05-22 13:56   ` [PATCH v2 05/28] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 06/28] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 08/28] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 09/28] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
2021-06-11  6:32     ` Junio C Hamano
2021-05-22 13:56   ` [PATCH v2 11/28] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 12/28] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 13/28] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 14/28] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 15/28] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 16/28] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 17/28] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 18/28] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 19/28] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 20/28] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 21/28] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 23/28] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 24/28] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 25/28] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 26/28] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 27/28] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 28/28] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
2021-06-02 11:28     ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f4a4de9-4067-8608-0379-84098c9b404f@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jeffhost@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git