git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Neeraj Singh <nksingh85@gmail.com>,
	"Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	Git List <git@vger.kernel.org>,
	Neeraj Singh <neerajsi@microsoft.com>,
	Elijah Newren <newren@gmail.com>,
	"Randall S. Becker" <rsbecker@nexbridge.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options)
Date: Mon, 28 Mar 2022 12:56:57 +0200	[thread overview]
Message-ID: <YkGUeQH4y1KIAdCc@ncase> (raw)
In-Reply-To: <220327.86y20veeua.gmgdl@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6443 bytes --]

On Sun, Mar 27, 2022 at 02:43:48PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sat, Mar 26 2022, Neeraj Singh wrote:
> 
> > On Sat, Mar 26, 2022 at 8:34 AM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >>
> >> On Fri, Mar 25 2022, Neeraj Singh wrote:
> >>
> >> > On Fri, Mar 25, 2022 at 5:33 PM Ævar Arnfjörð Bjarmason
> >> > <avarab@gmail.com> wrote:
> [...]
> >> > I want to make a comment about the Index here.  Syncing the index is
> >> > strictly required for the "added" level of consistency, so that we
> >> > don't lose stuff that leaves the work tree but is staged.  But my
> >> > Windows enlistment has an index that's 266MB, which would be painful
> >> > to sync even with all the optimizations.  Maybe with split-index, this
> >> > wouldn't be so bad, but I just wanted to call out that some advanced
> >> > users may really care about the configurability.
> >>
> >> So for that use-case you'd like to fsync the loose objects (if any), but
> >> not the index? So the FS will "flush" up to the index, and then queue
> >> the index for later syncing to platter?
> >>
> >>
> >> But even in that case don't the settings need to be tied to one another,
> >> because in the method=bulk sync=index && sync=!loose case wouldn't we be
> >> syncing "loose" in any case?
> >>
> >> > As Git's various database implementations improve, the fsync stuff
> >> > will hopefully be more optimal and self-tuning.  But as that happens,
> >> > Git could just start ignoring settings that lose meaning without tying
> >> > anyones hands.
> >>
> >> Yeah that would alleviate most of my concerns here, but the docs aren't
> >> saying anything like that. Since you added them & they just landed, do
> >> you mind doing a small follow-up where we e.g. say that these new
> >> settings are "EXPERIMENTAL" or whatever, and subject to drastic change?
> >
> > The doc is already pretty prescriptive.  It has this line at the end
> > of the first  paragraph:
> > "Unless you
> > have special requirements, it is recommended that you leave
> > this option empty or pick one of `committed`, `added`,
> > or `all`."
> >
> > Those values are already designed to change as Git changes.
> 
> I'm referring to the documentation as it stands not being marked as
> experimental in the sense that we might decide to re-do this to a large
> extent, i.e. something like the diff I suggested upthread in
> https://lore.kernel.org/git/220323.86fsn8ohg8.gmgdl@evledraar.gmail.com/
> 
> So yes, I agree that it e.g. clearly states that you can add a new
> core.git=foobar or whatever down the line, but it clearly doesn't
> suggest that e.g. core.fsync might have boolean semantics in some later
> version, or that the rest might simply be ignored, even if that
> e.g. means that we wouldn't sync loose objects on
> core.fsync=loose-object, as we'd just warn with a "we don't provide this
> anymore".
> 
> Or do you disagree with that? IOW I mean that we'd do something like
> this, either in docs or code:
> 
> diff --git a/config.c b/config.c
> index 3c9b6b589ab..94548566073 100644
> --- a/config.c
> +++ b/config.c
> @@ -1675,6 +1675,9 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>  	}
>  
>  	if (!strcmp(var, "core.fsync")) {
> +		if (!the_repository->settings.feature_experimental)
> +			warning(_("the '%s' configuration option is EXPERIMENTAL. opt-in to use it with feature.experimental=true"),
> +				var);
>  		if (!value)
>  			return config_error_nonbool(var);
>  		fsync_components = parse_fsync_components(var, value);
> @@ -1682,6 +1685,9 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>  	}
>  
>  	if (!strcmp(var, "core.fsyncmethod")) {
> +		if (!the_repository->settings.feature_experimental)
> +			warning(_("the '%s' configuration option is EXPERIMENTAL. opt-in to use it with feature.experimental=true"),
> +				var);
>  		if (!value)
>  			return config_error_nonbool(var);
>  		if (!strcmp(value, "fsync"))

Let's please not tie this to `feature.experimental=true`. Setting that
option has unintended sideeffects and will also change defaults which we
may not want to have in production. I don't mind adding a warning in the
docs though that the specific items which can be configured may be
subject to change in the future.

At GitLab, we've got a three-step plan:

    1. We need to migrate to `core.fsync` in the first place. In order
       to not migrate and change behaviour at the same point in time we
       already benefit from the fine-grainedness of this config because
       we can simply say `core.fsync=loose-objects` and have the same
       behaviour as before with `core.fsyncLooseObjects=true`.

    2. We'll want to enable syncing of packfiles, which I think wasn't
       previously covered by `core.fsyncLooseobjects`.

    3. We'll add `refs` to also sync loose refs to disk.

So while the end result will be the same as `committed`, having this
level of control helps us to assess the impact in a nicer way by being
able to do this step by step with feature flags.

On the other hand, many of the other parts we don't really care about.
Auxiliary metadata like the commit-graph or pack indices are data that
can in the worst case be regenerated by us, so it's not clear to me
whether it makes to also enable fsyncing those in production.

So altogether, I agree with Neeraj: having the fine-grainedness greatly
helps us to roll out changes like this and be able to pick what we deem
to be important. Personally I would be fine with explicitly pointing out
that there are two groups of this config in our docs though:

    1. The "porcelain" group: "committed", "added", "all", "none". These
       are abstract groups whose behaviour should adapt as we change
       implementations, and are those that should typically be set by a
       user, if intended.

    2. The "plumbing" or "expert" group: these are fine-grained options
       which shouldn't typically be used by Git users. They still have
       merit though in hosting environments, where requirements are
       typically a lot more specific.

We may also provide different guarantees for both groups. The first one
should definitely be stable, but we might state that the second group is
subject to change in the future.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-03-28 10:57 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-04  3:28 [PATCH 0/2] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-04  3:28 ` [PATCH 1/2] fsync: add writeout-only mode for fsyncing repo data Neeraj Singh via GitGitGadget
2021-12-06  7:54   ` Neeraj Singh
2021-12-04  3:28 ` [PATCH 2/2] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07  2:46 ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-07  2:46   ` [PATCH v2 1/3] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-07 11:44     ` Patrick Steinhardt
2021-12-07 12:14       ` Ævar Arnfjörð Bjarmason
2021-12-07 23:29       ` Neeraj Singh
2021-12-07 12:18     ` Ævar Arnfjörð Bjarmason
2021-12-07 23:58       ` Neeraj Singh
2021-12-07  2:46   ` [PATCH v2 2/3] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07 11:53     ` Patrick Steinhardt
2021-12-07 20:46       ` Neeraj Singh
2021-12-07 12:29     ` Ævar Arnfjörð Bjarmason
2021-12-07 21:44       ` Neeraj Singh
2021-12-08 10:05         ` Ævar Arnfjörð Bjarmason
2021-12-09  0:14           ` Neeraj Singh
2021-12-09  0:44             ` Junio C Hamano
2021-12-09  4:08             ` Ævar Arnfjörð Bjarmason
2021-12-09  6:18               ` Neeraj Singh
2022-01-18 23:50                 ` Neeraj Singh
2022-01-19 15:28                   ` Ævar Arnfjörð Bjarmason
2022-01-19 14:52                 ` Ævar Arnfjörð Bjarmason
2022-01-28  1:28                   ` Neeraj Singh
2021-12-07  2:46   ` [PATCH v2 3/3] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-07 11:56   ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Patrick Steinhardt
2021-12-08  0:44     ` Neeraj Singh
2021-12-09  0:57   ` [PATCH v3 0/4] " Neeraj K. Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-01-08  1:13     ` [PATCH v3 0/4] A design for future-proofing fsync() configuration Neeraj Singh
2022-01-09  0:55       ` rsbecker
2022-01-10 19:00         ` Neeraj Singh
2022-02-01  3:33     ` [PATCH v4 " Neeraj K. Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-02-02  0:51         ` Junio C Hamano
2022-02-02  1:42           ` Junio C Hamano
2022-02-11 21:18             ` Neeraj Singh
2022-02-11 22:19               ` Junio C Hamano
2022-02-11 23:04                 ` Neeraj Singh
2022-02-11 23:15                   ` Junio C Hamano
2022-02-12  0:39                     ` rsbecker
2022-02-14  7:04                     ` Patrick Steinhardt
2022-02-14 17:17                       ` Junio C Hamano
2022-03-09 13:42                         ` Patrick Steinhardt
2022-03-09 18:50                           ` Ævar Arnfjörð Bjarmason
2022-03-09 20:03                           ` Junio C Hamano
2022-03-10 12:33                             ` Patrick Steinhardt
2022-03-10 17:15                               ` Junio C Hamano
2022-03-09 20:05                           ` Neeraj Singh
2022-02-11 20:38           ` Neeraj Singh
2022-02-01  3:33       ` [PATCH v4 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-03-09 23:03       ` [PATCH v5 0/5] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-09 23:03         ` [PATCH v5 1/5] wrapper: move inclusion of CSPRNG headers the wrapper.c file Neeraj Singh via GitGitGadget
2022-03-09 23:29           ` Junio C Hamano
2022-03-10  1:21             ` Neeraj Singh
2022-03-10  1:26           ` brian m. carlson
2022-03-10  1:56             ` Neeraj Singh
2022-03-09 23:03         ` [PATCH v5 2/5] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-09 23:48           ` Junio C Hamano
2022-03-09 23:03         ` [PATCH v5 3/5] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-03-10  0:21           ` Junio C Hamano
2022-03-10  2:53             ` Neeraj Singh
2022-03-10  7:19               ` Junio C Hamano
2022-03-10 18:38                 ` Neeraj Singh
2022-03-10 18:44                   ` Junio C Hamano
2022-03-10 19:57                     ` Junio C Hamano
2022-03-10 20:25                       ` Neeraj Singh
2022-03-10 21:17                         ` Junio C Hamano
2022-03-10 13:11               ` Johannes Schindelin
2022-03-10 17:18               ` Junio C Hamano
2022-03-09 23:03         ` [PATCH v5 4/5] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-09 23:03         ` [PATCH v5 5/5] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-10  9:53         ` Future-proofed syncing of refs Patrick Steinhardt
2022-03-10  9:53         ` [PATCH 6/8] core.fsync: add `fsync_component()` wrapper which doesn't die Patrick Steinhardt
2022-03-10 17:34           ` Junio C Hamano
2022-03-10 18:40             ` Neeraj Singh
2022-03-10  9:53         ` [PATCH 7/8] core.fsync: new option to harden loose references Patrick Steinhardt
2022-03-10 18:25           ` Junio C Hamano
2022-03-10 19:03             ` Neeraj Singh
2022-03-10 22:54           ` Neeraj Singh
2022-03-11  6:40           ` Junio C Hamano
2022-03-11  9:15             ` Patrick Steinhardt
2022-03-11  9:36               ` Ævar Arnfjörð Bjarmason
2022-03-10  9:53         ` [PATCH 8/8] core.fsync: new option to harden packed references Patrick Steinhardt
2022-03-10 18:28           ` Junio C Hamano
2022-03-11  9:10             ` Patrick Steinhardt
2022-03-10 22:43         ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 1/6] wrapper: make inclusion of Windows csprng header tightly scoped Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 2/6] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 3/6] core.fsync: introduce granular fsync control infrastructure Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 4/6] core.fsync: add configuration parsing Neeraj Singh via GitGitGadget
2022-03-28 11:06             ` Jiang Xin
2022-03-28 19:45               ` Neeraj Singh
2022-03-10 22:43           ` [PATCH v6 5/6] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 6/6] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-15 19:12             ` [PATCH v7] " Neeraj Singh
2022-03-15 19:32               ` Junio C Hamano
2022-03-15 19:56                 ` Neeraj Singh
2022-03-23 14:20               ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Ævar Arnfjörð Bjarmason
2022-03-25 21:24                 ` Neeraj Singh
2022-03-26  0:24                   ` Ævar Arnfjörð Bjarmason
2022-03-26  1:23                     ` do we have too much fsync() configuration in 'next'? Junio C Hamano
2022-03-26  1:25                     ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Neeraj Singh
2022-03-26 15:31                       ` Ævar Arnfjörð Bjarmason
2022-03-27  5:27                         ` Neeraj Singh
2022-03-27 12:43                           ` Ævar Arnfjörð Bjarmason
2022-03-28 10:56                             ` Patrick Steinhardt [this message]
2022-03-28 11:25                               ` Ævar Arnfjörð Bjarmason
2022-03-28 19:56                                 ` Neeraj Singh
2022-03-30 16:59                                   ` Neeraj Singh
2022-03-10 23:34           ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Junio C Hamano
2022-03-11  0:03             ` Neeraj Singh
2022-03-11 18:50               ` Neeraj Singh
2022-03-13 23:50             ` Junio C Hamano
2022-03-11  9:58           ` [PATCH v2] core.fsync: new option to harden references Patrick Steinhardt
2022-03-25  6:11             ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkGUeQH4y1KIAdCc@ncase \
    --to=ps@pks.im \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=neerajsi@microsoft.com \
    --cc=newren@gmail.com \
    --cc=nksingh85@gmail.com \
    --cc=rsbecker@nexbridge.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).