git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Jeff King <peff@peff.net>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Junio C Hamano <gitster@pobox.com>,
	"Neeraj K. Singh" <neerajsi@microsoft.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Eric Wong <e@80x24.org>, Christoph Hellwig <hch@lst.de>,
	Emily Shaffer <emilyshaffer@google.com>
Subject: RFC: A configuration design for future-proofing fsync() configuration
Date: Wed, 10 Nov 2021 16:09:33 +0100	[thread overview]
Message-ID: <211110.86r1bogg27.gmgdl@evledraar.gmail.com> (raw)

As a follow-up to various fsync topics in-flight I've been encouraging
those involved to come up with some way to configure fsync() in a way
that'll make holistic sense in the end-state.

Continuing a discussion from [1] currently we have:

    ; Defaults to 'false'
    core.fsyncObjectFiles = [true|false]

In master..next this has been extended to this by Neeraj:

   core.fsyncObjectFiles = [true|false|batch]

Which, as an aside I hadn't considered before and I think we need to
change before it lands on "master", we really don't want config users
want to enable that makes older versions hard die. It's annoying to want
to configure a new thing and not being able to put it in .gitconfig
because older versions die on it:

    $ git -c core.fsyncObjectFiles=batch status; echo $?
    fatal: bad boolean config value 'batch' for 'core.fsyncobjectfiles'
    128

Then there's Eric Wong's proposed[2]:

    core.fsync = <bool>

And now Patrick Steinhardt has a proposal to extend Neeraj's with[3]:

    ; Like core.fsyncObjectFiles, but apparently for .git/refs, not
    ; .git/objects (but see my confusion on that topic in [1])
    core.fsyncRefFiles = [<bool>|batch]

I think this sort of config schema would make everyone above happy

It would:

 A) Be easy to extend for any future fsync behavior we'd reasonably
    implement
 
 B) Not make older git versions die. It's fine if they warn(), but not die.

 C) Has some pretty contrived key names, but I'm trying to maintain the
    constraint that you can set both fsck.X=Y and
    e.g. fetch.fsck.X=Y. I.e. we should be able to configure things
    globally *and* per-command, like color.*, fsck.* etc.

Proposal:

  ; Turns on/off all fsync, whatever the method is. I.e. allows you to
  ; never make any fsync() calls whatsoever (which we have another
  ; in-flight topic for).

  ; The "false" was controversial, and we could just leave it
  ; unimplemented
  core.fsync = <bool>

  ; Optional, by default we'd use the most pedantic (I'd call our
  ; current "loose", whether we want to forward-support it is another
  ; matter.
  ;
  ; Whatever names we pick an option like this should ignore (or at most
  ; warn about) values it doesn't know about, not hard die on it.
  ;
  ; Here "bach" is what Neeraj and Patrick are pursuing, a hypothetical
  ; POSIX would be a pedantic way of exhaustively fsyncing everything.
  ; 
  ; We'd leave door open to e.g. setting it to "linux:ext4" or whatever,
  ; to do only the work needed on some specific popular FS
  core.fsyncMethod = loose | POSIX | batch | linux:ext4 | NTFS | ...

  ; Turn on or off entire categories of files we'd like to sync. This
  ; way Neeraj's and Patrick's approach would be to set
  ; core.fsyncMethod=batch, and then core.fsyncGroup=files &
  ; core.fsyncGroup=refs.

  ; If we learn about a new core.fsyncGroup = xyz in the future a <bool>
  ; in "core.fsyncGroupDefault" will prevail. I.e. if true it's
  ; included, if false not.
  ;
  ; Whether "false" or "true" is the default depends on
  ; core.fsyncMethod. For POSIX it would be true, for "loose" it's
  ; false.
  core.fsyncGroup = files
  core.fsyncGroup = refs
  core.fsyncGroup = objects

I'm not sure I like calling it "group". Maybe "class", "category"? Doing
it with this structure is extensible to the two-level keys, as noted
above.

  ; Our existing config knob. When "false" synonymous with:
  ;
  ;     core.fsync = true
  ;     core.fsyncMethod = loose
  ;     core.fsyncGroup = pack
  ;
  ; When "true" synonymous with the same as the above, plus:
  ;     core.fsyncGroup = loose
  ;
  : Or something like that. I.e. we'll fsync *.pack, *.bitmap etc, and ;
  ; probably some other stuff, but not loose objects etc.
  ;
  ; Whatever we fsync now exactly this schema should be generic enough
  ; to support it.
  core.fsyncObjectFiles = <bool>

  ; A namespace for core.fsyncMethod = <X>. Specific methods will
  ; own this namespace and can configure whatever they want.
  fsyncMethod.<x>.<a> = <b>

E.g. we might have:

  fsyncMethod.POSIX.content = true
  fsyncMethod.POSIX.metadata = false

If we know we'd like to (depending on other config) to fsync things
exhaustively or not, but do different things depending on file content
or metadata. I.e. maybe your FS's fsync() on a file fd always implies a
sync of the metadata, and maybe not.

  ; Change whatever fsync configuration you want per-command, similar to
  ; fsck.* and fetch.fsck.*
  transfer.fsyncGroup=*
  fetch.fsyncGroup=*
  ...

1. https://lore.kernel.org/git/211110.86v910gi9a.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/20211028002102.19384-1-e@80x24.org/
3. https://lore.kernel.org/git/cover.1636544377.git.ps@pks.im/

             reply	other threads:[~2021-11-10 15:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 15:09 Ævar Arnfjörð Bjarmason [this message]
2021-11-11  0:47 ` RFC: A configuration design for future-proofing fsync() configuration Neeraj Singh
2021-11-11  0:57   ` Ævar Arnfjörð Bjarmason
2021-11-17 22:16     ` Neeraj Singh
2021-11-18 19:00       ` Junio C Hamano
2021-11-18 19:46         ` Neeraj Singh
2021-11-12  5:54   ` Christoph Hellwig
2021-11-17 18:49     ` Neeraj Singh
2021-11-11 18:03 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=211110.86r1bogg27.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=e@80x24.org \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hch@lst.de \
    --cc=neerajsi@microsoft.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).