All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: rsbecker@nexbridge.com, bagasdotme@gmail.com, newren@gmail.com,
	avarab@gmail.com, nksingh85@gmail.com, ps@pks.im,
	"Neeraj K. Singh" <neerajsi@microsoft.com>
Subject: [PATCH v3 0/4] A design for future-proofing fsync() configuration
Date: Thu, 09 Dec 2021 00:57:09 +0000	[thread overview]
Message-ID: <pull.1093.v3.git.1639011433.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1093.v2.git.1638845211.gitgitgadget@gmail.com>

This is an implementation of an extensible configuration mechanism for
fsyncing persistent components of a repo.

The main goals are to separate the "what" to sync from the "how". There are
now two settings: core.fsync - Control the 'what', including the index.
core.fsyncMethod - Control the 'how'. Currently we support writeout-only and
full fsync.

Syncing of refs can be layered on top of core.fsync. And batch mode will be
layered on core.fsyncMethod.

core.fsyncobjectfiles is removed and will issue a deprecation warning if
it's seen.

I'd like to get agreement on this direction before submitting batch mode to
the list. The batch mode series is available to view at
https://github.com/neerajsi-msft/git/pull/1.

Please see [1], [2], and [3] for discussions that led to this series.

One major concern I'd voice that is adverse to this change: when a new
persistent file is added to the Git repo, the person adding that file will
need to update this configuration code and the documentation. Maybe this is
the right thing to always think about, but the FSYNC_COMPONENT lists will be
a single place that will receive a number of updates over time.

Note: There's a minor conflict with ns/tmp-objdir. In
object-file.c:close_loose_object we need to resolve it like this:

if (!the_repository->objects->odb->will_destroy)
fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd, "loose object
file");

V3 changes:

 * Remove relative path from git-compat-util.h include [4].
 * Updated newly added warning texts to have more context for localization
   [4].
 * Fixed tab spacing in enum fsync_action
 * Moved the fsync looping out to a helper and do it consistently. [4]
 * Changed commit description to use camelCase for config names. [5]
 * Add an optional fourth patch with derived-metadata so that the user can
   exclude a forward-compatible set of things that should be recomputable
   given existing data.

V2 changes:

 * Updated the documentation for core.fsyncmethod to be less certain.
   writeout-only probably does not do the right thing on Linux.
 * Split out the core.fsync=index change into its own commit.
 * Rename REPO_COMPONENT to FSYNC_COMPONENT. This is really specific to
   fsyncing, so the name should reflect that.
 * Re-add missing Makefile change for SYNC_FILE_RANGE.
 * Tested writeout-only mode, index syncing, and general config settings.

[1] https://lore.kernel.org/git/211110.86r1bogg27.gmgdl@evledraar.gmail.com/
[2]
https://lore.kernel.org/git/dd65718814011eb93ccc4428f9882e0f025224a6.1636029491.git.ps@pks.im/
[3]
https://lore.kernel.org/git/pull.1076.git.git.1629856292.gitgitgadget@gmail.com/
[4]
https://lore.kernel.org/git/CANQDOdf8C4-haK9=Q_J4Cid8bQALnmGDm=SvatRbaVf+tkzqLw@mail.gmail.com/
[5] https://lore.kernel.org/git/211207.861r2opplg.gmgdl@evledraar.gmail.com/

Neeraj Singh (4):
  core.fsyncmethod: add writeout-only mode
  core.fsync: introduce granular fsync control
  core.fsync: new option to harden the index
  core.fsync: add a `derived-metadata` aggregate option

 Documentation/config/core.txt       | 35 ++++++++---
 Makefile                            |  6 ++
 builtin/fast-import.c               |  2 +-
 builtin/index-pack.c                |  4 +-
 builtin/pack-objects.c              |  8 ++-
 bulk-checkin.c                      |  5 +-
 cache.h                             | 49 +++++++++++++++-
 commit-graph.c                      |  3 +-
 compat/mingw.h                      |  3 +
 compat/win32/flush.c                | 28 +++++++++
 config.c                            | 90 ++++++++++++++++++++++++++++-
 config.mak.uname                    |  3 +
 configure.ac                        |  8 +++
 contrib/buildsystems/CMakeLists.txt |  3 +-
 csum-file.c                         |  5 +-
 csum-file.h                         |  3 +-
 environment.c                       |  3 +-
 git-compat-util.h                   | 24 ++++++++
 midx.c                              |  3 +-
 object-file.c                       |  3 +-
 pack-bitmap-write.c                 |  3 +-
 pack-write.c                        | 13 +++--
 read-cache.c                        | 19 ++++--
 wrapper.c                           | 64 ++++++++++++++++++++
 write-or-die.c                      | 10 ++--
 25 files changed, 354 insertions(+), 43 deletions(-)
 create mode 100644 compat/win32/flush.c


base-commit: abe6bb3905392d5eb6b01fa6e54d7e784e0522aa
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1093%2Fneerajsi-msft%2Fns%2Fcore-fsync-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1093/neerajsi-msft/ns/core-fsync-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1093

Range-diff vs v2:

 1:  e79522cbdd4 ! 1:  15edfe51509 core.fsyncmethod: add writeout-only mode
     @@ Metadata
       ## Commit message ##
          core.fsyncmethod: add writeout-only mode
      
     -    This commit introduces the `core.fsyncmethod` configuration
     +    This commit introduces the `core.fsyncMethod` configuration
          knob, which can currently be set to `fsync` or `writeout-only`.
      
          The new writeout-only mode attempts to tell the operating system to
     @@ Commit message
          directive to the storage controller. This change updates fsync to do
          fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
          with existing behavior on Apple platforms by setting the default value
     -    of the new core.fsyncmethod option.
     +    of the new core.fsyncMethod option.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ compat/mingw.h: int mingw_getpagesize(void);
      
       ## compat/win32/flush.c (new) ##
      @@
     -+#include "../../git-compat-util.h"
     ++#include "git-compat-util.h"
      +#include <winternl.h>
      +#include "lazyload.h"
      +
     @@ config.c: static int git_default_core_config(const char *var, const char *value,
      +		else if (!strcmp(value, "writeout-only"))
      +			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
      +		else
     -+			warning(_("unknown %s value '%s'"), var, value);
     ++			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
      +
      +	}
      +
     @@ git-compat-util.h: __attribute__((format (printf, 1, 2))) NORETURN
      +#endif
      +
      +enum fsync_action {
     -+    FSYNC_WRITEOUT_ONLY,
     -+    FSYNC_HARDWARE_FLUSH
     ++	FSYNC_WRITEOUT_ONLY,
     ++	FSYNC_HARDWARE_FLUSH
      +};
      +
      +/*
     @@ wrapper.c: int xmkstemp_mode(char *filename_template, int mode)
       	return fd;
       }
       
     ++/*
     ++ * Some platforms return EINTR from fsync. Since fsync is invoked in some
     ++ * cases by a wrapper that dies on failure, do not expose EINTR to callers.
     ++ */
     ++static int fsync_loop(int fd)
     ++{
     ++	int err;
     ++
     ++	do {
     ++		err = fsync(fd);
     ++	} while (err < 0 && errno == EINTR);
     ++	return err;
     ++}
     ++
      +int git_fsync(int fd, enum fsync_action action)
      +{
      +	switch (action) {
     @@ wrapper.c: int xmkstemp_mode(char *filename_template, int mode)
      +		 * on macOS, fsync just causes filesystem cache writeback but does not
      +		 * flush hardware caches.
      +		 */
     -+		return fsync(fd);
     ++		return fsync_loop(fd);
      +#endif
      +
      +#ifdef HAVE_SYNC_FILE_RANGE
     @@ wrapper.c: int xmkstemp_mode(char *filename_template, int mode)
      +		 * case, since callers asking for a hardware flush may die if
      +		 * this function returns an error.
      +		 */
     -+		for (;;) {
     -+			int err;
      +#ifdef __APPLE__
     -+			err = fcntl(fd, F_FULLFSYNC);
     ++		return fcntl(fd, F_FULLFSYNC);
      +#else
     -+			err = fsync(fd);
     ++		return fsync_loop(fd);
      +#endif
     -+			if (err >= 0 || errno != EINTR)
     -+				return err;
     -+		}
     -+
      +	default:
      +		BUG("unexpected git_fsync(%d) call", action);
      +	}
 2:  ff80a94bf9a ! 2:  080be1a6f64 core.fsync: introduce granular fsync control
     @@ config.c: static int git_parse_maybe_bool_text(const char *value)
      +
      +		if (!found) {
      +			char *component = xstrndup(string, len);
     -+			warning(_("unknown %s value '%s'"), var, component);
     ++			warning(_("ignoring unknown core.fsync component '%s'"), component);
      +			free(component);
      +		}
      +
 3:  86e39b8f8d1 = 3:  2207950beba core.fsync: new option to harden the index
 -:  ----------- > 4:  a830d177d4c core.fsync: add a `derived-metadata` aggregate option

-- 
gitgitgadget

  parent reply	other threads:[~2021-12-09  0:57 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-04  3:28 [PATCH 0/2] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-04  3:28 ` [PATCH 1/2] fsync: add writeout-only mode for fsyncing repo data Neeraj Singh via GitGitGadget
2021-12-06  7:54   ` Neeraj Singh
2021-12-04  3:28 ` [PATCH 2/2] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07  2:46 ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-07  2:46   ` [PATCH v2 1/3] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-07 11:44     ` Patrick Steinhardt
2021-12-07 12:14       ` Ævar Arnfjörð Bjarmason
2021-12-07 23:29       ` Neeraj Singh
2021-12-07 12:18     ` Ævar Arnfjörð Bjarmason
2021-12-07 23:58       ` Neeraj Singh
2021-12-07  2:46   ` [PATCH v2 2/3] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07 11:53     ` Patrick Steinhardt
2021-12-07 20:46       ` Neeraj Singh
2021-12-07 12:29     ` Ævar Arnfjörð Bjarmason
2021-12-07 21:44       ` Neeraj Singh
2021-12-08 10:05         ` Ævar Arnfjörð Bjarmason
2021-12-09  0:14           ` Neeraj Singh
2021-12-09  0:44             ` Junio C Hamano
2021-12-09  4:08             ` Ævar Arnfjörð Bjarmason
2021-12-09  6:18               ` Neeraj Singh
2022-01-18 23:50                 ` Neeraj Singh
2022-01-19 15:28                   ` Ævar Arnfjörð Bjarmason
2022-01-19 14:52                 ` Ævar Arnfjörð Bjarmason
2022-01-28  1:28                   ` Neeraj Singh
2021-12-07  2:46   ` [PATCH v2 3/3] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-07 11:56   ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Patrick Steinhardt
2021-12-08  0:44     ` Neeraj Singh
2021-12-09  0:57   ` Neeraj K. Singh via GitGitGadget [this message]
2021-12-09  0:57     ` [PATCH v3 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-09  0:57     ` [PATCH v3 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-01-08  1:13     ` [PATCH v3 0/4] A design for future-proofing fsync() configuration Neeraj Singh
2022-01-09  0:55       ` rsbecker
2022-01-10 19:00         ` Neeraj Singh
2022-02-01  3:33     ` [PATCH v4 " Neeraj K. Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-02-02  0:51         ` Junio C Hamano
2022-02-02  1:42           ` Junio C Hamano
2022-02-11 21:18             ` Neeraj Singh
2022-02-11 22:19               ` Junio C Hamano
2022-02-11 23:04                 ` Neeraj Singh
2022-02-11 23:15                   ` Junio C Hamano
2022-02-12  0:39                     ` rsbecker
2022-02-14  7:04                     ` Patrick Steinhardt
2022-02-14 17:17                       ` Junio C Hamano
2022-03-09 13:42                         ` Patrick Steinhardt
2022-03-09 18:50                           ` Ævar Arnfjörð Bjarmason
2022-03-09 20:03                           ` Junio C Hamano
2022-03-10 12:33                             ` Patrick Steinhardt
2022-03-10 17:15                               ` Junio C Hamano
2022-03-09 20:05                           ` Neeraj Singh
2022-02-11 20:38           ` Neeraj Singh
2022-02-01  3:33       ` [PATCH v4 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-02-01  3:33       ` [PATCH v4 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-03-09 23:03       ` [PATCH v5 0/5] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-09 23:03         ` [PATCH v5 1/5] wrapper: move inclusion of CSPRNG headers the wrapper.c file Neeraj Singh via GitGitGadget
2022-03-09 23:29           ` Junio C Hamano
2022-03-10  1:21             ` Neeraj Singh
2022-03-10  1:26           ` brian m. carlson
2022-03-10  1:56             ` Neeraj Singh
2022-03-09 23:03         ` [PATCH v5 2/5] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-09 23:48           ` Junio C Hamano
2022-03-09 23:03         ` [PATCH v5 3/5] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-03-10  0:21           ` Junio C Hamano
2022-03-10  2:53             ` Neeraj Singh
2022-03-10  7:19               ` Junio C Hamano
2022-03-10 18:38                 ` Neeraj Singh
2022-03-10 18:44                   ` Junio C Hamano
2022-03-10 19:57                     ` Junio C Hamano
2022-03-10 20:25                       ` Neeraj Singh
2022-03-10 21:17                         ` Junio C Hamano
2022-03-10 13:11               ` Johannes Schindelin
2022-03-10 17:18               ` Junio C Hamano
2022-03-09 23:03         ` [PATCH v5 4/5] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-09 23:03         ` [PATCH v5 5/5] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-10  9:53         ` Future-proofed syncing of refs Patrick Steinhardt
2022-03-10  9:53         ` [PATCH 6/8] core.fsync: add `fsync_component()` wrapper which doesn't die Patrick Steinhardt
2022-03-10 17:34           ` Junio C Hamano
2022-03-10 18:40             ` Neeraj Singh
2022-03-10  9:53         ` [PATCH 7/8] core.fsync: new option to harden loose references Patrick Steinhardt
2022-03-10 18:25           ` Junio C Hamano
2022-03-10 19:03             ` Neeraj Singh
2022-03-10 22:54           ` Neeraj Singh
2022-03-11  6:40           ` Junio C Hamano
2022-03-11  9:15             ` Patrick Steinhardt
2022-03-11  9:36               ` Ævar Arnfjörð Bjarmason
2022-03-10  9:53         ` [PATCH 8/8] core.fsync: new option to harden packed references Patrick Steinhardt
2022-03-10 18:28           ` Junio C Hamano
2022-03-11  9:10             ` Patrick Steinhardt
2022-03-10 22:43         ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 1/6] wrapper: make inclusion of Windows csprng header tightly scoped Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 2/6] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 3/6] core.fsync: introduce granular fsync control infrastructure Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 4/6] core.fsync: add configuration parsing Neeraj Singh via GitGitGadget
2022-03-28 11:06             ` Jiang Xin
2022-03-28 19:45               ` Neeraj Singh
2022-03-10 22:43           ` [PATCH v6 5/6] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-10 22:43           ` [PATCH v6 6/6] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-15 19:12             ` [PATCH v7] " Neeraj Singh
2022-03-15 19:32               ` Junio C Hamano
2022-03-15 19:56                 ` Neeraj Singh
2022-03-23 14:20               ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Ævar Arnfjörð Bjarmason
2022-03-25 21:24                 ` Neeraj Singh
2022-03-26  0:24                   ` Ævar Arnfjörð Bjarmason
2022-03-26  1:23                     ` do we have too much fsync() configuration in 'next'? Junio C Hamano
2022-03-26  1:25                     ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Neeraj Singh
2022-03-26 15:31                       ` Ævar Arnfjörð Bjarmason
2022-03-27  5:27                         ` Neeraj Singh
2022-03-27 12:43                           ` Ævar Arnfjörð Bjarmason
2022-03-28 10:56                             ` Patrick Steinhardt
2022-03-28 11:25                               ` Ævar Arnfjörð Bjarmason
2022-03-28 19:56                                 ` Neeraj Singh
2022-03-30 16:59                                   ` Neeraj Singh
2022-03-10 23:34           ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Junio C Hamano
2022-03-11  0:03             ` Neeraj Singh
2022-03-11 18:50               ` Neeraj Singh
2022-03-13 23:50             ` Junio C Hamano
2022-03-11  9:58           ` [PATCH v2] core.fsync: new option to harden references Patrick Steinhardt
2022-03-25  6:11             ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1093.v3.git.1639011433.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=neerajsi@microsoft.com \
    --cc=newren@gmail.com \
    --cc=nksingh85@gmail.com \
    --cc=ps@pks.im \
    --cc=rsbecker@nexbridge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.