All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, tytso@mit.edu,
	Christoph Hellwig <hch@lst.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] core.fsyncObjectFiles: make the docs less flippant
Date: Thu, 17 Sep 2020 22:15:10 +0200	[thread overview]
Message-ID: <5b969f59-0006-8632-d040-6a816416f51a@kdbg.org> (raw)
In-Reply-To: <xmqqv9gcs91k.fsf@gitster.c.googlers.com>

Am 17.09.20 um 17:43 schrieb Junio C Hamano:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> 
>> As amusing as Linus's original prose[1] is here it doesn't really explain
>> in any detail to the uninitiated why you would or wouldn't enable
>> this, and the counter-intuitive reason for why git wouldn't fsync your
>> precious data.
>>
>> So elaborate (a lot) on why this may or may not be needed. This is my
>> best-effort attempt to summarize the various points raised in the last
>> ML[2] discussion about this.
>>
>> 1.  aafe9fbaf4 ("Add config option to enable 'fsync()' of object
>>     files", 2008-06-18)
>> 2. https://lore.kernel.org/git/20180117184828.31816-1-hch@lst.de/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  Documentation/config/core.txt | 42 ++++++++++++++++++++++++++++++-----
>>  1 file changed, 36 insertions(+), 6 deletions(-)
> 
> When I saw the subject in my mailbox, I expected to see that you
> would resurrect Christoph's updated text in [*1*], but you wrote a
> whole lot more ;-) And they are quite informative to help readers to
> understand what the option does.  I am not sure if the understanding
> directly help readers to decide if it is appropriate for their own
> repositories, though X-<.

Not only that; the new text also uses the term "fsync" in a manner that
I could be persuaded that it is actually an English word. Which, so far,
I doubt that it is ;) A little bit less 1337 wording would help the
users better.

> 
> 
> Thanks.
> 
> [Reference]
> 
> *1* https://public-inbox.org/git/20180117193510.GA30657@lst.de/
> 
>>
>> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
>> index 74619a9c03..5b47670c16 100644
>> --- a/Documentation/config/core.txt
>> +++ b/Documentation/config/core.txt
>> @@ -548,12 +548,42 @@ core.whitespace::
>>    errors. The default tab width is 8. Allowed values are 1 to 63.
>>  
>>  core.fsyncObjectFiles::
>> -	This boolean will enable 'fsync()' when writing object files.
>> -+
>> -This is a total waste of time and effort on a filesystem that orders
>> -data writes properly, but can be useful for filesystems that do not use
>> -journalling (traditional UNIX filesystems) or that only journal metadata
>> -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
>> +	This boolean will enable 'fsync()' when writing loose object
>> +	files. Both the file itself and its containng directory will
>> +	be fsynced.
>> ++
>> +When git writes data any required object writes will precede the
>> +corresponding reference update(s). For example, a
>> +linkgit:git-receive-pack[1] accepting a push might write a pack or
>> +loose objects (depending on settings such as `transfer.unpackLimit`).
>> ++
>> +Therefore on a journaled file system which ensures that data is
>> +flushed to disk in chronological order an fsync shouldn't be
>> +needed. The loose objects might be lost with a crash, but so will the
>> +ref update that would have referenced them. Git's own state in such a
>> +crash will remain consistent.
>> ++
>> +This option exists because that assumption doesn't hold on filesystems
>> +where the data ordering is not preserved, such as on ext3 and ext4
>> +with "data=writeback". On such a filesystem the `rename()` that drops
>> +the new reference in place might be preserved, but the contents or
>> +directory entry for the loose object(s) might not have been synced to
>> +disk.
>> ++
>> +Enabling this option might slow git down by a lot in some
>> +cases. E.g. in the case of a naïve bulk import tool which might create
>> +a million loose objects before a final ref update and `gc`. In other
>> +more common cases such as on a server being pushed to with default
>> +`transfer.unpackLimit` settings the difference might not be noticable.
>> ++
>> +However, that's highly filesystem-dependent, on some filesystems
>> +simply calling fsync() might force an unrelated bulk background write
>> +to be serialized to disk. Such edge cases are the reason this option
>> +is off by default. That default setting might change in future
>> +versions.
>> ++
>> +In older versions of git only the descriptor for the file itself was
>> +fsynced, not its directory entry.
>>  
>>  core.preloadIndex::
>>  	Enable parallel index preload for operations like 'git diff'
> 


  reply	other threads:[~2020-09-17 20:59 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-17 18:48 [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2018-01-17 19:04 ` Junio C Hamano
2018-01-17 19:35   ` Christoph Hellwig
2018-01-17 19:35     ` Christoph Hellwig
2018-01-17 20:05     ` Andreas Schwab
2018-01-17 19:37   ` Matthew Wilcox
2018-01-17 19:42     ` Christoph Hellwig
2018-01-17 21:44   ` Ævar Arnfjörð Bjarmason
2018-01-17 22:07     ` Linus Torvalds
2018-01-17 22:25       ` Linus Torvalds
2018-01-17 23:16       ` Ævar Arnfjörð Bjarmason
2018-01-17 23:42         ` Linus Torvalds
2018-01-17 23:52       ` Theodore Ts'o
2018-01-17 23:57         ` Linus Torvalds
2018-01-18 16:27           ` Christoph Hellwig
2018-01-19 19:08             ` Junio C Hamano
2018-01-20 22:14               ` Theodore Ts'o
2018-01-20 22:27                 ` Junio C Hamano
2018-01-22 15:09                   ` Ævar Arnfjörð Bjarmason
2018-01-22 18:09                     ` Theodore Ts'o
2018-01-22 18:09                       ` Theodore Ts'o
2018-01-23  0:47                       ` Jeff King
2018-01-23  5:45                         ` Theodore Ts'o
2018-01-23  5:45                           ` Theodore Ts'o
2018-01-23 16:17                           ` Jeff King
2018-01-23  0:25                     ` Jeff King
2018-01-21 21:32             ` Chris Mason
2020-09-17 11:06         ` Ævar Arnfjörð Bjarmason
2020-09-17 11:28           ` [RFC PATCH 0/2] should core.fsyncObjectFiles fsync the dir entry + docs Ævar Arnfjörð Bjarmason
2020-09-17 11:28           ` [RFC PATCH 1/2] sha1-file: fsync() loose dir entry when core.fsyncObjectFiles Ævar Arnfjörð Bjarmason
2020-09-17 13:16             ` Jeff King
2020-09-17 15:09               ` Christoph Hellwig
2020-09-17 14:09             ` Christoph Hellwig
2020-09-17 14:55               ` Jeff King
2020-09-17 14:56                 ` Christoph Hellwig
2020-09-17 15:37                   ` Junio C Hamano
2020-09-17 17:12                     ` Jeff King
2020-09-17 20:37                       ` Taylor Blau
2020-09-22 10:42               ` Ævar Arnfjörð Bjarmason
2020-09-17 20:21             ` Johannes Sixt
2020-09-22  8:24               ` Ævar Arnfjörð Bjarmason
2020-11-19 11:38                 ` Johannes Schindelin
2020-09-17 11:28           ` [RFC PATCH 2/2] core.fsyncObjectFiles: make the docs less flippant Ævar Arnfjörð Bjarmason
2020-09-17 14:12             ` Christoph Hellwig
2020-09-17 15:43             ` Junio C Hamano
2020-09-17 20:15               ` Johannes Sixt [this message]
2020-10-08  8:13               ` Johannes Schindelin
2020-10-08 15:57                 ` Ævar Arnfjörð Bjarmason
2020-10-08 18:53                   ` Junio C Hamano
2020-10-09 10:44                   ` Johannes Schindelin
2020-09-17 19:21             ` Marc Branchaud
2020-09-17 14:14           ` [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2020-09-17 15:30           ` Junio C Hamano
2018-01-17 20:55 ` Jeff King
2018-01-17 21:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b969f59-0006-8632-d040-6a816416f51a@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.