All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>,
	git@vger.kernel.org, David Gstir <david@sigma-star.at>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: broken repo after power cut
Date: Mon, 22 Jun 2015 13:19:59 +0200	[thread overview]
Message-ID: <5587EF5F.90207@nod.at> (raw)
In-Reply-To: <20150622003551.GP29480@thunk.org>

Am 22.06.2015 um 02:35 schrieb Theodore Ts'o:
> On Sun, Jun 21, 2015 at 03:07:41PM +0200, Richard Weinberger wrote:
> 
>>> I was then shocked to learn that ext4 apparently has a default
>>> setting that allows it to truncate files upon power failure
>>> (something about a full journal vs a fast journal or some such)
> 
> s/ext4/all modern file systems/
> 
> POSIX makes **no guarantees** about what happens after a power failure
> unless you use fsync() --- which git does not do by default (see below).

Thanks for pointing this out.

> The bottome lins is that if you care about files being written, you
> need to use fsync().  Should git use fsync() by default?  Well, if you
> are willing to accept that if your system crashes within a second or
> so of your last git operation, you might need to run "git fsck" and
> potentially recover from a busted repo, maybe speed is more important
> for you (and git is known for its speed/performance, after all. :-)
> 
> The actual state of the source tree would have been written using a
> text editor which tends to be paranoid about using fsync (at least, if
> you use a real editor like Emacs or Vi, as opposed to the toy notepad
> editors shipped with GNOME or KDE :-).  So as long as you know what
> you're doing, it's unlikely that you will actually lose any work.
> 
> Personally, I have core.fsyncobjectfiles set to yes in my .gitconfig.
> Part of this is because I have an SSD, so the speed hit really doesn't
> bother me, and needing to recover a corrupted git repository is a pain
> (although I have certainly done it in the past).

I think core.fsyncObjectFiles documentation really needs an update.
What about this one?

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 43bb53c..b08fa11 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -693,10 +693,16 @@ core.whitespace::
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
 +
-This is a total waste of time and effort on a filesystem that orders
-data writes properly, but can be useful for filesystems that do not use
-journalling (traditional UNIX filesystems) or that only journal metadata
-and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
+For performance reasons git does not call 'fsync()' after writing object
+files. This means that after a power cut your git repository can get
+corrupted as not all data hit the storage media. Especially on modern
+filesystems like ext4, xfs or btrfs this can happen very easily.
+If you have to face power cuts and care about your data it is strongly
+recommended to enable this setting.
+Please note that git's behavior used to be safe on ext3 with data=ordered,
+for any other filesystems or mount settings this is not the case as
+POSIX clearly states that you have to call 'fsync()' to make sure that
+all data is written.

 core.preloadIndex::
 	Enable parallel index preload for operations like 'git diff'

--
Thanks,
//richard

  reply	other threads:[~2015-06-22 11:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-20 19:40 broken repo after power cut Richard Weinberger
2015-06-21 12:28 ` Johannes Schindelin
2015-06-21 13:07   ` Richard Weinberger
2015-06-21 13:59     ` Christoph Hellwig
2015-06-21 14:08       ` Richard Weinberger
2015-06-22  0:35     ` Theodore Ts'o
2015-06-22 11:19       ` Richard Weinberger [this message]
2015-06-22 12:31         ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5587EF5F.90207@nod.at \
    --to=richard@nod.at \
    --cc=david@sigma-star.at \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmx.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.