linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: 焦晓冬 <milestonejxd@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>,
	cmumford@cmumford.com, linux-btrfs <linux-btrfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: metadata operation reordering regards to crash
Date: Sat, 15 Sep 2018 12:04:51 -0600	[thread overview]
Message-ID: <22C71398-EFD7-4638-AAE4-CE7E30E95B7E@dilger.ca> (raw)
In-Reply-To: <CAJDTihx2MEpdWdp6V6AsOD35Lh_6R=uFXBGtO8fL8s0o5C_t8Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

On Sep 15, 2018, at 12:58 AM, 焦晓冬 <milestonejxd@gmail.com> wrote:
> 
> On Sat, Sep 15, 2018 at 6:23 AM Dave Chinner <david@fromorbit.com> wrote:
>> 
>> On Fri, Sep 14, 2018 at 05:06:44PM +0800, 焦晓冬 wrote:
>>> Hi, all,
>>> 
>>> A probably bit of complex question:
>>> Does nowadays practical filesystems, eg., extX, btfs, preserve metadata
>>> operation order through a crash/power failure?
>> 
>> Yes.
>> 
>> Behaviour is filesystem dependent, but we have tests in fstests that
>> specifically exercise order preservation across filesystem failures.
>> 
>>> What I know is modern filesystems ensure metadata consistency
>>> after crash/power failure. Journal filesystems like extX do that by
>>> write-ahead logging of metadata operations into transactions. Other
>>> filesystems do that in various ways as btfs do that by COW.
>>> 
>>> What I'm not so far clear is whether these filesystems preserve
>>> metadata operation order after a crash.
>>> 
>>> For example,
>>> op 1.  rename(A, B)
>>> op 2.  rename(C, D)
>>> 
>>> As mentioned above,  metadata consistency is ensured after a crash.
>>> Thus, B is either the original B(or not exists) or has been replaced by A.
>>> The same to D.
>>> 
>>> Is it possible that, after a crash, D has been replaced by C but B is still
>>> the original file(or not exists)?
>> 
>> Not for XFS, ext4, btrfs or f2fs. Other filesystems might be
>> different.
> 
> Thanks, Dave,
> 
> I found this archive:
> https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg31937.html
> 
> It seems btrfs people thinks reordering could happen.
> 
> It is a relatively old reply. Has the implement changed? Or is there
> some new standard that requires reordering not happen?

There is nothing in POSIX that requires any particular ordering.  However,
the sequence "A, B, C, sync C" on ext3/ext4 has "always" resulted in A, B
also being sync'd to disk (including parent directory creation, etc).

For a while, ext4 with delayed allocation resulted in write A, rename A->B
causing "B" to potentially not have any data (commit v2.6.29-5120-g8750c6d).
While the applications are depending on non-POSIX behaviour, the operation
ordering behaviour has been around long that applications have grown to
depend on it, and consider the filesystem to have a bug when it doesn't
behave that way.

If you want to write a robust application, you should fsync() the files you
care about (possibly with AIO so you get a notification on completion rather
than waiting).

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

  reply	other threads:[~2018-09-15 18:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14  9:06 metadata operation reordering regards to crash 焦晓冬
2018-09-14 22:23 ` Dave Chinner
2018-09-15  6:58   ` 焦晓冬
2018-09-15 18:04     ` Andreas Dilger [this message]
2018-09-16  1:18     ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22C71398-EFD7-4638-AAE4-CE7E30E95B7E@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=cmumford@cmumford.com \
    --cc=david@fromorbit.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=milestonejxd@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).