linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: lsf-pc@lists.linux-foundation.org
Cc: Dave Chinner <david@fromorbit.com>, Theodore Tso <tytso@mit.edu>,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jayashree Mohan <jaya@cs.utexas.edu>,
	Vijaychidambaram Velayudhan Pillai <vijay@cs.utexas.edu>,
	Filipe Manana <fdmanana@suse.com>
Subject: [TOPIC] Extending the filesystem crash recovery guaranties contract
Date: Sat, 27 Apr 2019 17:00:14 -0400	[thread overview]
Message-ID: <CAOQ4uxjZm6E2TmCv8JOyQr7f-2VB0uFRy7XEp8HBHQmMdQg+6w@mail.gmail.com> (raw)

Suggestion for another filesystems track topic.

Some of you may remember the emotional(?) discussions that ensued
when the crashmonkey developers embarked on a mission to document
and verify filesystem crash recovery guaranties:

https://lore.kernel.org/linux-fsdevel/CAOQ4uxj8YpYPPdEvAvKPKXO7wdBg6T1O3osd6fSPFKH9j=i2Yg@mail.gmail.com/

There are two camps among filesystem developers and every camp
has good arguments for wanting to document existing behavior and for
not wanting to document anything beyond "use fsync if you want any guaranty".

I would like to take a suggestion proposed by Jan on a related discussion:
https://lore.kernel.org/linux-fsdevel/CAOQ4uxjQx+TO3Dt7TA3ocXnNxbr3+oVyJLYUSpv4QCt_Texdvw@mail.gmail.com/

and make a proposal that may be able to meet the concerns of
both camps.

The proposal is to add new APIs which communicate
crash consistency requirements of the application to the filesystem.

Example API could look like this:
renameat2(..., RENAME_METADATA_BARRIER | RENAME_DATA_BARRIER)
It's just an example. The API could take another form and may need
more barrier types (I proposed to use new file_sync_range() flags).

The idea is simple though.
METADATA_BARRIER means all the inode metadata will be observed
after crash if rename is observed after crash.
DATA_BARRIER same for file data.
We may also want a "ALL_METADATA_BARRIER" and/or
"METADATA_DEPENDENCY_BARRIER" to more accurately
describe what SOMC guaranties actually provide today.

The implementation is also simple. filesystem that currently
have SOMC behavior don't need to do anything to respect
METADATA_BARRIER and only need to call
filemap_write_and_wait_range() to respect DATA_BARRIER.
filesystem developers are thus not tying their hands w.r.t future
performance optimizations for operations that are not explicitly
requesting a barrier.

Thanks,
Amir.

             reply	other threads:[~2019-04-27 21:00 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-27 21:00 Amir Goldstein [this message]
2019-05-02 16:12 ` Amir Goldstein
2019-05-02 17:11   ` Vijay Chidambaram
2019-05-02 17:39     ` Amir Goldstein
2019-05-03  2:30       ` Theodore Ts'o
2019-05-03  3:15         ` Vijay Chidambaram
2019-05-03  9:45           ` Theodore Ts'o
2019-05-04  0:17             ` Vijay Chidambaram
2019-05-04  1:43               ` Theodore Ts'o
2019-05-07 18:38                 ` Jan Kara
2019-05-03  4:16         ` Amir Goldstein
2019-05-03  9:58           ` Theodore Ts'o
2019-05-03 14:18             ` Amir Goldstein
2019-05-09  2:36             ` Dave Chinner
2019-05-09  1:43         ` Dave Chinner
2019-05-09  2:20           ` Theodore Ts'o
2019-05-09  2:58             ` Dave Chinner
2019-05-09  3:31               ` Theodore Ts'o
2019-05-09  5:19                 ` Darrick J. Wong
2019-05-09  5:02             ` Vijay Chidambaram
2019-05-09  5:37               ` Darrick J. Wong
2019-05-09 15:46               ` Theodore Ts'o
2019-05-09  8:47           ` Amir Goldstein
2019-05-02 21:05   ` Darrick J. Wong
2019-05-02 22:19     ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxjZm6E2TmCv8JOyQr7f-2VB0uFRy7XEp8HBHQmMdQg+6w@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=fdmanana@suse.com \
    --cc=jack@suse.cz \
    --cc=jaya@cs.utexas.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=vijay@cs.utexas.edu \
    --subject='Re: [TOPIC] Extending the filesystem crash recovery guaranties contract' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).