linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Vijay Chidambaram <vijay@cs.utexas.edu>,
	lsf-pc@lists.linux-foundation.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jayashree Mohan <jaya@cs.utexas.edu>,
	Filipe Manana <fdmanana@suse.com>, Chris Mason <clm@fb.com>,
	lwn@lwn.net
Subject: Re: [TOPIC] Extending the filesystem crash recovery guaranties contract
Date: Thu, 9 May 2019 12:36:23 +1000	[thread overview]
Message-ID: <20190509023623.GU1454@dread.disaster.area> (raw)
In-Reply-To: <20190503095846.GE23724@mit.edu>

On Fri, May 03, 2019 at 05:58:46AM -0400, Theodore Ts'o wrote:
> On Fri, May 03, 2019 at 12:16:32AM -0400, Amir Goldstein wrote:
> > OK. we can leave that one for later.
> > Although I am not sure what the concern is.
> > If we are able to agree  and document a LINK_ATOMIC flag,
> > what would be the down side of documenting a RENAME_ATOMIC
> > flag with same semantics? After all, as I said, this is what many users
> > already expect when renaming a temp file (as ext4 heuristics prove).
> 
> The problem is if the "temp file" has been hardlinked to 1000
> different directories, does the rename() have to guarantee that we
> have to make sure that the changes to all 1000 directories have been
> persisted to disk?

No.

Dependency creation is directional.

If the parent directory modifies an entry that points to an inode,
then the dependency (via inode link count modification) is created.
Modifying an inode does not create a dependency on the parent
directory, because the parent directory is not modified by inode
specific changes.

Yes, sometimes the dependency graph will resolve to fsync other
directories. e.g. because hardlinks to the same inode were created
and this is the first fsync on the inode that stabilises the link
count. Because the link count is being stabilised, all the current
depedencies on that link count (i.e. all the directories with
uncommitted dirent modifications that modified the link count in
that inode) /may/ be included in the fsync.

However, if the filesystem tracks every change to the inode link
count as separate modifications, it only need commit the directory
modifications that occurred /before/ the one being fsync'd. i.e.
SOMC doesn't require "sync the world" behaviour, it's just that we
have filesysetms that currently behave that way because it's a
simple and efficient way of tracking and resolving ordering
dependencies.

IOWs, SOMC is all about cross-object depedencies and how they are
resolved. If you have no cross-object dependencies or your
operations are isolated to a non-shared set of objects, then SOMC
allows them to operate in 100% isolation to everything else and the
filesystem can optimise this in whatever way it wants.

SOMC is not the end of the world, Ted. It's just a consistency model
that has been proposed that could allow substantial optimisation of
application operations and filesystem behaviour. You're free to go
in other directions if you want - diversity is good. :)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2019-05-09  2:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-27 21:00 [TOPIC] Extending the filesystem crash recovery guaranties contract Amir Goldstein
2019-05-02 16:12 ` Amir Goldstein
2019-05-02 17:11   ` Vijay Chidambaram
2019-05-02 17:39     ` Amir Goldstein
2019-05-03  2:30       ` Theodore Ts'o
2019-05-03  3:15         ` Vijay Chidambaram
2019-05-03  9:45           ` Theodore Ts'o
2019-05-04  0:17             ` Vijay Chidambaram
2019-05-04  1:43               ` Theodore Ts'o
2019-05-07 18:38                 ` Jan Kara
2019-05-03  4:16         ` Amir Goldstein
2019-05-03  9:58           ` Theodore Ts'o
2019-05-03 14:18             ` Amir Goldstein
2019-05-09  2:36             ` Dave Chinner [this message]
2019-05-09  1:43         ` Dave Chinner
2019-05-09  2:20           ` Theodore Ts'o
2019-05-09  2:58             ` Dave Chinner
2019-05-09  3:31               ` Theodore Ts'o
2019-05-09  5:19                 ` Darrick J. Wong
2019-05-09  5:02             ` Vijay Chidambaram
2019-05-09  5:37               ` Darrick J. Wong
2019-05-09 15:46               ` Theodore Ts'o
2019-05-09  8:47           ` Amir Goldstein
2019-05-02 21:05   ` Darrick J. Wong
2019-05-02 22:19     ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190509023623.GU1454@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=darrick.wong@oracle.com \
    --cc=fdmanana@suse.com \
    --cc=jack@suse.cz \
    --cc=jaya@cs.utexas.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=lwn@lwn.net \
    --cc=tytso@mit.edu \
    --cc=vijay@cs.utexas.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).