linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "bfields@fieldses.org" <bfields@fieldses.org>
Cc: "zohar@linux.ibm.com" <zohar@linux.ibm.com>,
	"djwong@kernel.org" <djwong@kernel.org>,
	"xiubli@redhat.com" <xiubli@redhat.com>,
	"brauner@kernel.org" <brauner@kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"neilb@suse.de" <neilb@suse.de>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jlayton@kernel.org" <jlayton@kernel.org>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
	"linux-ceph@vger.kernel.org" <linux-ceph@vger.kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"tytso@mit.edu" <tytso@mit.edu>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"jack@suse.cz" <jack@suse.cz>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"lczerner@redhat.com" <lczerner@redhat.com>,
	"adilger.kernel@dilger.ca" <adilger.kernel@dilger.ca>,
	"walters@verbum.org" <walters@verbum.org>
Subject: Re: [PATCH v3 1/7] iversion: update comments with info about atime updates
Date: Tue, 30 Aug 2022 15:43:13 +0000	[thread overview]
Message-ID: <3e8c7af5d39870c5b0dc61736a79bd134be5a9b3.camel@hammerspace.com> (raw)
In-Reply-To: <20220830151715.GE26330@fieldses.org>

On Tue, 2022-08-30 at 11:17 -0400, J. Bruce Fields wrote:
> On Tue, Aug 30, 2022 at 02:58:27PM +0000, Trond Myklebust wrote:
> > On Tue, 2022-08-30 at 10:44 -0400, J. Bruce Fields wrote:
> > > On Tue, Aug 30, 2022 at 09:50:02AM -0400, Jeff Layton wrote:
> > > > On Tue, 2022-08-30 at 09:24 -0400, J. Bruce Fields wrote:
> > > > > On Tue, Aug 30, 2022 at 07:40:02AM -0400, Jeff Layton wrote:
> > > > > > Yes, saying only that it must be different is intentional.
> > > > > > What
> > > > > > we
> > > > > > really want is for consumers to treat this as an opaque
> > > > > > value
> > > > > > for the
> > > > > > most part [1]. Therefore an implementation based on hashing
> > > > > > would
> > > > > > conform to the spec, I'd think, as long as all of the
> > > > > > relevant
> > > > > > info is
> > > > > > part of the hash.
> > > > > 
> > > > > It'd conform, but it might not be as useful as an increasing
> > > > > value.
> > > > > 
> > > > > E.g. a client can use that to work out which of a series of
> > > > > reordered
> > > > > write replies is the most recent, and I seem to recall that
> > > > > can
> > > > > prevent
> > > > > unnecessary invalidations in some cases.
> > > > > 
> > > > 
> > > > That's a good point; the linux client does this. That said,
> > > > NFSv4
> > > > has a
> > > > way for the server to advertise its change attribute behavior
> > > > [1]
> > > > (though nfsd hasn't implemented this yet).
> > > 
> > > It was implemented and reverted.  The issue was that I thought
> > > nfsd
> > > should mix in the ctime to prevent the change attribute going
> > > backwards
> > > on reboot (see fs/nfsd/nfsfh.h:nfsd4_change_attribute()), but
> > > Trond
> > > was
> > > concerned about the possibility of time going backwards.  See
> > > 1631087ba872 "Revert "nfsd4: support change_attr_type
> > > attribute"".
> > > There's some mailing list discussion to that I'm not turning up
> > > right
> > > now.
> 
> https://lore.kernel.org/linux-nfs/a6294c25cb5eb98193f609a52aa8f4b5d4e81279.camel@hammerspace.com/
> is what I was thinking of but it isn't actually that interesting.
> 
> > My main concern was that some filesystems (e.g. ext3) were failing
> > to
> > provide sufficient timestamp resolution to actually label the
> > resulting
> > 'change attribute' as being updated monotonically. If the time
> > stamp
> > doesn't change when the file data or metadata are changed, then the
> > client has to perform extra checks to try to figure out whether or
> > not
> > its caches are up to date.
> 
> That's a different issue from the one you were raising in that
> discussion.
> 
> > > Did NFSv4 add change_attr_type because some implementations
> > > needed
> > > the
> > > unordered case, or because they realized ordering was useful but
> > > wanted
> > > to keep backwards compatibility?  I don't know which it was.
> > 
> > We implemented it because, as implied above, knowledge of whether
> > or
> > not the change attribute behaves monotonically, or strictly
> > monotonically, enables a number of optimisations.
> 
> Of course, but my question was about the value of the old behavior,
> not
> about the value of the monotonic behavior.
> 
> Put differently, if we could redesign the protocol from scratch would
> we
> actually have included the option of non-monotonic behavior?
> 

If we could design the filesystems from scratch, we probably would not.
The protocol ended up being as it is because people were trying to make
it as easy to implement as possible.

So if we could design the filesystem from scratch, we would have
probably designed it along the lines of what AFS does.
i.e. each explicit change is accompanied by a single bump of the change
attribute, so that the clients can not only decide the order of the
resulting changes, but also if they have missed a change (that might
have been made by a different client).

However that would be a requirement that is likely to be very specific
to distributed caches (and hence distributed filesystems). I doubt
there are many user space applications that would need that high
precision. Maybe MPI, but that's the only candidate I can think of for
now?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2022-08-30 15:43 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26 21:46 [PATCH v3 0/7] vfs: clean up i_version behavior and expose it via statx Jeff Layton
2022-08-26 21:46 ` [PATCH v3 1/7] iversion: update comments with info about atime updates Jeff Layton
2022-08-29  7:56   ` Dave Chinner
2022-08-29 10:39     ` Jeff Layton
2022-08-29 22:58       ` NeilBrown
2022-08-30 11:40         ` Jeff Layton
2022-08-30 13:24           ` J. Bruce Fields
2022-08-30 13:50             ` Jeff Layton
2022-08-30 14:44               ` J. Bruce Fields
2022-08-30 14:58                 ` Trond Myklebust
2022-08-30 15:17                   ` J. Bruce Fields
2022-08-30 15:43                     ` Trond Myklebust [this message]
2022-08-30 17:02                       ` Jeff Layton
2022-08-30 17:47                         ` Trond Myklebust
2022-08-30 17:53                           ` Jeff Layton
2022-08-30 18:25                             ` Trond Myklebust
2022-08-30 19:11                               ` Jeff Layton
2022-08-30 18:32                         ` J. Bruce Fields
2022-08-30 19:30                           ` Jeff Layton
2022-08-30 19:46                             ` J. Bruce Fields
2022-08-30 19:57                               ` Jeff Layton
2022-08-30 20:08                                 ` J. Bruce Fields
2022-08-30  1:04       ` Dave Chinner
2022-08-30 12:38         ` Jeff Layton
2022-08-26 21:46 ` [PATCH v3 2/7] ext4: fix i_version handling in ext4 Jeff Layton
2022-08-26 21:46 ` [PATCH v3 3/7] ext4: unconditionally enable the i_version counter Jeff Layton
2022-08-29 14:51   ` Jan Kara
2022-08-26 21:47 ` [PATCH v3 4/7] xfs: don't bump the i_version on an atime update in xfs_vn_update_time Jeff Layton
2022-08-27  7:26   ` Amir Goldstein
2022-08-27  8:01     ` Amir Goldstein
2022-08-27 13:14       ` Jeff Layton
2022-08-27 15:46         ` Darrick J. Wong
2022-08-27 16:03           ` Trond Myklebust
2022-08-27 16:10             ` Jeff Layton
2022-08-27 17:06               ` Trond Myklebust
2022-08-28 13:25               ` Amir Goldstein
2022-08-28 14:37                 ` Jeff Layton
2022-08-28 16:53                   ` Amir Goldstein
2022-08-29  5:48                   ` Dave Chinner
2022-08-29 10:33                     ` Jeff Layton
2022-08-30  0:08                       ` Dave Chinner
2022-08-30 11:20                         ` Jeff Layton
2022-08-28 17:30           ` Amir Goldstein
2022-08-26 21:47 ` [PATCH v3 5/7] vfs: report an inode version in statx for IS_I_VERSION inodes Jeff Layton
2022-08-26 21:47 ` [PATCH v3 6/7] nfs: report the inode version in statx if requested Jeff Layton
2022-08-26 21:47 ` [PATCH v3 7/7] ceph: fill in the change attribute in statx requests Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e8c7af5d39870c5b0dc61736a79bd134be5a9b3.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=bfields@fieldses.org \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=lczerner@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ceph@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=walters@verbum.org \
    --cc=xiubli@redhat.com \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).