All of lore.kernel.org
 help / color / mirror / Atom feed
From: bfields <bfields@fieldses.org>
To: Daire Byrne <daire@dneg.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	linux-cachefs <linux-cachefs@redhat.com>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: Adventures in NFS re-exporting
Date: Fri, 13 Nov 2020 17:26:00 -0500	[thread overview]
Message-ID: <20201113222600.GC1299@fieldses.org> (raw)
In-Reply-To: <20201113145050.GB1299@fieldses.org>

On Fri, Nov 13, 2020 at 09:50:50AM -0500, bfields wrote:
> On Thu, Nov 12, 2020 at 11:05:57PM +0000, Daire Byrne wrote:
> > So, I can't lay claim to identifying the exact optimisation/hack that
> > improves the retention of the re-export server's client cache when
> > re-exporting an NFSv3 server (which is then read by many clients). We
> > were working with an engineer at the time who showed an interest in
> > our use case and after we supplied a reproducer he suggested modifying
> > the nfs/inode.c
> > 
> > -		if (!inode_eq_iversion_raw(inode, fattr->change_attr)) {
> > +		if (inode_peek_iversion_raw(inode) < fattr->change_attr)
> > {
> > 
> > His reasoning at the time was:
> > 
> > "Fixes inode invalidation caused by read access. The least important
> > bit is ORed with 1 and causes the inode version to differ from the one
> > seen on the NFS share. This in turn causes unnecessary re-download
> > impacting the performance significantly. This fix makes it only
> > re-fetch file content if inode version seen on the server is newer
> > than the one on the client."
> > 
> > But I've always been puzzled by why this only seems to be the case
> > when using knfsd to re-export the (NFSv3) client mount. Using multiple
> > processes on a standard client mount never causes any similar
> > re-validations. And this happens with a completely read-only share
> > which is why I started to think it has something to do with atimes as
> > that could perhaps still cause a "write" modification even when
> > read-only?
> 
> Ah-hah!  So, it's inode_query_iversion() that's modifying a nfs inode's
> i_version.  That's a special thing that only nfsd would do.
> 
> I think that's totally fixable, we'll just have to think a little about
> how....

I wonder if something like this helps?--b.

commit 0add88a9ccc5
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Fri Nov 13 17:03:04 2020 -0500

    nfs: don't mangle i_version on NFS
    
    The i_version on NFS has pretty much opaque to the client, so we don't
    want to give the low bit any special interpretation.
    
    Define a new FS_PRIVATE_I_VERSION flag for filesystems that manage the
    i_version on their own.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 29ec8b09a52d..9b8dd5b713a7 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1488,7 +1488,8 @@ struct file_system_type nfs_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA|
+				  FS_PRIVATE_I_VERSION,
 };
 MODULE_ALIAS_FS("nfs");
 EXPORT_SYMBOL_GPL(nfs_fs_type);
@@ -1500,7 +1501,8 @@ struct file_system_type nfs4_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA|
+				  FS_PRIVATE_I_VERSION,
 };
 MODULE_ALIAS_FS("nfs4");
 MODULE_ALIAS("nfs4");
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21cc971fd960..c5bb4268228b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2217,6 +2217,7 @@ struct file_system_type {
 #define FS_HAS_SUBTYPE		4
 #define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
+#define FS_PRIVATE_I_VERSION	32	/* i_version managed by filesystem */
 #define FS_THP_SUPPORT		8192	/* Remove once all fs converted */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
 	int (*init_fs_context)(struct fs_context *);
diff --git a/include/linux/iversion.h b/include/linux/iversion.h
index 2917ef990d43..52c790a847de 100644
--- a/include/linux/iversion.h
+++ b/include/linux/iversion.h
@@ -307,6 +307,8 @@ inode_query_iversion(struct inode *inode)
 	u64 cur, old, new;
 
 	cur = inode_peek_iversion_raw(inode);
+	if (inode->i_sb->s_type->fs_flags & FS_PRIVATE_I_VERSION)
+		return cur;
 	for (;;) {
 		/* If flag is already set, then no need to swap */
 		if (cur & I_VERSION_QUERIED) {

  reply	other threads:[~2020-11-13 22:26 UTC|newest]

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-07 17:31 Adventures in NFS re-exporting Daire Byrne
2020-09-08  9:40 ` Mkrtchyan, Tigran
2020-09-08 11:06   ` Daire Byrne
2020-09-15 17:21 ` J. Bruce Fields
2020-09-15 19:59   ` Trond Myklebust
2020-09-16 16:01     ` Daire Byrne
2020-10-19 16:19       ` Daire Byrne
2020-10-19 17:53         ` [PATCH 0/2] Add NFSv3 emulation of the lookupp operation trondmy
2020-10-19 17:53           ` [PATCH 1/2] NFSv3: Refactor nfs3_proc_lookup() to split out the dentry trondmy
2020-10-19 17:53             ` [PATCH 2/2] NFSv3: Add emulation of the lookupp() operation trondmy
2020-10-19 20:05         ` [PATCH v2 0/2] Add NFSv3 emulation of the lookupp operation trondmy
2020-10-19 20:05           ` [PATCH v2 1/2] NFSv3: Refactor nfs3_proc_lookup() to split out the dentry trondmy
2020-10-19 20:05             ` [PATCH v2 2/2] NFSv3: Add emulation of the lookupp() operation trondmy
2020-10-20 18:37         ` [PATCH v3 0/3] Add NFSv3 emulation of the lookupp operation trondmy
2020-10-20 18:37           ` [PATCH v3 1/3] NFSv3: Refactor nfs3_proc_lookup() to split out the dentry trondmy
2020-10-20 18:37             ` [PATCH v3 2/3] NFSv3: Add emulation of the lookupp() operation trondmy
2020-10-20 18:37               ` [PATCH v3 3/3] NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp trondmy
2020-10-21  9:33         ` Adventures in NFS re-exporting Daire Byrne
2020-11-09 16:02           ` bfields
2020-11-12 13:01             ` Daire Byrne
2020-11-12 13:57               ` bfields
2020-11-12 18:33                 ` Daire Byrne
2020-11-12 20:55                   ` bfields
2020-11-12 23:05                     ` Daire Byrne
2020-11-13 14:50                       ` bfields
2020-11-13 22:26                         ` bfields [this message]
2020-11-14 12:57                           ` Daire Byrne
2020-11-16 15:18                             ` bfields
2020-11-16 15:53                             ` bfields
2020-11-16 19:21                               ` Daire Byrne
2020-11-16 15:29                           ` Jeff Layton
2020-11-16 15:56                             ` bfields
2020-11-16 16:03                               ` Jeff Layton
2020-11-16 16:14                                 ` bfields
2020-11-16 16:38                                   ` Jeff Layton
2020-11-16 19:03                                     ` bfields
2020-11-16 20:03                                       ` Jeff Layton
2020-11-17  3:16                                         ` bfields
2020-11-17  3:18                                           ` [PATCH 1/4] nfsd: move fill_{pre,post}_wcc to nfsfh.c J. Bruce Fields
2020-11-17  3:18                                             ` [PATCH 2/4] nfsd: pre/post attr is using wrong change attribute J. Bruce Fields
2020-11-17 12:34                                               ` Jeff Layton
2020-11-17 15:26                                                 ` J. Bruce Fields
2020-11-17 15:34                                                   ` Jeff Layton
2020-11-20 22:38                                                     ` J. Bruce Fields
2020-11-20 22:39                                                       ` [PATCH 1/8] nfsd: only call inode_query_iversion in the I_VERSION case J. Bruce Fields
2020-11-20 22:39                                                         ` [PATCH 2/8] nfsd: simplify nfsd4_change_info J. Bruce Fields
2020-11-20 22:39                                                         ` [PATCH 3/8] nfsd: minor nfsd4_change_attribute cleanup J. Bruce Fields
2020-11-21  0:34                                                           ` Jeff Layton
2020-11-20 22:39                                                         ` [PATCH 4/8] nfsd4: don't query change attribute in v2/v3 case J. Bruce Fields
2020-11-20 22:39                                                         ` [PATCH 5/8] nfs: use change attribute for NFS re-exports J. Bruce Fields
2020-11-20 22:39                                                         ` [PATCH 6/8] nfsd: move change attribute generation to filesystem J. Bruce Fields
2020-11-21  0:58                                                           ` Jeff Layton
2020-11-21  1:01                                                             ` J. Bruce Fields
2020-11-21 13:00                                                           ` Jeff Layton
2020-11-20 22:39                                                         ` [PATCH 7/8] nfsd: skip some unnecessary stats in the v4 case J. Bruce Fields
2020-11-20 22:39                                                         ` [PATCH 8/8] Revert "nfsd4: support change_attr_type attribute" J. Bruce Fields
2020-11-20 22:44                                                       ` [PATCH 2/4] nfsd: pre/post attr is using wrong change attribute J. Bruce Fields
2020-11-21  1:03                                                         ` Jeff Layton
2020-11-21 21:44                                                           ` Daire Byrne
2020-11-22  0:02                                                             ` bfields
2020-11-22  1:55                                                               ` Daire Byrne
2020-11-22  3:03                                                                 ` bfields
2020-11-23 20:07                                                                   ` Daire Byrne
2020-11-17 15:25                                               ` J. Bruce Fields
2020-11-17  3:18                                             ` [PATCH 3/4] nfs: don't mangle i_version on NFS J. Bruce Fields
2020-11-17 12:27                                               ` Jeff Layton
2020-11-17 14:14                                                 ` J. Bruce Fields
2020-11-17  3:18                                             ` [PATCH 4/4] nfs: support i_version in the NFSv4 case J. Bruce Fields
2020-11-17 12:34                                               ` Jeff Layton
2020-11-24 20:35               ` Adventures in NFS re-exporting Daire Byrne
2020-11-24 21:15                 ` bfields
2020-11-24 22:15                   ` Frank Filz
2020-11-25 14:47                     ` 'bfields'
2020-11-25 16:25                       ` Frank Filz
2020-11-25 19:03                         ` 'bfields'
2020-11-26  0:04                           ` Frank Filz
2020-11-25 17:14                   ` Daire Byrne
2020-11-25 19:31                     ` bfields
2020-12-03 12:20                     ` Daire Byrne
2020-12-03 18:51                       ` bfields
2020-12-03 20:27                         ` Trond Myklebust
2020-12-03 21:13                           ` bfields
2020-12-03 21:32                             ` Frank Filz
2020-12-03 21:34                             ` Trond Myklebust
2020-12-03 21:45                               ` Frank Filz
2020-12-03 21:57                                 ` Trond Myklebust
2020-12-03 22:04                                   ` bfields
2020-12-03 22:14                                     ` Trond Myklebust
2020-12-03 22:39                                       ` Frank Filz
2020-12-03 22:50                                         ` Trond Myklebust
2020-12-03 23:34                                           ` Frank Filz
2020-12-03 22:44                                       ` bfields
2020-12-03 21:54                               ` bfields
2020-12-03 22:45                               ` bfields
2020-12-03 22:53                                 ` Trond Myklebust
2020-12-03 23:16                                   ` bfields
2020-12-03 23:28                                     ` Frank Filz
2020-12-04  1:02                                     ` Trond Myklebust
2020-12-04  1:41                                       ` bfields
2020-12-04  2:27                                         ` Trond Myklebust
2020-09-17 16:01   ` Daire Byrne
2020-09-17 19:09     ` bfields
2020-09-17 20:23       ` Frank van der Linden
2020-09-17 21:57         ` bfields
2020-09-19 11:08           ` Daire Byrne
2020-09-22 16:43         ` Chuck Lever
2020-09-23 20:25           ` Daire Byrne
2020-09-23 21:01             ` Frank van der Linden
2020-09-26  9:00               ` Daire Byrne
2020-09-28 15:49                 ` Frank van der Linden
2020-09-28 16:08                   ` Chuck Lever
2020-09-28 17:42                     ` Frank van der Linden
2020-09-22 12:31 ` Daire Byrne
2020-09-22 13:52   ` Trond Myklebust
2020-09-23 12:40     ` J. Bruce Fields
2020-09-23 13:09       ` Trond Myklebust
2020-09-23 17:07         ` bfields
2020-09-30 19:30   ` [Linux-cachefs] " Jeff Layton
2020-10-01  0:09     ` Daire Byrne
2020-10-01 10:36       ` Jeff Layton
2020-10-01 12:38         ` Trond Myklebust
2020-10-01 16:39           ` Jeff Layton
2020-10-05 12:54         ` Daire Byrne
2020-10-13  9:59           ` Daire Byrne
2020-10-01 18:41     ` J. Bruce Fields
2020-10-01 19:24       ` Trond Myklebust
2020-10-01 19:26         ` bfields
2020-10-01 19:29           ` Trond Myklebust
2020-10-01 19:51             ` bfields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201113222600.GC1299@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=daire@dneg.com \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.