All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Benjamin Coddington <bcodding@redhat.com>,
	trond.myklebust@hammerspace.com, anna@kernel.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/2] NFSv4: Allow per-mount tuning of READDIR attrs
Date: Wed, 18 Oct 2023 10:25:05 -0400	[thread overview]
Message-ID: <ZS/qwYzQvrgJNEv6@tissot.1015granger.net> (raw)
In-Reply-To: <6157b73e380e5b625cd8ed0133ef392d0dd4bd8b.camel@kernel.org>

On Wed, Oct 18, 2023 at 09:33:40AM -0400, Jeff Layton wrote:
> On Wed, 2023-10-18 at 08:56 -0400, Chuck Lever wrote:
> > On Tue, Oct 17, 2023 at 05:30:44PM -0400, Benjamin Coddington wrote:
> > > Expose a per-mount knob in sysfs to set the READDIR requested attributes
> > > for a non-plus READDIR request.
> > > 
> > > For example:
> > > 
> > >   echo 0x800 0x800000 0x0 > /sys/fs/nfs/0\:57/v4_readdir_attrs
> > > 
> > > .. will revert the client to only request rdattr_error and
> > > mounted_on_fileid for any non "plus" READDIR, as before the patch
> > > preceeding this one in this series.  This provides existing installations
> > > an option to fix a potential performance regression that may occur after
> > > NFS clients update to request additional default READDIR attributes.
> > > 
> > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > ---
> > >  fs/nfs/client.c           |  2 +
> > >  fs/nfs/nfs4client.c       |  4 ++
> > >  fs/nfs/nfs4proc.c         |  1 +
> > >  fs/nfs/nfs4xdr.c          |  7 ++--
> > >  fs/nfs/sysfs.c            | 81 +++++++++++++++++++++++++++++++++++++++
> > >  include/linux/nfs_fs_sb.h |  1 +
> > >  include/linux/nfs_xdr.h   |  1 +
> > >  7 files changed, 93 insertions(+), 4 deletions(-)
> > 
> > Admittedly, it would be much easier for humans to use if the API was
> > based on the symbolic names of the bits rather than a triplet of raw
> > hexadecimal values.
> > 
> 
> I think there are some significant footguns with this interface. It'd be
> very easy to set this wrong and get weird behavior.  OTOH, we could push
> that complexity into userland and provide some sort of script in nfs-
> utils for tuning this.
> 
> That said...
> 
> When we look at interfaces like this, we have to consider that they may
> be around for a long, long time (decades, even), and people will come to
> rely on them to do strange things that are difficult for us to support.
> If we have someone saying that their READDIR performance slowed down, we
> now have to grab those settings from this sysfs file and validate them
> when trying to help them.
> 
> Personally, I'd prefer a simple binary "make it work the old way"
> switch, if we're concerned about performance regressions here. I think
> that's the sort of thing that is simple to explain to admins that are
> suffering from this problem and (more importantly) the sort of setting
> we can later remove when it's no longer needed.
> 
> Adding this sort of fine-grained knob will create more problems than it
> solves, as people will (inevitably) use it incorrectly.

Totally agree with this assessment. It will get baked into wacky
knowledge base articles for decades. Voice of experience here ;-)

It's not yet clear sysadmins will even need a switch like this, so I
would go further and say you should hold off on merging anything
like it until there is an actual reported need for it.

Now, full control over that bitmap is still very neat thing for
experimentation by NFS developers. Hiding this behind a Kconfig
option would let you merge it but then turn it off in production
kernels.


> > > diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> > > index 44eca51b2808..e9aa1fd4335f 100644
> > > --- a/fs/nfs/client.c
> > > +++ b/fs/nfs/client.c
> > > @@ -922,6 +922,8 @@ void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *sour
> > >  	target->options = source->options;
> > >  	target->auth_info = source->auth_info;
> > >  	target->port = source->port;
> > > +	memcpy(target->readdir_attrs, source->readdir_attrs,
> > > +			sizeof(target->readdir_attrs));
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_server_copy_userdata);
> > >  
> > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > index 11e3a285594c..3bbfb4244c14 100644
> > > --- a/fs/nfs/nfs4client.c
> > > +++ b/fs/nfs/nfs4client.c
> > > @@ -1115,6 +1115,10 @@ static int nfs4_server_common_setup(struct nfs_server *server,
> > >  
> > >  	nfs4_server_set_init_caps(server);
> > >  
> > > +	/* Default (non-plus) v4 readdir attributes */
> > > +	server->readdir_attrs[0] = FATTR4_WORD0_TYPE|FATTR4_WORD0_RDATTR_ERROR;
> > > +	server->readdir_attrs[1] = FATTR4_WORD1_MOUNTED_ON_FILEID;
> > > +
> > >  	/* Probe the root fh to retrieve its FSID and filehandle */
> > >  	error = nfs4_get_rootfh(server, mntfh, auth_probe);
> > >  	if (error < 0)
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 7016eaadf555..0f0028de7941 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -5113,6 +5113,7 @@ static int _nfs4_proc_readdir(struct nfs_readdir_arg *nr_arg,
> > >  		.pgbase = 0,
> > >  		.count = nr_arg->page_len,
> > >  		.plus = nr_arg->plus,
> > > +		.server = server,
> > >  	};
> > >  	struct nfs4_readdir_res res;
> > >  	struct rpc_message msg = {
> > > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > > index 7200d6f7cd7b..45a9b40b801e 100644
> > > --- a/fs/nfs/nfs4xdr.c
> > > +++ b/fs/nfs/nfs4xdr.c
> > > @@ -1601,16 +1601,15 @@ static void encode_read(struct xdr_stream *xdr, const struct nfs_pgio_args *args
> > >  
> > >  static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
> > >  {
> > > -	uint32_t attrs[3] = {
> > > -		FATTR4_WORD0_TYPE|FATTR4_WORD0_RDATTR_ERROR,
> > > -		FATTR4_WORD1_MOUNTED_ON_FILEID,
> > > -	};
> > > +	uint32_t attrs[3];
> > >  	uint32_t dircount = readdir->count;
> > >  	uint32_t maxcount = readdir->count;
> > >  	__be32 *p, verf[2];
> > >  	uint32_t attrlen = 0;
> > >  	unsigned int i;
> > >  
> > > +	memcpy(attrs, readdir->server->readdir_attrs, sizeof(attrs));
> > > +
> > >  	if (readdir->plus) {
> > >  		attrs[0] |= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
> > >  			FATTR4_WORD0_FSID|FATTR4_WORD0_FILEHANDLE|FATTR4_WORD0_FILEID;
> > > diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
> > > index bf378ecd5d9f..6d4f52bf47b3 100644
> > > --- a/fs/nfs/sysfs.c
> > > +++ b/fs/nfs/sysfs.c
> > > @@ -270,7 +270,82 @@ shutdown_store(struct kobject *kobj, struct kobj_attribute *attr,
> > >  	return count;
> > >  }
> > >  
> > > +static ssize_t
> > > +v4_readdir_attrs_show(struct kobject *kobj, struct kobj_attribute *attr,
> > > +				char *buf)
> > > +{
> > > +	struct nfs_server *server;
> > > +	server = container_of(kobj, struct nfs_server, kobj);
> > > +
> > > +	return sysfs_emit(buf, "0x%x 0x%x 0x%x\n",
> > > +			server->readdir_attrs[0],
> > > +			server->readdir_attrs[1],
> > > +			server->readdir_attrs[2]);
> > > +}
> > > +
> > > +static ssize_t
> > > +v4_readdir_attrs_store(struct kobject *kobj, struct kobj_attribute *attr,
> > > +				const char *buf, size_t count)
> > > +{
> > > +	struct nfs_server *server;
> > > +	u32 attrs[3];
> > > +	char p[36], *v;
> > > +	size_t token = 0;
> > > +	int i;
> > > +
> > > +	if (count > 36)
> > > +		return -EINVAL;
> > > +
> > > +	v = strncpy(p, buf, count);
> > > +
> > > +	for (i = 0; i < 3; i++) {
> > > +		token += strcspn(v, " ") + 1;
> > > +		if (token > 34)
> > > +			return -EINVAL;
> > > +
> > > +		p[token - 1] = '\0';
> > > +		if (kstrtoint(v, 0, &attrs[i]))
> > > +			return -EINVAL;
> > > +		v = &p[token];
> > > +	}
> > > +
> > > +	/* Allow only what we decode - see decode_getfattr_attrs() */
> > > +	if (attrs[0] & ~(FATTR4_WORD0_TYPE |
> > > +			FATTR4_WORD0_CHANGE |
> > > +			FATTR4_WORD0_SIZE |
> > > +			FATTR4_WORD0_FSID |
> > > +			FATTR4_WORD0_RDATTR_ERROR |
> > > +			FATTR4_WORD0_FILEHANDLE |
> > > +			FATTR4_WORD0_FILEID |
> > > +			FATTR4_WORD0_FS_LOCATIONS) ||
> > > +		attrs[1] & ~(FATTR4_WORD1_MODE |
> > > +			FATTR4_WORD1_NUMLINKS |
> > > +			FATTR4_WORD1_OWNER |
> > > +			FATTR4_WORD1_OWNER_GROUP |
> > > +			FATTR4_WORD1_RAWDEV |
> > > +			FATTR4_WORD1_SPACE_USED |
> > > +			FATTR4_WORD1_TIME_ACCESS |
> > > +			FATTR4_WORD1_TIME_METADATA |
> > > +			FATTR4_WORD1_TIME_MODIFY |
> > > +			FATTR4_WORD1_MOUNTED_ON_FILEID) ||
> > > +		attrs[2] & ~(FATTR4_WORD2_MDSTHRESHOLD |
> > > +			FATTR4_WORD2_SECURITY_LABEL))
> > > +		return -EINVAL;
> > > +
> > > +	server = container_of(kobj, struct nfs_server, kobj);
> > > +
> > > +	if (attrs[0])
> > > +		server->readdir_attrs[0] = attrs[0];
> > > +	if (attrs[1])
> > > +		server->readdir_attrs[1] = attrs[1];
> > > +	if (attrs[2])
> > > +		server->readdir_attrs[2] = attrs[2];
> > > +
> > > +	return count;
> > > +}
> > > +
> > >  static struct kobj_attribute nfs_sysfs_attr_shutdown = __ATTR_RW(shutdown);
> > > +static struct kobj_attribute nfs_sysfs_attr_v4_readdir_attrs = __ATTR_RW(v4_readdir_attrs);
> > >  
> > >  #define RPC_CLIENT_NAME_SIZE 64
> > >  
> > > @@ -325,6 +400,12 @@ void nfs_sysfs_add_server(struct nfs_server *server)
> > >  	if (ret < 0)
> > >  		pr_warn("NFS: sysfs_create_file_ns for server-%d failed (%d)\n",
> > >  			server->s_sysfs_id, ret);
> > > +
> > > +	ret = sysfs_create_file_ns(&server->kobj, &nfs_sysfs_attr_v4_readdir_attrs.attr,
> > > +				nfs_netns_server_namespace(&server->kobj));
> > > +	if (ret < 0)
> > > +		pr_warn("NFS: sysfs_create_file_ns for server-%d failed (%d)\n",
> > > +			server->s_sysfs_id, ret);
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_sysfs_add_server);
> > >  
> > > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> > > index cd628c4b011e..db87261b7cd7 100644
> > > --- a/include/linux/nfs_fs_sb.h
> > > +++ b/include/linux/nfs_fs_sb.h
> > > @@ -219,6 +219,7 @@ struct nfs_server {
> > >  						   of change attribute, size, ctime
> > >  						   and mtime attributes supported by
> > >  						   the server */
> > > +	u32			readdir_attrs[3]; /* V4 tuneable default readdir attrs */
> > >  	u32			acl_bitmask;	/* V4 bitmask representing the ACEs
> > >  						   that are supported on this
> > >  						   filesystem */
> > > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > > index 12bbb5c63664..e05d861b1788 100644
> > > --- a/include/linux/nfs_xdr.h
> > > +++ b/include/linux/nfs_xdr.h
> > > @@ -1142,6 +1142,7 @@ struct nfs4_readdir_arg {
> > >  	struct page **			pages;	/* zero-copy data */
> > >  	unsigned int			pgbase;	/* zero-copy data */
> > >  	const u32 *			bitmask;
> > > +	const struct nfs_server		*server;
> > >  	bool				plus;
> > >  };
> > >  
> > > -- 
> > > 2.41.0
> > > 
> > 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

-- 
Chuck Lever

  parent reply	other threads:[~2023-10-18 14:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17 21:30 [PATCH 0/2] NFSv4 READDIR d_type fixup Benjamin Coddington
2023-10-17 21:30 ` [PATCH 1/2] NFSv4: Always ask for type with READDIR Benjamin Coddington
2023-10-17 21:30 ` [PATCH 2/2] NFSv4: Allow per-mount tuning of READDIR attrs Benjamin Coddington
2023-10-18 12:56   ` Chuck Lever
2023-10-18 13:33     ` Jeff Layton
2023-10-18 14:24       ` Benjamin Coddington
2023-10-18 14:33         ` Chuck Lever
2023-10-18 14:25       ` Chuck Lever [this message]
2023-10-18 18:38         ` Anna Schumaker
2023-10-18 19:08           ` Benjamin Coddington
2023-10-18 19:17           ` Benjamin Coddington
2023-10-19 12:06   ` kernel test robot
2023-10-19 12:52     ` Benjamin Coddington
2023-10-19 12:18   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZS/qwYzQvrgJNEv6@tissot.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=anna@kernel.org \
    --cc=bcodding@redhat.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.