All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "hsiangkao@linux.alibaba.com" <hsiangkao@linux.alibaba.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"joseph.qi@linux.alibaba.com" <joseph.qi@linux.alibaba.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH] nfs: set block size according to pnfs_blksize first
Date: Wed, 16 Jun 2021 15:14:17 +0000	[thread overview]
Message-ID: <80199ffaf89fc5ef2ad77245f9a5e75beed2dc37.camel@hammerspace.com> (raw)
In-Reply-To: <YMoNnr1RYDOLXtKJ@B-P7TQMD6M-0146.local>

On Wed, 2021-06-16 at 22:41 +0800, Gao Xiang wrote:
> Hi Trond,
> 
> On Wed, Jun 16, 2021 at 02:20:49PM +0000, Trond Myklebust wrote:
> > On Wed, 2021-06-16 at 22:06 +0800, Gao Xiang wrote:
> > > On Wed, Jun 16, 2021 at 01:47:13PM +0000, Trond Myklebust wrote:
> > > > On Wed, 2021-06-16 at 20:44 +0800, Gao Xiang wrote:
> > > > > When testing fstests with ext4 over nfs 4.2, I found
> > > > > generic/486
> > > > > failed. The root cause is that the length of its xattr value is
> > > > >   min(st_blksize * 3 / 4, XATTR_SIZE_MAX)
> > > > > 
> > > > > which is 4096 * 3 / 4 = 3072 for underlayfs ext4 rather than
> > > > > XATTR_SIZE_MAX = 65536 for nfs since the block size would be
> > > > > wsize
> > > > > (=131072) if bsize is not specified.
> > > > > 
> > > > > Let's use pnfs_blksize first instead of using wsize directly if
> > > > > bsize isn't specified. And the testcase itself can pass now.
> > > > > 
> > > > > Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > > > Cc: Anna Schumaker <anna.schumaker@netapp.com>
> > > > > Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
> > > > > Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> > > > > ---
> > > > > Considering bsize is not specified, we might use pnfs_blksize
> > > > > directly first rather than wsize.
> > > > > 
> > > > >  fs/nfs/super.c | 8 ++++++--
> > > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > > > > index fe58525cfed4..5015edf0cd9a 100644
> > > > > --- a/fs/nfs/super.c
> > > > > +++ b/fs/nfs/super.c
> > > > > @@ -1068,9 +1068,13 @@ static void nfs_fill_super(struct
> > > > > super_block
> > > > > *sb, struct nfs_fs_context *ctx)
> > > > >         snprintf(sb->s_id, sizeof(sb->s_id),
> > > > >                  "%u:%u", MAJOR(sb->s_dev), MINOR(sb->s_dev));
> > > > >  
> > > > > -       if (sb->s_blocksize == 0)
> > > > > -               sb->s_blocksize = nfs_block_bits(server->wsize,
> > > > > +       if (sb->s_blocksize == 0) {
> > > > > +               unsigned int blksize = server->pnfs_blksize ?
> > > > > +                       server->pnfs_blksize : server->wsize;
> > > > 
> > > > NACK. The pnfs block size is a layout driver-specific quantity,
> > > > and
> > > > should not be used to substitute for the server-advertised block
> > > > size.
> > > > It only applies to I/O _if_ the client is holding a layout for a
> > > > specific file and is using pNFS to do I/O to that file.
> > > 
> > > Honestly, I'm not sure if it's ok as well.
> > > 
> > > > 
> > > > It has nothing to do with xattrs at all.
> > > 
> > > Yet my question is how to deal with generic/486, should we just
> > > skip
> > > the case directly? I cannot find some proper way to get underlayfs
> > > block size or real xattr value limit.
> > > 
> > 
> > RFC8276 provides no method for determining the xattr size limits. It
> > just notes that such limits may exist, and provides the error code
> > NFS4ERR_XATTR2BIG, that the server may use as a return value when
> > those
> > limits are exceeded.
> > 
> > > For now, generic/486 will return ENOSPC at
> > > fsetxattr(fd, "user.world", value, 65536, XATTR_REPLACE);
> > > when testing new nfs4.2 xattr support.
> > > 
> > 
> > As noted above, the NFS server should really be returning
> > NFS4ERR_XATTR2BIG in this case, which the client, again, should be
> > transforming into -E2BIG. Where does ENOSPC come from?
> 
> Thanks for the detailed explanation...
> 
> I think that is due to ext4 returning ENOSPC since I tested
> 
> fsetxattr(fd, "user.world", value, 65536, XATTR_REPLACE);
> with ext4 as well and it returned ENOSPC, and I think it's reasonable
> since setxattr() will return ENOSPC for such cases.
> https://man7.org/linux/man-pages/man2/setxattr.2.html
> 
> should we transform it to E2BIG instead (at least in NFS
> protocol)? but I'm still not sure that E2BIG is a valid return code for
> setxattr()...

The setxattr() manpage appears to suggest ERANGE is the correct return
value here.

       ERANGE The size of name or value exceeds a filesystem-specific
limit.


However I can't tell if ext4 and xfs ever do that. Furthermore, it
looks as if the VFS is always returning E2BIG if size > XATTR_SIZE_MAX.

> 
> If necessary, I will look into it more tomorrow....
> 
> Thanks,
> Gao Xiang
> 
> > 
> > > Thanks,
> > > Gao Xiang
> > > 
> > > > 
> > > > > +
> > > > > +               sb->s_blocksize = nfs_block_bits(blksize,
> > > > >                                                  &sb-
> > > > > > s_blocksize_bits);
> > > > > +       }
> > > > >  
> > > > >         nfs_super_set_maxbytes(sb, server->maxfilesize);
> > > > >         server->has_sec_mnt_opts = ctx->has_sec_mnt_opts;
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
> > 
> > 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-06-16 15:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-16 12:44 [PATCH] nfs: set block size according to pnfs_blksize first Gao Xiang
2021-06-16 13:47 ` Trond Myklebust
2021-06-16 14:06   ` Gao Xiang
2021-06-16 14:20     ` Trond Myklebust
2021-06-16 14:41       ` Gao Xiang
2021-06-16 15:14         ` Trond Myklebust [this message]
2021-06-16 17:08           ` Theodore Ts'o
2021-06-16 17:17           ` Frank van der Linden
2021-06-16 22:51             ` Theodore Ts'o
2021-06-16 16:14         ` Theodore Ts'o
2021-06-16 17:51           ` Gao Xiang
2021-06-16 18:51             ` Trond Myklebust
2021-06-16 22:55             ` Theodore Ts'o
2021-06-17  2:39               ` Gao Xiang
2021-06-17 13:08                 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80199ffaf89fc5ef2ad77245f9a5e75beed2dc37.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=anna.schumaker@netapp.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.