All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"joseph.qi@linux.alibaba.com" <joseph.qi@linux.alibaba.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH] nfs: set block size according to pnfs_blksize first
Date: Wed, 16 Jun 2021 12:14:44 -0400	[thread overview]
Message-ID: <YMojdN145g9JqAC8@mit.edu> (raw)
In-Reply-To: <YMoNnr1RYDOLXtKJ@B-P7TQMD6M-0146.local>

On Wed, Jun 16, 2021 at 10:41:34PM +0800, Gao Xiang wrote:
> > > Yet my question is how to deal with generic/486, should we just skip
> > > the case directly? I cannot find some proper way to get underlayfs
> > > block size or real xattr value limit.

Note that the block size does not necessarily have thing to do with
the xattr value limit.  For example, in ext4, if you create the file
system with the ea_inode feature enabled, you can create extended
attributes up to the maximum value of 64k defined by the xattr
interface --- unless, of course, there isn't enough space in the file
system.

(The ea_inode feature will also use shared space for large xattrs, so
if you are storing hundreds of thousands of files that all have an
identical 20 kilbyte Windows security id or ACL, ea_inode is going to
be much more efficient way of supporting that particular use case.)

> > As noted above, the NFS server should really be returning
> > NFS4ERR_XATTR2BIG in this case, which the client, again, should be
> > transforming into -E2BIG. Where does ENOSPC come from?
> 
> Thanks for the detailed explanation...
> 
> I think that is due to ext4 returning ENOSPC since I tested

It's not just ext4.  From inspection, under some circumstances f2fs
and btrfs can return ENOSPC.

The distinction is that E2BIG is (from the attr_set man page):

       [E2BIG]          The value of the given attribute is too large,
       			it exceeds the maximum allowable size of an
			attribute value.

The maximum allowable size (enforced by the VFS) is XATTR_MAX_SIZE,
which is 65536 bytes.  Some file systems may impose a smaller max
allowable size.

In contrast ENOSPC means something different:

       ENOSPC		No space left on device.

This isn't necessarily just block space, BTW; it might some other file
system space --- thre might not be a free inode in the case of ext4's
ea_inode feature.  Or it be the f2fs file system not being able to
allocate a node id via f2fs_alloc_nid().

Note that generic/486 is testing a very specific case, which is
replacing a small xattr (16 bytes) with an xattr with a large value.
This is would be a really interesting test for ext4 ea_inode, when we
are replacing an xattr stored inline in the inode, or in a single 4k
block, with an xattr stored in a separate inode.  But not the way
src/attr_replace_test.c (which does all of the heavy lifting for the
generic/486 test) is currently written.

So what I would suggest is to have attr_replace_test.c try to
determine the maximum attr value size using binary search.  Start with
min=16, and max=65536, and try creating an xattr of a particular size,
and then delete the xattr, and then retry creating the xattr with the
next binary search size.  This will allow you to create a function
which determines the maximum size attr for a particular file system,
especially in those cases where it is dependent on how the file system
is configured.

> should we transform it to E2BIG instead (at least in NFS
> protocol)? but I'm still not sure that E2BIG is a valid return code for
> setxattr()...

E2BIG is defined in the attr_set(3) man page.  ENOSPC isn't mentioned
in the attr_set man page, but given that multiple file systems return
ENOSPC, and ENOSPC has a well-defined meaning in POSIX.1 which very
much makes sense when creating extended attributes, we should fix that
by adding ENOSPC to the attr_set(3) man page (which is shipped as part
of the libattr library sources).

Cheers,

						- Ted

  parent reply	other threads:[~2021-06-16 16:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-16 12:44 Gao Xiang
2021-06-16 13:47 ` Trond Myklebust
2021-06-16 14:06   ` Gao Xiang
2021-06-16 14:20     ` Trond Myklebust
2021-06-16 14:41       ` Gao Xiang
2021-06-16 15:14         ` Trond Myklebust
2021-06-16 17:08           ` Theodore Ts'o
2021-06-16 17:17           ` Frank van der Linden
2021-06-16 22:51             ` Theodore Ts'o
2021-06-16 16:14         ` Theodore Ts'o [this message]
2021-06-16 17:51           ` Gao Xiang
2021-06-16 18:51             ` Trond Myklebust
2021-06-16 22:55             ` Theodore Ts'o
2021-06-17  2:39               ` Gao Xiang
2021-06-17 13:08                 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YMojdN145g9JqAC8@mit.edu \
    --to=tytso@mit.edu \
    --cc=anna.schumaker@netapp.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    --subject='Re: [PATCH] nfs: set block size according to pnfs_blksize first' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.