Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Chuck Lever <chucklever@gmail.com>
To: Frank van der Linden <fllinden@amazon.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 00/35] user xattr support (RFC8276)
Date: Thu, 24 Oct 2019 16:16:33 -0400
Message-ID: <9CAEB69A-A92C-47D8-9871-BA6EA83E1881@gmail.com> (raw)
In-Reply-To: <cover.1568309119.git.fllinden@amazon.com>

Hi Frank-

> On Sep 12, 2019, at 1:25 PM, Frank van der Linden <fllinden@amazon.com> wrote:
> 
> This RFC patch series implements both client and server side support
> for RFC8276 user extended attributes.
> 
> The server side should be complete (except for modifications made
> after comments, of course). The client side lacks caching. I am
> working on that, but am not happy with my current implementation.
> So I'm sending this out for input. Having a reviewed base to work
> from will make adding client-side caching support easier to add,
> in any case.
> 
> Thanks for any input!

This prototype looks very good. I have some general comments as I'm
getting to know your code. I assume your end goal is to get this
work merged into the upstream Linux kernel.

First some comments about patch submission and review:

- IMO you can post future updates just to linux-nfs. Note that the
kernel NFS client and server are maintained separately, so when it
comes time to submit final patches, you will send the client work
to Trond and Anna, and the server work to Bruce (and maybe me).

- We like patches that are as small as possible but no smaller.
Some of these might be too small. For example, you don't need to add
the XDR encoders, decoders, and reply size macros in separate patches.
That might help get the total patch count down without reducing
review-ability. I can help offline with other examples.

- Please run scripts/checkpatch.pl on each patch before you post
again. This will help identify coding convention issues that should
be addressed before merge. sparse is also a good idea too.
clang-format is also nice but is entirely optional.

- I was not able to get 34/35 to apply. The series might be missing
a patch that adds nfsd_getxattr and friends.

- Do you have man page updates for the new mount and export options?


Now some comments about code design:

- I'm not clear why new CONFIG options are necessary. These days we
try to avoid adding new CONFIG options if possible. I can't think of
a reason someone would need to compile user xattr support out if
NFSv4.2 is enabled.

- Can you explain why an NFS server administrator might want to
disable user xattr support on a share?

- Probably you are correct that the design choices you made regarding
multi-message LISTXATTR are the best that can be done. Hopefully that
is not a frequent operation, but for something like "tar" it might be.
Do you have any feeling for how to assess performance?

- Regarding client caching... the RFC is notably vague about what
is needed there. You might be able to get away with no caching, just
as a start. Do you (and others) think that this series would be
acceptable and mergeable without any client caching support?

- Do you have access to an RDMA-capable platform? RPC/RDMA needs to
be able to predict how large each reply will be, in order to reserve
appropriate memory resources to land the whole RPC reply on the client.
I'm wondering if you've found any particular areas where that might be
a challenge.


Testing:

- Does fstests already have user xattr functional tests? If not, how
do you envision testing this new code?

- How should we test the logic that deals with delegation recall?

- Do you have plans to submit patches to pyNFS?


> Some notable issues:
> 
> * RFC8276 explicitly only concerns itself with the user xattr namespace.
>  Linux is the only OS that encodes the namespace in the attribute name,
>  adding "user." for the user namespace. Others, like FreeBSD and MacOS,
>  pass the namespace as a system call argument. Since the "user." prefix
>  is specific to Linux, it is stripped off on the client side, and added
>  back on the server side.
> 
> * Extended attributes tend to be small in size, but they may be somewhat
>  large. The Linux-imposed limit is 64k. The sunrpc XDR interfaces only
>  allow sizes > PAGE_SIZE to be encoded using pages. That works out
>  great for reads/writes, but since there are no page-based xattr
>  FS interfaces, it means some extra explicit copying.
> 
> * A quirk of the Linux xattr implementation is that you can create
>  more extended attributes per file than listxattr can handle in
>  its maximum size, meaning that you get E2BIG back. There is no
>  1-1 translation for that error to the LISTXATTR op. I chose to
>  return NFS4ERR_TOOSMALL (which is a valid error for the operation),
>  and have the client code check for it and return E2BIG. Not
>  great, but it seemed to be the best option.
> 
> * The LISTXATTR op uses cookies to support multiple calls to retrieve
>  the complete list, like READDDIR. This means that there's the old
>  "how to deal with non-0 cookies" issue.
> 
>  In the vast majority of cases, LISTXATTR should not need multiple
>  calls. However, you never know what a client will do. READDIR
>  uses the cookie as a seek offset in to the directory. It's not
>  quite as simple with LISTXATTR. First, there is no seperate
>  FS interface to list only user xattrs. Which means that, based
>  on permissions, the buffer returned by vfs_listxattr might turn
>  out differently (e.g. if you have no permission to read some
>  system. attributes). That means that using just a hard offset
>  in to the buffer (with verifications) can't work, as its a
>  context-specific value, and the client couldn't cache it if
>  it wanted to.
> 
>  My code returns the attribute count as a cookie. The server
>  always asks vfs_listxattr for a XATTR_LIST_MAX buffer (see
>  below), and then starts XDR encoding at the Nth user xattr
>  attribute, where N is the cookie value. There are bounds
>  checks of course.
> 
>  This isn't great, but probably the best you can do.
> 
> * There is no FS interface to only read user extended attributes,
>  and the server code needs to strip off the "user." prefix.
>  Additionally, the RFC specifies that the max length field for the
>  LISTXATTR op refers to the total XDR-encoded length. Given all
>  this, it is not possible to know what the length is it should pass
>  to vfs_listxattr. So it just passes an XATTR_LIST_MAX sized buffer
>  to vfs_listxattr, and then encodes as many user xattrs as it can
>  from the returned buffer.
> 
> * Given that this is an extension to v4.2, it is only supported
>  if 4.2 is used (even though it has no dependencies on 4.2
>  specific features).
> 
> * There is a new mount option, already known for other filesystems,
>  "user_xattr", which must be passed explicitly as it stands
>  right now, together with vers=4.2
> 
> * There is a new export option to turn off extended attributes for
>  a filesystem.
> 
> * Modifications outside of the NFS(D) code were minor. They were
>  all in the xattr code, to export versions of vfs_setxattr and
>  vfs_removexattr that the NFS code should use (inode rwsem taken
>  because of atomic change info), and to have them break delegations,
>  as specified by RFC8276.
> 
> * As I mentioned, this has no client caching. I have one implementation,
>  but I'm not that happy with it.
> 
>  The main issue with client caching is that, for virtually all expected
>  cases, the number of user extended attributes per file will be small,
>  and their data small. But, theoretically, you can, within current Linux
>  limitations, create some 9000 user extended attributes per inode, each
>  64k in size.
> 
>  My current implementation uses an inode LRU chain (much like the access
>  cache), and an rhashtable per inode (I picked rhashtables because
>  they automatically grow). This seems like overkill, though.
> 
>  If there are any suggestions on better implementations, I'll be happy
>  to take them. The client caching I mention here is not in this series,
>  as I need to clean it up a bit (and am not sure if I really want to use
>  it). But I can share it if asked. It will be in the next iteration,
>  in whatever shape or form it might take.
> 
> Frank van der Linden (35):
>  nfsd: make sure the nfsd4_ops array has the right size
>  nfs/nfsd: basic NFS4 extended attribute definitions
>  NFSv4.2: query the server for extended attribute support
>  nfs: parse the {no}user_xattr option
>  NFSv4.2: define a function to compute the maximum XDR size for
>    listxattr
>  NFSv4.2: define and set initial limits for extended attributes
>  NFSv4.2: define argument and response structures for xattr operations
>  NFSv4.2: define the encode/decode sizes for the XATTR operations
>  NFSv4.2: define and use extended attribute overhead sizes
>  NFSv4.2: add client side XDR handling for extended attributes
>  nfs: define nfs_access_get_cached function
>  NFSv4.2: query the extended attribute access bits
>  nfs: modify update_changeattr to deal with regular files
>  nfs: define and use the NFS_INO_INVALID_XATTR flag
>  nfs: make the buf_to_pages_noslab function available to the nfs code
>  NFSv4.2: add the extended attribute proc functions.
>  NFSv4.2: hook in the user extended attribute handlers
>  NFSv4.2: add client side xattr caching functions
>  NFSv4.2: call the xattr cache functions
>  nfs: add the NFS_V4_XATTR config option
>  xattr: modify vfs_{set,remove}xattr for NFS server use
>  nfsd: split off the write decode code in to a seperate function
>  nfsd: add defines for NFSv4.2 extended attribute support
>  nfsd: define xattr functions to call in to their vfs counterparts
>  nfsd: take xattr access bits in to account when checking
>  nfsd: add structure definitions for xattr requests / responses
>  nfsd: implement the xattr procedure functions.
>  nfsd: define xattr reply size functions
>  nfsd: add xattr XDR decode functions
>  nfsd: add xattr XDR encode functions
>  nfsd: add xattr operations to ops array
>  xattr: add a function to check if a namespace is supported
>  nfsd: add fattr support for user extended attributes
>  nfsd: add export flag to disable user extended attributes
>  nfsd: add NFSD_V4_XATTR config option
> 
> fs/nfs/Kconfig                   |   9 +
> fs/nfs/Makefile                  |   1 +
> fs/nfs/client.c                  |  17 +-
> fs/nfs/dir.c                     |  24 +-
> fs/nfs/inode.c                   |   7 +-
> fs/nfs/internal.h                |  24 ++
> fs/nfs/nfs42.h                   |  29 ++
> fs/nfs/nfs42proc.c               | 242 ++++++++++++++
> fs/nfs/nfs42xattr.c              |  72 ++++
> fs/nfs/nfs42xdr.c                | 446 +++++++++++++++++++++++++
> fs/nfs/nfs4_fs.h                 |   5 +
> fs/nfs/nfs4client.c              |  31 ++
> fs/nfs/nfs4proc.c                | 246 ++++++++++++--
> fs/nfs/nfs4super.c               |   1 +
> fs/nfs/nfs4xdr.c                 |  35 ++
> fs/nfs/nfstrace.h                |   3 +-
> fs/nfs/super.c                   |  11 +
> fs/nfsd/Kconfig                  |  10 +
> fs/nfsd/export.c                 |   1 +
> fs/nfsd/nfs4proc.c               | 145 +++++++-
> fs/nfsd/nfs4xdr.c                | 548 +++++++++++++++++++++++++++++--
> fs/nfsd/nfsd.h                   |  13 +-
> fs/nfsd/vfs.c                    | 153 +++++++++
> fs/nfsd/vfs.h                    |  10 +
> fs/nfsd/xdr4.h                   |  31 ++
> fs/xattr.c                       |  90 ++++-
> include/linux/nfs4.h             |  27 +-
> include/linux/nfs_fs.h           |   6 +
> include/linux/nfs_fs_sb.h        |   7 +
> include/linux/nfs_xdr.h          |  62 +++-
> include/linux/xattr.h            |   4 +
> include/uapi/linux/nfs4.h        |   3 +
> include/uapi/linux/nfs_fs.h      |   1 +
> include/uapi/linux/nfsd/export.h |   3 +-
> 34 files changed, 2233 insertions(+), 84 deletions(-)
> create mode 100644 fs/nfs/nfs42xattr.c
> 
> -- 
> 2.17.2
> 

--
Chuck Lever
chucklever@gmail.com




  parent reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-12 17:25 Frank van der Linden
2019-07-01 17:56 ` [RFC PATCH 02/35] nfs/nfsd: basic NFS4 extended attribute definitions Frank van der Linden
2019-08-26 21:38 ` [RFC PATCH 03/35] NFSv4.2: query the server for extended attribute support Frank van der Linden
2019-08-26 21:53 ` [RFC PATCH 04/35] nfs: parse the {no}user_xattr option Frank van der Linden
2019-08-26 22:06 ` [RFC PATCH 05/35] NFSv4.2: define a function to compute the maximum XDR size for listxattr Frank van der Linden
2019-08-26 22:23 ` [RFC PATCH 06/35] NFSv4.2: define and set initial limits for extended attributes Frank van der Linden
2019-08-26 22:32 ` [RFC PATCH 07/35] NFSv4.2: define argument and response structures for xattr operations Frank van der Linden
2019-08-26 22:44 ` [RFC PATCH 08/35] NFSv4.2: define the encode/decode sizes for the XATTR operations Frank van der Linden
2019-08-26 23:09 ` [RFC PATCH 09/35] NFSv4.2: define and use extended attribute overhead sizes Frank van der Linden
2019-08-27 15:34 ` [RFC PATCH 10/35] NFSv4.2: add client side XDR handling for extended attributes Frank van der Linden
2019-08-27 15:46 ` [RFC PATCH 11/35] nfs: define nfs_access_get_cached function Frank van der Linden
2019-08-27 16:01 ` [RFC PATCH 12/35] NFSv4.2: query the extended attribute access bits Frank van der Linden
2019-08-27 22:51 ` [RFC PATCH 13/35] nfs: modify update_changeattr to deal with regular files Frank van der Linden
2019-08-30 22:48 ` [RFC PATCH 14/35] nfs: define and use the NFS_INO_INVALID_XATTR flag Frank van der Linden
2019-08-30 22:56 ` [RFC PATCH 15/35] nfs: make the buf_to_pages_noslab function available to the nfs code Frank van der Linden
2019-08-30 23:15 ` [RFC PATCH 16/35] NFSv4.2: add the extended attribute proc functions Frank van der Linden
2019-08-30 23:31 ` [RFC PATCH 17/35] NFSv4.2: hook in the user extended attribute handlers Frank van der Linden
2019-08-30 23:38 ` [RFC PATCH 18/35] NFSv4.2: add client side xattr caching functions Frank van der Linden
2019-08-30 23:45 ` [RFC PATCH 19/35] NFSv4.2: call the xattr cache functions Frank van der Linden
2019-08-30 23:59 ` [RFC PATCH 20/35] nfs: add the NFS_V4_XATTR config option Frank van der Linden
2019-08-31  2:12 ` [RFC PATCH 21/35] xattr: modify vfs_{set,remove}xattr for NFS server use Frank van der Linden
2019-08-31 19:00 ` [RFC PATCH 22/35] nfsd: split off the write decode code in to a seperate function Frank van der Linden
2019-08-31 19:19 ` [RFC PATCH 23/35] nfsd: add defines for NFSv4.2 extended attribute support Frank van der Linden
2019-08-31 21:35 ` [RFC PATCH 26/35] nfsd: add structure definitions for xattr requests / responses Frank van der Linden
2019-08-31 23:53 ` [RFC PATCH 24/35] nfsd: define xattr functions to call in to their vfs counterparts Frank van der Linden
2019-09-01  0:13 ` [RFC PATCH 25/35] nfsd: take xattr access bits in to account when checking Frank van der Linden
2019-09-01  1:19 ` [RFC PATCH 27/35] nfsd: implement the xattr procedure functions Frank van der Linden
2019-09-01  2:46 ` [RFC PATCH 01/35] nfsd: make sure the nfsd4_ops array has the right size Frank van der Linden
2019-09-02 19:40 ` [RFC PATCH 28/35] nfsd: define xattr reply size functions Frank van der Linden
2019-09-02 19:58 ` [RFC PATCH 29/35] nfsd: add xattr XDR decode functions Frank van der Linden
2019-09-02 20:09 ` [RFC PATCH 30/35] nfsd: add xattr XDR encode functions Frank van der Linden
2019-09-02 20:19 ` [RFC PATCH 31/35] nfsd: add xattr operations to ops array Frank van der Linden
2019-09-02 20:34 ` [RFC PATCH 32/35] xattr: add a function to check if a namespace is supported Frank van der Linden
2019-09-02 21:30 ` [RFC PATCH 33/35] nfsd: add fattr support for user extended attributes Frank van der Linden
2019-09-02 23:06 ` [RFC PATCH 34/35] nfsd: add export flag to disable " Frank van der Linden
2019-09-02 23:17 ` [RFC PATCH 35/35] nfsd: add NFSD_V4_XATTR config option Frank van der Linden
2019-10-24 20:16 ` Chuck Lever [this message]
2019-10-24 23:15   ` [RFC PATCH 00/35] user xattr support (RFC8276) Frank van der Linden
2019-10-25 19:55     ` Chuck Lever
2019-11-04  3:01       ` bfields
2019-11-04 15:36         ` Chuck Lever
2019-11-04 16:21           ` Frank van der Linden
2019-11-04 22:58             ` Bruce Fields
2019-11-05  0:06               ` Frank van der Linden
2019-11-05 15:44               ` Chuck Lever

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9CAEB69A-A92C-47D8-9871-BA6EA83E1881@gmail.com \
    --to=chucklever@gmail.com \
    --cc=fllinden@amazon.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git