All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] getacl fixes
@ 2017-02-17 16:44 J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 1/3] nfsd4: fix getacl head length estimation J. Bruce Fields
                   ` (4 more replies)
  0 siblings, 5 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 16:44 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

The getacl code is allocating enough space to handle the ACL data but
not to handle the bitmask, which can lead to spurious ERANGE errors when
the end of the ACL gets close to a page boundary.

Dros addressed this by letting the rpc layer allocate pages as necessary
on demand, as the NFSv3 ACL code does.

On its own that didn't do the job either, because we don't handle the
case where xdr_shrink_bufhead needs to move data around in the xdr buf.
And xdr_shrink_bufhead was getting called every time due to an incorrect
estimate in an xdr_inline_pages call.

So, I fixed that estimate.  That still leaves the chance of a bug in the
rare case xdr_shrink_bufhead is called.

We could fix up the handling of the xdr_shrink_bufhead case, but I don't
see the point of shifting this data around in the first place.  We're
not doing anything like zero-copy here, we're just going to copy the
data out into the buffer we were passed.  The NFSv3 ACL code doesn't
bother with this.

It's simpler just to pass down the buffer to the xdr layer and let it
copy the ACL out.

The result looks a lot simpler and more obviously correct than this code
has been, though I'm not particularly happy with the sequence of patches
that gets us there; it would be better to squash together Dros's and my
patch and then split out the result some more sensible way.

Sorry for the delay getting back to this.  Older discussions:

	https://marc.info/?t=138452791200001&r=1&w=2
	http://marc.info/?t=138506891000003&r=1&w=2

J. Bruce Fields (2):
  nfsd4: fix getacl head length estimation
  nfsd4: simplify getacl decoding

Weston Andros Adamson (1):
  NFSv4: fix getacl ERANGE for some ACL buffer sizes

 fs/nfs/nfs4proc.c       | 116 +++++++++++++++++++++++-------------------------
 fs/nfs/nfs4xdr.c        |  29 +++---------
 include/linux/nfs_xdr.h |   4 +-
 3 files changed, 64 insertions(+), 85 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] nfsd4: fix getacl head length estimation
  2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
@ 2017-02-17 16:44 ` J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 2/3] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 16:44 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Bitmap and attrlen follow immediately after the op reply header, so I'm
not sure what this extra "+1" was for.

Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e9255cb453e6..bb95dd2edeef 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2524,7 +2524,7 @@ static void nfs4_xdr_enc_getacl(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
-	replen = hdr.replen + op_decode_hdr_maxsz + 1;
+	replen = hdr.replen + op_decode_hdr_maxsz;
 	encode_getattr_two(xdr, FATTR4_WORD0_ACL, 0, &hdr);
 
 	xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/3] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 1/3] nfsd4: fix getacl head length estimation J. Bruce Fields
@ 2017-02-17 16:44 ` J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 16:44 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson,
	Weston Andros Adamson, J. Bruce Fields

From: Weston Andros Adamson <dros@netapp.com>

Fix a bug where getxattr returns ERANGE when the attr buffer
length is close enough to the nearest PAGE_SIZE multiple that adding
the extra bytes leaves too little room for the ACL buffer.

Besides storing the ACL buffer, the getacl decoder uses the inline
pages for the attr bitmap and buffer length. __nfs4_get_acl_uncached
must allocate enough page space for all of the data to be decoded.

This patch uses xdr_partial_copy_from_skb to allocate any needed pages.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
[bfields: restore xdr_enter_page, don't preallocate a page, misc. cleanup]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 43 ++++++++++++-------------------------------
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 0a0eaecf9676..2d485dab343e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5060,20 +5060,17 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
 /*
  * The getxattr API returns the required buffer length when called with a
  * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
- * the required buf.  On a NULL buf, we send a page of data to the server
- * guessing that the ACL request can be serviced by a page. If so, we cache
- * up to the page of ACL data, and the 2nd call to getxattr is serviced by
- * the cache. If not so, we throw away the page, and cache the required
- * length. The next getxattr call will then produce another round trip to
- * the server, this time with the input buf of the required size.
+ * the required buf. Cache the result from the first getxattr call to avoid
+ * sending another RPC.
  */
 static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
 {
-	struct page *pages[NFS4ACL_MAXPAGES] = {NULL, };
+	/* enough pages to hold ACL data plus the bitmap and acl length */
+	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
 	struct nfs_getaclargs args = {
 		.fh = NFS_FH(inode),
+		/* The xdr layer may allocate pages here. */
 		.acl_pages = pages,
-		.acl_len = buflen,
 	};
 	struct nfs_getaclres res = {
 		.acl_len = buflen,
@@ -5083,31 +5080,17 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
-	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE);
 	int ret = -ENOMEM, i;
 
-	/* As long as we're doing a round trip to the server anyway,
-	 * let's be prepared for a page of acl data. */
-	if (npages == 0)
-		npages = 1;
-	if (npages > ARRAY_SIZE(pages))
-		return -ERANGE;
-
-	for (i = 0; i < npages; i++) {
-		pages[i] = alloc_page(GFP_KERNEL);
-		if (!pages[i])
-			goto out_free;
-	}
-
 	/* for decoding across pages */
 	res.acl_scratch = alloc_page(GFP_KERNEL);
 	if (!res.acl_scratch)
-		goto out_free;
+		return -ENOMEM;
 
-	args.acl_len = npages * PAGE_SIZE;
+	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
 
-	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
-		__func__, buf, buflen, npages, args.acl_len);
+	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
+		__func__, buf, buflen, args.acl_len);
 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
 			     &msg, &args.seq_args, &res.seq_res, 0);
 	if (ret)
@@ -5132,11 +5115,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 out_ok:
 	ret = res.acl_len;
 out_free:
-	for (i = 0; i < npages; i++)
-		if (pages[i])
-			__free_page(pages[i]);
-	if (res.acl_scratch)
-		__free_page(res.acl_scratch);
+	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
+		__free_page(pages[i]);
+	__free_page(res.acl_scratch);
 	return ret;
 }
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/3] nfsd4: simplify getacl decoding
  2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 1/3] nfsd4: fix getacl head length estimation J. Bruce Fields
  2017-02-17 16:44 ` [PATCH 2/3] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
@ 2017-02-17 16:44 ` J. Bruce Fields
  2017-02-17 19:15   ` kbuild test robot
                     ` (2 more replies)
  2017-02-17 20:36 ` [PATCH 0/3] getacl fixes Chuck Lever
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
  4 siblings, 3 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 16:44 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We still have a bug in the case xdr_enter_page() ends up needing to do
xdr_shrink_bufhead().  In that case some of the acl data could end up
shifting into the tail of the xdr_buf, but we're not prepared to handle
that.

There's not really much point to shifting the data around anyway, we're
just going to be copying it out into the buf that the vfs passed us.  So
let's just pass that buf down to the xdr layer and let it copy data
directly into it.

That means we need to allocate our own buf in the case the user didn't
give us one, so that we can still cache in that case (to efficiently
handle the common case of a getxattr(., ., NULL, 0) to get the size
followed immediately by a getxattr(., ., buf, size)).

But this still works out to be much simpler.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c       | 79 +++++++++++++++++++++++++++++--------------------
 fs/nfs/nfs4xdr.c        | 27 ++++-------------
 include/linux/nfs_xdr.h |  4 +--
 3 files changed, 54 insertions(+), 56 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2d485dab343e..6262b2955d0e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5035,17 +5035,23 @@ static inline ssize_t nfs4_read_cached_acl(struct inode *inode, char *buf, size_
 	return ret;
 }
 
-static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size_t pgbase, size_t acl_len)
+/*
+ * We don't cache ACLs over a certain size.  I don't know if we have any
+ * real reason for this policy:
+ */
+#define NFS4_MAX_CACHED_ACL PAGE_SIZE
+
+static void nfs4_write_cached_acl(struct inode *inode, void *buf, size_t acl_len)
 {
 	struct nfs4_cached_acl *acl;
 	size_t buflen = sizeof(*acl) + acl_len;
 
-	if (buflen <= PAGE_SIZE) {
+	if (buflen <= NFS4_MAX_CACHED_ACL) {
 		acl = kmalloc(buflen, GFP_KERNEL);
 		if (acl == NULL)
 			goto out;
 		acl->cached = 1;
-		_copy_from_pages(acl->data, pages, pgbase, acl_len);
+		memcpy(acl->data, buf, acl_len);
 	} else {
 		acl = kmalloc(sizeof(*acl), GFP_KERNEL);
 		if (acl == NULL)
@@ -5057,13 +5063,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
 	nfs4_set_cached_acl(inode, acl);
 }
 
-/*
- * The getxattr API returns the required buffer length when called with a
- * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
- * the required buf. Cache the result from the first getxattr call to avoid
- * sending another RPC.
- */
-static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
+static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
 {
 	/* enough pages to hold ACL data plus the bitmap and acl length */
 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
@@ -5074,13 +5074,14 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 	};
 	struct nfs_getaclres res = {
 		.acl_len = buflen,
+		.buf = buf,
 	};
 	struct rpc_message msg = {
 		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GETACL],
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
-	int ret = -ENOMEM, i;
+	int i;
 
 	/* for decoding across pages */
 	res.acl_scratch = alloc_page(GFP_KERNEL);
@@ -5093,34 +5094,48 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 		__func__, buf, buflen, args.acl_len);
 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
 			     &msg, &args.seq_args, &res.seq_res, 0);
-	if (ret)
-		goto out_free;
+	if (ret == 0)
+		ret = res.acl_len;
 
-	/* Handle the case where the passed-in buffer is too short */
-	if (res.acl_flags & NFS4_ACL_TRUNC) {
-		/* Did the user only issue a request for the acl length? */
-		if (buf == NULL)
-			goto out_ok;
-		ret = -ERANGE;
-		goto out_free;
-	}
-	nfs4_write_cached_acl(inode, pages, res.acl_data_offset, res.acl_len);
-	if (buf) {
-		if (res.acl_len > buflen) {
-			ret = -ERANGE;
-			goto out_free;
-		}
-		_copy_from_pages(buf, pages, res.acl_data_offset, res.acl_len);
-	}
-out_ok:
-	ret = res.acl_len;
-out_free:
 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
 		__free_page(pages[i]);
 	__free_page(res.acl_scratch);
 	return ret;
 }
 
+/*
+ * The getxattr API returns the required buffer length when called with a
+ * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
+ * the required buf. Cache the result from the first getxattr call to avoid
+ * sending another RPC.
+ */
+static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
+{
+	void *priv_buf = NULL;
+	void *our_buf = buf;
+	int our_buflen = buflen;
+	static ssize_t ret = -ENOMEM;
+
+	if (!buf) {
+		priv_buf = kmalloc(NFS4_MAX_CACHED_ACL, GFP_KERNEL);
+		if (!priv_buf)
+			goto out;
+		our_buf = priv_buf;
+		our_buflen = NFS4_MAX_CACHED_ACL;
+	}
+	ret = nfs4_do_get_acl(inode, our_buf, our_buflen);
+	if (ret < 0)
+		goto out;
+	if (ret <= our_buflen)
+		nfs4_write_cached_acl(inode, our_buf, ret);
+	if (buf && ret > buflen)
+		ret = -ERANGE;
+out:
+	if (priv_buf)
+		kfree(priv_buf);
+	return ret;
+}
+
 static ssize_t nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
 {
 	struct nfs4_exception exception = { };
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index bb95dd2edeef..534b377084fb 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -5353,39 +5353,24 @@ static int decode_getacl(struct xdr_stream *xdr, struct rpc_rqst *req,
 	uint32_t attrlen,
 		 bitmap[3] = {0};
 	int status;
-	unsigned int pg_offset;
 
-	res->acl_len = 0;
 	if ((status = decode_op_hdr(xdr, OP_GETATTR)) != 0)
 		goto out;
-
-	xdr_enter_page(xdr, xdr->buf->page_len);
-
-	/* Calculate the offset of the page data */
-	pg_offset = xdr->buf->head[0].iov_len;
-
 	if ((status = decode_attr_bitmap(xdr, bitmap)) != 0)
 		goto out;
 	if ((status = decode_attr_length(xdr, &attrlen, &savep)) != 0)
 		goto out;
-
+	if (unlikely(attrlen > (xdr->nwords << 2)))
+		return -EIO;
 	if (unlikely(bitmap[0] & (FATTR4_WORD0_ACL - 1U)))
 		return -EIO;
 	if (likely(bitmap[0] & FATTR4_WORD0_ACL)) {
+		unsigned int offset = xdr_stream_pos(xdr);
 
-		/* The bitmap (xdr len + bitmaps) and the attr xdr len words
-		 * are stored with the acl data to handle the problem of
-		 * variable length bitmaps.*/
-		res->acl_data_offset = xdr_stream_pos(xdr) - pg_offset;
+		if (attrlen <= res->acl_len)
+			read_bytes_from_xdr_buf(xdr->buf, offset,
+						res->buf, attrlen);
 		res->acl_len = attrlen;
-
-		/* Check for receive buffer overflow */
-		if (res->acl_len > (xdr->nwords << 2) ||
-		    res->acl_len + res->acl_data_offset > xdr->buf->page_len) {
-			res->acl_flags |= NFS4_ACL_TRUNC;
-			dprintk("NFS: acl reply: attrlen %u > page_len %u\n",
-					attrlen, xdr->nwords << 2);
-		}
 	} else
 		status = -EOPNOTSUPP;
 
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 348f7c158084..418116d62740 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -747,13 +747,11 @@ struct nfs_getaclargs {
 	struct page **			acl_pages;
 };
 
-/* getxattr ACL interface flags */
-#define NFS4_ACL_TRUNC		0x0001	/* ACL was truncated */
 struct nfs_getaclres {
 	struct nfs4_sequence_res	seq_res;
+	void *				buf;
 	size_t				acl_len;
 	size_t				acl_data_offset;
-	int				acl_flags;
 	struct page *			acl_scratch;
 };
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] nfsd4: simplify getacl decoding
  2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
@ 2017-02-17 19:15   ` kbuild test robot
  2017-02-17 19:33     ` J. Bruce Fields
  2017-02-17 19:35   ` [PATCH] nfsd4: fix ifnullfree.cocci warnings kbuild test robot
  2017-02-17 19:35   ` [PATCH 3/3] nfsd4: simplify getacl decoding kbuild test robot
  2 siblings, 1 reply; 46+ messages in thread
From: kbuild test robot @ 2017-02-17 19:15 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: kbuild-all, Trond Myklebust, Anna Schumaker, linux-nfs,
	Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

[-- Attachment #1: Type: text/plain, Size: 2958 bytes --]

Hi Bruce,

[auto build test ERROR on nfs/linux-next]
[also build test ERROR on v4.10-rc8 next-20170217]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/J-Bruce-Fields/getacl-fixes/20170218-022320
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
config: x86_64-randconfig-x015-201707 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   fs/nfs/nfs4proc.c: In function 'nfs4_do_get_acl':
>> fs/nfs/nfs4proc.c:5095:2: error: 'ret' undeclared (first use in this function)
     ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
     ^~~
   fs/nfs/nfs4proc.c:5095:2: note: each undeclared identifier is reported only once for each function it appears in
>> fs/nfs/nfs4proc.c:5104:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +/ret +5095 fs/nfs/nfs4proc.c

cfc94798 Weston Andros Adamson 2017-02-17  5089  		return -ENOMEM;
5a006899 Sachin Prabhu         2012-04-17  5090  
cfc94798 Weston Andros Adamson 2017-02-17  5091  	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
5a006899 Sachin Prabhu         2012-04-17  5092  
cfc94798 Weston Andros Adamson 2017-02-17  5093  	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
cfc94798 Weston Andros Adamson 2017-02-17  5094  		__func__, buf, buflen, args.acl_len);
bf118a34 Andy Adamson          2011-12-07 @5095  	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
bf118a34 Andy Adamson          2011-12-07  5096  			     &msg, &args.seq_args, &res.seq_res, 0);
942bf94d J. Bruce Fields       2017-02-17  5097  	if (ret == 0)
1f1ea6c2 Trond Myklebust       2012-08-26  5098  		ret = res.acl_len;
942bf94d J. Bruce Fields       2017-02-17  5099  
cfc94798 Weston Andros Adamson 2017-02-17  5100  	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
bf118a34 Andy Adamson          2011-12-07  5101  		__free_page(pages[i]);
331818f1 Trond Myklebust       2012-02-03  5102  	__free_page(res.acl_scratch);
e50a1c2e J. Bruce Fields       2005-06-22  5103  	return ret;
e50a1c2e J. Bruce Fields       2005-06-22 @5104  }
e50a1c2e J. Bruce Fields       2005-06-22  5105  
942bf94d J. Bruce Fields       2017-02-17  5106  /*
942bf94d J. Bruce Fields       2017-02-17  5107   * The getxattr API returns the required buffer length when called with a

:::::: The code at line 5095 was first introduced by commit
:::::: bf118a342f10dafe44b14451a1392c3254629a1f NFSv4: include bitmap in nfsv4 get acl data

:::::: TO: Andy Adamson <andros@netapp.com>
:::::: CC: Trond Myklebust <Trond.Myklebust@netapp.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26364 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] nfsd4: simplify getacl decoding
  2017-02-17 19:15   ` kbuild test robot
@ 2017-02-17 19:33     ` J. Bruce Fields
  0 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 19:33 UTC (permalink / raw)
  To: kbuild test robot
  Cc: J. Bruce Fields, kbuild-all, Trond Myklebust, Anna Schumaker,
	linux-nfs, Andreas Gruenbacher, Weston Andros Adamson

Whoops, forgot to retst after a last minute change!  Fixed locally.
Also fixed up the subject lines that refer to "nfsd".

--b.

On Sat, Feb 18, 2017 at 03:15:11AM +0800, kbuild test robot wrote:
> Hi Bruce,
> 
> [auto build test ERROR on nfs/linux-next]
> [also build test ERROR on v4.10-rc8 next-20170217]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/J-Bruce-Fields/getacl-fixes/20170218-022320
> base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
> config: x86_64-randconfig-x015-201707 (attached as .config)
> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> All error/warnings (new ones prefixed by >>):
> 
>    fs/nfs/nfs4proc.c: In function 'nfs4_do_get_acl':
> >> fs/nfs/nfs4proc.c:5095:2: error: 'ret' undeclared (first use in this function)
>      ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
>      ^~~
>    fs/nfs/nfs4proc.c:5095:2: note: each undeclared identifier is reported only once for each function it appears in
> >> fs/nfs/nfs4proc.c:5104:1: warning: control reaches end of non-void function [-Wreturn-type]
>     }
>     ^
> 
> vim +/ret +5095 fs/nfs/nfs4proc.c
> 
> cfc94798 Weston Andros Adamson 2017-02-17  5089  		return -ENOMEM;
> 5a006899 Sachin Prabhu         2012-04-17  5090  
> cfc94798 Weston Andros Adamson 2017-02-17  5091  	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
> 5a006899 Sachin Prabhu         2012-04-17  5092  
> cfc94798 Weston Andros Adamson 2017-02-17  5093  	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
> cfc94798 Weston Andros Adamson 2017-02-17  5094  		__func__, buf, buflen, args.acl_len);
> bf118a34 Andy Adamson          2011-12-07 @5095  	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
> bf118a34 Andy Adamson          2011-12-07  5096  			     &msg, &args.seq_args, &res.seq_res, 0);
> 942bf94d J. Bruce Fields       2017-02-17  5097  	if (ret == 0)
> 1f1ea6c2 Trond Myklebust       2012-08-26  5098  		ret = res.acl_len;
> 942bf94d J. Bruce Fields       2017-02-17  5099  
> cfc94798 Weston Andros Adamson 2017-02-17  5100  	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
> bf118a34 Andy Adamson          2011-12-07  5101  		__free_page(pages[i]);
> 331818f1 Trond Myklebust       2012-02-03  5102  	__free_page(res.acl_scratch);
> e50a1c2e J. Bruce Fields       2005-06-22  5103  	return ret;
> e50a1c2e J. Bruce Fields       2005-06-22 @5104  }
> e50a1c2e J. Bruce Fields       2005-06-22  5105  
> 942bf94d J. Bruce Fields       2017-02-17  5106  /*
> 942bf94d J. Bruce Fields       2017-02-17  5107   * The getxattr API returns the required buffer length when called with a
> 
> :::::: The code at line 5095 was first introduced by commit
> :::::: bf118a342f10dafe44b14451a1392c3254629a1f NFSv4: include bitmap in nfsv4 get acl data
> 
> :::::: TO: Andy Adamson <andros@netapp.com>
> :::::: CC: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] nfsd4: fix ifnullfree.cocci warnings
  2017-02-17 19:35   ` [PATCH] nfsd4: fix ifnullfree.cocci warnings kbuild test robot
@ 2017-02-17 19:34     ` J. Bruce Fields
  0 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 19:34 UTC (permalink / raw)
  To: kbuild test robot
  Cc: J. Bruce Fields, kbuild-all, Trond Myklebust, Anna Schumaker,
	linux-nfs, Andreas Gruenbacher, Weston Andros Adamson

Fixed locally.--b.

On Sat, Feb 18, 2017 at 03:35:21AM +0800, kbuild test robot wrote:
> fs/nfs/nfs4proc.c:5135:2-7: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
> 
>  NULL check before some freeing functions is not needed.
> 
>  Based on checkpatch warning
>  "kfree(NULL) is safe this check is probably not required"
>  and kfreeaddr.cocci by Julia Lawall.
> 
> Generated by: scripts/coccinelle/free/ifnullfree.cocci
> 
> CC: J. Bruce Fields <bfields@redhat.com>
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
> 
>  nfs4proc.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5131,8 +5131,7 @@ static ssize_t __nfs4_get_acl_uncached(s
>  	if (buf && ret > buflen)
>  		ret = -ERANGE;
>  out:
> -	if (priv_buf)
> -		kfree(priv_buf);
> +	kfree(priv_buf);
>  	return ret;
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] nfsd4: simplify getacl decoding
  2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
  2017-02-17 19:15   ` kbuild test robot
  2017-02-17 19:35   ` [PATCH] nfsd4: fix ifnullfree.cocci warnings kbuild test robot
@ 2017-02-17 19:35   ` kbuild test robot
  2 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2017-02-17 19:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: kbuild-all, Trond Myklebust, Anna Schumaker, linux-nfs,
	Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

Hi Bruce,

[auto build test WARNING on nfs/linux-next]
[also build test WARNING on v4.10-rc8 next-20170217]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/J-Bruce-Fields/getacl-fixes/20170218-022320
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next


coccinelle warnings: (new ones prefixed by >>)

>> fs/nfs/nfs4proc.c:5135:2-7: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] nfsd4: fix ifnullfree.cocci warnings
  2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
  2017-02-17 19:15   ` kbuild test robot
@ 2017-02-17 19:35   ` kbuild test robot
  2017-02-17 19:34     ` J. Bruce Fields
  2017-02-17 19:35   ` [PATCH 3/3] nfsd4: simplify getacl decoding kbuild test robot
  2 siblings, 1 reply; 46+ messages in thread
From: kbuild test robot @ 2017-02-17 19:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: kbuild-all, Trond Myklebust, Anna Schumaker, linux-nfs,
	Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

fs/nfs/nfs4proc.c:5135:2-7: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.

 NULL check before some freeing functions is not needed.

 Based on checkpatch warning
 "kfree(NULL) is safe this check is probably not required"
 and kfreeaddr.cocci by Julia Lawall.

Generated by: scripts/coccinelle/free/ifnullfree.cocci

CC: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 nfs4proc.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5131,8 +5131,7 @@ static ssize_t __nfs4_get_acl_uncached(s
 	if (buf && ret > buflen)
 		ret = -ERANGE;
 out:
-	if (priv_buf)
-		kfree(priv_buf);
+	kfree(priv_buf);
 	return ret;
 }
 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] getacl fixes
  2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
                   ` (2 preceding siblings ...)
  2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
@ 2017-02-17 20:36 ` Chuck Lever
  2017-02-17 20:52   ` J. Bruce Fields
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
  4 siblings, 1 reply; 46+ messages in thread
From: Chuck Lever @ 2017-02-17 20:36 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson


> On Feb 17, 2017, at 11:44 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> The getacl code is allocating enough space to handle the ACL data but
> not to handle the bitmask, which can lead to spurious ERANGE errors when
> the end of the ACL gets close to a page boundary.
> 
> Dros addressed this by letting the rpc layer allocate pages as necessary
> on demand, as the NFSv3 ACL code does.
> 
> On its own that didn't do the job either, because we don't handle the
> case where xdr_shrink_bufhead needs to move data around in the xdr buf.
> And xdr_shrink_bufhead was getting called every time due to an incorrect
> estimate in an xdr_inline_pages call.
> 
> So, I fixed that estimate.  That still leaves the chance of a bug in the
> rare case xdr_shrink_bufhead is called.
> 
> We could fix up the handling of the xdr_shrink_bufhead case, but I don't
> see the point of shifting this data around in the first place.  We're
> not doing anything like zero-copy here, we're just going to copy the
> data out into the buffer we were passed.  The NFSv3 ACL code doesn't
> bother with this.
> 
> It's simpler just to pass down the buffer to the xdr layer and let it
> copy the ACL out.

I haven't looked closely at these yet, but I have some general
thoughts (worth approximately 2 cents).

NFS/RDMA clients have to pre-allocate and register a receive buffer
for requests with large replies. The client's RPC layer can't allocate
more memory if the reply overruns the existing buffer.

(Note that the server doesn't have the same problem: the client
sends an RPC-over-RDMA message telling the server exactly how large
the RPC Call message is, and the server prepares RDMA Read operations
to pull it over.)

ACLs are particularly troublesome because there doesn't seem to be
a way for a client to ask a server "how big is this ACL?" before it
actually asks for the ACL. And at least for NFSACL there does
not seem to be a protocol-defined size limit for these objects.

If the server can't fit an ACL into the client-provided reply buffer,
that causes a transport level error. The blast radius of this failure
includes any RPC that happens to be running on that connection, which
will have to be retransmitted.

If the client has sent a COMPOUND with a non-idempotent request in
the same COMPOUND with a GETATTR requesting the ACL, there could
be a problem if the server can't return the RPC because the client's
receive buffer is too small. The solution there is to always send
such operations in separate COMPOUNDs.

So I prefer in general that the NFS client (above RPC) provide as
large a buffer as practical for NFSACL GETACL and NFSv4 GETATTR
requesting an ACL. IIUC that is the direction your patches are
going.

We likely have a similar conundrum with security labels.


> The result looks a lot simpler and more obviously correct than this code
> has been, though I'm not particularly happy with the sequence of patches
> that gets us there; it would be better to squash together Dros's and my
> patch and then split out the result some more sensible way.
> 
> Sorry for the delay getting back to this.  Older discussions:
> 
> 	https://marc.info/?t=138452791200001&r=1&w=2
> 	http://marc.info/?t=138506891000003&r=1&w=2
> 
> J. Bruce Fields (2):
>  nfsd4: fix getacl head length estimation
>  nfsd4: simplify getacl decoding
> 
> Weston Andros Adamson (1):
>  NFSv4: fix getacl ERANGE for some ACL buffer sizes
> 
> fs/nfs/nfs4proc.c       | 116 +++++++++++++++++++++++-------------------------
> fs/nfs/nfs4xdr.c        |  29 +++---------
> include/linux/nfs_xdr.h |   4 +-
> 3 files changed, 64 insertions(+), 85 deletions(-)
> 
> -- 
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] getacl fixes
  2017-02-17 20:36 ` [PATCH 0/3] getacl fixes Chuck Lever
@ 2017-02-17 20:52   ` J. Bruce Fields
  2017-02-17 21:21     ` Chuck Lever
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-17 20:52 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson

On Fri, Feb 17, 2017 at 03:36:38PM -0500, Chuck Lever wrote:
> 
> > On Feb 17, 2017, at 11:44 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > The getacl code is allocating enough space to handle the ACL data but
> > not to handle the bitmask, which can lead to spurious ERANGE errors when
> > the end of the ACL gets close to a page boundary.
> > 
> > Dros addressed this by letting the rpc layer allocate pages as necessary
> > on demand, as the NFSv3 ACL code does.
> > 
> > On its own that didn't do the job either, because we don't handle the
> > case where xdr_shrink_bufhead needs to move data around in the xdr buf.
> > And xdr_shrink_bufhead was getting called every time due to an incorrect
> > estimate in an xdr_inline_pages call.
> > 
> > So, I fixed that estimate.  That still leaves the chance of a bug in the
> > rare case xdr_shrink_bufhead is called.
> > 
> > We could fix up the handling of the xdr_shrink_bufhead case, but I don't
> > see the point of shifting this data around in the first place.  We're
> > not doing anything like zero-copy here, we're just going to copy the
> > data out into the buffer we were passed.  The NFSv3 ACL code doesn't
> > bother with this.
> > 
> > It's simpler just to pass down the buffer to the xdr layer and let it
> > copy the ACL out.
> 
> I haven't looked closely at these yet, but I have some general
> thoughts (worth approximately 2 cents).
> 
> NFS/RDMA clients have to pre-allocate and register a receive buffer
> for requests with large replies. The client's RPC layer can't allocate
> more memory if the reply overruns the existing buffer.
> 
> (Note that the server doesn't have the same problem: the client
> sends an RPC-over-RDMA message telling the server exactly how large
> the RPC Call message is, and the server prepares RDMA Read operations
> to pull it over.)
> 
> ACLs are particularly troublesome because there doesn't seem to be
> a way for a client to ask a server "how big is this ACL?" before it
> actually asks for the ACL. And at least for NFSACL there does
> not seem to be a protocol-defined size limit for these objects.

I think in practice the OS/filesystem limits end up being the limiting
factor.  V4.0 might be the more annoying case, partly thanks to all
those string names.

> If the server can't fit an ACL into the client-provided reply buffer,
> that causes a transport level error. The blast radius of this failure
> includes any RPC that happens to be running on that connection, which
> will have to be retransmitted.
> 
> If the client has sent a COMPOUND with a non-idempotent request in
> the same COMPOUND with a GETATTR requesting the ACL, there could
> be a problem if the server can't return the RPC because the client's
> receive buffer is too small. The solution there is to always send
> such operations in separate COMPOUNDs.
> 
> So I prefer in general that the NFS client (above RPC) provide as
> large a buffer as practical for NFSACL GETACL and NFSv4 GETATTR
> requesting an ACL. IIUC that is the direction your patches are
> going.

No, the net effect is to make the v4 code like the v3 code and allocate
pages for the reply only on demand.  (I understand the confusion,
there's multiple buffers involved here, and my description could
probably be better.)

Ugh.

Does the RDMA protocol give us any other mechanism we can use for the
case of ACL replies?

It probably wouldn't be so terrible to preallocate the maximum number of
pages possible if that's really the only option.  May as well get rid of
the allocations in xdr_partial_copy_from_skb if we do that, as I don't
think there are other users?

--b.

> 
> We likely have a similar conundrum with security labels.
> 
> 
> > The result looks a lot simpler and more obviously correct than this code
> > has been, though I'm not particularly happy with the sequence of patches
> > that gets us there; it would be better to squash together Dros's and my
> > patch and then split out the result some more sensible way.
> > 
> > Sorry for the delay getting back to this.  Older discussions:
> > 
> > 	https://marc.info/?t=138452791200001&r=1&w=2
> > 	http://marc.info/?t=138506891000003&r=1&w=2
> > 
> > J. Bruce Fields (2):
> >  nfsd4: fix getacl head length estimation
> >  nfsd4: simplify getacl decoding
> > 
> > Weston Andros Adamson (1):
> >  NFSv4: fix getacl ERANGE for some ACL buffer sizes
> > 
> > fs/nfs/nfs4proc.c       | 116 +++++++++++++++++++++++-------------------------
> > fs/nfs/nfs4xdr.c        |  29 +++---------
> > include/linux/nfs_xdr.h |   4 +-
> > 3 files changed, 64 insertions(+), 85 deletions(-)
> > 
> > -- 
> > 2.9.3
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] getacl fixes
  2017-02-17 20:52   ` J. Bruce Fields
@ 2017-02-17 21:21     ` Chuck Lever
  0 siblings, 0 replies; 46+ messages in thread
From: Chuck Lever @ 2017-02-17 21:21 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson


> On Feb 17, 2017, at 3:52 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> On Fri, Feb 17, 2017 at 03:36:38PM -0500, Chuck Lever wrote:
>> 
>>> On Feb 17, 2017, at 11:44 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> 
>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>> 
>>> The getacl code is allocating enough space to handle the ACL data but
>>> not to handle the bitmask, which can lead to spurious ERANGE errors when
>>> the end of the ACL gets close to a page boundary.
>>> 
>>> Dros addressed this by letting the rpc layer allocate pages as necessary
>>> on demand, as the NFSv3 ACL code does.
>>> 
>>> On its own that didn't do the job either, because we don't handle the
>>> case where xdr_shrink_bufhead needs to move data around in the xdr buf.
>>> And xdr_shrink_bufhead was getting called every time due to an incorrect
>>> estimate in an xdr_inline_pages call.
>>> 
>>> So, I fixed that estimate.  That still leaves the chance of a bug in the
>>> rare case xdr_shrink_bufhead is called.
>>> 
>>> We could fix up the handling of the xdr_shrink_bufhead case, but I don't
>>> see the point of shifting this data around in the first place.  We're
>>> not doing anything like zero-copy here, we're just going to copy the
>>> data out into the buffer we were passed.  The NFSv3 ACL code doesn't
>>> bother with this.
>>> 
>>> It's simpler just to pass down the buffer to the xdr layer and let it
>>> copy the ACL out.
>> 
>> I haven't looked closely at these yet, but I have some general
>> thoughts (worth approximately 2 cents).
>> 
>> NFS/RDMA clients have to pre-allocate and register a receive buffer
>> for requests with large replies. The client's RPC layer can't allocate
>> more memory if the reply overruns the existing buffer.
>> 
>> (Note that the server doesn't have the same problem: the client
>> sends an RPC-over-RDMA message telling the server exactly how large
>> the RPC Call message is, and the server prepares RDMA Read operations
>> to pull it over.)
>> 
>> ACLs are particularly troublesome because there doesn't seem to be
>> a way for a client to ask a server "how big is this ACL?" before it
>> actually asks for the ACL. And at least for NFSACL there does
>> not seem to be a protocol-defined size limit for these objects.
> 
> I think in practice the OS/filesystem limits end up being the limiting
> factor.  V4.0 might be the more annoying case, partly thanks to all
> those string names.

Agree, though there is no sure-fire way for either
side to know what the other peer's limits might be,
unlike, say, SYMLINK.


>> If the server can't fit an ACL into the client-provided reply buffer,
>> that causes a transport level error. The blast radius of this failure
>> includes any RPC that happens to be running on that connection, which
>> will have to be retransmitted.
>> 
>> If the client has sent a COMPOUND with a non-idempotent request in
>> the same COMPOUND with a GETATTR requesting the ACL, there could
>> be a problem if the server can't return the RPC because the client's
>> receive buffer is too small. The solution there is to always send
>> such operations in separate COMPOUNDs.
>> 
>> So I prefer in general that the NFS client (above RPC) provide as
>> large a buffer as practical for NFSACL GETACL and NFSv4 GETATTR
>> requesting an ACL. IIUC that is the direction your patches are
>> going.
> 
> No, the net effect is to make the v4 code like the v3 code and allocate
> pages for the reply only on demand.  (I understand the confusion,
> there's multiple buffers involved here, and my description could
> probably be better.)

There is a similar hack in xprtrdma's marshaling code
that allocates reply pages while constructing the RPC
Call if the upper layer hasn't provided them.

It would be great if instead, retrieving an ACL worked
like other NFS operations.


> Ugh.
> 
> Does the RDMA protocol give us any other mechanism we can use for the
> case of ACL replies?

The current RPC-over-RDMA Version One transport provides
two options:

If the RPC Reply is guaranteed to be smaller than the
inline threshold (the size of what can be sent with RDMA
Send/Receive, which can be as small as 1KB), then RDMA
Send with pre-allocated reusable buffers is used to send
the reply. This works fine as long as the expected
Reply message will be small.

If the RPC Reply could be larger than the inline threshold,
the client has to provide a Reply chunk, which is a region
of client memory that is registered so that the server can
use RDMA Writes to return the reply.

If that region is too small, the server is supposed to
return a transport-specific error instead of the RPC
Reply.

(There is a third mechanism but it is forbidden for
everything but NFS READ/WRITE, and it would have the
same problem because the client has to know the size
of the Reply in advance).

For NFSACL GETACL, for example, the client doesn't know
how large the RPC Reply message might be. So it will always
register a Reply chunk for GETACL requests, and it risks
underestimating the size of this region (though I've never
seen it actually get overrun, since real world ACLs tend to
be small).


> It probably wouldn't be so terrible to preallocate the maximum number of
> pages possible if that's really the only option.  May as well get rid of
> the allocations in xdr_partial_copy_from_skb if we do that, as I don't
> think there are other users?

I can't think of any other use cases which rely on this
mechanism.

That mechanism is unreliable, isn't it? It can fail
because it cannot use GFP_KERNEL allocation while
receiving a reply, or am I misinformed. That makes
either of the current implementation choices less
preferable than having the NFS client always allocate
a large buffer while still in process context, AFAICS.


> --b.
> 
>> 
>> We likely have a similar conundrum with security labels.
>> 
>> 
>>> The result looks a lot simpler and more obviously correct than this code
>>> has been, though I'm not particularly happy with the sequence of patches
>>> that gets us there; it would be better to squash together Dros's and my
>>> patch and then split out the result some more sensible way.
>>> 
>>> Sorry for the delay getting back to this.  Older discussions:
>>> 
>>> 	https://marc.info/?t=138452791200001&r=1&w=2
>>> 	http://marc.info/?t=138506891000003&r=1&w=2
>>> 
>>> J. Bruce Fields (2):
>>> nfsd4: fix getacl head length estimation
>>> nfsd4: simplify getacl decoding
>>> 
>>> Weston Andros Adamson (1):
>>> NFSv4: fix getacl ERANGE for some ACL buffer sizes
>>> 
>>> fs/nfs/nfs4proc.c       | 116 +++++++++++++++++++++++-------------------------
>>> fs/nfs/nfs4xdr.c        |  29 +++---------
>>> include/linux/nfs_xdr.h |   4 +-
>>> 3 files changed, 64 insertions(+), 85 deletions(-)
>>> 
>>> -- 
>>> 2.9.3
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0/6] getacl fixes V2
  2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
                   ` (3 preceding siblings ...)
  2017-02-17 20:36 ` [PATCH 0/3] getacl fixes Chuck Lever
@ 2017-02-19  2:07 ` J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
                     ` (5 more replies)
  4 siblings, 6 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

> The getacl code is allocating enough space to handle the ACL data but
> not to handle the bitmask, which can lead to spurious ERANGE errors when
> the end of the ACL gets close to a page boundary.
> 
> Dros addressed this by letting the rpc layer allocate pages as necessary
> on demand, as the NFSv3 ACL code does.
> 
> On its own that didn't do the job either, because we don't handle the
> case where xdr_shrink_bufhead needs to move data around in the xdr buf.
> And xdr_shrink_bufhead was getting called every time due to an incorrect
> estimate in an xdr_inline_pages call.
> 
> So, I fixed that estimate.  That still leaves the chance of a bug in the
> rare case xdr_shrink_bufhead is called.
> 
> We could fix up the handling of the xdr_shrink_bufhead case, but I don't
> see the point of shifting this data around in the first place.  We're
> not doing anything like zero-copy here, we're just going to copy the
> data out into the buffer we were passed.  The NFSv3 ACL code doesn't
> bother with this.
> 
> It's simpler just to pass down the buffer to the xdr layer and let it
> copy the ACL out.
> 
> The result looks a lot simpler and more obviously correct than this code
> has been, though I'm not particularly happy with the sequence of patches
> that gets us there; it would be better to squash together Dros's and my
> patch and then split out the result some more sensible way.

Done; here it is split out so the first two patches are minimal
bugfixes, the next two do some minor cleanup, the fifth converts to
copying data into "buf" from the xdr decoder, and the final one converts
to on-demand allocation (the part Chuck is asking about).

I left Dros as the author on the minimal ERANGE fix and the on-demand
allocation, though that choice is a little arbitrary as the changes are
all split up differently; Dros, let me know if that's problem.

It'd be nice to get at least the first two in (and possibly cc'd to
stable as well).

--b.

> 
> Sorry for the delay getting back to this.  Older discussions:
> 
>         https://marc.info/?t=138452791200001&r=1&w=2
>         http://marc.info/?t=138506891000003&r=1&w=2



J. Bruce Fields (4):
  NFSv4: fix getacl head length estimation
  NFSv4: minor acl caching policy documentation
  NFSv4: minor getacl cleanup
  NFSv4: simplify getacl decoding

Weston Andros Adamson (2):
  NFSv4: fix getacl ERANGE for some ACL buffer sizes
  NFSv4: allow getacl rpc to allocate pages on demand

 fs/nfs/nfs4proc.c       | 114 +++++++++++++++++++++++-------------------------
 fs/nfs/nfs4xdr.c        |  29 +++---------
 include/linux/nfs_xdr.h |   4 +-
 3 files changed, 62 insertions(+), 85 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/6] NFSv4: fix getacl head length estimation
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-20 13:19     ` Kinglong Mee
  2017-02-19  2:07   ` [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Bitmap and attrlen follow immediately after the op reply header, so I'm
not sure what this extra "+1" was for.

Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e9255cb453e6..bb95dd2edeef 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2524,7 +2524,7 @@ static void nfs4_xdr_enc_getacl(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
-	replen = hdr.replen + op_decode_hdr_maxsz + 1;
+	replen = hdr.replen + op_decode_hdr_maxsz;
 	encode_getattr_two(xdr, FATTR4_WORD0_ACL, 0, &hdr);
 
 	xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-21 19:46     ` Weston Andros Adamson
  2017-02-19  2:07   ` [PATCH 3/6] NFSv4: minor acl caching policy documentation J. Bruce Fields
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson,
	Weston Andros Adamson, J. Bruce Fields

From: Weston Andros Adamson <dros@netapp.com>

We're not taking into account that the space needed for the (variable
length) attr bitmap, with the result that we'd sometimes get a spurious
ERANGE when the ACL data got close to the end of a page.

Just add in an extra page to make sure.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 0a0eaecf9676..60501e10625d 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5069,7 +5069,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
  */
 static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
 {
-	struct page *pages[NFS4ACL_MAXPAGES] = {NULL, };
+	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
 	struct nfs_getaclargs args = {
 		.fh = NFS_FH(inode),
 		.acl_pages = pages,
@@ -5083,13 +5083,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
-	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE);
+	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
 	int ret = -ENOMEM, i;
 
-	/* As long as we're doing a round trip to the server anyway,
-	 * let's be prepared for a page of acl data. */
-	if (npages == 0)
-		npages = 1;
 	if (npages > ARRAY_SIZE(pages))
 		return -ERANGE;
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/6] NFSv4: minor acl caching policy documentation
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 4/6] NFSv4: minor getacl cleanup J. Bruce Fields
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 60501e10625d..a4e33e519f6e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5035,12 +5035,18 @@ static inline ssize_t nfs4_read_cached_acl(struct inode *inode, char *buf, size_
 	return ret;
 }
 
+/*
+ * We don't cache ACLs over a certain size.  I don't know if we have any
+ * real reason for this policy:
+ */
+#define NFS4_MAX_CACHED_ACL PAGE_SIZE
+
 static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size_t pgbase, size_t acl_len)
 {
 	struct nfs4_cached_acl *acl;
 	size_t buflen = sizeof(*acl) + acl_len;
 
-	if (buflen <= PAGE_SIZE) {
+	if (buflen <= NFS4_MAX_CACHED_ACL) {
 		acl = kmalloc(buflen, GFP_KERNEL);
 		if (acl == NULL)
 			goto out;
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 4/6] NFSv4: minor getacl cleanup
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
                     ` (2 preceding siblings ...)
  2017-02-19  2:07   ` [PATCH 3/6] NFSv4: minor acl caching policy documentation J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-20 22:38     ` Andreas Gruenbacher
  2017-02-19  2:07   ` [PATCH 5/6] NFSv4: simplify getacl decoding J. Bruce Fields
  2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
  5 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a4e33e519f6e..eb6c34db9a79 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5134,11 +5134,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 out_ok:
 	ret = res.acl_len;
 out_free:
-	for (i = 0; i < npages; i++)
-		if (pages[i])
-			__free_page(pages[i]);
-	if (res.acl_scratch)
-		__free_page(res.acl_scratch);
+	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
+		__free_page(pages[i]);
+	__free_page(res.acl_scratch);
 	return ret;
 }
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 5/6] NFSv4: simplify getacl decoding
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
                     ` (3 preceding siblings ...)
  2017-02-19  2:07   ` [PATCH 4/6] NFSv4: minor getacl cleanup J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-20 22:30     ` Andreas Gruenbacher
  2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
  5 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

There's not really much point to shifting the data around with
xdr_enter_page() (and xdr_shrink_bufhead) when we're just going to
copying the data out of the xdr buf anyway instead of trying to place
the data right for zero-copy.  And it turns out if we want to leave
page allocation to the xdr_partial_copy_from_skb() (as we will in a
following patch) shifting the data brings more ocmplications.

So let's just pass that buf down to the xdr layer and let it copy data
directly into it.

That means we need to allocate our own buf in the case the user didn't
give us one, so that we can still cache in that case (to efficiently
handle the common case of a getxattr(., ., NULL, 0) to get the size
followed immediately by a getxattr(., ., buf, size)).

But this still works out to be much simpler.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c       | 73 +++++++++++++++++++++++++------------------------
 fs/nfs/nfs4xdr.c        | 27 ++++--------------
 include/linux/nfs_xdr.h |  4 +--
 3 files changed, 45 insertions(+), 59 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index eb6c34db9a79..3e3dbba4aa74 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5041,7 +5041,7 @@ static inline ssize_t nfs4_read_cached_acl(struct inode *inode, char *buf, size_
  */
 #define NFS4_MAX_CACHED_ACL PAGE_SIZE
 
-static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size_t pgbase, size_t acl_len)
+static void nfs4_write_cached_acl(struct inode *inode, void *buf, size_t acl_len)
 {
 	struct nfs4_cached_acl *acl;
 	size_t buflen = sizeof(*acl) + acl_len;
@@ -5051,7 +5051,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
 		if (acl == NULL)
 			goto out;
 		acl->cached = 1;
-		_copy_from_pages(acl->data, pages, pgbase, acl_len);
+		memcpy(acl->data, buf, acl_len);
 	} else {
 		acl = kmalloc(sizeof(*acl), GFP_KERNEL);
 		if (acl == NULL)
@@ -5063,26 +5063,16 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
 	nfs4_set_cached_acl(inode, acl);
 }
 
-/*
- * The getxattr API returns the required buffer length when called with a
- * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
- * the required buf.  On a NULL buf, we send a page of data to the server
- * guessing that the ACL request can be serviced by a page. If so, we cache
- * up to the page of ACL data, and the 2nd call to getxattr is serviced by
- * the cache. If not so, we throw away the page, and cache the required
- * length. The next getxattr call will then produce another round trip to
- * the server, this time with the input buf of the required size.
- */
-static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
+static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
 {
 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
 	struct nfs_getaclargs args = {
 		.fh = NFS_FH(inode),
 		.acl_pages = pages,
-		.acl_len = buflen,
 	};
 	struct nfs_getaclres res = {
 		.acl_len = buflen,
+		.buf = buf,
 	};
 	struct rpc_message msg = {
 		.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GETACL],
@@ -5112,27 +5102,8 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 		__func__, buf, buflen, npages, args.acl_len);
 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
 			     &msg, &args.seq_args, &res.seq_res, 0);
-	if (ret)
-		goto out_free;
-
-	/* Handle the case where the passed-in buffer is too short */
-	if (res.acl_flags & NFS4_ACL_TRUNC) {
-		/* Did the user only issue a request for the acl length? */
-		if (buf == NULL)
-			goto out_ok;
-		ret = -ERANGE;
-		goto out_free;
-	}
-	nfs4_write_cached_acl(inode, pages, res.acl_data_offset, res.acl_len);
-	if (buf) {
-		if (res.acl_len > buflen) {
-			ret = -ERANGE;
-			goto out_free;
-		}
-		_copy_from_pages(buf, pages, res.acl_data_offset, res.acl_len);
-	}
-out_ok:
-	ret = res.acl_len;
+	if (ret == 0)
+		ret = res.acl_len;
 out_free:
 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
 		__free_page(pages[i]);
@@ -5140,6 +5111,38 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 	return ret;
 }
 
+/*
+ * The getxattr API returns the required buffer length when called with a
+ * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
+ * the required buf. Cache the result from the first getxattr call to avoid
+ * sending another RPC.
+ */
+static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
+{
+	void *priv_buf = NULL;
+	void *our_buf = buf;
+	int our_buflen = buflen;
+	static ssize_t ret = -ENOMEM;
+
+	if (!buf) {
+		priv_buf = kmalloc(NFS4_MAX_CACHED_ACL, GFP_KERNEL);
+		if (!priv_buf)
+			goto out;
+		our_buf = priv_buf;
+		our_buflen = NFS4_MAX_CACHED_ACL;
+	}
+	ret = nfs4_do_get_acl(inode, our_buf, our_buflen);
+	if (ret < 0)
+		goto out;
+	if (ret <= our_buflen)
+		nfs4_write_cached_acl(inode, our_buf, ret);
+	if (buf && ret > buflen)
+		ret = -ERANGE;
+out:
+	kfree(priv_buf);
+	return ret;
+}
+
 static ssize_t nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
 {
 	struct nfs4_exception exception = { };
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index bb95dd2edeef..534b377084fb 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -5353,39 +5353,24 @@ static int decode_getacl(struct xdr_stream *xdr, struct rpc_rqst *req,
 	uint32_t attrlen,
 		 bitmap[3] = {0};
 	int status;
-	unsigned int pg_offset;
 
-	res->acl_len = 0;
 	if ((status = decode_op_hdr(xdr, OP_GETATTR)) != 0)
 		goto out;
-
-	xdr_enter_page(xdr, xdr->buf->page_len);
-
-	/* Calculate the offset of the page data */
-	pg_offset = xdr->buf->head[0].iov_len;
-
 	if ((status = decode_attr_bitmap(xdr, bitmap)) != 0)
 		goto out;
 	if ((status = decode_attr_length(xdr, &attrlen, &savep)) != 0)
 		goto out;
-
+	if (unlikely(attrlen > (xdr->nwords << 2)))
+		return -EIO;
 	if (unlikely(bitmap[0] & (FATTR4_WORD0_ACL - 1U)))
 		return -EIO;
 	if (likely(bitmap[0] & FATTR4_WORD0_ACL)) {
+		unsigned int offset = xdr_stream_pos(xdr);
 
-		/* The bitmap (xdr len + bitmaps) and the attr xdr len words
-		 * are stored with the acl data to handle the problem of
-		 * variable length bitmaps.*/
-		res->acl_data_offset = xdr_stream_pos(xdr) - pg_offset;
+		if (attrlen <= res->acl_len)
+			read_bytes_from_xdr_buf(xdr->buf, offset,
+						res->buf, attrlen);
 		res->acl_len = attrlen;
-
-		/* Check for receive buffer overflow */
-		if (res->acl_len > (xdr->nwords << 2) ||
-		    res->acl_len + res->acl_data_offset > xdr->buf->page_len) {
-			res->acl_flags |= NFS4_ACL_TRUNC;
-			dprintk("NFS: acl reply: attrlen %u > page_len %u\n",
-					attrlen, xdr->nwords << 2);
-		}
 	} else
 		status = -EOPNOTSUPP;
 
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 348f7c158084..418116d62740 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -747,13 +747,11 @@ struct nfs_getaclargs {
 	struct page **			acl_pages;
 };
 
-/* getxattr ACL interface flags */
-#define NFS4_ACL_TRUNC		0x0001	/* ACL was truncated */
 struct nfs_getaclres {
 	struct nfs4_sequence_res	seq_res;
+	void *				buf;
 	size_t				acl_len;
 	size_t				acl_data_offset;
-	int				acl_flags;
 	struct page *			acl_scratch;
 };
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
                     ` (4 preceding siblings ...)
  2017-02-19  2:07   ` [PATCH 5/6] NFSv4: simplify getacl decoding J. Bruce Fields
@ 2017-02-19  2:07   ` J. Bruce Fields
  2017-02-19 19:29     ` Chuck Lever
  2017-02-20 22:38     ` Andreas Gruenbacher
  5 siblings, 2 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-19  2:07 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson,
	Weston Andros Adamson, J. Bruce Fields

From: Weston Andros Adamson <dros@netapp.com>

Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
allocate whatever pages we need on demand.  This is what the NFSv3 ACL
code does.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3e3dbba4aa74..7842c73fddfc 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
 	struct nfs_getaclargs args = {
 		.fh = NFS_FH(inode),
+		/* The xdr layer may allocate pages here. */
 		.acl_pages = pages,
 	};
 	struct nfs_getaclres res = {
@@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
-	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
-	int ret = -ENOMEM, i;
-
-	if (npages > ARRAY_SIZE(pages))
-		return -ERANGE;
-
-	for (i = 0; i < npages; i++) {
-		pages[i] = alloc_page(GFP_KERNEL);
-		if (!pages[i])
-			goto out_free;
-	}
+	int ret, i;
 
 	/* for decoding across pages */
 	res.acl_scratch = alloc_page(GFP_KERNEL);
 	if (!res.acl_scratch)
-		goto out_free;
+		return -ENOMEM;
 
-	args.acl_len = npages * PAGE_SIZE;
+	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
 
-	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
-		__func__, buf, buflen, npages, args.acl_len);
+	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
+		__func__, buf, buflen, args.acl_len);
 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
 			     &msg, &args.seq_args, &res.seq_res, 0);
 	if (ret == 0)
 		ret = res.acl_len;
-out_free:
+
 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
 		__free_page(pages[i]);
 	__free_page(res.acl_scratch);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
@ 2017-02-19 19:29     ` Chuck Lever
  2017-02-20 16:09       ` J. Bruce Fields
  2017-02-20 22:38     ` Andreas Gruenbacher
  1 sibling, 1 reply; 46+ messages in thread
From: Chuck Lever @ 2017-02-19 19:29 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson, Weston Andros Adamson


> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> From: Weston Andros Adamson <dros@netapp.com>
> 
> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> code does.

The patch description does not explain why this change is
being done.

The matching hack in xprtrdma is in rpcrdma_convert_iovs().
Note that those are GFP_ATOMIC allocations, whereas here
they are GFP_KERNEL, and are thus more reliable.

IMO this is a step in the wrong direction. We should not be
adding more upper layer dependencies on memory allocation
in the transport layer.

I strongly prefer that rather the NFSACL code works the way
this code currently does, and that the hacks be removed from
the transport implementations.


> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
> fs/nfs/nfs4proc.c | 23 +++++++----------------
> 1 file changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 3e3dbba4aa74..7842c73fddfc 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
> 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
> 	struct nfs_getaclargs args = {
> 		.fh = NFS_FH(inode),
> +		/* The xdr layer may allocate pages here. */

Sure, it is called xdr_partial_copy_from_skb, but that function
lives in socklib.c and is invoked only from xprtsock.c. Also, a
similar hack has to be done in xprtrdma.

So IMO this is a transport layer hack, and not part of the
(generic) XDR layer.


> 		.acl_pages = pages,
> 	};
> 	struct nfs_getaclres res = {
> @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
> 		.rpc_argp = &args,
> 		.rpc_resp = &res,
> 	};
> -	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
> -	int ret = -ENOMEM, i;
> -
> -	if (npages > ARRAY_SIZE(pages))
> -		return -ERANGE;
> -
> -	for (i = 0; i < npages; i++) {
> -		pages[i] = alloc_page(GFP_KERNEL);
> -		if (!pages[i])
> -			goto out_free;
> -	}
> +	int ret, i;
> 
> 	/* for decoding across pages */
> 	res.acl_scratch = alloc_page(GFP_KERNEL);
> 	if (!res.acl_scratch)
> -		goto out_free;
> +		return -ENOMEM;
> 
> -	args.acl_len = npages * PAGE_SIZE;
> +	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
> 
> -	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
> -		__func__, buf, buflen, npages, args.acl_len);
> +	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
> +		__func__, buf, buflen, args.acl_len);
> 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
> 			     &msg, &args.seq_args, &res.seq_res, 0);
> 	if (ret == 0)
> 		ret = res.acl_len;
> -out_free:
> +
> 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
> 		__free_page(pages[i]);
> 	__free_page(res.acl_scratch);
> -- 
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] NFSv4: fix getacl head length estimation
  2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
@ 2017-02-20 13:19     ` Kinglong Mee
  2017-02-20 15:50       ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Kinglong Mee @ 2017-02-20 13:19 UTC (permalink / raw)
  To: J. Bruce Fields, Trond Myklebust, Anna Schumaker
  Cc: linux-nfs, Andreas Gruenbacher, Weston Andros Adamson, Kinglong Mee

On 2/19/2017 10:07, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> Bitmap and attrlen follow immediately after the op reply header, so I'm
> not sure what this extra "+1" was for.
> 
> Consequences of this are just minor efficiency (extra calls to
> xdr_shrink_bufhead).
> 
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfs/nfs4xdr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index e9255cb453e6..bb95dd2edeef 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -2524,7 +2524,7 @@ static void nfs4_xdr_enc_getacl(struct rpc_rqst *req, struct xdr_stream *xdr,
>  	encode_compound_hdr(xdr, req, &hdr);
>  	encode_sequence(xdr, &args->seq_args, &hdr);
>  	encode_putfh(xdr, args->fh, &hdr);
> -	replen = hdr.replen + op_decode_hdr_maxsz + 1;

The +1 is set at 28f566942c "NFS: use dynamically computed compound_hdr.replen
for xdr_inline_pages offset", 

@@ -1827,20 +1806,18 @@ nfs4_xdr_enc_getacl(struct rpc_rqst *req, __be32 *p,
        xdr_init_encode(&xdr, &req->rq_snd_buf, p);
        encode_compound_hdr(&xdr, req, &hdr);
        encode_putfh(&xdr, args->fh, &hdr);
+       replen = hdr.replen + nfs4_fattr_bitmap_maxsz + 1;
        encode_getattr_two(&xdr, FATTR4_WORD0_ACL, 0, &hdr);

-       /* set up reply buffer: */
-       replen = (RPC_REPHDRSIZE + auth->au_rslack + NFS4_dec_getacl_sz) << 2;
-       xdr_inline_pages(&req->rq_rcv_buf, replen,
+       xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
                args->acl_pages, args->acl_pgbase, args->acl_len);
        encode_nops(&hdr);
        return 0;

#define NFS4_dec_getacl_sz      (compound_decode_hdr_maxsz + \
                                decode_sequence_maxsz + \
                                decode_putfh_maxsz + \
                                *decode_getacl_maxsz*)

#define decode_getacl_maxsz     (op_decode_hdr_maxsz + \
                                 nfs4_fattr_bitmap_maxsz + 1)

But, forget remove it at commit bf118a342f,
"NFSv4: include bitmap in nfsv4 get acl data".

Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>

thanks,
Kinglong Mee
> +	replen = hdr.replen + op_decode_hdr_maxsz;
>  	encode_getattr_two(xdr, FATTR4_WORD0_ACL, 0, &hdr);
>  
>  	xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] NFSv4: fix getacl head length estimation
  2017-02-20 13:19     ` Kinglong Mee
@ 2017-02-20 15:50       ` J. Bruce Fields
  2017-02-20 20:27         ` [PATCH] " J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-20 15:50 UTC (permalink / raw)
  To: Kinglong Mee
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs, Andreas Gruenbacher,
	Weston Andros Adamson

On Mon, Feb 20, 2017 at 09:19:14PM +0800, Kinglong Mee wrote:
> On 2/19/2017 10:07, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > Bitmap and attrlen follow immediately after the op reply header, so I'm
> > not sure what this extra "+1" was for.
...
> The +1 is set at 28f566942c "NFS: use dynamically computed compound_hdr.replen
> for xdr_inline_pages offset", 
...
> But, forget remove it at commit bf118a342f,
> "NFSv4: include bitmap in nfsv4 get acl data".
> 
> Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>

Thanks for the history, and the review.  I've added a "Fixes:
bf118a342f10" line.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-19 19:29     ` Chuck Lever
@ 2017-02-20 16:09       ` J. Bruce Fields
  2017-02-20 16:42         ` Chuck Lever
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-20 16:09 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson, Weston Andros Adamson

On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
> 
> > On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > From: Weston Andros Adamson <dros@netapp.com>
> > 
> > Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> > allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> > code does.
> 
> The patch description does not explain why this change is
> being done.

The only justification I see is avoiding allocating pages unnecessarily.
Without this patch, for each getacl, we allocate 17 pages (if I'm
calculating correctly) and probably rarely use most of them.

In the v3 case I think it's 7 pages instead of 17.

Do we have reason to believe that's actually a big deal?

--b.

> The matching hack in xprtrdma is in rpcrdma_convert_iovs().
> Note that those are GFP_ATOMIC allocations, whereas here
> they are GFP_KERNEL, and are thus more reliable.
> 
> IMO this is a step in the wrong direction. We should not be
> adding more upper layer dependencies on memory allocation
> in the transport layer.
> 
> I strongly prefer that rather the NFSACL code works the way
> this code currently does, and that the hacks be removed from
> the transport implementations.
> 
> 
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> > fs/nfs/nfs4proc.c | 23 +++++++----------------
> > 1 file changed, 7 insertions(+), 16 deletions(-)
> > 
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 3e3dbba4aa74..7842c73fddfc 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
> > 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
> > 	struct nfs_getaclargs args = {
> > 		.fh = NFS_FH(inode),
> > +		/* The xdr layer may allocate pages here. */
> 
> Sure, it is called xdr_partial_copy_from_skb, but that function
> lives in socklib.c and is invoked only from xprtsock.c. Also, a
> similar hack has to be done in xprtrdma.
> 
> So IMO this is a transport layer hack, and not part of the
> (generic) XDR layer.
> 
> 
> > 		.acl_pages = pages,
> > 	};
> > 	struct nfs_getaclres res = {
> > @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
> > 		.rpc_argp = &args,
> > 		.rpc_resp = &res,
> > 	};
> > -	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
> > -	int ret = -ENOMEM, i;
> > -
> > -	if (npages > ARRAY_SIZE(pages))
> > -		return -ERANGE;
> > -
> > -	for (i = 0; i < npages; i++) {
> > -		pages[i] = alloc_page(GFP_KERNEL);
> > -		if (!pages[i])
> > -			goto out_free;
> > -	}
> > +	int ret, i;
> > 
> > 	/* for decoding across pages */
> > 	res.acl_scratch = alloc_page(GFP_KERNEL);
> > 	if (!res.acl_scratch)
> > -		goto out_free;
> > +		return -ENOMEM;
> > 
> > -	args.acl_len = npages * PAGE_SIZE;
> > +	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
> > 
> > -	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
> > -		__func__, buf, buflen, npages, args.acl_len);
> > +	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
> > +		__func__, buf, buflen, args.acl_len);
> > 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
> > 			     &msg, &args.seq_args, &res.seq_res, 0);
> > 	if (ret == 0)
> > 		ret = res.acl_len;
> > -out_free:
> > +
> > 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
> > 		__free_page(pages[i]);
> > 	__free_page(res.acl_scratch);
> > -- 
> > 2.9.3
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-20 16:09       ` J. Bruce Fields
@ 2017-02-20 16:42         ` Chuck Lever
  2017-02-20 17:15           ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Chuck Lever @ 2017-02-20 16:42 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson, Weston Andros Adamson


> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> 
>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> 
>>> From: Weston Andros Adamson <dros@netapp.com>
>>> 
>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>>> code does.
>> 
>> The patch description does not explain why this change is
>> being done.
> 
> The only justification I see is avoiding allocating pages unnecessarily.

That makes sense. Is there a real world workload that has seen
a negative effect?


> Without this patch, for each getacl, we allocate 17 pages (if I'm
> calculating correctly) and probably rarely use most of them.
> 
> In the v3 case I think it's 7 pages instead of 17.

I would have guessed 9. Out of curiosity, is there a reason
documented for these size limits?


> Do we have reason to believe that's actually a big deal?

The xprtrdma hack already has to allocate the full set of
pages for NFSACL GETACL.

If NFSv4 GETATTR(fs_acl4) already works this way and there
are no real problems, I can't see any issue with NFSACL GETACL
using the same mechanism to retrieve smaller objects.

The only risk to overallocating is that it could drive some
page reclaims. The upper layer should be in a better position
to prevent deadlock in this case than the transport layer is.
However if NFSv4 doesn't see a problem here, then there isn't
likely to be an issue for NFSACL GETACL, IMO.


> --b.
> 
>> The matching hack in xprtrdma is in rpcrdma_convert_iovs().
>> Note that those are GFP_ATOMIC allocations, whereas here
>> they are GFP_KERNEL, and are thus more reliable.
>> 
>> IMO this is a step in the wrong direction. We should not be
>> adding more upper layer dependencies on memory allocation
>> in the transport layer.
>> 
>> I strongly prefer that rather the NFSACL code works the way
>> this code currently does, and that the hacks be removed from
>> the transport implementations.
>> 
>> 
>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>>> ---
>>> fs/nfs/nfs4proc.c | 23 +++++++----------------
>>> 1 file changed, 7 insertions(+), 16 deletions(-)
>>> 
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 3e3dbba4aa74..7842c73fddfc 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>>> 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
>>> 	struct nfs_getaclargs args = {
>>> 		.fh = NFS_FH(inode),
>>> +		/* The xdr layer may allocate pages here. */
>> 
>> Sure, it is called xdr_partial_copy_from_skb, but that function
>> lives in socklib.c and is invoked only from xprtsock.c. Also, a
>> similar hack has to be done in xprtrdma.
>> 
>> So IMO this is a transport layer hack, and not part of the
>> (generic) XDR layer.
>> 
>> 
>>> 		.acl_pages = pages,
>>> 	};
>>> 	struct nfs_getaclres res = {
>>> @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>>> 		.rpc_argp = &args,
>>> 		.rpc_resp = &res,
>>> 	};
>>> -	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
>>> -	int ret = -ENOMEM, i;
>>> -
>>> -	if (npages > ARRAY_SIZE(pages))
>>> -		return -ERANGE;
>>> -
>>> -	for (i = 0; i < npages; i++) {
>>> -		pages[i] = alloc_page(GFP_KERNEL);
>>> -		if (!pages[i])
>>> -			goto out_free;
>>> -	}
>>> +	int ret, i;
>>> 
>>> 	/* for decoding across pages */
>>> 	res.acl_scratch = alloc_page(GFP_KERNEL);
>>> 	if (!res.acl_scratch)
>>> -		goto out_free;
>>> +		return -ENOMEM;
>>> 
>>> -	args.acl_len = npages * PAGE_SIZE;
>>> +	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
>>> 
>>> -	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
>>> -		__func__, buf, buflen, npages, args.acl_len);
>>> +	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
>>> +		__func__, buf, buflen, args.acl_len);
>>> 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
>>> 			     &msg, &args.seq_args, &res.seq_res, 0);
>>> 	if (ret == 0)
>>> 		ret = res.acl_len;
>>> -out_free:
>>> +
>>> 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
>>> 		__free_page(pages[i]);
>>> 	__free_page(res.acl_scratch);
>>> -- 
>>> 2.9.3
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-20 16:42         ` Chuck Lever
@ 2017-02-20 17:15           ` J. Bruce Fields
  2017-02-20 21:31             ` Andreas Gruenbacher
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-20 17:15 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Andreas Gruenbacher, Dros Adamson, Weston Andros Adamson

On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
> 
> > On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
> >> 
> >>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>> 
> >>> From: Weston Andros Adamson <dros@netapp.com>
> >>> 
> >>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> >>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> >>> code does.
> >> 
> >> The patch description does not explain why this change is
> >> being done.
> > 
> > The only justification I see is avoiding allocating pages unnecessarily.
> 
> That makes sense. Is there a real world workload that has seen
> a negative effect?
> 
> 
> > Without this patch, for each getacl, we allocate 17 pages (if I'm
> > calculating correctly) and probably rarely use most of them.
> > 
> > In the v3 case I think it's 7 pages instead of 17.
> 
> I would have guessed 9. Out of curiosity, is there a reason
> documented for these size limits?


In the v4 case: 

	#define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)

And I believe XATTR_SIZE_MAX is a global maximum on the size of any
extend attribute value.

In the v3 case:

	/* Maximum number of ACL entries over NFS */
	#define NFS_ACL_MAX_ENTRIES     1024

	#define NFSACL_MAXPAGES         ((2*(8+12*NFS_ACL_MAX_ENTRIES) + PAGE_SIZE-1) \
	                                 >> PAGE_SHIFT)

No idea where that 1024 comes from.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] NFSv4: fix getacl head length estimation
  2017-02-20 15:50       ` J. Bruce Fields
@ 2017-02-20 20:27         ` J. Bruce Fields
  0 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-20 20:27 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Kinglong Mee, Trond Myklebust, Anna Schumaker, linux-nfs,
	Andreas Gruenbacher, Weston Andros Adamson

From: "J. Bruce Fields" <bfields@redhat.com>

Bitmap and attrlen follow immediately after the op reply header.  This
was an oversight from commit bf118a342f.

Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).

Fixes: bf118a342f10 "NFSv4: include bitmap in nfsv4 get acl data"
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

On Mon, Feb 20, 2017 at 10:50:19AM -0500, J. Bruce Fields wrote:
> Thanks for the history, and the review.  I've added a "Fixes:
> bf118a342f10" line.


diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e9255cb453e6..bb95dd2edeef 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2524,7 +2524,7 @@ static void nfs4_xdr_enc_getacl(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
-	replen = hdr.replen + op_decode_hdr_maxsz + 1;
+	replen = hdr.replen + op_decode_hdr_maxsz;
 	encode_getattr_two(xdr, FATTR4_WORD0_ACL, 0, &hdr);
 
 	xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-20 17:15           ` J. Bruce Fields
@ 2017-02-20 21:31             ` Andreas Gruenbacher
  2017-02-21 18:46               ` Chuck Lever
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-20 21:31 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>>
>> > On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >
>> > On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> >>
>> >>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >>>
>> >>> From: Weston Andros Adamson <dros@netapp.com>
>> >>>
>> >>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>> >>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>> >>> code does.
>> >>
>> >> The patch description does not explain why this change is
>> >> being done.
>> >
>> > The only justification I see is avoiding allocating pages unnecessarily.
>>
>> That makes sense. Is there a real world workload that has seen
>> a negative effect?
>>
>>
>> > Without this patch, for each getacl, we allocate 17 pages (if I'm
>> > calculating correctly) and probably rarely use most of them.
>> >
>> > In the v3 case I think it's 7 pages instead of 17.
>>
>> I would have guessed 9. Out of curiosity, is there a reason
>> documented for these size limits?
>
>
> In the v4 case:
>
>         #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>
> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
> extend attribute value.

XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
ACLs are passed through unchanged in "system.nfs4_acl".

> In the v3 case:
>
>         /* Maximum number of ACL entries over NFS */
>         #define NFS_ACL_MAX_ENTRIES     1024
>
>         #define NFSACL_MAXPAGES         ((2*(8+12*NFS_ACL_MAX_ENTRIES) + PAGE_SIZE-1) \
>                                          >> PAGE_SHIFT)
>
> No idea where that 1024 comes from.

The 1024-entry limit is arbitrary.

Andreas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 5/6] NFSv4: simplify getacl decoding
  2017-02-19  2:07   ` [PATCH 5/6] NFSv4: simplify getacl decoding J. Bruce Fields
@ 2017-02-20 22:30     ` Andreas Gruenbacher
  0 siblings, 0 replies; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-20 22:30 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Weston Andros Adamson

On Sun, Feb 19, 2017 at 3:07 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
>
> There's not really much point to shifting the data around with
> xdr_enter_page() (and xdr_shrink_bufhead) when we're just going to
> copying the data out of the xdr buf anyway instead of trying to place
> the data right for zero-copy.  And it turns out if we want to leave
> page allocation to the xdr_partial_copy_from_skb() (as we will in a
> following patch) shifting the data brings more ocmplications.

Typo above ("complications").

> So let's just pass that buf down to the xdr layer and let it copy data
> directly into it.
>
> That means we need to allocate our own buf in the case the user didn't
> give us one, so that we can still cache in that case (to efficiently
> handle the common case of a getxattr(., ., NULL, 0) to get the size
> followed immediately by a getxattr(., ., buf, size)).
>
> But this still works out to be much simpler.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>

> ---
>  fs/nfs/nfs4proc.c       | 73 +++++++++++++++++++++++++------------------------
>  fs/nfs/nfs4xdr.c        | 27 ++++--------------
>  include/linux/nfs_xdr.h |  4 +--
>  3 files changed, 45 insertions(+), 59 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index eb6c34db9a79..3e3dbba4aa74 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5041,7 +5041,7 @@ static inline ssize_t nfs4_read_cached_acl(struct inode *inode, char *buf, size_
>   */
>  #define NFS4_MAX_CACHED_ACL PAGE_SIZE
>
> -static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size_t pgbase, size_t acl_len)
> +static void nfs4_write_cached_acl(struct inode *inode, void *buf, size_t acl_len)
>  {
>         struct nfs4_cached_acl *acl;
>         size_t buflen = sizeof(*acl) + acl_len;
> @@ -5051,7 +5051,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
>                 if (acl == NULL)
>                         goto out;
>                 acl->cached = 1;
> -               _copy_from_pages(acl->data, pages, pgbase, acl_len);
> +               memcpy(acl->data, buf, acl_len);
>         } else {
>                 acl = kmalloc(sizeof(*acl), GFP_KERNEL);
>                 if (acl == NULL)
> @@ -5063,26 +5063,16 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
>         nfs4_set_cached_acl(inode, acl);
>  }
>
> -/*
> - * The getxattr API returns the required buffer length when called with a
> - * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
> - * the required buf.  On a NULL buf, we send a page of data to the server
> - * guessing that the ACL request can be serviced by a page. If so, we cache
> - * up to the page of ACL data, and the 2nd call to getxattr is serviced by
> - * the cache. If not so, we throw away the page, and cache the required
> - * length. The next getxattr call will then produce another round trip to
> - * the server, this time with the input buf of the required size.
> - */
> -static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
> +static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>  {
>         struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
>         struct nfs_getaclargs args = {
>                 .fh = NFS_FH(inode),
>                 .acl_pages = pages,
> -               .acl_len = buflen,
>         };
>         struct nfs_getaclres res = {
>                 .acl_len = buflen,
> +               .buf = buf,
>         };
>         struct rpc_message msg = {
>                 .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_GETACL],
> @@ -5112,27 +5102,8 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
>                 __func__, buf, buflen, npages, args.acl_len);
>         ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
>                              &msg, &args.seq_args, &res.seq_res, 0);
> -       if (ret)
> -               goto out_free;
> -
> -       /* Handle the case where the passed-in buffer is too short */
> -       if (res.acl_flags & NFS4_ACL_TRUNC) {
> -               /* Did the user only issue a request for the acl length? */
> -               if (buf == NULL)
> -                       goto out_ok;
> -               ret = -ERANGE;
> -               goto out_free;
> -       }
> -       nfs4_write_cached_acl(inode, pages, res.acl_data_offset, res.acl_len);
> -       if (buf) {
> -               if (res.acl_len > buflen) {
> -                       ret = -ERANGE;
> -                       goto out_free;
> -               }
> -               _copy_from_pages(buf, pages, res.acl_data_offset, res.acl_len);
> -       }
> -out_ok:
> -       ret = res.acl_len;
> +       if (ret == 0)
> +               ret = res.acl_len;
>  out_free:
>         for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
>                 __free_page(pages[i]);
> @@ -5140,6 +5111,38 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
>         return ret;
>  }
>
> +/*
> + * The getxattr API returns the required buffer length when called with a
> + * NULL buf. The NFSv4 acl tool then calls getxattr again after allocating
> + * the required buf. Cache the result from the first getxattr call to avoid
> + * sending another RPC.
> + */
> +static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
> +{
> +       void *priv_buf = NULL;
> +       void *our_buf = buf;
> +       int our_buflen = buflen;
> +       static ssize_t ret = -ENOMEM;
> +
> +       if (!buf) {
> +               priv_buf = kmalloc(NFS4_MAX_CACHED_ACL, GFP_KERNEL);
> +               if (!priv_buf)
> +                       goto out;
> +               our_buf = priv_buf;
> +               our_buflen = NFS4_MAX_CACHED_ACL;
> +       }
> +       ret = nfs4_do_get_acl(inode, our_buf, our_buflen);
> +       if (ret < 0)
> +               goto out;
> +       if (ret <= our_buflen)
> +               nfs4_write_cached_acl(inode, our_buf, ret);
> +       if (buf && ret > buflen)
> +               ret = -ERANGE;
> +out:
> +       kfree(priv_buf);
> +       return ret;
> +}
> +
>  static ssize_t nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
>  {
>         struct nfs4_exception exception = { };
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index bb95dd2edeef..534b377084fb 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -5353,39 +5353,24 @@ static int decode_getacl(struct xdr_stream *xdr, struct rpc_rqst *req,
>         uint32_t attrlen,
>                  bitmap[3] = {0};
>         int status;
> -       unsigned int pg_offset;
>
> -       res->acl_len = 0;
>         if ((status = decode_op_hdr(xdr, OP_GETATTR)) != 0)
>                 goto out;
> -
> -       xdr_enter_page(xdr, xdr->buf->page_len);
> -
> -       /* Calculate the offset of the page data */
> -       pg_offset = xdr->buf->head[0].iov_len;
> -
>         if ((status = decode_attr_bitmap(xdr, bitmap)) != 0)
>                 goto out;
>         if ((status = decode_attr_length(xdr, &attrlen, &savep)) != 0)
>                 goto out;
> -
> +       if (unlikely(attrlen > (xdr->nwords << 2)))
> +               return -EIO;
>         if (unlikely(bitmap[0] & (FATTR4_WORD0_ACL - 1U)))
>                 return -EIO;
>         if (likely(bitmap[0] & FATTR4_WORD0_ACL)) {
> +               unsigned int offset = xdr_stream_pos(xdr);
>
> -               /* The bitmap (xdr len + bitmaps) and the attr xdr len words
> -                * are stored with the acl data to handle the problem of
> -                * variable length bitmaps.*/
> -               res->acl_data_offset = xdr_stream_pos(xdr) - pg_offset;
> +               if (attrlen <= res->acl_len)
> +                       read_bytes_from_xdr_buf(xdr->buf, offset,
> +                                               res->buf, attrlen);
>                 res->acl_len = attrlen;
> -
> -               /* Check for receive buffer overflow */
> -               if (res->acl_len > (xdr->nwords << 2) ||
> -                   res->acl_len + res->acl_data_offset > xdr->buf->page_len) {
> -                       res->acl_flags |= NFS4_ACL_TRUNC;
> -                       dprintk("NFS: acl reply: attrlen %u > page_len %u\n",
> -                                       attrlen, xdr->nwords << 2);
> -               }
>         } else
>                 status = -EOPNOTSUPP;
>
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index 348f7c158084..418116d62740 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -747,13 +747,11 @@ struct nfs_getaclargs {
>         struct page **                  acl_pages;
>  };
>
> -/* getxattr ACL interface flags */
> -#define NFS4_ACL_TRUNC         0x0001  /* ACL was truncated */
>  struct nfs_getaclres {
>         struct nfs4_sequence_res        seq_res;
> +       void *                          buf;
>         size_t                          acl_len;
>         size_t                          acl_data_offset;
> -       int                             acl_flags;
>         struct page *                   acl_scratch;
>  };
>
> --
> 2.9.3
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
  2017-02-19 19:29     ` Chuck Lever
@ 2017-02-20 22:38     ` Andreas Gruenbacher
  2017-02-21 18:35       ` J. Bruce Fields
  1 sibling, 1 reply; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-20 22:38 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Weston Andros Adamson, Weston Andros Adamson

On Sun, Feb 19, 2017 at 3:07 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> From: Weston Andros Adamson <dros@netapp.com>
>
> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> code does.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Andreas Gruenbacher <agreuenba@rdhat.com>

> ---
>  fs/nfs/nfs4proc.c | 23 +++++++----------------
>  1 file changed, 7 insertions(+), 16 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 3e3dbba4aa74..7842c73fddfc 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>         struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
>         struct nfs_getaclargs args = {
>                 .fh = NFS_FH(inode),
> +               /* The xdr layer may allocate pages here. */
>                 .acl_pages = pages,
>         };
>         struct nfs_getaclres res = {
> @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>                 .rpc_argp = &args,
>                 .rpc_resp = &res,
>         };
> -       unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
> -       int ret = -ENOMEM, i;
> -
> -       if (npages > ARRAY_SIZE(pages))
> -               return -ERANGE;
> -
> -       for (i = 0; i < npages; i++) {
> -               pages[i] = alloc_page(GFP_KERNEL);
> -               if (!pages[i])
> -                       goto out_free;
> -       }
> +       int ret, i;
>
>         /* for decoding across pages */
>         res.acl_scratch = alloc_page(GFP_KERNEL);
>         if (!res.acl_scratch)
> -               goto out_free;
> +               return -ENOMEM;
>
> -       args.acl_len = npages * PAGE_SIZE;
> +       args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;

Define acl_len directly in the args initializer instead of here.

> -       dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
> -               __func__, buf, buflen, npages, args.acl_len);
> +       dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
> +               __func__, buf, buflen, args.acl_len);
>         ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
>                              &msg, &args.seq_args, &res.seq_res, 0);
>         if (ret == 0)
>                 ret = res.acl_len;
> -out_free:
> +
>         for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
>                 __free_page(pages[i]);
>         __free_page(res.acl_scratch);
> --
> 2.9.3
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 4/6] NFSv4: minor getacl cleanup
  2017-02-19  2:07   ` [PATCH 4/6] NFSv4: minor getacl cleanup J. Bruce Fields
@ 2017-02-20 22:38     ` Andreas Gruenbacher
  0 siblings, 0 replies; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-20 22:38 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Weston Andros Adamson

On Sun, Feb 19, 2017 at 3:07 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Andreas Gruenbacher <agreuenba@rdhat.com>

> ---
>  fs/nfs/nfs4proc.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index a4e33e519f6e..eb6c34db9a79 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5134,11 +5134,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
>  out_ok:
>         ret = res.acl_len;
>  out_free:
> -       for (i = 0; i < npages; i++)
> -               if (pages[i])
> -                       __free_page(pages[i]);
> -       if (res.acl_scratch)
> -               __free_page(res.acl_scratch);
> +       for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
> +               __free_page(pages[i]);
> +       __free_page(res.acl_scratch);
>         return ret;
>  }
>
> --
> 2.9.3
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-20 22:38     ` Andreas Gruenbacher
@ 2017-02-21 18:35       ` J. Bruce Fields
  2017-02-21 19:45         ` Weston Andros Adamson
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-21 18:35 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List,
	Weston Andros Adamson, Weston Andros Adamson

On Mon, Feb 20, 2017 at 11:38:22PM +0100, Andreas Gruenbacher wrote:
> On Sun, Feb 19, 2017 at 3:07 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> > @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
> >                 .rpc_argp = &args,
> >                 .rpc_resp = &res,
> >         };
> > -       unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
> > -       int ret = -ENOMEM, i;
> > -
> > -       if (npages > ARRAY_SIZE(pages))
> > -               return -ERANGE;
> > -
> > -       for (i = 0; i < npages; i++) {
> > -               pages[i] = alloc_page(GFP_KERNEL);
> > -               if (!pages[i])
> > -                       goto out_free;
> > -       }
> > +       int ret, i;
> >
> >         /* for decoding across pages */
> >         res.acl_scratch = alloc_page(GFP_KERNEL);
> >         if (!res.acl_scratch)
> > -               goto out_free;
> > +               return -ENOMEM;
> >
> > -       args.acl_len = npages * PAGE_SIZE;
> > +       args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
> 
> Define acl_len directly in the args initializer instead of here.

Got it, thanks.  I'll send another revision of the series.

--b.

> 
> > -       dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
> > -               __func__, buf, buflen, npages, args.acl_len);
> > +       dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
> > +               __func__, buf, buflen, args.acl_len);
> >         ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
> >                              &msg, &args.seq_args, &res.seq_res, 0);
> >         if (ret == 0)
> >                 ret = res.acl_len;
> > -out_free:
> > +
> >         for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
> >                 __free_page(pages[i]);
> >         __free_page(res.acl_scratch);
> > --
> > 2.9.3
> >

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-20 21:31             ` Andreas Gruenbacher
@ 2017-02-21 18:46               ` Chuck Lever
  2017-02-21 21:21                 ` Andreas Gruenbacher
  0 siblings, 1 reply; 46+ messages in thread
From: Chuck Lever @ 2017-02-21 18:46 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: J. Bruce Fields, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

Hi Andreas-


> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
> 
> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>>> 
>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>> 
>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>>>>> 
>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>>>> 
>>>>>> From: Weston Andros Adamson <dros@netapp.com>
>>>>>> 
>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>>>>>> code does.
>>>>> 
>>>>> The patch description does not explain why this change is
>>>>> being done.
>>>> 
>>>> The only justification I see is avoiding allocating pages unnecessarily.
>>> 
>>> That makes sense. Is there a real world workload that has seen
>>> a negative effect?
>>> 
>>> 
>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
>>>> calculating correctly) and probably rarely use most of them.
>>>> 
>>>> In the v3 case I think it's 7 pages instead of 17.
>>> 
>>> I would have guessed 9. Out of curiosity, is there a reason
>>> documented for these size limits?
>> 
>> 
>> In the v4 case:
>> 
>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>> 
>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
>> extend attribute value.
> 
> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
> ACLs are passed through unchanged in "system.nfs4_acl".

"Extended attribute" means this is a Linux-specific limit?

Is there anything that prevents a non-Linux system from constructing
or returning an ACL that is larger than that?

What happens on a Linux client when a server returns an ACL that does
not fit in this allotment?


>> In the v3 case:
>> 
>>        /* Maximum number of ACL entries over NFS */
>>        #define NFS_ACL_MAX_ENTRIES     1024
>> 
>>        #define NFSACL_MAXPAGES         ((2*(8+12*NFS_ACL_MAX_ENTRIES) + PAGE_SIZE-1) \
>>>> PAGE_SHIFT)
>> 
>> No idea where that 1024 comes from.
> 
> The 1024-entry limit is arbitrary.
> 
> Andreas
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-21 18:35       ` J. Bruce Fields
@ 2017-02-21 19:45         ` Weston Andros Adamson
  0 siblings, 0 replies; 46+ messages in thread
From: Weston Andros Adamson @ 2017-02-21 19:45 UTC (permalink / raw)
  To: Bruce Fields
  Cc: Andreas Gruenbacher, Trond Myklebust, Anna Schumaker,
	linux-nfs list, Weston Andros Adamson



> On Feb 21, 2017, at 1:35 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>=20
> On Mon, Feb 20, 2017 at 11:38:22PM +0100, Andreas Gruenbacher wrote:
>> On Sun, Feb 19, 2017 at 3:07 AM, J. Bruce Fields <bfields@redhat.com> wr=
ote:
>>> @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *in=
ode, void *buf, size_t buflen)
>>>                .rpc_argp =3D &args,
>>>                .rpc_resp =3D &res,
>>>        };
>>> -       unsigned int npages =3D DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
>>> -       int ret =3D -ENOMEM, i;
>>> -
>>> -       if (npages > ARRAY_SIZE(pages))
>>> -               return -ERANGE;
>>> -
>>> -       for (i =3D 0; i < npages; i++) {
>>> -               pages[i] =3D alloc_page(GFP_KERNEL);
>>> -               if (!pages[i])
>>> -                       goto out_free;
>>> -       }
>>> +       int ret, i;
>>>=20
>>>        /* for decoding across pages */
>>>        res.acl_scratch =3D alloc_page(GFP_KERNEL);
>>>        if (!res.acl_scratch)
>>> -               goto out_free;
>>> +               return -ENOMEM;
>>>=20
>>> -       args.acl_len =3D npages * PAGE_SIZE;
>>> +       args.acl_len =3D ARRAY_SIZE(pages) << PAGE_SHIFT;
>>=20
>> Define acl_len directly in the args initializer instead of here.
>=20
> Got it, thanks.  I'll send another revision of the series.

Thanks Bruce,

You can add:

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>

on the next revision.

-dros

>=20
> --b.
>=20
>>=20
>>> -       dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
>>> -               __func__, buf, buflen, npages, args.acl_len);
>>> +       dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
>>> +               __func__, buf, buflen, args.acl_len);
>>>        ret =3D nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(ino=
de),
>>>                             &msg, &args.seq_args, &res.seq_res, 0);
>>>        if (ret =3D=3D 0)
>>>                ret =3D res.acl_len;
>>> -out_free:
>>> +
>>>        for (i =3D 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
>>>                __free_page(pages[i]);
>>>        __free_page(res.acl_scratch);
>>> --
>>> 2.9.3
>>>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-19  2:07   ` [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
@ 2017-02-21 19:46     ` Weston Andros Adamson
  2017-02-22 22:36       ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Weston Andros Adamson @ 2017-02-21 19:46 UTC (permalink / raw)
  To: Bruce Fields
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs list,
	Andreas Gruenbacher, Weston Andros Adamson


> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>=20
> From: Weston Andros Adamson <dros@netapp.com>
>=20
> We're not taking into account that the space needed for the (variable
> length) attr bitmap, with the result that we'd sometimes get a spurious
> ERANGE when the ACL data got close to the end of a page.
>=20
> Just add in an extra page to make sure.
>=20
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>

Thanks, you can add:

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>


> ---
> fs/nfs/nfs4proc.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>=20
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 0a0eaecf9676..60501e10625d 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5069,7 +5069,7 @@ static void nfs4_write_cached_acl(struct inode *ino=
de, struct page **pages, size
>  */
> static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, si=
ze_t buflen)
> {
> -=09struct page *pages[NFS4ACL_MAXPAGES] =3D {NULL, };
> +=09struct page *pages[NFS4ACL_MAXPAGES + 1] =3D {NULL, };
> =09struct nfs_getaclargs args =3D {
> =09=09.fh =3D NFS_FH(inode),
> =09=09.acl_pages =3D pages,
> @@ -5083,13 +5083,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inod=
e *inode, void *buf, size_t bu
> =09=09.rpc_argp =3D &args,
> =09=09.rpc_resp =3D &res,
> =09};
> -=09unsigned int npages =3D DIV_ROUND_UP(buflen, PAGE_SIZE);
> +=09unsigned int npages =3D DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
> =09int ret =3D -ENOMEM, i;
>=20
> -=09/* As long as we're doing a round trip to the server anyway,
> -=09 * let's be prepared for a page of acl data. */
> -=09if (npages =3D=3D 0)
> -=09=09npages =3D 1;
> =09if (npages > ARRAY_SIZE(pages))
> =09=09return -ERANGE;
>=20
> --=20
> 2.9.3
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-21 18:46               ` Chuck Lever
@ 2017-02-21 21:21                 ` Andreas Gruenbacher
  2017-02-21 21:37                   ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-21 21:21 UTC (permalink / raw)
  To: Chuck Lever
  Cc: J. Bruce Fields, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> Hi Andreas-
>
>
>> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
>>
>> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>>>>
>>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>>>
>>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>>>>>>
>>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>>>>>
>>>>>>> From: Weston Andros Adamson <dros@netapp.com>
>>>>>>>
>>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>>>>>>> code does.
>>>>>>
>>>>>> The patch description does not explain why this change is
>>>>>> being done.
>>>>>
>>>>> The only justification I see is avoiding allocating pages unnecessarily.
>>>>
>>>> That makes sense. Is there a real world workload that has seen
>>>> a negative effect?
>>>>
>>>>
>>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
>>>>> calculating correctly) and probably rarely use most of them.
>>>>>
>>>>> In the v3 case I think it's 7 pages instead of 17.
>>>>
>>>> I would have guessed 9. Out of curiosity, is there a reason
>>>> documented for these size limits?
>>>
>>>
>>> In the v4 case:
>>>
>>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>>>
>>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
>>> extend attribute value.
>>
>> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
>> ACLs are passed through unchanged in "system.nfs4_acl".
>
> "Extended attribute" means this is a Linux-specific limit?

Yes.

> Is there anything that prevents a non-Linux system from constructing
> or returning an ACL that is larger than that?

No.

> What happens on a Linux client when a server returns an ACL that does
> not fit in this allotment?

I would hope an error, but I haven't tested it.

Andreas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-21 21:21                 ` Andreas Gruenbacher
@ 2017-02-21 21:37                   ` J. Bruce Fields
  2017-02-21 21:45                     ` Andreas Gruenbacher
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-21 21:37 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> > Hi Andreas-
> >
> >
> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
> >>
> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
> >>>>
> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>>>>
> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
> >>>>>>
> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>>>>>>
> >>>>>>> From: Weston Andros Adamson <dros@netapp.com>
> >>>>>>>
> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> >>>>>>> code does.
> >>>>>>
> >>>>>> The patch description does not explain why this change is
> >>>>>> being done.
> >>>>>
> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
> >>>>
> >>>> That makes sense. Is there a real world workload that has seen
> >>>> a negative effect?
> >>>>
> >>>>
> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
> >>>>> calculating correctly) and probably rarely use most of them.
> >>>>>
> >>>>> In the v3 case I think it's 7 pages instead of 17.
> >>>>
> >>>> I would have guessed 9. Out of curiosity, is there a reason
> >>>> documented for these size limits?
> >>>
> >>>
> >>> In the v4 case:
> >>>
> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
> >>>
> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
> >>> extend attribute value.
> >>
> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
> >> ACLs are passed through unchanged in "system.nfs4_acl".
> >
> > "Extended attribute" means this is a Linux-specific limit?
> 
> Yes.
> 
> > Is there anything that prevents a non-Linux system from constructing
> > or returning an ACL that is larger than that?
> 
> No.

In the >=v4.1 case there are session limits, but they'll typically be
less.  In the 4.0 case I think there's no explicit limit at all.  In
practice I bet other systems are similar to Linux in that the assume
peers won't send rpc replies or requests larger than about the
maximum-sized read or write.  But again that'll usually be a higher
limit than our ACL limit.

> > What happens on a Linux client when a server returns an ACL that does
> > not fit in this allotment?
> 
> I would hope an error, but I haven't tested it.

I haven't tested either, but it looks to me like the rpc layer receives
a truncated request, the xdr decoding recognizes that it's truncated,
and the result is an -ERANGE.

Looking now I think that my "NFSv4: simplify getacl decoding" changes
that to an -EIO.  More importantly, it makes that an EIO even when the
calling application was only asking for the length, not the actual ACL
data.  I'll fix that.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-21 21:37                   ` J. Bruce Fields
@ 2017-02-21 21:45                     ` Andreas Gruenbacher
  2017-02-22  1:53                       ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-21 21:45 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Tue, Feb 21, 2017 at 10:37 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
>> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> > Hi Andreas-
>> >
>> >
>> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
>> >>
>> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>> >>>>
>> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >>>>>
>> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> >>>>>>
>> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >>>>>>>
>> >>>>>>> From: Weston Andros Adamson <dros@netapp.com>
>> >>>>>>>
>> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>> >>>>>>> code does.
>> >>>>>>
>> >>>>>> The patch description does not explain why this change is
>> >>>>>> being done.
>> >>>>>
>> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
>> >>>>
>> >>>> That makes sense. Is there a real world workload that has seen
>> >>>> a negative effect?
>> >>>>
>> >>>>
>> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
>> >>>>> calculating correctly) and probably rarely use most of them.
>> >>>>>
>> >>>>> In the v3 case I think it's 7 pages instead of 17.
>> >>>>
>> >>>> I would have guessed 9. Out of curiosity, is there a reason
>> >>>> documented for these size limits?
>> >>>
>> >>>
>> >>> In the v4 case:
>> >>>
>> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>> >>>
>> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
>> >>> extend attribute value.
>> >>
>> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
>> >> ACLs are passed through unchanged in "system.nfs4_acl".
>> >
>> > "Extended attribute" means this is a Linux-specific limit?
>>
>> Yes.
>>
>> > Is there anything that prevents a non-Linux system from constructing
>> > or returning an ACL that is larger than that?
>>
>> No.
>
> In the >=v4.1 case there are session limits, but they'll typically be
> less.  In the 4.0 case I think there's no explicit limit at all.  In
> practice I bet other systems are similar to Linux in that the assume
> peers won't send rpc replies or requests larger than about the
> maximum-sized read or write.  But again that'll usually be a higher
> limit than our ACL limit.
>
>> > What happens on a Linux client when a server returns an ACL that does
>> > not fit in this allotment?
>>
>> I would hope an error, but I haven't tested it.
>
> I haven't tested either, but it looks to me like the rpc layer receives
> a truncated request, the xdr decoding recognizes that it's truncated,
> and the result is an -ERANGE.
>
> Looking now I think that my "NFSv4: simplify getacl decoding" changes
> that to an -EIO.  More importantly, it makes that an EIO even when the
> calling application was only asking for the length, not the actual ACL
> data.  I'll fix that.

Just be careful not to return a length from getxattr(path, name, NULL,
0) that will cause getxattr(path, name, buffer, size) to fail with
ERANGE, please. Otherwise, user space might get very confused.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-21 21:45                     ` Andreas Gruenbacher
@ 2017-02-22  1:53                       ` J. Bruce Fields
  2017-02-23 10:28                         ` Andreas Gruenbacher
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-22  1:53 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Tue, Feb 21, 2017 at 10:45:35PM +0100, Andreas Gruenbacher wrote:
> On Tue, Feb 21, 2017 at 10:37 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
> >> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> >> > Hi Andreas-
> >> >
> >> >
> >> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
> >> >>
> >> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
> >> >>>>
> >> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >>>>>
> >> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
> >> >>>>>>
> >> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >>>>>>>
> >> >>>>>>> From: Weston Andros Adamson <dros@netapp.com>
> >> >>>>>>>
> >> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> >> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> >> >>>>>>> code does.
> >> >>>>>>
> >> >>>>>> The patch description does not explain why this change is
> >> >>>>>> being done.
> >> >>>>>
> >> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
> >> >>>>
> >> >>>> That makes sense. Is there a real world workload that has seen
> >> >>>> a negative effect?
> >> >>>>
> >> >>>>
> >> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
> >> >>>>> calculating correctly) and probably rarely use most of them.
> >> >>>>>
> >> >>>>> In the v3 case I think it's 7 pages instead of 17.
> >> >>>>
> >> >>>> I would have guessed 9. Out of curiosity, is there a reason
> >> >>>> documented for these size limits?
> >> >>>
> >> >>>
> >> >>> In the v4 case:
> >> >>>
> >> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
> >> >>>
> >> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
> >> >>> extend attribute value.
> >> >>
> >> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
> >> >> ACLs are passed through unchanged in "system.nfs4_acl".
> >> >
> >> > "Extended attribute" means this is a Linux-specific limit?
> >>
> >> Yes.
> >>
> >> > Is there anything that prevents a non-Linux system from constructing
> >> > or returning an ACL that is larger than that?
> >>
> >> No.
> >
> > In the >=v4.1 case there are session limits, but they'll typically be
> > less.  In the 4.0 case I think there's no explicit limit at all.  In
> > practice I bet other systems are similar to Linux in that the assume
> > peers won't send rpc replies or requests larger than about the
> > maximum-sized read or write.  But again that'll usually be a higher
> > limit than our ACL limit.
> >
> >> > What happens on a Linux client when a server returns an ACL that does
> >> > not fit in this allotment?
> >>
> >> I would hope an error, but I haven't tested it.
> >
> > I haven't tested either, but it looks to me like the rpc layer receives
> > a truncated request, the xdr decoding recognizes that it's truncated,
> > and the result is an -ERANGE.
> >
> > Looking now I think that my "NFSv4: simplify getacl decoding" changes
> > that to an -EIO.  More importantly, it makes that an EIO even when the
> > calling application was only asking for the length, not the actual ACL
> > data.  I'll fix that.
> 
> Just be careful not to return a length from getxattr(path, name, NULL,
> 0) that will cause getxattr(path, name, buffer, size) to fail with
> ERANGE, please. Otherwise, user space might get very confused.

Ugh, OK.  So there could be userspace code that does something like

	while (getxattr(path, name, buf, size) == -ERANGE) {
		/* oops, must have raced with a size change */
		size = getxattr(path, name, NULL, 0);
		buf = realloc(buf, size);
	}

and you'd consider that a kernel bug not a userspace bug?

I suspect that can happen both before and after my changes.

So what do we want for that case?  Just -EIO?

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-21 19:46     ` Weston Andros Adamson
@ 2017-02-22 22:36       ` J. Bruce Fields
  2017-02-23 14:55         ` Anna Schumaker
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-22 22:36 UTC (permalink / raw)
  To: Weston Andros Adamson
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs list,
	Andreas Gruenbacher, Weston Andros Adamson

On Tue, Feb 21, 2017 at 07:46:58PM +0000, Weston Andros Adamson wrote:
> 
> > On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > From: Weston Andros Adamson <dros@netapp.com>
> > 
> > We're not taking into account that the space needed for the (variable
> > length) attr bitmap, with the result that we'd sometimes get a spurious
> > ERANGE when the ACL data got close to the end of a page.
> > 
> > Just add in an extra page to make sure.
> > 
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> Thanks, you can add:
> 
> Signed-off-by: Weston Andros Adamson <dros@primarydata.com>

Thanks.

Anna, could we get this one in now?

The rest of it still needs some work to account for the problem Andreas
notes (where we can return a length successfully even though we'll never
accept an ACL of that length).  But this one is an easy fix for a real
bug.

Let me know if you need it resent.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-22  1:53                       ` J. Bruce Fields
@ 2017-02-23 10:28                         ` Andreas Gruenbacher
  2017-02-23 20:20                           ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Gruenbacher @ 2017-02-23 10:28 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Wed, Feb 22, 2017 at 2:53 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> On Tue, Feb 21, 2017 at 10:45:35PM +0100, Andreas Gruenbacher wrote:
>> On Tue, Feb 21, 2017 at 10:37 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> > On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
>> >> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> >> > Hi Andreas-
>> >> >
>> >> >
>> >> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
>> >> >>
>> >> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>> >> >>>>
>> >> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >> >>>>>
>> >> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> >> >>>>>>
>> >> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>> >> >>>>>>>
>> >> >>>>>>> From: Weston Andros Adamson <dros@netapp.com>
>> >> >>>>>>>
>> >> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>> >> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>> >> >>>>>>> code does.
>> >> >>>>>>
>> >> >>>>>> The patch description does not explain why this change is
>> >> >>>>>> being done.
>> >> >>>>>
>> >> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
>> >> >>>>
>> >> >>>> That makes sense. Is there a real world workload that has seen
>> >> >>>> a negative effect?
>> >> >>>>
>> >> >>>>
>> >> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
>> >> >>>>> calculating correctly) and probably rarely use most of them.
>> >> >>>>>
>> >> >>>>> In the v3 case I think it's 7 pages instead of 17.
>> >> >>>>
>> >> >>>> I would have guessed 9. Out of curiosity, is there a reason
>> >> >>>> documented for these size limits?
>> >> >>>
>> >> >>>
>> >> >>> In the v4 case:
>> >> >>>
>> >> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>> >> >>>
>> >> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
>> >> >>> extend attribute value.
>> >> >>
>> >> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
>> >> >> ACLs are passed through unchanged in "system.nfs4_acl".
>> >> >
>> >> > "Extended attribute" means this is a Linux-specific limit?
>> >>
>> >> Yes.
>> >>
>> >> > Is there anything that prevents a non-Linux system from constructing
>> >> > or returning an ACL that is larger than that?
>> >>
>> >> No.
>> >
>> > In the >=v4.1 case there are session limits, but they'll typically be
>> > less.  In the 4.0 case I think there's no explicit limit at all.  In
>> > practice I bet other systems are similar to Linux in that the assume
>> > peers won't send rpc replies or requests larger than about the
>> > maximum-sized read or write.  But again that'll usually be a higher
>> > limit than our ACL limit.
>> >
>> >> > What happens on a Linux client when a server returns an ACL that does
>> >> > not fit in this allotment?
>> >>
>> >> I would hope an error, but I haven't tested it.
>> >
>> > I haven't tested either, but it looks to me like the rpc layer receives
>> > a truncated request, the xdr decoding recognizes that it's truncated,
>> > and the result is an -ERANGE.
>> >
>> > Looking now I think that my "NFSv4: simplify getacl decoding" changes
>> > that to an -EIO.  More importantly, it makes that an EIO even when the
>> > calling application was only asking for the length, not the actual ACL
>> > data.  I'll fix that.
>>
>> Just be careful not to return a length from getxattr(path, name, NULL,
>> 0) that will cause getxattr(path, name, buffer, size) to fail with
>> ERANGE, please. Otherwise, user space might get very confused.
>
> Ugh, OK.  So there could be userspace code that does something like
>
>         while (getxattr(path, name, buf, size) == -ERANGE) {
>                 /* oops, must have raced with a size change */
>                 size = getxattr(path, name, NULL, 0);
>                 buf = realloc(buf, size);
>         }
>
> and you'd consider that a kernel bug not a userspace bug?

It would at least provoke errors if the above loop (with an additional
check for size == -1) didn't terminate, so I'd like to avoid that. I
see now that there is botched code in fs/xattr.c that tries to prevent
that, so I'll try to fix that so that file systems won't have to
bother.

> I suspect that can happen both before and after my changes.
>
> So what do we want for that case?  Just -EIO?

getxattr and listxattr are trying to cast that kind of error to
-E2BIG, which seems okay.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-22 22:36       ` J. Bruce Fields
@ 2017-02-23 14:55         ` Anna Schumaker
  2017-02-23 19:43           ` J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: Anna Schumaker @ 2017-02-23 14:55 UTC (permalink / raw)
  To: J. Bruce Fields, Weston Andros Adamson
  Cc: Trond Myklebust, Anna Schumaker, linux-nfs list,
	Andreas Gruenbacher, Weston Andros Adamson



On 02/22/2017 05:36 PM, J. Bruce Fields wrote:
> On Tue, Feb 21, 2017 at 07:46:58PM +0000, Weston Andros Adamson wrote:
>>
>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>>
>>> From: Weston Andros Adamson <dros@netapp.com>
>>>
>>> We're not taking into account that the space needed for the (variable
>>> length) attr bitmap, with the result that we'd sometimes get a spurious
>>> ERANGE when the ACL data got close to the end of a page.
>>>
>>> Just add in an extra page to make sure.
>>>
>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>>
>> Thanks, you can add:
>>
>> Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> 
> Thanks.
> 
> Anna, could we get this one in now?
> 
> The rest of it still needs some work to account for the problem Andreas
> notes (where we can return a length successfully even though we'll never
> accept an ACL of that length).  But this one is an easy fix for a real
> bug.
> 
> Let me know if you need it resent.

I don't mind taking just the one patch.  Have you made any changes to it since this posting?  If not, then I can take it from the email.

Thanks,
Anna

> 
> --b.
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-23 14:55         ` Anna Schumaker
@ 2017-02-23 19:43           ` J. Bruce Fields
  2017-02-23 19:53             ` [PATCH 1/2] NFSv4: fix getacl head length estimation J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-23 19:43 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Weston Andros Adamson, Trond Myklebust, Anna Schumaker,
	linux-nfs list, Andreas Gruenbacher, Weston Andros Adamson

On Thu, Feb 23, 2017 at 09:55:35AM -0500, Anna Schumaker wrote:
> 
> 
> On 02/22/2017 05:36 PM, J. Bruce Fields wrote:
> > On Tue, Feb 21, 2017 at 07:46:58PM +0000, Weston Andros Adamson wrote:
> >>
> >>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>>
> >>> From: Weston Andros Adamson <dros@netapp.com>
> >>>
> >>> We're not taking into account that the space needed for the (variable
> >>> length) attr bitmap, with the result that we'd sometimes get a spurious
> >>> ERANGE when the ACL data got close to the end of a page.
> >>>
> >>> Just add in an extra page to make sure.
> >>>
> >>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> >>
> >> Thanks, you can add:
> >>
> >> Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> > 
> > Thanks.
> > 
> > Anna, could we get this one in now?
> > 
> > The rest of it still needs some work to account for the problem Andreas
> > notes (where we can return a length successfully even though we'll never
> > accept an ACL of that length).  But this one is an easy fix for a real
> > bug.
> > 
> > Let me know if you need it resent.
> 
> I don't mind taking just the one patch.  Have you made any changes to it since this posting?  If not, then I can take it from the email.

Just Dros's signoff.  Actually, we should probably take the first two.
I'll resend them just to make sure.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/2] NFSv4: fix getacl head length estimation
  2017-02-23 19:43           ` J. Bruce Fields
@ 2017-02-23 19:53             ` J. Bruce Fields
  2017-02-23 19:54               ` [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-23 19:53 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Weston Andros Adamson, Trond Myklebust, Anna Schumaker,
	linux-nfs list, Andreas Gruenbacher, Weston Andros Adamson

From: "J. Bruce Fields" <bfields@redhat.com>

Bitmap and attrlen follow immediately after the op reply header.  This
was an oversight from commit bf118a342f.

Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).

Fixes: bf118a342f10 "NFSv4: include bitmap in nfsv4 get acl data"
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e9255cb453e6..bb95dd2edeef 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2524,7 +2524,7 @@ static void nfs4_xdr_enc_getacl(struct rpc_rqst *req, struct xdr_stream *xdr,
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
 	encode_putfh(xdr, args->fh, &hdr);
-	replen = hdr.replen + op_decode_hdr_maxsz + 1;
+	replen = hdr.replen + op_decode_hdr_maxsz;
 	encode_getattr_two(xdr, FATTR4_WORD0_ACL, 0, &hdr);
 
 	xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-23 19:53             ` [PATCH 1/2] NFSv4: fix getacl head length estimation J. Bruce Fields
@ 2017-02-23 19:54               ` J. Bruce Fields
  2017-02-23 21:54                 ` Anna Schumaker
  0 siblings, 1 reply; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-23 19:54 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Weston Andros Adamson, Trond Myklebust, Anna Schumaker,
	linux-nfs list, Andreas Gruenbacher, Weston Andros Adamson

From: Weston Andros Adamson <dros@primarydata.com>

We're not taking into account that the space needed for the (variable
length) attr bitmap, with the result that we'd sometimes get a spurious
ERANGE when the ACL data got close to the end of a page.

Just add in an extra page to make sure.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4proc.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 0a0eaecf9676..60501e10625d 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5069,7 +5069,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
  */
 static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
 {
-	struct page *pages[NFS4ACL_MAXPAGES] = {NULL, };
+	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
 	struct nfs_getaclargs args = {
 		.fh = NFS_FH(inode),
 		.acl_pages = pages,
@@ -5083,13 +5083,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
 		.rpc_argp = &args,
 		.rpc_resp = &res,
 	};
-	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE);
+	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
 	int ret = -ENOMEM, i;
 
-	/* As long as we're doing a round trip to the server anyway,
-	 * let's be prepared for a page of acl data. */
-	if (npages == 0)
-		npages = 1;
 	if (npages > ARRAY_SIZE(pages))
 		return -ERANGE;
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
  2017-02-23 10:28                         ` Andreas Gruenbacher
@ 2017-02-23 20:20                           ` J. Bruce Fields
  0 siblings, 0 replies; 46+ messages in thread
From: J. Bruce Fields @ 2017-02-23 20:20 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Chuck Lever, Trond Myklebust, Anna Schumaker,
	Linux NFS Mailing List, Dros Adamson, Weston Andros Adamson

On Thu, Feb 23, 2017 at 11:28:46AM +0100, Andreas Gruenbacher wrote:
> On Wed, Feb 22, 2017 at 2:53 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> > On Tue, Feb 21, 2017 at 10:45:35PM +0100, Andreas Gruenbacher wrote:
> >> On Tue, Feb 21, 2017 at 10:37 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> > On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
> >> >> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> >> >> > Hi Andreas-
> >> >> >
> >> >> >
> >> >> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@redhat.com> wrote:
> >> >> >>
> >> >> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
> >> >> >>>>
> >> >> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >> >>>>>
> >> >> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
> >> >> >>>>>>
> >> >> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >> >> >>>>>>>
> >> >> >>>>>>> From: Weston Andros Adamson <dros@netapp.com>
> >> >> >>>>>>>
> >> >> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
> >> >> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
> >> >> >>>>>>> code does.
> >> >> >>>>>>
> >> >> >>>>>> The patch description does not explain why this change is
> >> >> >>>>>> being done.
> >> >> >>>>>
> >> >> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
> >> >> >>>>
> >> >> >>>> That makes sense. Is there a real world workload that has seen
> >> >> >>>> a negative effect?
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
> >> >> >>>>> calculating correctly) and probably rarely use most of them.
> >> >> >>>>>
> >> >> >>>>> In the v3 case I think it's 7 pages instead of 17.
> >> >> >>>>
> >> >> >>>> I would have guessed 9. Out of curiosity, is there a reason
> >> >> >>>> documented for these size limits?
> >> >> >>>
> >> >> >>>
> >> >> >>> In the v4 case:
> >> >> >>>
> >> >> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
> >> >> >>>
> >> >> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
> >> >> >>> extend attribute value.
> >> >> >>
> >> >> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
> >> >> >> ACLs are passed through unchanged in "system.nfs4_acl".
> >> >> >
> >> >> > "Extended attribute" means this is a Linux-specific limit?
> >> >>
> >> >> Yes.
> >> >>
> >> >> > Is there anything that prevents a non-Linux system from constructing
> >> >> > or returning an ACL that is larger than that?
> >> >>
> >> >> No.
> >> >
> >> > In the >=v4.1 case there are session limits, but they'll typically be
> >> > less.  In the 4.0 case I think there's no explicit limit at all.  In
> >> > practice I bet other systems are similar to Linux in that the assume
> >> > peers won't send rpc replies or requests larger than about the
> >> > maximum-sized read or write.  But again that'll usually be a higher
> >> > limit than our ACL limit.
> >> >
> >> >> > What happens on a Linux client when a server returns an ACL that does
> >> >> > not fit in this allotment?
> >> >>
> >> >> I would hope an error, but I haven't tested it.
> >> >
> >> > I haven't tested either, but it looks to me like the rpc layer receives
> >> > a truncated request, the xdr decoding recognizes that it's truncated,
> >> > and the result is an -ERANGE.
> >> >
> >> > Looking now I think that my "NFSv4: simplify getacl decoding" changes
> >> > that to an -EIO.  More importantly, it makes that an EIO even when the
> >> > calling application was only asking for the length, not the actual ACL
> >> > data.  I'll fix that.
> >>
> >> Just be careful not to return a length from getxattr(path, name, NULL,
> >> 0) that will cause getxattr(path, name, buffer, size) to fail with
> >> ERANGE, please. Otherwise, user space might get very confused.
> >
> > Ugh, OK.  So there could be userspace code that does something like
> >
> >         while (getxattr(path, name, buf, size) == -ERANGE) {
> >                 /* oops, must have raced with a size change */
> >                 size = getxattr(path, name, NULL, 0);
> >                 buf = realloc(buf, size);
> >         }
> >
> > and you'd consider that a kernel bug not a userspace bug?
> 
> It would at least provoke errors if the above loop (with an additional
> check for size == -1) didn't terminate, so I'd like to avoid that. I
> see now that there is botched code in fs/xattr.c that tries to prevent
> that, so I'll try to fix that so that file systems won't have to
> bother.

Having seen your patch on fs-devel....  OK, so after that point, we can
choose in NFS to either to return -E2BIG ourselves or to return success
with the large length and let fs/xattr convert to -E2BIG if necessary.
Thanks, that makes sense.

> > I suspect that can happen both before and after my changes.
> >
> > So what do we want for that case?  Just -EIO?
> 
> getxattr and listxattr are trying to cast that kind of error to
> -E2BIG, which seems okay.

Got it, thanks.

--b.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes
  2017-02-23 19:54               ` [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
@ 2017-02-23 21:54                 ` Anna Schumaker
  0 siblings, 0 replies; 46+ messages in thread
From: Anna Schumaker @ 2017-02-23 21:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Weston Andros Adamson, Trond Myklebust, Anna Schumaker,
	linux-nfs list, Andreas Gruenbacher, Weston Andros Adamson

Thanks, Bruce and Dros!

Anna

On 02/23/2017 02:54 PM, J. Bruce Fields wrote:
> From: Weston Andros Adamson <dros@primarydata.com>
> 
> We're not taking into account that the space needed for the (variable
> length) attr bitmap, with the result that we'd sometimes get a spurious
> ERANGE when the ACL data got close to the end of a page.
> 
> Just add in an extra page to make sure.
> 
> Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfs/nfs4proc.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 0a0eaecf9676..60501e10625d 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5069,7 +5069,7 @@ static void nfs4_write_cached_acl(struct inode *inode, struct page **pages, size
>   */
>  static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t buflen)
>  {
> -	struct page *pages[NFS4ACL_MAXPAGES] = {NULL, };
> +	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
>  	struct nfs_getaclargs args = {
>  		.fh = NFS_FH(inode),
>  		.acl_pages = pages,
> @@ -5083,13 +5083,9 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf, size_t bu
>  		.rpc_argp = &args,
>  		.rpc_resp = &res,
>  	};
> -	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE);
> +	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
>  	int ret = -ENOMEM, i;
>  
> -	/* As long as we're doing a round trip to the server anyway,
> -	 * let's be prepared for a page of acl data. */
> -	if (npages == 0)
> -		npages = 1;
>  	if (npages > ARRAY_SIZE(pages))
>  		return -ERANGE;
>  
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2017-02-23 21:55 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
2017-02-17 16:44 ` [PATCH 1/3] nfsd4: fix getacl head length estimation J. Bruce Fields
2017-02-17 16:44 ` [PATCH 2/3] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
2017-02-17 19:15   ` kbuild test robot
2017-02-17 19:33     ` J. Bruce Fields
2017-02-17 19:35   ` [PATCH] nfsd4: fix ifnullfree.cocci warnings kbuild test robot
2017-02-17 19:34     ` J. Bruce Fields
2017-02-17 19:35   ` [PATCH 3/3] nfsd4: simplify getacl decoding kbuild test robot
2017-02-17 20:36 ` [PATCH 0/3] getacl fixes Chuck Lever
2017-02-17 20:52   ` J. Bruce Fields
2017-02-17 21:21     ` Chuck Lever
2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
2017-02-20 13:19     ` Kinglong Mee
2017-02-20 15:50       ` J. Bruce Fields
2017-02-20 20:27         ` [PATCH] " J. Bruce Fields
2017-02-19  2:07   ` [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-21 19:46     ` Weston Andros Adamson
2017-02-22 22:36       ` J. Bruce Fields
2017-02-23 14:55         ` Anna Schumaker
2017-02-23 19:43           ` J. Bruce Fields
2017-02-23 19:53             ` [PATCH 1/2] NFSv4: fix getacl head length estimation J. Bruce Fields
2017-02-23 19:54               ` [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-23 21:54                 ` Anna Schumaker
2017-02-19  2:07   ` [PATCH 3/6] NFSv4: minor acl caching policy documentation J. Bruce Fields
2017-02-19  2:07   ` [PATCH 4/6] NFSv4: minor getacl cleanup J. Bruce Fields
2017-02-20 22:38     ` Andreas Gruenbacher
2017-02-19  2:07   ` [PATCH 5/6] NFSv4: simplify getacl decoding J. Bruce Fields
2017-02-20 22:30     ` Andreas Gruenbacher
2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
2017-02-19 19:29     ` Chuck Lever
2017-02-20 16:09       ` J. Bruce Fields
2017-02-20 16:42         ` Chuck Lever
2017-02-20 17:15           ` J. Bruce Fields
2017-02-20 21:31             ` Andreas Gruenbacher
2017-02-21 18:46               ` Chuck Lever
2017-02-21 21:21                 ` Andreas Gruenbacher
2017-02-21 21:37                   ` J. Bruce Fields
2017-02-21 21:45                     ` Andreas Gruenbacher
2017-02-22  1:53                       ` J. Bruce Fields
2017-02-23 10:28                         ` Andreas Gruenbacher
2017-02-23 20:20                           ` J. Bruce Fields
2017-02-20 22:38     ` Andreas Gruenbacher
2017-02-21 18:35       ` J. Bruce Fields
2017-02-21 19:45         ` Weston Andros Adamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.