All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: "J. Bruce Fields" <bfields@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
	Anna Schumaker <schumakeranna@gmail.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Dros Adamson <dros@primarydata.com>,
	Weston Andros Adamson <dros@netapp.com>
Subject: Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand
Date: Mon, 20 Feb 2017 11:42:31 -0500	[thread overview]
Message-ID: <4824B968-4ED6-44AA-A935-3D309D76EFFF@oracle.com> (raw)
In-Reply-To: <20170220160940.GB12335@parsley.fieldses.org>


> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> 
>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> 
>>> From: Weston Andros Adamson <dros@netapp.com>
>>> 
>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>>> code does.
>> 
>> The patch description does not explain why this change is
>> being done.
> 
> The only justification I see is avoiding allocating pages unnecessarily.

That makes sense. Is there a real world workload that has seen
a negative effect?


> Without this patch, for each getacl, we allocate 17 pages (if I'm
> calculating correctly) and probably rarely use most of them.
> 
> In the v3 case I think it's 7 pages instead of 17.

I would have guessed 9. Out of curiosity, is there a reason
documented for these size limits?


> Do we have reason to believe that's actually a big deal?

The xprtrdma hack already has to allocate the full set of
pages for NFSACL GETACL.

If NFSv4 GETATTR(fs_acl4) already works this way and there
are no real problems, I can't see any issue with NFSACL GETACL
using the same mechanism to retrieve smaller objects.

The only risk to overallocating is that it could drive some
page reclaims. The upper layer should be in a better position
to prevent deadlock in this case than the transport layer is.
However if NFSv4 doesn't see a problem here, then there isn't
likely to be an issue for NFSACL GETACL, IMO.


> --b.
> 
>> The matching hack in xprtrdma is in rpcrdma_convert_iovs().
>> Note that those are GFP_ATOMIC allocations, whereas here
>> they are GFP_KERNEL, and are thus more reliable.
>> 
>> IMO this is a step in the wrong direction. We should not be
>> adding more upper layer dependencies on memory allocation
>> in the transport layer.
>> 
>> I strongly prefer that rather the NFSACL code works the way
>> this code currently does, and that the hacks be removed from
>> the transport implementations.
>> 
>> 
>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>>> ---
>>> fs/nfs/nfs4proc.c | 23 +++++++----------------
>>> 1 file changed, 7 insertions(+), 16 deletions(-)
>>> 
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 3e3dbba4aa74..7842c73fddfc 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5068,6 +5068,7 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>>> 	struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };
>>> 	struct nfs_getaclargs args = {
>>> 		.fh = NFS_FH(inode),
>>> +		/* The xdr layer may allocate pages here. */
>> 
>> Sure, it is called xdr_partial_copy_from_skb, but that function
>> lives in socklib.c and is invoked only from xprtsock.c. Also, a
>> similar hack has to be done in xprtrdma.
>> 
>> So IMO this is a transport layer hack, and not part of the
>> (generic) XDR layer.
>> 
>> 
>>> 		.acl_pages = pages,
>>> 	};
>>> 	struct nfs_getaclres res = {
>>> @@ -5079,32 +5080,22 @@ static ssize_t nfs4_do_get_acl(struct inode *inode, void *buf, size_t buflen)
>>> 		.rpc_argp = &args,
>>> 		.rpc_resp = &res,
>>> 	};
>>> -	unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE) + 1;
>>> -	int ret = -ENOMEM, i;
>>> -
>>> -	if (npages > ARRAY_SIZE(pages))
>>> -		return -ERANGE;
>>> -
>>> -	for (i = 0; i < npages; i++) {
>>> -		pages[i] = alloc_page(GFP_KERNEL);
>>> -		if (!pages[i])
>>> -			goto out_free;
>>> -	}
>>> +	int ret, i;
>>> 
>>> 	/* for decoding across pages */
>>> 	res.acl_scratch = alloc_page(GFP_KERNEL);
>>> 	if (!res.acl_scratch)
>>> -		goto out_free;
>>> +		return -ENOMEM;
>>> 
>>> -	args.acl_len = npages * PAGE_SIZE;
>>> +	args.acl_len = ARRAY_SIZE(pages) << PAGE_SHIFT;
>>> 
>>> -	dprintk("%s  buf %p buflen %zu npages %d args.acl_len %zu\n",
>>> -		__func__, buf, buflen, npages, args.acl_len);
>>> +	dprintk("%s  buf %p buflen %zu args.acl_len %zu\n",
>>> +		__func__, buf, buflen, args.acl_len);
>>> 	ret = nfs4_call_sync(NFS_SERVER(inode)->client, NFS_SERVER(inode),
>>> 			     &msg, &args.seq_args, &res.seq_res, 0);
>>> 	if (ret == 0)
>>> 		ret = res.acl_len;
>>> -out_free:
>>> +
>>> 	for (i = 0; i < ARRAY_SIZE(pages) && pages[i]; i++)
>>> 		__free_page(pages[i]);
>>> 	__free_page(res.acl_scratch);
>>> -- 
>>> 2.9.3
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




  reply	other threads:[~2017-02-20 16:42 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-17 16:44 [PATCH 0/3] getacl fixes J. Bruce Fields
2017-02-17 16:44 ` [PATCH 1/3] nfsd4: fix getacl head length estimation J. Bruce Fields
2017-02-17 16:44 ` [PATCH 2/3] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-17 16:44 ` [PATCH 3/3] nfsd4: simplify getacl decoding J. Bruce Fields
2017-02-17 19:15   ` kbuild test robot
2017-02-17 19:33     ` J. Bruce Fields
2017-02-17 19:35   ` [PATCH] nfsd4: fix ifnullfree.cocci warnings kbuild test robot
2017-02-17 19:34     ` J. Bruce Fields
2017-02-17 19:35   ` [PATCH 3/3] nfsd4: simplify getacl decoding kbuild test robot
2017-02-17 20:36 ` [PATCH 0/3] getacl fixes Chuck Lever
2017-02-17 20:52   ` J. Bruce Fields
2017-02-17 21:21     ` Chuck Lever
2017-02-19  2:07 ` [PATCH 0/6] getacl fixes V2 J. Bruce Fields
2017-02-19  2:07   ` [PATCH 1/6] NFSv4: fix getacl head length estimation J. Bruce Fields
2017-02-20 13:19     ` Kinglong Mee
2017-02-20 15:50       ` J. Bruce Fields
2017-02-20 20:27         ` [PATCH] " J. Bruce Fields
2017-02-19  2:07   ` [PATCH 2/6] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-21 19:46     ` Weston Andros Adamson
2017-02-22 22:36       ` J. Bruce Fields
2017-02-23 14:55         ` Anna Schumaker
2017-02-23 19:43           ` J. Bruce Fields
2017-02-23 19:53             ` [PATCH 1/2] NFSv4: fix getacl head length estimation J. Bruce Fields
2017-02-23 19:54               ` [PATCH 2/2] NFSv4: fix getacl ERANGE for some ACL buffer sizes J. Bruce Fields
2017-02-23 21:54                 ` Anna Schumaker
2017-02-19  2:07   ` [PATCH 3/6] NFSv4: minor acl caching policy documentation J. Bruce Fields
2017-02-19  2:07   ` [PATCH 4/6] NFSv4: minor getacl cleanup J. Bruce Fields
2017-02-20 22:38     ` Andreas Gruenbacher
2017-02-19  2:07   ` [PATCH 5/6] NFSv4: simplify getacl decoding J. Bruce Fields
2017-02-20 22:30     ` Andreas Gruenbacher
2017-02-19  2:07   ` [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand J. Bruce Fields
2017-02-19 19:29     ` Chuck Lever
2017-02-20 16:09       ` J. Bruce Fields
2017-02-20 16:42         ` Chuck Lever [this message]
2017-02-20 17:15           ` J. Bruce Fields
2017-02-20 21:31             ` Andreas Gruenbacher
2017-02-21 18:46               ` Chuck Lever
2017-02-21 21:21                 ` Andreas Gruenbacher
2017-02-21 21:37                   ` J. Bruce Fields
2017-02-21 21:45                     ` Andreas Gruenbacher
2017-02-22  1:53                       ` J. Bruce Fields
2017-02-23 10:28                         ` Andreas Gruenbacher
2017-02-23 20:20                           ` J. Bruce Fields
2017-02-20 22:38     ` Andreas Gruenbacher
2017-02-21 18:35       ` J. Bruce Fields
2017-02-21 19:45         ` Weston Andros Adamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4824B968-4ED6-44AA-A935-3D309D76EFFF@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=agruenba@redhat.com \
    --cc=bfields@redhat.com \
    --cc=dros@netapp.com \
    --cc=dros@primarydata.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=schumakeranna@gmail.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.