All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux NFS Mailing List
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst
Date: Thu, 29 Jun 2017 20:23:04 -0400	[thread overview]
Message-ID: <C17CF389-2673-4478-B662-D6F890E3D047@oracle.com> (raw)
In-Reply-To: <EB8E2100-6176-424A-B687-FE84758889E0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>


> On Jun 29, 2017, at 4:44 PM, Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> 
> Thanks for the review!
> 
>> On Jun 29, 2017, at 4:20 PM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
>> 
>> I'm confused by this one:
>> 
>> On Fri, Jun 23, 2017 at 05:18:16PM -0400, Chuck Lever wrote:
>>> svcrdma needs 259 pages for 1MB NFSv4.0 READ requests:
>>> 
>>> - 1 page for the transport header and head iovec
>>> - 256 pages for the data payload
>>> - 1 page for the trailing GETATTR request (since NFSD XDR decoding
>>>  does not look for a tail iovec, the GETATTR is stuck at the end
>>>  of the rqstp->rq_arg.pages list)
>>> - 1 page for building the reply xdr_buf
>>> 
>>> Note that RPCSVC_MAXPAGES is defined as:
>>> 
>>> 	((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2 + 1)
>>> 
>>> I don't understand why the "+PAGE_SIZE-1" is in there, because
>>> division will always round the result down to
>>> 
>>> 	(RPCSVC_MAXPAYLOAD / PAGE_SIZE + 2 + 1)
>> 
>> I think you're assuming that RPCSVC_MAXPAYLOAD is a multiple of
>> PAGE_SIZE.  Maybe that's true in all cases (I haven't checked), but it
>> seems harmless to handle the case where it's not.
> 
> include/linux/sunrpc/svc.h says:
> 
> 126  * Maximum payload size supported by a kernel RPC server.
> 127  * This is use to determine the max number of pages nfsd is
> 128  * willing to return in a single READ operation.
> 129  *
> 130  * These happen to all be powers of 2, which is not strictly
> 131  * necessary but helps enforce the real limitation, which is
> 132  * that they should be multiples of PAGE_SIZE.
> 
> Note that the same calculation in svc_alloc_args() assumes
> that serv->sv_max_mesg is a multiple of PAGE_SIZE:
> 
> 	pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
> 
> It's confusing that the value of RPCSVC_MAXPAGES is not computed
> the same way as "pages" is.
> 
> 
>>> Let's remove the "-1" to get 260 pages maximum.
>> 
>> Why?  (I'm not necessarily opposed, I just missed the explanation.)
> 
> The top of the patch description explains why I need 259 pages.
> 
> MAX_PAGES has to be 260 in order for this check in svc_alloc_args()
> 
> 	if (pages >= RPCSVC_MAXPAGES) {
> 		/* use as many pages as possible */
> 		pages = RPCSVC_MAXPAGES - 1;
> 
> to allow 260 elements of the array to be filled in.

OK, I see why I was confused about this. I expected a variable
called "pages" to be the count of pages. But it actually contains
the maximum index of the rq_pages array, which is one less than
the number of array elements. However...

It looks like the rq_pages array already has 259 elements, but
the last element always contains NULL.

 682         rqstp->rq_page_end = &rqstp->rq_pages[i];
 683         rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */

So we actually have just 258 pages in this 259-entry array. That's
one short of the 259 pages needed by svcrdma, and one less than
the value of RPCSVC_MAXPAGES.

(Note that "i++" isn't necessary, as "i" is not accessed again in
this function.).

I'm sure there's a better way to fix this. For instance, since
the last element of rq_pages is always NULL, perhaps the structure
definition could be

	struct page *rq_pages[RPCSVC_MAXPAGES + 1];

And the logic in svc_alloc_arg() could be adjusted accordingly
to fill in at most RPCSVC_MAXPAGES pages.


>> --b.
>> 
>>> Then svc_alloc_args() needs to ask for pages according to this
>>> same formula, otherwise it will never allocate enough.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>>> ---
>>> include/linux/sunrpc/svc.h |    3 +--
>>> net/sunrpc/svc_xprt.c      |    8 +++++---
>>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
>>> index 11cef5a..35c274f 100644
>>> --- a/include/linux/sunrpc/svc.h
>>> +++ b/include/linux/sunrpc/svc.h
>>> @@ -176,8 +176,7 @@ static inline void svc_get(struct svc_serv *serv)
>>> * We using ->sendfile to return read data, we might need one extra page
>>> * if the request is not page-aligned.  So add another '1'.
>>> */
>>> -#define RPCSVC_MAXPAGES		((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \
>>> -				+ 2 + 1)
>>> +#define RPCSVC_MAXPAGES	((RPCSVC_MAXPAYLOAD + PAGE_SIZE) / PAGE_SIZE + 2 + 1)
>>> 
>>> static inline u32 svc_getnl(struct kvec *iov)
>>> {
>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>> index 7bfe1fb..9bd484d 100644
>>> --- a/net/sunrpc/svc_xprt.c
>>> +++ b/net/sunrpc/svc_xprt.c
>>> @@ -659,11 +659,13 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
>>> 	int i;
>>> 
>>> 	/* now allocate needed pages.  If we get a failure, sleep briefly */
>>> -	pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
>>> -	WARN_ON_ONCE(pages >= RPCSVC_MAXPAGES);
>>> -	if (pages >= RPCSVC_MAXPAGES)
>>> +	pages = (serv->sv_max_mesg + (2 * PAGE_SIZE)) >> PAGE_SHIFT;
>>> +	if (pages >= RPCSVC_MAXPAGES) {
>>> +		pr_warn_once("svc: warning: pages=%u >= RPCSVC_MAXPAGES=%lu\n",
>>> +			     pages, RPCSVC_MAXPAGES);
>>> 		/* use as many pages as possible */
>>> 		pages = RPCSVC_MAXPAGES - 1;
>>> +	}
>>> 	for (i = 0; i < pages ; i++)
>>> 		while (rqstp->rq_pages[i] == NULL) {
>>> 			struct page *p = alloc_page(GFP_KERNEL);
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Chuck Lever <chuck.lever@oracle.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-rdma <linux-rdma@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst
Date: Thu, 29 Jun 2017 20:23:04 -0400	[thread overview]
Message-ID: <C17CF389-2673-4478-B662-D6F890E3D047@oracle.com> (raw)
In-Reply-To: <EB8E2100-6176-424A-B687-FE84758889E0@oracle.com>


> On Jun 29, 2017, at 4:44 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Thanks for the review!
> 
>> On Jun 29, 2017, at 4:20 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>> 
>> I'm confused by this one:
>> 
>> On Fri, Jun 23, 2017 at 05:18:16PM -0400, Chuck Lever wrote:
>>> svcrdma needs 259 pages for 1MB NFSv4.0 READ requests:
>>> 
>>> - 1 page for the transport header and head iovec
>>> - 256 pages for the data payload
>>> - 1 page for the trailing GETATTR request (since NFSD XDR decoding
>>>  does not look for a tail iovec, the GETATTR is stuck at the end
>>>  of the rqstp->rq_arg.pages list)
>>> - 1 page for building the reply xdr_buf
>>> 
>>> Note that RPCSVC_MAXPAGES is defined as:
>>> 
>>> 	((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2 + 1)
>>> 
>>> I don't understand why the "+PAGE_SIZE-1" is in there, because
>>> division will always round the result down to
>>> 
>>> 	(RPCSVC_MAXPAYLOAD / PAGE_SIZE + 2 + 1)
>> 
>> I think you're assuming that RPCSVC_MAXPAYLOAD is a multiple of
>> PAGE_SIZE.  Maybe that's true in all cases (I haven't checked), but it
>> seems harmless to handle the case where it's not.
> 
> include/linux/sunrpc/svc.h says:
> 
> 126  * Maximum payload size supported by a kernel RPC server.
> 127  * This is use to determine the max number of pages nfsd is
> 128  * willing to return in a single READ operation.
> 129  *
> 130  * These happen to all be powers of 2, which is not strictly
> 131  * necessary but helps enforce the real limitation, which is
> 132  * that they should be multiples of PAGE_SIZE.
> 
> Note that the same calculation in svc_alloc_args() assumes
> that serv->sv_max_mesg is a multiple of PAGE_SIZE:
> 
> 	pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
> 
> It's confusing that the value of RPCSVC_MAXPAGES is not computed
> the same way as "pages" is.
> 
> 
>>> Let's remove the "-1" to get 260 pages maximum.
>> 
>> Why?  (I'm not necessarily opposed, I just missed the explanation.)
> 
> The top of the patch description explains why I need 259 pages.
> 
> MAX_PAGES has to be 260 in order for this check in svc_alloc_args()
> 
> 	if (pages >= RPCSVC_MAXPAGES) {
> 		/* use as many pages as possible */
> 		pages = RPCSVC_MAXPAGES - 1;
> 
> to allow 260 elements of the array to be filled in.

OK, I see why I was confused about this. I expected a variable
called "pages" to be the count of pages. But it actually contains
the maximum index of the rq_pages array, which is one less than
the number of array elements. However...

It looks like the rq_pages array already has 259 elements, but
the last element always contains NULL.

 682         rqstp->rq_page_end = &rqstp->rq_pages[i];
 683         rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */

So we actually have just 258 pages in this 259-entry array. That's
one short of the 259 pages needed by svcrdma, and one less than
the value of RPCSVC_MAXPAGES.

(Note that "i++" isn't necessary, as "i" is not accessed again in
this function.).

I'm sure there's a better way to fix this. For instance, since
the last element of rq_pages is always NULL, perhaps the structure
definition could be

	struct page *rq_pages[RPCSVC_MAXPAGES + 1];

And the logic in svc_alloc_arg() could be adjusted accordingly
to fill in at most RPCSVC_MAXPAGES pages.


>> --b.
>> 
>>> Then svc_alloc_args() needs to ask for pages according to this
>>> same formula, otherwise it will never allocate enough.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> include/linux/sunrpc/svc.h |    3 +--
>>> net/sunrpc/svc_xprt.c      |    8 +++++---
>>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
>>> index 11cef5a..35c274f 100644
>>> --- a/include/linux/sunrpc/svc.h
>>> +++ b/include/linux/sunrpc/svc.h
>>> @@ -176,8 +176,7 @@ static inline void svc_get(struct svc_serv *serv)
>>> * We using ->sendfile to return read data, we might need one extra page
>>> * if the request is not page-aligned.  So add another '1'.
>>> */
>>> -#define RPCSVC_MAXPAGES		((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \
>>> -				+ 2 + 1)
>>> +#define RPCSVC_MAXPAGES	((RPCSVC_MAXPAYLOAD + PAGE_SIZE) / PAGE_SIZE + 2 + 1)
>>> 
>>> static inline u32 svc_getnl(struct kvec *iov)
>>> {
>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>> index 7bfe1fb..9bd484d 100644
>>> --- a/net/sunrpc/svc_xprt.c
>>> +++ b/net/sunrpc/svc_xprt.c
>>> @@ -659,11 +659,13 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
>>> 	int i;
>>> 
>>> 	/* now allocate needed pages.  If we get a failure, sleep briefly */
>>> -	pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
>>> -	WARN_ON_ONCE(pages >= RPCSVC_MAXPAGES);
>>> -	if (pages >= RPCSVC_MAXPAGES)
>>> +	pages = (serv->sv_max_mesg + (2 * PAGE_SIZE)) >> PAGE_SHIFT;
>>> +	if (pages >= RPCSVC_MAXPAGES) {
>>> +		pr_warn_once("svc: warning: pages=%u >= RPCSVC_MAXPAGES=%lu\n",
>>> +			     pages, RPCSVC_MAXPAGES);
>>> 		/* use as many pages as possible */
>>> 		pages = RPCSVC_MAXPAGES - 1;
>>> +	}
>>> 	for (i = 0; i < pages ; i++)
>>> 		while (rqstp->rq_pages[i] == NULL) {
>>> 			struct page *p = alloc_page(GFP_KERNEL);
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




  parent reply	other threads:[~2017-06-30  0:23 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-23 21:16 [PATCH v3 00/18] Server-side NFS/RDMA changes proposed for v4.13 Chuck Lever
2017-06-23 21:16 ` Chuck Lever
     [not found] ` <20170623211150.5162.59075.stgit-Hs+gFlyCn65vLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-06-23 21:17   ` [PATCH v3 01/18] sunrpc: Disable splice for krb5i Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:17   ` [PATCH v3 02/18] svcrdma: Squelch disconnection messages Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:17   ` [PATCH v3 03/18] svcrdma: Avoid Send Queue overflow Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:17   ` [PATCH v3 04/18] svcrdma: Remove svc_rdma_marshal.c Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:17   ` [PATCH v3 05/18] svcrdma: Improve Read chunk sanity checking Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:17   ` [PATCH v3 06/18] svcrdma: Improve Write " Chuck Lever
2017-06-23 21:17     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 07/18] svcrdma: Improve Reply " Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 08/18] svcrdma: Don't account for Receive queue "starvation" Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst Chuck Lever
2017-06-23 21:18     ` Chuck Lever
     [not found]     ` <20170623211816.5162.54447.stgit-Hs+gFlyCn65vLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-06-29 20:20       ` J. Bruce Fields
2017-06-29 20:20         ` J. Bruce Fields
     [not found]         ` <20170629202004.GC4178-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2017-06-29 20:44           ` Chuck Lever
2017-06-29 20:44             ` Chuck Lever
     [not found]             ` <EB8E2100-6176-424A-B687-FE84758889E0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-06-30  0:23               ` Chuck Lever [this message]
2017-06-30  0:23                 ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 10/18] svcrdma: Add recvfrom helpers to svc_rdma_rw.c Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 11/18] svcrdma: Use generic RDMA R/W API in RPC Call path Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 12/18] svcrdma: Properly compute .len and .buflen for received RPC Calls Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 13/18] svcrdma: Remove unused Read completion handlers Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:18   ` [PATCH v3 14/18] svcrdma: Remove frmr cache Chuck Lever
2017-06-23 21:18     ` Chuck Lever
2017-06-23 21:19   ` [PATCH v3 15/18] svcrdma: Clean-up svc_rdma_unmap_dma Chuck Lever
2017-06-23 21:19     ` Chuck Lever
2017-06-23 21:19   ` [PATCH v3 16/18] svcrdma: Clean up after converting svc_rdma_recvfrom to rdma_rw API Chuck Lever
2017-06-23 21:19     ` Chuck Lever
2017-06-23 21:19   ` [PATCH v3 17/18] svcrdma: use offset_in_page() macro Chuck Lever
2017-06-23 21:19     ` Chuck Lever
2017-06-23 21:19   ` [PATCH v3 18/18] svcrdma: Remove svc_rdma_chunk_ctxt::cc_dir field Chuck Lever
2017-06-23 21:19     ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C17CF389-2673-4478-B662-D6F890E3D047@oracle.com \
    --to=chuck.lever-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.