From: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> To: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, Linux NFS Mailing List <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> Subject: Re: [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst Date: Thu, 29 Jun 2017 20:23:04 -0400 [thread overview] Message-ID: <C17CF389-2673-4478-B662-D6F890E3D047@oracle.com> (raw) In-Reply-To: <EB8E2100-6176-424A-B687-FE84758889E0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> > On Jun 29, 2017, at 4:44 PM, Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote: > > Thanks for the review! > >> On Jun 29, 2017, at 4:20 PM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote: >> >> I'm confused by this one: >> >> On Fri, Jun 23, 2017 at 05:18:16PM -0400, Chuck Lever wrote: >>> svcrdma needs 259 pages for 1MB NFSv4.0 READ requests: >>> >>> - 1 page for the transport header and head iovec >>> - 256 pages for the data payload >>> - 1 page for the trailing GETATTR request (since NFSD XDR decoding >>> does not look for a tail iovec, the GETATTR is stuck at the end >>> of the rqstp->rq_arg.pages list) >>> - 1 page for building the reply xdr_buf >>> >>> Note that RPCSVC_MAXPAGES is defined as: >>> >>> ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2 + 1) >>> >>> I don't understand why the "+PAGE_SIZE-1" is in there, because >>> division will always round the result down to >>> >>> (RPCSVC_MAXPAYLOAD / PAGE_SIZE + 2 + 1) >> >> I think you're assuming that RPCSVC_MAXPAYLOAD is a multiple of >> PAGE_SIZE. Maybe that's true in all cases (I haven't checked), but it >> seems harmless to handle the case where it's not. > > include/linux/sunrpc/svc.h says: > > 126 * Maximum payload size supported by a kernel RPC server. > 127 * This is use to determine the max number of pages nfsd is > 128 * willing to return in a single READ operation. > 129 * > 130 * These happen to all be powers of 2, which is not strictly > 131 * necessary but helps enforce the real limitation, which is > 132 * that they should be multiples of PAGE_SIZE. > > Note that the same calculation in svc_alloc_args() assumes > that serv->sv_max_mesg is a multiple of PAGE_SIZE: > > pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE; > > It's confusing that the value of RPCSVC_MAXPAGES is not computed > the same way as "pages" is. > > >>> Let's remove the "-1" to get 260 pages maximum. >> >> Why? (I'm not necessarily opposed, I just missed the explanation.) > > The top of the patch description explains why I need 259 pages. > > MAX_PAGES has to be 260 in order for this check in svc_alloc_args() > > if (pages >= RPCSVC_MAXPAGES) { > /* use as many pages as possible */ > pages = RPCSVC_MAXPAGES - 1; > > to allow 260 elements of the array to be filled in. OK, I see why I was confused about this. I expected a variable called "pages" to be the count of pages. But it actually contains the maximum index of the rq_pages array, which is one less than the number of array elements. However... It looks like the rq_pages array already has 259 elements, but the last element always contains NULL. 682 rqstp->rq_page_end = &rqstp->rq_pages[i]; 683 rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */ So we actually have just 258 pages in this 259-entry array. That's one short of the 259 pages needed by svcrdma, and one less than the value of RPCSVC_MAXPAGES. (Note that "i++" isn't necessary, as "i" is not accessed again in this function.). I'm sure there's a better way to fix this. For instance, since the last element of rq_pages is always NULL, perhaps the structure definition could be struct page *rq_pages[RPCSVC_MAXPAGES + 1]; And the logic in svc_alloc_arg() could be adjusted accordingly to fill in at most RPCSVC_MAXPAGES pages. >> --b. >> >>> Then svc_alloc_args() needs to ask for pages according to this >>> same formula, otherwise it will never allocate enough. >>> >>> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> >>> --- >>> include/linux/sunrpc/svc.h | 3 +-- >>> net/sunrpc/svc_xprt.c | 8 +++++--- >>> 2 files changed, 6 insertions(+), 5 deletions(-) >>> >>> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h >>> index 11cef5a..35c274f 100644 >>> --- a/include/linux/sunrpc/svc.h >>> +++ b/include/linux/sunrpc/svc.h >>> @@ -176,8 +176,7 @@ static inline void svc_get(struct svc_serv *serv) >>> * We using ->sendfile to return read data, we might need one extra page >>> * if the request is not page-aligned. So add another '1'. >>> */ >>> -#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \ >>> - + 2 + 1) >>> +#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD + PAGE_SIZE) / PAGE_SIZE + 2 + 1) >>> >>> static inline u32 svc_getnl(struct kvec *iov) >>> { >>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c >>> index 7bfe1fb..9bd484d 100644 >>> --- a/net/sunrpc/svc_xprt.c >>> +++ b/net/sunrpc/svc_xprt.c >>> @@ -659,11 +659,13 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) >>> int i; >>> >>> /* now allocate needed pages. If we get a failure, sleep briefly */ >>> - pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE; >>> - WARN_ON_ONCE(pages >= RPCSVC_MAXPAGES); >>> - if (pages >= RPCSVC_MAXPAGES) >>> + pages = (serv->sv_max_mesg + (2 * PAGE_SIZE)) >> PAGE_SHIFT; >>> + if (pages >= RPCSVC_MAXPAGES) { >>> + pr_warn_once("svc: warning: pages=%u >= RPCSVC_MAXPAGES=%lu\n", >>> + pages, RPCSVC_MAXPAGES); >>> /* use as many pages as possible */ >>> pages = RPCSVC_MAXPAGES - 1; >>> + } >>> for (i = 0; i < pages ; i++) >>> while (rqstp->rq_pages[i] == NULL) { >>> struct page *p = alloc_page(GFP_KERNEL); >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Chuck Lever <chuck.lever@oracle.com> To: "J. Bruce Fields" <bfields@fieldses.org> Cc: linux-rdma <linux-rdma@vger.kernel.org>, Linux NFS Mailing List <linux-nfs@vger.kernel.org> Subject: Re: [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst Date: Thu, 29 Jun 2017 20:23:04 -0400 [thread overview] Message-ID: <C17CF389-2673-4478-B662-D6F890E3D047@oracle.com> (raw) In-Reply-To: <EB8E2100-6176-424A-B687-FE84758889E0@oracle.com> > On Jun 29, 2017, at 4:44 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > Thanks for the review! > >> On Jun 29, 2017, at 4:20 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >> >> I'm confused by this one: >> >> On Fri, Jun 23, 2017 at 05:18:16PM -0400, Chuck Lever wrote: >>> svcrdma needs 259 pages for 1MB NFSv4.0 READ requests: >>> >>> - 1 page for the transport header and head iovec >>> - 256 pages for the data payload >>> - 1 page for the trailing GETATTR request (since NFSD XDR decoding >>> does not look for a tail iovec, the GETATTR is stuck at the end >>> of the rqstp->rq_arg.pages list) >>> - 1 page for building the reply xdr_buf >>> >>> Note that RPCSVC_MAXPAGES is defined as: >>> >>> ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2 + 1) >>> >>> I don't understand why the "+PAGE_SIZE-1" is in there, because >>> division will always round the result down to >>> >>> (RPCSVC_MAXPAYLOAD / PAGE_SIZE + 2 + 1) >> >> I think you're assuming that RPCSVC_MAXPAYLOAD is a multiple of >> PAGE_SIZE. Maybe that's true in all cases (I haven't checked), but it >> seems harmless to handle the case where it's not. > > include/linux/sunrpc/svc.h says: > > 126 * Maximum payload size supported by a kernel RPC server. > 127 * This is use to determine the max number of pages nfsd is > 128 * willing to return in a single READ operation. > 129 * > 130 * These happen to all be powers of 2, which is not strictly > 131 * necessary but helps enforce the real limitation, which is > 132 * that they should be multiples of PAGE_SIZE. > > Note that the same calculation in svc_alloc_args() assumes > that serv->sv_max_mesg is a multiple of PAGE_SIZE: > > pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE; > > It's confusing that the value of RPCSVC_MAXPAGES is not computed > the same way as "pages" is. > > >>> Let's remove the "-1" to get 260 pages maximum. >> >> Why? (I'm not necessarily opposed, I just missed the explanation.) > > The top of the patch description explains why I need 259 pages. > > MAX_PAGES has to be 260 in order for this check in svc_alloc_args() > > if (pages >= RPCSVC_MAXPAGES) { > /* use as many pages as possible */ > pages = RPCSVC_MAXPAGES - 1; > > to allow 260 elements of the array to be filled in. OK, I see why I was confused about this. I expected a variable called "pages" to be the count of pages. But it actually contains the maximum index of the rq_pages array, which is one less than the number of array elements. However... It looks like the rq_pages array already has 259 elements, but the last element always contains NULL. 682 rqstp->rq_page_end = &rqstp->rq_pages[i]; 683 rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */ So we actually have just 258 pages in this 259-entry array. That's one short of the 259 pages needed by svcrdma, and one less than the value of RPCSVC_MAXPAGES. (Note that "i++" isn't necessary, as "i" is not accessed again in this function.). I'm sure there's a better way to fix this. For instance, since the last element of rq_pages is always NULL, perhaps the structure definition could be struct page *rq_pages[RPCSVC_MAXPAGES + 1]; And the logic in svc_alloc_arg() could be adjusted accordingly to fill in at most RPCSVC_MAXPAGES pages. >> --b. >> >>> Then svc_alloc_args() needs to ask for pages according to this >>> same formula, otherwise it will never allocate enough. >>> >>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>> --- >>> include/linux/sunrpc/svc.h | 3 +-- >>> net/sunrpc/svc_xprt.c | 8 +++++--- >>> 2 files changed, 6 insertions(+), 5 deletions(-) >>> >>> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h >>> index 11cef5a..35c274f 100644 >>> --- a/include/linux/sunrpc/svc.h >>> +++ b/include/linux/sunrpc/svc.h >>> @@ -176,8 +176,7 @@ static inline void svc_get(struct svc_serv *serv) >>> * We using ->sendfile to return read data, we might need one extra page >>> * if the request is not page-aligned. So add another '1'. >>> */ >>> -#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \ >>> - + 2 + 1) >>> +#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD + PAGE_SIZE) / PAGE_SIZE + 2 + 1) >>> >>> static inline u32 svc_getnl(struct kvec *iov) >>> { >>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c >>> index 7bfe1fb..9bd484d 100644 >>> --- a/net/sunrpc/svc_xprt.c >>> +++ b/net/sunrpc/svc_xprt.c >>> @@ -659,11 +659,13 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) >>> int i; >>> >>> /* now allocate needed pages. If we get a failure, sleep briefly */ >>> - pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE; >>> - WARN_ON_ONCE(pages >= RPCSVC_MAXPAGES); >>> - if (pages >= RPCSVC_MAXPAGES) >>> + pages = (serv->sv_max_mesg + (2 * PAGE_SIZE)) >> PAGE_SHIFT; >>> + if (pages >= RPCSVC_MAXPAGES) { >>> + pr_warn_once("svc: warning: pages=%u >= RPCSVC_MAXPAGES=%lu\n", >>> + pages, RPCSVC_MAXPAGES); >>> /* use as many pages as possible */ >>> pages = RPCSVC_MAXPAGES - 1; >>> + } >>> for (i = 0; i < pages ; i++) >>> while (rqstp->rq_pages[i] == NULL) { >>> struct page *p = alloc_page(GFP_KERNEL); >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever
next prev parent reply other threads:[~2017-06-30 0:23 UTC|newest] Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-06-23 21:16 [PATCH v3 00/18] Server-side NFS/RDMA changes proposed for v4.13 Chuck Lever 2017-06-23 21:16 ` Chuck Lever [not found] ` <20170623211150.5162.59075.stgit-Hs+gFlyCn65vLzlybtyyYzGyq/o6K9yX@public.gmane.org> 2017-06-23 21:17 ` [PATCH v3 01/18] sunrpc: Disable splice for krb5i Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:17 ` [PATCH v3 02/18] svcrdma: Squelch disconnection messages Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:17 ` [PATCH v3 03/18] svcrdma: Avoid Send Queue overflow Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:17 ` [PATCH v3 04/18] svcrdma: Remove svc_rdma_marshal.c Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:17 ` [PATCH v3 05/18] svcrdma: Improve Read chunk sanity checking Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:17 ` [PATCH v3 06/18] svcrdma: Improve Write " Chuck Lever 2017-06-23 21:17 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 07/18] svcrdma: Improve Reply " Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 08/18] svcrdma: Don't account for Receive queue "starvation" Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 09/18] sunrpc: Allocate one more page per svc_rqst Chuck Lever 2017-06-23 21:18 ` Chuck Lever [not found] ` <20170623211816.5162.54447.stgit-Hs+gFlyCn65vLzlybtyyYzGyq/o6K9yX@public.gmane.org> 2017-06-29 20:20 ` J. Bruce Fields 2017-06-29 20:20 ` J. Bruce Fields [not found] ` <20170629202004.GC4178-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2017-06-29 20:44 ` Chuck Lever 2017-06-29 20:44 ` Chuck Lever [not found] ` <EB8E2100-6176-424A-B687-FE84758889E0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2017-06-30 0:23 ` Chuck Lever [this message] 2017-06-30 0:23 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 10/18] svcrdma: Add recvfrom helpers to svc_rdma_rw.c Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 11/18] svcrdma: Use generic RDMA R/W API in RPC Call path Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 12/18] svcrdma: Properly compute .len and .buflen for received RPC Calls Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 13/18] svcrdma: Remove unused Read completion handlers Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:18 ` [PATCH v3 14/18] svcrdma: Remove frmr cache Chuck Lever 2017-06-23 21:18 ` Chuck Lever 2017-06-23 21:19 ` [PATCH v3 15/18] svcrdma: Clean-up svc_rdma_unmap_dma Chuck Lever 2017-06-23 21:19 ` Chuck Lever 2017-06-23 21:19 ` [PATCH v3 16/18] svcrdma: Clean up after converting svc_rdma_recvfrom to rdma_rw API Chuck Lever 2017-06-23 21:19 ` Chuck Lever 2017-06-23 21:19 ` [PATCH v3 17/18] svcrdma: use offset_in_page() macro Chuck Lever 2017-06-23 21:19 ` Chuck Lever 2017-06-23 21:19 ` [PATCH v3 18/18] svcrdma: Remove svc_rdma_chunk_ctxt::cc_dir field Chuck Lever 2017-06-23 21:19 ` Chuck Lever
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=C17CF389-2673-4478-B662-D6F890E3D047@oracle.com \ --to=chuck.lever-qhclzuegtsvqt0dzr+alfa@public.gmane.org \ --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \ --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.