All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH v1] sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqst
Date: Fri, 30 Jun 2017 16:16:53 -0400	[thread overview]
Message-ID: <20170630201653.GD14868@fieldses.org> (raw)
In-Reply-To: <20170630160258.11995.10205.stgit@klimt.1015granger.net>

I like that better.

I'm still pretty confused.

The long comment above RPCSVC_MAXPAGES convinces me that it's
calculating the maximum number of pages a TCP request might need.  But
apparently rq_pages has never been large enough to hold that many pages.
Has nobody ever tried an unaligned 1MB read?

We could use comments each time elsewhere when we add another random 1
or 2 somewhere.  What does sv_max_mesg really mean, why are we adding 2
to it in svc_alloc_arg, etc.

Might make more sense if RPCSVC_MAXPAYLOAD was explicitly calculated as
a max over the compiled-in transports, too.

I feel like we've actually been through this before, but somehow I'm
still confused every time I look at it.

--b.

On Fri, Jun 30, 2017 at 12:03:54PM -0400, Chuck Lever wrote:
> svcrdma needs 259 pages for 1MB NFSv4.0 READ requests:
> 
>  - 1 page for the transport header and head iovec
>  - 256 pages for the data payload
>  - 1 page for the trailing GETATTR request (since NFSD XDR decoding
>    does not look for a tail iovec, the GETATTR is stuck at the end
>    of the rqstp->rq_arg.pages list)
>  - 1 page for building the reply xdr_buf
> 
> But RPCSVC_MAXPAGES is already 259 (on x86_64). The problem is that
> svc_alloc_arg never allocates that many pages. To address this:
> 
> 1. The final element of rq_pages always points to NULL. To
>    accommodate up to 259 pages in rq_pages, add an extra element
>    to rq_pages for the array termination sentinel.
> 
> 2. Adjust the calculation of "pages" to match how RPCSVC_MAXPAGES
>    is calculated, so it can go up to 259. Bruce noted that the
>    calculation assumes sv_max_mesg is a multiple of PAGE_SIZE,
>    which might not always be true. I didn't change this assumption.
> 
> 3. Change the loop boundaries to allow 259 pages to be allocated.
> 
> Additional clean-up: WARN_ON_ONCE adds an extra conditional branch,
> which is basically never taken. And there's no need to dump the
> stack here because svc_alloc_arg has only one caller.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> Hi Bruce-
> 
> This is what I proposed yesterday evening. Instead of changing
> the value of RPCSVC_MAXPAGES, ensure that svc_alloc_arg can
> safely allocate that many pages.
> 
> 
>  include/linux/sunrpc/svc.h |    2 +-
>  net/sunrpc/svc_xprt.c      |   10 ++++++----
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 11cef5a..d741399 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -246,7 +246,7 @@ struct svc_rqst {
>  	size_t			rq_xprt_hlen;	/* xprt header len */
>  	struct xdr_buf		rq_arg;
>  	struct xdr_buf		rq_res;
> -	struct page *		rq_pages[RPCSVC_MAXPAGES];
> +	struct page		*rq_pages[RPCSVC_MAXPAGES + 1];
>  	struct page *		*rq_respages;	/* points into rq_pages */
>  	struct page *		*rq_next_page; /* next reply page to use */
>  	struct page *		*rq_page_end;  /* one past the last page */
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 7bfe1fb..d16a8b4 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -659,11 +659,13 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
>  	int i;
>  
>  	/* now allocate needed pages.  If we get a failure, sleep briefly */
> -	pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
> -	WARN_ON_ONCE(pages >= RPCSVC_MAXPAGES);
> -	if (pages >= RPCSVC_MAXPAGES)
> +	pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT;
> +	if (pages > RPCSVC_MAXPAGES) {
> +		pr_warn_once("svc: warning: pages=%u > RPCSVC_MAXPAGES=%lu\n",
> +			     pages, RPCSVC_MAXPAGES);
>  		/* use as many pages as possible */
> -		pages = RPCSVC_MAXPAGES - 1;
> +		pages = RPCSVC_MAXPAGES;
> +	}
>  	for (i = 0; i < pages ; i++)
>  		while (rqstp->rq_pages[i] == NULL) {
>  			struct page *p = alloc_page(GFP_KERNEL);

  reply	other threads:[~2017-06-30 20:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-30 16:03 [PATCH v1] sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqst Chuck Lever
2017-06-30 20:16 ` J. Bruce Fields [this message]
2017-06-30 20:33   ` Chuck Lever
2017-06-30 20:47     ` J. Bruce Fields
2017-06-30 21:23       ` Chuck Lever
2017-07-05 14:36         ` Chuck Lever
2017-07-06 14:02           ` J. Bruce Fields
2017-07-11 20:50             ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170630201653.GD14868@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.