All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
@ 2021-02-02 14:42 Chuck Lever
  2021-02-02 19:18 ` Tom Talpey
  0 siblings, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2021-02-02 14:42 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-nfs, linux-rdma

Clean up.

Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
Remove support for FMR memory registration") [Dec 2018]. That means
the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
on page boundaries") [Mar 2016], is no longer necessary. FRWR
memory registration handles this case with aplomb.

A related simplification removes an extra conditional branch from
the SGL set-up loop in frwr_map(): Instead of using either
sg_set_page() or sg_set_buf(), initialize the mr_page field properly
when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
frwr_map() can then invoke sg_set_page() unconditionally.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
 net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
 net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
 3 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index baca49fe83af..5eb044a5f0be 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 	if (nsegs > ep->re_max_fr_depth)
 		nsegs = ep->re_max_fr_depth;
 	for (i = 0; i < nsegs;) {
-		if (seg->mr_page)
-			sg_set_page(&mr->mr_sg[i],
-				    seg->mr_page,
-				    seg->mr_len,
-				    offset_in_page(seg->mr_offset));
-		else
-			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
-				   seg->mr_len);
+		sg_set_page(&mr->mr_sg[i], seg->mr_page,
+			    seg->mr_len, offset_in_page(seg->mr_offset));
 
 		++seg;
 		++i;
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 8f5d0cb68360..529adb6ad4db 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
 	return 0;
 }
 
-/* Split @vec on page boundaries into SGEs. FMR registers pages, not
- * a byte range. Other modes coalesce these SGEs into a single MR
- * when they can.
+/* Convert @vec to a single SGL element.
  *
  * Returns pointer to next available SGE, and bumps the total number
  * of SGEs consumed.
@@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
 rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
 		     unsigned int *n)
 {
-	u32 remaining, page_offset;
-	char *base;
-
-	base = vec->iov_base;
-	page_offset = offset_in_page(base);
-	remaining = vec->iov_len;
-	while (remaining) {
-		seg->mr_page = NULL;
-		seg->mr_offset = base;
-		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
-		remaining -= seg->mr_len;
-		base += seg->mr_len;
+	if (vec->iov_len) {
+		seg->mr_page = virt_to_page(vec->iov_base);
+		seg->mr_offset = vec->iov_base;
+		seg->mr_len = vec->iov_len;
 		++seg;
 		++(*n);
-		page_offset = 0;
 	}
 	return seg;
 }
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 94b28657aeeb..4a9fe6592795 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -285,7 +285,7 @@ enum {
 
 struct rpcrdma_mr_seg {		/* chunk descriptors */
 	u32		mr_len;		/* length of chunk or segment */
-	struct page	*mr_page;	/* owning page, if any */
+	struct page	*mr_page;	/* underlying struct page */
 	char		*mr_offset;	/* kva if no page, else offset */
 };
 



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 14:42 [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() Chuck Lever
@ 2021-02-02 19:18 ` Tom Talpey
  2021-02-02 19:20   ` Chuck Lever
  2021-02-02 19:40   ` Anna Schumaker
  0 siblings, 2 replies; 7+ messages in thread
From: Tom Talpey @ 2021-02-02 19:18 UTC (permalink / raw)
  To: Chuck Lever, anna.schumaker; +Cc: linux-nfs, linux-rdma

What's not to like about a log that uses the words "with aplomb"? :)

Minor related comment/question below.

On 2/2/2021 9:42 AM, Chuck Lever wrote:
> Clean up.
> 
> Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
> Remove support for FMR memory registration") [Dec 2018]. That means
> the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
> commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
> on page boundaries") [Mar 2016], is no longer necessary. FRWR
> memory registration handles this case with aplomb.
> 
> A related simplification removes an extra conditional branch from
> the SGL set-up loop in frwr_map(): Instead of using either
> sg_set_page() or sg_set_buf(), initialize the mr_page field properly
> when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
> frwr_map() can then invoke sg_set_page() unconditionally.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
>   net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
>   net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
>   3 files changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
> index baca49fe83af..5eb044a5f0be 100644
> --- a/net/sunrpc/xprtrdma/frwr_ops.c
> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
> @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>   	if (nsegs > ep->re_max_fr_depth)
>   		nsegs = ep->re_max_fr_depth;
>   	for (i = 0; i < nsegs;) {
> -		if (seg->mr_page)
> -			sg_set_page(&mr->mr_sg[i],
> -				    seg->mr_page,
> -				    seg->mr_len,
> -				    offset_in_page(seg->mr_offset));
> -		else
> -			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
> -				   seg->mr_len);
> +		sg_set_page(&mr->mr_sg[i], seg->mr_page,
> +			    seg->mr_len, offset_in_page(seg->mr_offset));
>   
>   		++seg;
>   		++i;
> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
> index 8f5d0cb68360..529adb6ad4db 100644
> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
>   	return 0;
>   }
>   
> -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
> - * a byte range. Other modes coalesce these SGEs into a single MR
> - * when they can.
> +/* Convert @vec to a single SGL element.
>    *
>    * Returns pointer to next available SGE, and bumps the total number
>    * of SGEs consumed.
> @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
>   rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
>   		     unsigned int *n)
>   {
> -	u32 remaining, page_offset;
> -	char *base;
> -
> -	base = vec->iov_base;
> -	page_offset = offset_in_page(base);
> -	remaining = vec->iov_len;
> -	while (remaining) {
> -		seg->mr_page = NULL;
> -		seg->mr_offset = base;
> -		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
> -		remaining -= seg->mr_len;
> -		base += seg->mr_len;
> +	if (vec->iov_len) {
> +		seg->mr_page = virt_to_page(vec->iov_base);
> +		seg->mr_offset = vec->iov_base;
> +		seg->mr_len = vec->iov_len;
>   		++seg;
>   		++(*n);
> -		page_offset = 0;
>   	}
>   	return seg;
>   }
> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
> index 94b28657aeeb..4a9fe6592795 100644
> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> @@ -285,7 +285,7 @@ enum {
>   
>   struct rpcrdma_mr_seg {		/* chunk descriptors */
>   	u32		mr_len;		/* length of chunk or segment */
> -	struct page	*mr_page;	/* owning page, if any */
> +	struct page	*mr_page;	/* underlying struct page */
>   	char		*mr_offset;	/* kva if no page, else offset */

Is this comment ("kva if no page") actually correct? The hunk just
above is an example of a case where mr_page is set, yet mr_offset
is an iov_base. Is iov_base not a kva?

Tom.

>   };
>   
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 19:18 ` Tom Talpey
@ 2021-02-02 19:20   ` Chuck Lever
  2021-02-02 21:50     ` Tom Talpey
  2021-02-02 19:40   ` Anna Schumaker
  1 sibling, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2021-02-02 19:20 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Anna Schumaker, Linux NFS Mailing List, linux-rdma



> On Feb 2, 2021, at 2:18 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> What's not to like about a log that uses the words "with aplomb"? :)
> 
> Minor related comment/question below.
> 
> On 2/2/2021 9:42 AM, Chuck Lever wrote:
>> Clean up.
>> Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
>> Remove support for FMR memory registration") [Dec 2018]. That means
>> the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
>> commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
>> on page boundaries") [Mar 2016], is no longer necessary. FRWR
>> memory registration handles this case with aplomb.
>> A related simplification removes an extra conditional branch from
>> the SGL set-up loop in frwr_map(): Instead of using either
>> sg_set_page() or sg_set_buf(), initialize the mr_page field properly
>> when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
>> frwr_map() can then invoke sg_set_page() unconditionally.
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
>>  net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
>>  net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
>>  3 files changed, 8 insertions(+), 25 deletions(-)
>> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
>> index baca49fe83af..5eb044a5f0be 100644
>> --- a/net/sunrpc/xprtrdma/frwr_ops.c
>> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
>> @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>>  	if (nsegs > ep->re_max_fr_depth)
>>  		nsegs = ep->re_max_fr_depth;
>>  	for (i = 0; i < nsegs;) {
>> -		if (seg->mr_page)
>> -			sg_set_page(&mr->mr_sg[i],
>> -				    seg->mr_page,
>> -				    seg->mr_len,
>> -				    offset_in_page(seg->mr_offset));
>> -		else
>> -			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
>> -				   seg->mr_len);
>> +		sg_set_page(&mr->mr_sg[i], seg->mr_page,
>> +			    seg->mr_len, offset_in_page(seg->mr_offset));
>>    		++seg;
>>  		++i;
>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>> index 8f5d0cb68360..529adb6ad4db 100644
>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>> @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
>>  	return 0;
>>  }
>>  -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
>> - * a byte range. Other modes coalesce these SGEs into a single MR
>> - * when they can.
>> +/* Convert @vec to a single SGL element.
>>   *
>>   * Returns pointer to next available SGE, and bumps the total number
>>   * of SGEs consumed.
>> @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
>>  rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
>>  		     unsigned int *n)
>>  {
>> -	u32 remaining, page_offset;
>> -	char *base;
>> -
>> -	base = vec->iov_base;
>> -	page_offset = offset_in_page(base);
>> -	remaining = vec->iov_len;
>> -	while (remaining) {
>> -		seg->mr_page = NULL;
>> -		seg->mr_offset = base;
>> -		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
>> -		remaining -= seg->mr_len;
>> -		base += seg->mr_len;
>> +	if (vec->iov_len) {
>> +		seg->mr_page = virt_to_page(vec->iov_base);
>> +		seg->mr_offset = vec->iov_base;
>> +		seg->mr_len = vec->iov_len;
>>  		++seg;
>>  		++(*n);
>> -		page_offset = 0;
>>  	}
>>  	return seg;
>>  }
>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
>> index 94b28657aeeb..4a9fe6592795 100644
>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>> @@ -285,7 +285,7 @@ enum {
>>    struct rpcrdma_mr_seg {		/* chunk descriptors */
>>  	u32		mr_len;		/* length of chunk or segment */
>> -	struct page	*mr_page;	/* owning page, if any */
>> +	struct page	*mr_page;	/* underlying struct page */
>>  	char		*mr_offset;	/* kva if no page, else offset */
> 
> Is this comment ("kva if no page") actually correct? The hunk just
> above is an example of a case where mr_page is set, yet mr_offset
> is an iov_base. Is iov_base not a kva?

Ah, well the "if no page" part is now obsolete.

I suppose it should be set to "offset_in_page(vec->iov_base)" ?


--
Chuck Lever




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 19:18 ` Tom Talpey
  2021-02-02 19:20   ` Chuck Lever
@ 2021-02-02 19:40   ` Anna Schumaker
  1 sibling, 0 replies; 7+ messages in thread
From: Anna Schumaker @ 2021-02-02 19:40 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Chuck Lever, Linux NFS Mailing List, List Linux RDMA Mailing

On Tue, Feb 2, 2021 at 2:24 PM Tom Talpey <tom@talpey.com> wrote:
>
> What's not to like about a log that uses the words "with aplomb"? :)

As far as I can tell, it'll be the first use of "aplomb" in the entire
kernel git history

>
> Minor related comment/question below.
>
> On 2/2/2021 9:42 AM, Chuck Lever wrote:
> > Clean up.
> >
> > Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
> > Remove support for FMR memory registration") [Dec 2018]. That means
> > the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
> > commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
> > on page boundaries") [Mar 2016], is no longer necessary. FRWR
> > memory registration handles this case with aplomb.
> >
> > A related simplification removes an extra conditional branch from
> > the SGL set-up loop in frwr_map(): Instead of using either
> > sg_set_page() or sg_set_buf(), initialize the mr_page field properly
> > when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
> > frwr_map() can then invoke sg_set_page() unconditionally.
> >
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >   net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
> >   net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
> >   net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
> >   3 files changed, 8 insertions(+), 25 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
> > index baca49fe83af..5eb044a5f0be 100644
> > --- a/net/sunrpc/xprtrdma/frwr_ops.c
> > +++ b/net/sunrpc/xprtrdma/frwr_ops.c
> > @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
> >       if (nsegs > ep->re_max_fr_depth)
> >               nsegs = ep->re_max_fr_depth;
> >       for (i = 0; i < nsegs;) {
> > -             if (seg->mr_page)
> > -                     sg_set_page(&mr->mr_sg[i],
> > -                                 seg->mr_page,
> > -                                 seg->mr_len,
> > -                                 offset_in_page(seg->mr_offset));
> > -             else
> > -                     sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
> > -                                seg->mr_len);
> > +             sg_set_page(&mr->mr_sg[i], seg->mr_page,
> > +                         seg->mr_len, offset_in_page(seg->mr_offset));
> >
> >               ++seg;
> >               ++i;
> > diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
> > index 8f5d0cb68360..529adb6ad4db 100644
> > --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> > +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> > @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
> >       return 0;
> >   }
> >
> > -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
> > - * a byte range. Other modes coalesce these SGEs into a single MR
> > - * when they can.
> > +/* Convert @vec to a single SGL element.
> >    *
> >    * Returns pointer to next available SGE, and bumps the total number
> >    * of SGEs consumed.
> > @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
> >   rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
> >                    unsigned int *n)
> >   {
> > -     u32 remaining, page_offset;
> > -     char *base;
> > -
> > -     base = vec->iov_base;
> > -     page_offset = offset_in_page(base);
> > -     remaining = vec->iov_len;
> > -     while (remaining) {
> > -             seg->mr_page = NULL;
> > -             seg->mr_offset = base;
> > -             seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
> > -             remaining -= seg->mr_len;
> > -             base += seg->mr_len;
> > +     if (vec->iov_len) {
> > +             seg->mr_page = virt_to_page(vec->iov_base);
> > +             seg->mr_offset = vec->iov_base;
> > +             seg->mr_len = vec->iov_len;
> >               ++seg;
> >               ++(*n);
> > -             page_offset = 0;
> >       }
> >       return seg;
> >   }
> > diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
> > index 94b28657aeeb..4a9fe6592795 100644
> > --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> > +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> > @@ -285,7 +285,7 @@ enum {
> >
> >   struct rpcrdma_mr_seg {             /* chunk descriptors */
> >       u32             mr_len;         /* length of chunk or segment */
> > -     struct page     *mr_page;       /* owning page, if any */
> > +     struct page     *mr_page;       /* underlying struct page */
> >       char            *mr_offset;     /* kva if no page, else offset */
>
> Is this comment ("kva if no page") actually correct? The hunk just
> above is an example of a case where mr_page is set, yet mr_offset
> is an iov_base. Is iov_base not a kva?
>
> Tom.
>
> >   };
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 19:20   ` Chuck Lever
@ 2021-02-02 21:50     ` Tom Talpey
  2021-02-02 21:55       ` Chuck Lever
  0 siblings, 1 reply; 7+ messages in thread
From: Tom Talpey @ 2021-02-02 21:50 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Anna Schumaker, Linux NFS Mailing List, linux-rdma

On 2/2/2021 2:20 PM, Chuck Lever wrote:
> 
> 
>> On Feb 2, 2021, at 2:18 PM, Tom Talpey <tom@talpey.com> wrote:
>>
>> What's not to like about a log that uses the words "with aplomb"? :)
>>
>> Minor related comment/question below.
>>
>> On 2/2/2021 9:42 AM, Chuck Lever wrote:
>>> Clean up.
>>> Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
>>> Remove support for FMR memory registration") [Dec 2018]. That means
>>> the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
>>> commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
>>> on page boundaries") [Mar 2016], is no longer necessary. FRWR
>>> memory registration handles this case with aplomb.
>>> A related simplification removes an extra conditional branch from
>>> the SGL set-up loop in frwr_map(): Instead of using either
>>> sg_set_page() or sg_set_buf(), initialize the mr_page field properly
>>> when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
>>> frwr_map() can then invoke sg_set_page() unconditionally.
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>   net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
>>>   net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
>>>   net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
>>>   3 files changed, 8 insertions(+), 25 deletions(-)
>>> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
>>> index baca49fe83af..5eb044a5f0be 100644
>>> --- a/net/sunrpc/xprtrdma/frwr_ops.c
>>> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
>>> @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>>>   	if (nsegs > ep->re_max_fr_depth)
>>>   		nsegs = ep->re_max_fr_depth;
>>>   	for (i = 0; i < nsegs;) {
>>> -		if (seg->mr_page)
>>> -			sg_set_page(&mr->mr_sg[i],
>>> -				    seg->mr_page,
>>> -				    seg->mr_len,
>>> -				    offset_in_page(seg->mr_offset));
>>> -		else
>>> -			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
>>> -				   seg->mr_len);
>>> +		sg_set_page(&mr->mr_sg[i], seg->mr_page,
>>> +			    seg->mr_len, offset_in_page(seg->mr_offset));
>>>     		++seg;
>>>   		++i;
>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>>> index 8f5d0cb68360..529adb6ad4db 100644
>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>> @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
>>>   	return 0;
>>>   }
>>>   -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
>>> - * a byte range. Other modes coalesce these SGEs into a single MR
>>> - * when they can.
>>> +/* Convert @vec to a single SGL element.
>>>    *
>>>    * Returns pointer to next available SGE, and bumps the total number
>>>    * of SGEs consumed.
>>> @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
>>>   rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
>>>   		     unsigned int *n)
>>>   {
>>> -	u32 remaining, page_offset;
>>> -	char *base;
>>> -
>>> -	base = vec->iov_base;
>>> -	page_offset = offset_in_page(base);
>>> -	remaining = vec->iov_len;
>>> -	while (remaining) {
>>> -		seg->mr_page = NULL;
>>> -		seg->mr_offset = base;
>>> -		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
>>> -		remaining -= seg->mr_len;
>>> -		base += seg->mr_len;
>>> +	if (vec->iov_len) {
>>> +		seg->mr_page = virt_to_page(vec->iov_base);
>>> +		seg->mr_offset = vec->iov_base;
>>> +		seg->mr_len = vec->iov_len;
>>>   		++seg;
>>>   		++(*n);
>>> -		page_offset = 0;
>>>   	}
>>>   	return seg;
>>>   }
>>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
>>> index 94b28657aeeb..4a9fe6592795 100644
>>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>>> @@ -285,7 +285,7 @@ enum {
>>>     struct rpcrdma_mr_seg {		/* chunk descriptors */
>>>   	u32		mr_len;		/* length of chunk or segment */
>>> -	struct page	*mr_page;	/* owning page, if any */
>>> +	struct page	*mr_page;	/* underlying struct page */
>>>   	char		*mr_offset;	/* kva if no page, else offset */
>>
>> Is this comment ("kva if no page") actually correct? The hunk just
>> above is an example of a case where mr_page is set, yet mr_offset
>> is an iov_base. Is iov_base not a kva?
> 
> Ah, well the "if no page" part is now obsolete.
> 
> I suppose it should be set to "offset_in_page(vec->iov_base)" ?

Seems like it, yes. Assuming that only the first element in the sgl
has a possibly non-zero offset ("FBO"). All others must be zero for
the FRMR.

Is it guaranteed that each kvec is at most one physical page? If not,
then the length may span into a random physical page, that was not
necessarily contiguous in the original KVA-addressed buffer.

Tom.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 21:50     ` Tom Talpey
@ 2021-02-02 21:55       ` Chuck Lever
  2021-02-02 22:21         ` Tom Talpey
  0 siblings, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2021-02-02 21:55 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Anna Schumaker, Linux NFS Mailing List, linux-rdma



> On Feb 2, 2021, at 4:50 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> On 2/2/2021 2:20 PM, Chuck Lever wrote:
>>> On Feb 2, 2021, at 2:18 PM, Tom Talpey <tom@talpey.com> wrote:
>>> 
>>> What's not to like about a log that uses the words "with aplomb"? :)
>>> 
>>> Minor related comment/question below.
>>> 
>>> On 2/2/2021 9:42 AM, Chuck Lever wrote:
>>>> Clean up.
>>>> Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
>>>> Remove support for FMR memory registration") [Dec 2018]. That means
>>>> the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
>>>> commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
>>>> on page boundaries") [Mar 2016], is no longer necessary. FRWR
>>>> memory registration handles this case with aplomb.
>>>> A related simplification removes an extra conditional branch from
>>>> the SGL set-up loop in frwr_map(): Instead of using either
>>>> sg_set_page() or sg_set_buf(), initialize the mr_page field properly
>>>> when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
>>>> frwr_map() can then invoke sg_set_page() unconditionally.
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>  net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
>>>>  net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
>>>>  net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
>>>>  3 files changed, 8 insertions(+), 25 deletions(-)
>>>> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
>>>> index baca49fe83af..5eb044a5f0be 100644
>>>> --- a/net/sunrpc/xprtrdma/frwr_ops.c
>>>> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
>>>> @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>>>>  	if (nsegs > ep->re_max_fr_depth)
>>>>  		nsegs = ep->re_max_fr_depth;
>>>>  	for (i = 0; i < nsegs;) {
>>>> -		if (seg->mr_page)
>>>> -			sg_set_page(&mr->mr_sg[i],
>>>> -				    seg->mr_page,
>>>> -				    seg->mr_len,
>>>> -				    offset_in_page(seg->mr_offset));
>>>> -		else
>>>> -			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
>>>> -				   seg->mr_len);
>>>> +		sg_set_page(&mr->mr_sg[i], seg->mr_page,
>>>> +			    seg->mr_len, offset_in_page(seg->mr_offset));
>>>>    		++seg;
>>>>  		++i;
>>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> index 8f5d0cb68360..529adb6ad4db 100644
>>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
>>>>  	return 0;
>>>>  }
>>>>  -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
>>>> - * a byte range. Other modes coalesce these SGEs into a single MR
>>>> - * when they can.
>>>> +/* Convert @vec to a single SGL element.
>>>>   *
>>>>   * Returns pointer to next available SGE, and bumps the total number
>>>>   * of SGEs consumed.
>>>> @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
>>>>  rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
>>>>  		     unsigned int *n)
>>>>  {
>>>> -	u32 remaining, page_offset;
>>>> -	char *base;
>>>> -
>>>> -	base = vec->iov_base;
>>>> -	page_offset = offset_in_page(base);
>>>> -	remaining = vec->iov_len;
>>>> -	while (remaining) {
>>>> -		seg->mr_page = NULL;
>>>> -		seg->mr_offset = base;
>>>> -		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
>>>> -		remaining -= seg->mr_len;
>>>> -		base += seg->mr_len;
>>>> +	if (vec->iov_len) {
>>>> +		seg->mr_page = virt_to_page(vec->iov_base);
>>>> +		seg->mr_offset = vec->iov_base;
>>>> +		seg->mr_len = vec->iov_len;
>>>>  		++seg;
>>>>  		++(*n);
>>>> -		page_offset = 0;
>>>>  	}
>>>>  	return seg;
>>>>  }
>>>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> index 94b28657aeeb..4a9fe6592795 100644
>>>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> @@ -285,7 +285,7 @@ enum {
>>>>    struct rpcrdma_mr_seg {		/* chunk descriptors */
>>>>  	u32		mr_len;		/* length of chunk or segment */
>>>> -	struct page	*mr_page;	/* owning page, if any */
>>>> +	struct page	*mr_page;	/* underlying struct page */
>>>>  	char		*mr_offset;	/* kva if no page, else offset */
>>> 
>>> Is this comment ("kva if no page") actually correct? The hunk just
>>> above is an example of a case where mr_page is set, yet mr_offset
>>> is an iov_base. Is iov_base not a kva?
>> Ah, well the "if no page" part is now obsolete.
>> I suppose it should be set to "offset_in_page(vec->iov_base)" ?
> 
> Seems like it, yes. Assuming that only the first element in the sgl
> has a possibly non-zero offset ("FBO"). All others must be zero for
> the FRMR.
> 
> Is it guaranteed that each kvec is at most one physical page? If not,
> then the length may span into a random physical page, that was not
> necessarily contiguous in the original KVA-addressed buffer.

IIUC kmalloc'd buffers are backed by physically contiguous pages.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map()
  2021-02-02 21:55       ` Chuck Lever
@ 2021-02-02 22:21         ` Tom Talpey
  0 siblings, 0 replies; 7+ messages in thread
From: Tom Talpey @ 2021-02-02 22:21 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Anna Schumaker, Linux NFS Mailing List, linux-rdma

On 2/2/2021 4:55 PM, Chuck Lever wrote:
> 
> 
>> On Feb 2, 2021, at 4:50 PM, Tom Talpey <tom@talpey.com> wrote:
>>
>> On 2/2/2021 2:20 PM, Chuck Lever wrote:
>>>> On Feb 2, 2021, at 2:18 PM, Tom Talpey <tom@talpey.com> wrote:
>>>>
>>>> What's not to like about a log that uses the words "with aplomb"? :)
>>>>
>>>> Minor related comment/question below.
>>>>
>>>> On 2/2/2021 9:42 AM, Chuck Lever wrote:
>>>>> Clean up.
>>>>> Support for FMR was removed by commit ba69cd122ece ("xprtrdma:
>>>>> Remove support for FMR memory registration") [Dec 2018]. That means
>>>>> the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
>>>>> commit 821c791a0bde ("xprtrdma: Segment head and tail XDR buffers
>>>>> on page boundaries") [Mar 2016], is no longer necessary. FRWR
>>>>> memory registration handles this case with aplomb.
>>>>> A related simplification removes an extra conditional branch from
>>>>> the SGL set-up loop in frwr_map(): Instead of using either
>>>>> sg_set_page() or sg_set_buf(), initialize the mr_page field properly
>>>>> when rpcrdma_convert_kvec() converts the kvec to an SGL entry.
>>>>> frwr_map() can then invoke sg_set_page() unconditionally.
>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> ---
>>>>>   net/sunrpc/xprtrdma/frwr_ops.c  |   10 ++--------
>>>>>   net/sunrpc/xprtrdma/rpc_rdma.c  |   21 +++++----------------
>>>>>   net/sunrpc/xprtrdma/xprt_rdma.h |    2 +-
>>>>>   3 files changed, 8 insertions(+), 25 deletions(-)
>>>>> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
>>>>> index baca49fe83af..5eb044a5f0be 100644
>>>>> --- a/net/sunrpc/xprtrdma/frwr_ops.c
>>>>> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
>>>>> @@ -306,14 +306,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>>>>>   	if (nsegs > ep->re_max_fr_depth)
>>>>>   		nsegs = ep->re_max_fr_depth;
>>>>>   	for (i = 0; i < nsegs;) {
>>>>> -		if (seg->mr_page)
>>>>> -			sg_set_page(&mr->mr_sg[i],
>>>>> -				    seg->mr_page,
>>>>> -				    seg->mr_len,
>>>>> -				    offset_in_page(seg->mr_offset));
>>>>> -		else
>>>>> -			sg_set_buf(&mr->mr_sg[i], seg->mr_offset,
>>>>> -				   seg->mr_len);
>>>>> +		sg_set_page(&mr->mr_sg[i], seg->mr_page,
>>>>> +			    seg->mr_len, offset_in_page(seg->mr_offset));
>>>>>     		++seg;
>>>>>   		++i;
>>>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>>> index 8f5d0cb68360..529adb6ad4db 100644
>>>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>>> @@ -204,9 +204,7 @@ rpcrdma_alloc_sparse_pages(struct xdr_buf *buf)
>>>>>   	return 0;
>>>>>   }
>>>>>   -/* Split @vec on page boundaries into SGEs. FMR registers pages, not
>>>>> - * a byte range. Other modes coalesce these SGEs into a single MR
>>>>> - * when they can.
>>>>> +/* Convert @vec to a single SGL element.
>>>>>    *
>>>>>    * Returns pointer to next available SGE, and bumps the total number
>>>>>    * of SGEs consumed.
>>>>> @@ -215,21 +213,12 @@ static struct rpcrdma_mr_seg *
>>>>>   rpcrdma_convert_kvec(struct kvec *vec, struct rpcrdma_mr_seg *seg,
>>>>>   		     unsigned int *n)
>>>>>   {
>>>>> -	u32 remaining, page_offset;
>>>>> -	char *base;
>>>>> -
>>>>> -	base = vec->iov_base;
>>>>> -	page_offset = offset_in_page(base);
>>>>> -	remaining = vec->iov_len;
>>>>> -	while (remaining) {
>>>>> -		seg->mr_page = NULL;
>>>>> -		seg->mr_offset = base;
>>>>> -		seg->mr_len = min_t(u32, PAGE_SIZE - page_offset, remaining);
>>>>> -		remaining -= seg->mr_len;
>>>>> -		base += seg->mr_len;
>>>>> +	if (vec->iov_len) {
>>>>> +		seg->mr_page = virt_to_page(vec->iov_base);
>>>>> +		seg->mr_offset = vec->iov_base;
>>>>> +		seg->mr_len = vec->iov_len;
>>>>>   		++seg;
>>>>>   		++(*n);
>>>>> -		page_offset = 0;
>>>>>   	}
>>>>>   	return seg;
>>>>>   }
>>>>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
>>>>> index 94b28657aeeb..4a9fe6592795 100644
>>>>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>>>>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>>>>> @@ -285,7 +285,7 @@ enum {
>>>>>     struct rpcrdma_mr_seg {		/* chunk descriptors */
>>>>>   	u32		mr_len;		/* length of chunk or segment */
>>>>> -	struct page	*mr_page;	/* owning page, if any */
>>>>> +	struct page	*mr_page;	/* underlying struct page */
>>>>>   	char		*mr_offset;	/* kva if no page, else offset */
>>>>
>>>> Is this comment ("kva if no page") actually correct? The hunk just
>>>> above is an example of a case where mr_page is set, yet mr_offset
>>>> is an iov_base. Is iov_base not a kva?
>>> Ah, well the "if no page" part is now obsolete.
>>> I suppose it should be set to "offset_in_page(vec->iov_base)" ?
>>
>> Seems like it, yes. Assuming that only the first element in the sgl
>> has a possibly non-zero offset ("FBO"). All others must be zero for
>> the FRMR.
>>
>> Is it guaranteed that each kvec is at most one physical page? If not,
>> then the length may span into a random physical page, that was not
>> necessarily contiguous in the original KVA-addressed buffer.
> 
> IIUC kmalloc'd buffers are backed by physically contiguous pages.

Indeed yes, kmalloc is heroic. If the kvec's are based on kmalloc'd
buffers it's good for any iov_len.

Tom.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-02 22:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02 14:42 [PATCH v1] xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() Chuck Lever
2021-02-02 19:18 ` Tom Talpey
2021-02-02 19:20   ` Chuck Lever
2021-02-02 21:50     ` Tom Talpey
2021-02-02 21:55       ` Chuck Lever
2021-02-02 22:21         ` Tom Talpey
2021-02-02 19:40   ` Anna Schumaker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.