All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures
@ 2022-06-05 19:58 Chuck Lever
  2022-06-05 19:58 ` [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() Chuck Lever
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

While looking into the filecache CPU soft lock-up issue, I ran
across this problem. I thought I could run it down in just an
afternoon (I was wrong) and that this problem probably affects more
users than the soft lock-up (jury's still out).

Anyway, NFSD's new READDIRPLUS dirent encoder blows past the end of
the directory payload xdr_stream when the client requests more than
a page worth of directory entries. I tracked this down to how
xdr_get_next_encode_buffer() computes xdr->end. First patch in this
series is the fix. The remaining patches are clean-ups and
optimizations.

I want to get this into 5.19-rc quickly. I would appreciate getting
at least two R-b's for this series.

---

Chuck Lever (5):
      SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
      SUNRPC: Optimize xdr_reserve_space()
      SUNRPC: Clean up xdr_commit_encode()
      SUNRPC: Clean up xdr_get_next_encode_buffer()
      SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()


 net/sunrpc/xdr.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

--
Chuck Lever


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
@ 2022-06-05 19:58 ` Chuck Lever
  2022-06-06 22:09   ` J. Bruce Fields
  2022-06-05 19:58 ` [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space() Chuck Lever
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
right at the end of the page array. xdr_get_next_encode_buffer() does
not compute the value of xdr->end correctly:

 * The check to see if we're on the final available page in xdr->buf
   needs to account for the space consumed by @nbytes.

 * The new xdr->end value needs to account for the portion of @nbytes
   that is to be encoded into the previous buffer.

Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xdr.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index df194cc07035..b57cf9df4de8 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -979,7 +979,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
 	 */
 	xdr->p = (void *)p + frag2bytes;
 	space_left = xdr->buf->buflen - xdr->buf->len;
-	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
+	if (space_left - nbytes >= PAGE_SIZE)
+		xdr->end = (void *)p + PAGE_SIZE;
+	else
+		xdr->end = (void *)p + space_left - frag1bytes;
+
 	xdr->buf->page_len += frag2bytes;
 	xdr->buf->len += nbytes;
 	return p;



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space()
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
  2022-06-05 19:58 ` [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-05 19:58 ` Chuck Lever
  2022-06-07  1:03   ` NeilBrown
  2022-06-05 19:58 ` [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode() Chuck Lever
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

The xdr_get_next_encode_buffer() function is called infrequently.
On a typical build/test workload, it's called about 1 time in 400
calls to xdr_reserve_space() (measured on NFSD).

Force the compiler to remove it from xdr_reserve_space(), which is
a hot path. This change reduces the size of xdr_reserve_space() from
10 cache lines to 4 on my test system.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xdr.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index b57cf9df4de8..08a85375b311 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr)
 }
 EXPORT_SYMBOL_GPL(xdr_commit_encode);
 
-static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
-		size_t nbytes)
+/*
+ * The buffer space to be reserved crosses the boundary between
+ * xdr->buf->head and xdr->buf->pages, or between two pages
+ * in xdr->buf->pages.
+ */
+static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
+						   size_t nbytes)
 {
 	__be32 *p;
 	int space_left;



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode()
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
  2022-06-05 19:58 ` [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() Chuck Lever
  2022-06-05 19:58 ` [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space() Chuck Lever
@ 2022-06-05 19:58 ` Chuck Lever
  2022-06-07  1:05   ` NeilBrown
  2022-06-05 19:58 ` [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer() Chuck Lever
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

Both the ::iov_len field and the third parameter of memcpy() and
memmove() are size_t. There's no reason for the implicit conversion
from size_t to int and back. Change the type of @shift to make the
code easier to read and understand.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xdr.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 08a85375b311..de8f71468637 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -933,14 +933,16 @@ EXPORT_SYMBOL_GPL(xdr_init_encode);
  */
 inline void xdr_commit_encode(struct xdr_stream *xdr)
 {
-	int shift = xdr->scratch.iov_len;
+	size_t shift = xdr->scratch.iov_len;
 	void *page;
 
 	if (shift == 0)
 		return;
+
 	page = page_address(*xdr->page_ptr);
 	memcpy(xdr->scratch.iov_base, page, shift);
 	memmove(page, page + shift, (void *)xdr->p - page);
+
 	xdr_reset_scratch_buffer(xdr);
 }
 EXPORT_SYMBOL_GPL(xdr_commit_encode);



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer()
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
                   ` (2 preceding siblings ...)
  2022-06-05 19:58 ` [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode() Chuck Lever
@ 2022-06-05 19:58 ` Chuck Lever
  2022-06-07  1:07   ` NeilBrown
  2022-06-05 19:58 ` [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() Chuck Lever
  2022-06-07 11:37 ` [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures NeilBrown
  5 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

The value of @p is not used until the "location of the next item" is
computed. Help human readers by moving its initial assignment to the
paragraph where that value is used and by clarifying the antecedents
in the documenting comment.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xdr.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index de8f71468637..89cb48931a1f 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -971,6 +971,7 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
 		xdr->buf->page_len += frag1bytes;
 	xdr->page_ptr++;
 	xdr->iov = NULL;
+
 	/*
 	 * If the last encode didn't end exactly on a page boundary, the
 	 * next one will straddle boundaries.  Encode into the next
@@ -979,11 +980,12 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
 	 * space at the end of the previous buffer:
 	 */
 	xdr_set_scratch_buffer(xdr, xdr->p, frag1bytes);
-	p = page_address(*xdr->page_ptr);
+
 	/*
-	 * Note this is where the next encode will start after we've
-	 * shifted this one back:
+	 * xdr->p is where the next encode will start after
+	 * xdr_commit_encode() has shifted this one back:
 	 */
+	p = page_address(*xdr->page_ptr);
 	xdr->p = (void *)p + frag2bytes;
 	space_left = xdr->buf->buflen - xdr->buf->len;
 	if (space_left - nbytes >= PAGE_SIZE)



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
                   ` (3 preceding siblings ...)
  2022-06-05 19:58 ` [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-05 19:58 ` Chuck Lever
  2022-06-07  1:08   ` NeilBrown
  2022-06-07 11:37 ` [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures NeilBrown
  5 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2022-06-05 19:58 UTC (permalink / raw)
  To: linux-nfs

To make the code easier to read, remove visual clutter by changing
the declared type of @p.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xdr.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 89cb48931a1f..4e4cad7aeec2 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -955,9 +955,9 @@ EXPORT_SYMBOL_GPL(xdr_commit_encode);
 static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
 						   size_t nbytes)
 {
-	__be32 *p;
 	int space_left;
 	int frag1bytes, frag2bytes;
+	void *p;
 
 	if (nbytes > PAGE_SIZE)
 		goto out_overflow; /* Bigger buffers require special handling */
@@ -986,12 +986,12 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
 	 * xdr_commit_encode() has shifted this one back:
 	 */
 	p = page_address(*xdr->page_ptr);
-	xdr->p = (void *)p + frag2bytes;
+	xdr->p = p + frag2bytes;
 	space_left = xdr->buf->buflen - xdr->buf->len;
 	if (space_left - nbytes >= PAGE_SIZE)
-		xdr->end = (void *)p + PAGE_SIZE;
+		xdr->end = p + PAGE_SIZE;
 	else
-		xdr->end = (void *)p + space_left - frag1bytes;
+		xdr->end = p + space_left - frag1bytes;
 
 	xdr->buf->page_len += frag2bytes;
 	xdr->buf->len += nbytes;



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-05 19:58 ` [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-06 22:09   ` J. Bruce Fields
  2022-06-07  0:59     ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: J. Bruce Fields @ 2022-06-06 22:09 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Sun, Jun 05, 2022 at 03:58:25PM -0400, Chuck Lever wrote:
> I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
> right at the end of the page array. xdr_get_next_encode_buffer() does
> not compute the value of xdr->end correctly:
> 
>  * The check to see if we're on the final available page in xdr->buf
>    needs to account for the space consumed by @nbytes.
> 
>  * The new xdr->end value needs to account for the portion of @nbytes
>    that is to be encoded into the previous buffer.
> 
> Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xdr.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index df194cc07035..b57cf9df4de8 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -979,7 +979,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>  	 */
>  	xdr->p = (void *)p + frag2bytes;
>  	space_left = xdr->buf->buflen - xdr->buf->len;
> -	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
> +	if (space_left - nbytes >= PAGE_SIZE)
> +		xdr->end = (void *)p + PAGE_SIZE;
> +	else
> +		xdr->end = (void *)p + space_left - frag1bytes;
> +

I think that's right.

Couldn't you just make that


-	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
+	xdr->end = (void *)p + min_t(int, space_left - nbytes, PAGE_SIZE);

?

(Note we know space_left >= nbytes from the second "if" of this
function.)

No strong opinion either way, really, I just wonder if I'm missing
something.

--b.

>  	xdr->buf->page_len += frag2bytes;
>  	xdr->buf->len += nbytes;
>  	return p;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-06 22:09   ` J. Bruce Fields
@ 2022-06-07  0:59     ` NeilBrown
  2022-06-07  2:35       ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2022-06-07  0:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Chuck Lever, linux-nfs

On Tue, 07 Jun 2022, J. Bruce Fields wrote:
> On Sun, Jun 05, 2022 at 03:58:25PM -0400, Chuck Lever wrote:
> > I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
> > right at the end of the page array. xdr_get_next_encode_buffer() does
> > not compute the value of xdr->end correctly:
> > 
> >  * The check to see if we're on the final available page in xdr->buf
> >    needs to account for the space consumed by @nbytes.
> > 
> >  * The new xdr->end value needs to account for the portion of @nbytes
> >    that is to be encoded into the previous buffer.
> > 
> > Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >  net/sunrpc/xdr.c |    6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> > index df194cc07035..b57cf9df4de8 100644
> > --- a/net/sunrpc/xdr.c
> > +++ b/net/sunrpc/xdr.c
> > @@ -979,7 +979,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
> >  	 */
> >  	xdr->p = (void *)p + frag2bytes;
> >  	space_left = xdr->buf->buflen - xdr->buf->len;
> > -	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
> > +	if (space_left - nbytes >= PAGE_SIZE)
> > +		xdr->end = (void *)p + PAGE_SIZE;
> > +	else
> > +		xdr->end = (void *)p + space_left - frag1bytes;
> > +
> 
> I think that's right.

I think I agree.

> 
> Couldn't you just make that
> 
> 
> -	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
> +	xdr->end = (void *)p + min_t(int, space_left - nbytes, PAGE_SIZE);

I don't think that is the same.
When space_left is small, this results in "p + space_left - nbytes" but
the one we both this is right results in  "p + space_left - frag1bytes".

I'm going to suggest a more radical change.
Let's start off with 

   int space_avail = xdr->buf->buflen - xdr->buf->len;

In this function we sometime care about space before we consume any, and
something care about space after we consume some.  "space_left" sounds
more like the latter.  "space_avail" sounds more like the former.
Current space_left is assigned to the former, which I find confused.

Then the second "if" which Bruce highlighted becomes

   if (nbytes > space_avail)
          goto out_overflow;

which is obviously correct.

We then assign frag{1,2}bytes and have another chunk of code that looks
wrong to me.  I'd like

   if (xdr->iov) {
	xdr->iov->iov_len += frag1bytes;
	xdr->iov = NULL;
   } else {
        xdr->buf->page_len += frag1bytes;
        xdr->page_ptr++;
   }

Note that this changes the code NOT to increment pagE_ptr if iov was not
NULL.  I cannot see how it would be correct to do that.  Presumably this
code is never called with iov != NULL ???

Now I want to rearrange the assignments at the end:

	xdr->p = (void *)p + frag2bytes;
	xdr->buf->page_len += frag2bytes;
	xdr->buf->len += nbytes;
	space_left = xdr->buf->buflen - xdr->buf->len;
	xdr->end = (void *)xdr->p + min_t(int, space_left, PAGE_SIZE);

Note that the last line "xdr->p" in place of "p".
We still have "space_left", but it now is the space that is left
after we have consumed some.

Possibly the space_left line could be

       space_left -= nbytes;

NeilBrown

> 
> ?
> 
> (Note we know space_left >= nbytes from the second "if" of this
> function.)
> 
> No strong opinion either way, really, I just wonder if I'm missing
> something.
> 
> --b.
> 
> >  	xdr->buf->page_len += frag2bytes;
> >  	xdr->buf->len += nbytes;
> >  	return p;
> > 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space()
  2022-06-05 19:58 ` [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space() Chuck Lever
@ 2022-06-07  1:03   ` NeilBrown
  2022-06-07  2:19     ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2022-06-07  1:03 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Mon, 06 Jun 2022, Chuck Lever wrote:
> The xdr_get_next_encode_buffer() function is called infrequently.
> On a typical build/test workload, it's called about 1 time in 400
> calls to xdr_reserve_space() (measured on NFSD).
> 
> Force the compiler to remove it from xdr_reserve_space(), which is
> a hot path. This change reduces the size of xdr_reserve_space() from
> 10 cache lines to 4 on my test system.

Does this really help at all?  Are the instructions that are executed in
the common case distributed over those 10 cache line, or are they all in
the first 4?

I would have thought the "unlikely" in xdr_reserve_space() would have
pushed all the code from xdr_get_next_encode_buffer() to the end of the
function leaving the remainder in a small contiguous chunk.

NeilBrown


> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xdr.c |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index b57cf9df4de8..08a85375b311 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr)
>  }
>  EXPORT_SYMBOL_GPL(xdr_commit_encode);
>  
> -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
> -		size_t nbytes)
> +/*
> + * The buffer space to be reserved crosses the boundary between
> + * xdr->buf->head and xdr->buf->pages, or between two pages
> + * in xdr->buf->pages.
> + */
> +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
> +						   size_t nbytes)
>  {
>  	__be32 *p;
>  	int space_left;
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode()
  2022-06-05 19:58 ` [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode() Chuck Lever
@ 2022-06-07  1:05   ` NeilBrown
  2022-06-07  2:21     ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2022-06-07  1:05 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Mon, 06 Jun 2022, Chuck Lever wrote:
> Both the ::iov_len field and the third parameter of memcpy() and
> memmove() are size_t. There's no reason for the implicit conversion
> from size_t to int and back. Change the type of @shift to make the
> code easier to read and understand.

I wouldn't object to "shift" being renamed "frag1bytes" to make the
connection with xdr_get_next_encode_buffer() more blatant..

Reviewed-by: NeilBrown <neilb@suse.de>

Thanks,
NeilBrown


> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xdr.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index 08a85375b311..de8f71468637 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -933,14 +933,16 @@ EXPORT_SYMBOL_GPL(xdr_init_encode);
>   */
>  inline void xdr_commit_encode(struct xdr_stream *xdr)
>  {
> -	int shift = xdr->scratch.iov_len;
> +	size_t shift = xdr->scratch.iov_len;
>  	void *page;
>  
>  	if (shift == 0)
>  		return;
> +
>  	page = page_address(*xdr->page_ptr);
>  	memcpy(xdr->scratch.iov_base, page, shift);
>  	memmove(page, page + shift, (void *)xdr->p - page);
> +
>  	xdr_reset_scratch_buffer(xdr);
>  }
>  EXPORT_SYMBOL_GPL(xdr_commit_encode);
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer()
  2022-06-05 19:58 ` [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-07  1:07   ` NeilBrown
  0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2022-06-07  1:07 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Mon, 06 Jun 2022, Chuck Lever wrote:
> The value of @p is not used until the "location of the next item" is
> computed. Help human readers by moving its initial assignment to the
> paragraph where that value is used and by clarifying the antecedents
> in the documenting comment.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Reviewed-by: NeilBrown <neilb@suse.com>


> ---
>  net/sunrpc/xdr.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index de8f71468637..89cb48931a1f 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -971,6 +971,7 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>  		xdr->buf->page_len += frag1bytes;
>  	xdr->page_ptr++;
>  	xdr->iov = NULL;
> +
>  	/*
>  	 * If the last encode didn't end exactly on a page boundary, the
>  	 * next one will straddle boundaries.  Encode into the next
> @@ -979,11 +980,12 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>  	 * space at the end of the previous buffer:
>  	 */
>  	xdr_set_scratch_buffer(xdr, xdr->p, frag1bytes);
> -	p = page_address(*xdr->page_ptr);
> +
>  	/*
> -	 * Note this is where the next encode will start after we've
> -	 * shifted this one back:
> +	 * xdr->p is where the next encode will start after
> +	 * xdr_commit_encode() has shifted this one back:
>  	 */
> +	p = page_address(*xdr->page_ptr);
>  	xdr->p = (void *)p + frag2bytes;
>  	space_left = xdr->buf->buflen - xdr->buf->len;
>  	if (space_left - nbytes >= PAGE_SIZE)
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
  2022-06-05 19:58 ` [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-07  1:08   ` NeilBrown
  0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2022-06-07  1:08 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Mon, 06 Jun 2022, Chuck Lever wrote:
> To make the code easier to read, remove visual clutter by changing
> the declared type of @p.

Oh yes - that's much nicer!

Reviewed-by: NeilBrown <neilb@suse.de>

Thanks,
NeilBrown

> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xdr.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index 89cb48931a1f..4e4cad7aeec2 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -955,9 +955,9 @@ EXPORT_SYMBOL_GPL(xdr_commit_encode);
>  static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>  						   size_t nbytes)
>  {
> -	__be32 *p;
>  	int space_left;
>  	int frag1bytes, frag2bytes;
> +	void *p;
>  
>  	if (nbytes > PAGE_SIZE)
>  		goto out_overflow; /* Bigger buffers require special handling */
> @@ -986,12 +986,12 @@ static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>  	 * xdr_commit_encode() has shifted this one back:
>  	 */
>  	p = page_address(*xdr->page_ptr);
> -	xdr->p = (void *)p + frag2bytes;
> +	xdr->p = p + frag2bytes;
>  	space_left = xdr->buf->buflen - xdr->buf->len;
>  	if (space_left - nbytes >= PAGE_SIZE)
> -		xdr->end = (void *)p + PAGE_SIZE;
> +		xdr->end = p + PAGE_SIZE;
>  	else
> -		xdr->end = (void *)p + space_left - frag1bytes;
> +		xdr->end = p + space_left - frag1bytes;
>  
>  	xdr->buf->page_len += frag2bytes;
>  	xdr->buf->len += nbytes;
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space()
  2022-06-07  1:03   ` NeilBrown
@ 2022-06-07  2:19     ` Chuck Lever III
  2022-06-07  2:35       ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever III @ 2022-06-07  2:19 UTC (permalink / raw)
  To: Neil Brown; +Cc: Linux NFS Mailing List



> On Jun 6, 2022, at 9:03 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Mon, 06 Jun 2022, Chuck Lever wrote:
>> The xdr_get_next_encode_buffer() function is called infrequently.
>> On a typical build/test workload, it's called about 1 time in 400
>> calls to xdr_reserve_space() (measured on NFSD).
>> 
>> Force the compiler to remove it from xdr_reserve_space(), which is
>> a hot path. This change reduces the size of xdr_reserve_space() from
>> 10 cache lines to 4 on my test system.
> 
> Does this really help at all?  Are the instructions that are executed in
> the common case distributed over those 10 cache line, or are they all in
> the first 4?
> 
> I would have thought the "unlikely" in xdr_reserve_space() would have
> pushed all the code from xdr_get_next_encode_buffer() to the end of the
> function leaving the remainder in a small contiguous chunk.

Well, granted that I'm compiling with -Os, not -O2. The compiler inlines
xdr_get_next_encode_buffer() right in the middle of xdr_reserve_space().


> NeilBrown
> 
> 
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> net/sunrpc/xdr.c |    9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
>> index b57cf9df4de8..08a85375b311 100644
>> --- a/net/sunrpc/xdr.c
>> +++ b/net/sunrpc/xdr.c
>> @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr)
>> }
>> EXPORT_SYMBOL_GPL(xdr_commit_encode);
>> 
>> -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>> -		size_t nbytes)
>> +/*
>> + * The buffer space to be reserved crosses the boundary between
>> + * xdr->buf->head and xdr->buf->pages, or between two pages
>> + * in xdr->buf->pages.
>> + */
>> +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>> +						   size_t nbytes)
>> {
>> 	__be32 *p;
>> 	int space_left;
>> 
>> 
>> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode()
  2022-06-07  1:05   ` NeilBrown
@ 2022-06-07  2:21     ` Chuck Lever III
  2022-06-07  2:37       ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever III @ 2022-06-07  2:21 UTC (permalink / raw)
  To: Neil Brown; +Cc: Linux NFS Mailing List



> On Jun 6, 2022, at 9:05 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Mon, 06 Jun 2022, Chuck Lever wrote:
>> Both the ::iov_len field and the third parameter of memcpy() and
>> memmove() are size_t. There's no reason for the implicit conversion
>> from size_t to int and back. Change the type of @shift to make the
>> code easier to read and understand.
> 
> I wouldn't object to "shift" being renamed "frag1bytes" to make the
> connection with xdr_get_next_encode_buffer() more blatant..

I thought of that. Readers would wonder why frag1bytes in
xdr_commit_encode() was a size_t, but in xdr_get_next_encode_buffer()
it's a signed int. It started to become a longer string to pull.
Maybe it's worth it?


> Reviewed-by: NeilBrown <neilb@suse.de>
> 
> Thanks,
> NeilBrown
> 
> 
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> net/sunrpc/xdr.c |    4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
>> index 08a85375b311..de8f71468637 100644
>> --- a/net/sunrpc/xdr.c
>> +++ b/net/sunrpc/xdr.c
>> @@ -933,14 +933,16 @@ EXPORT_SYMBOL_GPL(xdr_init_encode);
>>  */
>> inline void xdr_commit_encode(struct xdr_stream *xdr)
>> {
>> -	int shift = xdr->scratch.iov_len;
>> +	size_t shift = xdr->scratch.iov_len;
>> 	void *page;
>> 
>> 	if (shift == 0)
>> 		return;
>> +
>> 	page = page_address(*xdr->page_ptr);
>> 	memcpy(xdr->scratch.iov_base, page, shift);
>> 	memmove(page, page + shift, (void *)xdr->p - page);
>> +
>> 	xdr_reset_scratch_buffer(xdr);
>> }
>> EXPORT_SYMBOL_GPL(xdr_commit_encode);
>> 
>> 
>> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space()
  2022-06-07  2:19     ` Chuck Lever III
@ 2022-06-07  2:35       ` NeilBrown
  0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2022-06-07  2:35 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Linux NFS Mailing List

On Tue, 07 Jun 2022, Chuck Lever III wrote:
> 
> > On Jun 6, 2022, at 9:03 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > On Mon, 06 Jun 2022, Chuck Lever wrote:
> >> The xdr_get_next_encode_buffer() function is called infrequently.
> >> On a typical build/test workload, it's called about 1 time in 400
> >> calls to xdr_reserve_space() (measured on NFSD).
> >> 
> >> Force the compiler to remove it from xdr_reserve_space(), which is
> >> a hot path. This change reduces the size of xdr_reserve_space() from
> >> 10 cache lines to 4 on my test system.
> > 
> > Does this really help at all?  Are the instructions that are executed in
> > the common case distributed over those 10 cache line, or are they all in
> > the first 4?
> > 
> > I would have thought the "unlikely" in xdr_reserve_space() would have
> > pushed all the code from xdr_get_next_encode_buffer() to the end of the
> > function leaving the remainder in a small contiguous chunk.
> 
> Well, granted that I'm compiling with -Os, not -O2. The compiler inlines
> xdr_get_next_encode_buffer() right in the middle of xdr_reserve_space().

Interesting.  I tried with -O2 and it move xdr_get_next_encode_buffer()
to the end, but inlined xdr_commit_encode() in the middle.
I changed xdr_commit_encode() to wrap the "shift==0" test in likely(),
and then it produced a more reasonable result.

With your noinline patch, the "return xdr_get_next_encode_buffer()"
becomes a jump, not a jump-to-subroutine, so there is little cost in it.

Might I suggest:
  Move the "xdr->scratch.iov_len" test out of xdr_commit_encode() and
  put it in both callers as "unlikely".
  Mark both xdr_commit_encode and xdr_get_next_encode_buffer() as
  noinline
  mention in the commit message that with -Os the "unlikely" in
  xdr_reserve_space() doesn't help
??

Thanks,
NeilBrown


> 
> 
> > NeilBrown
> > 
> > 
> >> 
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >> net/sunrpc/xdr.c |    9 +++++++--
> >> 1 file changed, 7 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> >> index b57cf9df4de8..08a85375b311 100644
> >> --- a/net/sunrpc/xdr.c
> >> +++ b/net/sunrpc/xdr.c
> >> @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr)
> >> }
> >> EXPORT_SYMBOL_GPL(xdr_commit_encode);
> >> 
> >> -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
> >> -		size_t nbytes)
> >> +/*
> >> + * The buffer space to be reserved crosses the boundary between
> >> + * xdr->buf->head and xdr->buf->pages, or between two pages
> >> + * in xdr->buf->pages.
> >> + */
> >> +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
> >> +						   size_t nbytes)
> >> {
> >> 	__be32 *p;
> >> 	int space_left;
> >> 
> >> 
> >> 
> 
> --
> Chuck Lever
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-07  0:59     ` NeilBrown
@ 2022-06-07  2:35       ` Chuck Lever III
  2022-06-07  3:12         ` Chuck Lever III
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever III @ 2022-06-07  2:35 UTC (permalink / raw)
  To: Neil Brown; +Cc: Bruce Fields, Linux NFS Mailing List


> On Jun 6, 2022, at 8:59 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Tue, 07 Jun 2022, J. Bruce Fields wrote:
>> On Sun, Jun 05, 2022 at 03:58:25PM -0400, Chuck Lever wrote:
>>> I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
>>> right at the end of the page array. xdr_get_next_encode_buffer() does
>>> not compute the value of xdr->end correctly:
>>> 
>>> * The check to see if we're on the final available page in xdr->buf
>>>   needs to account for the space consumed by @nbytes.
>>> 
>>> * The new xdr->end value needs to account for the portion of @nbytes
>>>   that is to be encoded into the previous buffer.
>>> 
>>> Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries")
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> net/sunrpc/xdr.c |    6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
>>> index df194cc07035..b57cf9df4de8 100644
>>> --- a/net/sunrpc/xdr.c
>>> +++ b/net/sunrpc/xdr.c
>>> @@ -979,7 +979,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
>>> 	 */
>>> 	xdr->p = (void *)p + frag2bytes;
>>> 	space_left = xdr->buf->buflen - xdr->buf->len;
>>> -	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
>>> +	if (space_left - nbytes >= PAGE_SIZE)
>>> +		xdr->end = (void *)p + PAGE_SIZE;
>>> +	else
>>> +		xdr->end = (void *)p + space_left - frag1bytes;
>>> +
>> 
>> I think that's right.
> 
> I think I agree.
> 
>> 
>> Couldn't you just make that
>> 
>> 
>> -	xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
>> +	xdr->end = (void *)p + min_t(int, space_left - nbytes, PAGE_SIZE);
> 
> I don't think that is the same.
> When space_left is small, this results in "p + space_left - nbytes" but
> the one we both this is right results in  "p + space_left - frag1bytes".

Exactly.


> I'm going to suggest a more radical change.
> Let's start off with 
> 
>   int space_avail = xdr->buf->buflen - xdr->buf->len;
> 
> In this function we sometime care about space before we consume any, and
> something care about space after we consume some.  "space_left" sounds
> more like the latter.  "space_avail" sounds more like the former.
> Current space_left is assigned to the former, which I find confused.
> 
> Then the second "if" which Bruce highlighted becomes
> 
>   if (nbytes > space_avail)
>          goto out_overflow;
> 
> which is obviously correct.

IIUC, it's correct only if space_avail is an unsigned integer type.
Otherwise, comparison of space_avail with a size_t will cause it
to get converted to an unsigned integer, risking an underflow.

I've toyed with the idea of changing all of these signed integer
local variables to size_t, since they are used for pointer
arithmetic. Bruce has taught me to be wary about that kind of
change, however.


> We then assign frag{1,2}bytes and have another chunk of code that looks
> wrong to me.  I'd like
> 
>   if (xdr->iov) {
> 	xdr->iov->iov_len += frag1bytes;
> 	xdr->iov = NULL;
>   } else {
>        xdr->buf->page_len += frag1bytes;
>        xdr->page_ptr++;
>   }
> 
> Note that this changes the code NOT to increment pagE_ptr if iov was not
> NULL.  I cannot see how it would be correct to do that.  Presumably this
> code is never called with iov != NULL ???

That strikes me as a good change. I will add it to this series as
another patch.

Yes, this code is called by the server's READDIR encoder with iov = NULL.
See nfsd3_init_dirlist_pages().


> Now I want to rearrange the assignments at the end:
> 
> 	xdr->p = (void *)p + frag2bytes;
> 	xdr->buf->page_len += frag2bytes;
> 	xdr->buf->len += nbytes;
> 	space_left = xdr->buf->buflen - xdr->buf->len;
> 	xdr->end = (void *)xdr->p + min_t(int, space_left, PAGE_SIZE);
> 
> Note that the last line "xdr->p" in place of "p".
> We still have "space_left", but it now is the space that is left
> after we have consumed some.
> 
> Possibly the space_left line could be
> 
>       space_left -= nbytes;

This part of the code can become too clever by half very quickly.
Since this function is called infrequently, we can afford to err
on the side of easy readability. The way it is after my patch seems
straightforward to me, but I will keep your suggestion in mind.


> NeilBrown
> 
>> 
>> ?
>> 
>> (Note we know space_left >= nbytes from the second "if" of this
>> function.)
>> 
>> No strong opinion either way, really, I just wonder if I'm missing
>> something.
>> 
>> --b.
>> 
>>> 	xdr->buf->page_len += frag2bytes;
>>> 	xdr->buf->len += nbytes;
>>> 	return p;
>>> 
>> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode()
  2022-06-07  2:21     ` Chuck Lever III
@ 2022-06-07  2:37       ` NeilBrown
  0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2022-06-07  2:37 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Linux NFS Mailing List

On Tue, 07 Jun 2022, Chuck Lever III wrote:
> 
> > On Jun 6, 2022, at 9:05 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > On Mon, 06 Jun 2022, Chuck Lever wrote:
> >> Both the ::iov_len field and the third parameter of memcpy() and
> >> memmove() are size_t. There's no reason for the implicit conversion
> >> from size_t to int and back. Change the type of @shift to make the
> >> code easier to read and understand.
> > 
> > I wouldn't object to "shift" being renamed "frag1bytes" to make the
> > connection with xdr_get_next_encode_buffer() more blatant..
> 
> I thought of that. Readers would wonder why frag1bytes in
> xdr_commit_encode() was a size_t, but in xdr_get_next_encode_buffer()
> it's a signed int. It started to become a longer string to pull.
> Maybe it's worth it?
> 

Probably worth it.  Not necessarily worth it today.  I have no strong
preference.

NeilBrown

> 
> > Reviewed-by: NeilBrown <neilb@suse.de>
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> >> 
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >> net/sunrpc/xdr.c |    4 +++-
> >> 1 file changed, 3 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> >> index 08a85375b311..de8f71468637 100644
> >> --- a/net/sunrpc/xdr.c
> >> +++ b/net/sunrpc/xdr.c
> >> @@ -933,14 +933,16 @@ EXPORT_SYMBOL_GPL(xdr_init_encode);
> >>  */
> >> inline void xdr_commit_encode(struct xdr_stream *xdr)
> >> {
> >> -	int shift = xdr->scratch.iov_len;
> >> +	size_t shift = xdr->scratch.iov_len;
> >> 	void *page;
> >> 
> >> 	if (shift == 0)
> >> 		return;
> >> +
> >> 	page = page_address(*xdr->page_ptr);
> >> 	memcpy(xdr->scratch.iov_base, page, shift);
> >> 	memmove(page, page + shift, (void *)xdr->p - page);
> >> +
> >> 	xdr_reset_scratch_buffer(xdr);
> >> }
> >> EXPORT_SYMBOL_GPL(xdr_commit_encode);
> >> 
> >> 
> >> 
> 
> --
> Chuck Lever
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-07  2:35       ` Chuck Lever III
@ 2022-06-07  3:12         ` Chuck Lever III
  2022-06-07  3:29           ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever III @ 2022-06-07  3:12 UTC (permalink / raw)
  To: Neil Brown; +Cc: Bruce Fields, Linux NFS Mailing List


> On Jun 6, 2022, at 10:35 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
>> On Jun 6, 2022, at 8:59 PM, NeilBrown <neilb@suse.de> wrote:
>> We then assign frag{1,2}bytes and have another chunk of code that looks
>> wrong to me.  I'd like
>> 
>>  if (xdr->iov) {
>> 	xdr->iov->iov_len += frag1bytes;
>> 	xdr->iov = NULL;
>>  } else {
>>       xdr->buf->page_len += frag1bytes;
>>       xdr->page_ptr++;
>>  }
>> 
>> Note that this changes the code NOT to increment pagE_ptr if iov was not
>> NULL.  I cannot see how it would be correct to do that.  Presumably this
>> code is never called with iov != NULL ???
> 
> That strikes me as a good change. I will add it to this series as
> another patch.
> 
> Yes, this code is called by the server's READDIR encoder with iov = NULL.
> See nfsd3_init_dirlist_pages().

This change breaks READDIR, looks like.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  2022-06-07  3:12         ` Chuck Lever III
@ 2022-06-07  3:29           ` NeilBrown
  0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2022-06-07  3:29 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Bruce Fields, Linux NFS Mailing List

On Tue, 07 Jun 2022, Chuck Lever III wrote:
> > On Jun 6, 2022, at 10:35 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> > 
> >> On Jun 6, 2022, at 8:59 PM, NeilBrown <neilb@suse.de> wrote:
> >> We then assign frag{1,2}bytes and have another chunk of code that looks
> >> wrong to me.  I'd like
> >> 
> >>  if (xdr->iov) {
> >> 	xdr->iov->iov_len += frag1bytes;
> >> 	xdr->iov = NULL;
> >>  } else {
> >>       xdr->buf->page_len += frag1bytes;
> >>       xdr->page_ptr++;
> >>  }
> >> 
> >> Note that this changes the code NOT to increment pagE_ptr if iov was not
> >> NULL.  I cannot see how it would be correct to do that.  Presumably this
> >> code is never called with iov != NULL ???
> > 
> > That strikes me as a good change. I will add it to this series as
> > another patch.
> > 
> > Yes, this code is called by the server's READDIR encoder with iov = NULL.
> > See nfsd3_init_dirlist_pages().
> 
> This change breaks READDIR, looks like.

Thanks for testing...

I looked again, and I see that svcxdr_init_encode() initialises
->page_ptr to "buf->page - 1".  So we need a +1 when moving from the
'head' to the pages.

Commit 05638dc73af2 ("nfsd4: simplify server xdr->next_page use")

explains why.  I'm not sure I completely agree with the reasoning (head
really is a special case), but it probably isn't worth "fixing".

Thanks,
NeilBrown

> 
> --
> Chuck Lever
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures
  2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
                   ` (4 preceding siblings ...)
  2022-06-05 19:58 ` [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() Chuck Lever
@ 2022-06-07 11:37 ` NeilBrown
  2022-06-07 14:16   ` Chuck Lever III
  5 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2022-06-07 11:37 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Mon, 06 Jun 2022, Chuck Lever wrote:
> While looking into the filecache CPU soft lock-up issue, I ran
> across this problem. I thought I could run it down in just an
> afternoon (I was wrong) and that this problem probably affects more
> users than the soft lock-up (jury's still out).
> 
> Anyway, NFSD's new READDIRPLUS dirent encoder blows past the end of
> the directory payload xdr_stream when the client requests more than
> a page worth of directory entries. I tracked this down to how
> xdr_get_next_encode_buffer() computes xdr->end. First patch in this
> series is the fix. The remaining patches are clean-ups and
> optimizations.
> 
> I want to get this into 5.19-rc quickly. I would appreciate getting
> at least two R-b's for this series.

Just for completeness:
  Reviewed-by: NeilBrown <neilb@suse.de>
for the whole series. Nothing I saw would justify any delay.

NeilBrown

> 
> ---
> 
> Chuck Lever (5):
>       SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
>       SUNRPC: Optimize xdr_reserve_space()
>       SUNRPC: Clean up xdr_commit_encode()
>       SUNRPC: Clean up xdr_get_next_encode_buffer()
>       SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
> 
> 
>  net/sunrpc/xdr.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> --
> Chuck Lever
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures
  2022-06-07 11:37 ` [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures NeilBrown
@ 2022-06-07 14:16   ` Chuck Lever III
  0 siblings, 0 replies; 21+ messages in thread
From: Chuck Lever III @ 2022-06-07 14:16 UTC (permalink / raw)
  To: Neil Brown; +Cc: Linux NFS Mailing List



> On Jun 7, 2022, at 7:37 AM, NeilBrown <neilb@suse.de> wrote:
> 
> On Mon, 06 Jun 2022, Chuck Lever wrote:
>> While looking into the filecache CPU soft lock-up issue, I ran
>> across this problem. I thought I could run it down in just an
>> afternoon (I was wrong) and that this problem probably affects more
>> users than the soft lock-up (jury's still out).
>> 
>> Anyway, NFSD's new READDIRPLUS dirent encoder blows past the end of
>> the directory payload xdr_stream when the client requests more than
>> a page worth of directory entries. I tracked this down to how
>> xdr_get_next_encode_buffer() computes xdr->end. First patch in this
>> series is the fix. The remaining patches are clean-ups and
>> optimizations.
>> 
>> I want to get this into 5.19-rc quickly. I would appreciate getting
>> at least two R-b's for this series.
> 
> Just for completeness:
>  Reviewed-by: NeilBrown <neilb@suse.de>
> for the whole series. Nothing I saw would justify any delay.

Thanks for your review. I have some thoughts about how to deal
idiomatically with your inlining optimization suggestion. I will
post a v2, I think that patch will be the only update.


> NeilBrown
> 
>> 
>> ---
>> 
>> Chuck Lever (5):
>>      SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
>>      SUNRPC: Optimize xdr_reserve_space()
>>      SUNRPC: Clean up xdr_commit_encode()
>>      SUNRPC: Clean up xdr_get_next_encode_buffer()
>>      SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
>> 
>> 
>> net/sunrpc/xdr.c | 31 ++++++++++++++++++++++---------
>> 1 file changed, 22 insertions(+), 9 deletions(-)
>> 
>> --
>> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-06-07 14:16 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-05 19:58 [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures Chuck Lever
2022-06-05 19:58 ` [PATCH v1 1/5] SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() Chuck Lever
2022-06-06 22:09   ` J. Bruce Fields
2022-06-07  0:59     ` NeilBrown
2022-06-07  2:35       ` Chuck Lever III
2022-06-07  3:12         ` Chuck Lever III
2022-06-07  3:29           ` NeilBrown
2022-06-05 19:58 ` [PATCH v1 2/5] SUNRPC: Optimize xdr_reserve_space() Chuck Lever
2022-06-07  1:03   ` NeilBrown
2022-06-07  2:19     ` Chuck Lever III
2022-06-07  2:35       ` NeilBrown
2022-06-05 19:58 ` [PATCH v1 3/5] SUNRPC: Clean up xdr_commit_encode() Chuck Lever
2022-06-07  1:05   ` NeilBrown
2022-06-07  2:21     ` Chuck Lever III
2022-06-07  2:37       ` NeilBrown
2022-06-05 19:58 ` [PATCH v1 4/5] SUNRPC: Clean up xdr_get_next_encode_buffer() Chuck Lever
2022-06-07  1:07   ` NeilBrown
2022-06-05 19:58 ` [PATCH v1 5/5] SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() Chuck Lever
2022-06-07  1:08   ` NeilBrown
2022-06-07 11:37 ` [PATCH v1 0/5] Fix NFSv3 READDIRPLUS failures NeilBrown
2022-06-07 14:16   ` Chuck Lever III

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.