* [PATCH 0/8] Fix various issues in the SUNRPC xdr code @ 2020-11-22 20:52 trondmy 2020-11-22 20:52 ` [PATCH 1/8] NFSv4: Fix the alignment of page data in the getdeviceinfo reply trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> When looking at the issues raised by Tigran's testing of the NFS client updates, I noticed a couple of things in the generic SUNRPC xdr code that want to be fixed. This patch series replaces an earlier series that attempted to just fix the XDR padding in the NFS code. This series fixes up a number of issues w.r.t. bounds checking in the xdr_stream code. It corrects the behaviour of xdr_read_pages() for the case where the XDR object size is larger than the buffer page array length and simplifies the code. Trond Myklebust (8): NFSv4: Fix the alignment of page data in the getdeviceinfo reply SUNRPC: Fix up typo in xdr_init_decode() SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base() SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengths SUNRPC: Don't truncate tail in xdr_inline_pages() SUNRPC: Fix up xdr_set_page() SUNRPC: Fix open coded xdr_stream_remaining() NFSv4: Fix open coded xdr_stream_remaining() fs/nfs/nfs4xdr.c | 16 +++++--- net/sunrpc/xdr.c | 101 +++++++++++++++++++++-------------------------- 2 files changed, 56 insertions(+), 61 deletions(-) -- 2.28.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/8] NFSv4: Fix the alignment of page data in the getdeviceinfo reply 2020-11-22 20:52 [PATCH 0/8] Fix various issues in the SUNRPC xdr code trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 2/8] SUNRPC: Fix up typo in xdr_init_decode() trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> We can fit the device_addr4 opaque data padding in the pages. Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- fs/nfs/nfs4xdr.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index c6dbfcae7517..9913aec3ee32 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst *req, struct compound_hdr hdr = { .minorversion = nfs4_xdr_minorversion(&args->seq_args), }; + uint32_t replen; encode_compound_hdr(xdr, req, &hdr); encode_sequence(xdr, &args->seq_args, &hdr); + + replen = hdr.replen + op_decode_hdr_maxsz; + encode_getdeviceinfo(xdr, args, &hdr); - /* set up reply kvec. Subtract notification bitmap max size (2) - * so that notification bitmap is put in xdr_buf tail */ + /* set up reply kvec. device_addr4 opaque data is read into the + * pages */ rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase, - args->pdev->pglen, hdr.replen - 2); + args->pdev->pglen, replen + 2); encode_nops(&hdr); } -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/8] SUNRPC: Fix up typo in xdr_init_decode() 2020-11-22 20:52 ` [PATCH 1/8] NFSv4: Fix the alignment of page data in the getdeviceinfo reply trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 3/8] SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base() trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> We already know that the head buffer and page are empty, so if there is any data, it is in the tail. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index b1cda6d85ded..bc7a622016ee 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -1060,7 +1060,7 @@ void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p, else if (buf->page_len != 0) xdr_set_page_base(xdr, 0, buf->len); else - xdr_set_iov(xdr, buf->head, buf->len); + xdr_set_iov(xdr, buf->tail, buf->len); if (p != NULL && p > xdr->p && xdr->end >= p) { xdr->nwords -= p - xdr->p; xdr->p = p; -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/8] SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base() 2020-11-22 20:52 ` [PATCH 2/8] SUNRPC: Fix up typo in xdr_init_decode() trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 4/8] SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengths trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> Allow xdr_set_iov() to set a base so that we can use it to set the cursor to a specific position in the kvec buffer. If the new base overflows the kvec/pages buffer in either xdr_set_iov() or xdr_set_page_base(), then truncate it so that we point to the end of the buffer. Finally, change both function to return the number of bytes remaining to read in their buffers. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index bc7a622016ee..394297ec1cb9 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -970,19 +970,22 @@ void xdr_write_pages(struct xdr_stream *xdr, struct page **pages, unsigned int b } EXPORT_SYMBOL_GPL(xdr_write_pages); -static void xdr_set_iov(struct xdr_stream *xdr, struct kvec *iov, - unsigned int len) +static unsigned int xdr_set_iov(struct xdr_stream *xdr, struct kvec *iov, + unsigned int base, unsigned int len) { if (len > iov->iov_len) len = iov->iov_len; - xdr->p = (__be32*)iov->iov_base; + if (unlikely(base > len)) + base = len; + xdr->p = (__be32*)(iov->iov_base + base); xdr->end = (__be32*)(iov->iov_base + len); xdr->iov = iov; xdr->page_ptr = NULL; + return len - base; } -static int xdr_set_page_base(struct xdr_stream *xdr, - unsigned int base, unsigned int len) +static unsigned int xdr_set_page_base(struct xdr_stream *xdr, + unsigned int base, unsigned int len) { unsigned int pgnr; unsigned int maxlen; @@ -991,9 +994,11 @@ static int xdr_set_page_base(struct xdr_stream *xdr, void *kaddr; maxlen = xdr->buf->page_len; - if (base >= maxlen) - return -EINVAL; - maxlen -= base; + if (base >= maxlen) { + base = maxlen; + maxlen = 0; + } else + maxlen -= base; if (len > maxlen) len = maxlen; @@ -1011,14 +1016,14 @@ static int xdr_set_page_base(struct xdr_stream *xdr, pgend = PAGE_SIZE; xdr->end = (__be32*)(kaddr + pgend); xdr->iov = NULL; - return 0; + return len; } static void xdr_set_page(struct xdr_stream *xdr, unsigned int base, unsigned int len) { - if (xdr_set_page_base(xdr, base, len) < 0) - xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2); + if (xdr_set_page_base(xdr, base, len) == 0) + xdr_set_iov(xdr, xdr->buf->tail, 0, xdr_stream_remaining(xdr)); } static void xdr_set_next_page(struct xdr_stream *xdr) @@ -1055,12 +1060,9 @@ void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p, xdr->scratch.iov_base = NULL; xdr->scratch.iov_len = 0; xdr->nwords = XDR_QUADLEN(buf->len); - if (buf->head[0].iov_len != 0) - xdr_set_iov(xdr, buf->head, buf->len); - else if (buf->page_len != 0) - xdr_set_page_base(xdr, 0, buf->len); - else - xdr_set_iov(xdr, buf->tail, buf->len); + if (xdr_set_iov(xdr, buf->head, 0, buf->len) == 0 && + xdr_set_page_base(xdr, 0, buf->len) == 0) + xdr_set_iov(xdr, buf->tail, 0, buf->len); if (p != NULL && p > xdr->p && xdr->end >= p) { xdr->nwords -= p - xdr->p; xdr->p = p; -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/8] SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengths 2020-11-22 20:52 ` [PATCH 3/8] SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base() trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> Fix up xdr_read_pages() so that it can handle object lengths that are larger than the page length, by simply aligning to the next object in the buffer tail. The function will continue to return the length of the truncate object data that actually fit into the pages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 41 +++++++++++++++-------------------------- 1 file changed, 15 insertions(+), 26 deletions(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index 394297ec1cb9..3ce0a5daa9eb 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -1219,44 +1219,33 @@ static unsigned int xdr_align_pages(struct xdr_stream *xdr, unsigned int len) } /** - * xdr_read_pages - Ensure page-based XDR data to decode is aligned at current pointer position + * xdr_read_pages - align page-based XDR data to current pointer position * @xdr: pointer to xdr_stream struct * @len: number of bytes of page data * * Moves data beyond the current pointer position from the XDR head[] buffer - * into the page list. Any data that lies beyond current position + "len" - * bytes is moved into the XDR tail[]. + * into the page list. Any data that lies beyond current position + @len + * bytes is moved into the XDR tail[]. The xdr_stream current position is + * then advanced past that data to align to the next XDR object in the tail. * * Returns the number of XDR encoded bytes now contained in the pages */ unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len) { - struct xdr_buf *buf = xdr->buf; - struct kvec *iov; - unsigned int nwords; - unsigned int end; - unsigned int padding; + unsigned int nwords = XDR_QUADLEN(len); + unsigned int base, end, pglen; - len = xdr_align_pages(xdr, len); - if (len == 0) + pglen = xdr_align_pages(xdr, nwords << 2); + if (pglen == 0) return 0; - nwords = XDR_QUADLEN(len); - padding = (nwords << 2) - len; - xdr->iov = iov = buf->tail; - /* Compute remaining message length. */ - end = ((xdr->nwords - nwords) << 2) + padding; - if (end > iov->iov_len) - end = iov->iov_len; - /* - * Position current pointer at beginning of tail, and - * set remaining message length. - */ - xdr->p = (__be32 *)((char *)iov->iov_base + padding); - xdr->end = (__be32 *)((char *)iov->iov_base + end); - xdr->page_ptr = NULL; - xdr->nwords = XDR_QUADLEN(end - padding); - return len; + xdr->nwords -= nwords; + base = (nwords << 2) - pglen; + end = xdr_stream_remaining(xdr) - pglen; + + if (xdr_set_iov(xdr, xdr->buf->tail, base, end) == 0) + xdr->nwords = 0; + return len <= pglen ? len : pglen; } EXPORT_SYMBOL_GPL(xdr_read_pages); -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-22 20:52 ` [PATCH 4/8] SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengths trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 6/8] SUNRPC: Fix up xdr_set_page() trondmy ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> True that if the length of the pages[] array is not 4-byte aligned, then we will need to store the padding in the tail, but there is no need to truncate the total buffer length here. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index 3ce0a5daa9eb..5a450055469f 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset, tail->iov_base = buf + offset; tail->iov_len = buflen - offset; - if ((xdr->page_len & 3) == 0) - tail->iov_len -= sizeof(__be32); - xdr->buflen += len; } EXPORT_SYMBOL_GPL(xdr_inline_pages); -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 6/8] SUNRPC: Fix up xdr_set_page() 2020-11-22 20:52 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 7/8] SUNRPC: Fix open coded xdr_stream_remaining() trondmy 2020-11-23 1:24 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() Chuck Lever 2020-11-23 17:24 ` Anna Schumaker 2 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> While we always want to align to the next page and/or the beginning of the tail buffer when we call xdr_set_next_page(), the functions xdr_align_data() and xdr_expand_hole() really want to align to the next object in that next page or tail. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index 5a450055469f..ddd5cc2281ab 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -1019,8 +1019,10 @@ static unsigned int xdr_set_page_base(struct xdr_stream *xdr, static void xdr_set_page(struct xdr_stream *xdr, unsigned int base, unsigned int len) { - if (xdr_set_page_base(xdr, base, len) == 0) - xdr_set_iov(xdr, xdr->buf->tail, 0, xdr_stream_remaining(xdr)); + if (xdr_set_page_base(xdr, base, len) == 0) { + base -= xdr->buf->page_len; + xdr_set_iov(xdr, xdr->buf->tail, base, len); + } } static void xdr_set_next_page(struct xdr_stream *xdr) @@ -1029,17 +1031,18 @@ static void xdr_set_next_page(struct xdr_stream *xdr) newbase = (1 + xdr->page_ptr - xdr->buf->pages) << PAGE_SHIFT; newbase -= xdr->buf->page_base; - - xdr_set_page(xdr, newbase, PAGE_SIZE); + if (newbase < xdr->buf->page_len) + xdr_set_page_base(xdr, newbase, xdr_stream_remaining(xdr)); + else + xdr_set_iov(xdr, xdr->buf->tail, 0, xdr_stream_remaining(xdr)); } static bool xdr_set_next_buffer(struct xdr_stream *xdr) { if (xdr->page_ptr != NULL) xdr_set_next_page(xdr); - else if (xdr->iov == xdr->buf->head) { - xdr_set_page(xdr, 0, PAGE_SIZE); - } + else if (xdr->iov == xdr->buf->head) + xdr_set_page(xdr, 0, xdr_stream_remaining(xdr)); return xdr->p != xdr->end; } @@ -1277,7 +1280,7 @@ uint64_t xdr_align_data(struct xdr_stream *xdr, uint64_t offset, uint32_t length } xdr->nwords -= XDR_QUADLEN(length); - xdr_set_page(xdr, from + length, PAGE_SIZE); + xdr_set_page(xdr, from + length, xdr_stream_remaining(xdr)); return length; } EXPORT_SYMBOL_GPL(xdr_align_data); @@ -1314,7 +1317,7 @@ uint64_t xdr_expand_hole(struct xdr_stream *xdr, uint64_t offset, uint64_t lengt _zero_pages(buf->pages, buf->page_base + offset, length); buf->len += length - (from - offset) - truncated; - xdr_set_page(xdr, offset + length, PAGE_SIZE); + xdr_set_page(xdr, offset + length, xdr_stream_remaining(xdr)); return length; } EXPORT_SYMBOL_GPL(xdr_expand_hole); -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 7/8] SUNRPC: Fix open coded xdr_stream_remaining() 2020-11-22 20:52 ` [PATCH 6/8] SUNRPC: Fix up xdr_set_page() trondmy @ 2020-11-22 20:52 ` trondmy 2020-11-22 20:52 ` [PATCH 8/8] NFSv4: " trondmy 0 siblings, 1 reply; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/xdr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index ddd5cc2281ab..c852d199c789 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -1261,7 +1261,7 @@ uint64_t xdr_align_data(struct xdr_stream *xdr, uint64_t offset, uint32_t length xdr_realign_pages(xdr); from = xdr_page_pos(xdr); - bytes = xdr->nwords << 2; + bytes = xdr_stream_remaining(xdr); if (length < bytes) bytes = length; @@ -1298,7 +1298,7 @@ uint64_t xdr_expand_hole(struct xdr_stream *xdr, uint64_t offset, uint64_t lengt xdr_realign_pages(xdr); from = xdr_page_pos(xdr); - bytes = xdr->nwords << 2; + bytes = xdr_stream_remaining(xdr); if (offset + length + bytes > buf->page_len) { unsigned int shift = (offset + length + bytes) - buf->page_len; -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 8/8] NFSv4: Fix open coded xdr_stream_remaining() 2020-11-22 20:52 ` [PATCH 7/8] SUNRPC: Fix open coded xdr_stream_remaining() trondmy @ 2020-11-22 20:52 ` trondmy 0 siblings, 0 replies; 15+ messages in thread From: trondmy @ 2020-11-22 20:52 UTC (permalink / raw) To: linux-nfs From: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- fs/nfs/nfs4xdr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 9913aec3ee32..910468c77c58 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -5335,11 +5335,11 @@ static int decode_getacl(struct xdr_stream *xdr, struct rpc_rqst *req, res->acl_len = attrlen; /* Check for receive buffer overflow */ - if (res->acl_len > (xdr->nwords << 2) || + if (res->acl_len > xdr_stream_remaining(xdr) || res->acl_len + res->acl_data_offset > xdr->buf->page_len) { res->acl_flags |= NFS4_ACL_TRUNC; - dprintk("NFS: acl reply: attrlen %u > page_len %u\n", - attrlen, xdr->nwords << 2); + dprintk("NFS: acl reply: attrlen %u > page_len %zu\n", + attrlen, xdr_stream_remaining(xdr)); } } else status = -EOPNOTSUPP; -- 2.28.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-22 20:52 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() trondmy 2020-11-22 20:52 ` [PATCH 6/8] SUNRPC: Fix up xdr_set_page() trondmy @ 2020-11-23 1:24 ` Chuck Lever 2020-11-23 4:29 ` Trond Myklebust 2020-11-23 17:24 ` Anna Schumaker 2 siblings, 1 reply; 15+ messages in thread From: Chuck Lever @ 2020-11-23 1:24 UTC (permalink / raw) To: trondmy; +Cc: Linux NFS Mailing List > On Nov 22, 2020, at 3:52 PM, trondmy@kernel.org wrote: > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > True that if the length of the pages[] array is not 4-byte aligned, then > we will need to store the padding in the tail, but there is no need to > truncate the total buffer length here. This description confuses me. The existing code reduces the length of the tail, not the "total buffer length." And what the removed logic is doing is taking out the length of the XDR pad for the pages array when it is not expected to be used. > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> > --- > net/sunrpc/xdr.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > index 3ce0a5daa9eb..5a450055469f 100644 > --- a/net/sunrpc/xdr.c > +++ b/net/sunrpc/xdr.c > @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset, > > tail->iov_base = buf + offset; > tail->iov_len = buflen - offset; > - if ((xdr->page_len & 3) == 0) > - tail->iov_len -= sizeof(__be32); > - > xdr->buflen += len; > } > EXPORT_SYMBOL_GPL(xdr_inline_pages); > -- > 2.28.0 > -- Chuck Lever ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-23 1:24 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() Chuck Lever @ 2020-11-23 4:29 ` Trond Myklebust 2020-11-23 14:52 ` Chuck Lever 0 siblings, 1 reply; 15+ messages in thread From: Trond Myklebust @ 2020-11-23 4:29 UTC (permalink / raw) To: chuck.lever, trondmy; +Cc: linux-nfs On Sun, 2020-11-22 at 20:24 -0500, Chuck Lever wrote: > > > > On Nov 22, 2020, at 3:52 PM, trondmy@kernel.org wrote: > > > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > > > True that if the length of the pages[] array is not 4-byte aligned, > > then > > we will need to store the padding in the tail, but there is no need > > to > > truncate the total buffer length here. > > This description confuses me. The existing code reduces the length of > the tail, not the "total buffer length." And what the removed logic > is > doing is taking out the length of the XDR pad for the pages array > when > it is not expected to be used. Why are we bothering to do that? There is nothing problematic with just ignoring this test and leaving the tail length as it is, nor is there anything to be gained by applying it. > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> > > --- > > net/sunrpc/xdr.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > > index 3ce0a5daa9eb..5a450055469f 100644 > > --- a/net/sunrpc/xdr.c > > +++ b/net/sunrpc/xdr.c > > @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned > > int offset, > > > > tail->iov_base = buf + offset; > > tail->iov_len = buflen - offset; > > - if ((xdr->page_len & 3) == 0) > > - tail->iov_len -= sizeof(__be32); > > - > > xdr->buflen += len; > > } > > EXPORT_SYMBOL_GPL(xdr_inline_pages); > > -- > > 2.28.0 > > > > -- > Chuck Lever > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-23 4:29 ` Trond Myklebust @ 2020-11-23 14:52 ` Chuck Lever 2020-11-23 15:37 ` Trond Myklebust 0 siblings, 1 reply; 15+ messages in thread From: Chuck Lever @ 2020-11-23 14:52 UTC (permalink / raw) To: Trond Myklebust; +Cc: trondmy, Linux NFS Mailing List > On Nov 22, 2020, at 11:29 PM, Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Sun, 2020-11-22 at 20:24 -0500, Chuck Lever wrote: >> >> >>> On Nov 22, 2020, at 3:52 PM, trondmy@kernel.org wrote: >>> >>> From: Trond Myklebust <trond.myklebust@hammerspace.com> >>> >>> True that if the length of the pages[] array is not 4-byte aligned, >>> then >>> we will need to store the padding in the tail, but there is no need >>> to >>> truncate the total buffer length here. >> >> This description confuses me. The existing code reduces the length of >> the tail, not the "total buffer length." And what the removed logic >> is >> doing is taking out the length of the XDR pad for the pages array >> when >> it is not expected to be used. > > Why are we bothering to do that? There is nothing problematic with just > ignoring this test and leaving the tail length as it is, nor is there > anything to be gained by applying it. You are correct that leaving the buffer a little long is not going to harm normal operation. After all, we lived with a wildly over- estimated slack length for years. The purpose of this code path is to prepare the receive buffer with the memory resources and expected length of the Reply. The series of patches that introduced this particular change was all about ensuring that the estimated length of the reply message was exact. If the reply message size is overestimated, that moves the end-of- message sentinel that is later set by xdr_init_decode(). We then miss subtle problems like our fixed size estimates are incorrect or a man-in-the-middle is extending the RPC message or the server is malfunctioning. <scratches chin> After moving the ->pages pad into ->pages, I'm wondering if you should revert 02ef04e432ba ("NFS: Account for XDR pad of buf->pages") -- the maxsz macros don't need to account for the XDR pad of ->pages any more. Then the below hunk makes sense. The patch description still doesn't, though ;-) And then you should confirm that we are still getting the receive buffer size estimate right for krb5i and krb5p. >>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> >>> --- >>> net/sunrpc/xdr.c | 3 --- >>> 1 file changed, 3 deletions(-) >>> >>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c >>> index 3ce0a5daa9eb..5a450055469f 100644 >>> --- a/net/sunrpc/xdr.c >>> +++ b/net/sunrpc/xdr.c >>> @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned >>> int offset, >>> >>> tail->iov_base = buf + offset; >>> tail->iov_len = buflen - offset; >>> - if ((xdr->page_len & 3) == 0) >>> - tail->iov_len -= sizeof(__be32); >>> - >>> xdr->buflen += len; >>> } >>> EXPORT_SYMBOL_GPL(xdr_inline_pages); >>> -- >>> 2.28.0 >>> >> >> -- >> Chuck Lever >> >> >> > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com -- Chuck Lever ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-23 14:52 ` Chuck Lever @ 2020-11-23 15:37 ` Trond Myklebust 2020-11-23 15:47 ` Chuck Lever 0 siblings, 1 reply; 15+ messages in thread From: Trond Myklebust @ 2020-11-23 15:37 UTC (permalink / raw) To: chuck.lever; +Cc: linux-nfs On Mon, 2020-11-23 at 09:52 -0500, Chuck Lever wrote: > > > > On Nov 22, 2020, at 11:29 PM, Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > > On Sun, 2020-11-22 at 20:24 -0500, Chuck Lever wrote: > > > > > > > > > > On Nov 22, 2020, at 3:52 PM, trondmy@kernel.org wrote: > > > > > > > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > > > > > > > True that if the length of the pages[] array is not 4-byte > > > > aligned, > > > > then > > > > we will need to store the padding in the tail, but there is no > > > > need > > > > to > > > > truncate the total buffer length here. > > > > > > This description confuses me. The existing code reduces the > > > length of > > > the tail, not the "total buffer length." And what the removed > > > logic > > > is > > > doing is taking out the length of the XDR pad for the pages array > > > when > > > it is not expected to be used. > > > > Why are we bothering to do that? There is nothing problematic with > > just > > ignoring this test and leaving the tail length as it is, nor is > > there > > anything to be gained by applying it. > > You are correct that leaving the buffer a little long is not going > to harm normal operation. After all, we lived with a wildly over- > estimated slack length for years. > > The purpose of this code path is to prepare the receive buffer with > the memory resources and expected length of the Reply. The series > of patches that introduced this particular change was all about > ensuring that the estimated length of the reply message was exact. > > If the reply message size is overestimated, that moves the end-of- > message sentinel that is later set by xdr_init_decode(). We then > miss subtle problems like our fixed size estimates are incorrect > or a man-in-the-middle is extending the RPC message or the server > is malfunctioning. > > <scratches chin> > > After moving the ->pages pad into ->pages, I'm wondering if you > should revert 02ef04e432ba ("NFS: Account for XDR pad of buf->pages") > -- > the maxsz macros don't need to account for the XDR pad of ->pages > any more. Then the below hunk makes sense. The patch description > still doesn't, though ;-) > I don't think it needs to be reverted. I think you are right to include the padding in the buffer size that we use to set the value of task- >tk_rqstp->rq_rcvsize. That said, it seems wrong to include that padding as part of the 'hdrsize' argument in rpc_prepare_reply_pages(). That just causes confusion, because the padding is not part of the header in front of the array of pages. It is part of the tail data after the array of pages. So I think a cleanup there may be warranted. The other thing that I'm considering is that we may want to optimise to avoid setting up an RDMA SEND just for the padding if that is truly the last word in the RPC call (it matters less if there is other data that requires us to set up such a SEND anyway). Not sure how to do that in a clean manner, though. Perhaps we'd have to pass in the padding size as a separate argument to xdr_inline_pages() (and also to rpc_prepare_reply_pages())? > And then you should confirm that we are still getting the receive > buffer size estimate right for krb5i and krb5p. > > > > > > Signed-off-by: Trond Myklebust > > > > <trond.myklebust@hammerspace.com> > > > > --- > > > > net/sunrpc/xdr.c | 3 --- > > > > 1 file changed, 3 deletions(-) > > > > > > > > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > > > > index 3ce0a5daa9eb..5a450055469f 100644 > > > > --- a/net/sunrpc/xdr.c > > > > +++ b/net/sunrpc/xdr.c > > > > @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, > > > > unsigned > > > > int offset, > > > > > > > > tail->iov_base = buf + offset; > > > > tail->iov_len = buflen - offset; > > > > - if ((xdr->page_len & 3) == 0) > > > > - tail->iov_len -= sizeof(__be32); > > > > - > > > > xdr->buflen += len; > > > > } > > > > EXPORT_SYMBOL_GPL(xdr_inline_pages); > > > > -- > > > > 2.28.0 > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-23 15:37 ` Trond Myklebust @ 2020-11-23 15:47 ` Chuck Lever 0 siblings, 0 replies; 15+ messages in thread From: Chuck Lever @ 2020-11-23 15:47 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List > On Nov 23, 2020, at 10:37 AM, Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Mon, 2020-11-23 at 09:52 -0500, Chuck Lever wrote: >> >> >>> On Nov 22, 2020, at 11:29 PM, Trond Myklebust < >>> trondmy@hammerspace.com> wrote: >>> >>> On Sun, 2020-11-22 at 20:24 -0500, Chuck Lever wrote: >>>> >>>> >>>>> On Nov 22, 2020, at 3:52 PM, trondmy@kernel.org wrote: >>>>> >>>>> From: Trond Myklebust <trond.myklebust@hammerspace.com> >>>>> >>>>> True that if the length of the pages[] array is not 4-byte >>>>> aligned, >>>>> then >>>>> we will need to store the padding in the tail, but there is no >>>>> need >>>>> to >>>>> truncate the total buffer length here. >>>> >>>> This description confuses me. The existing code reduces the >>>> length of >>>> the tail, not the "total buffer length." And what the removed >>>> logic >>>> is >>>> doing is taking out the length of the XDR pad for the pages array >>>> when >>>> it is not expected to be used. >>> >>> Why are we bothering to do that? There is nothing problematic with >>> just >>> ignoring this test and leaving the tail length as it is, nor is >>> there >>> anything to be gained by applying it. >> >> You are correct that leaving the buffer a little long is not going >> to harm normal operation. After all, we lived with a wildly over- >> estimated slack length for years. >> >> The purpose of this code path is to prepare the receive buffer with >> the memory resources and expected length of the Reply. The series >> of patches that introduced this particular change was all about >> ensuring that the estimated length of the reply message was exact. >> >> If the reply message size is overestimated, that moves the end-of- >> message sentinel that is later set by xdr_init_decode(). We then >> miss subtle problems like our fixed size estimates are incorrect >> or a man-in-the-middle is extending the RPC message or the server >> is malfunctioning. >> >> <scratches chin> >> >> After moving the ->pages pad into ->pages, I'm wondering if you >> should revert 02ef04e432ba ("NFS: Account for XDR pad of buf->pages") >> -- >> the maxsz macros don't need to account for the XDR pad of ->pages >> any more. Then the below hunk makes sense. The patch description >> still doesn't, though ;-) >> > > I don't think it needs to be reverted. I think you are right to include > the padding in the buffer size that we use to set the value of task- >> tk_rqstp->rq_rcvsize. > > That said, it seems wrong to include that padding as part of the > 'hdrsize' argument in rpc_prepare_reply_pages(). That just causes > confusion, because the padding is not part of the header in front of > the array of pages. It is part of the tail data after the array of > pages. So I think a cleanup there may be warranted. Agreed, dealing with the tail size is confusing. > The other thing that I'm considering is that we may want to optimise to > avoid setting up an RDMA SEND just for the padding if that is truly the > last word in the RPC call (it matters less if there is other data that > requires us to set up such a SEND anyway). Not sure how to do that in a > clean manner, though. Perhaps we'd have to pass in the padding size as > a separate argument to xdr_inline_pages() (and also to > rpc_prepare_reply_pages())? In the current version of RPC/RDMA, there's always exactly one RDMA Send per RPC message. The Linux client implementation is also careful to exclude XDR padding in both Read and Write chunks because the protocol makes the inclusion of padding on the wire optional. The only issue I see is that the upper layer needs to identify to the transport the exact size of the data item that is being transferred in a chunk so that the padding can be properly excluded. Currently rpcrdma makes some assumptions about how the data items are laid out in the xdr_buf when XDRBUF_READ/WRITE is set. >> And then you should confirm that we are still getting the receive >> buffer size estimate right for krb5i and krb5p. >> >> >>>>> Signed-off-by: Trond Myklebust >>>>> <trond.myklebust@hammerspace.com> >>>>> --- >>>>> net/sunrpc/xdr.c | 3 --- >>>>> 1 file changed, 3 deletions(-) >>>>> >>>>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c >>>>> index 3ce0a5daa9eb..5a450055469f 100644 >>>>> --- a/net/sunrpc/xdr.c >>>>> +++ b/net/sunrpc/xdr.c >>>>> @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, >>>>> unsigned >>>>> int offset, >>>>> >>>>> tail->iov_base = buf + offset; >>>>> tail->iov_len = buflen - offset; >>>>> - if ((xdr->page_len & 3) == 0) >>>>> - tail->iov_len -= sizeof(__be32); >>>>> - >>>>> xdr->buflen += len; >>>>> } >>>>> EXPORT_SYMBOL_GPL(xdr_inline_pages); >>>>> -- >>>>> 2.28.0 >> > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com -- Chuck Lever ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() 2020-11-22 20:52 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() trondmy 2020-11-22 20:52 ` [PATCH 6/8] SUNRPC: Fix up xdr_set_page() trondmy 2020-11-23 1:24 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() Chuck Lever @ 2020-11-23 17:24 ` Anna Schumaker 2 siblings, 0 replies; 15+ messages in thread From: Anna Schumaker @ 2020-11-23 17:24 UTC (permalink / raw) To: trondmy; +Cc: Linux NFS Mailing List Hi Trond, On Sun, Nov 22, 2020 at 4:07 PM <trondmy@kernel.org> wrote: > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > True that if the length of the pages[] array is not 4-byte aligned, then > we will need to store the padding in the tail, but there is no need to > truncate the total buffer length here. Just a heads up, after applying this patch there are a *lot* of xfstests that fail when run on v4.2 against a server that supports READ_PLUS Anna > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> > --- > net/sunrpc/xdr.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > index 3ce0a5daa9eb..5a450055469f 100644 > --- a/net/sunrpc/xdr.c > +++ b/net/sunrpc/xdr.c > @@ -193,9 +193,6 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset, > > tail->iov_base = buf + offset; > tail->iov_len = buflen - offset; > - if ((xdr->page_len & 3) == 0) > - tail->iov_len -= sizeof(__be32); > - > xdr->buflen += len; > } > EXPORT_SYMBOL_GPL(xdr_inline_pages); > -- > 2.28.0 > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-11-23 17:25 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-22 20:52 [PATCH 0/8] Fix various issues in the SUNRPC xdr code trondmy 2020-11-22 20:52 ` [PATCH 1/8] NFSv4: Fix the alignment of page data in the getdeviceinfo reply trondmy 2020-11-22 20:52 ` [PATCH 2/8] SUNRPC: Fix up typo in xdr_init_decode() trondmy 2020-11-22 20:52 ` [PATCH 3/8] SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base() trondmy 2020-11-22 20:52 ` [PATCH 4/8] SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengths trondmy 2020-11-22 20:52 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() trondmy 2020-11-22 20:52 ` [PATCH 6/8] SUNRPC: Fix up xdr_set_page() trondmy 2020-11-22 20:52 ` [PATCH 7/8] SUNRPC: Fix open coded xdr_stream_remaining() trondmy 2020-11-22 20:52 ` [PATCH 8/8] NFSv4: " trondmy 2020-11-23 1:24 ` [PATCH 5/8] SUNRPC: Don't truncate tail in xdr_inline_pages() Chuck Lever 2020-11-23 4:29 ` Trond Myklebust 2020-11-23 14:52 ` Chuck Lever 2020-11-23 15:37 ` Trond Myklebust 2020-11-23 15:47 ` Chuck Lever 2020-11-23 17:24 ` Anna Schumaker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).