linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
@ 2015-09-21 17:24 Steve Wise
       [not found] ` <20150921172423.9761.92399.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-21 17:24 UTC (permalink / raw)
  To: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA, bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Otherwise a FRMR completion can cause a touch-after-free crash.

In xprt_rdma_destroy(), call rpcrdma_buffer_destroy() only after calling
rpcrdma_ep_destroy().

In rpcrdma_ep_destroy(), disconnect the cm_id first which should flush the
qp, then drain the cqs, then destroy the qp, and finally destroy the cqs.

Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---

 net/sunrpc/xprtrdma/transport.c |    2 +-
 net/sunrpc/xprtrdma/verbs.c     |    9 ++++++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 64443eb..41e452b 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -270,8 +270,8 @@ xprt_rdma_destroy(struct rpc_xprt *xprt)
 
 	xprt_clear_connected(xprt);
 
-	rpcrdma_buffer_destroy(&r_xprt->rx_buf);
 	rpcrdma_ep_destroy(&r_xprt->rx_ep, &r_xprt->rx_ia);
+	rpcrdma_buffer_destroy(&r_xprt->rx_buf);
 	rpcrdma_ia_close(&r_xprt->rx_ia);
 
 	xprt_rdma_free_addresses(xprt);
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 6829967..01a314a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -755,19 +755,22 @@ rpcrdma_ep_destroy(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
 
 	cancel_delayed_work_sync(&ep->rep_connect_worker);
 
-	if (ia->ri_id->qp) {
+	if (ia->ri_id->qp)
 		rpcrdma_ep_disconnect(ep, ia);
+
+	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
+	rpcrdma_clean_cq(ep->rep_attr.send_cq);
+
+	if (ia->ri_id->qp) {
 		rdma_destroy_qp(ia->ri_id);
 		ia->ri_id->qp = NULL;
 	}
 
-	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
 	rc = ib_destroy_cq(ep->rep_attr.recv_cq);
 	if (rc)
 		dprintk("RPC:       %s: ib_destroy_cq returned %i\n",
 			__func__, rc);
 
-	rpcrdma_clean_cq(ep->rep_attr.send_cq);
 	rc = ib_destroy_cq(ep->rep_attr.send_cq);
 	if (rc)
 		dprintk("RPC:       %s: ib_destroy_cq returned %i\n",

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
       [not found] ` <20150921172423.9761.92399.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
@ 2015-09-21 17:24   ` Steve Wise
       [not found]     ` <20150921172428.9761.27838.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
  2015-09-21 17:24   ` [PATCH 3/3] xprtrdma: don't log warnings for flushed completions Steve Wise
  2015-09-28 14:30   ` [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers Steve Wise
  2 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-21 17:24 UTC (permalink / raw)
  To: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA, bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
were not taking into account the initial page_offset when determining
the rdma read length.  This resulted in a read who's starting address
and length exceeded the base/bounds of the frmr.

Most work loads don't tickle this bug apparently, but one test hit it
every time: building the linux kernel on a 16 core node with 'make -j
16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.

This bug seems to only be tripped with devices having small fastreg page
list depths.  I didn't see it with mlx4, for instance.

Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---

 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index cb51742..5f6ca47 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -136,7 +136,8 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->read_hdr = head;
 	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	for (pno = 0; pno < pages_needed; pno++) {
 		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
@@ -235,7 +236,8 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->frmr = frmr;
 	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
 	frmr->direction = DMA_FROM_DEVICE;

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] xprtrdma: don't log warnings for flushed completions
       [not found] ` <20150921172423.9761.92399.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
  2015-09-21 17:24   ` [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset Steve Wise
@ 2015-09-21 17:24   ` Steve Wise
  2015-09-28 14:30   ` [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers Steve Wise
  2 siblings, 0 replies; 11+ messages in thread
From: Steve Wise @ 2015-09-21 17:24 UTC (permalink / raw)
  To: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA, bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Unsignaled send WRs can get flushed as part of normal unmount, so don't
log them as warnings.

Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---

 net/sunrpc/xprtrdma/frwr_ops.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index d6653f5..f1868ba 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -257,8 +257,11 @@ frwr_sendcompletion(struct ib_wc *wc)
 
 	/* WARNING: Only wr_id and status are reliable at this point */
 	r = (struct rpcrdma_mw *)(unsigned long)wc->wr_id;
-	pr_warn("RPC:       %s: frmr %p flushed, status %s (%d)\n",
-		__func__, r, ib_wc_status_msg(wc->status), wc->status);
+	if (wc->status == IB_WC_WR_FLUSH_ERR)
+		dprintk("RPC:       %s: frmr %p flushed\n", __func__, r);
+	else
+		pr_warn("RPC:       %s: frmr %p error, status %s (%d)\n",
+			__func__, r, ib_wc_status_msg(wc->status), wc->status);
 	r->r.frmr.fr_state = FRMR_IS_STALE;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
       [not found] ` <20150921172423.9761.92399.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
  2015-09-21 17:24   ` [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset Steve Wise
  2015-09-21 17:24   ` [PATCH 3/3] xprtrdma: don't log warnings for flushed completions Steve Wise
@ 2015-09-28 14:30   ` Steve Wise
       [not found]     ` <56094F16.3040608-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  2 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-28 14:30 UTC (permalink / raw)
  To: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA, bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 9/21/2015 12:24 PM, Steve Wise wrote:
> Otherwise a FRMR completion can cause a touch-after-free crash.
>
> In xprt_rdma_destroy(), call rpcrdma_buffer_destroy() only after calling
> rpcrdma_ep_destroy().
>
> In rpcrdma_ep_destroy(), disconnect the cm_id first which should flush the
> qp, then drain the cqs, then destroy the qp, and finally destroy the cqs.
>
> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---

Hey Trond,  I'm hoping this can make 4.3-rc (and stable if you agree).

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
       [not found]     ` <20150921172428.9761.27838.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
@ 2015-09-28 14:31       ` Steve Wise
       [not found]         ` <56094F3D.2030002-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-28 14:31 UTC (permalink / raw)
  To: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA, bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 9/21/2015 12:24 PM, Steve Wise wrote:
> The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> were not taking into account the initial page_offset when determining
> the rdma read length.  This resulted in a read who's starting address
> and length exceeded the base/bounds of the frmr.
>
> Most work loads don't tickle this bug apparently, but one test hit it
> every time: building the linux kernel on a 16 core node with 'make -j
> 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
>
> This bug seems to only be tripped with devices having small fastreg page
> list depths.  I didn't see it with mlx4, for instance.
>
> Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
>
>

Hey Bruce, can this make 4.3-rc?  Also, what do you think about pushing 
it to stable?

Thanks,

Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
       [not found]     ` <56094F16.3040608-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2015-09-28 14:45       ` Anna Schumaker
       [not found]         ` <56095286.6030201-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Anna Schumaker @ 2015-09-28 14:45 UTC (permalink / raw)
  To: Steve Wise, trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi Steve,

On 09/28/2015 10:30 AM, Steve Wise wrote:
> On 9/21/2015 12:24 PM, Steve Wise wrote:
>> Otherwise a FRMR completion can cause a touch-after-free crash.
>>
>> In xprt_rdma_destroy(), call rpcrdma_buffer_destroy() only after calling
>> rpcrdma_ep_destroy().
>>
>> In rpcrdma_ep_destroy(), disconnect the cm_id first which should flush the
>> qp, then drain the cqs, then destroy the qp, and finally destroy the cqs.
>>
>> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
>> Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> ---
> 
> Hey Trond,  I'm hoping this can make 4.3-rc (and stable if you agree).

This patch looks fine to me.  I'll pass it on to Trond!

I'll save patch 3/3 for the Linux 4.4 merge.

Thanks,
Anna

> 
> Thanks,
> 
> Steve.
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
       [not found]         ` <56095286.6030201-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
@ 2015-09-28 14:50           ` Steve Wise
  2015-09-28 14:57             ` Anna Schumaker
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-28 14:50 UTC (permalink / raw)
  To: 'Anna Schumaker',
	trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA



> -----Original Message-----
> From: Anna Schumaker [mailto:Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org]
> Sent: Monday, September 28, 2015 9:45 AM
> To: Steve Wise; trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org; bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org
> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
> 
> Hi Steve,
> 
> On 09/28/2015 10:30 AM, Steve Wise wrote:
> > On 9/21/2015 12:24 PM, Steve Wise wrote:
> >> Otherwise a FRMR completion can cause a touch-after-free crash.
> >>
> >> In xprt_rdma_destroy(), call rpcrdma_buffer_destroy() only after calling
> >> rpcrdma_ep_destroy().
> >>
> >> In rpcrdma_ep_destroy(), disconnect the cm_id first which should flush the
> >> qp, then drain the cqs, then destroy the qp, and finally destroy the cqs.
> >>
> >> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> >> Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> >> ---
> >
> > Hey Trond,  I'm hoping this can make 4.3-rc (and stable if you agree).
> 
> This patch looks fine to me.  I'll pass it on to Trond!
> 
> I'll save patch 3/3 for the Linux 4.4 merge.
> 
> Thanks,
> Anna
> 

Thanks.  Going forward I'll make sure you are CC'd for client patches too!  I wasn't sure if you are formally taking all xprtrdma patches and sending them to Trond...

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
  2015-09-28 14:50           ` Steve Wise
@ 2015-09-28 14:57             ` Anna Schumaker
  0 siblings, 0 replies; 11+ messages in thread
From: Anna Schumaker @ 2015-09-28 14:57 UTC (permalink / raw)
  To: Steve Wise, trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	bfields-uC3wQj2KruNg9hUCZPvPmw
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 09/28/2015 10:50 AM, Steve Wise wrote:
> 
> 
>> -----Original Message-----
>> From: Anna Schumaker [mailto:Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org]
>> Sent: Monday, September 28, 2015 9:45 AM
>> To: Steve Wise; trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org; bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org
>> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> Subject: Re: [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers
>>
>> Hi Steve,
>>
>> On 09/28/2015 10:30 AM, Steve Wise wrote:
>>> On 9/21/2015 12:24 PM, Steve Wise wrote:
>>>> Otherwise a FRMR completion can cause a touch-after-free crash.
>>>>
>>>> In xprt_rdma_destroy(), call rpcrdma_buffer_destroy() only after calling
>>>> rpcrdma_ep_destroy().
>>>>
>>>> In rpcrdma_ep_destroy(), disconnect the cm_id first which should flush the
>>>> qp, then drain the cqs, then destroy the qp, and finally destroy the cqs.
>>>>
>>>> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
>>>> Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>
>>> Hey Trond,  I'm hoping this can make 4.3-rc (and stable if you agree).
>>
>> This patch looks fine to me.  I'll pass it on to Trond!
>>
>> I'll save patch 3/3 for the Linux 4.4 merge.
>>
>> Thanks,
>> Anna
>>
> 
> Thanks.  Going forward I'll make sure you are CC'd for client patches too!  I wasn't sure if you are formally taking all xprtrdma patches and sending them to Trond...

Yeah, I'm taking all the client RDMA patches.  I'm glad that's cleared up now! :)

Anna

> 
> Steve.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
       [not found]         ` <56094F3D.2030002-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2015-09-28 21:04           ` J. Bruce Fields
       [not found]             ` <20150928210459.GC3190-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2015-09-28 21:04 UTC (permalink / raw)
  To: Steve Wise
  Cc: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> On 9/21/2015 12:24 PM, Steve Wise wrote:
> >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> >were not taking into account the initial page_offset when determining
> >the rdma read length.  This resulted in a read who's starting address
> >and length exceeded the base/bounds of the frmr.
> >
> >Most work loads don't tickle this bug apparently, but one test hit it
> >every time: building the linux kernel on a 16 core node with 'make -j
> >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> >
> >This bug seems to only be tripped with devices having small fastreg page
> >list depths.  I didn't see it with mlx4, for instance.
> >
> >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> >Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> >Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> >---
> >
> >
> 
> Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> pushing it to stable?

It looks like a reasonable candidate for stable.  Apologies, somehow I
missed it when you posted it--would you mind resending?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
       [not found]             ` <20150928210459.GC3190-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2015-09-28 21:49               ` Steve Wise
  2015-09-29 15:40                 ` 'J. Bruce Fields'
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Wise @ 2015-09-28 21:49 UTC (permalink / raw)
  To: 'J. Bruce Fields'
  Cc: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA



> -----Original Message-----
> From: J. Bruce Fields [mailto:bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org]
> Sent: Monday, September 28, 2015 4:05 PM
> To: Steve Wise
> Cc: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
> 
> On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> > On 9/21/2015 12:24 PM, Steve Wise wrote:
> > >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> > >were not taking into account the initial page_offset when determining
> > >the rdma read length.  This resulted in a read who's starting address
> > >and length exceeded the base/bounds of the frmr.
> > >
> > >Most work loads don't tickle this bug apparently, but one test hit it
> > >every time: building the linux kernel on a 16 core node with 'make -j
> > >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> > >
> > >This bug seems to only be tripped with devices having small fastreg page
> > >list depths.  I didn't see it with mlx4, for instance.
> > >
> > >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> > >Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> > >Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > >---
> > >
> > >
> >
> > Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> > pushing it to stable?
> 
> It looks like a reasonable candidate for stable.  Apologies, somehow I
> missed it when you posted it--would you mind resending?
> 
> --b.

resent this one patch.

What is your process for pushing to stable?

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
  2015-09-28 21:49               ` Steve Wise
@ 2015-09-29 15:40                 ` 'J. Bruce Fields'
  0 siblings, 0 replies; 11+ messages in thread
From: 'J. Bruce Fields' @ 2015-09-29 15:40 UTC (permalink / raw)
  To: Steve Wise
  Cc: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Sep 28, 2015 at 04:49:21PM -0500, Steve Wise wrote:
> 
> 
> > -----Original Message-----
> > From: J. Bruce Fields [mailto:bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org]
> > Sent: Monday, September 28, 2015 4:05 PM
> > To: Steve Wise
> > Cc: trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Subject: Re: [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset
> > 
> > On Mon, Sep 28, 2015 at 09:31:25AM -0500, Steve Wise wrote:
> > > On 9/21/2015 12:24 PM, Steve Wise wrote:
> > > >The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> > > >were not taking into account the initial page_offset when determining
> > > >the rdma read length.  This resulted in a read who's starting address
> > > >and length exceeded the base/bounds of the frmr.
> > > >
> > > >Most work loads don't tickle this bug apparently, but one test hit it
> > > >every time: building the linux kernel on a 16 core node with 'make -j
> > > >16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> > > >
> > > >This bug seems to only be tripped with devices having small fastreg page
> > > >list depths.  I didn't see it with mlx4, for instance.
> > > >
> > > >Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> > > >Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> > > >Tested-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > > >---
> > > >
> > > >
> > >
> > > Hey Bruce, can this make 4.3-rc?  Also, what do you think about
> > > pushing it to stable?
> > 
> > It looks like a reasonable candidate for stable.  Apologies, somehow I
> > missed it when you posted it--would you mind resending?
> > 
> > --b.
> 
> resent this one patch.
> 
> What is your process for pushing to stable?

Currently my standard regression tests don't seem to be passing on
recent 4.3; once I figure that out, I'll push out a branch with your
patch (and any others required) to

	git://linux-nfs.org/~bfields for-4.3

and then send a pull request to Linus.  Should be by the end of the
week.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-09-29 15:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-21 17:24 [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers Steve Wise
     [not found] ` <20150921172423.9761.92399.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-09-21 17:24   ` [PATCH 2/3] svcrdma: handle rdma read with a non-zero initial page offset Steve Wise
     [not found]     ` <20150921172428.9761.27838.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-09-28 14:31       ` Steve Wise
     [not found]         ` <56094F3D.2030002-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2015-09-28 21:04           ` J. Bruce Fields
     [not found]             ` <20150928210459.GC3190-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-09-28 21:49               ` Steve Wise
2015-09-29 15:40                 ` 'J. Bruce Fields'
2015-09-21 17:24   ` [PATCH 3/3] xprtrdma: don't log warnings for flushed completions Steve Wise
2015-09-28 14:30   ` [PATCH 1/3] xprtrdma: disconnect and flush cqs before freeing buffers Steve Wise
     [not found]     ` <56094F16.3040608-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2015-09-28 14:45       ` Anna Schumaker
     [not found]         ` <56095286.6030201-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-28 14:50           ` Steve Wise
2015-09-28 14:57             ` Anna Schumaker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).