All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Three SRP driver patches for kernel 4.5
@ 2015-12-31  8:54 Bart Van Assche
       [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-12-31  8:54 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hello Doug,

This series of three patches fixes new issues that exist in the
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 
branch. It would be appreciated if these patches could be included in 
the kernel 4.5 RDMA pull request.

0001-irq_poll-Fix-irq_poll_sched.patch
0002-IB-srpt-Fix-the-RDMA-completion-handlers.patch
0003-IB-srpt-Fix-a-race-condition-related-to-SRP-login.patch

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] irq_poll: Fix irq_poll_sched()
       [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2015-12-31  8:56   ` Bart Van Assche
       [not found]     ` <5684EDA3.60802-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2015-12-31  8:56   ` [PATCH 2/3] IB/srpt: Fix the RDMA completion handlers Bart Van Assche
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-12-31  8:56 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

The IRQ_POLL_F_SCHED bit is set as long as polling is ongoing.
This means that irq_poll_sched() must proceed if this bit has
not yet been set.

Fixes: commit ea51190c0315 ("irq_poll: fold irq_poll_sched_prep into irq_poll_sched").
Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 lib/irq_poll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 2836620..836f7db 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -29,7 +29,7 @@ void irq_poll_sched(struct irq_poll *iop)
 
 	if (test_bit(IRQ_POLL_F_DISABLE, &iop->state))
 		return;
-	if (!test_and_set_bit(IRQ_POLL_F_SCHED, &iop->state))
+	if (test_and_set_bit(IRQ_POLL_F_SCHED, &iop->state))
 		return;
 
 	local_irq_save(flags);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] IB/srpt: Fix the RDMA completion handlers
       [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2015-12-31  8:56   ` [PATCH 1/3] irq_poll: Fix irq_poll_sched() Bart Van Assche
@ 2015-12-31  8:56   ` Bart Van Assche
       [not found]     ` <5684EDCB.30606-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2015-12-31  8:57   ` [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login Bart Van Assche
  2016-01-19 20:43   ` [PATCH 0/3] Three SRP driver patches for kernel 4.5 Doug Ledford
  3 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-12-31  8:56 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Avoid that the following kernel crash is triggered when processing
an RDMA completion:

BUG: unable to handle kernel paging request at 0000000100000198
IP: [<ffffffff810a4ea2>] __lock_acquire+0xa2/0x560
Call Trace:
 [<ffffffff810a53c2>] lock_acquire+0x62/0x80
 [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
 [<ffffffffa04fd437>] srpt_rdma_read_done+0x57/0x120 [ib_srpt]
 [<ffffffffa0144dd3>] __ib_process_cq+0x43/0xc0 [ib_core]
 [<ffffffffa0145115>] ib_cq_poll_work+0x25/0x70 [ib_core]
 [<ffffffff8107184d>] process_one_work+0x1bd/0x460
 [<ffffffff81073148>] worker_thread+0x118/0x420
 [<ffffffff81078454>] kthread+0xe4/0x100
 [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70

Fixes: commit 59fae4deaad3 ("IB/srpt: chain RDMA READ/WRITE requests").
Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 105512d..b42cf0d 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1395,7 +1395,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct ib_wc *wc)
 {
 	struct srpt_rdma_ch *ch = cq->cq_context;
 	struct srpt_send_ioctx *ioctx =
-		container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+		container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
 	WARN_ON(ioctx->n_rdma <= 0);
 	atomic_add(ioctx->n_rdma, &ch->sq_wr_avail);
@@ -1418,7 +1418,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct ib_wc *wc)
 static void srpt_rdma_write_done(struct ib_cq *cq, struct ib_wc *wc)
 {
 	struct srpt_send_ioctx *ioctx =
-		container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+		container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
 	if (unlikely(wc->status != IB_WC_SUCCESS)) {
 		pr_info("RDMA_WRITE for ioctx 0x%p failed with status %d\n",
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2015-12-31  8:56   ` [PATCH 1/3] irq_poll: Fix irq_poll_sched() Bart Van Assche
  2015-12-31  8:56   ` [PATCH 2/3] IB/srpt: Fix the RDMA completion handlers Bart Van Assche
@ 2015-12-31  8:57   ` Bart Van Assche
       [not found]     ` <5684EE16.8070701-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2016-01-19 20:43   ` [PATCH 0/3] Three SRP driver patches for kernel 4.5 Doug Ledford
  3 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-12-31  8:57 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Since patch "IB/srpt: chain RDMA READ/WRITE requests" there are
two loops that process the command wait list. ch->cmd_wait_list
is accessed without locking which means that all code that
accesses this list must be serialized. Since processing of the
RTU event happens from another context than IB WC processing,
remove the wait list processing code from the RTU handler.

Fixes: commit a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1").
Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index b42cf0d..48b27dc 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2540,15 +2540,7 @@ static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id)
 	BUG_ON(!ch);
 
 	if (srpt_test_and_set_ch_state(ch, CH_CONNECTING, CH_LIVE)) {
-		struct srpt_recv_ioctx *ioctx, *ioctx_tmp;
-
 		ret = srpt_ch_qp_rts(ch, ch->qp);
-
-		list_for_each_entry_safe(ioctx, ioctx_tmp, &ch->cmd_wait_list,
-					 wait_list) {
-			list_del(&ioctx->wait_list);
-			srpt_handle_new_iu(ch, ioctx, NULL);
-		}
 		if (ret)
 			srpt_close_ch(ch);
 	}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found]     ` <5684EE16.8070701-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-01-03 10:51       ` Christoph Hellwig
       [not found]         ` <20160103105127.GA10025-jcswGhMUV9g@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2016-01-03 10:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Doug Ledford, Christoph Hellwig, Sagi Grimberg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Dec 31, 2015 at 09:57:58AM +0100, Bart Van Assche wrote:
> Since patch "IB/srpt: chain RDMA READ/WRITE requests" there are
> two loops that process the command wait list. ch->cmd_wait_list
> is accessed without locking which means that all code that
> accesses this list must be serialized. Since processing of the
> RTU event happens from another context than IB WC processing,
> remove the wait list processing code from the RTU handler.

But now the first I/O(s) could be lost if no other I/O comes in,
right?  I suspect that we need to keep this loop to protect against
such corner cases.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] irq_poll: Fix irq_poll_sched()
       [not found]     ` <5684EDA3.60802-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-01-03 11:15       ` Sagi Grimberg
  0 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-01-03 11:15 UTC (permalink / raw)
  To: Bart Van Assche, Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Looks good,

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] IB/srpt: Fix the RDMA completion handlers
       [not found]     ` <5684EDCB.30606-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-01-03 11:16       ` Sagi Grimberg
  0 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2016-01-03 11:16 UTC (permalink / raw)
  To: Bart Van Assche, Doug Ledford
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

Looks good,

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found]         ` <20160103105127.GA10025-jcswGhMUV9g@public.gmane.org>
@ 2016-01-03 11:24           ` Sagi Grimberg
       [not found]             ` <56890507.9080404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2016-01-03 11:24 UTC (permalink / raw)
  To: Christoph Hellwig, Bart Van Assche
  Cc: Doug Ledford, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA



> But now the first I/O(s) could be lost if no other I/O comes in,
> right?  I suspect that we need to keep this loop to protect against
> such corner cases.

It can happen theoretically, but why do we even bother? Why not just
post the recv buffer after we moved in to CH_LIVE? This why we let the
RC transport handle the "Recv-Not-Ready" NAK by retrying?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found]             ` <56890507.9080404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-01-03 13:01               ` Bart Van Assche
       [not found]                 ` <56891BB8.6040502-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2016-01-03 13:01 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Doug Ledford, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 01/03/2016 12:25 PM, Sagi Grimberg wrote:
>> But now the first I/O(s) could be lost if no other I/O comes in,
>> right?  I suspect that we need to keep this loop to protect against
>> such corner cases.
>
> It can happen theoretically, but why do we even bother? Why not just
> post the recv buffer after we moved in to CH_LIVE? This why we let the
> RC transport handle the "Recv-Not-Ready" NAK by retrying?

Making wait list processing in the RTU handler safe would require to 
introduce additional locking. Posting receive buffers after the RTU 
event has been received would introduce another race condition because 
then it can happen that an initiator starts sending data before the 
receive buffers have been posted. A possible solution would be to 
trigger the receive handler after the RTU event has been received in 
such a way that all invocations of the receive handler are still serialized.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found]                 ` <56891BB8.6040502-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-01-03 13:09                   ` Sagi Grimberg
       [not found]                     ` <56891DA5.5060506-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2016-01-03 13:09 UTC (permalink / raw)
  To: Bart Van Assche, Christoph Hellwig
  Cc: Doug Ledford, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA


>>> But now the first I/O(s) could be lost if no other I/O comes in,
>>> right?  I suspect that we need to keep this loop to protect against
>>> such corner cases.
>>
>> It can happen theoretically, but why do we even bother? Why not just
>> post the recv buffer after we moved in to CH_LIVE? This why we let the
>> RC transport handle the "Recv-Not-Ready" NAK by retrying?
>
> Making wait list processing in the RTU handler safe would require to
> introduce additional locking. Posting receive buffers after the RTU
> event has been received would introduce another race condition because
> then it can happen that an initiator starts sending data before the
> receive buffers have been posted.

And what is the problem with that? The peer HCA will get a RNR_NAK
and retry a pre-configured number of retries (7). The initiator stack
won't feel it at all...

I'd say it's a clean way to go.

> A possible solution would be to  trigger the receive handler after
> the RTU event has been received in
> such a way that all invocations of the receive handler are still
> serialized.

We can also do that, it's less trivial though.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login
       [not found]                     ` <56891DA5.5060506-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-01-15 22:14                       ` Bart Van Assche
  0 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2016-01-15 22:14 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Doug Ledford, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 01/03/2016 05:09 AM, Sagi Grimberg wrote:
>>>> But now the first I/O(s) could be lost if no other I/O comes in,
>>>> right?  I suspect that we need to keep this loop to protect against
>>>> such corner cases.
>>>
>>> It can happen theoretically, but why do we even bother? Why not just
>>> post the recv buffer after we moved in to CH_LIVE? This why we let the
>>> RC transport handle the "Recv-Not-Ready" NAK by retrying?
>>
>> Making wait list processing in the RTU handler safe would require to
>> introduce additional locking. Posting receive buffers after the RTU
>> event has been received would introduce another race condition because
>> then it can happen that an initiator starts sending data before the
>> receive buffers have been posted.
> 
> And what is the problem with that? The peer HCA will get a RNR_NAK
> and retry a pre-configured number of retries (7). The initiator stack
> won't feel it at all...
> 
> I'd say it's a clean way to go.
> 
>> A possible solution would be to  trigger the receive handler after
>> the RTU event has been received in
>> such a way that all invocations of the receive handler are still
>> serialized.
> 
> We can also do that, it's less trivial though.

Hello Sagi,

I have started to test the patch below. I will repost this series
after the merge window has closed and after Doug has had the chance
to process his e-mail backlog.

From: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Subject: [PATCH] IB/srpt: Fix wait list processing

Since the wait list is not protected against concurrent access
it must be processed from the context of the completion handler.
Replace the wait list processing code in the IB CM RTU callback
handler by code that triggers a completion handler. This patch
fixes the following rare crash:

WARNING: CPU: 2 PID: 78656 at lib/list_debug.c:53 __list_del_entry+0x67/0xd0()
list_del corruption, ffff88041ae404b8->next is LIST_POISON1 (dead000000000100)
Call Trace:
 [<ffffffff81251c6b>] dump_stack+0x4f/0x74
 [<ffffffff810574ab>] warn_slowpath_common+0x8b/0xd0
 [<ffffffff81057591>] warn_slowpath_fmt+0x41/0x70
 [<ffffffff8126f007>] __list_del_entry+0x67/0xd0
 [<ffffffff8126f081>] list_del+0x11/0x40
 [<ffffffffa0265242>] srpt_cm_handler+0x172/0x1a4 [ib_srpt]
 [<ffffffffa0370370>] cm_process_work+0x20/0xf0 [ib_cm]
 [<ffffffffa0370dae>] cm_establish_handler+0xbe/0x110 [ib_cm]
 [<ffffffffa03733e7>] cm_work_handler+0x67/0xd0 [ib_cm]
 [<ffffffff8107184d>] process_one_work+0x1bd/0x460
 [<ffffffff81073148>] worker_thread+0x118/0x420
 [<ffffffff81078444>] kthread+0xe4/0x100
 [<ffffffff8151caff>] ret_from_fork+0x3f/0x70

---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 64 +++++++++++++++++++++--------------
 1 file changed, 38 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 2cc9813..8991d3f 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -96,7 +96,7 @@ static void srpt_free_ch(struct kref *kref);
 static int srpt_queue_status(struct se_cmd *cmd);
 static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
-static void srpt_zerolength_write_done(struct ib_cq *cq, struct ib_wc *wc);
+static void srpt_process_wait_list(struct srpt_rdma_ch *ch);
 
 /*
  * The only allowed channel state changes are those that change the channel
@@ -832,12 +832,14 @@ static void srpt_zerolength_write_done(struct ib_cq *cq, struct ib_wc *wc)
 {
 	struct srpt_rdma_ch *ch = cq->cq_context;
 
-	WARN(wc->status == IB_WC_SUCCESS, "%s-%d: QP not in error state\n",
-	     ch->sess_name, ch->qp->qp_num);
-	if (srpt_set_ch_state(ch, CH_DISCONNECTED))
-		schedule_work(&ch->release_work);
-	else
-		WARN_ONCE("%s-%d\n", ch->sess_name, ch->qp->qp_num);
+	if (wc->status == IB_WC_SUCCESS) {
+		srpt_process_wait_list(ch);
+	} else {
+		if (srpt_set_ch_state(ch, CH_DISCONNECTED))
+			schedule_work(&ch->release_work);
+		else
+			WARN_ONCE("%s-%d\n", ch->sess_name, ch->qp->qp_num);
+	}
 }
 
 /**
@@ -1734,6 +1736,28 @@ static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc)
 	}
 }
 
+/*
+ * This function must be called from the context of the IB completion handler
+ * because it accesses the wait list without protection against access from
+ * other threads.
+ */
+static void srpt_process_wait_list(struct srpt_rdma_ch *ch)
+{
+	struct srpt_send_ioctx *ioctx;
+
+	while (!list_empty(&ch->cmd_wait_list) &&
+	       ch->state >= CH_LIVE &&
+	       (ioctx = srpt_get_send_ioctx(ch)) != NULL) {
+		struct srpt_recv_ioctx *recv_ioctx;
+
+		recv_ioctx = list_first_entry(&ch->cmd_wait_list,
+					      struct srpt_recv_ioctx,
+					      wait_list);
+		list_del(&recv_ioctx->wait_list);
+		srpt_handle_new_iu(ch, recv_ioctx, ioctx);
+	}
+}
+
 /**
  * Note: Although this has not yet been observed during tests, at least in
  * theory it is possible that the srpt_get_send_ioctx() call invoked by
@@ -1773,17 +1797,7 @@ static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc)
 		       " wr_id = %u.\n", ioctx->ioctx.index);
 	}
 
-	while (!list_empty(&ch->cmd_wait_list) &&
-	       ch->state == CH_LIVE &&
-	       (ioctx = srpt_get_send_ioctx(ch)) != NULL) {
-		struct srpt_recv_ioctx *recv_ioctx;
-
-		recv_ioctx = list_first_entry(&ch->cmd_wait_list,
-					      struct srpt_recv_ioctx,
-					      wait_list);
-		list_del(&recv_ioctx->wait_list);
-		srpt_handle_new_iu(ch, recv_ioctx, ioctx);
-	}
+	srpt_process_wait_list(ch);
 }
 
 /**
@@ -2328,17 +2342,15 @@ static void srpt_cm_rtu_recv(struct srpt_rdma_ch *ch)
 	int ret;
 
 	if (srpt_set_ch_state(ch, CH_LIVE)) {
-		struct srpt_recv_ioctx *ioctx, *ioctx_tmp;
-
 		ret = srpt_ch_qp_rts(ch, ch->qp);
 
-		list_for_each_entry_safe(ioctx, ioctx_tmp, &ch->cmd_wait_list,
-					 wait_list) {
-			list_del(&ioctx->wait_list);
-			srpt_handle_new_iu(ch, ioctx, NULL);
-		}
-		if (ret)
+		if (ret == 0) {
+			/* Trigger wait list processing. */
+			ret = srpt_zerolength_write(ch);
+			WARN_ONCE(ret < 0, "%d\n", ret);
+		} else {
 			srpt_close_ch(ch);
+		}
 	}
 }
 
-- 
2.1.4


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] Three SRP driver patches for kernel 4.5
       [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-12-31  8:57   ` [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login Bart Van Assche
@ 2016-01-19 20:43   ` Doug Ledford
  3 siblings, 0 replies; 12+ messages in thread
From: Doug Ledford @ 2016-01-19 20:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Sagi Grimberg, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 709 bytes --]

On 12/31/2015 03:54 AM, Bart Van Assche wrote:
> Hello Doug,
> 
> This series of three patches fixes new issues that exist in the
> git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5
> branch. It would be appreciated if these patches could be included in
> the kernel 4.5 RDMA pull request.
> 
> 0001-irq_poll-Fix-irq_poll_sched.patch
> 0002-IB-srpt-Fix-the-RDMA-completion-handlers.patch
> 0003-IB-srpt-Fix-a-race-condition-related-to-SRP-login.patch

Since you are still working on the last patch, I dropped it, but I've
picked up the first two.  Thanks.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-19 20:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-31  8:54 [PATCH 0/3] Three SRP driver patches for kernel 4.5 Bart Van Assche
     [not found] ` <5684ED4B.2010303-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-12-31  8:56   ` [PATCH 1/3] irq_poll: Fix irq_poll_sched() Bart Van Assche
     [not found]     ` <5684EDA3.60802-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-01-03 11:15       ` Sagi Grimberg
2015-12-31  8:56   ` [PATCH 2/3] IB/srpt: Fix the RDMA completion handlers Bart Van Assche
     [not found]     ` <5684EDCB.30606-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-01-03 11:16       ` Sagi Grimberg
2015-12-31  8:57   ` [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login Bart Van Assche
     [not found]     ` <5684EE16.8070701-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-01-03 10:51       ` Christoph Hellwig
     [not found]         ` <20160103105127.GA10025-jcswGhMUV9g@public.gmane.org>
2016-01-03 11:24           ` Sagi Grimberg
     [not found]             ` <56890507.9080404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-01-03 13:01               ` Bart Van Assche
     [not found]                 ` <56891BB8.6040502-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-01-03 13:09                   ` Sagi Grimberg
     [not found]                     ` <56891DA5.5060506-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-01-15 22:14                       ` Bart Van Assche
2016-01-19 20:43   ` [PATCH 0/3] Three SRP driver patches for kernel 4.5 Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.