* [PATCH rdma-next] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()
@ 2020-09-30 7:20 Leon Romanovsky
2020-09-30 18:31 ` Jason Gunthorpe
0 siblings, 1 reply; 2+ messages in thread
From: Leon Romanovsky @ 2020-09-30 7:20 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Dan Aloni, linux-rdma
From: Jason Gunthorpe <jgg@nvidia.com>
This three thread race can result in the work being run once the callback
becomes NULL:
CPU1 CPU2 CPU3
netevent_callback()
process_one_req() rdma_addr_cancel()
[..]
spin_lock_bh()
set_timeout()
spin_unlock_bh()
spin_lock_bh()
list_del_init(&req->list);
spin_unlock_bh()
req->callback = NULL
spin_lock_bh()
if (!list_empty(&req->list))
// Skipped!
// cancel_delayed_work(&req->work);
spin_unlock_bh()
process_one_req() // again
req->callback() // BOOM
cancel_delayed_work_sync()
The solution is to always cancel the work once it is completed so any
in between set_timeout() does not result in it running again.
Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence")
Reported-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/infiniband/core/addr.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 3a98439bba83..0abce004a959 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -647,13 +647,12 @@ static void process_one_req(struct work_struct *_work)
req->callback = NULL;
spin_lock_bh(&lock);
+ /*
+ * Although the work will normally have been canceled by the workqueue,
+ * it can still be requeued as long as it is on the req_list.
+ */
+ cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) {
- /*
- * Although the work will normally have been canceled by the
- * workqueue, it can still be requeued as long as it is on the
- * req_list.
- */
- cancel_delayed_work(&req->work);
list_del_init(&req->list);
kfree(req);
}
--
2.26.2
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH rdma-next] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()
2020-09-30 7:20 [PATCH rdma-next] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() Leon Romanovsky
@ 2020-09-30 18:31 ` Jason Gunthorpe
0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2020-09-30 18:31 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, Dan Aloni, linux-rdma
On Wed, Sep 30, 2020 at 10:20:07AM +0300, Leon Romanovsky wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
>
> This three thread race can result in the work being run once the callback
> becomes NULL:
>
> CPU1 CPU2 CPU3
> netevent_callback()
> process_one_req() rdma_addr_cancel()
> [..]
> spin_lock_bh()
> set_timeout()
> spin_unlock_bh()
>
> spin_lock_bh()
> list_del_init(&req->list);
> spin_unlock_bh()
>
> req->callback = NULL
> spin_lock_bh()
> if (!list_empty(&req->list))
> // Skipped!
> // cancel_delayed_work(&req->work);
> spin_unlock_bh()
>
> process_one_req() // again
> req->callback() // BOOM
> cancel_delayed_work_sync()
>
> The solution is to always cancel the work once it is completed so any
> in between set_timeout() does not result in it running again.
>
> Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence")
> Reported-by: Dan Aloni <dan@kernelim.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> drivers/infiniband/core/addr.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
Applied to for-next
Jason
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-09-30 18:31 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 7:20 [PATCH rdma-next] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() Leon Romanovsky
2020-09-30 18:31 ` Jason Gunthorpe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.