All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
@ 2016-03-07  9:44 Devesh Sharma
       [not found] ` <1457343873-14869-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Devesh Sharma @ 2016-03-07  9:44 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, Devesh Sharma

Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one")

While testing ocrdma for disassociate_ucontext support following
kernel panic was seen:

BUG: unable to handle kernel paging request at ffffffffa07ccd7a
[67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
[67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0
[67139.993370] Oops: 0010 [#1] SMP
[67140.257286] Call Trace:
[67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
[67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]
[67140.276009]  [<ffffffffa03ee5f0>] ? ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs]
[67140.285550]  [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0 [ib_uverbs]
[67140.293807]  [<ffffffff811ff744>] ? __fput+0xe4/0x210
[67140.300132]  [<ffffffff811ff8ae>] ? ____fput+0xe/0x10
[67140.306457]  [<ffffffff8109b697>] ? task_work_run+0x77/0x90
[67140.313388]  [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0
[67140.319910]  [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0
[67140.326821]  [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0
[67140.333635]  [<ffffffff81017387>] ? do_signal+0x37/0x660
[67140.340257]  [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm]
[67140.347949]  [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2
[67140.355651]  [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0
[67140.363554]  [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f
[67140.371259] Code:  Bad RIP value.
[67140.375678] RIP  [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
[67140.382314]  RSP <ffff880867727ad8>
[67140.386894] CR2: ffffffffa07ccd7a
[67140.393737] ---[ end trace 807b4472c30412d0 ]---
[67141.682413] Kernel panic - not syncing: Fatal exception
[67141.682431] Kernel Offset: disabled
[67141.733934] ---[ end Kernel panic - not syncing: Fatal exception

Root Cause:

During rmmod <vendor-driver> "ib_uverbs_close()" context is
still running, while "ib_uverbs_remove_one()" context completes and
ends up freeing ib_dev pointer, thus causing a Kernel Panic.

This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL
inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu
else continues with the normal flow.

CC: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |  1 +
 drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 612ccfd..94a7339 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -121,6 +121,7 @@ struct ib_uverbs_file {
 	struct ib_event_handler			event_handler;
 	struct ib_uverbs_event_file	       *async_file;
 	struct list_head			list;
+	struct completion			fcomp;
 	int					is_closed;
 };
 
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 39680ae..9531168 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	file->async_file = NULL;
 	kref_init(&file->ref);
 	mutex_init(&file->mutex);
+	init_completion(&file->fcomp);
 
 	filp->private_data = file;
 	kobject_get(&dev->kobj);
@@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 	struct ib_uverbs_file *file = filp->private_data;
 	struct ib_uverbs_device *dev = file->device;
 	struct ib_ucontext *ucontext = NULL;
+	struct ib_device *ib_dev;
+	int srcu_key;
 
-	mutex_lock(&file->device->lists_mutex);
+	srcu_key = srcu_read_lock(&dev->disassociate_srcu);
+	ib_dev = srcu_dereference(dev->ib_dev,
+				  &dev->disassociate_srcu);
+	if (!ib_dev)
+		goto out;
+
+	mutex_lock(&dev->lists_mutex);
 	ucontext = file->ucontext;
 	file->ucontext = NULL;
 	if (!file->is_closed) {
 		list_del(&file->list);
 		file->is_closed = 1;
 	}
-	mutex_unlock(&file->device->lists_mutex);
+	mutex_unlock(&dev->lists_mutex);
 	if (ucontext)
 		ib_uverbs_cleanup_ucontext(file, ucontext);
 
 	if (file->async_file)
 		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
 
+	complete(&file->fcomp);
+out:
+	wait_for_completion(&file->fcomp);
+	srcu_read_unlock(&dev->disassociate_srcu, srcu_key);
 	kref_put(&file->ref, ib_uverbs_release_file);
 	kobject_put(&dev->kobj);
 
@@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 		}
 
 		mutex_lock(&uverbs_dev->lists_mutex);
+		complete(&file->fcomp);
 		kref_put(&file->ref, ib_uverbs_release_file);
 	}
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found] ` <1457343873-14869-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2016-03-07 11:14   ` Yishai Hadas
       [not found]     ` <56DD6295.6000705-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2016-03-07 19:08   ` Jason Gunthorpe
  1 sibling, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2016-03-07 11:14 UTC (permalink / raw)
  To: Devesh Sharma, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, Majd Dibbiny

On 3/7/2016 11:44 AM, Devesh Sharma wrote:
> Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one")

It fixes 036b10635739 (IB/uverbs: Enable device removal when there are 
active user space applications) and not the commit that you pointed on.


>
> While testing ocrdma for disassociate_ucontext support following
> kernel panic was seen:
>
> BUG: unable to handle kernel paging request at ffffffffa07ccd7a
> [67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
> [67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0
> [67139.993370] Oops: 0010 [#1] SMP
> [67140.257286] Call Trace:
> [67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
> [67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]
> [67140.276009]  [<ffffffffa03ee5f0>] ? ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs]
> [67140.285550]  [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0 [ib_uverbs]
> [67140.293807]  [<ffffffff811ff744>] ? __fput+0xe4/0x210
> [67140.300132]  [<ffffffff811ff8ae>] ? ____fput+0xe/0x10
> [67140.306457]  [<ffffffff8109b697>] ? task_work_run+0x77/0x90
> [67140.313388]  [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0
> [67140.319910]  [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0
> [67140.326821]  [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0
> [67140.333635]  [<ffffffff81017387>] ? do_signal+0x37/0x660
> [67140.340257]  [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm]
> [67140.347949]  [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2
> [67140.355651]  [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0
> [67140.363554]  [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f
> [67140.371259] Code:  Bad RIP value.
> [67140.375678] RIP  [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
> [67140.382314]  RSP <ffff880867727ad8>
> [67140.386894] CR2: ffffffffa07ccd7a
> [67140.393737] ---[ end trace 807b4472c30412d0 ]---
> [67141.682413] Kernel panic - not syncing: Fatal exception
> [67141.682431] Kernel Offset: disabled
> [67141.733934] ---[ end Kernel panic - not syncing: Fatal exception
>
> Root Cause:
>
> During rmmod <vendor-driver> "ib_uverbs_close()" context is
> still running, while "ib_uverbs_remove_one()" context completes and
> ends up freeing ib_dev pointer, thus causing a Kernel Panic.

Kernel Panic -> kernel panic.
>
> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL
> inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu
> else continues with the normal flow.

Need to fix this description as of the expected code change, see below.
Please also describe why/how it solves the problem and not what it does.

>
> CC: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
>   drivers/infiniband/core/uverbs.h      |  1 +
>   drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++--
>   2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
> index 612ccfd..94a7339 100644
> --- a/drivers/infiniband/core/uverbs.h
> +++ b/drivers/infiniband/core/uverbs.h
> @@ -121,6 +121,7 @@ struct ib_uverbs_file {
>   	struct ib_event_handler			event_handler;
>   	struct ib_uverbs_event_file	       *async_file;
>   	struct list_head			list;
> +	struct completion			fcomp;
>   	int					is_closed;
>   };
>
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 39680ae..9531168 100644
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
>   	file->async_file = NULL;
>   	kref_init(&file->ref);
>   	mutex_init(&file->mutex);
> +	init_completion(&file->fcomp);
>
>   	filp->private_data = file;
>   	kobject_get(&dev->kobj);
> @@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>   	struct ib_uverbs_file *file = filp->private_data;
>   	struct ib_uverbs_device *dev = file->device;
>   	struct ib_ucontext *ucontext = NULL;
> +	struct ib_device *ib_dev;
> +	int srcu_key;
>
> -	mutex_lock(&file->device->lists_mutex);
> +	srcu_key = srcu_read_lock(&dev->disassociate_srcu);
> +	ib_dev = srcu_dereference(dev->ib_dev,
> +				  &dev->disassociate_srcu);
> +	if (!ib_dev)

You need to free the sruc lock, wait for completion then go to out.
Doing in the opposite order as you did below, might end up with a 
dead-lock in case ib_uverbs_free_hw_resources is waiting for the 
synchronize_srcu(disassociate_srcu) and can't mark the file as completed.

> +		goto out;
> +
> +	mutex_lock(&dev->lists_mutex);
>   	ucontext = file->ucontext;
>   	file->ucontext = NULL;
>   	if (!file->is_closed) {
>   		list_del(&file->list);
>   		file->is_closed = 1;
>   	}
> -	mutex_unlock(&file->device->lists_mutex);
> +	mutex_unlock(&dev->lists_mutex);
>   	if (ucontext)
>   		ib_uverbs_cleanup_ucontext(file, ucontext);
>
>   	if (file->async_file)
>   		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
>
> +	complete(&file->fcomp);

No need to do that in that flow, only in above flow.

> +out:
> +	wait_for_completion(&file->fcomp);
> +	srcu_read_unlock(&dev->disassociate_srcu, srcu_key);

See above, might end-up with a dead-lock, should be done above in 
opposite order.

>   	kref_put(&file->ref, ib_uverbs_release_file);
>   	kobject_put(&dev->kobj);
>
> @@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
>   		}
>
>   		mutex_lock(&uverbs_dev->lists_mutex);
> +		complete(&file->fcomp);
>   		kref_put(&file->ref, ib_uverbs_release_file);
>   	}
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found] ` <1457343873-14869-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2016-03-07 11:14   ` Yishai Hadas
@ 2016-03-07 19:08   ` Jason Gunthorpe
       [not found]     ` <20160307190833.GA1886-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2016-03-07 19:08 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w

On Mon, Mar 07, 2016 at 04:44:33AM -0500, Devesh Sharma wrote:

> [67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
> [67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]

So, ib_dereg_mr at this point:

        ret = mr->device->dereg_mr(mr);

Is running when mr->device is already freed?

> During rmmod <vendor-driver> "ib_uverbs_close()" context is
> still running, while "ib_uverbs_remove_one()" context completes and
> ends up freeing ib_dev pointer, thus causing a Kernel Panic.

Hurm..

So ib_uverbs_close is busy running in ib_uverbs_cleanup_ucontext and
then ib_uverbs_free_hw_resources is called?

At first blush it certainly looks like the locking around
ib_uverbs_cleanup_context is wrong.

> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL
> inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu
> else continues with the normal flow.

Hum.. So this is really weird, this patch is bascially duplicating a
mutex with srcu and a completion??

What is wrong with simply this:

--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
                list_del(&file->list);
                file->is_closed = 1;
        }
-       mutex_unlock(&file->device->lists_mutex);
        if (ucontext)
                ib_uverbs_cleanup_ucontext(file, ucontext);
+       mutex_unlock(&file->device->lists_mutex);
 

??

Noting that ib_uverbs_free_hw_resources holds lists_mutex while
calling ib_uverbs_cleanup_ucontext, so it should be safe, or we have
another bug?

Certainly, the above is closer to the original intent of how this was
supposed to work...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]     ` <56DD6295.6000705-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-03-08  9:49       ` Devesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Devesh Sharma @ 2016-03-08  9:49 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On Mon, Mar 7, 2016 at 4:44 PM, Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 3/7/2016 11:44 AM, Devesh Sharma wrote:
>>
>> Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and
>> remove_one")
>
>
> It fixes 036b10635739 (IB/uverbs: Enable device removal when there are
> active user space applications) and not the commit that you pointed on.
>
>
>
>>
>> While testing ocrdma for disassociate_ucontext support following
>> kernel panic was seen:
>>
>> BUG: unable to handle kernel paging request at ffffffffa07ccd7a
>> [67139.981020] IP: [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
>> [67139.987185] PGD 19c5067 PUD 19c6063 PMD 469d08067 PTE 0
>> [67139.993370] Oops: 0010 [#1] SMP
>> [67140.257286] Call Trace:
>> [67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
>> [67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]
>> [67140.276009]  [<ffffffffa03ee5f0>] ?
>> ib_uverbs_cleanup_ucontext+0x320/0x440 [ib_uverbs]
>> [67140.285550]  [<ffffffffa03ee9e9>] ? ib_uverbs_close+0x59/0xb0
>> [ib_uverbs]
>> [67140.293807]  [<ffffffff811ff744>] ? __fput+0xe4/0x210
>> [67140.300132]  [<ffffffff811ff8ae>] ? ____fput+0xe/0x10
>> [67140.306457]  [<ffffffff8109b697>] ? task_work_run+0x77/0x90
>> [67140.313388]  [<ffffffff81081af2>] ? do_exit+0x2d2/0xab0
>> [67140.319910]  [<ffffffff8108234f>] ? do_group_exit+0x3f/0xa0
>> [67140.326821]  [<ffffffff8108d54c>] ? get_signal+0x1cc/0x5e0
>> [67140.333635]  [<ffffffff81017387>] ? do_signal+0x37/0x660
>> [67140.340257]  [<ffffffffa021669a>] ? ucma_write+0x7a/0xc0 [rdma_ucm]
>> [67140.347949]  [<ffffffff81079c37>] ? exit_to_usermode_loop+0x59/0xa2
>> [67140.355651]  [<ffffffff81003bad>] ? syscall_return_slowpath+0x8d/0xa0
>> [67140.363554]  [<ffffffff81680dcc>] ? int_ret_from_sys_call+0x25/0x8f
>> [67140.371259] Code:  Bad RIP value.
>> [67140.375678] RIP  [<ffffffffa07ccd7a>] 0xffffffffa07ccd7a
>> [67140.382314]  RSP <ffff880867727ad8>
>> [67140.386894] CR2: ffffffffa07ccd7a
>> [67140.393737] ---[ end trace 807b4472c30412d0 ]---
>> [67141.682413] Kernel panic - not syncing: Fatal exception
>> [67141.682431] Kernel Offset: disabled
>> [67141.733934] ---[ end Kernel panic - not syncing: Fatal exception
>>
>> Root Cause:
>>
>> During rmmod <vendor-driver> "ib_uverbs_close()" context is
>> still running, while "ib_uverbs_remove_one()" context completes and
>> ends up freeing ib_dev pointer, thus causing a Kernel Panic.
>
>
> Kernel Panic -> kernel panic.

Will fix this in V3.

>>
>>
>> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against
>> NULL
>> inside an srcu lock. If it is NULL, it waits for a completion and drops
>> the srcu
>> else continues with the normal flow.
>
>
> Need to fix this description as of the expected code change, see below.
> Please also describe why/how it solves the problem and not what it does.

Sure, will do that.

>
>
>>
>> CC: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> ---
>>   drivers/infiniband/core/uverbs.h      |  1 +
>>   drivers/infiniband/core/uverbs_main.c | 18 ++++++++++++++++--
>>   2 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/uverbs.h
>> b/drivers/infiniband/core/uverbs.h
>> index 612ccfd..94a7339 100644
>> --- a/drivers/infiniband/core/uverbs.h
>> +++ b/drivers/infiniband/core/uverbs.h
>> @@ -121,6 +121,7 @@ struct ib_uverbs_file {
>>         struct ib_event_handler                 event_handler;
>>         struct ib_uverbs_event_file            *async_file;
>>         struct list_head                        list;
>> +       struct completion                       fcomp;
>>         int                                     is_closed;
>>   };
>>
>> diff --git a/drivers/infiniband/core/uverbs_main.c
>> b/drivers/infiniband/core/uverbs_main.c
>> index 39680ae..9531168 100644
>> --- a/drivers/infiniband/core/uverbs_main.c
>> +++ b/drivers/infiniband/core/uverbs_main.c
>> @@ -928,6 +928,7 @@ static int ib_uverbs_open(struct inode *inode, struct
>> file *filp)
>>         file->async_file = NULL;
>>         kref_init(&file->ref);
>>         mutex_init(&file->mutex);
>> +       init_completion(&file->fcomp);
>>
>>         filp->private_data = file;
>>         kobject_get(&dev->kobj);
>> @@ -954,21 +955,33 @@ static int ib_uverbs_close(struct inode *inode,
>> struct file *filp)
>>         struct ib_uverbs_file *file = filp->private_data;
>>         struct ib_uverbs_device *dev = file->device;
>>         struct ib_ucontext *ucontext = NULL;
>> +       struct ib_device *ib_dev;
>> +       int srcu_key;
>>
>> -       mutex_lock(&file->device->lists_mutex);
>> +       srcu_key = srcu_read_lock(&dev->disassociate_srcu);
>> +       ib_dev = srcu_dereference(dev->ib_dev,
>> +                                 &dev->disassociate_srcu);
>> +       if (!ib_dev)
>
>
> You need to free the sruc lock, wait for completion then go to out.
> Doing in the opposite order as you did below, might end up with a dead-lock
> in case ib_uverbs_free_hw_resources is waiting for the
> synchronize_srcu(disassociate_srcu) and can't mark the file as completed.

Agreed. Will fix it.

>
>> +               goto out;
>> +
>> +       mutex_lock(&dev->lists_mutex);
>>         ucontext = file->ucontext;
>>         file->ucontext = NULL;
>>         if (!file->is_closed) {
>>                 list_del(&file->list);
>>                 file->is_closed = 1;
>>         }
>> -       mutex_unlock(&file->device->lists_mutex);
>> +       mutex_unlock(&dev->lists_mutex);
>>         if (ucontext)
>>                 ib_uverbs_cleanup_ucontext(file, ucontext);
>>
>>         if (file->async_file)
>>                 kref_put(&file->async_file->ref,
>> ib_uverbs_release_event_file);
>>
>> +       complete(&file->fcomp);
>
>
> No need to do that in that flow, only in above flow.

Agreed.

>
>> +out:
>> +       wait_for_completion(&file->fcomp);
>> +       srcu_read_unlock(&dev->disassociate_srcu, srcu_key);
>
>
> See above, might end-up with a dead-lock, should be done above in opposite
> order.

Will fix in next version.

>
>
>>         kref_put(&file->ref, ib_uverbs_release_file);
>>         kobject_put(&dev->kobj);
>>
>> @@ -1199,6 +1212,7 @@ static void ib_uverbs_free_hw_resources(struct
>> ib_uverbs_device *uverbs_dev,
>>                 }
>>
>>                 mutex_lock(&uverbs_dev->lists_mutex);
>> +               complete(&file->fcomp);
>>                 kref_put(&file->ref, ib_uverbs_release_file);
>>         }
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]     ` <20160307190833.GA1886-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-08 10:54       ` Devesh Sharma
       [not found]         ` <CANjDDBiYagKm79n5sWNsCnxruSzqDqZYREmw1mGBR_upapF4hQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Devesh Sharma @ 2016-03-08 10:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas

On Tue, Mar 8, 2016 at 12:38 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Mon, Mar 07, 2016 at 04:44:33AM -0500, Devesh Sharma wrote:
>
>> [67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
>> [67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]
>
> So, ib_dereg_mr at this point:
>
>         ret = mr->device->dereg_mr(mr);
>
> Is running when mr->device is already freed?

Yes.

>
>> During rmmod <vendor-driver> "ib_uverbs_close()" context is
>> still running, while "ib_uverbs_remove_one()" context completes and
>> ends up freeing ib_dev pointer, thus causing a Kernel Panic.
>
> Hurm..
>
> So ib_uverbs_close is busy running in ib_uverbs_cleanup_ucontext and
> then ib_uverbs_free_hw_resources is called?

Yes, and completed also to unblock ib_unregister_device() which actually free-up
device pointer.

>
> At first blush it certainly looks like the locking around
> ib_uverbs_cleanup_context is wrong.

I agree, from both locations it is called without any synchronization.

>
>> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL
>> inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu
>> else continues with the normal flow.
>
> Hum.. So this is really weird, this patch is bascially duplicating a
> mutex with srcu and a completion??

Agreed.

>
> What is wrong with simply this:
>
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>                 list_del(&file->list);
>                 file->is_closed = 1;
>         }
> -       mutex_unlock(&file->device->lists_mutex);
>         if (ucontext)
>                 ib_uverbs_cleanup_ucontext(file, ucontext);
> +       mutex_unlock(&file->device->lists_mutex);
>
>
> ??

There is following comment about list_mutex in uverbs_main.c around
line number 1200:
/* We must release the mutex before going ahead and calling
 * disassociate_ucontext. disassociate_ucontext might end up
 * indirectly calling uverbs_close, for example due to freeing
 * the resources (e.g mmput).
 */

>
> Noting that ib_uverbs_free_hw_resources holds lists_mutex while
> calling ib_uverbs_cleanup_ucontext, so it should be safe, or we have
> another bug?

No, ib_uverbs_cleanup_ucontext is called outside mutex from this context.
the code takes the reference of the file pointer from the list, then
releases the mutex
then calls ib_uverbs_cleanup_ucontext. After the call is done, mutext
is acquired again.

>
> Certainly, the above is closer to the original intent of how this was
> supposed to work...
>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]         ` <CANjDDBiYagKm79n5sWNsCnxruSzqDqZYREmw1mGBR_upapF4hQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-03-08 14:33           ` Yishai Hadas
  2016-03-08 17:53           ` Jason Gunthorpe
  1 sibling, 0 replies; 18+ messages in thread
From: Yishai Hadas @ 2016-03-08 14:33 UTC (permalink / raw)
  To: Devesh Sharma, Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 3/8/2016 12:54 PM, Devesh Sharma wrote:
>> What is wrong with simply this:
>>
>> --- a/drivers/infiniband/core/uverbs_main.c
>> +++ b/drivers/infiniband/core/uverbs_main.c
>> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>>                  list_del(&file->list);
>>                  file->is_closed = 1;
>>          }
>> -       mutex_unlock(&file->device->lists_mutex);
>>          if (ucontext)
>>                  ib_uverbs_cleanup_ucontext(file, ucontext);
>> +       mutex_unlock(&file->device->lists_mutex);
>>
>>
>> ??
>
> There is following comment about list_mutex in uverbs_main.c around
> line number 1200:
> /* We must release the mutex before going ahead and calling
>   * disassociate_ucontext. disassociate_ucontext might end up
>   * indirectly calling uverbs_close, for example due to freeing
>   * the resources (e.g mmput).
>   */
>

Correct.

In addition it's very *bad/incorrect* to call ib_uverbs_cleanup_ucontext 
under the list mutex as it will prevent parallel cleanups of different 
contexts, this may end up with softlock ups as cleanup may involve FW 
commands and may take time.

The correct solution is as I described in my comments to V2, need V3 
with the fixes.




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]         ` <CANjDDBiYagKm79n5sWNsCnxruSzqDqZYREmw1mGBR_upapF4hQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-03-08 14:33           ` Yishai Hadas
@ 2016-03-08 17:53           ` Jason Gunthorpe
       [not found]             ` <20160308175334.GB10805-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2016-03-08 17:53 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas

On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote:

> > +++ b/drivers/infiniband/core/uverbs_main.c
> > @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
> >                 list_del(&file->list);
> >                 file->is_closed = 1;
> >         }
> > -       mutex_unlock(&file->device->lists_mutex);
> >         if (ucontext)
> >                 ib_uverbs_cleanup_ucontext(file, ucontext);
> > +       mutex_unlock(&file->device->lists_mutex);
> >
> >
> > ??
> 
> There is following comment about list_mutex in uverbs_main.c around
> line number 1200:
> /* We must release the mutex before going ahead and calling
>  * disassociate_ucontext. disassociate_ucontext might end up
>  * indirectly calling uverbs_close, for example due to freeing
>  * the resources (e.g mmput).
>  */

Okay, now I remember this discussion, and being unhappy about this
during review.

However, this comment is talking about disassociate_ucontext, the bug
is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close
while we are already in uverbs_close, so that doesn't explain why it
cannot be in the mutex.

So, Yishai, what is the problem with the above lock placement?

The only issue you raised was with multi-file close concurrency, and
that is trivially solved with another mutex.

I'd rather see another mutex added then this ugly add-hoc
srcu/completion thing.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]             ` <20160308175334.GB10805-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-09 16:48               ` Yishai Hadas
       [not found]                 ` <56E053C8.8050008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2016-03-09 16:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Devesh Sharma
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 3/8/2016 7:53 PM, Jason Gunthorpe wrote:
> On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote:
>
>>> +++ b/drivers/infiniband/core/uverbs_main.c
>>> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>>>                  list_del(&file->list);
>>>                  file->is_closed = 1;
>>>          }
>>> -       mutex_unlock(&file->device->lists_mutex);
>>>          if (ucontext)
>>>                  ib_uverbs_cleanup_ucontext(file, ucontext);
>>> +       mutex_unlock(&file->device->lists_mutex);
>>>
>>>
>>> ??
>>
>> There is following comment about list_mutex in uverbs_main.c around
>> line number 1200:
>> /* We must release the mutex before going ahead and calling
>>   * disassociate_ucontext. disassociate_ucontext might end up
>>   * indirectly calling uverbs_close, for example due to freeing
>>   * the resources (e.g mmput).
>>   */
>
> Okay, now I remember this discussion, and being unhappy about this
> during review.
>
> However, this comment is talking about disassociate_ucontext, the bug
> is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close
> while we are already in uverbs_close, so that doesn't explain why it
> cannot be in the mutex.
>
> So, Yishai, what is the problem with the above lock placement?
>
> The only issue you raised was with multi-file close concurrency, and
> that is trivially solved with another mutex.
>
> I'd rather see another mutex added then this ugly add-hoc
> srcu/completion thing.

The srcu with NULL checking by itself can prevent the race, no need for 
the "completion" mechanism. ib_uverbs_free_hw_resources uses 
synchronize_srcu just after that ib_dev was set to NULL as part of 
ib_uverbs_remove_one.

The reason for the extra "completion" that I suggested comes to make 
sure that when an application returns from its close API the underlying 
resources were really freed, this is open in current code even if the 
race *wasn't* hit.

As we need to enable parallel closing it seems to be the preferred way 
to go.

Devesh, can you send V3 with above suggestion to help people reviewing 
it ? if you have some other solution with mutex that addressed above 
points please come it to the list for a review.






--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                 ` <56E053C8.8050008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-03-09 19:03                   ` Jason Gunthorpe
       [not found]                     ` <20160309190354.GD21139-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-03-10  8:16                   ` Devesh Sharma
  1 sibling, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2016-03-09 19:03 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Devesh Sharma, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Wed, Mar 09, 2016 at 06:48:08PM +0200, Yishai Hadas wrote:
 
> The srcu with NULL checking by itself can prevent the race, no need for the
> "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu
> just after that ib_dev was set to NULL as part of ib_uverbs_remove_one.

No, I don't think that is true, the completion looks like it is
actually needed because the goto out in ib_uverbs_close needs to wait
for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
skipped over before it can go ahead and kref_put things.

So, this is too ugly, do not create a mutex out of srcu and completion.

Your performance reason not to use the existing lists_mutex seems
reasonable, so add a new cleanup mutex for this purpose.

Something like this. I would also get rid of file->is_closed and use
list_del_init & list_empty instead.

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 39680aed99dd..8d192234fdd6 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 {
 	struct ib_uverbs_file *file = filp->private_data;
 	struct ib_uverbs_device *dev = file->device;
-	struct ib_ucontext *ucontext = NULL;
 
 	mutex_lock(&file->device->lists_mutex);
-	ucontext = file->ucontext;
-	file->ucontext = NULL;
 	if (!file->is_closed) {
 		list_del(&file->list);
 		file->is_closed = 1;
 	}
 	mutex_unlock(&file->device->lists_mutex);
-	if (ucontext)
-		ib_uverbs_cleanup_ucontext(file, ucontext);
+
+	mutex_lock(&file->cleanup_mutex);
+	if (file->ucontext) {
+		ib_uverbs_cleanup_ucontext(file, file->ucontext);
+		file->ucontext = NULL;
+	}
+	mutex_unlock(&file->cleanup_mutex);
 
 	if (file->async_file)
 		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
@@ -1177,26 +1179,26 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 
 	mutex_lock(&uverbs_dev->lists_mutex);
 	while (!list_empty(&uverbs_dev->uverbs_file_list)) {
-		struct ib_ucontext *ucontext;
-
 		file = list_first_entry(&uverbs_dev->uverbs_file_list,
 					struct ib_uverbs_file, list);
 		file->is_closed = 1;
-		ucontext = file->ucontext;
 		list_del(&file->list);
-		file->ucontext = NULL;
 		kref_get(&file->ref);
 		mutex_unlock(&uverbs_dev->lists_mutex);
+
 		/* We must release the mutex before going ahead and calling
 		 * disassociate_ucontext. disassociate_ucontext might end up
 		 * indirectly calling uverbs_close, for example due to freeing
 		 * the resources (e.g mmput).
 		 */
 		ib_uverbs_event_handler(&file->event_handler, &event);
-		if (ucontext) {
-			ib_dev->disassociate_ucontext(ucontext);
-			ib_uverbs_cleanup_ucontext(file, ucontext);
+		mutex_lock(&file->cleanup_mutex);
+		if (file->ucontext) {
+			ib_dev->disassociate_ucontext(file->ucontext);
+			ib_uverbs_cleanup_ucontext(file, file->ucontext);
+			file->ucontext = NULL;
 		}
+		mutex_unlock(&file->cleanup_mutex);
 
 		mutex_lock(&uverbs_dev->lists_mutex);
 		kref_put(&file->ref, ib_uverbs_release_file);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                 ` <56E053C8.8050008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2016-03-09 19:03                   ` Jason Gunthorpe
@ 2016-03-10  8:16                   ` Devesh Sharma
  1 sibling, 0 replies; 18+ messages in thread
From: Devesh Sharma @ 2016-03-10  8:16 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Jason Gunthorpe, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Wed, Mar 9, 2016 at 10:18 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 3/8/2016 7:53 PM, Jason Gunthorpe wrote:
>>
>> On Tue, Mar 08, 2016 at 04:24:51PM +0530, Devesh Sharma wrote:
>>
>>>> +++ b/drivers/infiniband/core/uverbs_main.c
>>>> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode,
>>>> struct file *filp)
>>>>                  list_del(&file->list);
>>>>                  file->is_closed = 1;
>>>>          }
>>>> -       mutex_unlock(&file->device->lists_mutex);
>>>>          if (ucontext)
>>>>                  ib_uverbs_cleanup_ucontext(file, ucontext);
>>>> +       mutex_unlock(&file->device->lists_mutex);
>>>>
>>>>
>>>> ??
>>>
>>>
>>> There is following comment about list_mutex in uverbs_main.c around
>>> line number 1200:
>>> /* We must release the mutex before going ahead and calling
>>>   * disassociate_ucontext. disassociate_ucontext might end up
>>>   * indirectly calling uverbs_close, for example due to freeing
>>>   * the resources (e.g mmput).
>>>   */
>>
>>
>> Okay, now I remember this discussion, and being unhappy about this
>> during review.
>>
>> However, this comment is talking about disassociate_ucontext, the bug
>> is with ib_uverbs_cleanup_ucontext. We can't re-entre uverbs_close
>> while we are already in uverbs_close, so that doesn't explain why it
>> cannot be in the mutex.
>>
>> So, Yishai, what is the problem with the above lock placement?
>>
>> The only issue you raised was with multi-file close concurrency, and
>> that is trivially solved with another mutex.
>>
>> I'd rather see another mutex added then this ugly add-hoc
>> srcu/completion thing.
>
>
> The srcu with NULL checking by itself can prevent the race, no need for the
> "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu
> just after that ib_dev was set to NULL as part of ib_uverbs_remove_one.
>
> The reason for the extra "completion" that I suggested comes to make sure
> that when an application returns from its close API the underlying resources
> were really freed, this is open in current code even if the race *wasn't*
> hit.
>
> As we need to enable parallel closing it seems to be the preferred way to
> go.
>
> Devesh, can you send V3 with above suggestion to help people reviewing it ?
> if you have some other solution with mutex that addressed above points
> please come it to the list for a review.

Sure, I should be able to post it toady.

>
>
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                     ` <20160309190354.GD21139-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-10  9:04                       ` Devesh Sharma
       [not found]                         ` <CANjDDBj=F-LTSDMesD97CvvJQWOW6fecuDLY2a9sBZ220jMYMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-03-10 11:26                       ` Yishai Hadas
  1 sibling, 1 reply; 18+ messages in thread
From: Devesh Sharma @ 2016-03-10  9:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Thu, Mar 10, 2016 at 12:33 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Wed, Mar 09, 2016 at 06:48:08PM +0200, Yishai Hadas wrote:
>
>> The srcu with NULL checking by itself can prevent the race, no need for the
>> "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu
>> just after that ib_dev was set to NULL as part of ib_uverbs_remove_one.
>
> No, I don't think that is true, the completion looks like it is
> actually needed because the goto out in ib_uverbs_close needs to wait
> for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
> skipped over before it can go ahead and kref_put things.

Yes, I agree with the above explanation. Even my testing showed evidence.

>
> So, this is too ugly, do not create a mutex out of srcu and completion.
>
> Your performance reason not to use the existing lists_mutex seems
> reasonable, so add a new cleanup mutex for this purpose.
>
> Something like this. I would also get rid of file->is_closed and use
> list_del_init & list_empty instead.

Below patch seems to be working in the code analysis. I will test though.
I would test this approach as well before posting V3.

>
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 39680aed99dd..8d192234fdd6 100644
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>  {
>         struct ib_uverbs_file *file = filp->private_data;
>         struct ib_uverbs_device *dev = file->device;
> -       struct ib_ucontext *ucontext = NULL;
>

ib_dev pointer will not be NULL if cleanup_mutex is taken, thus no
need to check here.

>         mutex_lock(&file->device->lists_mutex);
> -       ucontext = file->ucontext;
> -       file->ucontext = NULL;
>         if (!file->is_closed) {
>                 list_del(&file->list);
>                 file->is_closed = 1;
>         }
>         mutex_unlock(&file->device->lists_mutex);
> -       if (ucontext)
> -               ib_uverbs_cleanup_ucontext(file, ucontext);
> +
> +       mutex_lock(&file->cleanup_mutex);
> +       if (file->ucontext) {
> +               ib_uverbs_cleanup_ucontext(file, file->ucontext);
> +               file->ucontext = NULL;
> +       }
> +       mutex_unlock(&file->cleanup_mutex);
>
>         if (file->async_file)
>                 kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
> @@ -1177,26 +1179,26 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
>
>         mutex_lock(&uverbs_dev->lists_mutex);
>         while (!list_empty(&uverbs_dev->uverbs_file_list)) {
> -               struct ib_ucontext *ucontext;
> -
>                 file = list_first_entry(&uverbs_dev->uverbs_file_list,
>                                         struct ib_uverbs_file, list);
>                 file->is_closed = 1;
> -               ucontext = file->ucontext;
>                 list_del(&file->list);
> -               file->ucontext = NULL;
>                 kref_get(&file->ref);
>                 mutex_unlock(&uverbs_dev->lists_mutex);
> +
>                 /* We must release the mutex before going ahead and calling
>                  * disassociate_ucontext. disassociate_ucontext might end up
>                  * indirectly calling uverbs_close, for example due to freeing
>                  * the resources (e.g mmput).
>                  */
>                 ib_uverbs_event_handler(&file->event_handler, &event);
> -               if (ucontext) {
> -                       ib_dev->disassociate_ucontext(ucontext);
> -                       ib_uverbs_cleanup_ucontext(file, ucontext);
> +               mutex_lock(&file->cleanup_mutex);
> +               if (file->ucontext) {
> +                       ib_dev->disassociate_ucontext(file->ucontext);
> +                       ib_uverbs_cleanup_ucontext(file, file->ucontext);
> +                       file->ucontext = NULL;
>                 }
> +               mutex_unlock(&file->cleanup_mutex);
>
>                 mutex_lock(&uverbs_dev->lists_mutex);
>                 kref_put(&file->ref, ib_uverbs_release_file);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                     ` <20160309190354.GD21139-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-03-10  9:04                       ` Devesh Sharma
@ 2016-03-10 11:26                       ` Yishai Hadas
       [not found]                         ` <56E159CC.3090805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2016-03-10 11:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Devesh Sharma, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On 3/9/2016 9:03 PM, Jason Gunthorpe wrote:
> On Wed, Mar 09, 2016 at 06:48:08PM +0200, Yishai Hadas wrote:
>
>> The srcu with NULL checking by itself can prevent the race, no need for the
>> "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu
>> just after that ib_dev was set to NULL as part of ib_uverbs_remove_one.
>
> No, I don't think that is true, the completion looks like it is
> actually needed because the goto out in ib_uverbs_close needs to wait
> for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
> skipped over before it can go ahead and kref_put things.

Why not ? the final cleanup as part of uverbs_close doesn't depend on 
ib_dev, the kref should be fine for that. The race is *only* for 
ib_uverbs_cleanup_ucontext that uses ib_dev and it should be solved as 
of above suggestion.

> So, this is too ugly, do not create a mutex out of srcu and completion.
>
> Your performance reason not to use the existing lists_mutex seems
> reasonable, so add a new cleanup mutex for this purpose.
>
> Something like this. I would also get rid of file->is_closed and use
> list_del_init & list_empty instead.

Your suggestion is wrong, it doesn't handle the race and might end up in 
other case with deadlock, see below.

> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 39680aed99dd..8d192234fdd6 100644
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>   {
>   	struct ib_uverbs_file *file = filp->private_data;
>   	struct ib_uverbs_device *dev = file->device;
> -	struct ib_ucontext *ucontext = NULL;
>
>   	mutex_lock(&file->device->lists_mutex);
> -	ucontext = file->ucontext;
> -	file->ucontext = NULL;
>   	if (!file->is_closed) {
>   		list_del(&file->list);
>   		file->is_closed = 1;
>   	}
>   	mutex_unlock(&file->device->lists_mutex);

At that point file was deleted from the list and there is *no* sync any 
more with ib_uverbs_free_hw_resources relates to that file.

If here ib_uverbs_free_hw_resource will run to its end freeing ib_dev we 
hit the race as part of ib_uverbs_cleanup_ucontext below, the new added 
lock won't help.

> -	if (ucontext)
> -		ib_uverbs_cleanup_ucontext(file, ucontext);
> +
> +	mutex_lock(&file->cleanup_mutex);
> +	if (file->ucontext) {
> +		ib_uverbs_cleanup_ucontext(file, file->ucontext);
> +		file->ucontext = NULL;
> +	}
> +	mutex_unlock(&file->cleanup_mutex);
>
>   	if (file->async_file)
>   		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
> @@ -1177,26 +1179,26 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
>
>   	mutex_lock(&uverbs_dev->lists_mutex);
>   	while (!list_empty(&uverbs_dev->uverbs_file_list)) {
> -		struct ib_ucontext *ucontext;
> -
>   		file = list_first_entry(&uverbs_dev->uverbs_file_list,
>   					struct ib_uverbs_file, list);
>   		file->is_closed = 1;
> -		ucontext = file->ucontext;
>   		list_del(&file->list);
> -		file->ucontext = NULL;
>   		kref_get(&file->ref);
>   		mutex_unlock(&uverbs_dev->lists_mutex);
> +
>   		/* We must release the mutex before going ahead and calling
>   		 * disassociate_ucontext. disassociate_ucontext might end up
>   		 * indirectly calling uverbs_close, for example due to freeing
>   		 * the resources (e.g mmput).
>   		 */
>   		ib_uverbs_event_handler(&file->event_handler, &event);
> -		if (ucontext) {
> -			ib_dev->disassociate_ucontext(ucontext);
> -			ib_uverbs_cleanup_ucontext(file, ucontext);
> +		mutex_lock(&file->cleanup_mutex);
> +		if (file->ucontext) {
> +			ib_dev->disassociate_ucontext(file->ucontext);

This might end up with deadlock, what is the difference between taking 
this cleanup mutex comparing the list mutex ? see above comment re 
calling disassociate_ucontext under the lock.

> +			ib_uverbs_cleanup_ucontext(file, file->ucontext);
> +			file->ucontext = NULL;
>   		}
> +		mutex_unlock(&file->cleanup_mutex);
>
>   		mutex_lock(&uverbs_dev->lists_mutex);
>   		kref_put(&file->ref, ib_uverbs_release_file);
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                         ` <CANjDDBj=F-LTSDMesD97CvvJQWOW6fecuDLY2a9sBZ220jMYMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-03-10 15:25                           ` Devesh Sharma
       [not found]                             ` <CANjDDBhnJgic4QP-mL7_7cTAh-CH7xaTO147MNqat=aZ45B1nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Devesh Sharma @ 2016-03-10 15:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

Hi Jason.

On Thu, Mar 10, 2016 at 2:34 PM, Devesh Sharma
<devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org> wrote:
> On Thu, Mar 10, 2016 at 12:33 AM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> On Wed, Mar 09, 2016 at 06:48:08PM +0200, Yishai Hadas wrote:
>>
>>> The srcu with NULL checking by itself can prevent the race, no need for the
>>> "completion" mechanism. ib_uverbs_free_hw_resources uses synchronize_srcu
>>> just after that ib_dev was set to NULL as part of ib_uverbs_remove_one.
>>
>> No, I don't think that is true, the completion looks like it is
>> actually needed because the goto out in ib_uverbs_close needs to wait
>> for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
>> skipped over before it can go ahead and kref_put things.
>
> Yes, I agree with the above explanation. Even my testing showed evidence.
>
>>
>> So, this is too ugly, do not create a mutex out of srcu and completion.
>>
>> Your performance reason not to use the existing lists_mutex seems
>> reasonable, so add a new cleanup mutex for this purpose.
>>
>> Something like this. I would also get rid of file->is_closed and use
>> list_del_init & list_empty instead.
>
> Below patch seems to be working in the code analysis. I will test though.
> I would test this approach as well before posting V3.

I tested this approach and again hit the same panic. Thus, looks like
there is no escape other than using
srcu and complete().

>
>>
>> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
>> index 39680aed99dd..8d192234fdd6 100644
>> --- a/drivers/infiniband/core/uverbs_main.c
>> +++ b/drivers/infiniband/core/uverbs_main.c
>> @@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>>  {
>>         struct ib_uverbs_file *file = filp->private_data;
>>         struct ib_uverbs_device *dev = file->device;
>> -       struct ib_ucontext *ucontext = NULL;
>>
>
> ib_dev pointer will not be NULL if cleanup_mutex is taken, thus no
> need to check here.
>
>>         mutex_lock(&file->device->lists_mutex);
>> -       ucontext = file->ucontext;
>> -       file->ucontext = NULL;
>>         if (!file->is_closed) {
>>                 list_del(&file->list);
>>                 file->is_closed = 1;
>>         }
>>         mutex_unlock(&file->device->lists_mutex);
>> -       if (ucontext)
>> -               ib_uverbs_cleanup_ucontext(file, ucontext);
>> +
>> +       mutex_lock(&file->cleanup_mutex);
>> +       if (file->ucontext) {
>> +               ib_uverbs_cleanup_ucontext(file, file->ucontext);
>> +               file->ucontext = NULL;
>> +       }
>> +       mutex_unlock(&file->cleanup_mutex);
>>
>>         if (file->async_file)
>>                 kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
>> @@ -1177,26 +1179,26 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
>>
>>         mutex_lock(&uverbs_dev->lists_mutex);
>>         while (!list_empty(&uverbs_dev->uverbs_file_list)) {
>> -               struct ib_ucontext *ucontext;
>> -
>>                 file = list_first_entry(&uverbs_dev->uverbs_file_list,
>>                                         struct ib_uverbs_file, list);
>>                 file->is_closed = 1;
>> -               ucontext = file->ucontext;
>>                 list_del(&file->list);
>> -               file->ucontext = NULL;
>>                 kref_get(&file->ref);
>>                 mutex_unlock(&uverbs_dev->lists_mutex);
>> +
>>                 /* We must release the mutex before going ahead and calling
>>                  * disassociate_ucontext. disassociate_ucontext might end up
>>                  * indirectly calling uverbs_close, for example due to freeing
>>                  * the resources (e.g mmput).
>>                  */
>>                 ib_uverbs_event_handler(&file->event_handler, &event);
>> -               if (ucontext) {
>> -                       ib_dev->disassociate_ucontext(ucontext);
>> -                       ib_uverbs_cleanup_ucontext(file, ucontext);
>> +               mutex_lock(&file->cleanup_mutex);
>> +               if (file->ucontext) {
>> +                       ib_dev->disassociate_ucontext(file->ucontext);
>> +                       ib_uverbs_cleanup_ucontext(file, file->ucontext);
>> +                       file->ucontext = NULL;
>>                 }
>> +               mutex_unlock(&file->cleanup_mutex);
>>
>>                 mutex_lock(&uverbs_dev->lists_mutex);
>>                 kref_put(&file->ref, ib_uverbs_release_file);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                             ` <CANjDDBhnJgic4QP-mL7_7cTAh-CH7xaTO147MNqat=aZ45B1nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-03-10 15:44                               ` Yishai Hadas
       [not found]                                 ` <56E19676.4070805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2016-03-10 15:44 UTC (permalink / raw)
  To: Devesh Sharma, Jason Gunthorpe
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 3/10/2016 5:25 PM, Devesh Sharma wrote:
> Hi Jason.
> I tested this approach and again hit the same panic. Thus, looks like
> there is no escape other than using
> srcu and complete().

As expected, see my analysis in my answer to Jason, his suggestion 
didn't fix the race, need to go with the sruc solution.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                                 ` <56E19676.4070805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-03-10 15:57                                   ` Devesh Sharma
  0 siblings, 0 replies; 18+ messages in thread
From: Devesh Sharma @ 2016-03-10 15:57 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Jason Gunthorpe, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Thu, Mar 10, 2016 at 9:14 PM, Yishai Hadas
<yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 3/10/2016 5:25 PM, Devesh Sharma wrote:
>>
>> Hi Jason.
>> I tested this approach and again hit the same panic. Thus, looks like
>> there is no escape other than using
>> srcu and complete().
>
>
> As expected, see my analysis in my answer to Jason, his suggestion didn't
> fix the race, need to go with the sruc solution.

Yes, I also had doubts so thought of confirming before proceeding. I
will post V3

>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                         ` <56E159CC.3090805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-03-10 21:05                           ` Jason Gunthorpe
       [not found]                             ` <20160310210535.GA9735-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2016-03-10 21:05 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Devesh Sharma, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Thu, Mar 10, 2016 at 01:26:04PM +0200, Yishai Hadas wrote:

>> No, I don't think that is true, the completion looks like it is
>> actually needed because the goto out in ib_uverbs_close needs to wait
>> for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close

>> skipped over before it can go ahead and kref_put things.
> Why not ? the final cleanup as part of uverbs_close doesn't depend on ib_dev,
> the kref should be fine for that. The race is *only* for
> ib_uverbs_cleanup_ucontext that uses ib_dev and it should be solved as of
> above suggestion.

It has nothing to do with ib_dev, srcu doesn't lock the list:

CPU 0                                   CPU 1
rcu_assign_pointer(ib_dev, null)
ib_uverbs_free_hw_resources()
synchronize_srcu();
                                        ib_uverbs_close()
                                        srcu_read_lock
					.. goto out
					kref_put(file->ref) .. kfree(file)
ib_uverbs_free_hw_resources()
mutex_lock(&lists_mutex);
while (!list_empty(&uverbs_dev->uverbs_file_list))
  .. Boom, use after free of file->list ..

Ie, as I said, we can't put until we know the list_del is done, and
the goto skips over list_del.

The completion is preventing the above scenario, can't remove it.

> >@@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
> >  {
> >  	struct ib_uverbs_file *file = filp->private_data;
> >  	struct ib_uverbs_device *dev = file->device;
> >-	struct ib_ucontext *ucontext = NULL;
> >
> >  	mutex_lock(&file->device->lists_mutex);
> >-	ucontext = file->ucontext;
> >-	file->ucontext = NULL;
> >  	if (!file->is_closed) {
> >  		list_del(&file->list);
> >  		file->is_closed = 1;
> >  	}
> >  	mutex_unlock(&file->device->lists_mutex);
> 
> At that point file was deleted from the list and there is *no* sync any more
> with ib_uverbs_free_hw_resources relates to that file.

Yes, that is right. The ordering of the two locking blocks in ib_uverbs_close should
be swapped to prevent this.

> >-		if (ucontext) {
> >-			ib_dev->disassociate_ucontext(ucontext);
> >-			ib_uverbs_cleanup_ucontext(file, ucontext);
> >+		mutex_lock(&file->cleanup_mutex);
> >+		if (file->ucontext) {
> >+			ib_dev->disassociate_ucontext(file->ucontext);
> 
> This might end up with deadlock, what is the difference between taking this
> cleanup mutex comparing the list mutex ? see above comment re calling
> disassociate_ucontext under the lock.

If the above can deadlock then so can the wait on completion, since it
is basically the same construct.

Fortunately that isn't hard to deal with, more like this:

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 39680aed99dd..b46bbd3dda98 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -953,18 +953,20 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
 {
 	struct ib_uverbs_file *file = filp->private_data;
 	struct ib_uverbs_device *dev = file->device;
-	struct ib_ucontext *ucontext = NULL;
+
+	mutex_lock(&file->cleanup_mutex);
+	if (file->ucontext) {
+		ib_uverbs_cleanup_ucontext(file, file->ucontext);
+		file->ucontext = NULL;
+	}
+	mutex_unlock(&file->cleanup_mutex);
 
 	mutex_lock(&file->device->lists_mutex);
-	ucontext = file->ucontext;
-	file->ucontext = NULL;
 	if (!file->is_closed) {
 		list_del(&file->list);
 		file->is_closed = 1;
 	}
 	mutex_unlock(&file->device->lists_mutex);
-	if (ucontext)
-		ib_uverbs_cleanup_ucontext(file, ucontext);
 
 	if (file->async_file)
 		kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
@@ -1177,26 +1179,28 @@ static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
 
 	mutex_lock(&uverbs_dev->lists_mutex);
 	while (!list_empty(&uverbs_dev->uverbs_file_list)) {
-		struct ib_ucontext *ucontext;
-
 		file = list_first_entry(&uverbs_dev->uverbs_file_list,
 					struct ib_uverbs_file, list);
 		file->is_closed = 1;
-		ucontext = file->ucontext;
 		list_del(&file->list);
-		file->ucontext = NULL;
 		kref_get(&file->ref);
 		mutex_unlock(&uverbs_dev->lists_mutex);
+
 		/* We must release the mutex before going ahead and calling
 		 * disassociate_ucontext. disassociate_ucontext might end up
 		 * indirectly calling uverbs_close, for example due to freeing
 		 * the resources (e.g mmput).
 		 */
 		ib_uverbs_event_handler(&file->event_handler, &event);
-		if (ucontext) {
-			ib_dev->disassociate_ucontext(ucontext);
-			ib_uverbs_cleanup_ucontext(file, ucontext);
+		mutex_lock(&file->cleanup_mutex);
+		if (file->ucontext) {
+			file->ucontext = NULL;
+			mutex_unlock(&file->cleanup_mutex);
+			ib_dev->disassociate_ucontext(file->ucontext);
+			ib_uverbs_cleanup_ucontext(file, file->ucontext);
 		}
+		else
+			mutex_unlock(&file->cleanup_mutex);
 
 		mutex_lock(&uverbs_dev->lists_mutex);
 		kref_put(&file->ref, ib_uverbs_release_file);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                             ` <20160310210535.GA9735-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-14 15:55                               ` Yishai Hadas
       [not found]                                 ` <56E6DEEB.30904-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2016-03-14 15:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Devesh Sharma, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On 3/10/2016 11:05 PM, Jason Gunthorpe wrote:
> On Thu, Mar 10, 2016 at 01:26:04PM +0200, Yishai Hadas wrote:
>
>>> No, I don't think that is true, the completion looks like it is
>>> actually needed because the goto out in ib_uverbs_close needs to wait
>>> for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
>
>>> skipped over before it can go ahead and kref_put things.
>> Why not ? the final cleanup as part of uverbs_close doesn't depend on ib_dev,
>> the kref should be fine for that. The race is *only* for
>> ib_uverbs_cleanup_ucontext that uses ib_dev and it should be solved as of
>> above suggestion.
>
> It has nothing to do with ib_dev, srcu doesn't lock the list:
>
> CPU 0                                   CPU 1
> rcu_assign_pointer(ib_dev, null)
> ib_uverbs_free_hw_resources()
> synchronize_srcu();
>                                          ib_uverbs_close()
>                                          srcu_read_lock
> 					.. goto out
> 					kref_put(file->ref) .. kfree(file)
> ib_uverbs_free_hw_resources()
> mutex_lock(&lists_mutex);
> while (!list_empty(&uverbs_dev->uverbs_file_list))
>    .. Boom, use after free of file->list ..
>
> Ie, as I said, we can't put until we know the list_del is done, and
> the goto skips over list_del.
>
> The completion is preventing the above scenario, can't remove it.

Why do we need the completion for that ? the race can be prevented by 
below flow as part of uverbs_close.

ib_uverbs_close()
{
..
struct ib_ucontext *ucontext = NULL;
+ struct ib_device *ib_dev;

+ srcu_read_lock(...)
+ ib_dev = srcu_dereference(...);
mutex_lock(&file->device->lists_mutex);
ucontext = file->ucontext;
file->ucontext = NULL;
- if (!file->is_closed) {
+ if (ib_dev) {
	list_del(&file->list);
	file->is_closed = 1;
}
mutex_unlock(&file->device->lists_mutex);
if (ucontext)
	ib_uverbs_cleanup_ucontext(file, ucontext);

+ srcu_read_unlock()
if (file->async_file)
	kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
...

}

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one
       [not found]                                 ` <56E6DEEB.30904-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-03-14 17:29                                   ` Jason Gunthorpe
  0 siblings, 0 replies; 18+ messages in thread
From: Jason Gunthorpe @ 2016-03-14 17:29 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Devesh Sharma, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On Mon, Mar 14, 2016 at 05:55:23PM +0200, Yishai Hadas wrote:
> On 3/10/2016 11:05 PM, Jason Gunthorpe wrote:
> >On Thu, Mar 10, 2016 at 01:26:04PM +0200, Yishai Hadas wrote:
> >
> >>>No, I don't think that is true, the completion looks like it is
> >>>actually needed because the goto out in ib_uverbs_close needs to wait
> >>>for ib_uverbs_free_hw_resources to do the cleanups ib_uverbs_close
> >
> >>>skipped over before it can go ahead and kref_put things.
> >>Why not ? the final cleanup as part of uverbs_close doesn't depend on ib_dev,
> >>the kref should be fine for that. The race is *only* for
> >>ib_uverbs_cleanup_ucontext that uses ib_dev and it should be solved as of
> >>above suggestion.
> >
> >It has nothing to do with ib_dev, srcu doesn't lock the list:
> >
> >CPU 0                                   CPU 1
> >rcu_assign_pointer(ib_dev, null)
> >ib_uverbs_free_hw_resources()
> >synchronize_srcu();
> >                                         ib_uverbs_close()
> >                                         srcu_read_lock
> >					.. goto out
> >					kref_put(file->ref) .. kfree(file)
> >ib_uverbs_free_hw_resources()
> >mutex_lock(&lists_mutex);
> >while (!list_empty(&uverbs_dev->uverbs_file_list))
> >   .. Boom, use after free of file->list ..
> >
> >Ie, as I said, we can't put until we know the list_del is done, and
> >the goto skips over list_del.
> >
> >The completion is preventing the above scenario, can't remove it.
> 
> Why do we need the completion for that ? the race can be prevented by below
> flow as part of uverbs_close.

I'm not sure what this is trying to show.

> ib_uverbs_close()
> {
> ..
> struct ib_ucontext *ucontext = NULL;
> + struct ib_device *ib_dev;
> 
> + srcu_read_lock(...)
> + ib_dev = srcu_dereference(...);
> mutex_lock(&file->device->lists_mutex);
> ucontext = file->ucontext;
> file->ucontext = NULL;
> - if (!file->is_closed) {
> + if (ib_dev) {
> 	list_del(&file->list);
> 	file->is_closed = 1;
> }
> mutex_unlock(&file->device->lists_mutex);
> if (ucontext)
> 	ib_uverbs_cleanup_ucontext(file, ucontext);
> 
> + srcu_read_unlock()

This doesn't do anything, there is no assurance that ib_uverbs_close
doesn't start after synchronize_srcu.

Did you mean to include a goto? In which case, it is broken because
skipping over the ib_uverbs_cleanup_ucontext and doing list_del will
leak something.

Seriously, sit down and make a proper patch for this already.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-14 17:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-07  9:44 [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one Devesh Sharma
     [not found] ` <1457343873-14869-1-git-send-email-devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2016-03-07 11:14   ` Yishai Hadas
     [not found]     ` <56DD6295.6000705-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-03-08  9:49       ` Devesh Sharma
2016-03-07 19:08   ` Jason Gunthorpe
     [not found]     ` <20160307190833.GA1886-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-08 10:54       ` Devesh Sharma
     [not found]         ` <CANjDDBiYagKm79n5sWNsCnxruSzqDqZYREmw1mGBR_upapF4hQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-08 14:33           ` Yishai Hadas
2016-03-08 17:53           ` Jason Gunthorpe
     [not found]             ` <20160308175334.GB10805-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-09 16:48               ` Yishai Hadas
     [not found]                 ` <56E053C8.8050008-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-03-09 19:03                   ` Jason Gunthorpe
     [not found]                     ` <20160309190354.GD21139-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-10  9:04                       ` Devesh Sharma
     [not found]                         ` <CANjDDBj=F-LTSDMesD97CvvJQWOW6fecuDLY2a9sBZ220jMYMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-10 15:25                           ` Devesh Sharma
     [not found]                             ` <CANjDDBhnJgic4QP-mL7_7cTAh-CH7xaTO147MNqat=aZ45B1nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-10 15:44                               ` Yishai Hadas
     [not found]                                 ` <56E19676.4070805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-03-10 15:57                                   ` Devesh Sharma
2016-03-10 11:26                       ` Yishai Hadas
     [not found]                         ` <56E159CC.3090805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-03-10 21:05                           ` Jason Gunthorpe
     [not found]                             ` <20160310210535.GA9735-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-14 15:55                               ` Yishai Hadas
     [not found]                                 ` <56E6DEEB.30904-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-03-14 17:29                                   ` Jason Gunthorpe
2016-03-10  8:16                   ` Devesh Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.