* [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure
@ 2018-05-08 8:50 Lidong Chen
2018-05-15 23:14 ` Jason Gunthorpe
2018-06-13 4:36 ` Jason Gunthorpe
0 siblings, 2 replies; 5+ messages in thread
From: Lidong Chen @ 2018-05-08 8:50 UTC (permalink / raw)
To: dledford, jgg, akpm, qing.huang, leon, artemyko, dan.j.williams
Cc: linux-rdma, linux-kernel, adido, galsha, aviadye, Lidong Chen
The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads.
If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL, ib_umem_release does not decrease
mm->pinned_vm. This patch fixes it by use tgid in ib_ucontext struct.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
[v2]
- use ib_ucontext tgid instread of tgid in ib_umem structure
drivers/infiniband/core/umem.c | 7 +------
include/rdma/ib_umem.h | 1 -
2 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 9a4e899..2b6c9b5 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -119,7 +119,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
umem->length = size;
umem->address = addr;
umem->page_shift = PAGE_SHIFT;
- umem->pid = get_task_pid(current, PIDTYPE_PID);
/*
* We ask for writable memory if any of the following
* access flags are set. "Local write" and "remote write"
@@ -132,7 +131,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
if (access & IB_ACCESS_ON_DEMAND) {
- put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem, access);
if (ret) {
kfree(umem);
@@ -148,7 +146,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
- put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
@@ -231,7 +228,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
if (ret < 0) {
if (need_release)
__ib_umem_release(context->device, umem, 0);
- put_pid(umem->pid);
kfree(umem);
} else
current->mm->pinned_vm = locked;
@@ -274,8 +270,7 @@ void ib_umem_release(struct ib_umem *umem)
__ib_umem_release(umem->context->device, umem, 1);
- task = get_pid_task(umem->pid, PIDTYPE_PID);
- put_pid(umem->pid);
+ task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
if (!task)
goto out;
mm = get_task_mm(task);
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 23159dd..a1fd638 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -48,7 +48,6 @@ struct ib_umem {
int writable;
int hugetlb;
struct work_struct work;
- struct pid *pid;
struct mm_struct *mm;
unsigned long diff;
struct ib_umem_odp *odp_data;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure
2018-05-08 8:50 [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure Lidong Chen
@ 2018-05-15 23:14 ` Jason Gunthorpe
2018-05-16 7:32 ` 858585 jemmy
2018-06-13 4:36 ` Jason Gunthorpe
1 sibling, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2018-05-15 23:14 UTC (permalink / raw)
To: Lidong Chen
Cc: dledford, akpm, qing.huang, leon, artemyko, dan.j.williams,
linux-rdma, linux-kernel, adido, galsha, aviadye, Lidong Chen
On Tue, May 08, 2018 at 04:50:16PM +0800, Lidong Chen wrote:
> The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads.
> If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has
> exited, get_pid_task will return NULL, ib_umem_release does not decrease
> mm->pinned_vm. This patch fixes it by use tgid in ib_ucontext struct.
>
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> ---
> [v2]
> - use ib_ucontext tgid instread of tgid in ib_umem structure
>
> drivers/infiniband/core/umem.c | 7 +------
> include/rdma/ib_umem.h | 1 -
> 2 files changed, 1 insertion(+), 7 deletions(-)
Applied to for-rc, thanks.
It would be nice to send a cleanup to have all the users of tgid doing
this pattern
task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
if (!task)
goto out;
mm = get_task_mm(task);
To call some kind of common function like ib_get_mr_mm(), just to make
it really clear what is happening.
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure
2018-05-15 23:14 ` Jason Gunthorpe
@ 2018-05-16 7:32 ` 858585 jemmy
0 siblings, 0 replies; 5+ messages in thread
From: 858585 jemmy @ 2018-05-16 7:32 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, akpm, qing.huang, Leon Romanovsky, artemyko,
dan.j.williams, linux-rdma, linux-kernel, adido, Gal Shachaf,
Aviad Yehezkel, Lidong Chen
On Wed, May 16, 2018 at 7:14 AM, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Tue, May 08, 2018 at 04:50:16PM +0800, Lidong Chen wrote:
>> The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads.
>> If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has
>> exited, get_pid_task will return NULL, ib_umem_release does not decrease
>> mm->pinned_vm. This patch fixes it by use tgid in ib_ucontext struct.
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> ---
>> [v2]
>> - use ib_ucontext tgid instread of tgid in ib_umem structure
>>
>> drivers/infiniband/core/umem.c | 7 +------
>> include/rdma/ib_umem.h | 1 -
>> 2 files changed, 1 insertion(+), 7 deletions(-)
>
> Applied to for-rc, thanks.
>
> It would be nice to send a cleanup to have all the users of tgid doing
> this pattern
>
> task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
> if (!task)
> goto out;
> mm = get_task_mm(task);
>
> To call some kind of common function like ib_get_mr_mm(), just to make
> it really clear what is happening.
OK, I will submit a patch for this.
>
> Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure
2018-05-08 8:50 [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure Lidong Chen
2018-05-15 23:14 ` Jason Gunthorpe
@ 2018-06-13 4:36 ` Jason Gunthorpe
2018-06-13 9:25 ` 858585 jemmy
1 sibling, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2018-06-13 4:36 UTC (permalink / raw)
To: Lidong Chen
Cc: dledford, akpm, qing.huang, leon, artemyko, dan.j.williams,
linux-rdma, linux-kernel, adido, galsha, aviadye, Lidong Chen
On Tue, May 08, 2018 at 04:50:16PM +0800, Lidong Chen wrote:
> The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads.
> If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has
> exited, get_pid_task will return NULL, ib_umem_release does not decrease
> mm->pinned_vm. This patch fixes it by use tgid in ib_ucontext struct.
>
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> ---
> [v2]
> - use ib_ucontext tgid instread of tgid in ib_umem structure
I'm looking at this again, and it doesn't seem quite right..
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 9a4e899..2b6c9b5 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -119,7 +119,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
> umem->length = size;
> umem->address = addr;
> umem->page_shift = PAGE_SHIFT;
> - umem->pid = get_task_pid(current, PIDTYPE_PID);
> /*
> * We ask for writable memory if any of the following
> * access flags are set. "Local write" and "remote write"
> @@ -132,7 +131,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
> IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
>
> if (access & IB_ACCESS_ON_DEMAND) {
> - put_pid(umem->pid);
> ret = ib_umem_odp_get(context, umem, access);
> if (ret) {
> kfree(umem);
> @@ -148,7 +146,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>
> page_list = (struct page **) __get_free_page(GFP_KERNEL);
> if (!page_list) {
> - put_pid(umem->pid);
> kfree(umem);
> return ERR_PTR(-ENOMEM);
> }
in ib_umem_get we are doing this:
down_write(¤t->mm->mmap_sem);
locked = npages + current->mm->pinned_vm;
And then in release we now do:
task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
if (!task)
goto out;
mm = get_task_mm(task);
mm->pinned_vm -= diff;
But there is no guarantee that context->tgid and 'current' are the
same thing during ib_umem_get..
So in the dysfunctional case where someone forks and keeps the context
FD open on both sides of the fork they can cause the pinned_vm
counter to become wrong in the processes. Sounds bad..
Thus, I think we need to go back to storing the tgid in the ib_umem
and just fix it to store the group leader not the thread PID?
And then even more we need the ib_get_mr_mm() helper to make sense of
this, because all the drivers are doing the wrong thing by using the
context->tgid too.
Is that all right?
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure
2018-06-13 4:36 ` Jason Gunthorpe
@ 2018-06-13 9:25 ` 858585 jemmy
0 siblings, 0 replies; 5+ messages in thread
From: 858585 jemmy @ 2018-06-13 9:25 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, akpm, qing.huang, Leon Romanovsky, artemyko,
dan.j.williams, linux-rdma, linux-kernel, Adi Dotan, Gal Shachaf,
Aviad Yehezkel, Lidong Chen
On Wed, Jun 13, 2018 at 12:36 PM, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Tue, May 08, 2018 at 04:50:16PM +0800, Lidong Chen wrote:
>> The userspace may invoke ibv_reg_mr and ibv_dereg_mr by different threads.
>> If when ibv_dereg_mr invoke and the thread which invoked ibv_reg_mr has
>> exited, get_pid_task will return NULL, ib_umem_release does not decrease
>> mm->pinned_vm. This patch fixes it by use tgid in ib_ucontext struct.
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> ---
>> [v2]
>> - use ib_ucontext tgid instread of tgid in ib_umem structure
>
> I'm looking at this again, and it doesn't seem quite right..
>
>> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
>> index 9a4e899..2b6c9b5 100644
>> --- a/drivers/infiniband/core/umem.c
>> +++ b/drivers/infiniband/core/umem.c
>> @@ -119,7 +119,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>> umem->length = size;
>> umem->address = addr;
>> umem->page_shift = PAGE_SHIFT;
>> - umem->pid = get_task_pid(current, PIDTYPE_PID);
>> /*
>> * We ask for writable memory if any of the following
>> * access flags are set. "Local write" and "remote write"
>> @@ -132,7 +131,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>> IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
>>
>> if (access & IB_ACCESS_ON_DEMAND) {
>> - put_pid(umem->pid);
>> ret = ib_umem_odp_get(context, umem, access);
>> if (ret) {
>> kfree(umem);
>> @@ -148,7 +146,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>>
>> page_list = (struct page **) __get_free_page(GFP_KERNEL);
>> if (!page_list) {
>> - put_pid(umem->pid);
>> kfree(umem);
>> return ERR_PTR(-ENOMEM);
>> }
>
> in ib_umem_get we are doing this:
>
> down_write(¤t->mm->mmap_sem);
> locked = npages + current->mm->pinned_vm;
>
> And then in release we now do:
>
> task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
> if (!task)
> goto out;
> mm = get_task_mm(task);
> mm->pinned_vm -= diff;
>
> But there is no guarantee that context->tgid and 'current' are the
> same thing during ib_umem_get..
context->tgid and current maybe different. but different threads in one
process should point to one mm structure. so it should works for multithread.
>
> So in the dysfunctional case where someone forks and keeps the context
> FD open on both sides of the fork they can cause the pinned_vm
> counter to become wrong in the processes. Sounds bad..
I am not sure about fork support, I will check this problem.
>
> Thus, I think we need to go back to storing the tgid in the ib_umem
> and just fix it to store the group leader not the thread PID?
>
> And then even more we need the ib_get_mr_mm() helper to make sense of
> this, because all the drivers are doing the wrong thing by using the
> context->tgid too.
>
> Is that all right?
>
> Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-06-13 9:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08 8:50 [PATCH v2] IB/umem: ib_ucontext already have tgid, remove pid from ib_umem structure Lidong Chen
2018-05-15 23:14 ` Jason Gunthorpe
2018-05-16 7:32 ` 858585 jemmy
2018-06-13 4:36 ` Jason Gunthorpe
2018-06-13 9:25 ` 858585 jemmy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).