All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: add rcu_barrier after entity fini
@ 2018-05-17 10:03 Emily Deng
       [not found] ` <1526551432-12599-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Emily Deng @ 2018-05-17 10:03 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Emily Deng

To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) ->
drm_sched_fence_release_finished ->
drm_sched_fence_release_scheduled ->
call_rcu(&fence->finished.rcu, drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(&f->rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

v2:put the barrier before the kmem_cache_destroy

Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 39ec6b8..42be65b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
 void amdgpu_fence_slab_fini(void)
 {
 	rcu_barrier();
+	rcu_barrier();
 	kmem_cache_destroy(amdgpu_fence_slab);
 }
 /*
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found] ` <1526551432-12599-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-17 11:07   ` Christian König
       [not found]     ` <ec5de5ac-f9b8-53e5-0db4-3e0791469b40-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-05-17 11:07 UTC (permalink / raw)
  To: Emily Deng, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 17.05.2018 um 12:03 schrieb Emily Deng:
> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
> the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.
>
> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> 1.drm_sched_entity_fini ->
> drm_sched_entity_cleanup ->
> dma_fence_put(entity->last_scheduled) ->
> drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled ->
> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
>
> 2.drm_sched_fence_free ->
> dma_fence_put(fence->parent) ->
> amdgpu_fence_release ->
> call_rcu(&f->rcu, amdgpu_fence_free) ->
> kmem_cache_free(amdgpu_fence_slab, fence);
>
> v2:put the barrier before the kmem_cache_destroy
>
> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 39ec6b8..42be65b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
>   void amdgpu_fence_slab_fini(void)
>   {
>   	rcu_barrier();
> +	rcu_barrier();

Well, you should have noted that there is already an rcu_barrier here 
and adding another one shouldn't have any additional effect. So your 
explanation and the proposed solution doesn't make to much sense.

I think the problem you run into is rather that the fence is reference 
counted and might live longer than the module who created it.

Complicated issue, one possible solution would be to release 
fence->parent earlier in the scheduler fence but that doesn't sound like 
a general purpose solution.

Christian.

>   	kmem_cache_destroy(amdgpu_fence_slab);
>   }
>   /*

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]     ` <ec5de5ac-f9b8-53e5-0db4-3e0791469b40-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-05-18  3:20       ` Deng, Emily
       [not found]         ` <CY4PR12MB112526B02190A3D892CCB84F8F900-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Deng, Emily @ 2018-05-18  3:20 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Christian,
     Yes, it has already one rcu_barrier, but it has called twice call_rcu, so the one rcu_barrier just could barrier one call_rcu some time.
    After I added another rcu_barrier, the kernel issue will disappear.

Best Wishes,
Emily Deng

> -----Original Message-----
> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> Sent: Thursday, May 17, 2018 7:08 PM
> To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> 
> Am 17.05.2018 um 12:03 schrieb Emily Deng:
> > To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
> > avoid the amdgpu_fence_slab_fini call
> > kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> drm_sched_entity_fini.
> >
> > The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> > 1.drm_sched_entity_fini ->
> > drm_sched_entity_cleanup ->
> > dma_fence_put(entity->last_scheduled) ->
> > drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled
> > -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> >
> > 2.drm_sched_fence_free ->
> > dma_fence_put(fence->parent) ->
> > amdgpu_fence_release ->
> > call_rcu(&f->rcu, amdgpu_fence_free) ->
> > kmem_cache_free(amdgpu_fence_slab, fence);
> >
> > v2:put the barrier before the kmem_cache_destroy
> >
> > Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> > Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > index 39ec6b8..42be65b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> >   void amdgpu_fence_slab_fini(void)
> >   {
> >   	rcu_barrier();
> > +	rcu_barrier();
> 
> Well, you should have noted that there is already an rcu_barrier here and
> adding another one shouldn't have any additional effect. So your explanation
> and the proposed solution doesn't make to much sense.
> 
> I think the problem you run into is rather that the fence is reference counted
> and might live longer than the module who created it.
> 
> Complicated issue, one possible solution would be to release
> fence->parent earlier in the scheduler fence but that doesn't sound like
> a general purpose solution.
> 
> Christian.
> 
> >   	kmem_cache_destroy(amdgpu_fence_slab);
> >   }
> >   /*

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]         ` <CY4PR12MB112526B02190A3D892CCB84F8F900-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-05-18  9:41           ` Deng, Emily
       [not found]             ` <BN6PR12MB11210121E65A155D901B94AE8F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Deng, Emily @ 2018-05-18  9:41 UTC (permalink / raw)
  To: Deng, Emily, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Ping......

Best Wishes,
Emily Deng




> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Deng, Emily
> Sent: Friday, May 18, 2018 11:20 AM
> To: Koenig, Christian <Christian.Koenig@amd.com>; amd-
> gfx@lists.freedesktop.org
> Subject: RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> 
> Hi Christian,
>      Yes, it has already one rcu_barrier, but it has called twice call_rcu, so the
> one rcu_barrier just could barrier one call_rcu some time.
>     After I added another rcu_barrier, the kernel issue will disappear.
> 
> Best Wishes,
> Emily Deng
> 
> > -----Original Message-----
> > From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> > Sent: Thursday, May 17, 2018 7:08 PM
> > To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >
> > Am 17.05.2018 um 12:03 schrieb Emily Deng:
> > > To free the fence from the amdgpu_fence_slab, need twice call_rcu,
> > > to avoid the amdgpu_fence_slab_fini call
> > > kmem_cache_destroy(amdgpu_fence_slab) before
> > kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> > drm_sched_entity_fini.
> > >
> > > The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> > > 1.drm_sched_entity_fini ->
> > > drm_sched_entity_cleanup ->
> > > dma_fence_put(entity->last_scheduled) ->
> > > drm_sched_fence_release_finished ->
> > drm_sched_fence_release_scheduled
> > > -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> > >
> > > 2.drm_sched_fence_free ->
> > > dma_fence_put(fence->parent) ->
> > > amdgpu_fence_release ->
> > > call_rcu(&f->rcu, amdgpu_fence_free) ->
> > > kmem_cache_free(amdgpu_fence_slab, fence);
> > >
> > > v2:put the barrier before the kmem_cache_destroy
> > >
> > > Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> > > Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > index 39ec6b8..42be65b 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > > @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> > >   void amdgpu_fence_slab_fini(void)
> > >   {
> > >   	rcu_barrier();
> > > +	rcu_barrier();
> >
> > Well, you should have noted that there is already an rcu_barrier here
> > and adding another one shouldn't have any additional effect. So your
> > explanation and the proposed solution doesn't make to much sense.
> >
> > I think the problem you run into is rather that the fence is reference
> > counted and might live longer than the module who created it.
> >
> > Complicated issue, one possible solution would be to release
> > fence->parent earlier in the scheduler fence but that doesn't sound
> > fence->like
> > a general purpose solution.
> >
> > Christian.
> >
> > >   	kmem_cache_destroy(amdgpu_fence_slab);
> > >   }
> > >   /*
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]             ` <BN6PR12MB11210121E65A155D901B94AE8F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-05-18  9:45               ` Christian König
       [not found]                 ` <3f9070bd-ebce-77d4-5979-6ce383885064-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-05-18  9:45 UTC (permalink / raw)
  To: Deng, Emily, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Ok, I'm lost where do we use call_rcu() twice? Cause that sounds 
incorrect in the first place.

Christian.

Am 18.05.2018 um 11:41 schrieb Deng, Emily:
> Ping......
>
> Best Wishes,
> Emily Deng
>
>
>
>
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of Deng, Emily
>> Sent: Friday, May 18, 2018 11:20 AM
>> To: Koenig, Christian <Christian.Koenig@amd.com>; amd-
>> gfx@lists.freedesktop.org
>> Subject: RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>>
>> Hi Christian,
>>       Yes, it has already one rcu_barrier, but it has called twice call_rcu, so the
>> one rcu_barrier just could barrier one call_rcu some time.
>>      After I added another rcu_barrier, the kernel issue will disappear.
>>
>> Best Wishes,
>> Emily Deng
>>
>>> -----Original Message-----
>>> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
>>> Sent: Thursday, May 17, 2018 7:08 PM
>>> To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>>>
>>> Am 17.05.2018 um 12:03 schrieb Emily Deng:
>>>> To free the fence from the amdgpu_fence_slab, need twice call_rcu,
>>>> to avoid the amdgpu_fence_slab_fini call
>>>> kmem_cache_destroy(amdgpu_fence_slab) before
>>> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
>>> drm_sched_entity_fini.
>>>> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
>>>> 1.drm_sched_entity_fini ->
>>>> drm_sched_entity_cleanup ->
>>>> dma_fence_put(entity->last_scheduled) ->
>>>> drm_sched_fence_release_finished ->
>>> drm_sched_fence_release_scheduled
>>>> -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
>>>>
>>>> 2.drm_sched_fence_free ->
>>>> dma_fence_put(fence->parent) ->
>>>> amdgpu_fence_release ->
>>>> call_rcu(&f->rcu, amdgpu_fence_free) ->
>>>> kmem_cache_free(amdgpu_fence_slab, fence);
>>>>
>>>> v2:put the barrier before the kmem_cache_destroy
>>>>
>>>> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
>>>> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> index 39ec6b8..42be65b 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>>>> @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
>>>>    void amdgpu_fence_slab_fini(void)
>>>>    {
>>>>    	rcu_barrier();
>>>> +	rcu_barrier();
>>> Well, you should have noted that there is already an rcu_barrier here
>>> and adding another one shouldn't have any additional effect. So your
>>> explanation and the proposed solution doesn't make to much sense.
>>>
>>> I think the problem you run into is rather that the fence is reference
>>> counted and might live longer than the module who created it.
>>>
>>> Complicated issue, one possible solution would be to release
>>> fence->parent earlier in the scheduler fence but that doesn't sound
>>> fence->like
>>> a general purpose solution.
>>>
>>> Christian.
>>>
>>>>    	kmem_cache_destroy(amdgpu_fence_slab);
>>>>    }
>>>>    /*
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]                 ` <3f9070bd-ebce-77d4-5979-6ce383885064-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-18  9:56                   ` Deng, Emily
       [not found]                     ` <BN6PR12MB1121DDBFDB283E01AC198B038F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Deng, Emily @ 2018-05-18  9:56 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 4942 bytes --]

Hi Christian,
        When we free an IB fence, we first call one call_rcu in drm_sched_fence_release_scheduled as the call trace one, then after the call trace one,
 we call the call_rcu second in the  amdgpu_fence_release in call trace two, as below.

 The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
    1.drm_sched_entity_fini ->
    drm_sched_entity_cleanup ->
    dma_fence_put(entity->last_scheduled) ->
    drm_sched_fence_release_finished ->
    drm_sched_fence_release_scheduled ->
    call_rcu(&fence->finished.rcu, drm_sched_fence_free)

    2.drm_sched_fence_free ->
    dma_fence_put(fence->parent) ->
    amdgpu_fence_release ->
    call_rcu(&f->rcu, amdgpu_fence_free) ->
    kmem_cache_free(amdgpu_fence_slab, fence);

> -----Original Message-----
> From: Koenig, Christian
> Sent: Friday, May 18, 2018 5:46 PM
> To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>
> Ok, I'm lost where do we use call_rcu() twice? Cause that sounds incorrect in
> the first place.
>
> Christian.
>
> Am 18.05.2018 um 11:41 schrieb Deng, Emily:
> > Ping......
> >
> > Best Wishes,
> > Emily Deng
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
> >> Behalf Of Deng, Emily
> >> Sent: Friday, May 18, 2018 11:20 AM
> >> To: Koenig, Christian <Christian.Koenig@amd.com<mailto:Christian.Koenig@amd.com>>; amd-
> >> gfx@lists.freedesktop.org<mailto:gfx@lists.freedesktop.org>
> >> Subject: RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >>
> >> Hi Christian,
> >>       Yes, it has already one rcu_barrier, but it has called twice
> >> call_rcu, so the one rcu_barrier just could barrier one call_rcu some time.
> >>      After I added another rcu_barrier, the kernel issue will disappear.
> >>
> >> Best Wishes,
> >> Emily Deng
> >>
> >>> -----Original Message-----
> >>> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> >>> Sent: Thursday, May 17, 2018 7:08 PM
> >>> To: Deng, Emily <Emily.Deng@amd.com<mailto:Emily.Deng@amd.com>>; amd-<mailto:amd-gfx@lists.freedesktop.org>
> gfx@lists.freedesktop.org
> >>> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >>>
> >>> Am 17.05.2018 um 12:03 schrieb Emily Deng:
> >>>> To free the fence from the amdgpu_fence_slab, need twice call_rcu,
> >>>> to avoid the amdgpu_fence_slab_fini call
> >>>> kmem_cache_destroy(amdgpu_fence_slab) before
> >>> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> >>> drm_sched_entity_fini.
> >>>> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as
> below:
> >>>> 1.drm_sched_entity_fini ->
> >>>> drm_sched_entity_cleanup ->
> >>>> dma_fence_put(entity->last_scheduled) ->
> >>>> drm_sched_fence_release_finished ->
> >>> drm_sched_fence_release_scheduled
> >>>> -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> >>>>
> >>>> 2.drm_sched_fence_free ->
> >>>> dma_fence_put(fence->parent) ->
> >>>> amdgpu_fence_release ->
> >>>> call_rcu(&f->rcu, amdgpu_fence_free) ->
> >>>> kmem_cache_free(amdgpu_fence_slab, fence);
> >>>>
> >>>> v2:put the barrier before the kmem_cache_destroy
> >>>>
> >>>> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> >>>> Signed-off-by: Emily Deng <Emily.Deng@amd.com<mailto:Emily.Deng@amd.com>>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> >>>>    1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> index 39ec6b8..42be65b 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> >>>>    void amdgpu_fence_slab_fini(void)
> >>>>    {
> >>>>          rcu_barrier();
> >>>> +        rcu_barrier();
> >>> Well, you should have noted that there is already an rcu_barrier
> >>> here and adding another one shouldn't have any additional effect. So
> >>> your explanation and the proposed solution doesn't make to much
> sense.
> >>>
> >>> I think the problem you run into is rather that the fence is
> >>> reference counted and might live longer than the module who created it.
> >>>
> >>> Complicated issue, one possible solution would be to release
> >>> fence->parent earlier in the scheduler fence but that doesn't sound
> >>> fence->like
> >>> a general purpose solution.
> >>>
> >>> Christian.
> >>>
> >>>>          kmem_cache_destroy(amdgpu_fence_slab);
> >>>>    }
> >>>>    /*
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 8941 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]                     ` <BN6PR12MB1121DDBFDB283E01AC198B038F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-05-18 10:35                       ` Christian König
       [not found]                         ` <bc3fbefd-957a-9b89-8096-22b44f6dc3d2-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-05-18 10:35 UTC (permalink / raw)
  To: Deng, Emily, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 6314 bytes --]

Ah! So we have one RCU grace period caused by the scheduler fence and 
another one from the hardware fence.

Thanks for explaining that once more, that was really not obvious from 
reading the code.

But this means we could we also fix it by moving the 
"dma_fence_put(fence->parent);" from drm_sched_fence_free() into 
drm_sched_fence_release_scheduled() directly before the call_rcu(), 
can't we?

I think that would be cleaner, cause otherwise every driver using the 
GPU scheduler would need that workaround.

And drm_sched_fence_release_scheduled() is only called when the 
reference count becomes zero, so when anybody accesses the structure 
after that then that would be a bug as well.

Thanks,
Christian.

Am 18.05.2018 um 11:56 schrieb Deng, Emily:
> Hi Christian,
>         When we free an IB fence, we first call one call_rcu in 
> drm_sched_fence_release_scheduled as the call trace one, then after 
> the call trace one,
> we call the call_rcu second in the  amdgpu_fence_release in call trace 
> two, as below.
> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
>     1.drm_sched_entity_fini ->
>     drm_sched_entity_cleanup ->
>     dma_fence_put(entity->last_scheduled) ->
>     drm_sched_fence_release_finished ->
>     drm_sched_fence_release_scheduled ->
>     call_rcu(&fence->finished.rcu, drm_sched_fence_free)
>     2.drm_sched_fence_free ->
>     dma_fence_put(fence->parent) ->
>     amdgpu_fence_release ->
>     call_rcu(&f->rcu, amdgpu_fence_free) ->
>     kmem_cache_free(amdgpu_fence_slab, fence);
> > -----Original Message-----
> > From: Koenig, Christian
> > Sent: Friday, May 18, 2018 5:46 PM
> > To: Deng, Emily <Emily.Deng-5C7GfCeVMHo@public.gmane.org>; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> > Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> > 
> > Ok, I'm lost where do we use call_rcu() twice? Cause that sounds incorrect in
> > the first place.
> > 
> > Christian.
> > 
> > Am 18.05.2018 um 11:41 schrieb Deng, Emily:
> > > Ping......
> > >
> > > Best Wishes,
> > > Emily Deng
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: amd-gfx [mailto:amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org] On
> > >> Behalf Of Deng, Emily
> > >> Sent: Friday, May 18, 2018 11:20 AM
> > >> To: Koenig, Christian <Christian.Koenig-5C7GfCeVMHo@public.gmane.org <mailto:Christian.Koenig-5C7GfCeVMHo@public.gmane.org>>; amd-
> > >> gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org <mailto:gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
> > >> Subject: RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> > >>
> > >> Hi Christian,
> > >>       Yes, it has already one rcu_barrier, but it has called twice
> > >> call_rcu, so the one rcu_barrier just could barrier one call_rcu some time.
> > >>      After I added another rcu_barrier, the kernel issue will disappear.
> > >>
> > >> Best Wishes,
> > >> Emily Deng
> > >>
> > >>> -----Original Message-----
> > >>> From: Christian König [mailto:ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org]
> > >>> Sent: Thursday, May 17, 2018 7:08 PM
> > >>> To: Deng, Emily <Emily.Deng-5C7GfCeVMHo@public.gmane.org <mailto:Emily.Deng-5C7GfCeVMHo@public.gmane.org>>; amd- 
> <mailto:amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
> > gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> > >>> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> > >>>
> > >>> Am 17.05.2018 um 12:03 schrieb Emily Deng:
> > >>>> To free the fence from the amdgpu_fence_slab, need twice call_rcu,
> > >>>> to avoid the amdgpu_fence_slab_fini call
> > >>>> kmem_cache_destroy(amdgpu_fence_slab) before
> > >>> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> > >>> drm_sched_entity_fini.
> > >>>> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as
> > below:
> > >>>> 1.drm_sched_entity_fini ->
> > >>>> drm_sched_entity_cleanup ->
> > >>>> dma_fence_put(entity->last_scheduled) ->
> > >>>> drm_sched_fence_release_finished ->
> > >>> drm_sched_fence_release_scheduled
> > >>>> -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> > >>>>
> > >>>> 2.drm_sched_fence_free ->
> > >>>> dma_fence_put(fence->parent) ->
> > >>>> amdgpu_fence_release ->
> > >>>> call_rcu(&f->rcu, amdgpu_fence_free) ->
> > >>>> kmem_cache_free(amdgpu_fence_slab, fence);
> > >>>>
> > >>>> v2:put the barrier before the kmem_cache_destroy
> > >>>>
> > >>>> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> > >>>> Signed-off-by: Emily Deng <Emily.Deng-5C7GfCeVMHo@public.gmane.org <mailto:Emily.Deng-5C7GfCeVMHo@public.gmane.org>>
> > >>>> ---
> > >>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> > >>>>    1 file changed, 1 insertion(+)
> > >>>>
> > >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > >>>> index 39ec6b8..42be65b 100644
> > >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > >>>> @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> > >>>>    void amdgpu_fence_slab_fini(void)
> > >>>>    {
> > >>>>           rcu_barrier();
> > >>>> +        rcu_barrier();
> > >>> Well, you should have noted that there is already an rcu_barrier
> > >>> here and adding another one shouldn't have any additional effect. So
> > >>> your explanation and the proposed solution doesn't make to much
> > sense.
> > >>>
> > >>> I think the problem you run into is rather that the fence is
> > >>> reference counted and might live longer than the module who created it.
> > >>>
> > >>> Complicated issue, one possible solution would be to release
> > >>> fence->parent earlier in the scheduler fence but that doesn't sound
> > >>> fence->like
> > >>> a general purpose solution.
> > >>>
> > >>> Christian.
> > >>>
> > >>>>          kmem_cache_destroy(amdgpu_fence_slab);
> > >>>>    }
> > >>>>    /*
> > >> _______________________________________________
> > >> amd-gfx mailing list
> > >> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org <mailto:amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
> > >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 12655 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]                         ` <bc3fbefd-957a-9b89-8096-22b44f6dc3d2-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-21  7:54                           ` Deng, Emily
  0 siblings, 0 replies; 17+ messages in thread
From: Deng, Emily @ 2018-05-21  7:54 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 6113 bytes --]

Hi Christian,
     Thanks for your advice, will send a modify patch later.

Best Wishes,
Emily Deng
From: Koenig, Christian
Sent: Friday, May 18, 2018 6:36 PM
To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini

Ah! So we have one RCU grace period caused by the scheduler fence and another one from the hardware fence.

Thanks for explaining that once more, that was really not obvious from reading the code.

But this means we could we also fix it by moving the "dma_fence_put(fence->parent);" from drm_sched_fence_free() into drm_sched_fence_release_scheduled() directly before the call_rcu(), can't we?

I think that would be cleaner, cause otherwise every driver using the GPU scheduler would need that workaround.

And drm_sched_fence_release_scheduled() is only called when the reference count becomes zero, so when anybody accesses the structure after that then that would be a bug as well.

Thanks,
Christian.

Am 18.05.2018 um 11:56 schrieb Deng, Emily:
Hi Christian,
        When we free an IB fence, we first call one call_rcu in drm_sched_fence_release_scheduled as the call trace one, then after the call trace one,
we call the call_rcu second in the  amdgpu_fence_release in call trace two, as below.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
    1.drm_sched_entity_fini ->
    drm_sched_entity_cleanup ->
    dma_fence_put(entity->last_scheduled) ->
    drm_sched_fence_release_finished ->
    drm_sched_fence_release_scheduled ->
    call_rcu(&fence->finished.rcu, drm_sched_fence_free)

    2.drm_sched_fence_free ->
    dma_fence_put(fence->parent) ->
    amdgpu_fence_release ->
    call_rcu(&f->rcu, amdgpu_fence_free) ->
    kmem_cache_free(amdgpu_fence_slab, fence);

> -----Original Message-----
> From: Koenig, Christian
> Sent: Friday, May 18, 2018 5:46 PM
> To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>
> Ok, I'm lost where do we use call_rcu() twice? Cause that sounds incorrect in
> the first place.
>
> Christian.
>
> Am 18.05.2018 um 11:41 schrieb Deng, Emily:
> > Ping......
> >
> > Best Wishes,
> > Emily Deng
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
> >> Behalf Of Deng, Emily
> >> Sent: Friday, May 18, 2018 11:20 AM
> >> To: Koenig, Christian <Christian.Koenig@amd.com<mailto:Christian.Koenig@amd.com>>; amd-
> >> gfx@lists.freedesktop.org<mailto:gfx@lists.freedesktop.org>
> >> Subject: RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >>
> >> Hi Christian,
> >>       Yes, it has already one rcu_barrier, but it has called twice
> >> call_rcu, so the one rcu_barrier just could barrier one call_rcu some time.
> >>      After I added another rcu_barrier, the kernel issue will disappear.
> >>
> >> Best Wishes,
> >> Emily Deng
> >>
> >>> -----Original Message-----
> >>> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> >>> Sent: Thursday, May 17, 2018 7:08 PM
> >>> To: Deng, Emily <Emily.Deng@amd.com<mailto:Emily.Deng@amd.com>>; amd-<mailto:amd-gfx@lists.freedesktop.org>
> gfx@lists.freedesktop.org<mailto:gfx@lists.freedesktop.org>
> >>> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >>>
> >>> Am 17.05.2018 um 12:03 schrieb Emily Deng:
> >>>> To free the fence from the amdgpu_fence_slab, need twice call_rcu,
> >>>> to avoid the amdgpu_fence_slab_fini call
> >>>> kmem_cache_destroy(amdgpu_fence_slab) before
> >>> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> >>> drm_sched_entity_fini.
> >>>> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as
> below:
> >>>> 1.drm_sched_entity_fini ->
> >>>> drm_sched_entity_cleanup ->
> >>>> dma_fence_put(entity->last_scheduled) ->
> >>>> drm_sched_fence_release_finished ->
> >>> drm_sched_fence_release_scheduled
> >>>> -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> >>>>
> >>>> 2.drm_sched_fence_free ->
> >>>> dma_fence_put(fence->parent) ->
> >>>> amdgpu_fence_release ->
> >>>> call_rcu(&f->rcu, amdgpu_fence_free) ->
> >>>> kmem_cache_free(amdgpu_fence_slab, fence);
> >>>>
> >>>> v2:put the barrier before the kmem_cache_destroy
> >>>>
> >>>> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> >>>> Signed-off-by: Emily Deng <Emily.Deng@amd.com<mailto:Emily.Deng@amd.com>>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> >>>>    1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> index 39ec6b8..42be65b 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> >>>>    void amdgpu_fence_slab_fini(void)
> >>>>    {
> >>>>           rcu_barrier();
> >>>> +        rcu_barrier();
> >>> Well, you should have noted that there is already an rcu_barrier
> >>> here and adding another one shouldn't have any additional effect. So
> >>> your explanation and the proposed solution doesn't make to much
> sense.
> >>>
> >>> I think the problem you run into is rather that the fence is
> >>> reference counted and might live longer than the module who created it.
> >>>
> >>> Complicated issue, one possible solution would be to release
> >>> fence->parent earlier in the scheduler fence but that doesn't sound
> >>> fence->like
> >>> a general purpose solution.
> >>>
> >>> Christian.
> >>>
> >>>>           kmem_cache_destroy(amdgpu_fence_slab);
> >>>>    }
> >>>>    /*
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 28722 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]         ` <c19068bf-4c4d-bb74-f31c-f5b71a2f8333-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-05-23  7:47           ` Deng, Emily
  0 siblings, 0 replies; 17+ messages in thread
From: Deng, Emily @ 2018-05-23  7:47 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Thanks.

Best Wishes,
Emily Deng

> -----Original Message-----
> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> Sent: Wednesday, May 23, 2018 3:33 PM
> To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> 
> Sorry missed that one.
> 
> Patch is Reviewed-by: Christian König <christian.koenig@amd.com>.
> 
> Nice work,
> Christian.
> 
> Am 23.05.2018 um 07:25 schrieb Deng, Emily:
> > Ping ......
> >
> >> -----Original Message-----
> >> From: Emily Deng [mailto:Emily.Deng@amd.com]
> >> Sent: Monday, May 21, 2018 4:09 PM
> >> To: amd-gfx@lists.freedesktop.org
> >> Cc: Deng, Emily <Emily.Deng@amd.com>
> >> Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> >>
> >> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
> >> avoid the amdgpu_fence_slab_fini call
> >> kmem_cache_destroy(amdgpu_fence_slab) before
> >> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> >> drm_sched_entity_fini.
> >>
> >> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> >> 1.drm_sched_entity_fini ->
> >> drm_sched_entity_cleanup ->
> >> dma_fence_put(entity->last_scheduled) ->
> >> drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled
> >> -> call_rcu(&fence->finished.rcu,
> >> drm_sched_fence_free)
> >>
> >> 2.drm_sched_fence_free ->
> >> dma_fence_put(fence->parent) ->
> >> amdgpu_fence_release ->
> >> call_rcu(&f->rcu, amdgpu_fence_free) ->
> >> kmem_cache_free(amdgpu_fence_slab, fence);
> >>
> >> v2:put the barrier before the kmem_cache_destroy v3:put the
> >> dma_fence_put(fence->parent) before call_rcu in
> >> drm_sched_fence_release_scheduled
> >>
> >> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> >> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> >> ---
> >>   drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
> >> b/drivers/gpu/drm/scheduler/sched_fence.c
> >> index 786b47f..df44616 100644
> >> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >> @@ -98,7 +98,6 @@ static void drm_sched_fence_free(struct rcu_head
> *rcu)
> >>   	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >>   	struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>
> >> -	dma_fence_put(fence->parent);
> >>   	kmem_cache_free(sched_fence_slab, fence);  }
> >>
> >> @@ -114,6 +113,7 @@ static void
> >> drm_sched_fence_release_scheduled(struct dma_fence *f)  {
> >>   	struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>
> >> +	dma_fence_put(fence->parent);
> >>   	call_rcu(&fence->finished.rcu, drm_sched_fence_free);  }
> >>
> >> --
> >> 2.7.4
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]     ` <CY4PR12MB1125509E5759D4FDB7C097998F6B0-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-05-23  7:32       ` Christian König
       [not found]         ` <c19068bf-4c4d-bb74-f31c-f5b71a2f8333-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-05-23  7:32 UTC (permalink / raw)
  To: Deng, Emily, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Sorry missed that one.

Patch is Reviewed-by: Christian König <christian.koenig@amd.com>.

Nice work,
Christian.

Am 23.05.2018 um 07:25 schrieb Deng, Emily:
> Ping ......
>
>> -----Original Message-----
>> From: Emily Deng [mailto:Emily.Deng@amd.com]
>> Sent: Monday, May 21, 2018 4:09 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Deng, Emily <Emily.Deng@amd.com>
>> Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>>
>> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
>> avoid the amdgpu_fence_slab_fini call
>> kmem_cache_destroy(amdgpu_fence_slab) before
>> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
>> drm_sched_entity_fini.
>>
>> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
>> 1.drm_sched_entity_fini ->
>> drm_sched_entity_cleanup ->
>> dma_fence_put(entity->last_scheduled) ->
>> drm_sched_fence_release_finished ->
>> drm_sched_fence_release_scheduled -> call_rcu(&fence->finished.rcu,
>> drm_sched_fence_free)
>>
>> 2.drm_sched_fence_free ->
>> dma_fence_put(fence->parent) ->
>> amdgpu_fence_release ->
>> call_rcu(&f->rcu, amdgpu_fence_free) ->
>> kmem_cache_free(amdgpu_fence_slab, fence);
>>
>> v2:put the barrier before the kmem_cache_destroy v3:put the
>> dma_fence_put(fence->parent) before call_rcu in
>> drm_sched_fence_release_scheduled
>>
>> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
>> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
>> b/drivers/gpu/drm/scheduler/sched_fence.c
>> index 786b47f..df44616 100644
>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>> @@ -98,7 +98,6 @@ static void drm_sched_fence_free(struct rcu_head *rcu)
>>   	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>   	struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>
>> -	dma_fence_put(fence->parent);
>>   	kmem_cache_free(sched_fence_slab, fence);  }
>>
>> @@ -114,6 +113,7 @@ static void
>> drm_sched_fence_release_scheduled(struct dma_fence *f)  {
>>   	struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>
>> +	dma_fence_put(fence->parent);
>>   	call_rcu(&fence->finished.rcu, drm_sched_fence_free);  }
>>
>> --
>> 2.7.4
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found] ` <1526890130-673-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-23  5:25   ` Deng, Emily
       [not found]     ` <CY4PR12MB1125509E5759D4FDB7C097998F6B0-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Deng, Emily @ 2018-05-23  5:25 UTC (permalink / raw)
  To: Deng, Emily, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Ping ......

> -----Original Message-----
> From: Emily Deng [mailto:Emily.Deng@amd.com]
> Sent: Monday, May 21, 2018 4:09 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deng, Emily <Emily.Deng@amd.com>
> Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> 
> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
> avoid the amdgpu_fence_slab_fini call
> kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> drm_sched_entity_fini.
> 
> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> 1.drm_sched_entity_fini ->
> drm_sched_entity_cleanup ->
> dma_fence_put(entity->last_scheduled) ->
> drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled -> call_rcu(&fence->finished.rcu,
> drm_sched_fence_free)
> 
> 2.drm_sched_fence_free ->
> dma_fence_put(fence->parent) ->
> amdgpu_fence_release ->
> call_rcu(&f->rcu, amdgpu_fence_free) ->
> kmem_cache_free(amdgpu_fence_slab, fence);
> 
> v2:put the barrier before the kmem_cache_destroy v3:put the
> dma_fence_put(fence->parent) before call_rcu in
> drm_sched_fence_release_scheduled
> 
> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index 786b47f..df44616 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -98,7 +98,6 @@ static void drm_sched_fence_free(struct rcu_head *rcu)
>  	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>  	struct drm_sched_fence *fence = to_drm_sched_fence(f);
> 
> -	dma_fence_put(fence->parent);
>  	kmem_cache_free(sched_fence_slab, fence);  }
> 
> @@ -114,6 +113,7 @@ static void
> drm_sched_fence_release_scheduled(struct dma_fence *f)  {
>  	struct drm_sched_fence *fence = to_drm_sched_fence(f);
> 
> +	dma_fence_put(fence->parent);
>  	call_rcu(&fence->finished.rcu, drm_sched_fence_free);  }
> 
> --
> 2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] drm/amdgpu: add rcu_barrier after entity fini
@ 2018-05-21  8:08 Emily Deng
       [not found] ` <1526890130-673-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Emily Deng @ 2018-05-21  8:08 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Emily Deng

To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) ->
drm_sched_fence_release_finished ->
drm_sched_fence_release_scheduled ->
call_rcu(&fence->finished.rcu, drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(&f->rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

v2:put the barrier before the kmem_cache_destroy
v3:put the dma_fence_put(fence->parent) before call_rcu in
drm_sched_fence_release_scheduled

Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 786b47f..df44616 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -98,7 +98,6 @@ static void drm_sched_fence_free(struct rcu_head *rcu)
 	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
 	struct drm_sched_fence *fence = to_drm_sched_fence(f);
 
-	dma_fence_put(fence->parent);
 	kmem_cache_free(sched_fence_slab, fence);
 }
 
@@ -114,6 +113,7 @@ static void drm_sched_fence_release_scheduled(struct dma_fence *f)
 {
 	struct drm_sched_fence *fence = to_drm_sched_fence(f);
 
+	dma_fence_put(fence->parent);
 	call_rcu(&fence->finished.rcu, drm_sched_fence_free);
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found]     ` <a2990863-eea0-ecdd-0e86-f208fd91d59c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-05-17 10:01       ` Deng, Emily
  0 siblings, 0 replies; 17+ messages in thread
From: Deng, Emily @ 2018-05-17 10:01 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Christian,
     Good suggestion. Put the barrier() before the kmem_cache_destroy() will be better. I will send a v2 patch later. Thanks

Best Wishes,
Emily Deng

> -----Original Message-----
> From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com]
> Sent: Thursday, May 17, 2018 4:39 PM
> To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
> 
> Am 17.05.2018 um 05:05 schrieb Emily Deng:
> > To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
> > avoid the amdgpu_fence_slab_fini call
> > kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> drm_sched_entity_fini.
> >
> > The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> > 1.drm_sched_entity_fini ->
> > drm_sched_entity_cleanup ->
> > dma_fence_put(entity->last_scheduled) ->
> > drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled
> > -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> >
> > 2.drm_sched_fence_free ->
> > dma_fence_put(fence->parent) ->
> > amdgpu_fence_release ->
> > call_rcu(&f->rcu, amdgpu_fence_free) ->
> > kmem_cache_free(amdgpu_fence_slab, fence);
> >
> > Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> > Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index cc3b067..07b2e10 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct
> amdgpu_device *adev)
> >   	if (adev->mman.mem_global_referenced) {
> >   		drm_sched_entity_fini(adev->mman.entity.sched,
> >   				      &adev->mman.entity);
> > +		rcu_barrier();
> 
> Good catch, but why don't you put the barrier() before the
> kmem_cache_destroy()?
> 
> That looks like it would make more sense.
> 
> Christian.
> 
> >   		mutex_destroy(&adev->mman.gtt_window_lock);
> >   		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
> >   		drm_global_item_unref(&adev->mman.mem_global_ref);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found] ` <1526526326-5130-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
  2018-05-17  3:32   ` Zhou, David(ChunMing)
  2018-05-17  7:42   ` Michel Dänzer
@ 2018-05-17  8:39   ` Christian König
       [not found]     ` <a2990863-eea0-ecdd-0e86-f208fd91d59c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2018-05-17  8:39 UTC (permalink / raw)
  To: Emily Deng, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 17.05.2018 um 05:05 schrieb Emily Deng:
> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
> the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.
>
> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> 1.drm_sched_entity_fini ->
> drm_sched_entity_cleanup ->
> dma_fence_put(entity->last_scheduled) ->
> drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled ->
> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
>
> 2.drm_sched_fence_free ->
> dma_fence_put(fence->parent) ->
> amdgpu_fence_release ->
> call_rcu(&f->rcu, amdgpu_fence_free) ->
> kmem_cache_free(amdgpu_fence_slab, fence);
>
> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index cc3b067..07b2e10 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
>   	if (adev->mman.mem_global_referenced) {
>   		drm_sched_entity_fini(adev->mman.entity.sched,
>   				      &adev->mman.entity);
> +		rcu_barrier();

Good catch, but why don't you put the barrier() before the 
kmem_cache_destroy()?

That looks like it would make more sense.

Christian.

>   		mutex_destroy(&adev->mman.gtt_window_lock);
>   		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
>   		drm_global_item_unref(&adev->mman.mem_global_ref);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found] ` <1526526326-5130-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
  2018-05-17  3:32   ` Zhou, David(ChunMing)
@ 2018-05-17  7:42   ` Michel Dänzer
  2018-05-17  8:39   ` Christian König
  2 siblings, 0 replies; 17+ messages in thread
From: Michel Dänzer @ 2018-05-17  7:42 UTC (permalink / raw)
  To: Emily Deng; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-05-17 05:05 AM, Emily Deng wrote:
> To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
> the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.
> 
> The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> 1.drm_sched_entity_fini ->
> drm_sched_entity_cleanup ->
> dma_fence_put(entity->last_scheduled) ->
> drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled ->
> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> 
> 2.drm_sched_fence_free ->
> dma_fence_put(fence->parent) ->
> amdgpu_fence_release ->
> call_rcu(&f->rcu, amdgpu_fence_free) ->
> kmem_cache_free(amdgpu_fence_slab, fence);
> 
> Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> Signed-off-by: Emily Deng <Emily.Deng@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index cc3b067..07b2e10 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
>  	if (adev->mman.mem_global_referenced) {
>  		drm_sched_entity_fini(adev->mman.entity.sched,
>  				      &adev->mman.entity);
> +		rcu_barrier();
>  		mutex_destroy(&adev->mman.gtt_window_lock);
>  		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
>  		drm_global_item_unref(&adev->mman.mem_global_ref);
> 

Hmm, this makes me wonder if
https://bugs.freedesktop.org/show_bug.cgi?id=106225 could be related to
something like this as well.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
       [not found] ` <1526526326-5130-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-17  3:32   ` Zhou, David(ChunMing)
  2018-05-17  7:42   ` Michel Dänzer
  2018-05-17  8:39   ` Christian König
  2 siblings, 0 replies; 17+ messages in thread
From: Zhou, David(ChunMing) @ 2018-05-17  3:32 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Deng, Emily

Looks good, Acked-by: Chunming  Zhou <david1.zhou@amd.com>

-----Original Message-----
From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Emily Deng
Sent: Thursday, May 17, 2018 11:05 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily <Emily.Deng@amd.com>
Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini

To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) -> drm_sched_fence_release_finished -> drm_sched_fence_release_scheduled -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(&f->rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index cc3b067..07b2e10 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
 	if (adev->mman.mem_global_referenced) {
 		drm_sched_entity_fini(adev->mman.entity.sched,
 				      &adev->mman.entity);
+		rcu_barrier();
 		mutex_destroy(&adev->mman.gtt_window_lock);
 		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
 		drm_global_item_unref(&adev->mman.mem_global_ref);
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH] drm/amdgpu: add rcu_barrier after entity fini
@ 2018-05-17  3:05 Emily Deng
       [not found] ` <1526526326-5130-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Emily Deng @ 2018-05-17  3:05 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Emily Deng

To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) ->
drm_sched_fence_release_finished ->
drm_sched_fence_release_scheduled ->
call_rcu(&fence->finished.rcu, drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(&f->rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index cc3b067..07b2e10 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
 	if (adev->mman.mem_global_referenced) {
 		drm_sched_entity_fini(adev->mman.entity.sched,
 				      &adev->mman.entity);
+		rcu_barrier();
 		mutex_destroy(&adev->mman.gtt_window_lock);
 		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
 		drm_global_item_unref(&adev->mman.mem_global_ref);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-05-23  7:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 10:03 [PATCH] drm/amdgpu: add rcu_barrier after entity fini Emily Deng
     [not found] ` <1526551432-12599-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
2018-05-17 11:07   ` Christian König
     [not found]     ` <ec5de5ac-f9b8-53e5-0db4-3e0791469b40-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-18  3:20       ` Deng, Emily
     [not found]         ` <CY4PR12MB112526B02190A3D892CCB84F8F900-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-18  9:41           ` Deng, Emily
     [not found]             ` <BN6PR12MB11210121E65A155D901B94AE8F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-18  9:45               ` Christian König
     [not found]                 ` <3f9070bd-ebce-77d4-5979-6ce383885064-5C7GfCeVMHo@public.gmane.org>
2018-05-18  9:56                   ` Deng, Emily
     [not found]                     ` <BN6PR12MB1121DDBFDB283E01AC198B038F900-/b2+HYfkarSgw6z4+5+8kgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-18 10:35                       ` Christian König
     [not found]                         ` <bc3fbefd-957a-9b89-8096-22b44f6dc3d2-5C7GfCeVMHo@public.gmane.org>
2018-05-21  7:54                           ` Deng, Emily
  -- strict thread matches above, loose matches on Subject: below --
2018-05-21  8:08 Emily Deng
     [not found] ` <1526890130-673-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
2018-05-23  5:25   ` Deng, Emily
     [not found]     ` <CY4PR12MB1125509E5759D4FDB7C097998F6B0-rpdhrqHFk07v2MZdTKcfDgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-05-23  7:32       ` Christian König
     [not found]         ` <c19068bf-4c4d-bb74-f31c-f5b71a2f8333-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-23  7:47           ` Deng, Emily
2018-05-17  3:05 Emily Deng
     [not found] ` <1526526326-5130-1-git-send-email-Emily.Deng-5C7GfCeVMHo@public.gmane.org>
2018-05-17  3:32   ` Zhou, David(ChunMing)
2018-05-17  7:42   ` Michel Dänzer
2018-05-17  8:39   ` Christian König
     [not found]     ` <a2990863-eea0-ecdd-0e86-f208fd91d59c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-05-17 10:01       ` Deng, Emily

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.