All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hangyu Hua <hbh25y@gmail.com>
To: Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	yuq825@gmail.com, airlied@linux.ie, daniel@ffwll.ch
Cc: dri-devel@lists.freedesktop.org, lima@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
Date: Thu, 28 Apr 2022 16:56:14 +0800	[thread overview]
Message-ID: <d0790635-4b2e-cd58-0a51-36427800b39c@gmail.com> (raw)
In-Reply-To: <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com>

On 2022/4/27 22:43, Andrey Grodzovsky wrote:
> 
> On 2022-04-26 22:31, Hangyu Hua wrote:
>> On 2022/4/26 22:55, Andrey Grodzovsky wrote:
>>>
>>> On 2022-04-25 22:54, Hangyu Hua wrote:
>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>>>
>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>>>>> called
>>>>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>>>>> finishes
>>>>>> may result in a dangling pointer.
>>>>>>
>>>>>> Fix this by removing redundant dma_fence_put().
>>>>>>
>>>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>>>   2 files changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>>>> *file, struct lima_submit *submit)
>>>>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>>>>> fence);
>>>>>>           if (err) {
>>>>>> -            dma_fence_put(fence);
>>>>>>               return err;
>>>>>
>>>>>
>>>>> Makes sense here
>>>>>
>>>>>
>>>>>>           }
>>>>>>       }
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -708,7 +708,6 @@ int 
>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>>>           dma_fence_get(fence);
>>>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>>>           if (ret) {
>>>>>> -            dma_fence_put(fence);
>>>>>
>>>>>
>>>>>
>>>>> Not sure about this one since if you look at the relevant commits -
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>>>> You will see that the dma_fence_put here balances the extra 
>>>>> dma_fence_get
>>>>> above
>>>>>
>>>>> Andrey
>>>>>
>>>>
>>>> I don't think so. I checked the call chain and found no additional 
>>>> dma_fence_get(). But dma_fence_get() needs to be called before 
>>>> drm_sched_job_add_dependency() to keep the counter balanced. 
>>>
>>>
>>> I don't say there is an additional get, I just say that 
>>> drm_sched_job_add_dependency doesn't grab an extra reference to the 
>>> fences it stores so this needs to be done outside and for that
>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called and, 
>>> if this addition fails you just call dma_fence_put to keep the 
>>> counter balanced.
>>>
>>
>> drm_sched_job_add_implicit_dependencies() will call 
>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
>> already call dma_fence_put() when it fails. Calling dma_fence_put() 
>> twice doesn't make sense.
>>
>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] 
>> and [3] when xa_alloc() fails.
> 
> 
> The way I see it, [2] and [3] are mat matching *get* and *put* 
> respectively. [1] *put* is against the original 
> dma_fence_init->kref_init of the fence which always set the refcount at 1.
> Also in support of this see commit 'drm/scheduler: fix 
> drm_sched_job_add_implicit_dependencies harder' - it says there 
> "drm_sched_job_add_dependency() could drop the last ref"  - this last 
> ref is the original refcount set by dma_fence_init->kref
> 
> Andrey


You can see that drm_sched_job_add_dependency() has three return paths 
they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return error.

There will be three weird problems if you're right:

1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. 
Otherwise [2] and [5] are matching *get* and *put* in here.

2. [4] path need a additional dma_fence_get() to adds the fence as a job 
dependency. fence is from obj->resv. Taking msm as an example obj->resv 
is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). It is not 
possible that an object has *refcount == 1* but is referenced in two 
places. So dma_fence_get() called in [2] is for [4]. By the way, [3] 
don't execute in this case.

3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix 
drm_sched_job_add_implicit_dependencies harder". 
drm_sched_job_add_dependency() could drop the last ref, so we need to do
the dma_fence_get() first. But the last ref still will drop in [3] if 
drm_sched_job_add_dependency() go path [1]. And there is only a *return* 
between [1] and [3]. Is this necessary? I think Rob Clark wants to avoid 
the last ref being dropped in drm_sched_job_add_implicit_dependencies() 
because fence is still used by obj->resv.


int drm_sched_job_add_dependency(struct drm_sched_job *job,
                                  struct dma_fence *fence)
{
         ...
         xa_for_each(&job->dependencies, index, entry) {
                 if (entry->context != fence->context)
                         continue;

                 if (dma_fence_is_later(fence, entry)) {
                         dma_fence_put(entry);
                         xa_store(&job->dependencies, index, fence, 
GFP_KERNEL);	<---- [4]
                 } else {
                         dma_fence_put(fence);	<---- [5]
                 }
                 return 0;
         }

         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
GFP_KERNEL);
         if (ret != 0)
                 dma_fence_put(fence);   <---- [1]

         return ret;
}


int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
                                             struct drm_gem_object *obj,
                                             bool write)
{
         struct dma_resv_iter cursor;
         struct dma_fence *fence;
         int ret;

         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
                 /* Make sure to grab an additional ref on the added 
fence */
                 dma_fence_get(fence);   <---- [2]
                 ret = drm_sched_job_add_dependency(job, fence);
                 if (ret) {      <---- [6]
                         dma_fence_put(fence);   <---- [3]

                         return ret;
                 }
         }
         return 0;
}

Thanks,
hangyu

> 
>>
>>
>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>                  struct dma_fence *fence)
>> {
>>     ...
>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>> GFP_KERNEL);
>>     if (ret != 0)
>>         dma_fence_put(fence);    <--- [1]
>>
>>     return ret;
>> }
>> EXPORT_SYMBOL(drm_sched_job_add_dependency);
>>
>>
>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>                         struct drm_gem_object *obj,
>>                         bool write)
>> {
>>     struct dma_resv_iter cursor;
>>     struct dma_fence *fence;
>>     int ret;
>>
>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>         /* Make sure to grab an additional ref on the added fence */
>>         dma_fence_get(fence);    <--- [2]
>>         ret = drm_sched_job_add_dependency(job, fence);
>>         if (ret) {
>>             dma_fence_put(fence);    <--- [3]
>>             return ret;
>>         }
>>     }
>>     return 0;
>> }
>>
>>
>>>
>>>> On the other hand, dma_fence_get() and dma_fence_put() are 
>>>> meaningless here if threre is an extra dma_fence_get() beacause 
>>>> counter will not decrease to 0 during drm_sched_job_add_dependency().
>>>>
>>>> I check the call chain as follows:
>>>>
>>>> msm_ioctl_gem_submit()
>>>> -> submit_fence_sync()
>>>> -> drm_sched_job_add_implicit_dependencies()
>>>
>>>
>>> Can you maybe trace or print one such example of problematic refcount 
>>> that you are trying to fix ? I still don't see where is the problem.
>>>
>>> Andrey
>>>
>>
>> I also wish I could. System logs can make this easy. But i don't have 
>> a corresponding GPU physical device. 
>> drm_sched_job_add_implicit_dependencies is only used in a few devices.
>>
>> Thanks.
>>>
>>>>
>>>> Thanks,
>>>> Hangyu
>>>>
>>>>>
>>>>>>               return ret;
>>>>>>           }
>>>>>>       }

WARNING: multiple messages have this Message-ID (diff)
From: Hangyu Hua <hbh25y@gmail.com>
To: Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	yuq825@gmail.com, airlied@linux.ie, daniel@ffwll.ch
Cc: lima@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails
Date: Thu, 28 Apr 2022 16:56:14 +0800	[thread overview]
Message-ID: <d0790635-4b2e-cd58-0a51-36427800b39c@gmail.com> (raw)
In-Reply-To: <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com>

On 2022/4/27 22:43, Andrey Grodzovsky wrote:
> 
> On 2022-04-26 22:31, Hangyu Hua wrote:
>> On 2022/4/26 22:55, Andrey Grodzovsky wrote:
>>>
>>> On 2022-04-25 22:54, Hangyu Hua wrote:
>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote:
>>>>> On 2022-04-25 04:36, Hangyu Hua wrote:
>>>>>
>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will be 
>>>>>> called
>>>>>> internally. Calling it again after drm_sched_job_add_dependency() 
>>>>>> finishes
>>>>>> may result in a dangling pointer.
>>>>>>
>>>>>> Fix this by removing redundant dma_fence_put().
>>>>>>
>>>>>> Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 -
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 -
>>>>>>   2 files changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c 
>>>>>> b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644
>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file 
>>>>>> *file, struct lima_submit *submit)
>>>>>>           err = drm_sched_job_add_dependency(&submit->task->base, 
>>>>>> fence);
>>>>>>           if (err) {
>>>>>> -            dma_fence_put(fence);
>>>>>>               return err;
>>>>>
>>>>>
>>>>> Makes sense here
>>>>>
>>>>>
>>>>>>           }
>>>>>>       }
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -708,7 +708,6 @@ int 
>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>>>>>           dma_fence_get(fence);
>>>>>>           ret = drm_sched_job_add_dependency(job, fence);
>>>>>>           if (ret) {
>>>>>> -            dma_fence_put(fence);
>>>>>
>>>>>
>>>>>
>>>>> Not sure about this one since if you look at the relevant commits -
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and
>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder'
>>>>> You will see that the dma_fence_put here balances the extra 
>>>>> dma_fence_get
>>>>> above
>>>>>
>>>>> Andrey
>>>>>
>>>>
>>>> I don't think so. I checked the call chain and found no additional 
>>>> dma_fence_get(). But dma_fence_get() needs to be called before 
>>>> drm_sched_job_add_dependency() to keep the counter balanced. 
>>>
>>>
>>> I don't say there is an additional get, I just say that 
>>> drm_sched_job_add_dependency doesn't grab an extra reference to the 
>>> fences it stores so this needs to be done outside and for that
>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called and, 
>>> if this addition fails you just call dma_fence_put to keep the 
>>> counter balanced.
>>>
>>
>> drm_sched_job_add_implicit_dependencies() will call 
>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() 
>> already call dma_fence_put() when it fails. Calling dma_fence_put() 
>> twice doesn't make sense.
>>
>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] 
>> and [3] when xa_alloc() fails.
> 
> 
> The way I see it, [2] and [3] are mat matching *get* and *put* 
> respectively. [1] *put* is against the original 
> dma_fence_init->kref_init of the fence which always set the refcount at 1.
> Also in support of this see commit 'drm/scheduler: fix 
> drm_sched_job_add_implicit_dependencies harder' - it says there 
> "drm_sched_job_add_dependency() could drop the last ref"  - this last 
> ref is the original refcount set by dma_fence_init->kref
> 
> Andrey


You can see that drm_sched_job_add_dependency() has three return paths 
they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return error.

There will be three weird problems if you're right:

1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. 
Otherwise [2] and [5] are matching *get* and *put* in here.

2. [4] path need a additional dma_fence_get() to adds the fence as a job 
dependency. fence is from obj->resv. Taking msm as an example obj->resv 
is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). It is not 
possible that an object has *refcount == 1* but is referenced in two 
places. So dma_fence_get() called in [2] is for [4]. By the way, [3] 
don't execute in this case.

3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix 
drm_sched_job_add_implicit_dependencies harder". 
drm_sched_job_add_dependency() could drop the last ref, so we need to do
the dma_fence_get() first. But the last ref still will drop in [3] if 
drm_sched_job_add_dependency() go path [1]. And there is only a *return* 
between [1] and [3]. Is this necessary? I think Rob Clark wants to avoid 
the last ref being dropped in drm_sched_job_add_implicit_dependencies() 
because fence is still used by obj->resv.


int drm_sched_job_add_dependency(struct drm_sched_job *job,
                                  struct dma_fence *fence)
{
         ...
         xa_for_each(&job->dependencies, index, entry) {
                 if (entry->context != fence->context)
                         continue;

                 if (dma_fence_is_later(fence, entry)) {
                         dma_fence_put(entry);
                         xa_store(&job->dependencies, index, fence, 
GFP_KERNEL);	<---- [4]
                 } else {
                         dma_fence_put(fence);	<---- [5]
                 }
                 return 0;
         }

         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
GFP_KERNEL);
         if (ret != 0)
                 dma_fence_put(fence);   <---- [1]

         return ret;
}


int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
                                             struct drm_gem_object *obj,
                                             bool write)
{
         struct dma_resv_iter cursor;
         struct dma_fence *fence;
         int ret;

         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
                 /* Make sure to grab an additional ref on the added 
fence */
                 dma_fence_get(fence);   <---- [2]
                 ret = drm_sched_job_add_dependency(job, fence);
                 if (ret) {      <---- [6]
                         dma_fence_put(fence);   <---- [3]

                         return ret;
                 }
         }
         return 0;
}

Thanks,
hangyu

> 
>>
>>
>> int drm_sched_job_add_dependency(struct drm_sched_job *job,
>>                  struct dma_fence *fence)
>> {
>>     ...
>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
>> GFP_KERNEL);
>>     if (ret != 0)
>>         dma_fence_put(fence);    <--- [1]
>>
>>     return ret;
>> }
>> EXPORT_SYMBOL(drm_sched_job_add_dependency);
>>
>>
>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
>>                         struct drm_gem_object *obj,
>>                         bool write)
>> {
>>     struct dma_resv_iter cursor;
>>     struct dma_fence *fence;
>>     int ret;
>>
>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
>>         /* Make sure to grab an additional ref on the added fence */
>>         dma_fence_get(fence);    <--- [2]
>>         ret = drm_sched_job_add_dependency(job, fence);
>>         if (ret) {
>>             dma_fence_put(fence);    <--- [3]
>>             return ret;
>>         }
>>     }
>>     return 0;
>> }
>>
>>
>>>
>>>> On the other hand, dma_fence_get() and dma_fence_put() are 
>>>> meaningless here if threre is an extra dma_fence_get() beacause 
>>>> counter will not decrease to 0 during drm_sched_job_add_dependency().
>>>>
>>>> I check the call chain as follows:
>>>>
>>>> msm_ioctl_gem_submit()
>>>> -> submit_fence_sync()
>>>> -> drm_sched_job_add_implicit_dependencies()
>>>
>>>
>>> Can you maybe trace or print one such example of problematic refcount 
>>> that you are trying to fix ? I still don't see where is the problem.
>>>
>>> Andrey
>>>
>>
>> I also wish I could. System logs can make this easy. But i don't have 
>> a corresponding GPU physical device. 
>> drm_sched_job_add_implicit_dependencies is only used in a few devices.
>>
>> Thanks.
>>>
>>>>
>>>> Thanks,
>>>> Hangyu
>>>>
>>>>>
>>>>>>               return ret;
>>>>>>           }
>>>>>>       }

  reply	other threads:[~2022-04-28  8:56 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-25  8:36 [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails Hangyu Hua
2022-04-25  8:36 ` Hangyu Hua
2022-04-25 15:42 ` Andrey Grodzovsky
2022-04-25 15:42   ` Andrey Grodzovsky
2022-04-26  2:54   ` Hangyu Hua
2022-04-26  2:54     ` Hangyu Hua
2022-04-26 14:48     ` Andrey Grodzovsky
2022-04-26 14:48       ` Andrey Grodzovsky
2022-04-26 14:55     ` Andrey Grodzovsky
2022-04-26 14:55       ` Andrey Grodzovsky
2022-04-27  2:31       ` Hangyu Hua
2022-04-27  2:31         ` Hangyu Hua
2022-04-27 14:43         ` Andrey Grodzovsky
2022-04-28  8:56           ` Hangyu Hua [this message]
2022-04-28  8:56             ` Hangyu Hua
2022-04-28 15:27             ` Andrey Grodzovsky
2022-04-28 15:27               ` Andrey Grodzovsky
2022-04-29  3:03               ` Hangyu Hua
2022-04-29  3:03                 ` Hangyu Hua
2022-04-29 16:34                 ` Andrey Grodzovsky
2022-04-29 16:34                   ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0790635-4b2e-cd58-0a51-36427800b39c@gmail.com \
    --to=hbh25y@gmail.com \
    --cc=airlied@linux.ie \
    --cc=andrey.grodzovsky@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lima@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuq825@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.