All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com,
	monk.liu@amd.com, yiqing.yao@amd.com
Subject: Re: [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence
Date: Thu, 23 Jun 2022 17:18:18 -0400	[thread overview]
Message-ID: <32e06ffd-a551-a05b-7993-1b6901455a21@amd.com> (raw)
In-Reply-To: <16353a69-64e1-2f1c-8859-8000ac6266ce@gmail.com>


On 2022-06-22 11:04, Christian König wrote:
> Am 22.06.22 um 17:01 schrieb Andrey Grodzovsky:
>>
>> On 2022-06-22 05:00, Christian König wrote:
>>> Am 21.06.22 um 21:34 schrieb Andrey Grodzovsky:
>>>> On 2022-06-21 03:19, Christian König wrote:
>>>>
>>>>> Am 21.06.22 um 00:02 schrieb Andrey Grodzovsky:
>>>>>> Problem:
>>>>>> In amdgpu_job_submit_direct - The refcount should drop by 2
>>>>>> but it drops only by 1.
>>>>>>
>>>>>> amdgpu_ib_sched->emit -> refcount 1 from first fence init
>>>>>> dma_fence_get -> refcount 2
>>>>>> dme_fence_put -> refcount 1
>>>>>>
>>>>>> Fix:
>>>>>> Add put for external_hw_fence in amdgpu_job_free/free_cb
>>>>>
>>>>> Well what is the external_hw_fence good for in this construct?
>>>>
>>>>
>>>> As far as I understand for direct submissions you don't want to 
>>>> pass a job
>>>> pointer to ib_schedule and so u can't use the embedded fence for 
>>>> this case.
>>>
>>> Can you please look a bit deeper into this, we now have a couple of 
>>> fields in the job structure which have no obvious use.
>>>
>>> I think we could pass a job structure to ib_schedule even for direct 
>>> submit now.
>>
>>
>> Are you sure  ? I see a lot of activities in amdgpu_ib_schedule 
>> depend on presence of  vm and fence_ctx which are set if the job 
>> pointer argument != NULL, might this have a negative impact on direct 
>> submit ?
>
> Not 100% sure, but we did tons of workarounds because we didn't had a 
> job pointer for direct submit.
>
> But this was before we embedded the IBs at the end of the job.
>
> It's quite likely that this should be possible now, it's just that 
> somebody needs to double check.
>
> Christian.


Looking more i see stuff like amdgpu_vm_flush and 
amdgpu_ring_emit_cntxcntl, emit_frame_cntl that are conditioned on job 
argument, doesn't look to me like this is relevant to direct submit ?

I also noticed that direct submit passes back the created fence to it's 
caller while freeing the job immediately, Using embedded job here will 
increase the time the job object will hang around the memory
without any use as long as it's fence is referenced. Job object is much 
larger then a single fence.

Andrey


>
>>
>> Andrey
>>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Andrey
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 ++++++--
>>>>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> index 10aa073600d4..58568fdde2d0 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>> @@ -152,8 +152,10 @@ static void amdgpu_job_free_cb(struct 
>>>>>> drm_sched_job *s_job)
>>>>>>       /* only put the hw fence if has embedded fence */
>>>>>>       if (job->hw_fence.ops != NULL)
>>>>>>           dma_fence_put(&job->hw_fence);
>>>>>> -    else
>>>>>> +    else {
>>>>>
>>>>> When one side of the if uses {} the other side should use {} as 
>>>>> well, e.g. use } else { here.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> + dma_fence_put(job->external_hw_fence);
>>>>>>           kfree(job);
>>>>>> +    }
>>>>>>   }
>>>>>>     void amdgpu_job_free(struct amdgpu_job *job)
>>>>>> @@ -165,8 +167,10 @@ void amdgpu_job_free(struct amdgpu_job *job)
>>>>>>       /* only put the hw fence if has embedded fence */
>>>>>>       if (job->hw_fence.ops != NULL)
>>>>>>           dma_fence_put(&job->hw_fence);
>>>>>> -    else
>>>>>> +    else {
>>>>>> +        dma_fence_put(job->external_hw_fence);
>>>>>>           kfree(job);
>>>>>> +    }
>>>>>>   }
>>>>>>     int amdgpu_job_submit(struct amdgpu_job *job, struct 
>>>>>> drm_sched_entity *entity,
>>>>>
>>>
>

  reply	other threads:[~2022-06-23 21:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-20 22:02 [PATCH 0/5] Rework amdgpu HW fence refocunt and update scheduler parent fence refcount Andrey Grodzovsky
2022-06-20 22:02 ` Andrey Grodzovsky
2022-06-20 22:02 ` [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence Andrey Grodzovsky
2022-06-20 22:02   ` Andrey Grodzovsky
2022-06-21  7:19   ` Christian König
2022-06-21 19:34     ` Andrey Grodzovsky
2022-06-22  9:00       ` Christian König
2022-06-22 15:01         ` Andrey Grodzovsky
2022-06-22 15:04           ` Christian König
2022-06-23 21:18             ` Andrey Grodzovsky [this message]
2022-06-24  6:00               ` Christian König
2022-06-20 22:02 ` [PATCH 2/5] drm/amdgpu: Add put fence in amdgpu_fence_driver_clear_job_fences Andrey Grodzovsky
2022-06-20 22:02   ` Andrey Grodzovsky
2022-06-21  7:21   ` Christian König
2022-06-20 22:03 ` [PATCH 3/5] drm/amdgpu: Prevent race between late signaled fences and GPU reset Andrey Grodzovsky
2022-06-20 22:03   ` Andrey Grodzovsky
2022-06-21  7:25   ` Christian König
2022-06-21 19:45     ` Andrey Grodzovsky
2022-06-22  1:47       ` VURDIGERENATARAJ, CHANDAN
2022-06-22  2:41         ` Andrey Grodzovsky
2022-06-22 17:31       ` Andrey Grodzovsky
2022-06-23  5:57         ` Christian König
2022-06-20 22:03 ` [PATCH 4/5] drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer' Andrey Grodzovsky
2022-06-20 22:03   ` Andrey Grodzovsky
2022-06-21  7:26   ` Christian König
2022-06-20 22:03 ` [PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change Andrey Grodzovsky
2022-06-20 22:03   ` Andrey Grodzovsky
2022-06-21  7:28   ` Christian König
2022-06-21 20:00     ` Andrey Grodzovsky
2022-06-22  7:17       ` Christian König
2022-06-22 17:19         ` Andrey Grodzovsky
2022-06-23  5:52           ` Christian König
2022-06-23 14:51             ` Andrey Grodzovsky
2022-06-23 14:54               ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32e06ffd-a551-a05b-7993-1b6901455a21@amd.com \
    --to=andrey.grodzovsky@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jingwen.chen2@amd.com \
    --cc=monk.liu@amd.com \
    --cc=yiqing.yao@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.