All of lore.kernel.org
 help / color / mirror / Atom feed
From: Felix Kuehling <felix.kuehling@amd.com>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Lang Yu" <Lang.Yu@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>,
	Huang Rui <ray.huang@amd.com>,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: don't add DOORBELL and MMIO BOs to validate list
Date: Wed, 25 May 2022 18:17:29 -0400	[thread overview]
Message-ID: <e46a9b9f-f8ca-a701-530c-20080f50a3f8@amd.com> (raw)
In-Reply-To: <77ece2f1-97e6-f44d-0a30-971b28693c3c@gmail.com>


On 2022-05-25 06:37, Christian König wrote:
> Am 25.05.22 um 11:25 schrieb Lang Yu:
>> On 05/25/ , Christian König wrote:
>>> Am 25.05.22 um 10:43 schrieb Lang Yu:
>>>> DOORBELL and MMIO BOs never move once created.
>>>> No need to validate them after that.
>>> Yeah, but you still need to make sure their page tables are up to date.
>>>
>>> So this here might break horrible.
>> These BOs(and attachments) are validated when allocated and mapped.
>> Their page tables should be determined at this time.
>>
>> The kfd_bo_list is used to restore BOs after evictions.
>>
>> Do you mean their page tabes could be changed? Thanks.
>
> Yes, page tables can be destroyed under memory pressure as well.

Is that actually happening today, or is that some future optimization 
you have in mind? I know page tables can get evicted, but I didn't think 
they were destroyed unless the memory at that address is unmapped (which 
never happens for pinned BOs).


>
> Not sure how the KFD handles that, but in theory we should have every 
> BO used by a process on the validation list. Even the ones pinned.

Then we already have some other broken cases for the small number of 
kmapped BOs that are pinned and currently removed from the validation 
list (see amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel).

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>>
>>> Christian.
>>>
>>>> Signed-off-by: Lang Yu <Lang.Yu@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 
>>>> +++++++++-----
>>>>    1 file changed, 9 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> index 34ba9e776521..45de9cadd771 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> @@ -808,6 +808,10 @@ static void add_kgd_mem_to_kfd_bo_list(struct 
>>>> kgd_mem *mem,
>>>>        struct ttm_validate_buffer *entry = &mem->validate_list;
>>>>        struct amdgpu_bo *bo = mem->bo;
>>>> +    if (mem->alloc_flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
>>>> +                KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))
>>>> +        return;
>>>> +
>>>>        INIT_LIST_HEAD(&entry->head);
>>>>        entry->num_shared = 1;
>>>>        entry->bo = &bo->tbo;
>>>> @@ -824,6 +828,10 @@ static void 
>>>> remove_kgd_mem_from_kfd_bo_list(struct kgd_mem *mem,
>>>>    {
>>>>        struct ttm_validate_buffer *bo_list_entry;
>>>> +    if (mem->alloc_flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
>>>> +                KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))
>>>> +        return;
>>>> +
>>>>        bo_list_entry = &mem->validate_list;
>>>>        mutex_lock(&process_info->lock);
>>>>        list_del(&bo_list_entry->head);
>>>> @@ -1649,7 +1657,6 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>>>        unsigned long bo_size = mem->bo->tbo.base.size;
>>>>        struct kfd_mem_attachment *entry, *tmp;
>>>>        struct bo_vm_reservation_context ctx;
>>>> -    struct ttm_validate_buffer *bo_list_entry;
>>>>        unsigned int mapped_to_gpu_memory;
>>>>        int ret;
>>>>        bool is_imported = false;
>>>> @@ -1677,10 +1684,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>>>        }
>>>>        /* Make sure restore workers don't access the BO any more */
>>>> -    bo_list_entry = &mem->validate_list;
>>>> -    mutex_lock(&process_info->lock);
>>>> -    list_del(&bo_list_entry->head);
>>>> -    mutex_unlock(&process_info->lock);
>>>> +    remove_kgd_mem_from_kfd_bo_list(mem, process_info);
>>>>        /* No more MMU notifiers */
>>>>        amdgpu_mn_unregister(mem->bo);
>

  parent reply	other threads:[~2022-05-25 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25  8:43 [PATCH] drm/amdkfd: don't add DOORBELL and MMIO BOs to validate list Lang Yu
2022-05-25  8:45 ` Christian König
2022-05-25  9:25   ` Lang Yu
2022-05-25 10:37     ` Christian König
2022-05-25 11:37       ` Lang Yu
2022-05-25 11:43         ` Christian König
2022-05-25 12:46           ` Lang Yu
2022-05-25 22:17       ` Felix Kuehling [this message]
2022-05-30  9:57         ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e46a9b9f-f8ca-a701-530c-20080f50a3f8@amd.com \
    --to=felix.kuehling@amd.com \
    --cc=Lang.Yu@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=ray.huang@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.