Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:30 ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 16:30 UTC (permalink / raw)
  To: Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1.1: Type: text/plain, Size: 6635 bytes --]

Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.

2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.

Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:

Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.

Regards,
   Felix

>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #1.2: Type: text/html, Size: 13541 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread