All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:30 ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 16:30 UTC (permalink / raw)
  To: Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 6635 bytes --]



Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 13541 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:30 ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 16:30 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 6635 bytes --]



Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 13541 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:55     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 16:55 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 7031 bytes --]

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 14540 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:55     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 16:55 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 7031 bytes --]

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 14540 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 17:42         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 17:42 UTC (permalink / raw)
  To: Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 8624 bytes --]

The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 17453 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 17:42         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 17:42 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 8624 bytes --]

The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 17453 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 17:57             ` Christian König
  0 siblings, 0 replies; 32+ messages in thread
From: Christian König @ 2019-10-30 17:57 UTC (permalink / raw)
  To: Koenig, Christian, Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 12237 bytes --]

One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent 
ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are 
that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a 
range in amdgpu_update_ptes() and see if you could walk the tree up if 
the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>> The vaild flag doesn't take effect in this function.
> That's irrelevant.
>
> See what amdgpu_vm_update_ptes() does is to first determine the 
> fragment size:
>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>
> Then we walk down the tree:
>>         amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>         while (cursor.pfn < end) {
>
> And make sure that the page tables covering the address range are 
> actually allocated:
>>                 r = amdgpu_vm_alloc_pts(params->adev, params->vm, 
>> &cursor);
>
> Then we update the tables with the flags and addresses and free up 
> subsequent tables in the case of huge pages or freed up areas:
>>                         /* Free all child entries */
>>                         while (cursor.pfn < frag_start) {
>>                                 amdgpu_vm_free_pts(adev, params->vm, 
>> &cursor);
>>                                 amdgpu_vm_pt_next(adev, &cursor);
>>                         }
>
> This is the maximum you can free, cause all other page tables are not 
> completely covered by the range and so potentially still in use.
>
> And I have the strong suspicion that this is what your patch is 
> actually doing wrong. In other words you are also freeing page tables 
> which are only partially covered by the range and so potentially still 
> in use.
>
> Since we don't have any tracking how many entries in a page table are 
> currently valid and how many are invalid we actually can't implement 
> what you are trying to do here. So the patch is definitely somehow broken.
>
> Regards,
> Christian.
>
> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>
>> The vaild flag doesn't take effect in this function. 
>> amdgpu_vm_alloc_pts() is always executed that only depended on 
>> "cursor.pfn < end". The valid flag has only been checked on here for 
>> asic below GMC v9:
>>
>> if (adev->asic_type < CHIP_VEGA10 &&
>>             (flags & AMDGPU_PTE_VALID))...
>>
>> Regards,
>>
>> Eric
>>
>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>
>>>
>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" 
>>> <JinHuiEric.Huang-5C7GfCeVMHo@public.gmane.org>:
>>>
>>>     I tested it that it saves a lot of vram on KFD big buffer stress
>>>     test. I think there are two reasons:
>>>
>>>     1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>     allocate unnecessary pts, because there is no flag to determine
>>>     if the VA is mapping or unmapping in function
>>>     amdgpu_vm_update_ptes(). It saves the most of memory.
>>>
>>> That's not correct. The valid flag is used for this.
>>>
>>>     2. Intentionally removing those unmapping pts is logical
>>>     expectation, although it is not removing so much pts.
>>>
>>> Well I actually don't see a change to what update_ptes is doing and 
>>> have the strong suspicion that the patch is simply broken.
>>>
>>> You either free page tables which are potentially still in use or 
>>> update_pte doesn't free page tables when the valid but is not set.
>>>
>>> Regards,
>>> Christian.
>>>
>>>
>>>
>>>     Regards,
>>>
>>>     Eric
>>>
>>>     On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>
>>>
>>>
>>>         Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>         <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> <mailto:Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>:
>>>
>>>             On 2019-10-30 9:52 a.m., Christian König wrote:
>>>             > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>             >> The issue is PT BOs are not freed when unmapping VA,
>>>             >> which causes vram usage accumulated is huge in some
>>>             >> memory stress test, such as kfd big buffer stress test.
>>>             >> Function amdgpu_vm_bo_update_mapping() is called by both
>>>             >> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>>>             >> solution is replacing amdgpu_vm_bo_update_mapping() in
>>>             >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>             >> to save vram usage.
>>>             >
>>>             > NAK, that is intentional behavior.
>>>             >
>>>             > Otherwise we can run into out of memory situations
>>>             when page tables
>>>             > need to be allocated again under stress.
>>>
>>>             That's a bit arbitrary and inconsistent. We are freeing
>>>             page tables in
>>>             other situations, when a mapping uses huge pages in
>>>             amdgpu_vm_update_ptes. Why not when a mapping is
>>>             destroyed completely?
>>>
>>>             I'm actually a bit surprised that the huge-page handling in
>>>             amdgpu_vm_update_ptes isn't kicking in to free up
>>>             lower-level page
>>>             tables when a BO is unmapped.
>>>
>>>
>>>         Well it does free the lower level, and that is already
>>>         causing problems (that's why I added the reserved space).
>>>
>>>         What we don't do is freeing the higher levels.
>>>
>>>         E.g. when you free a 2MB BO we free the lowest level, if we
>>>         free a 1GB BO we free the two lowest levels etc...
>>>
>>>         The problem with freeing the higher levels is that you don't
>>>         know who is also using this. E.g. we would need to check all
>>>         entries when we unmap one.
>>>
>>>         It's simply not worth it for a maximum saving of 2MB per VM.
>>>
>>>         Writing this I'm actually wondering how you ended up in this
>>>         issue? There shouldn't be much savings from this.
>>>
>>>         Regards,
>>>         Christian.
>>>
>>>
>>>             Regards,
>>>                Felix
>>>
>>>
>>>             >
>>>             > Regards,
>>>             > Christian.
>>>             >
>>>             >>
>>>             >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>             >> Signed-off-by: Eric Huang <JinhuiEric.Huang-5C7GfCeVMHo@public.gmane.org>
>>>             <mailto:JinhuiEric.Huang-5C7GfCeVMHo@public.gmane.org>
>>>             >> ---
>>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>             >> +++++++++++++++++++++++++++++-----
>>>             >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>             >>
>>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> index 0f4c3b2..8a480c7 100644
>>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> @@ -1930,6 +1930,51 @@ static void
>>>             amdgpu_vm_prt_fini(struct
>>>             >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>             >>   }
>>>             >>     /**
>>>             >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>             >> + *
>>>             >> + * @adev: amdgpu device structure
>>>             >> + * @vm: amdgpu vm structure
>>>             >> + * @start: start of mapped range
>>>             >> + * @end: end of mapped entry
>>>             >> + *
>>>             >> + * Free the page table level.
>>>             >> + */
>>>             >> +static int amdgpu_vm_remove_ptes(struct
>>>             amdgpu_device *adev,
>>>             >> +        struct amdgpu_vm *vm, uint64_t start,
>>>             uint64_t end)
>>>             >> +{
>>>             >> +    struct amdgpu_vm_pt_cursor cursor;
>>>             >> +    unsigned shift, num_entries;
>>>             >> +
>>>             >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>             >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>             >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>             >> +            return -ENOENT;
>>>             >> +    }
>>>             >> +
>>>             >> +    while (cursor.pfn < end) {
>>>             >> + amdgpu_vm_free_table(cursor.entry);
>>>             >> +        num_entries = amdgpu_vm_num_entries(adev,
>>>             cursor.level - 1);
>>>             >> +
>>>             >> +        if (cursor.entry !=
>>>             &cursor.parent->entries[num_entries - 1]) {
>>>             >> +            /* Next ptb entry */
>>>             >> +            shift = amdgpu_vm_level_shift(adev,
>>>             cursor.level - 1);
>>>             >> + cursor.pfn += 1ULL << shift;
>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>             >> + cursor.entry++;
>>>             >> +        } else {
>>>             >> +            /* Next ptb entry in next pd0 entry */
>>>             >> + amdgpu_vm_pt_ancestor(&cursor);
>>>             >> +            shift = amdgpu_vm_level_shift(adev,
>>>             cursor.level - 1);
>>>             >> + cursor.pfn += 1ULL << shift;
>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>             >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>             >> +        }
>>>             >> +    }
>>>             >> +
>>>             >> +    return 0;
>>>             >> +}
>>>             >> +
>>>             >> +/**
>>>             >>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>>             >>    *
>>>             >>    * @adev: amdgpu_device pointer
>>>             >> @@ -1949,7 +1994,6 @@ int
>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>             >> *adev,
>>>             >> struct dma_fence **fence)
>>>             >>   {
>>>             >>       struct amdgpu_bo_va_mapping *mapping;
>>>             >> -    uint64_t init_pte_value = 0;
>>>             >>       struct dma_fence *f = NULL;
>>>             >>       int r;
>>>             >>   @@ -1958,13 +2002,10 @@ int
>>>             amdgpu_vm_clear_freed(struct
>>>             >> amdgpu_device *adev,
>>>             >>               struct amdgpu_bo_va_mapping, list);
>>>             >> list_del(&mapping->list);
>>>             >>   -        if (vm->pte_support_ats &&
>>>             >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>             >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>             >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>             >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>             >> + (mapping->last + 1) & (~0x1ffll));
>>>             >>   -        r = amdgpu_vm_bo_update_mapping(adev, vm,
>>>             false, NULL,
>>>             >> - mapping->start, mapping->last,
>>>             >> - init_pte_value, 0, NULL, &f);
>>>             >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>             >>           if (r) {
>>>             >> dma_fence_put(f);
>>>             >> @@ -1980,7 +2021,6 @@ int
>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>             >> *adev,
>>>             >>       }
>>>             >>         return 0;
>>>             >> -
>>>             >>   }
>>>             >>     /**
>>>             >
>>>             > _______________________________________________
>>>             > amd-gfx mailing list
>>>             > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
>>>             <mailto:amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
>>>             > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>>>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 31536 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 17:57             ` Christian König
  0 siblings, 0 replies; 32+ messages in thread
From: Christian König @ 2019-10-30 17:57 UTC (permalink / raw)
  To: Koenig, Christian, Huang, JinHuiEric; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 12048 bytes --]

One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent 
ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are 
that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a 
range in amdgpu_update_ptes() and see if you could walk the tree up if 
the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>> The vaild flag doesn't take effect in this function.
> That's irrelevant.
>
> See what amdgpu_vm_update_ptes() does is to first determine the 
> fragment size:
>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>
> Then we walk down the tree:
>>         amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>         while (cursor.pfn < end) {
>
> And make sure that the page tables covering the address range are 
> actually allocated:
>>                 r = amdgpu_vm_alloc_pts(params->adev, params->vm, 
>> &cursor);
>
> Then we update the tables with the flags and addresses and free up 
> subsequent tables in the case of huge pages or freed up areas:
>>                         /* Free all child entries */
>>                         while (cursor.pfn < frag_start) {
>>                                 amdgpu_vm_free_pts(adev, params->vm, 
>> &cursor);
>>                                 amdgpu_vm_pt_next(adev, &cursor);
>>                         }
>
> This is the maximum you can free, cause all other page tables are not 
> completely covered by the range and so potentially still in use.
>
> And I have the strong suspicion that this is what your patch is 
> actually doing wrong. In other words you are also freeing page tables 
> which are only partially covered by the range and so potentially still 
> in use.
>
> Since we don't have any tracking how many entries in a page table are 
> currently valid and how many are invalid we actually can't implement 
> what you are trying to do here. So the patch is definitely somehow broken.
>
> Regards,
> Christian.
>
> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>
>> The vaild flag doesn't take effect in this function. 
>> amdgpu_vm_alloc_pts() is always executed that only depended on 
>> "cursor.pfn < end". The valid flag has only been checked on here for 
>> asic below GMC v9:
>>
>> if (adev->asic_type < CHIP_VEGA10 &&
>>             (flags & AMDGPU_PTE_VALID))...
>>
>> Regards,
>>
>> Eric
>>
>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>
>>>
>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" 
>>> <JinHuiEric.Huang@amd.com>:
>>>
>>>     I tested it that it saves a lot of vram on KFD big buffer stress
>>>     test. I think there are two reasons:
>>>
>>>     1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>     allocate unnecessary pts, because there is no flag to determine
>>>     if the VA is mapping or unmapping in function
>>>     amdgpu_vm_update_ptes(). It saves the most of memory.
>>>
>>> That's not correct. The valid flag is used for this.
>>>
>>>     2. Intentionally removing those unmapping pts is logical
>>>     expectation, although it is not removing so much pts.
>>>
>>> Well I actually don't see a change to what update_ptes is doing and 
>>> have the strong suspicion that the patch is simply broken.
>>>
>>> You either free page tables which are potentially still in use or 
>>> update_pte doesn't free page tables when the valid but is not set.
>>>
>>> Regards,
>>> Christian.
>>>
>>>
>>>
>>>     Regards,
>>>
>>>     Eric
>>>
>>>     On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>
>>>
>>>
>>>         Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>         <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>
>>>             On 2019-10-30 9:52 a.m., Christian König wrote:
>>>             > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>             >> The issue is PT BOs are not freed when unmapping VA,
>>>             >> which causes vram usage accumulated is huge in some
>>>             >> memory stress test, such as kfd big buffer stress test.
>>>             >> Function amdgpu_vm_bo_update_mapping() is called by both
>>>             >> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>>>             >> solution is replacing amdgpu_vm_bo_update_mapping() in
>>>             >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>             >> to save vram usage.
>>>             >
>>>             > NAK, that is intentional behavior.
>>>             >
>>>             > Otherwise we can run into out of memory situations
>>>             when page tables
>>>             > need to be allocated again under stress.
>>>
>>>             That's a bit arbitrary and inconsistent. We are freeing
>>>             page tables in
>>>             other situations, when a mapping uses huge pages in
>>>             amdgpu_vm_update_ptes. Why not when a mapping is
>>>             destroyed completely?
>>>
>>>             I'm actually a bit surprised that the huge-page handling in
>>>             amdgpu_vm_update_ptes isn't kicking in to free up
>>>             lower-level page
>>>             tables when a BO is unmapped.
>>>
>>>
>>>         Well it does free the lower level, and that is already
>>>         causing problems (that's why I added the reserved space).
>>>
>>>         What we don't do is freeing the higher levels.
>>>
>>>         E.g. when you free a 2MB BO we free the lowest level, if we
>>>         free a 1GB BO we free the two lowest levels etc...
>>>
>>>         The problem with freeing the higher levels is that you don't
>>>         know who is also using this. E.g. we would need to check all
>>>         entries when we unmap one.
>>>
>>>         It's simply not worth it for a maximum saving of 2MB per VM.
>>>
>>>         Writing this I'm actually wondering how you ended up in this
>>>         issue? There shouldn't be much savings from this.
>>>
>>>         Regards,
>>>         Christian.
>>>
>>>
>>>             Regards,
>>>                Felix
>>>
>>>
>>>             >
>>>             > Regards,
>>>             > Christian.
>>>             >
>>>             >>
>>>             >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>             >> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
>>>             <mailto:JinhuiEric.Huang@amd.com>
>>>             >> ---
>>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>             >> +++++++++++++++++++++++++++++-----
>>>             >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>             >>
>>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> index 0f4c3b2..8a480c7 100644
>>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>             >> @@ -1930,6 +1930,51 @@ static void
>>>             amdgpu_vm_prt_fini(struct
>>>             >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>             >>   }
>>>             >>     /**
>>>             >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>             >> + *
>>>             >> + * @adev: amdgpu device structure
>>>             >> + * @vm: amdgpu vm structure
>>>             >> + * @start: start of mapped range
>>>             >> + * @end: end of mapped entry
>>>             >> + *
>>>             >> + * Free the page table level.
>>>             >> + */
>>>             >> +static int amdgpu_vm_remove_ptes(struct
>>>             amdgpu_device *adev,
>>>             >> +        struct amdgpu_vm *vm, uint64_t start,
>>>             uint64_t end)
>>>             >> +{
>>>             >> +    struct amdgpu_vm_pt_cursor cursor;
>>>             >> +    unsigned shift, num_entries;
>>>             >> +
>>>             >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>             >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>             >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>             >> +            return -ENOENT;
>>>             >> +    }
>>>             >> +
>>>             >> +    while (cursor.pfn < end) {
>>>             >> + amdgpu_vm_free_table(cursor.entry);
>>>             >> +        num_entries = amdgpu_vm_num_entries(adev,
>>>             cursor.level - 1);
>>>             >> +
>>>             >> +        if (cursor.entry !=
>>>             &cursor.parent->entries[num_entries - 1]) {
>>>             >> +            /* Next ptb entry */
>>>             >> +            shift = amdgpu_vm_level_shift(adev,
>>>             cursor.level - 1);
>>>             >> + cursor.pfn += 1ULL << shift;
>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>             >> + cursor.entry++;
>>>             >> +        } else {
>>>             >> +            /* Next ptb entry in next pd0 entry */
>>>             >> + amdgpu_vm_pt_ancestor(&cursor);
>>>             >> +            shift = amdgpu_vm_level_shift(adev,
>>>             cursor.level - 1);
>>>             >> + cursor.pfn += 1ULL << shift;
>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>             >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>             >> +        }
>>>             >> +    }
>>>             >> +
>>>             >> +    return 0;
>>>             >> +}
>>>             >> +
>>>             >> +/**
>>>             >>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>>             >>    *
>>>             >>    * @adev: amdgpu_device pointer
>>>             >> @@ -1949,7 +1994,6 @@ int
>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>             >> *adev,
>>>             >> struct dma_fence **fence)
>>>             >>   {
>>>             >>       struct amdgpu_bo_va_mapping *mapping;
>>>             >> -    uint64_t init_pte_value = 0;
>>>             >>       struct dma_fence *f = NULL;
>>>             >>       int r;
>>>             >>   @@ -1958,13 +2002,10 @@ int
>>>             amdgpu_vm_clear_freed(struct
>>>             >> amdgpu_device *adev,
>>>             >>               struct amdgpu_bo_va_mapping, list);
>>>             >> list_del(&mapping->list);
>>>             >>   -        if (vm->pte_support_ats &&
>>>             >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>             >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>             >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>             >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>             >> + (mapping->last + 1) & (~0x1ffll));
>>>             >>   -        r = amdgpu_vm_bo_update_mapping(adev, vm,
>>>             false, NULL,
>>>             >> - mapping->start, mapping->last,
>>>             >> - init_pte_value, 0, NULL, &f);
>>>             >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>             >>           if (r) {
>>>             >> dma_fence_put(f);
>>>             >> @@ -1980,7 +2021,6 @@ int
>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>             >> *adev,
>>>             >>       }
>>>             >>         return 0;
>>>             >> -
>>>             >>   }
>>>             >>     /**
>>>             >
>>>             > _______________________________________________
>>>             > amd-gfx mailing list
>>>             > amd-gfx@lists.freedesktop.org
>>>             <mailto:amd-gfx@lists.freedesktop.org>
>>>             > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>>>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 31224 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 18:00                 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 18:00 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 9964 bytes --]

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 19867 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 18:00                 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 18:00 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 9964 bytes --]

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 19867 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 18:11                     ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 18:11 UTC (permalink / raw)
  To: Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 10238 bytes --]

Then I don't see how this patch actually changes anything.

Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate this, but I won't have time to look into the ticket in detail.

Regards,
Christian.

Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 20398 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 18:11                     ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 18:11 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 10238 bytes --]

Then I don't see how this patch actually changes anything.

Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate this, but I won't have time to look into the ticket in detail.

Regards,
Christian.

Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[-- Attachment #1.2: Type: text/html, Size: 20398 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 10:41                         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-31 10:41 UTC (permalink / raw)
  To: Huang, JinHuiEric
  Cc: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 10583 bytes --]

Just tested this and amdgpu_vm_update_ptes() indeed works as expected.

When you free at least a 2MB the lowest level of page tables is freed up again.

BTW: What hardware have you tested this on? On gfx8 and older it is expected that page tables are never freed.

Regards,
Christian.

Am 30.10.19 um 19:11 schrieb Christian König:
Then I don't see how this patch actually changes anything.

Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate this, but I won't have time to look into the ticket in detail.

Regards,
Christian.

Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 20941 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 10:41                         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-31 10:41 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: Kuehling, Felix, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 10583 bytes --]

Just tested this and amdgpu_vm_update_ptes() indeed works as expected.

When you free at least a 2MB the lowest level of page tables is freed up again.

BTW: What hardware have you tested this on? On gfx8 and older it is expected that page tables are never freed.

Regards,
Christian.

Am 30.10.19 um 19:11 schrieb Christian König:
Then I don't see how this patch actually changes anything.

Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate this, but I won't have time to look into the ticket in detail.

Regards,
Christian.

Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:

Actually I do prevent to remove in-use pts by this:

+               r = amdgpu_vm_remove_ptes(adev, vm,
+                               (mapping->start + 0x1ff) & (~0x1ffll),
+                               (mapping->last + 1) & (~0x1ffll));

Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.

By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!

Regards,

Eric

On 2019-10-30 1:57 p.m., Christian König wrote:
One thing I've forgotten:

What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.

The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.

Regards,
Christian.

Am 30.10.19 um 18:42 schrieb Koenig, Christian:
The vaild flag doesn't take effect in this function.
That's irrelevant.

See what amdgpu_vm_update_ptes() does is to first determine the fragment size:
amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);

Then we walk down the tree:
        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
        while (cursor.pfn < end) {

And make sure that the page tables covering the address range are actually allocated:
                r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);

Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:
                        /* Free all child entries */
                        while (cursor.pfn < frag_start) {
                                amdgpu_vm_free_pts(adev, params->vm, &cursor);
                                amdgpu_vm_pt_next(adev, &cursor);
                        }

This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.

And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.

Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.

Regards,
Christian.

Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:

The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:

if (adev->asic_type < CHIP_VEGA10 &&
            (flags & AMDGPU_PTE_VALID))...

Regards,

Eric

On 2019-10-30 12:30 p.m., Koenig, Christian wrote:


Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <JinHuiEric.Huang@amd.com><mailto:JinHuiEric.Huang@amd.com>:

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

That's not correct. The valid flag is used for this.


2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.

You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.

Regards,
Christian.




Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx






_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 20941 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 14:08                             ` StDenis, Tom
  0 siblings, 0 replies; 32+ messages in thread
From: StDenis, Tom @ 2019-10-31 14:08 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Koenig, Christian

I could try it on my carrizo/polaris setup.  Is there a test procedure I 
could folllow to trigger the changed code paths?


Tom

On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>
> When you free at least a 2MB the lowest level of page tables is freed 
> up again.
>
> BTW: What hardware have you tested this on? On gfx8 and older it is 
> expected that page tables are never freed.
>
> Regards,
> Christian.
>
> Am 30.10.19 um 19:11 schrieb Christian König:
>> Then I don't see how this patch actually changes anything.
>>
>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate 
>> this, but I won't have time to look into the ticket in detail.
>>
>> Regards,
>> Christian.
>>
>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>
>>> Actually I do prevent to remove in-use pts by this:
>>>
>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>
>>> Which is only removing aligned page table for 2M. And I have tested 
>>> it at least on KFD tests without anything broken.
>>>
>>> By the way, I am not familiar with memory staff. This patch is the 
>>> best I can do for now. Could you take a look at the Jira ticket 
>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>
>>> Regards,
>>>
>>> Eric
>>>
>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>> One thing I've forgotten:
>>>>
>>>> What you could maybe do to improve the situation is to join 
>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how 
>>>> the chances are that the ranges are freed all together.
>>>>
>>>> The only other alternative I can see would be to check the mappings 
>>>> of a range in amdgpu_update_ptes() and see if you could walk the 
>>>> tree up if the valid flag is not set and there are no mappings left 
>>>> for a page table.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>> The vaild flag doesn't take effect in this function.
>>>>> That's irrelevant.
>>>>>
>>>>> See what amdgpu_vm_update_ptes() does is to first determine the 
>>>>> fragment size:
>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>
>>>>> Then we walk down the tree:
>>>>>>         amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>         while (cursor.pfn < end) {
>>>>>
>>>>> And make sure that the page tables covering the address range are 
>>>>> actually allocated:
>>>>>>                 r = amdgpu_vm_alloc_pts(params->adev, params->vm, 
>>>>>> &cursor);
>>>>>
>>>>> Then we update the tables with the flags and addresses and free up 
>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>                         /* Free all child entries */
>>>>>>                         while (cursor.pfn < frag_start) {
>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>                         }
>>>>>
>>>>> This is the maximum you can free, cause all other page tables are 
>>>>> not completely covered by the range and so potentially still in use.
>>>>>
>>>>> And I have the strong suspicion that this is what your patch is 
>>>>> actually doing wrong. In other words you are also freeing page 
>>>>> tables which are only partially covered by the range and so 
>>>>> potentially still in use.
>>>>>
>>>>> Since we don't have any tracking how many entries in a page table 
>>>>> are currently valid and how many are invalid we actually can't 
>>>>> implement what you are trying to do here. So the patch is 
>>>>> definitely somehow broken.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>
>>>>>> The vaild flag doesn't take effect in this function. 
>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on 
>>>>>> "cursor.pfn < end". The valid flag has only been checked on here 
>>>>>> for asic below GMC v9:
>>>>>>
>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>             (flags & AMDGPU_PTE_VALID))...
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>
>>>>>>>
>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" 
>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>
>>>>>>>     I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>     stress test. I think there are two reasons:
>>>>>>>
>>>>>>>     1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>     allocate unnecessary pts, because there is no flag to
>>>>>>>     determine if the VA is mapping or unmapping in function
>>>>>>>     amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>
>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>
>>>>>>>     2. Intentionally removing those unmapping pts is logical
>>>>>>>     expectation, although it is not removing so much pts.
>>>>>>>
>>>>>>> Well I actually don't see a change to what update_ptes is doing 
>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>
>>>>>>> You either free page tables which are potentially still in use 
>>>>>>> or update_pte doesn't free page tables when the valid but is not 
>>>>>>> set.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     Regards,
>>>>>>>
>>>>>>>     Eric
>>>>>>>
>>>>>>>     On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>         <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>
>>>>>>>             On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>             > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>             >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>             >> which causes vram usage accumulated is huge in some
>>>>>>>             >> memory stress test, such as kfd big buffer stress
>>>>>>>             test.
>>>>>>>             >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>             by both
>>>>>>>             >> amdgpu_vm_bo_update() and
>>>>>>>             amdgpu_vm_clear_freed(). The
>>>>>>>             >> solution is replacing
>>>>>>>             amdgpu_vm_bo_update_mapping() in
>>>>>>>             >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>             >> to save vram usage.
>>>>>>>             >
>>>>>>>             > NAK, that is intentional behavior.
>>>>>>>             >
>>>>>>>             > Otherwise we can run into out of memory situations
>>>>>>>             when page tables
>>>>>>>             > need to be allocated again under stress.
>>>>>>>
>>>>>>>             That's a bit arbitrary and inconsistent. We are
>>>>>>>             freeing page tables in
>>>>>>>             other situations, when a mapping uses huge pages in
>>>>>>>             amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>             destroyed completely?
>>>>>>>
>>>>>>>             I'm actually a bit surprised that the huge-page
>>>>>>>             handling in
>>>>>>>             amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>             lower-level page
>>>>>>>             tables when a BO is unmapped.
>>>>>>>
>>>>>>>
>>>>>>>         Well it does free the lower level, and that is already
>>>>>>>         causing problems (that's why I added the reserved space).
>>>>>>>
>>>>>>>         What we don't do is freeing the higher levels.
>>>>>>>
>>>>>>>         E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>         we free a 1GB BO we free the two lowest levels etc...
>>>>>>>
>>>>>>>         The problem with freeing the higher levels is that you
>>>>>>>         don't know who is also using this. E.g. we would need to
>>>>>>>         check all entries when we unmap one.
>>>>>>>
>>>>>>>         It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>
>>>>>>>         Writing this I'm actually wondering how you ended up in
>>>>>>>         this issue? There shouldn't be much savings from this.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>         Christian.
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>                Felix
>>>>>>>
>>>>>>>
>>>>>>>             >
>>>>>>>             > Regards,
>>>>>>>             > Christian.
>>>>>>>             >
>>>>>>>             >>
>>>>>>>             >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>             >> Signed-off-by: Eric Huang
>>>>>>>             <JinhuiEric.Huang@amd.com>
>>>>>>>             <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>             >> ---
>>>>>>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>             >> +++++++++++++++++++++++++++++-----
>>>>>>>             >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>             >>
>>>>>>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> index 0f4c3b2..8a480c7 100644
>>>>>>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>             amdgpu_vm_prt_fini(struct
>>>>>>>             >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>             >>   }
>>>>>>>             >>     /**
>>>>>>>             >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>             >> + *
>>>>>>>             >> + * @adev: amdgpu device structure
>>>>>>>             >> + * @vm: amdgpu vm structure
>>>>>>>             >> + * @start: start of mapped range
>>>>>>>             >> + * @end: end of mapped entry
>>>>>>>             >> + *
>>>>>>>             >> + * Free the page table level.
>>>>>>>             >> + */
>>>>>>>             >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>             amdgpu_device *adev,
>>>>>>>             >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>             >> +{
>>>>>>>             >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>             >> +    unsigned shift, num_entries;
>>>>>>>             >> +
>>>>>>>             >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>             >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>             >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>             >> + return -ENOENT;
>>>>>>>             >> +    }
>>>>>>>             >> +
>>>>>>>             >> +    while (cursor.pfn < end) {
>>>>>>>             >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>             >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> +
>>>>>>>             >> +        if (cursor.entry !=
>>>>>>>             &cursor.parent->entries[num_entries - 1]) {
>>>>>>>             >> + /* Next ptb entry */
>>>>>>>             >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> + cursor.pfn += 1ULL << shift;
>>>>>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>             >> + cursor.entry++;
>>>>>>>             >> +        } else {
>>>>>>>             >> + /* Next ptb entry in next pd0 entry */
>>>>>>>             >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>             >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> + cursor.pfn += 1ULL << shift;
>>>>>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>             >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>             >> +        }
>>>>>>>             >> +    }
>>>>>>>             >> +
>>>>>>>             >> +    return 0;
>>>>>>>             >> +}
>>>>>>>             >> +
>>>>>>>             >> +/**
>>>>>>>             >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>             the PT
>>>>>>>             >>    *
>>>>>>>             >>    * @adev: amdgpu_device pointer
>>>>>>>             >> @@ -1949,7 +1994,6 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>             >> *adev,
>>>>>>>             >>                 struct dma_fence **fence)
>>>>>>>             >>   {
>>>>>>>             >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>             >> -    uint64_t init_pte_value = 0;
>>>>>>>             >>       struct dma_fence *f = NULL;
>>>>>>>             >>       int r;
>>>>>>>             >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct
>>>>>>>             >> amdgpu_device *adev,
>>>>>>>             >> struct amdgpu_bo_va_mapping, list);
>>>>>>>             >> list_del(&mapping->list);
>>>>>>>             >>   -        if (vm->pte_support_ats &&
>>>>>>>             >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>             >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>             >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>             >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>             >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>             >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>             vm, false, NULL,
>>>>>>>             >> - mapping->start, mapping->last,
>>>>>>>             >> - init_pte_value, 0, NULL, &f);
>>>>>>>             >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>             >>           if (r) {
>>>>>>>             >> dma_fence_put(f);
>>>>>>>             >> @@ -1980,7 +2021,6 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>             >> *adev,
>>>>>>>             >>       }
>>>>>>>             >> return 0;
>>>>>>>             >> -
>>>>>>>             >>   }
>>>>>>>             >>     /**
>>>>>>>             >
>>>>>>>             > _______________________________________________
>>>>>>>             > amd-gfx mailing list
>>>>>>>             > amd-gfx@lists.freedesktop.org
>>>>>>>             <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>             > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 14:08                             ` StDenis, Tom
  0 siblings, 0 replies; 32+ messages in thread
From: StDenis, Tom @ 2019-10-31 14:08 UTC (permalink / raw)
  To: amd-gfx; +Cc: Koenig, Christian

I could try it on my carrizo/polaris setup.  Is there a test procedure I 
could folllow to trigger the changed code paths?


Tom

On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>
> When you free at least a 2MB the lowest level of page tables is freed 
> up again.
>
> BTW: What hardware have you tested this on? On gfx8 and older it is 
> expected that page tables are never freed.
>
> Regards,
> Christian.
>
> Am 30.10.19 um 19:11 schrieb Christian König:
>> Then I don't see how this patch actually changes anything.
>>
>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate 
>> this, but I won't have time to look into the ticket in detail.
>>
>> Regards,
>> Christian.
>>
>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>
>>> Actually I do prevent to remove in-use pts by this:
>>>
>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>
>>> Which is only removing aligned page table for 2M. And I have tested 
>>> it at least on KFD tests without anything broken.
>>>
>>> By the way, I am not familiar with memory staff. This patch is the 
>>> best I can do for now. Could you take a look at the Jira ticket 
>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>
>>> Regards,
>>>
>>> Eric
>>>
>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>> One thing I've forgotten:
>>>>
>>>> What you could maybe do to improve the situation is to join 
>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how 
>>>> the chances are that the ranges are freed all together.
>>>>
>>>> The only other alternative I can see would be to check the mappings 
>>>> of a range in amdgpu_update_ptes() and see if you could walk the 
>>>> tree up if the valid flag is not set and there are no mappings left 
>>>> for a page table.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>> The vaild flag doesn't take effect in this function.
>>>>> That's irrelevant.
>>>>>
>>>>> See what amdgpu_vm_update_ptes() does is to first determine the 
>>>>> fragment size:
>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>
>>>>> Then we walk down the tree:
>>>>>>         amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>         while (cursor.pfn < end) {
>>>>>
>>>>> And make sure that the page tables covering the address range are 
>>>>> actually allocated:
>>>>>>                 r = amdgpu_vm_alloc_pts(params->adev, params->vm, 
>>>>>> &cursor);
>>>>>
>>>>> Then we update the tables with the flags and addresses and free up 
>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>                         /* Free all child entries */
>>>>>>                         while (cursor.pfn < frag_start) {
>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>                         }
>>>>>
>>>>> This is the maximum you can free, cause all other page tables are 
>>>>> not completely covered by the range and so potentially still in use.
>>>>>
>>>>> And I have the strong suspicion that this is what your patch is 
>>>>> actually doing wrong. In other words you are also freeing page 
>>>>> tables which are only partially covered by the range and so 
>>>>> potentially still in use.
>>>>>
>>>>> Since we don't have any tracking how many entries in a page table 
>>>>> are currently valid and how many are invalid we actually can't 
>>>>> implement what you are trying to do here. So the patch is 
>>>>> definitely somehow broken.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>
>>>>>> The vaild flag doesn't take effect in this function. 
>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on 
>>>>>> "cursor.pfn < end". The valid flag has only been checked on here 
>>>>>> for asic below GMC v9:
>>>>>>
>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>             (flags & AMDGPU_PTE_VALID))...
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>
>>>>>>>
>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" 
>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>
>>>>>>>     I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>     stress test. I think there are two reasons:
>>>>>>>
>>>>>>>     1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>     allocate unnecessary pts, because there is no flag to
>>>>>>>     determine if the VA is mapping or unmapping in function
>>>>>>>     amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>
>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>
>>>>>>>     2. Intentionally removing those unmapping pts is logical
>>>>>>>     expectation, although it is not removing so much pts.
>>>>>>>
>>>>>>> Well I actually don't see a change to what update_ptes is doing 
>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>
>>>>>>> You either free page tables which are potentially still in use 
>>>>>>> or update_pte doesn't free page tables when the valid but is not 
>>>>>>> set.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     Regards,
>>>>>>>
>>>>>>>     Eric
>>>>>>>
>>>>>>>     On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>         <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>
>>>>>>>             On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>             > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>             >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>             >> which causes vram usage accumulated is huge in some
>>>>>>>             >> memory stress test, such as kfd big buffer stress
>>>>>>>             test.
>>>>>>>             >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>             by both
>>>>>>>             >> amdgpu_vm_bo_update() and
>>>>>>>             amdgpu_vm_clear_freed(). The
>>>>>>>             >> solution is replacing
>>>>>>>             amdgpu_vm_bo_update_mapping() in
>>>>>>>             >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>             >> to save vram usage.
>>>>>>>             >
>>>>>>>             > NAK, that is intentional behavior.
>>>>>>>             >
>>>>>>>             > Otherwise we can run into out of memory situations
>>>>>>>             when page tables
>>>>>>>             > need to be allocated again under stress.
>>>>>>>
>>>>>>>             That's a bit arbitrary and inconsistent. We are
>>>>>>>             freeing page tables in
>>>>>>>             other situations, when a mapping uses huge pages in
>>>>>>>             amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>             destroyed completely?
>>>>>>>
>>>>>>>             I'm actually a bit surprised that the huge-page
>>>>>>>             handling in
>>>>>>>             amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>             lower-level page
>>>>>>>             tables when a BO is unmapped.
>>>>>>>
>>>>>>>
>>>>>>>         Well it does free the lower level, and that is already
>>>>>>>         causing problems (that's why I added the reserved space).
>>>>>>>
>>>>>>>         What we don't do is freeing the higher levels.
>>>>>>>
>>>>>>>         E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>         we free a 1GB BO we free the two lowest levels etc...
>>>>>>>
>>>>>>>         The problem with freeing the higher levels is that you
>>>>>>>         don't know who is also using this. E.g. we would need to
>>>>>>>         check all entries when we unmap one.
>>>>>>>
>>>>>>>         It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>
>>>>>>>         Writing this I'm actually wondering how you ended up in
>>>>>>>         this issue? There shouldn't be much savings from this.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>         Christian.
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>                Felix
>>>>>>>
>>>>>>>
>>>>>>>             >
>>>>>>>             > Regards,
>>>>>>>             > Christian.
>>>>>>>             >
>>>>>>>             >>
>>>>>>>             >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>             >> Signed-off-by: Eric Huang
>>>>>>>             <JinhuiEric.Huang@amd.com>
>>>>>>>             <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>             >> ---
>>>>>>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>             >> +++++++++++++++++++++++++++++-----
>>>>>>>             >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>             >>
>>>>>>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> index 0f4c3b2..8a480c7 100644
>>>>>>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>             >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>             amdgpu_vm_prt_fini(struct
>>>>>>>             >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>             >>   }
>>>>>>>             >>     /**
>>>>>>>             >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>             >> + *
>>>>>>>             >> + * @adev: amdgpu device structure
>>>>>>>             >> + * @vm: amdgpu vm structure
>>>>>>>             >> + * @start: start of mapped range
>>>>>>>             >> + * @end: end of mapped entry
>>>>>>>             >> + *
>>>>>>>             >> + * Free the page table level.
>>>>>>>             >> + */
>>>>>>>             >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>             amdgpu_device *adev,
>>>>>>>             >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>             >> +{
>>>>>>>             >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>             >> +    unsigned shift, num_entries;
>>>>>>>             >> +
>>>>>>>             >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>             >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>             >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>             >> + return -ENOENT;
>>>>>>>             >> +    }
>>>>>>>             >> +
>>>>>>>             >> +    while (cursor.pfn < end) {
>>>>>>>             >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>             >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> +
>>>>>>>             >> +        if (cursor.entry !=
>>>>>>>             &cursor.parent->entries[num_entries - 1]) {
>>>>>>>             >> + /* Next ptb entry */
>>>>>>>             >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> + cursor.pfn += 1ULL << shift;
>>>>>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>             >> + cursor.entry++;
>>>>>>>             >> +        } else {
>>>>>>>             >> + /* Next ptb entry in next pd0 entry */
>>>>>>>             >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>             >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>             cursor.level - 1);
>>>>>>>             >> + cursor.pfn += 1ULL << shift;
>>>>>>>             >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>             >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>             >> +        }
>>>>>>>             >> +    }
>>>>>>>             >> +
>>>>>>>             >> +    return 0;
>>>>>>>             >> +}
>>>>>>>             >> +
>>>>>>>             >> +/**
>>>>>>>             >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>             the PT
>>>>>>>             >>    *
>>>>>>>             >>    * @adev: amdgpu_device pointer
>>>>>>>             >> @@ -1949,7 +1994,6 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>             >> *adev,
>>>>>>>             >>                 struct dma_fence **fence)
>>>>>>>             >>   {
>>>>>>>             >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>             >> -    uint64_t init_pte_value = 0;
>>>>>>>             >>       struct dma_fence *f = NULL;
>>>>>>>             >>       int r;
>>>>>>>             >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct
>>>>>>>             >> amdgpu_device *adev,
>>>>>>>             >> struct amdgpu_bo_va_mapping, list);
>>>>>>>             >> list_del(&mapping->list);
>>>>>>>             >>   -        if (vm->pte_support_ats &&
>>>>>>>             >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>             >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>             >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>             >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>             >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>             >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>             vm, false, NULL,
>>>>>>>             >> - mapping->start, mapping->last,
>>>>>>>             >> - init_pte_value, 0, NULL, &f);
>>>>>>>             >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>             >>           if (r) {
>>>>>>>             >> dma_fence_put(f);
>>>>>>>             >> @@ -1980,7 +2021,6 @@ int
>>>>>>>             amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>             >> *adev,
>>>>>>>             >>       }
>>>>>>>             >> return 0;
>>>>>>>             >> -
>>>>>>>             >>   }
>>>>>>>             >>     /**
>>>>>>>             >
>>>>>>>             > _______________________________________________
>>>>>>>             > amd-gfx mailing list
>>>>>>>             > amd-gfx@lists.freedesktop.org
>>>>>>>             <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>             > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 14:33                                 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-31 14:33 UTC (permalink / raw)
  To: Elder, Christina; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest. 
More detail is on Jira SWDEV-201443.

Regards,

Eric

On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
> I could try it on my carrizo/polaris setup.  Is there a test procedure I
> could folllow to trigger the changed code paths?
>
>
> Tom
>
> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>
>> When you free at least a 2MB the lowest level of page tables is freed
>> up again.
>>
>> BTW: What hardware have you tested this on? On gfx8 and older it is
>> expected that page tables are never freed.
>>
>> Regards,
>> Christian.
>>
>> Am 30.10.19 um 19:11 schrieb Christian König:
>>> Then I don't see how this patch actually changes anything.
>>>
>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>> this, but I won't have time to look into the ticket in detail.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>> Actually I do prevent to remove in-use pts by this:
>>>>
>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>
>>>> Which is only removing aligned page table for 2M. And I have tested
>>>> it at least on KFD tests without anything broken.
>>>>
>>>> By the way, I am not familiar with memory staff. This patch is the
>>>> best I can do for now. Could you take a look at the Jira ticket
>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>
>>>> Regards,
>>>>
>>>> Eric
>>>>
>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>> One thing I've forgotten:
>>>>>
>>>>> What you could maybe do to improve the situation is to join
>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>> the chances are that the ranges are freed all together.
>>>>>
>>>>> The only other alternative I can see would be to check the mappings
>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>> for a page table.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>> That's irrelevant.
>>>>>>
>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>> fragment size:
>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>> Then we walk down the tree:
>>>>>>>          amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>          while (cursor.pfn < end) {
>>>>>> And make sure that the page tables covering the address range are
>>>>>> actually allocated:
>>>>>>>                  r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>> &cursor);
>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>                          /* Free all child entries */
>>>>>>>                          while (cursor.pfn < frag_start) {
>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>                          }
>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>
>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>> tables which are only partially covered by the range and so
>>>>>> potentially still in use.
>>>>>>
>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>> are currently valid and how many are invalid we actually can't
>>>>>> implement what you are trying to do here. So the patch is
>>>>>> definitely somehow broken.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>> for asic below GMC v9:
>>>>>>>
>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>              (flags & AMDGPU_PTE_VALID))...
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>
>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>
>>>>>>>>      I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>      stress test. I think there are two reasons:
>>>>>>>>
>>>>>>>>      1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>      allocate unnecessary pts, because there is no flag to
>>>>>>>>      determine if the VA is mapping or unmapping in function
>>>>>>>>      amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>
>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>
>>>>>>>>      2. Intentionally removing those unmapping pts is logical
>>>>>>>>      expectation, although it is not removing so much pts.
>>>>>>>>
>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>
>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>> set.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>      Regards,
>>>>>>>>
>>>>>>>>      Eric
>>>>>>>>
>>>>>>>>      On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>          Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>          <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>
>>>>>>>>              On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>              > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>              >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>              >> which causes vram usage accumulated is huge in some
>>>>>>>>              >> memory stress test, such as kfd big buffer stress
>>>>>>>>              test.
>>>>>>>>              >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>              by both
>>>>>>>>              >> amdgpu_vm_bo_update() and
>>>>>>>>              amdgpu_vm_clear_freed(). The
>>>>>>>>              >> solution is replacing
>>>>>>>>              amdgpu_vm_bo_update_mapping() in
>>>>>>>>              >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>              >> to save vram usage.
>>>>>>>>              >
>>>>>>>>              > NAK, that is intentional behavior.
>>>>>>>>              >
>>>>>>>>              > Otherwise we can run into out of memory situations
>>>>>>>>              when page tables
>>>>>>>>              > need to be allocated again under stress.
>>>>>>>>
>>>>>>>>              That's a bit arbitrary and inconsistent. We are
>>>>>>>>              freeing page tables in
>>>>>>>>              other situations, when a mapping uses huge pages in
>>>>>>>>              amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>              destroyed completely?
>>>>>>>>
>>>>>>>>              I'm actually a bit surprised that the huge-page
>>>>>>>>              handling in
>>>>>>>>              amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>              lower-level page
>>>>>>>>              tables when a BO is unmapped.
>>>>>>>>
>>>>>>>>
>>>>>>>>          Well it does free the lower level, and that is already
>>>>>>>>          causing problems (that's why I added the reserved space).
>>>>>>>>
>>>>>>>>          What we don't do is freeing the higher levels.
>>>>>>>>
>>>>>>>>          E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>          we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>
>>>>>>>>          The problem with freeing the higher levels is that you
>>>>>>>>          don't know who is also using this. E.g. we would need to
>>>>>>>>          check all entries when we unmap one.
>>>>>>>>
>>>>>>>>          It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>
>>>>>>>>          Writing this I'm actually wondering how you ended up in
>>>>>>>>          this issue? There shouldn't be much savings from this.
>>>>>>>>
>>>>>>>>          Regards,
>>>>>>>>          Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>              Regards,
>>>>>>>>                 Felix
>>>>>>>>
>>>>>>>>
>>>>>>>>              >
>>>>>>>>              > Regards,
>>>>>>>>              > Christian.
>>>>>>>>              >
>>>>>>>>              >>
>>>>>>>>              >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>              >> Signed-off-by: Eric Huang
>>>>>>>>              <JinhuiEric.Huang@amd.com>
>>>>>>>>              <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>              >> ---
>>>>>>>>              >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>              >> +++++++++++++++++++++++++++++-----
>>>>>>>>              >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>              >>
>>>>>>>>              >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> index 0f4c3b2..8a480c7 100644
>>>>>>>>              >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>              amdgpu_vm_prt_fini(struct
>>>>>>>>              >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>              >>   }
>>>>>>>>              >>     /**
>>>>>>>>              >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>              >> + *
>>>>>>>>              >> + * @adev: amdgpu device structure
>>>>>>>>              >> + * @vm: amdgpu vm structure
>>>>>>>>              >> + * @start: start of mapped range
>>>>>>>>              >> + * @end: end of mapped entry
>>>>>>>>              >> + *
>>>>>>>>              >> + * Free the page table level.
>>>>>>>>              >> + */
>>>>>>>>              >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>              amdgpu_device *adev,
>>>>>>>>              >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>              >> +{
>>>>>>>>              >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>              >> +    unsigned shift, num_entries;
>>>>>>>>              >> +
>>>>>>>>              >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>              >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>              >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>              >> + return -ENOENT;
>>>>>>>>              >> +    }
>>>>>>>>              >> +
>>>>>>>>              >> +    while (cursor.pfn < end) {
>>>>>>>>              >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>              >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> +
>>>>>>>>              >> +        if (cursor.entry !=
>>>>>>>>              &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>              >> + /* Next ptb entry */
>>>>>>>>              >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> + cursor.pfn += 1ULL << shift;
>>>>>>>>              >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>              >> + cursor.entry++;
>>>>>>>>              >> +        } else {
>>>>>>>>              >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>              >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>              >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> + cursor.pfn += 1ULL << shift;
>>>>>>>>              >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>              >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>              >> +        }
>>>>>>>>              >> +    }
>>>>>>>>              >> +
>>>>>>>>              >> +    return 0;
>>>>>>>>              >> +}
>>>>>>>>              >> +
>>>>>>>>              >> +/**
>>>>>>>>              >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>              the PT
>>>>>>>>              >>    *
>>>>>>>>              >>    * @adev: amdgpu_device pointer
>>>>>>>>              >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>              >> *adev,
>>>>>>>>              >>                 struct dma_fence **fence)
>>>>>>>>              >>   {
>>>>>>>>              >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>              >> -    uint64_t init_pte_value = 0;
>>>>>>>>              >>       struct dma_fence *f = NULL;
>>>>>>>>              >>       int r;
>>>>>>>>              >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct
>>>>>>>>              >> amdgpu_device *adev,
>>>>>>>>              >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>              >> list_del(&mapping->list);
>>>>>>>>              >>   -        if (vm->pte_support_ats &&
>>>>>>>>              >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>              >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>              >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>              >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>              >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>              >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>              vm, false, NULL,
>>>>>>>>              >> - mapping->start, mapping->last,
>>>>>>>>              >> - init_pte_value, 0, NULL, &f);
>>>>>>>>              >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>              >>           if (r) {
>>>>>>>>              >> dma_fence_put(f);
>>>>>>>>              >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>              >> *adev,
>>>>>>>>              >>       }
>>>>>>>>              >> return 0;
>>>>>>>>              >> -
>>>>>>>>              >>   }
>>>>>>>>              >>     /**
>>>>>>>>              >
>>>>>>>>              > _______________________________________________
>>>>>>>>              > amd-gfx mailing list
>>>>>>>>              > amd-gfx@lists.freedesktop.org
>>>>>>>>              <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>              > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-31 14:33                                 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-31 14:33 UTC (permalink / raw)
  To: Elder, Christina; +Cc: amd-gfx

The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest. 
More detail is on Jira SWDEV-201443.

Regards,

Eric

On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
> I could try it on my carrizo/polaris setup.  Is there a test procedure I
> could folllow to trigger the changed code paths?
>
>
> Tom
>
> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>
>> When you free at least a 2MB the lowest level of page tables is freed
>> up again.
>>
>> BTW: What hardware have you tested this on? On gfx8 and older it is
>> expected that page tables are never freed.
>>
>> Regards,
>> Christian.
>>
>> Am 30.10.19 um 19:11 schrieb Christian König:
>>> Then I don't see how this patch actually changes anything.
>>>
>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>> this, but I won't have time to look into the ticket in detail.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>> Actually I do prevent to remove in-use pts by this:
>>>>
>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>
>>>> Which is only removing aligned page table for 2M. And I have tested
>>>> it at least on KFD tests without anything broken.
>>>>
>>>> By the way, I am not familiar with memory staff. This patch is the
>>>> best I can do for now. Could you take a look at the Jira ticket
>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>
>>>> Regards,
>>>>
>>>> Eric
>>>>
>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>> One thing I've forgotten:
>>>>>
>>>>> What you could maybe do to improve the situation is to join
>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>> the chances are that the ranges are freed all together.
>>>>>
>>>>> The only other alternative I can see would be to check the mappings
>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>> for a page table.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>> That's irrelevant.
>>>>>>
>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>> fragment size:
>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>> Then we walk down the tree:
>>>>>>>          amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>          while (cursor.pfn < end) {
>>>>>> And make sure that the page tables covering the address range are
>>>>>> actually allocated:
>>>>>>>                  r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>> &cursor);
>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>                          /* Free all child entries */
>>>>>>>                          while (cursor.pfn < frag_start) {
>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>                          }
>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>
>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>> tables which are only partially covered by the range and so
>>>>>> potentially still in use.
>>>>>>
>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>> are currently valid and how many are invalid we actually can't
>>>>>> implement what you are trying to do here. So the patch is
>>>>>> definitely somehow broken.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>> for asic below GMC v9:
>>>>>>>
>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>              (flags & AMDGPU_PTE_VALID))...
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>
>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>
>>>>>>>>      I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>      stress test. I think there are two reasons:
>>>>>>>>
>>>>>>>>      1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>      allocate unnecessary pts, because there is no flag to
>>>>>>>>      determine if the VA is mapping or unmapping in function
>>>>>>>>      amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>
>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>
>>>>>>>>      2. Intentionally removing those unmapping pts is logical
>>>>>>>>      expectation, although it is not removing so much pts.
>>>>>>>>
>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>
>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>> set.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>      Regards,
>>>>>>>>
>>>>>>>>      Eric
>>>>>>>>
>>>>>>>>      On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>          Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>          <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>
>>>>>>>>              On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>              > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>              >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>              >> which causes vram usage accumulated is huge in some
>>>>>>>>              >> memory stress test, such as kfd big buffer stress
>>>>>>>>              test.
>>>>>>>>              >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>              by both
>>>>>>>>              >> amdgpu_vm_bo_update() and
>>>>>>>>              amdgpu_vm_clear_freed(). The
>>>>>>>>              >> solution is replacing
>>>>>>>>              amdgpu_vm_bo_update_mapping() in
>>>>>>>>              >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>              >> to save vram usage.
>>>>>>>>              >
>>>>>>>>              > NAK, that is intentional behavior.
>>>>>>>>              >
>>>>>>>>              > Otherwise we can run into out of memory situations
>>>>>>>>              when page tables
>>>>>>>>              > need to be allocated again under stress.
>>>>>>>>
>>>>>>>>              That's a bit arbitrary and inconsistent. We are
>>>>>>>>              freeing page tables in
>>>>>>>>              other situations, when a mapping uses huge pages in
>>>>>>>>              amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>              destroyed completely?
>>>>>>>>
>>>>>>>>              I'm actually a bit surprised that the huge-page
>>>>>>>>              handling in
>>>>>>>>              amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>              lower-level page
>>>>>>>>              tables when a BO is unmapped.
>>>>>>>>
>>>>>>>>
>>>>>>>>          Well it does free the lower level, and that is already
>>>>>>>>          causing problems (that's why I added the reserved space).
>>>>>>>>
>>>>>>>>          What we don't do is freeing the higher levels.
>>>>>>>>
>>>>>>>>          E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>          we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>
>>>>>>>>          The problem with freeing the higher levels is that you
>>>>>>>>          don't know who is also using this. E.g. we would need to
>>>>>>>>          check all entries when we unmap one.
>>>>>>>>
>>>>>>>>          It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>
>>>>>>>>          Writing this I'm actually wondering how you ended up in
>>>>>>>>          this issue? There shouldn't be much savings from this.
>>>>>>>>
>>>>>>>>          Regards,
>>>>>>>>          Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>              Regards,
>>>>>>>>                 Felix
>>>>>>>>
>>>>>>>>
>>>>>>>>              >
>>>>>>>>              > Regards,
>>>>>>>>              > Christian.
>>>>>>>>              >
>>>>>>>>              >>
>>>>>>>>              >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>              >> Signed-off-by: Eric Huang
>>>>>>>>              <JinhuiEric.Huang@amd.com>
>>>>>>>>              <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>              >> ---
>>>>>>>>              >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>              >> +++++++++++++++++++++++++++++-----
>>>>>>>>              >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>              >>
>>>>>>>>              >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> index 0f4c3b2..8a480c7 100644
>>>>>>>>              >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>              >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>              amdgpu_vm_prt_fini(struct
>>>>>>>>              >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>              >>   }
>>>>>>>>              >>     /**
>>>>>>>>              >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>              >> + *
>>>>>>>>              >> + * @adev: amdgpu device structure
>>>>>>>>              >> + * @vm: amdgpu vm structure
>>>>>>>>              >> + * @start: start of mapped range
>>>>>>>>              >> + * @end: end of mapped entry
>>>>>>>>              >> + *
>>>>>>>>              >> + * Free the page table level.
>>>>>>>>              >> + */
>>>>>>>>              >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>              amdgpu_device *adev,
>>>>>>>>              >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>              >> +{
>>>>>>>>              >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>              >> +    unsigned shift, num_entries;
>>>>>>>>              >> +
>>>>>>>>              >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>              >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>              >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>              >> + return -ENOENT;
>>>>>>>>              >> +    }
>>>>>>>>              >> +
>>>>>>>>              >> +    while (cursor.pfn < end) {
>>>>>>>>              >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>              >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> +
>>>>>>>>              >> +        if (cursor.entry !=
>>>>>>>>              &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>              >> + /* Next ptb entry */
>>>>>>>>              >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> + cursor.pfn += 1ULL << shift;
>>>>>>>>              >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>              >> + cursor.entry++;
>>>>>>>>              >> +        } else {
>>>>>>>>              >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>              >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>              >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>              cursor.level - 1);
>>>>>>>>              >> + cursor.pfn += 1ULL << shift;
>>>>>>>>              >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>              >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>              >> +        }
>>>>>>>>              >> +    }
>>>>>>>>              >> +
>>>>>>>>              >> +    return 0;
>>>>>>>>              >> +}
>>>>>>>>              >> +
>>>>>>>>              >> +/**
>>>>>>>>              >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>              the PT
>>>>>>>>              >>    *
>>>>>>>>              >>    * @adev: amdgpu_device pointer
>>>>>>>>              >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>              >> *adev,
>>>>>>>>              >>                 struct dma_fence **fence)
>>>>>>>>              >>   {
>>>>>>>>              >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>              >> -    uint64_t init_pte_value = 0;
>>>>>>>>              >>       struct dma_fence *f = NULL;
>>>>>>>>              >>       int r;
>>>>>>>>              >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct
>>>>>>>>              >> amdgpu_device *adev,
>>>>>>>>              >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>              >> list_del(&mapping->list);
>>>>>>>>              >>   -        if (vm->pte_support_ats &&
>>>>>>>>              >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>              >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>              >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>              >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>              >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>              >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>              vm, false, NULL,
>>>>>>>>              >> - mapping->start, mapping->last,
>>>>>>>>              >> - init_pte_value, 0, NULL, &f);
>>>>>>>>              >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>              >>           if (r) {
>>>>>>>>              >> dma_fence_put(f);
>>>>>>>>              >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>              amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>              >> *adev,
>>>>>>>>              >>       }
>>>>>>>>              >> return 0;
>>>>>>>>              >> -
>>>>>>>>              >>   }
>>>>>>>>              >>     /**
>>>>>>>>              >
>>>>>>>>              > _______________________________________________
>>>>>>>>              > amd-gfx mailing list
>>>>>>>>              > amd-gfx@lists.freedesktop.org
>>>>>>>>              <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>              > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-11-05 16:27                                     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-11-05 16:27 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Christian,

I found the reason why page tables are not freed when unmapping. All the 
pts are reserved, then they are not freed until vm fini. So the 
consequences are old pts and new pts for the same VAs will exist till vm 
fini. In KFD big buffer strees test, multiple times of mapping and 
unmapping a big range of system memory causes huge vram pts usage 
accumulation.

I tried to avoid generating duplicated pts during unmapping in 
amdgpu_vm_update_ptes() by skipping amdgpu_vm_free_pts() and not 
reserving the lowest pts, but they didn't work with VM fault. The only 
way working is skipping whole function amdgpu_vm_update_ptes(), but it 
seems wrong, because we have to update GPU VM MMU.

So there is no bug in amdgpu_vm_update_ptes(), but the accumulation of 
pts vram usage is an overhead. Do you think what we can do to get better 
solution?

Regards,

Eric

On 2019-10-31 10:33 a.m., Huang, JinHuiEric wrote:
> The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest.
> More detail is on Jira SWDEV-201443.
>
> Regards,
>
> Eric
>
> On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
>> I could try it on my carrizo/polaris setup.  Is there a test procedure I
>> could folllow to trigger the changed code paths?
>>
>>
>> Tom
>>
>> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>>
>>> When you free at least a 2MB the lowest level of page tables is freed
>>> up again.
>>>
>>> BTW: What hardware have you tested this on? On gfx8 and older it is
>>> expected that page tables are never freed.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 30.10.19 um 19:11 schrieb Christian König:
>>>> Then I don't see how this patch actually changes anything.
>>>>
>>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>>> this, but I won't have time to look into the ticket in detail.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>>> Actually I do prevent to remove in-use pts by this:
>>>>>
>>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>>
>>>>> Which is only removing aligned page table for 2M. And I have tested
>>>>> it at least on KFD tests without anything broken.
>>>>>
>>>>> By the way, I am not familiar with memory staff. This patch is the
>>>>> best I can do for now. Could you take a look at the Jira ticket
>>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>>
>>>>> Regards,
>>>>>
>>>>> Eric
>>>>>
>>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>>> One thing I've forgotten:
>>>>>>
>>>>>> What you could maybe do to improve the situation is to join
>>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>>> the chances are that the ranges are freed all together.
>>>>>>
>>>>>> The only other alternative I can see would be to check the mappings
>>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>>> for a page table.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>> That's irrelevant.
>>>>>>>
>>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>>> fragment size:
>>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>>> Then we walk down the tree:
>>>>>>>>           amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>>           while (cursor.pfn < end) {
>>>>>>> And make sure that the page tables covering the address range are
>>>>>>> actually allocated:
>>>>>>>>                   r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>>> &cursor);
>>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>>                           /* Free all child entries */
>>>>>>>>                           while (cursor.pfn < frag_start) {
>>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>>                           }
>>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>>
>>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>>> tables which are only partially covered by the range and so
>>>>>>> potentially still in use.
>>>>>>>
>>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>>> are currently valid and how many are invalid we actually can't
>>>>>>> implement what you are trying to do here. So the patch is
>>>>>>> definitely somehow broken.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>>> for asic below GMC v9:
>>>>>>>>
>>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>>               (flags & AMDGPU_PTE_VALID))...
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>>
>>>>>>>>>       I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>>       stress test. I think there are two reasons:
>>>>>>>>>
>>>>>>>>>       1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>>       allocate unnecessary pts, because there is no flag to
>>>>>>>>>       determine if the VA is mapping or unmapping in function
>>>>>>>>>       amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>>
>>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>>
>>>>>>>>>       2. Intentionally removing those unmapping pts is logical
>>>>>>>>>       expectation, although it is not removing so much pts.
>>>>>>>>>
>>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>>
>>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>>> set.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       Regards,
>>>>>>>>>
>>>>>>>>>       Eric
>>>>>>>>>
>>>>>>>>>       On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>           Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>>           <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>>
>>>>>>>>>               On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>>               > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>>               >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>>               >> which causes vram usage accumulated is huge in some
>>>>>>>>>               >> memory stress test, such as kfd big buffer stress
>>>>>>>>>               test.
>>>>>>>>>               >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>>               by both
>>>>>>>>>               >> amdgpu_vm_bo_update() and
>>>>>>>>>               amdgpu_vm_clear_freed(). The
>>>>>>>>>               >> solution is replacing
>>>>>>>>>               amdgpu_vm_bo_update_mapping() in
>>>>>>>>>               >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>>               >> to save vram usage.
>>>>>>>>>               >
>>>>>>>>>               > NAK, that is intentional behavior.
>>>>>>>>>               >
>>>>>>>>>               > Otherwise we can run into out of memory situations
>>>>>>>>>               when page tables
>>>>>>>>>               > need to be allocated again under stress.
>>>>>>>>>
>>>>>>>>>               That's a bit arbitrary and inconsistent. We are
>>>>>>>>>               freeing page tables in
>>>>>>>>>               other situations, when a mapping uses huge pages in
>>>>>>>>>               amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>>               destroyed completely?
>>>>>>>>>
>>>>>>>>>               I'm actually a bit surprised that the huge-page
>>>>>>>>>               handling in
>>>>>>>>>               amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>>               lower-level page
>>>>>>>>>               tables when a BO is unmapped.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>           Well it does free the lower level, and that is already
>>>>>>>>>           causing problems (that's why I added the reserved space).
>>>>>>>>>
>>>>>>>>>           What we don't do is freeing the higher levels.
>>>>>>>>>
>>>>>>>>>           E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>>           we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>>
>>>>>>>>>           The problem with freeing the higher levels is that you
>>>>>>>>>           don't know who is also using this. E.g. we would need to
>>>>>>>>>           check all entries when we unmap one.
>>>>>>>>>
>>>>>>>>>           It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>>
>>>>>>>>>           Writing this I'm actually wondering how you ended up in
>>>>>>>>>           this issue? There shouldn't be much savings from this.
>>>>>>>>>
>>>>>>>>>           Regards,
>>>>>>>>>           Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>               Regards,
>>>>>>>>>                  Felix
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>               >
>>>>>>>>>               > Regards,
>>>>>>>>>               > Christian.
>>>>>>>>>               >
>>>>>>>>>               >>
>>>>>>>>>               >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>>               >> Signed-off-by: Eric Huang
>>>>>>>>>               <JinhuiEric.Huang@amd.com>
>>>>>>>>>               <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>>               >> ---
>>>>>>>>>               >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>>               >> +++++++++++++++++++++++++++++-----
>>>>>>>>>               >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>>               >>
>>>>>>>>>               >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> index 0f4c3b2..8a480c7 100644
>>>>>>>>>               >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>>               amdgpu_vm_prt_fini(struct
>>>>>>>>>               >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>>               >>   }
>>>>>>>>>               >>     /**
>>>>>>>>>               >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>>               >> + *
>>>>>>>>>               >> + * @adev: amdgpu device structure
>>>>>>>>>               >> + * @vm: amdgpu vm structure
>>>>>>>>>               >> + * @start: start of mapped range
>>>>>>>>>               >> + * @end: end of mapped entry
>>>>>>>>>               >> + *
>>>>>>>>>               >> + * Free the page table level.
>>>>>>>>>               >> + */
>>>>>>>>>               >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>>               amdgpu_device *adev,
>>>>>>>>>               >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>>               >> +{
>>>>>>>>>               >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>>               >> +    unsigned shift, num_entries;
>>>>>>>>>               >> +
>>>>>>>>>               >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>>               >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>>               >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>>               >> + return -ENOENT;
>>>>>>>>>               >> +    }
>>>>>>>>>               >> +
>>>>>>>>>               >> +    while (cursor.pfn < end) {
>>>>>>>>>               >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>>               >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> +
>>>>>>>>>               >> +        if (cursor.entry !=
>>>>>>>>>               &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>>               >> + /* Next ptb entry */
>>>>>>>>>               >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>               >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>               >> + cursor.entry++;
>>>>>>>>>               >> +        } else {
>>>>>>>>>               >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>>               >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>>               >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>               >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>               >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>>               >> +        }
>>>>>>>>>               >> +    }
>>>>>>>>>               >> +
>>>>>>>>>               >> +    return 0;
>>>>>>>>>               >> +}
>>>>>>>>>               >> +
>>>>>>>>>               >> +/**
>>>>>>>>>               >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>>               the PT
>>>>>>>>>               >>    *
>>>>>>>>>               >>    * @adev: amdgpu_device pointer
>>>>>>>>>               >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>               >> *adev,
>>>>>>>>>               >>                 struct dma_fence **fence)
>>>>>>>>>               >>   {
>>>>>>>>>               >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>>               >> -    uint64_t init_pte_value = 0;
>>>>>>>>>               >>       struct dma_fence *f = NULL;
>>>>>>>>>               >>       int r;
>>>>>>>>>               >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct
>>>>>>>>>               >> amdgpu_device *adev,
>>>>>>>>>               >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>>               >> list_del(&mapping->list);
>>>>>>>>>               >>   -        if (vm->pte_support_ats &&
>>>>>>>>>               >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>>               >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>>               >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>>               >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>>               >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>>               >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>>               vm, false, NULL,
>>>>>>>>>               >> - mapping->start, mapping->last,
>>>>>>>>>               >> - init_pte_value, 0, NULL, &f);
>>>>>>>>>               >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>>               >>           if (r) {
>>>>>>>>>               >> dma_fence_put(f);
>>>>>>>>>               >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>               >> *adev,
>>>>>>>>>               >>       }
>>>>>>>>>               >> return 0;
>>>>>>>>>               >> -
>>>>>>>>>               >>   }
>>>>>>>>>               >>     /**
>>>>>>>>>               >
>>>>>>>>>               > _______________________________________________
>>>>>>>>>               > amd-gfx mailing list
>>>>>>>>>               > amd-gfx@lists.freedesktop.org
>>>>>>>>>               <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>>               > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> amd-gfx mailing list
>>>>>>> amd-gfx@lists.freedesktop.org
>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-11-05 16:27                                     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-11-05 16:27 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: amd-gfx

Hi Christian,

I found the reason why page tables are not freed when unmapping. All the 
pts are reserved, then they are not freed until vm fini. So the 
consequences are old pts and new pts for the same VAs will exist till vm 
fini. In KFD big buffer strees test, multiple times of mapping and 
unmapping a big range of system memory causes huge vram pts usage 
accumulation.

I tried to avoid generating duplicated pts during unmapping in 
amdgpu_vm_update_ptes() by skipping amdgpu_vm_free_pts() and not 
reserving the lowest pts, but they didn't work with VM fault. The only 
way working is skipping whole function amdgpu_vm_update_ptes(), but it 
seems wrong, because we have to update GPU VM MMU.

So there is no bug in amdgpu_vm_update_ptes(), but the accumulation of 
pts vram usage is an overhead. Do you think what we can do to get better 
solution?

Regards,

Eric

On 2019-10-31 10:33 a.m., Huang, JinHuiEric wrote:
> The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest.
> More detail is on Jira SWDEV-201443.
>
> Regards,
>
> Eric
>
> On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
>> I could try it on my carrizo/polaris setup.  Is there a test procedure I
>> could folllow to trigger the changed code paths?
>>
>>
>> Tom
>>
>> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>>
>>> When you free at least a 2MB the lowest level of page tables is freed
>>> up again.
>>>
>>> BTW: What hardware have you tested this on? On gfx8 and older it is
>>> expected that page tables are never freed.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 30.10.19 um 19:11 schrieb Christian König:
>>>> Then I don't see how this patch actually changes anything.
>>>>
>>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>>> this, but I won't have time to look into the ticket in detail.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>>> Actually I do prevent to remove in-use pts by this:
>>>>>
>>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>>
>>>>> Which is only removing aligned page table for 2M. And I have tested
>>>>> it at least on KFD tests without anything broken.
>>>>>
>>>>> By the way, I am not familiar with memory staff. This patch is the
>>>>> best I can do for now. Could you take a look at the Jira ticket
>>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>>
>>>>> Regards,
>>>>>
>>>>> Eric
>>>>>
>>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>>> One thing I've forgotten:
>>>>>>
>>>>>> What you could maybe do to improve the situation is to join
>>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>>> the chances are that the ranges are freed all together.
>>>>>>
>>>>>> The only other alternative I can see would be to check the mappings
>>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>>> for a page table.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>> That's irrelevant.
>>>>>>>
>>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>>> fragment size:
>>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>>> Then we walk down the tree:
>>>>>>>>           amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>>           while (cursor.pfn < end) {
>>>>>>> And make sure that the page tables covering the address range are
>>>>>>> actually allocated:
>>>>>>>>                   r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>>> &cursor);
>>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>>                           /* Free all child entries */
>>>>>>>>                           while (cursor.pfn < frag_start) {
>>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>>                           }
>>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>>
>>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>>> tables which are only partially covered by the range and so
>>>>>>> potentially still in use.
>>>>>>>
>>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>>> are currently valid and how many are invalid we actually can't
>>>>>>> implement what you are trying to do here. So the patch is
>>>>>>> definitely somehow broken.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>>> for asic below GMC v9:
>>>>>>>>
>>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>>               (flags & AMDGPU_PTE_VALID))...
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>>
>>>>>>>>>       I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>>       stress test. I think there are two reasons:
>>>>>>>>>
>>>>>>>>>       1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>>       allocate unnecessary pts, because there is no flag to
>>>>>>>>>       determine if the VA is mapping or unmapping in function
>>>>>>>>>       amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>>
>>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>>
>>>>>>>>>       2. Intentionally removing those unmapping pts is logical
>>>>>>>>>       expectation, although it is not removing so much pts.
>>>>>>>>>
>>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>>
>>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>>> set.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       Regards,
>>>>>>>>>
>>>>>>>>>       Eric
>>>>>>>>>
>>>>>>>>>       On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>           Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>>           <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>>
>>>>>>>>>               On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>>               > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>>               >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>>               >> which causes vram usage accumulated is huge in some
>>>>>>>>>               >> memory stress test, such as kfd big buffer stress
>>>>>>>>>               test.
>>>>>>>>>               >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>>               by both
>>>>>>>>>               >> amdgpu_vm_bo_update() and
>>>>>>>>>               amdgpu_vm_clear_freed(). The
>>>>>>>>>               >> solution is replacing
>>>>>>>>>               amdgpu_vm_bo_update_mapping() in
>>>>>>>>>               >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>>               >> to save vram usage.
>>>>>>>>>               >
>>>>>>>>>               > NAK, that is intentional behavior.
>>>>>>>>>               >
>>>>>>>>>               > Otherwise we can run into out of memory situations
>>>>>>>>>               when page tables
>>>>>>>>>               > need to be allocated again under stress.
>>>>>>>>>
>>>>>>>>>               That's a bit arbitrary and inconsistent. We are
>>>>>>>>>               freeing page tables in
>>>>>>>>>               other situations, when a mapping uses huge pages in
>>>>>>>>>               amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>>               destroyed completely?
>>>>>>>>>
>>>>>>>>>               I'm actually a bit surprised that the huge-page
>>>>>>>>>               handling in
>>>>>>>>>               amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>>               lower-level page
>>>>>>>>>               tables when a BO is unmapped.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>           Well it does free the lower level, and that is already
>>>>>>>>>           causing problems (that's why I added the reserved space).
>>>>>>>>>
>>>>>>>>>           What we don't do is freeing the higher levels.
>>>>>>>>>
>>>>>>>>>           E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>>           we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>>
>>>>>>>>>           The problem with freeing the higher levels is that you
>>>>>>>>>           don't know who is also using this. E.g. we would need to
>>>>>>>>>           check all entries when we unmap one.
>>>>>>>>>
>>>>>>>>>           It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>>
>>>>>>>>>           Writing this I'm actually wondering how you ended up in
>>>>>>>>>           this issue? There shouldn't be much savings from this.
>>>>>>>>>
>>>>>>>>>           Regards,
>>>>>>>>>           Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>               Regards,
>>>>>>>>>                  Felix
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>               >
>>>>>>>>>               > Regards,
>>>>>>>>>               > Christian.
>>>>>>>>>               >
>>>>>>>>>               >>
>>>>>>>>>               >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>>               >> Signed-off-by: Eric Huang
>>>>>>>>>               <JinhuiEric.Huang@amd.com>
>>>>>>>>>               <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>>               >> ---
>>>>>>>>>               >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>>               >> +++++++++++++++++++++++++++++-----
>>>>>>>>>               >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>>               >>
>>>>>>>>>               >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> index 0f4c3b2..8a480c7 100644
>>>>>>>>>               >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>               >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>>               amdgpu_vm_prt_fini(struct
>>>>>>>>>               >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>>               >>   }
>>>>>>>>>               >>     /**
>>>>>>>>>               >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>>               >> + *
>>>>>>>>>               >> + * @adev: amdgpu device structure
>>>>>>>>>               >> + * @vm: amdgpu vm structure
>>>>>>>>>               >> + * @start: start of mapped range
>>>>>>>>>               >> + * @end: end of mapped entry
>>>>>>>>>               >> + *
>>>>>>>>>               >> + * Free the page table level.
>>>>>>>>>               >> + */
>>>>>>>>>               >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>>               amdgpu_device *adev,
>>>>>>>>>               >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>>               >> +{
>>>>>>>>>               >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>>               >> +    unsigned shift, num_entries;
>>>>>>>>>               >> +
>>>>>>>>>               >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>>               >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>>               >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>>               >> + return -ENOENT;
>>>>>>>>>               >> +    }
>>>>>>>>>               >> +
>>>>>>>>>               >> +    while (cursor.pfn < end) {
>>>>>>>>>               >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>>               >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> +
>>>>>>>>>               >> +        if (cursor.entry !=
>>>>>>>>>               &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>>               >> + /* Next ptb entry */
>>>>>>>>>               >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>               >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>               >> + cursor.entry++;
>>>>>>>>>               >> +        } else {
>>>>>>>>>               >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>>               >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>>               >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>               cursor.level - 1);
>>>>>>>>>               >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>               >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>               >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>>               >> +        }
>>>>>>>>>               >> +    }
>>>>>>>>>               >> +
>>>>>>>>>               >> +    return 0;
>>>>>>>>>               >> +}
>>>>>>>>>               >> +
>>>>>>>>>               >> +/**
>>>>>>>>>               >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>>               the PT
>>>>>>>>>               >>    *
>>>>>>>>>               >>    * @adev: amdgpu_device pointer
>>>>>>>>>               >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>               >> *adev,
>>>>>>>>>               >>                 struct dma_fence **fence)
>>>>>>>>>               >>   {
>>>>>>>>>               >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>>               >> -    uint64_t init_pte_value = 0;
>>>>>>>>>               >>       struct dma_fence *f = NULL;
>>>>>>>>>               >>       int r;
>>>>>>>>>               >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct
>>>>>>>>>               >> amdgpu_device *adev,
>>>>>>>>>               >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>>               >> list_del(&mapping->list);
>>>>>>>>>               >>   -        if (vm->pte_support_ats &&
>>>>>>>>>               >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>>               >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>>               >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>>               >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>>               >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>>               >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>>               vm, false, NULL,
>>>>>>>>>               >> - mapping->start, mapping->last,
>>>>>>>>>               >> - init_pte_value, 0, NULL, &f);
>>>>>>>>>               >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>>               >>           if (r) {
>>>>>>>>>               >> dma_fence_put(f);
>>>>>>>>>               >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>>               amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>               >> *adev,
>>>>>>>>>               >>       }
>>>>>>>>>               >> return 0;
>>>>>>>>>               >> -
>>>>>>>>>               >>   }
>>>>>>>>>               >>     /**
>>>>>>>>>               >
>>>>>>>>>               > _______________________________________________
>>>>>>>>>               > amd-gfx mailing list
>>>>>>>>>               > amd-gfx@lists.freedesktop.org
>>>>>>>>>               <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>>               > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> amd-gfx mailing list
>>>>>>> amd-gfx@lists.freedesktop.org
>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-11-05 18:51                                         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-11-05 18:51 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Eric,

Ah! Yeah that is a well known issue.

Basic problem is that for releasing the BOs we need to reserve them to 
check if they are idle or not.

I've got a branch with a TTM change to avoid that, but essentially that 
is a huge problem which needs a rather big change in memory management 
to fix.

Regards,
Christian.

Am 05.11.19 um 17:27 schrieb Huang, JinHuiEric:
> Hi Christian,
>
> I found the reason why page tables are not freed when unmapping. All the
> pts are reserved, then they are not freed until vm fini. So the
> consequences are old pts and new pts for the same VAs will exist till vm
> fini. In KFD big buffer strees test, multiple times of mapping and
> unmapping a big range of system memory causes huge vram pts usage
> accumulation.
>
> I tried to avoid generating duplicated pts during unmapping in
> amdgpu_vm_update_ptes() by skipping amdgpu_vm_free_pts() and not
> reserving the lowest pts, but they didn't work with VM fault. The only
> way working is skipping whole function amdgpu_vm_update_ptes(), but it
> seems wrong, because we have to update GPU VM MMU.
>
> So there is no bug in amdgpu_vm_update_ptes(), but the accumulation of
> pts vram usage is an overhead. Do you think what we can do to get better
> solution?
>
> Regards,
>
> Eric
>
> On 2019-10-31 10:33 a.m., Huang, JinHuiEric wrote:
>> The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest.
>> More detail is on Jira SWDEV-201443.
>>
>> Regards,
>>
>> Eric
>>
>> On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
>>> I could try it on my carrizo/polaris setup.  Is there a test procedure I
>>> could folllow to trigger the changed code paths?
>>>
>>>
>>> Tom
>>>
>>> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>>>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>>>
>>>> When you free at least a 2MB the lowest level of page tables is freed
>>>> up again.
>>>>
>>>> BTW: What hardware have you tested this on? On gfx8 and older it is
>>>> expected that page tables are never freed.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 19:11 schrieb Christian König:
>>>>> Then I don't see how this patch actually changes anything.
>>>>>
>>>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>>>> this, but I won't have time to look into the ticket in detail.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>>>> Actually I do prevent to remove in-use pts by this:
>>>>>>
>>>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>>>
>>>>>> Which is only removing aligned page table for 2M. And I have tested
>>>>>> it at least on KFD tests without anything broken.
>>>>>>
>>>>>> By the way, I am not familiar with memory staff. This patch is the
>>>>>> best I can do for now. Could you take a look at the Jira ticket
>>>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>>>> One thing I've forgotten:
>>>>>>>
>>>>>>> What you could maybe do to improve the situation is to join
>>>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>>>> the chances are that the ranges are freed all together.
>>>>>>>
>>>>>>> The only other alternative I can see would be to check the mappings
>>>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>>>> for a page table.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>> That's irrelevant.
>>>>>>>>
>>>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>>>> fragment size:
>>>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>>>> Then we walk down the tree:
>>>>>>>>>            amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>>>            while (cursor.pfn < end) {
>>>>>>>> And make sure that the page tables covering the address range are
>>>>>>>> actually allocated:
>>>>>>>>>                    r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>>>> &cursor);
>>>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>>>                            /* Free all child entries */
>>>>>>>>>                            while (cursor.pfn < frag_start) {
>>>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>>>                            }
>>>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>>>
>>>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>>>> tables which are only partially covered by the range and so
>>>>>>>> potentially still in use.
>>>>>>>>
>>>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>>>> are currently valid and how many are invalid we actually can't
>>>>>>>> implement what you are trying to do here. So the patch is
>>>>>>>> definitely somehow broken.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>>>> for asic below GMC v9:
>>>>>>>>>
>>>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>>>                (flags & AMDGPU_PTE_VALID))...
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Eric
>>>>>>>>>
>>>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>>>
>>>>>>>>>>        I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>>>        stress test. I think there are two reasons:
>>>>>>>>>>
>>>>>>>>>>        1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>>>        allocate unnecessary pts, because there is no flag to
>>>>>>>>>>        determine if the VA is mapping or unmapping in function
>>>>>>>>>>        amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>>>
>>>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>>>
>>>>>>>>>>        2. Intentionally removing those unmapping pts is logical
>>>>>>>>>>        expectation, although it is not removing so much pts.
>>>>>>>>>>
>>>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>>>
>>>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>>>> set.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>        Regards,
>>>>>>>>>>
>>>>>>>>>>        Eric
>>>>>>>>>>
>>>>>>>>>>        On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>            Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>>>            <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>>>
>>>>>>>>>>                On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>>>                > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>>>                >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>>>                >> which causes vram usage accumulated is huge in some
>>>>>>>>>>                >> memory stress test, such as kfd big buffer stress
>>>>>>>>>>                test.
>>>>>>>>>>                >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>>>                by both
>>>>>>>>>>                >> amdgpu_vm_bo_update() and
>>>>>>>>>>                amdgpu_vm_clear_freed(). The
>>>>>>>>>>                >> solution is replacing
>>>>>>>>>>                amdgpu_vm_bo_update_mapping() in
>>>>>>>>>>                >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>>>                >> to save vram usage.
>>>>>>>>>>                >
>>>>>>>>>>                > NAK, that is intentional behavior.
>>>>>>>>>>                >
>>>>>>>>>>                > Otherwise we can run into out of memory situations
>>>>>>>>>>                when page tables
>>>>>>>>>>                > need to be allocated again under stress.
>>>>>>>>>>
>>>>>>>>>>                That's a bit arbitrary and inconsistent. We are
>>>>>>>>>>                freeing page tables in
>>>>>>>>>>                other situations, when a mapping uses huge pages in
>>>>>>>>>>                amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>>>                destroyed completely?
>>>>>>>>>>
>>>>>>>>>>                I'm actually a bit surprised that the huge-page
>>>>>>>>>>                handling in
>>>>>>>>>>                amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>>>                lower-level page
>>>>>>>>>>                tables when a BO is unmapped.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>            Well it does free the lower level, and that is already
>>>>>>>>>>            causing problems (that's why I added the reserved space).
>>>>>>>>>>
>>>>>>>>>>            What we don't do is freeing the higher levels.
>>>>>>>>>>
>>>>>>>>>>            E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>>>            we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>>>
>>>>>>>>>>            The problem with freeing the higher levels is that you
>>>>>>>>>>            don't know who is also using this. E.g. we would need to
>>>>>>>>>>            check all entries when we unmap one.
>>>>>>>>>>
>>>>>>>>>>            It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>>>
>>>>>>>>>>            Writing this I'm actually wondering how you ended up in
>>>>>>>>>>            this issue? There shouldn't be much savings from this.
>>>>>>>>>>
>>>>>>>>>>            Regards,
>>>>>>>>>>            Christian.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                Regards,
>>>>>>>>>>                   Felix
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                >
>>>>>>>>>>                > Regards,
>>>>>>>>>>                > Christian.
>>>>>>>>>>                >
>>>>>>>>>>                >>
>>>>>>>>>>                >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>>>                >> Signed-off-by: Eric Huang
>>>>>>>>>>                <JinhuiEric.Huang@amd.com>
>>>>>>>>>>                <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>>>                >> ---
>>>>>>>>>>                >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>>>                >> +++++++++++++++++++++++++++++-----
>>>>>>>>>>                >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>>>                >>
>>>>>>>>>>                >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> index 0f4c3b2..8a480c7 100644
>>>>>>>>>>                >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>>>                amdgpu_vm_prt_fini(struct
>>>>>>>>>>                >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>>>                >>   }
>>>>>>>>>>                >>     /**
>>>>>>>>>>                >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>>>                >> + *
>>>>>>>>>>                >> + * @adev: amdgpu device structure
>>>>>>>>>>                >> + * @vm: amdgpu vm structure
>>>>>>>>>>                >> + * @start: start of mapped range
>>>>>>>>>>                >> + * @end: end of mapped entry
>>>>>>>>>>                >> + *
>>>>>>>>>>                >> + * Free the page table level.
>>>>>>>>>>                >> + */
>>>>>>>>>>                >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>>>                amdgpu_device *adev,
>>>>>>>>>>                >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>>>                >> +{
>>>>>>>>>>                >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>>>                >> +    unsigned shift, num_entries;
>>>>>>>>>>                >> +
>>>>>>>>>>                >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>>>                >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>>>                >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>>>                >> + return -ENOENT;
>>>>>>>>>>                >> +    }
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +    while (cursor.pfn < end) {
>>>>>>>>>>                >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>>>                >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +        if (cursor.entry !=
>>>>>>>>>>                &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>>>                >> + /* Next ptb entry */
>>>>>>>>>>                >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>>                >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>>                >> + cursor.entry++;
>>>>>>>>>>                >> +        } else {
>>>>>>>>>>                >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>>>                >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>>>                >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>>                >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>>                >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>>>                >> +        }
>>>>>>>>>>                >> +    }
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +    return 0;
>>>>>>>>>>                >> +}
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +/**
>>>>>>>>>>                >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>>>                the PT
>>>>>>>>>>                >>    *
>>>>>>>>>>                >>    * @adev: amdgpu_device pointer
>>>>>>>>>>                >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>>                >> *adev,
>>>>>>>>>>                >>                 struct dma_fence **fence)
>>>>>>>>>>                >>   {
>>>>>>>>>>                >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>>>                >> -    uint64_t init_pte_value = 0;
>>>>>>>>>>                >>       struct dma_fence *f = NULL;
>>>>>>>>>>                >>       int r;
>>>>>>>>>>                >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct
>>>>>>>>>>                >> amdgpu_device *adev,
>>>>>>>>>>                >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>>>                >> list_del(&mapping->list);
>>>>>>>>>>                >>   -        if (vm->pte_support_ats &&
>>>>>>>>>>                >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>>>                >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>>>                >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>>>                >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>>>                >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>>>                >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>>>                vm, false, NULL,
>>>>>>>>>>                >> - mapping->start, mapping->last,
>>>>>>>>>>                >> - init_pte_value, 0, NULL, &f);
>>>>>>>>>>                >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>>>                >>           if (r) {
>>>>>>>>>>                >> dma_fence_put(f);
>>>>>>>>>>                >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>>                >> *adev,
>>>>>>>>>>                >>       }
>>>>>>>>>>                >> return 0;
>>>>>>>>>>                >> -
>>>>>>>>>>                >>   }
>>>>>>>>>>                >>     /**
>>>>>>>>>>                >
>>>>>>>>>>                > _______________________________________________
>>>>>>>>>>                > amd-gfx mailing list
>>>>>>>>>>                > amd-gfx@lists.freedesktop.org
>>>>>>>>>>                <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>>>                > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> amd-gfx mailing list
>>>>>>>> amd-gfx@lists.freedesktop.org
>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-11-05 18:51                                         ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-11-05 18:51 UTC (permalink / raw)
  To: Huang, JinHuiEric; +Cc: amd-gfx

Hi Eric,

Ah! Yeah that is a well known issue.

Basic problem is that for releasing the BOs we need to reserve them to 
check if they are idle or not.

I've got a branch with a TTM change to avoid that, but essentially that 
is a huge problem which needs a rather big change in memory management 
to fix.

Regards,
Christian.

Am 05.11.19 um 17:27 schrieb Huang, JinHuiEric:
> Hi Christian,
>
> I found the reason why page tables are not freed when unmapping. All the
> pts are reserved, then they are not freed until vm fini. So the
> consequences are old pts and new pts for the same VAs will exist till vm
> fini. In KFD big buffer strees test, multiple times of mapping and
> unmapping a big range of system memory causes huge vram pts usage
> accumulation.
>
> I tried to avoid generating duplicated pts during unmapping in
> amdgpu_vm_update_ptes() by skipping amdgpu_vm_free_pts() and not
> reserving the lowest pts, but they didn't work with VM fault. The only
> way working is skipping whole function amdgpu_vm_update_ptes(), but it
> seems wrong, because we have to update GPU VM MMU.
>
> So there is no bug in amdgpu_vm_update_ptes(), but the accumulation of
> pts vram usage is an overhead. Do you think what we can do to get better
> solution?
>
> Regards,
>
> Eric
>
> On 2019-10-31 10:33 a.m., Huang, JinHuiEric wrote:
>> The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest.
>> More detail is on Jira SWDEV-201443.
>>
>> Regards,
>>
>> Eric
>>
>> On 2019-10-31 10:08 a.m., StDenis, Tom wrote:
>>> I could try it on my carrizo/polaris setup.  Is there a test procedure I
>>> could folllow to trigger the changed code paths?
>>>
>>>
>>> Tom
>>>
>>> On 2019-10-31 6:41 a.m., Koenig, Christian wrote:
>>>> Just tested this and amdgpu_vm_update_ptes() indeed works as expected.
>>>>
>>>> When you free at least a 2MB the lowest level of page tables is freed
>>>> up again.
>>>>
>>>> BTW: What hardware have you tested this on? On gfx8 and older it is
>>>> expected that page tables are never freed.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 30.10.19 um 19:11 schrieb Christian König:
>>>>> Then I don't see how this patch actually changes anything.
>>>>>
>>>>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate
>>>>> this, but I won't have time to look into the ticket in detail.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric:
>>>>>> Actually I do prevent to remove in-use pts by this:
>>>>>>
>>>>>> +               r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>> +                               (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>> +                               (mapping->last + 1) & (~0x1ffll));
>>>>>>
>>>>>> Which is only removing aligned page table for 2M. And I have tested
>>>>>> it at least on KFD tests without anything broken.
>>>>>>
>>>>>> By the way, I am not familiar with memory staff. This patch is the
>>>>>> best I can do for now. Could you take a look at the Jira ticket
>>>>>> SWDEV-201443 ? and find the better solution. Thanks!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On 2019-10-30 1:57 p.m., Christian König wrote:
>>>>>>> One thing I've forgotten:
>>>>>>>
>>>>>>> What you could maybe do to improve the situation is to join
>>>>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
>>>>>>> the chances are that the ranges are freed all together.
>>>>>>>
>>>>>>> The only other alternative I can see would be to check the mappings
>>>>>>> of a range in amdgpu_update_ptes() and see if you could walk the
>>>>>>> tree up if the valid flag is not set and there are no mappings left
>>>>>>> for a page table.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian:
>>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>> That's irrelevant.
>>>>>>>>
>>>>>>>> See what amdgpu_vm_update_ptes() does is to first determine the
>>>>>>>> fragment size:
>>>>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);
>>>>>>>> Then we walk down the tree:
>>>>>>>>>            amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>>>>            while (cursor.pfn < end) {
>>>>>>>> And make sure that the page tables covering the address range are
>>>>>>>> actually allocated:
>>>>>>>>>                    r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>>>>> &cursor);
>>>>>>>> Then we update the tables with the flags and addresses and free up
>>>>>>>> subsequent tables in the case of huge pages or freed up areas:
>>>>>>>>>                            /* Free all child entries */
>>>>>>>>>                            while (cursor.pfn < frag_start) {
>>>>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor);
>>>>>>>>> amdgpu_vm_pt_next(adev, &cursor);
>>>>>>>>>                            }
>>>>>>>> This is the maximum you can free, cause all other page tables are
>>>>>>>> not completely covered by the range and so potentially still in use.
>>>>>>>>
>>>>>>>> And I have the strong suspicion that this is what your patch is
>>>>>>>> actually doing wrong. In other words you are also freeing page
>>>>>>>> tables which are only partially covered by the range and so
>>>>>>>> potentially still in use.
>>>>>>>>
>>>>>>>> Since we don't have any tracking how many entries in a page table
>>>>>>>> are currently valid and how many are invalid we actually can't
>>>>>>>> implement what you are trying to do here. So the patch is
>>>>>>>> definitely somehow broken.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:
>>>>>>>>> The vaild flag doesn't take effect in this function.
>>>>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on
>>>>>>>>> "cursor.pfn < end". The valid flag has only been checked on here
>>>>>>>>> for asic below GMC v9:
>>>>>>>>>
>>>>>>>>> if (adev->asic_type < CHIP_VEGA10 &&
>>>>>>>>>                (flags & AMDGPU_PTE_VALID))...
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Eric
>>>>>>>>>
>>>>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote:
>>>>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric"
>>>>>>>>>> <JinHuiEric.Huang@amd.com>:
>>>>>>>>>>
>>>>>>>>>>        I tested it that it saves a lot of vram on KFD big buffer
>>>>>>>>>>        stress test. I think there are two reasons:
>>>>>>>>>>
>>>>>>>>>>        1. Calling amdgpu_vm_update_ptes() during unmapping will
>>>>>>>>>>        allocate unnecessary pts, because there is no flag to
>>>>>>>>>>        determine if the VA is mapping or unmapping in function
>>>>>>>>>>        amdgpu_vm_update_ptes(). It saves the most of memory.
>>>>>>>>>>
>>>>>>>>>> That's not correct. The valid flag is used for this.
>>>>>>>>>>
>>>>>>>>>>        2. Intentionally removing those unmapping pts is logical
>>>>>>>>>>        expectation, although it is not removing so much pts.
>>>>>>>>>>
>>>>>>>>>> Well I actually don't see a change to what update_ptes is doing
>>>>>>>>>> and have the strong suspicion that the patch is simply broken.
>>>>>>>>>>
>>>>>>>>>> You either free page tables which are potentially still in use
>>>>>>>>>> or update_pte doesn't free page tables when the valid but is not
>>>>>>>>>> set.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>        Regards,
>>>>>>>>>>
>>>>>>>>>>        Eric
>>>>>>>>>>
>>>>>>>>>>        On 2019-10-30 11:57 a.m., Koenig, Christian wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>            Am 30.10.2019 16:47 schrieb "Kuehling, Felix"
>>>>>>>>>>            <Felix.Kuehling@amd.com> <mailto:Felix.Kuehling@amd.com>:
>>>>>>>>>>
>>>>>>>>>>                On 2019-10-30 9:52 a.m., Christian König wrote:
>>>>>>>>>>                > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>>>>>>>>>>                >> The issue is PT BOs are not freed when unmapping VA,
>>>>>>>>>>                >> which causes vram usage accumulated is huge in some
>>>>>>>>>>                >> memory stress test, such as kfd big buffer stress
>>>>>>>>>>                test.
>>>>>>>>>>                >> Function amdgpu_vm_bo_update_mapping() is called
>>>>>>>>>>                by both
>>>>>>>>>>                >> amdgpu_vm_bo_update() and
>>>>>>>>>>                amdgpu_vm_clear_freed(). The
>>>>>>>>>>                >> solution is replacing
>>>>>>>>>>                amdgpu_vm_bo_update_mapping() in
>>>>>>>>>>                >> amdgpu_vm_clear_freed() with removing PT BOs function
>>>>>>>>>>                >> to save vram usage.
>>>>>>>>>>                >
>>>>>>>>>>                > NAK, that is intentional behavior.
>>>>>>>>>>                >
>>>>>>>>>>                > Otherwise we can run into out of memory situations
>>>>>>>>>>                when page tables
>>>>>>>>>>                > need to be allocated again under stress.
>>>>>>>>>>
>>>>>>>>>>                That's a bit arbitrary and inconsistent. We are
>>>>>>>>>>                freeing page tables in
>>>>>>>>>>                other situations, when a mapping uses huge pages in
>>>>>>>>>>                amdgpu_vm_update_ptes. Why not when a mapping is
>>>>>>>>>>                destroyed completely?
>>>>>>>>>>
>>>>>>>>>>                I'm actually a bit surprised that the huge-page
>>>>>>>>>>                handling in
>>>>>>>>>>                amdgpu_vm_update_ptes isn't kicking in to free up
>>>>>>>>>>                lower-level page
>>>>>>>>>>                tables when a BO is unmapped.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>            Well it does free the lower level, and that is already
>>>>>>>>>>            causing problems (that's why I added the reserved space).
>>>>>>>>>>
>>>>>>>>>>            What we don't do is freeing the higher levels.
>>>>>>>>>>
>>>>>>>>>>            E.g. when you free a 2MB BO we free the lowest level, if
>>>>>>>>>>            we free a 1GB BO we free the two lowest levels etc...
>>>>>>>>>>
>>>>>>>>>>            The problem with freeing the higher levels is that you
>>>>>>>>>>            don't know who is also using this. E.g. we would need to
>>>>>>>>>>            check all entries when we unmap one.
>>>>>>>>>>
>>>>>>>>>>            It's simply not worth it for a maximum saving of 2MB per VM.
>>>>>>>>>>
>>>>>>>>>>            Writing this I'm actually wondering how you ended up in
>>>>>>>>>>            this issue? There shouldn't be much savings from this.
>>>>>>>>>>
>>>>>>>>>>            Regards,
>>>>>>>>>>            Christian.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                Regards,
>>>>>>>>>>                   Felix
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                >
>>>>>>>>>>                > Regards,
>>>>>>>>>>                > Christian.
>>>>>>>>>>                >
>>>>>>>>>>                >>
>>>>>>>>>>                >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>>>>>>>>>>                >> Signed-off-by: Eric Huang
>>>>>>>>>>                <JinhuiEric.Huang@amd.com>
>>>>>>>>>>                <mailto:JinhuiEric.Huang@amd.com>
>>>>>>>>>>                >> ---
>>>>>>>>>>                >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>>>>>>>>>>                >> +++++++++++++++++++++++++++++-----
>>>>>>>>>>                >>   1 file changed, 48 insertions(+), 8 deletions(-)
>>>>>>>>>>                >>
>>>>>>>>>>                >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> index 0f4c3b2..8a480c7 100644
>>>>>>>>>>                >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>                >> @@ -1930,6 +1930,51 @@ static void
>>>>>>>>>>                amdgpu_vm_prt_fini(struct
>>>>>>>>>>                >> amdgpu_device *adev, struct amdgpu_vm *vm)
>>>>>>>>>>                >>   }
>>>>>>>>>>                >>     /**
>>>>>>>>>>                >> + * amdgpu_vm_remove_ptes - free PT BOs
>>>>>>>>>>                >> + *
>>>>>>>>>>                >> + * @adev: amdgpu device structure
>>>>>>>>>>                >> + * @vm: amdgpu vm structure
>>>>>>>>>>                >> + * @start: start of mapped range
>>>>>>>>>>                >> + * @end: end of mapped entry
>>>>>>>>>>                >> + *
>>>>>>>>>>                >> + * Free the page table level.
>>>>>>>>>>                >> + */
>>>>>>>>>>                >> +static int amdgpu_vm_remove_ptes(struct
>>>>>>>>>>                amdgpu_device *adev,
>>>>>>>>>>                >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>>>>>>>>>>                >> +{
>>>>>>>>>>                >> +    struct amdgpu_vm_pt_cursor cursor;
>>>>>>>>>>                >> +    unsigned shift, num_entries;
>>>>>>>>>>                >> +
>>>>>>>>>>                >> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>>>>>>>>>>                >> +    while (cursor.level < AMDGPU_VM_PTB) {
>>>>>>>>>>                >> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>>>>>>>>                >> + return -ENOENT;
>>>>>>>>>>                >> +    }
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +    while (cursor.pfn < end) {
>>>>>>>>>>                >> + amdgpu_vm_free_table(cursor.entry);
>>>>>>>>>>                >> + num_entries = amdgpu_vm_num_entries(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +        if (cursor.entry !=
>>>>>>>>>>                &cursor.parent->entries[num_entries - 1]) {
>>>>>>>>>>                >> + /* Next ptb entry */
>>>>>>>>>>                >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>>                >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>>                >> + cursor.entry++;
>>>>>>>>>>                >> +        } else {
>>>>>>>>>>                >> + /* Next ptb entry in next pd0 entry */
>>>>>>>>>>                >> + amdgpu_vm_pt_ancestor(&cursor);
>>>>>>>>>>                >> + shift = amdgpu_vm_level_shift(adev,
>>>>>>>>>>                cursor.level - 1);
>>>>>>>>>>                >> + cursor.pfn += 1ULL << shift;
>>>>>>>>>>                >> + cursor.pfn &= ~((1ULL << shift) - 1);
>>>>>>>>>>                >> + amdgpu_vm_pt_descendant(adev, &cursor);
>>>>>>>>>>                >> +        }
>>>>>>>>>>                >> +    }
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +    return 0;
>>>>>>>>>>                >> +}
>>>>>>>>>>                >> +
>>>>>>>>>>                >> +/**
>>>>>>>>>>                >>    * amdgpu_vm_clear_freed - clear freed BOs in
>>>>>>>>>>                the PT
>>>>>>>>>>                >>    *
>>>>>>>>>>                >>    * @adev: amdgpu_device pointer
>>>>>>>>>>                >> @@ -1949,7 +1994,6 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>>                >> *adev,
>>>>>>>>>>                >>                 struct dma_fence **fence)
>>>>>>>>>>                >>   {
>>>>>>>>>>                >>       struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>>>                >> -    uint64_t init_pte_value = 0;
>>>>>>>>>>                >>       struct dma_fence *f = NULL;
>>>>>>>>>>                >>       int r;
>>>>>>>>>>                >>   @@ -1958,13 +2002,10 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct
>>>>>>>>>>                >> amdgpu_device *adev,
>>>>>>>>>>                >> struct amdgpu_bo_va_mapping, list);
>>>>>>>>>>                >> list_del(&mapping->list);
>>>>>>>>>>                >>   -        if (vm->pte_support_ats &&
>>>>>>>>>>                >> - mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>>>>>>                >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>>>>>                >> +        r = amdgpu_vm_remove_ptes(adev, vm,
>>>>>>>>>>                >> + (mapping->start + 0x1ff) & (~0x1ffll),
>>>>>>>>>>                >> + (mapping->last + 1) & (~0x1ffll));
>>>>>>>>>>                >>   -        r = amdgpu_vm_bo_update_mapping(adev,
>>>>>>>>>>                vm, false, NULL,
>>>>>>>>>>                >> - mapping->start, mapping->last,
>>>>>>>>>>                >> - init_pte_value, 0, NULL, &f);
>>>>>>>>>>                >> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>>>>>>>                >>           if (r) {
>>>>>>>>>>                >> dma_fence_put(f);
>>>>>>>>>>                >> @@ -1980,7 +2021,6 @@ int
>>>>>>>>>>                amdgpu_vm_clear_freed(struct amdgpu_device
>>>>>>>>>>                >> *adev,
>>>>>>>>>>                >>       }
>>>>>>>>>>                >> return 0;
>>>>>>>>>>                >> -
>>>>>>>>>>                >>   }
>>>>>>>>>>                >>     /**
>>>>>>>>>>                >
>>>>>>>>>>                > _______________________________________________
>>>>>>>>>>                > amd-gfx mailing list
>>>>>>>>>>                > amd-gfx@lists.freedesktop.org
>>>>>>>>>>                <mailto:amd-gfx@lists.freedesktop.org>
>>>>>>>>>>                > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> amd-gfx mailing list
>>>>>>>> amd-gfx@lists.freedesktop.org
>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:19     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 16:19 UTC (permalink / raw)
  To: Koenig, Christian, Kuehling, Felix
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 6196 bytes --]

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 12330 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 16:19     ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-30 16:19 UTC (permalink / raw)
  To: Koenig, Christian, Kuehling, Felix; +Cc: amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 6196 bytes --]

I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:

1. Calling amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
amdgpu_vm_update_ptes(). It saves the most of memory.

2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.

Regards,

Eric

On 2019-10-30 11:57 a.m., Koenig, Christian wrote:


Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com><mailto:Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com><mailto:JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 12330 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 15:57 ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 15:57 UTC (permalink / raw)
  To: Kuehling, Felix
  Cc: Huang, JinHuiEric, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 5578 bytes --]



Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 10769 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 15:57 ` Koenig, Christian
  0 siblings, 0 replies; 32+ messages in thread
From: Koenig, Christian @ 2019-10-30 15:57 UTC (permalink / raw)
  To: Kuehling, Felix; +Cc: Huang, JinHuiEric, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 5578 bytes --]



Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <Felix.Kuehling@amd.com>:
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.

Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).

What we don't do is freeing the higher levels.

E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...

The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.

It's simply not worth it for a maximum saving of 2MB per VM.

Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.

Regards,
Christian.


Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 10769 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 15:47         ` Kuehling, Felix
  0 siblings, 0 replies; 32+ messages in thread
From: Kuehling, Felix @ 2019-10-30 15:47 UTC (permalink / raw)
  To: Koenig, Christian, Huang, JinHuiEric,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables 
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in 
other situations, when a mapping uses huge pages in 
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in 
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page 
tables when a BO is unmapped.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct 
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct 
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 15:47         ` Kuehling, Felix
  0 siblings, 0 replies; 32+ messages in thread
From: Kuehling, Felix @ 2019-10-30 15:47 UTC (permalink / raw)
  To: Koenig, Christian, Huang, JinHuiEric, amd-gfx

On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables 
> need to be allocated again under stress.

That's a bit arbitrary and inconsistent. We are freeing page tables in 
other situations, when a mapping uses huge pages in 
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?

I'm actually a bit surprised that the huge-page handling in 
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page 
tables when a BO is unmapped.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 
>> +++++++++++++++++++++++++++++-----
>>   1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct 
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>>   }
>>     /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> +        struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> +    struct amdgpu_vm_pt_cursor cursor;
>> +    unsigned shift, num_entries;
>> +
>> +    amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> +    while (cursor.level < AMDGPU_VM_PTB) {
>> +        if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> +            return -ENOENT;
>> +    }
>> +
>> +    while (cursor.pfn < end) {
>> +        amdgpu_vm_free_table(cursor.entry);
>> +        num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> +        if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> +            /* Next ptb entry */
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            cursor.entry++;
>> +        } else {
>> +            /* Next ptb entry in next pd0 entry */
>> +            amdgpu_vm_pt_ancestor(&cursor);
>> +            shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> +            cursor.pfn += 1ULL << shift;
>> +            cursor.pfn &= ~((1ULL << shift) - 1);
>> +            amdgpu_vm_pt_descendant(adev, &cursor);
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>>    *
>>    * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>                 struct dma_fence **fence)
>>   {
>>       struct amdgpu_bo_va_mapping *mapping;
>> -    uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>>       int r;
>>   @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct 
>> amdgpu_device *adev,
>>               struct amdgpu_bo_va_mapping, list);
>>           list_del(&mapping->list);
>>   -        if (vm->pte_support_ats &&
>> -            mapping->start < AMDGPU_GMC_HOLE_START)
>> -            init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> +        r = amdgpu_vm_remove_ptes(adev, vm,
>> +                (mapping->start + 0x1ff) & (~0x1ffll),
>> +                (mapping->last + 1) & (~0x1ffll));
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> -                        mapping->start, mapping->last,
>> -                        init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>           if (r) {
>>               dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>       }
>>         return 0;
>> -
>>   }
>>     /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 13:52     ` Christian König
  0 siblings, 0 replies; 32+ messages in thread
From: Christian König @ 2019-10-30 13:52 UTC (permalink / raw)
  To: Huang, JinHuiEric, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
> The issue is PT BOs are not freed when unmapping VA,
> which causes vram usage accumulated is huge in some
> memory stress test, such as kfd big buffer stress test.
> Function amdgpu_vm_bo_update_mapping() is called by both
> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
> solution is replacing amdgpu_vm_bo_update_mapping() in
> amdgpu_vm_clear_freed() with removing PT BOs function
> to save vram usage.

NAK, that is intentional behavior.

Otherwise we can run into out of memory situations when page tables need 
to be allocated again under stress.

Regards,
Christian.

>
> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 +++++++++++++++++++++++++++++-----
>   1 file changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0f4c3b2..8a480c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   }
>   
>   /**
> + * amdgpu_vm_remove_ptes - free PT BOs
> + *
> + * @adev: amdgpu device structure
> + * @vm: amdgpu vm structure
> + * @start: start of mapped range
> + * @end: end of mapped entry
> + *
> + * Free the page table level.
> + */
> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
> +		struct amdgpu_vm *vm, uint64_t start, uint64_t end)
> +{
> +	struct amdgpu_vm_pt_cursor cursor;
> +	unsigned shift, num_entries;
> +
> +	amdgpu_vm_pt_start(adev, vm, start, &cursor);
> +	while (cursor.level < AMDGPU_VM_PTB) {
> +		if (!amdgpu_vm_pt_descendant(adev, &cursor))
> +			return -ENOENT;
> +	}
> +
> +	while (cursor.pfn < end) {
> +		amdgpu_vm_free_table(cursor.entry);
> +		num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
> +
> +		if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
> +			/* Next ptb entry */
> +			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
> +			cursor.pfn += 1ULL << shift;
> +			cursor.pfn &= ~((1ULL << shift) - 1);
> +			cursor.entry++;
> +		} else {
> +			/* Next ptb entry in next pd0 entry */
> +			amdgpu_vm_pt_ancestor(&cursor);
> +			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
> +			cursor.pfn += 1ULL << shift;
> +			cursor.pfn &= ~((1ULL << shift) - 1);
> +			amdgpu_vm_pt_descendant(adev, &cursor);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>    *
>    * @adev: amdgpu_device pointer
> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			  struct dma_fence **fence)
>   {
>   	struct amdgpu_bo_va_mapping *mapping;
> -	uint64_t init_pte_value = 0;
>   	struct dma_fence *f = NULL;
>   	int r;
>   
> @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			struct amdgpu_bo_va_mapping, list);
>   		list_del(&mapping->list);
>   
> -		if (vm->pte_support_ats &&
> -		    mapping->start < AMDGPU_GMC_HOLE_START)
> -			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
> +		r = amdgpu_vm_remove_ptes(adev, vm,
> +				(mapping->start + 0x1ff) & (~0x1ffll),
> +				(mapping->last + 1) & (~0x1ffll));
>   
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
> -						mapping->start, mapping->last,
> -						init_pte_value, 0, NULL, &f);
>   		amdgpu_vm_free_mapping(adev, vm, mapping, f);
>   		if (r) {
>   			dma_fence_put(f);
> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   	}
>   
>   	return 0;
> -
>   }
>   
>   /**

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-30 13:52     ` Christian König
  0 siblings, 0 replies; 32+ messages in thread
From: Christian König @ 2019-10-30 13:52 UTC (permalink / raw)
  To: Huang, JinHuiEric, amd-gfx

Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
> The issue is PT BOs are not freed when unmapping VA,
> which causes vram usage accumulated is huge in some
> memory stress test, such as kfd big buffer stress test.
> Function amdgpu_vm_bo_update_mapping() is called by both
> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
> solution is replacing amdgpu_vm_bo_update_mapping() in
> amdgpu_vm_clear_freed() with removing PT BOs function
> to save vram usage.

NAK, that is intentional behavior.

Otherwise we can run into out of memory situations when page tables need 
to be allocated again under stress.

Regards,
Christian.

>
> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
> Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 +++++++++++++++++++++++++++++-----
>   1 file changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0f4c3b2..8a480c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   }
>   
>   /**
> + * amdgpu_vm_remove_ptes - free PT BOs
> + *
> + * @adev: amdgpu device structure
> + * @vm: amdgpu vm structure
> + * @start: start of mapped range
> + * @end: end of mapped entry
> + *
> + * Free the page table level.
> + */
> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
> +		struct amdgpu_vm *vm, uint64_t start, uint64_t end)
> +{
> +	struct amdgpu_vm_pt_cursor cursor;
> +	unsigned shift, num_entries;
> +
> +	amdgpu_vm_pt_start(adev, vm, start, &cursor);
> +	while (cursor.level < AMDGPU_VM_PTB) {
> +		if (!amdgpu_vm_pt_descendant(adev, &cursor))
> +			return -ENOENT;
> +	}
> +
> +	while (cursor.pfn < end) {
> +		amdgpu_vm_free_table(cursor.entry);
> +		num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
> +
> +		if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
> +			/* Next ptb entry */
> +			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
> +			cursor.pfn += 1ULL << shift;
> +			cursor.pfn &= ~((1ULL << shift) - 1);
> +			cursor.entry++;
> +		} else {
> +			/* Next ptb entry in next pd0 entry */
> +			amdgpu_vm_pt_ancestor(&cursor);
> +			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
> +			cursor.pfn += 1ULL << shift;
> +			cursor.pfn &= ~((1ULL << shift) - 1);
> +			amdgpu_vm_pt_descendant(adev, &cursor);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/**
>    * amdgpu_vm_clear_freed - clear freed BOs in the PT
>    *
>    * @adev: amdgpu_device pointer
> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			  struct dma_fence **fence)
>   {
>   	struct amdgpu_bo_va_mapping *mapping;
> -	uint64_t init_pte_value = 0;
>   	struct dma_fence *f = NULL;
>   	int r;
>   
> @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			struct amdgpu_bo_va_mapping, list);
>   		list_del(&mapping->list);
>   
> -		if (vm->pte_support_ats &&
> -		    mapping->start < AMDGPU_GMC_HOLE_START)
> -			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
> +		r = amdgpu_vm_remove_ptes(adev, vm,
> +				(mapping->start + 0x1ff) & (~0x1ffll),
> +				(mapping->last + 1) & (~0x1ffll));
>   
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
> -						mapping->start, mapping->last,
> -						init_pte_value, 0, NULL, &f);
>   		amdgpu_vm_free_mapping(adev, vm, mapping, f);
>   		if (r) {
>   			dma_fence_put(f);
> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   	}
>   
>   	return 0;
> -
>   }
>   
>   /**

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-29 20:06 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-29 20:06 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Huang, JinHuiEric

The issue is PT BOs are not freed when unmapping VA,
which causes vram usage accumulated is huge in some
memory stress test, such as kfd big buffer stress test.
Function amdgpu_vm_bo_update_mapping() is called by both
amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
solution is replacing amdgpu_vm_bo_update_mapping() in
amdgpu_vm_clear_freed() with removing PT BOs function
to save vram usage.

Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 +++++++++++++++++++++++++++++-----
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0f4c3b2..8a480c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 }
 
 /**
+ * amdgpu_vm_remove_ptes - free PT BOs
+ *
+ * @adev: amdgpu device structure
+ * @vm: amdgpu vm structure
+ * @start: start of mapped range
+ * @end: end of mapped entry
+ *
+ * Free the page table level.
+ */
+static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
+		struct amdgpu_vm *vm, uint64_t start, uint64_t end)
+{
+	struct amdgpu_vm_pt_cursor cursor;
+	unsigned shift, num_entries;
+
+	amdgpu_vm_pt_start(adev, vm, start, &cursor);
+	while (cursor.level < AMDGPU_VM_PTB) {
+		if (!amdgpu_vm_pt_descendant(adev, &cursor))
+			return -ENOENT;
+	}
+
+	while (cursor.pfn < end) {
+		amdgpu_vm_free_table(cursor.entry);
+		num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
+
+		if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
+			/* Next ptb entry */
+			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
+			cursor.pfn += 1ULL << shift;
+			cursor.pfn &= ~((1ULL << shift) - 1);
+			cursor.entry++;
+		} else {
+			/* Next ptb entry in next pd0 entry */
+			amdgpu_vm_pt_ancestor(&cursor);
+			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
+			cursor.pfn += 1ULL << shift;
+			cursor.pfn &= ~((1ULL << shift) - 1);
+			amdgpu_vm_pt_descendant(adev, &cursor);
+		}
+	}
+
+	return 0;
+}
+
+/**
  * amdgpu_vm_clear_freed - clear freed BOs in the PT
  *
  * @adev: amdgpu_device pointer
@@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			  struct dma_fence **fence)
 {
 	struct amdgpu_bo_va_mapping *mapping;
-	uint64_t init_pte_value = 0;
 	struct dma_fence *f = NULL;
 	int r;
 
@@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			struct amdgpu_bo_va_mapping, list);
 		list_del(&mapping->list);
 
-		if (vm->pte_support_ats &&
-		    mapping->start < AMDGPU_GMC_HOLE_START)
-			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
+		r = amdgpu_vm_remove_ptes(adev, vm,
+				(mapping->start + 0x1ff) & (~0x1ffll),
+				(mapping->last + 1) & (~0x1ffll));
 
-		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
-						mapping->start, mapping->last,
-						init_pte_value, 0, NULL, &f);
 		amdgpu_vm_free_mapping(adev, vm, mapping, f);
 		if (r) {
 			dma_fence_put(f);
@@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 	}
 
 	return 0;
-
 }
 
 /**
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] drm/amdgpu: remove PT BOs when unmapping
@ 2019-10-29 20:06 ` Huang, JinHuiEric
  0 siblings, 0 replies; 32+ messages in thread
From: Huang, JinHuiEric @ 2019-10-29 20:06 UTC (permalink / raw)
  To: amd-gfx; +Cc: Huang, JinHuiEric

The issue is PT BOs are not freed when unmapping VA,
which causes vram usage accumulated is huge in some
memory stress test, such as kfd big buffer stress test.
Function amdgpu_vm_bo_update_mapping() is called by both
amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
solution is replacing amdgpu_vm_bo_update_mapping() in
amdgpu_vm_clear_freed() with removing PT BOs function
to save vram usage.

Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 +++++++++++++++++++++++++++++-----
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0f4c3b2..8a480c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 }
 
 /**
+ * amdgpu_vm_remove_ptes - free PT BOs
+ *
+ * @adev: amdgpu device structure
+ * @vm: amdgpu vm structure
+ * @start: start of mapped range
+ * @end: end of mapped entry
+ *
+ * Free the page table level.
+ */
+static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
+		struct amdgpu_vm *vm, uint64_t start, uint64_t end)
+{
+	struct amdgpu_vm_pt_cursor cursor;
+	unsigned shift, num_entries;
+
+	amdgpu_vm_pt_start(adev, vm, start, &cursor);
+	while (cursor.level < AMDGPU_VM_PTB) {
+		if (!amdgpu_vm_pt_descendant(adev, &cursor))
+			return -ENOENT;
+	}
+
+	while (cursor.pfn < end) {
+		amdgpu_vm_free_table(cursor.entry);
+		num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
+
+		if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
+			/* Next ptb entry */
+			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
+			cursor.pfn += 1ULL << shift;
+			cursor.pfn &= ~((1ULL << shift) - 1);
+			cursor.entry++;
+		} else {
+			/* Next ptb entry in next pd0 entry */
+			amdgpu_vm_pt_ancestor(&cursor);
+			shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
+			cursor.pfn += 1ULL << shift;
+			cursor.pfn &= ~((1ULL << shift) - 1);
+			amdgpu_vm_pt_descendant(adev, &cursor);
+		}
+	}
+
+	return 0;
+}
+
+/**
  * amdgpu_vm_clear_freed - clear freed BOs in the PT
  *
  * @adev: amdgpu_device pointer
@@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			  struct dma_fence **fence)
 {
 	struct amdgpu_bo_va_mapping *mapping;
-	uint64_t init_pte_value = 0;
 	struct dma_fence *f = NULL;
 	int r;
 
@@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			struct amdgpu_bo_va_mapping, list);
 		list_del(&mapping->list);
 
-		if (vm->pte_support_ats &&
-		    mapping->start < AMDGPU_GMC_HOLE_START)
-			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
+		r = amdgpu_vm_remove_ptes(adev, vm,
+				(mapping->start + 0x1ff) & (~0x1ffll),
+				(mapping->last + 1) & (~0x1ffll));
 
-		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
-						mapping->start, mapping->last,
-						init_pte_value, 0, NULL, &f);
 		amdgpu_vm_free_mapping(adev, vm, mapping, f);
 		if (r) {
 			dma_fence_put(f);
@@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 	}
 
 	return 0;
-
 }
 
 /**
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-11-05 18:51 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-30 16:30 [PATCH] drm/amdgpu: remove PT BOs when unmapping Koenig, Christian
2019-10-30 16:30 ` Koenig, Christian
     [not found] ` <3f4b6803-ec66-44ca-b55a-8bccf4236632-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2019-10-30 16:55   ` Huang, JinHuiEric
2019-10-30 16:55     ` Huang, JinHuiEric
     [not found]     ` <b8ad3c90-42d0-512d-5ba0-af330eab30a1-5C7GfCeVMHo@public.gmane.org>
2019-10-30 17:42       ` Koenig, Christian
2019-10-30 17:42         ` Koenig, Christian
     [not found]         ` <b5d9309e-a32b-8243-8c4d-cfd4e77e09e1-5C7GfCeVMHo@public.gmane.org>
2019-10-30 17:57           ` Christian König
2019-10-30 17:57             ` Christian König
     [not found]             ` <461cc802-e7c5-f968-1cb4-5e55a306e780-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-10-30 18:00               ` Huang, JinHuiEric
2019-10-30 18:00                 ` Huang, JinHuiEric
     [not found]                 ` <d2d9f2a2-5a79-8fd4-f6ee-5ce2dc6cfbd9-5C7GfCeVMHo@public.gmane.org>
2019-10-30 18:11                   ` Koenig, Christian
2019-10-30 18:11                     ` Koenig, Christian
     [not found]                     ` <3f62c442-ed58-d8b8-faac-9289c83fd0da-5C7GfCeVMHo@public.gmane.org>
2019-10-31 10:41                       ` Koenig, Christian
2019-10-31 10:41                         ` Koenig, Christian
     [not found]                         ` <68bc3b87-cd95-83c1-162b-61ad3d5979dd-5C7GfCeVMHo@public.gmane.org>
2019-10-31 14:08                           ` StDenis, Tom
2019-10-31 14:08                             ` StDenis, Tom
     [not found]                             ` <90bf5487-96cc-9971-332f-a97ab3e5ccc9-5C7GfCeVMHo@public.gmane.org>
2019-10-31 14:33                               ` Huang, JinHuiEric
2019-10-31 14:33                                 ` Huang, JinHuiEric
     [not found]                                 ` <795a33e9-92d0-d7f5-3448-24350cc20d1b-5C7GfCeVMHo@public.gmane.org>
2019-11-05 16:27                                   ` Huang, JinHuiEric
2019-11-05 16:27                                     ` Huang, JinHuiEric
     [not found]                                     ` <baf6c6c0-2fb1-6c8a-d5a5-d13ebe28406c-5C7GfCeVMHo@public.gmane.org>
2019-11-05 18:51                                       ` Koenig, Christian
2019-11-05 18:51                                         ` Koenig, Christian
  -- strict thread matches above, loose matches on Subject: below --
2019-10-30 15:57 Koenig, Christian
2019-10-30 15:57 ` Koenig, Christian
     [not found] ` <3b01b638-a678-42e6-900e-bff3593874c4-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2019-10-30 16:19   ` Huang, JinHuiEric
2019-10-30 16:19     ` Huang, JinHuiEric
2019-10-29 20:06 Huang, JinHuiEric
2019-10-29 20:06 ` Huang, JinHuiEric
     [not found] ` <1572379585-1401-1-git-send-email-JinhuiEric.Huang-5C7GfCeVMHo@public.gmane.org>
2019-10-30 13:52   ` Christian König
2019-10-30 13:52     ` Christian König
     [not found]     ` <ebd653c8-d9e9-b104-769f-e68dee0e4c65-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-10-30 15:47       ` Kuehling, Felix
2019-10-30 15:47         ` Kuehling, Felix

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.