Hi Christian

 

I believe Wentao can fix the issue we it by below step:

  1. Return Virtual_address_max (UMD use it) to HOLE_START – RESERVED_SIZE
  2. [optional] Still Keep virtual_address_offset to RESERVED_SIZE (current way, I think it’s because previously we put CSA in 0 à RESERVED_SIZE space)
  3. Put CSA in HOLE_START – RESERVED_SIZE  è HOLE_START (it’s current design)

 

I don’t get where above scheme is not correct … can you give more explain for the GMC_HOLE_START ?

 

e.g.

  1. why you set GMC_HOLE_START to 0x8’000’0000’0000 (half size of MAX of 48bit address space) ? is it for HSA purpose to make sure GPU address can also be used for CPU address ?
  2. now MAX_PFN is 1’000’0000’0000, do you need to change GMC_HOLE_START ?

 

thanks

we need some catch up

 

/Monk

 

From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Koenig, Christian
Sent: Thursday, January 17, 2019 3:39 PM
To: Lou, Wentao <Wentao.Lou@amd.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Zhu, Rex <Rex.Zhu@amd.com>
Cc: Deng, Emily <Emily.Deng@amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

 

Am 17.01.19 um 04:17 schrieb Lou, Wentao:

Hi Christian,

 

Your solution as:

addr = (max_pfn - (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT)) << AMDGPU_PAGE_SHIFT;

now max_pfn = 0x10 0000 0000, AMDGPU_VA_RESERVED_SIZE = 0x10 0000, AMDGPU_PAGE_SHIFT = 12

Still got addr = 0xFFFF FFF0 0000, which would cause ring gfx timeout.


But 0xFFFF FFF0 0000 is the correct address, if that is causing a problem then there is a bug somewhere else.

Please try to use AMDGPU_GMC_HOLE_START-AMDGPU_VA_RESERVED_SIZE as well. Does that work?


 

Before commit 1bf621c42137926ac249af761c0190a9258aa0db, vm_size was 32GB, and csa_addr was under AMDGPU_GMC_HOLE_START.


Wait a second why was the vm_size 32GB? This is on a Vega10 isn't it?


I didn’t understand why csa_addr need to be above AMDGPU_GMC_HOLE_START now.


On Vega10 the lower range, e.g. everything below AMDGPU_GMC_HOLE_START is reserved for SVA.

Regards,
Christian.


Thanks.

 

BR,

Wentao

 

 

 

From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, January 16, 2019 5:48 PM
To: Lou, Wentao <Wentao.Lou@amd.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Zhu, Rex <Rex.Zhu@amd.com>
Cc: Deng, Emily <Emily.Deng@amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

 

Hi Wentao,

well the problem is you don't seem to understand how the hardware works.

See the engines see an MC address space with a hole in the middle, similar to the how x86 64bit CPU address space works. But the page tables are programmed linearly.

So the calculation in amdgpu_driver_open_kms() is correct because it takes the MC address and mages a linear page table index from it again.

The only thing we might need to fix here is shifting max_pfn before the subtraction and I doubt that even that is necessary.

Regards,
Christian.

Am 16.01.19 um 10:34 schrieb Lou, Wentao:

Hi Christian,

 

Now vm_size was set to 0x4 0000 GB by below commit:

1bf621c42137926ac249af761c0190a9258aa0db drm/amdgpu: Remove unnecessary VM size calculations

 

So that max_pfn would be 0x10 0000 0000.

amdgpu_csa_vaddr would make max_pfn << 12 to get 0x1 0000 0000 0000, and then minus AMDGPU_VA_RESERVED_SIZE, to get 0xFFFF FFF0 0000

unfortunately this number was between AMDGPU_GMC_HOLE_START and AMDGPU_GMC_HOLE_END, so that amdgpu_gmc_sign_extend was called to make it 0xFFFF FFFF FFF0 0000

 

in amdgpu_driver_open_kms, extended csa_addr cannot be passed into amdgpu_map_static_csa directly, it would be above the limit of max_pfn.

So that csa_addr was restricted by AMDGPU_GMC_HOLE_MASK to make it possible for amdgpu_vm_alloc_pts.

But this restriction by AMDGPU_GMC_HOLE_MASK would make the address fall back into AMDGPU_GMC_HOLE again,  which causing GPU reset.

We just put amdgpu_csa_vaddr back to AMDGPU_GMC_HOLE_START, to avoid the address touching AMDGPU_GMC_HOLE.

By the way, if max_pfn was shift much to the left, it would always get zero, with or without min(*,*).

 

 

BR,

Wentao

 

 

 

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Tuesday, January 15, 2019 4:02 PM
To: Liu, Monk <Monk.Liu@amd.com>; Lou, Wentao <Wentao.Lou@amd.com>; amd-gfx@lists.freedesktop.org; Zhu, Rex <Rex.Zhu@amd.com>
Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

 

Am 15.01.19 um 07:19 schrieb Liu, Monk:

> The max_pfn is now 1'0000'0000'0000'0000 (bytes) which is above 48 bit now, and it with AMDGPU_GMC_HOLE_MASK make it to zero ....

> And in code "amdgpu_driver_open_kms()" I saw @Zhu, Rex write the code as :

> "csa_addr = amdgpu_csa_vadr(adev) & AMDGPU_GMC_HOLE_MASK", I think this is wrong since you intentionally place the csa above GMC hole, right ?

 

The fix is just completely incorrect since min(adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT, AMDGPU_GMC_HOLE_START) still gives you 0 when we shift max_pfn to much to the left.

 

The correct solution is to substract the reserved size first and then shift. E.g.:

 

addr = (max_pfn - (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT)) << AMDGPU_PAGE_SHIFT;

 

Regards,

Christian.

 

> Looks like  we should modify this place

> /Monk

> -----Original Message-----

> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of

> Christian K?nig

> Sent: Monday, January 14, 2019 9:05 PM

> To: Lou, Wentao <Wentao.Lou@amd.com>; amd-gfx@lists.freedesktop.org

> Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than

> AMDGPU_GMC_HOLE_START

> Am 14.01.19 um 09:40 schrieb wentalou:

>> After removing unnecessary VM size calculations, vm_manager.max_pfn

>> would reach 0x10,0000,0000 max_pfn << AMDGPU_GPU_PAGE_SHIFT exceeding

>> AMDGPU_GMC_HOLE_START would caused GPU reset.

>> 

>> Change-Id: I47ad0be2b0bd9fb7490c4e1d7bb7bdacf71132cb

>> Signed-off-by: wentalou <Wentao.Lou@amd.com>

> NAK, that is incorrect. We intentionally place the csa above the GMC hole.

> Regards,

> Christian.

>> ---

>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 3 ++-

>>    1 file changed, 2 insertions(+), 1 deletion(-)

>> 

>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> index 7e22be7..dd3bd01 100644

>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

>> @@ -26,7 +26,8 @@

>>   

>>    uint64_t amdgpu_csa_vaddr(struct amdgpu_device *adev)

>>    {

>> -        uint64_t addr = adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT;

>> +       uint64_t addr = min(adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT,

>> +                                                    AMDGPU_GMC_HOLE_START);

>>   

>>          addr -= AMDGPU_VA_RESERVED_SIZE;

>>          addr = amdgpu_gmc_sign_extend(addr);

> _______________________________________________

> amd-gfx mailing list

> amd-gfx@lists.freedesktop.org

> https://lists.freedesktop.org/mailman/listinfo/amd-gfx