If you could get me a copy of the vbios image from a problematic board, that would be helpful. In the meantime, I've applied the patch. Alex On Thu, Dec 16, 2021 at 9:38 PM 周宗敏 wrote: > Dear Alex: > > > >Is the issue reproducible with the same board in bare metal on x86?Or > does it only happen with passthrough on ARM? > > > Unfortunately, my current environment is not convenient to test this GPU > board on x86 platform. > > but I can tell you the problem still occurs on ARM without passthrough to > virtual machine. > > > In addition,at end of 2020,my colleagues also found similar problems on > MIPS platforms with Graphics chips of Radeon R7 340. > > So,I may think it can happen to no matter based on x86 ,ARM or mips. > > > I hope the above information is helpful to you,and I also think it will be > better for user if can root cause this issue. > > > Best regards. > > > > > ---- > > > > > > > *主 题:*Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 > > *日 期:*2021-12-16 23:28 > *发件人:*Alex Deucher > *收件人:*周宗敏 > > > Is the issue reproducible with the same board in bare metal on x86? Or > does it only happen with passthrough on ARM? Looking through the archives, > the SI patch I made was for an x86 laptop. It would be nice to root > cause this, but there weren't any gfx8 boards with more than 64G of vram, > so I think it's safe. That said, if you see similar issues with newer gfx > IPs then we have an issue since the upper bit will be meaningful, so it > would be nice to root cause this. > > Alex > > > On Thu, Dec 16, 2021 at 4:36 AM 周宗敏 wrote: > >> Hi Christian, >> >> >> I'm testing for GPU passthrough feature, so I pass through this GPU to >> virtual machine to use. It based on arm64 system. >> >> As far as i know, Alex had dealt with a similar problems on >> dri/radeon/si.c . Maybe they have a same reason to cause it? >> >> the history commit message is below: >> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ca223b029a261e82fb2f50c52eb85d510f4260e >> >> [image: image.png] >> >> >> Thanks very much. >> >> >> >> ---- >> >> >> >> *主 题:*Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 >> >> *日 期:*2021-12-16 16:15 >> *发件人:*Christian König >> *收件人:*周宗敏Alex Deucher >> >> >> >> >> Hi Zongmin, >> >> that strongly sounds like the ASIC is not correctly initialized when >> trying to read the register. >> >> What board and environment are you using this GPU with? Is that a >> normal x86 system? >> >> Regards, >> Christian. >> >> >> >> Am 16.12.21 um 04:11 schrieb 周宗敏: >> >> >> >> 1. >> >> the problematic boards that I have tested is [AMD/ATI] Lexa >> PRO [Radeon RX 550/550X] ; and the vbios version : >> 113-RXF9310-C09-BT >> 2. >> >> When an exception occurs I can see the following changes in >> the values of vram size get from RREG32(mmCONFIG_MEMSIZE) , >> >> it seems to have garbage in the upper 16 bits >> >> [image: image.png] >> >> >> >> >> 3. >> >> and then I can also see some dmesg like below: >> >> when vram size register have garbage,we may see error >> message like below: >> >> amdgpu 0000:09:00.0: VRAM: 4286582784M 0x000000F400000000 - >> 0x000FF8F4FFFFFFFF (4286582784M used) >> >> the correct message should like below: >> >> amdgpu 0000:09:00.0: VRAM: 4096M 0x000000F400000000 - >> 0x000000F4FFFFFFFF (4096M used) >> >> >> >> >> if you have any problems,please send me mail. >> >> thanks very much. >> >> >> >> >> ---- >> >> *主 题:*Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 >> >> *日 期:*2021-12-16 04:23 >> *发件人:*Alex Deucher >> *收件人:*Zongmin Zhou >> >> >> >> >> On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote: >> > >> > Some boards(like RX550) seem to have garbage in the upper >> > 16 bits of the vram size register. Check for >> > this and clamp the size properly. Fixes >> > boards reporting bogus amounts of vram. >> > >> > after add this patch,the maximum GPU VRAM size is 64GB, >> > otherwise only 64GB vram size will be used. >> >> Can you provide some examples of problematic boards and >> possibly a >> vbios image from the problematic board? What values are you >> seeing? >> It would be nice to see what the boards are reporting and >> whether the >> lower 16 bits are actually correct or if it is some other >> issue. This >> register is undefined until the asic has been initialized. >> The vbios >> programs it as part of it's asic init sequence (either via >> vesa/gop or >> the OS driver). >> >> Alex >> >> >> > >> > Signed-off-by: Zongmin Zhou >> > --- >> > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13 >> ++++++++++--- >> > 1 file changed, 10 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> > index 492ebed2915b..63b890f1e8af 100644 >> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> > @@ -515,10 +515,10 @@ static void >> gmc_v8_0_mc_program(struct amdgpu_device *adev) >> > static int gmc_v8_0_mc_init(struct amdgpu_device >> *adev) >> > { >> > int r; >> > + u32 tmp; >> > >> > adev->gmc.vram_width = >> amdgpu_atombios_get_vram_width(adev); >> > if (!adev->gmc.vram_width) { >> > - u32 tmp; >> > int chansize, numchan; >> > >> > /* Get VRAM informations */ >> > @@ -562,8 +562,15 @@ static int gmc_v8_0_mc_init(struct >> amdgpu_device *adev) >> > adev->gmc.vram_width = numchan * >> chansize; >> > } >> > /* size in MB on si */ >> > - adev->gmc.mc_vram_size = >> RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; >> > - adev->gmc.real_vram_size = >> RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; >> > + tmp = RREG32(mmCONFIG_MEMSIZE); >> > + /* some boards may have garbage in the upper 16 >> bits */ >> > + if (tmp & 0xffff0000) { >> > + DRM_INFO("Probable bad vram size: >> 0x%08x\n", tmp); >> > + if (tmp & 0xffff) >> > + tmp &= 0xffff; >> > + } >> > + adev->gmc.mc_vram_size = tmp * 1024ULL * >> 1024ULL; >> > + adev->gmc.real_vram_size = >> adev->gmc.mc_vram_size; >> > >> > if (!(adev->flags & AMD_IS_APU)) { >> > r = amdgpu_device_resize_fb_bar(adev); >> > -- >> > 2.25.1 >> > >> > >> > No virus found >> > Checked by Hillstone Network AntiVirus >> >> >> >> >> >