* [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang @ 2021-05-14 8:19 changfeng.zhu 2021-05-14 14:13 ` Alex Deucher 0 siblings, 1 reply; 13+ messages in thread From: changfeng.zhu @ 2021-05-14 8:19 UTC (permalink / raw) To: amd-gfx, Ray.Huang; +Cc: changzhu From: changzhu <Changfeng.Zhu@amd.com> From: Changfeng <Changfeng.Zhu@amd.com> There is problem with 3DCGCG firmware and it will cause compute test hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid compute hang. Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 22608c45f07c..feaa5e4a5538 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, amdgpu_gfx_rlc_enter_safe_mode(adev); /* Enable 3D CGCG/CGLS */ - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { + if (enable) { /* write cmd to clear cgcg/cgls ov */ def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); /* unset CGCG override */ @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, /* enable 3Dcgcg FSM(0x0000363f) */ def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + else + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 4b660b2d1c22..080e715799d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_RLC_LS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-14 8:19 [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang changfeng.zhu @ 2021-05-14 14:13 ` Alex Deucher 2021-05-17 6:27 ` Huang Rui 2021-05-19 11:34 ` Nirmoy 0 siblings, 2 replies; 13+ messages in thread From: Alex Deucher @ 2021-05-14 14:13 UTC (permalink / raw) To: changzhu; +Cc: Huang Rui, amd-gfx list On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > From: changzhu <Changfeng.Zhu@amd.com> > > From: Changfeng <Changfeng.Zhu@amd.com> > > There is problem with 3DCGCG firmware and it will cause compute test > hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid > compute hang. > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> WIth this applied, can we re-enable the additional compute queues? Alex > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 22608c45f07c..feaa5e4a5538 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > amdgpu_gfx_rlc_enter_safe_mode(adev); > > /* Enable 3D CGCG/CGLS */ > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > + if (enable) { > /* write cmd to clear cgcg/cgls ov */ > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > /* unset CGCG override */ > @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > /* enable 3Dcgcg FSM(0x0000363f) */ > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > + else > + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > + > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 4b660b2d1c22..080e715799d4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > AMD_CG_SUPPORT_GFX_MGLS | > AMD_CG_SUPPORT_GFX_CP_LS | > - AMD_CG_SUPPORT_GFX_3D_CGCG | > AMD_CG_SUPPORT_GFX_3D_CGLS | > AMD_CG_SUPPORT_GFX_CGCG | > AMD_CG_SUPPORT_GFX_CGLS | > @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) > AMD_CG_SUPPORT_GFX_MGLS | > AMD_CG_SUPPORT_GFX_RLC_LS | > AMD_CG_SUPPORT_GFX_CP_LS | > - AMD_CG_SUPPORT_GFX_3D_CGCG | > AMD_CG_SUPPORT_GFX_3D_CGLS | > AMD_CG_SUPPORT_GFX_CGCG | > AMD_CG_SUPPORT_GFX_CGLS | > -- > 2.17.1 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-14 14:13 ` Alex Deucher @ 2021-05-17 6:27 ` Huang Rui 2021-05-17 8:09 ` Zhu, Changfeng 2021-05-19 11:34 ` Nirmoy 1 sibling, 1 reply; 13+ messages in thread From: Huang Rui @ 2021-05-17 6:27 UTC (permalink / raw) To: Alex Deucher, Zhu, Changfeng; +Cc: amd-gfx list On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > > > From: changzhu <Changfeng.Zhu@amd.com> > > > > From: Changfeng <Changfeng.Zhu@amd.com> > > > > There is problem with 3DCGCG firmware and it will cause compute test > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid > > compute hang. > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > WIth this applied, can we re-enable the additional compute queues? > I think so. Changfeng, could you please confirm this on all raven series? Patch is Reviewed-by: Huang Rui <ray.huang@amd.com> > Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index 22608c45f07c..feaa5e4a5538 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > /* Enable 3D CGCG/CGLS */ > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > + if (enable) { > > /* write cmd to clear cgcg/cgls ov */ > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > /* unset CGCG override */ > > @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > /* enable 3Dcgcg FSM(0x0000363f) */ > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + else > > + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > + > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 4b660b2d1c22..080e715799d4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | > > @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_RLC_LS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | > > -- > > 2.17.1 > > > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2BV7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-17 6:27 ` Huang Rui @ 2021-05-17 8:09 ` Zhu, Changfeng 2021-05-19 2:19 ` Alex Deucher 0 siblings, 1 reply; 13+ messages in thread From: Zhu, Changfeng @ 2021-05-17 8:09 UTC (permalink / raw) To: Huang, Ray, Alex Deucher; +Cc: amd-gfx list [AMD Official Use Only - Internal Distribution Only] Hi Ray and Alex, I have confirmed it can enable the additional compute queues with this patch: [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 BR, Changfeng. -----Original Message----- From: Huang, Ray <Ray.Huang@amd.com> Sent: Monday, May 17, 2021 2:27 PM To: Alex Deucher <alexdeucher@gmail.com>; Zhu, Changfeng <Changfeng.Zhu@amd.com> Cc: amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > > > From: changzhu <Changfeng.Zhu@amd.com> > > > > From: Changfeng <Changfeng.Zhu@amd.com> > > > > There is problem with 3DCGCG firmware and it will cause compute test > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to > > avoid compute hang. > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > WIth this applied, can we re-enable the additional compute queues? > I think so. Changfeng, could you please confirm this on all raven series? Patch is Reviewed-by: Huang Rui <ray.huang@amd.com> > Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index 22608c45f07c..feaa5e4a5538 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > /* Enable 3D CGCG/CGLS */ > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > + if (enable) { > > /* write cmd to clear cgcg/cgls ov */ > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > /* enable 3Dcgcg FSM(0x0000363f) */ > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + else > > + data = 0x0 << > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > + > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 4b660b2d1c22..080e715799d4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_RLC_LS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | > > -- > > 2.17.1 > > > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01% > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961fe > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CTWF > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2BV > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-17 8:09 ` Zhu, Changfeng @ 2021-05-19 2:19 ` Alex Deucher 2021-05-19 2:28 ` Zhu, Changfeng 0 siblings, 1 reply; 13+ messages in thread From: Alex Deucher @ 2021-05-19 2:19 UTC (permalink / raw) To: Zhu, Changfeng; +Cc: Huang, Ray, amd-gfx list Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com>; Zhu, Changfeng <Changfeng.Zhu@amd.com> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com> > > > > > > There is problem with 3DCGCG firmware and it will cause compute test > > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to > > > avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CTWF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 2:19 ` Alex Deucher @ 2021-05-19 2:28 ` Zhu, Changfeng 2021-05-19 2:52 ` Deucher, Alexander 0 siblings, 1 reply; 13+ messages in thread From: Zhu, Changfeng @ 2021-05-19 2:28 UTC (permalink / raw) To: Alex Deucher; +Cc: Huang, Ray, amd-gfx list [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com>; Zhu, Changfeng > <Changfeng.Zhu@amd.com> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 2:28 ` Zhu, Changfeng @ 2021-05-19 2:52 ` Deucher, Alexander 2021-05-19 2:55 ` Zhu, Changfeng 0 siblings, 1 reply; 13+ messages in thread From: Deucher, Alexander @ 2021-05-19 2:52 UTC (permalink / raw) To: Zhu, Changfeng, Alex Deucher, Das, Nirmoy; +Cc: Huang, Ray, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 7865 bytes --] [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex ________________________________ From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Zhu, Changfeng <Changfeng.Zhu@amd.com> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher <alexdeucher@gmail.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com>; Zhu, Changfeng > <Changfeng.Zhu@amd.com> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0 [-- Attachment #1.2: Type: text/html, Size: 15556 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 2:52 ` Deucher, Alexander @ 2021-05-19 2:55 ` Zhu, Changfeng 2021-05-19 3:00 ` Chen, Guchun 0 siblings, 1 reply; 13+ messages in thread From: Zhu, Changfeng @ 2021-05-19 2:55 UTC (permalink / raw) To: Deucher, Alexander, Alex Deucher, Das, Nirmoy; +Cc: Huang, Ray, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 9095 bytes --] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander <Alexander.Deucher@amd.com> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, Nirmoy <Nirmoy.Das@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex ________________________________ From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org<mailto:amd-gfx-bounces@lists.freedesktop.org>> on behalf of Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Zhu, Changfeng > <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com<mailto:changfeng.zhu@amd.com>> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com<mailto:ray.huang@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0 [-- Attachment #1.2: Type: text/html, Size: 19550 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 2:55 ` Zhu, Changfeng @ 2021-05-19 3:00 ` Chen, Guchun 2021-05-19 3:03 ` Deucher, Alexander 0 siblings, 1 reply; 13+ messages in thread From: Chen, Guchun @ 2021-05-19 3:00 UTC (permalink / raw) To: Zhu, Changfeng, Deucher, Alexander, Alex Deucher, Das, Nirmoy Cc: Huang, Ray, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 10307 bytes --] [Public] Nirmoy's patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander <Alexander.Deucher@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, Nirmoy <Nirmoy.Das@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander <Alexander.Deucher@amd.com<mailto:Alexander.Deucher@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>>; Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com<mailto:Nirmoy.Das@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex ________________________________ From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org<mailto:amd-gfx-bounces@lists.freedesktop.org>> on behalf of Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Zhu, Changfeng > <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com<mailto:changfeng.zhu@amd.com>> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com<mailto:ray.huang@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C3fc7a549816d4c8061c008d91a719cb8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569897555065647%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YTC%2FvVR%2BbPKw9JKayhmHapRkkEFaczoGzJJ3jFJqBAM%3D&reserved=0> [-- Attachment #1.2: Type: text/html, Size: 21399 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 3:00 ` Chen, Guchun @ 2021-05-19 3:03 ` Deucher, Alexander 2021-05-19 3:14 ` Huang, Ray 0 siblings, 1 reply; 13+ messages in thread From: Deucher, Alexander @ 2021-05-19 3:03 UTC (permalink / raw) To: Chen, Guchun, Zhu, Changfeng, Alex Deucher, Das, Nirmoy Cc: Huang, Ray, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 11025 bytes --] [Public] I thought we had disabled all but one of the compute queues on raven due to this issue or at least disabled the schedulers for the additional queues, but maybe I'm misremembering. Alex ________________________________ From: Chen, Guchun <Guchun.Chen@amd.com> Sent: Tuesday, May 18, 2021 11:00 PM To: Zhu, Changfeng <Changfeng.Zhu@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, Nirmoy <Nirmoy.Das@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] Nirmoy’s patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander <Alexander.Deucher@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, Nirmoy <Nirmoy.Das@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander <Alexander.Deucher@amd.com<mailto:Alexander.Deucher@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>>; Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com<mailto:Nirmoy.Das@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex ________________________________ From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org<mailto:amd-gfx-bounces@lists.freedesktop.org>> on behalf of Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Zhu, Changfeng > <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com<mailto:changfeng.zhu@amd.com>> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com<mailto:ray.huang@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C3fc7a549816d4c8061c008d91a719cb8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569897555065647%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YTC%2FvVR%2BbPKw9JKayhmHapRkkEFaczoGzJJ3jFJqBAM%3D&reserved=0> [-- Attachment #1.2: Type: text/html, Size: 21663 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 3:03 ` Deucher, Alexander @ 2021-05-19 3:14 ` Huang, Ray 2021-05-19 11:27 ` Nirmoy 0 siblings, 1 reply; 13+ messages in thread From: Huang, Ray @ 2021-05-19 3:14 UTC (permalink / raw) To: Deucher, Alexander, Chen, Guchun, Zhu, Changfeng, Alex Deucher, Das, Nirmoy Cc: Huang, Shimmer, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 13135 bytes --] [Public] I check the patch (below) to disable compute queues for raven is not landed into drm-next. So actually all queues are enabled at this moment. Nirmoy, can we get your confirmation? diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 97a8f786cf85..9352fcb77fe9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -812,6 +812,13 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v) int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev) { if (amdgpu_num_kcq == -1) { + /* raven firmware currently can not load balance jobs + * among multiple compute queues. Enable only one + * compute queue till we have a firmware fix. + */ + if (adev->asic_type == CHIP_RAVEN) + return 1; + return 8; } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid parameter provided by user\n"); And I am glad to see that we have a solution to fix this issue at current. Nice work, Changfeng! Best Regards, Ray From: Deucher, Alexander <Alexander.Deucher@amd.com> Sent: Wednesday, May 19, 2021 11:04 AM To: Chen, Guchun <Guchun.Chen@amd.com>; Zhu, Changfeng <Changfeng.Zhu@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, Nirmoy <Nirmoy.Das@amd.com> Cc: Huang, Ray <Ray.Huang@amd.com>; amd-gfx list <amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] I thought we had disabled all but one of the compute queues on raven due to this issue or at least disabled the schedulers for the additional queues, but maybe I'm misremembering. Alex ________________________________ From: Chen, Guchun <Guchun.Chen@amd.com<mailto:Guchun.Chen@amd.com>> Sent: Tuesday, May 18, 2021 11:00 PM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>>; Deucher, Alexander <Alexander.Deucher@amd.com<mailto:Alexander.Deucher@amd.com>>; Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com<mailto:Nirmoy.Das@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] Nirmoy's patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org<mailto:amd-gfx-bounces@lists.freedesktop.org>> On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander <Alexander.Deucher@amd.com<mailto:Alexander.Deucher@amd.com>>; Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com<mailto:Nirmoy.Das@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander <Alexander.Deucher@amd.com<mailto:Alexander.Deucher@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>>; Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com<mailto:Nirmoy.Das@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex ________________________________ From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org<mailto:amd-gfx-bounces@lists.freedesktop.org>> on behalf of Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -----Original Message----- From: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> Cc: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>>; amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com<mailto:Ray.Huang@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher <alexdeucher@gmail.com<mailto:alexdeucher@gmail.com>>; Zhu, Changfeng > <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com<mailto:changfeng.zhu@amd.com>> wrote: > > > > > > From: changzhu <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > From: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com<mailto:Changfeng.Zhu@amd.com>> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com<mailto:ray.huang@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > -1413,7 > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > AMD_CG_SUPPORT_GFX_MGLS | > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > AMD_CG_SUPPORT_GFX_CGCG | > > > AMD_CG_SUPPORT_GFX_CGLS | > > > -- > > > 2.17.1 > > > > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > > > li > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > 1% > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > fe > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > WF > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > CI > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > BV > > > 7pSY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C3fc7a549816d4c8061c008d91a719cb8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569897555065647%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YTC%2FvVR%2BbPKw9JKayhmHapRkkEFaczoGzJJ3jFJqBAM%3D&reserved=0> [-- Attachment #1.2: Type: text/html, Size: 32049 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-19 3:14 ` Huang, Ray @ 2021-05-19 11:27 ` Nirmoy 0 siblings, 0 replies; 13+ messages in thread From: Nirmoy @ 2021-05-19 11:27 UTC (permalink / raw) To: Huang, Ray, Deucher, Alexander, Chen, Guchun, Zhu, Changfeng, Alex Deucher, Das, Nirmoy Cc: Huang, Shimmer, amd-gfx list [-- Attachment #1.1: Type: text/plain, Size: 13652 bytes --] On 5/19/21 5:14 AM, Huang, Ray wrote: > > [Public] > > I check the patch (below) to disable compute queues for raven is not > landed into drm-next. So actually all queues are enabled at this > moment. Nirmoy, can we get your confirmation? > I indeed didn't push the commit that disable all but one cu for raven. I was suppose to check with kfd as Felix wanted to know if that bug affects KFD. I think I got distracted with something else. Regards, Nirmoy > *diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* > > *index 97a8f786cf85..9352fcb77fe9 100644* > > *--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* > > *+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* > > *@@ -812,6 +812,13 @@* void amdgpu_kiq_wreg(struct amdgpu_device > *adev, uint32_t reg, uint32_t v) > > int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev) > > { > > if (amdgpu_num_kcq == -1) { > > + /* raven firmware currently can not load balance jobs > > + * among multiple compute queues. Enable only one > > + * compute queue till we have a firmware fix. > > + */ > > + if (adev->asic_type == CHIP_RAVEN) > > + return 1; > > + > > return 8; > > } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { > > dev_warn(adev->dev, "set kernel compute queue number to 8 due to > invalid parameter provided by user\n"); > > And I am glad to see that we have a solution to fix this issue at > current. Nice work, Changfeng! > > Best Regards, > > Ray > > *From:* Deucher, Alexander <Alexander.Deucher@amd.com> > *Sent:* Wednesday, May 19, 2021 11:04 AM > *To:* Chen, Guchun <Guchun.Chen@amd.com>; Zhu, Changfeng > <Changfeng.Zhu@amd.com>; Alex Deucher <alexdeucher@gmail.com>; Das, > Nirmoy <Nirmoy.Das@amd.com> > *Cc:* Huang, Ray <Ray.Huang@amd.com>; amd-gfx list > <amd-gfx@lists.freedesktop.org> > *Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > [Public] > > I thought we had disabled all but one of the compute queues on raven > due to this issue or at least disabled the schedulers for the > additional queues, but maybe I'm misremembering. > > Alex > > ------------------------------------------------------------------------ > > *From:*Chen, Guchun <Guchun.Chen@amd.com <mailto:Guchun.Chen@amd.com>> > *Sent:* Tuesday, May 18, 2021 11:00 PM > *To:* Zhu, Changfeng <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>>; Deucher, Alexander > <Alexander.Deucher@amd.com <mailto:Alexander.Deucher@amd.com>>; Alex > Deucher <alexdeucher@gmail.com <mailto:alexdeucher@gmail.com>>; Das, > Nirmoy <Nirmoy.Das@amd.com <mailto:Nirmoy.Das@amd.com>> > *Cc:* Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>>; > amd-gfx list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > [Public] > > Nirmoy’s patch landed already if I understand correctly. > > d41a39dda140 drm/scheduler: improve job distribution with multiple queues > > Regards, > > Guchun > > *From:* amd-gfx <amd-gfx-bounces@lists.freedesktop.org > <mailto:amd-gfx-bounces@lists.freedesktop.org>> *On Behalf Of *Zhu, > Changfeng > *Sent:* Wednesday, May 19, 2021 10:56 AM > *To:* Deucher, Alexander <Alexander.Deucher@amd.com > <mailto:Alexander.Deucher@amd.com>>; Alex Deucher > <alexdeucher@gmail.com <mailto:alexdeucher@gmail.com>>; Das, Nirmoy > <Nirmoy.Das@amd.com <mailto:Nirmoy.Das@amd.com>> > *Cc:* Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>>; > amd-gfx list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > [Public] > > [Public] > > Hi Alex, > > This is the issue exposed by Nirmoy's patch that provided better load > balancing across queues. > > BR, > > Changfeng. > > *From:* Deucher, Alexander <Alexander.Deucher@amd.com > <mailto:Alexander.Deucher@amd.com>> > *Sent:* Wednesday, May 19, 2021 10:53 AM > *To:* Zhu, Changfeng <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>>; Alex Deucher <alexdeucher@gmail.com > <mailto:alexdeucher@gmail.com>>; Das, Nirmoy <Nirmoy.Das@amd.com > <mailto:Nirmoy.Das@amd.com>> > *Cc:* Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>>; > amd-gfx list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > *Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > [Public] > > + Nirmoy > > I thought we disabled all but one of the compute queues on raven due > to this issue. Maybe that patch never landed? Wasn't this the same > issue that was exposed by Nirmoy's patch that provided better load > balancing across queues? > > Alex > > ------------------------------------------------------------------------ > > *From:*amd-gfx <amd-gfx-bounces@lists.freedesktop.org > <mailto:amd-gfx-bounces@lists.freedesktop.org>> on behalf of Zhu, > Changfeng <Changfeng.Zhu@amd.com <mailto:Changfeng.Zhu@amd.com>> > *Sent:* Tuesday, May 18, 2021 10:28 PM > *To:* Alex Deucher <alexdeucher@gmail.com <mailto:alexdeucher@gmail.com>> > *Cc:* Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>>; > amd-gfx list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > [AMD Official Use Only - Internal Distribution Only] > > Hi Alex. > > I have submitted the patch: drm/amdgpu: disable 3DCGCG on > picasso/raven1 to avoid compute hang > > Do you mean we have something else to do for re-enabling the extra > compute queues? > > BR, > Changfeng. > > -----Original Message----- > From: Alex Deucher <alexdeucher@gmail.com <mailto:alexdeucher@gmail.com>> > Sent: Wednesday, May 19, 2021 10:20 AM > To: Zhu, Changfeng <Changfeng.Zhu@amd.com <mailto:Changfeng.Zhu@amd.com>> > Cc: Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>>; amd-gfx > list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > Care to submit a patch to re-enable the extra compute queues? > > Alex > > On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>> wrote: > > > > [AMD Official Use Only - Internal Distribution Only] > > > > Hi Ray and Alex, > > > > I have confirmed it can enable the additional compute queues with > this patch: > > > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > > > BR, > > Changfeng. > > > > > > -----Original Message----- > > From: Huang, Ray <Ray.Huang@amd.com <mailto:Ray.Huang@amd.com>> > > Sent: Monday, May 17, 2021 2:27 PM > > To: Alex Deucher <alexdeucher@gmail.com > <mailto:alexdeucher@gmail.com>>; Zhu, Changfeng > > <Changfeng.Zhu@amd.com <mailto:Changfeng.Zhu@amd.com>> > > Cc: amd-gfx list <amd-gfx@lists.freedesktop.org > <mailto:amd-gfx@lists.freedesktop.org>> > > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > > avoid compute hang > > > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com > <mailto:changfeng.zhu@amd.com>> wrote: > > > > > > > > From: changzhu <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>> > > > > > > > > From: Changfeng <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>> > > > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > > to avoid compute hang. > > > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > > Signed-off-by: Changfeng <Changfeng.Zhu@amd.com > <mailto:Changfeng.Zhu@amd.com>> > > > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com > <mailto:alexander.deucher@amd.com>> > > > > > > WIth this applied, can we re-enable the additional compute queues? > > > > > > > I think so. > > > > Changfeng, could you please confirm this on all raven series? > > > > Patch is Reviewed-by: Huang Rui <ray.huang@amd.com > <mailto:ray.huang@amd.com>> > > > > > Alex > > > > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- > > > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- > > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > index 22608c45f07c..feaa5e4a5538 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > > @@ -4947,7 +4947,7 @@ static void > gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > > > /* Enable 3D CGCG/CGLS */ > > > > - if (enable && (adev->cg_flags & > AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > > + if (enable) { > > > > /* write cmd to clear cgcg/cgls ov */ > > > > def = data = RREG32_SOC15(GC, 0, > mmRLC_CGTT_MGCG_OVERRIDE); > > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device > *adev, > > > > /* enable 3Dcgcg FSM(0x0000363f) */ > > > > def = RREG32_SOC15(GC, 0, > > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > > > - data = (0x36 << > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > > + data = (0x36 << > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > > + else > > > > + data = 0x0 << > > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > > + > > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > > data |= (0x000F << > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > > index 4b660b2d1c22..080e715799d4 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void > *handle) > > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > > AMD_CG_SUPPORT_GFX_MGLS | > > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > > AMD_CG_SUPPORT_GFX_CGCG | > > > > AMD_CG_SUPPORT_GFX_CGLS | @@ > > > > -1413,7 > > > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > > > AMD_CG_SUPPORT_GFX_MGLS | > > > > AMD_CG_SUPPORT_GFX_RLC_LS | > > > > AMD_CG_SUPPORT_GFX_CP_LS | > > > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > > > AMD_CG_SUPPORT_GFX_CGCG | > > > > AMD_CG_SUPPORT_GFX_CGLS | > > > > -- > > > > 2.17.1 > > > > > > > > _______________________________________________ > > > > amd-gfx mailing list > > > > amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F> > > > > li > > > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0 > > > > 1% > > > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961 > > > > fe > > > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT > > > > WF > > > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > > > > CI > > > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2 > > > > BV > > > > 7pSY%3D&reserved=0 > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0 > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C3fc7a549816d4c8061c008d91a719cb8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569897555065647%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YTC%2FvVR%2BbPKw9JKayhmHapRkkEFaczoGzJJ3jFJqBAM%3D&reserved=0> > [-- Attachment #1.2: Type: text/html, Size: 45065 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang 2021-05-14 14:13 ` Alex Deucher 2021-05-17 6:27 ` Huang Rui @ 2021-05-19 11:34 ` Nirmoy 1 sibling, 0 replies; 13+ messages in thread From: Nirmoy @ 2021-05-19 11:34 UTC (permalink / raw) To: Alex Deucher, changzhu; +Cc: Huang Rui, amd-gfx list On 5/14/21 4:13 PM, Alex Deucher wrote: > On Fri, May 14, 2021 at 4:20 AM <changfeng.zhu@amd.com> wrote: >> From: changzhu <Changfeng.Zhu@amd.com> >> >> From: Changfeng <Changfeng.Zhu@amd.com> >> >> There is problem with 3DCGCG firmware and it will cause compute test >> hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid >> compute hang. >> >> Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 >> Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > WIth this applied, can we re-enable the additional compute queues? I didn't push that change as I was suppose do more tests with KFD and I probably got distracted by some other activity. Sorry for causing this confusion! Acked-by: Nirmoy Das <nirmoy.das@amd.com> Regards, Nirmoy > > Alex > >> --- >> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++--- >> drivers/gpu/drm/amd/amdgpu/soc15.c | 2 -- >> 2 files changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >> index 22608c45f07c..feaa5e4a5538 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c >> @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, >> amdgpu_gfx_rlc_enter_safe_mode(adev); >> >> /* Enable 3D CGCG/CGLS */ >> - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { >> + if (enable) { >> /* write cmd to clear cgcg/cgls ov */ >> def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); >> /* unset CGCG override */ >> @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, >> /* enable 3Dcgcg FSM(0x0000363f) */ >> def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); >> >> - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | >> - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; >> + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) >> + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | >> + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; >> + else >> + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; >> + >> if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) >> data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | >> RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; >> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c >> index 4b660b2d1c22..080e715799d4 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c >> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c >> @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) >> adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | >> AMD_CG_SUPPORT_GFX_MGLS | >> AMD_CG_SUPPORT_GFX_CP_LS | >> - AMD_CG_SUPPORT_GFX_3D_CGCG | >> AMD_CG_SUPPORT_GFX_3D_CGLS | >> AMD_CG_SUPPORT_GFX_CGCG | >> AMD_CG_SUPPORT_GFX_CGLS | >> @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) >> AMD_CG_SUPPORT_GFX_MGLS | >> AMD_CG_SUPPORT_GFX_RLC_LS | >> AMD_CG_SUPPORT_GFX_CP_LS | >> - AMD_CG_SUPPORT_GFX_3D_CGCG | >> AMD_CG_SUPPORT_GFX_3D_CGLS | >> AMD_CG_SUPPORT_GFX_CGCG | >> AMD_CG_SUPPORT_GFX_CGLS | >> -- >> 2.17.1 >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cnirmoy.das%40amd.com%7C2337da3349cc4613a5bf08d916e28b82%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984543494149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BCI8ckEunFfb5P80Ncaa3iuz9SHEqj07SXt6H2lZMCg%3D&reserved=0 > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cnirmoy.das%40amd.com%7C2337da3349cc4613a5bf08d916e28b82%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984543494149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BCI8ckEunFfb5P80Ncaa3iuz9SHEqj07SXt6H2lZMCg%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-05-19 11:34 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-14 8:19 [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang changfeng.zhu 2021-05-14 14:13 ` Alex Deucher 2021-05-17 6:27 ` Huang Rui 2021-05-17 8:09 ` Zhu, Changfeng 2021-05-19 2:19 ` Alex Deucher 2021-05-19 2:28 ` Zhu, Changfeng 2021-05-19 2:52 ` Deucher, Alexander 2021-05-19 2:55 ` Zhu, Changfeng 2021-05-19 3:00 ` Chen, Guchun 2021-05-19 3:03 ` Deucher, Alexander 2021-05-19 3:14 ` Huang, Ray 2021-05-19 11:27 ` Nirmoy 2021-05-19 11:34 ` Nirmoy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.