All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii
@ 2021-12-08  8:25 Felix Kuehling
  2021-12-08  8:25 ` [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental Felix Kuehling
  2022-01-11  0:13 ` [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
  0 siblings, 2 replies; 5+ messages in thread
From: Felix Kuehling @ 2021-12-08  8:25 UTC (permalink / raw)
  To: amd-gfx

start_nocpsch would never set dqm->sched_running on Hawaii due to an
early return statement. This would trigger asserts in other functions
and end up in inconsistent states.

Bug: https://github.com/RadeonOpenCompute/ROCm/issues/1624
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index dd0b952f0173..104b70e61ba0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1004,14 +1004,17 @@ static void uninitialize(struct device_queue_manager *dqm)
 
 static int start_nocpsch(struct device_queue_manager *dqm)
 {
+	int r = 0;
+
 	pr_info("SW scheduler is used");
 	init_interrupts(dqm);
 	
 	if (dqm->dev->adev->asic_type == CHIP_HAWAII)
-		return pm_init(&dqm->packet_mgr, dqm);
-	dqm->sched_running = true;
+		r = pm_init(&dqm->packet_mgr, dqm);
+	if (!r)
+		dqm->sched_running = true;
 
-	return 0;
+	return r;
 }
 
 static int stop_nocpsch(struct device_queue_manager *dqm)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental
  2021-12-08  8:25 [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
@ 2021-12-08  8:25 ` Felix Kuehling
  2021-12-08 16:34   ` Russell, Kent
  2022-01-11  0:13 ` [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
  1 sibling, 1 reply; 5+ messages in thread
From: Felix Kuehling @ 2021-12-08  8:25 UTC (permalink / raw)
  To: amd-gfx

Hawaii support is mostly untested these days. ROCm user mode also
depends on custom firmware for AQL packet processing, that was never
pushed upstream due to quality regressions in graphics driver testing.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 267668b96456..facc28f58c1f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -147,7 +147,11 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf)
 #ifdef CONFIG_DRM_AMDGPU_CIK
 	case CHIP_HAWAII:
 		gfx_target_version = 70001;
-		if (!vf)
+		if (!amdgpu_exp_hw_support)
+			pr_info(
+	"KFD support on Hawaii is experimental. See modparam exp_hw_support\n"
+				);
+		else if (!vf)
 			f2g = &gfx_v7_kfd2kgd;
 		break;
 #endif
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental
  2021-12-08  8:25 ` [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental Felix Kuehling
@ 2021-12-08 16:34   ` Russell, Kent
  0 siblings, 0 replies; 5+ messages in thread
From: Russell, Kent @ 2021-12-08 16:34 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx

[AMD Official Use Only]

Reviewed-by: Kent Russell <kent.russell@amd.com>



> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Felix Kuehling
> Sent: Wednesday, December 8, 2021 3:26 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental
>
> Hawaii support is mostly untested these days. ROCm user mode also
> depends on custom firmware for AQL packet processing, that was never
> pushed upstream due to quality regressions in graphics driver testing.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 267668b96456..facc28f58c1f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -147,7 +147,11 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool
> vf)
>  #ifdef CONFIG_DRM_AMDGPU_CIK
>       case CHIP_HAWAII:
>               gfx_target_version = 70001;
> -             if (!vf)
> +             if (!amdgpu_exp_hw_support)
> +                     pr_info(
> +     "KFD support on Hawaii is experimental. See modparam exp_hw_support\n"
> +                             );
> +             else if (!vf)
>                       f2g = &gfx_v7_kfd2kgd;
>               break;
>  #endif
> --
> 2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii
  2021-12-08  8:25 [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
  2021-12-08  8:25 ` [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental Felix Kuehling
@ 2022-01-11  0:13 ` Felix Kuehling
  2022-01-11 14:41   ` Russell, Kent
  1 sibling, 1 reply; 5+ messages in thread
From: Felix Kuehling @ 2022-01-11  0:13 UTC (permalink / raw)
  To: amd-gfx

Ping.

On 2021-12-08 3:25 a.m., Felix Kuehling wrote:

> start_nocpsch would never set dqm->sched_running on Hawaii due to an
> early return statement. This would trigger asserts in other functions
> and end up in inconsistent states.
>
> Bug: https://github.com/RadeonOpenCompute/ROCm/issues/1624
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index dd0b952f0173..104b70e61ba0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1004,14 +1004,17 @@ static void uninitialize(struct device_queue_manager *dqm)
>   
>   static int start_nocpsch(struct device_queue_manager *dqm)
>   {
> +	int r = 0;
> +
>   	pr_info("SW scheduler is used");
>   	init_interrupts(dqm);
>   	
>   	if (dqm->dev->adev->asic_type == CHIP_HAWAII)
> -		return pm_init(&dqm->packet_mgr, dqm);
> -	dqm->sched_running = true;
> +		r = pm_init(&dqm->packet_mgr, dqm);
> +	if (!r)
> +		dqm->sched_running = true;
>   
> -	return 0;
> +	return r;
>   }
>   
>   static int stop_nocpsch(struct device_queue_manager *dqm)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii
  2022-01-11  0:13 ` [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
@ 2022-01-11 14:41   ` Russell, Kent
  0 siblings, 0 replies; 5+ messages in thread
From: Russell, Kent @ 2022-01-11 14:41 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx

[AMD Official Use Only]

Reviewed-by: Kent Russell <kent.russell@amd.com>


> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Felix Kuehling
> Sent: Monday, January 10, 2022 7:13 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii
>
> Ping.
>
> On 2021-12-08 3:25 a.m., Felix Kuehling wrote:
>
> > start_nocpsch would never set dqm->sched_running on Hawaii due to an
> > early return statement. This would trigger asserts in other functions
> > and end up in inconsistent states.
> >
> > Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRad
> eonOpenCompute%2FROCm%2Fissues%2F1624&amp;data=04%7C01%7Ckent.russell%40a
> md.com%7C44c423a1e21b4676d29c08d9d4972868%7C3dd8961fe4884e608e11a82d994e18
> 3d%7C0%7C0%7C637774567959648449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=IYVH4ZU
> UOL1cVzCLZfvoFkRO5%2FKlHsSd6H8RRUP73Nk%3D&amp;reserved=0
> > Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 9 ++++++---
> >   1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > index dd0b952f0173..104b70e61ba0 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > @@ -1004,14 +1004,17 @@ static void uninitialize(struct device_queue_manager *dqm)
> >
> >   static int start_nocpsch(struct device_queue_manager *dqm)
> >   {
> > +   int r = 0;
> > +
> >     pr_info("SW scheduler is used");
> >     init_interrupts(dqm);
> >
> >     if (dqm->dev->adev->asic_type == CHIP_HAWAII)
> > -           return pm_init(&dqm->packet_mgr, dqm);
> > -   dqm->sched_running = true;
> > +           r = pm_init(&dqm->packet_mgr, dqm);
> > +   if (!r)
> > +           dqm->sched_running = true;
> >
> > -   return 0;
> > +   return r;
> >   }
> >
> >   static int stop_nocpsch(struct device_queue_manager *dqm)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-11 14:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08  8:25 [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
2021-12-08  8:25 ` [PATCH 2/2] drm/amdkfd: Make KFD support on Hawaii experimental Felix Kuehling
2021-12-08 16:34   ` Russell, Kent
2022-01-11  0:13 ` [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii Felix Kuehling
2022-01-11 14:41   ` Russell, Kent

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.