[PATCH] drm/amdgpu: reset asic after system-wide suspend aborted

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
@ 2021-11-18  7:02 Prike Liang
  2021-11-18  8:00 ` Lazar, Lijo
  0 siblings, 1 reply; 15+ messages in thread
From: Prike Liang @ 2021-11-18  7:02 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alexander.Deucher, Prike Liang, ray.huang

Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend
to keep AMDGPU in a clean reset state and that can avoid re-initialize
device improperly error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19 +++++++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b85b67a..8bd9833 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1053,6 +1053,7 @@ struct amdgpu_device {
 	bool				in_s3;
 	bool				in_s4;
 	bool				in_s0ix;
+	bool				pm_completed;
 
 	atomic_t 			in_gpu_reset;
 	enum pp_mp1_state               mp1_state;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ec42a6f..a12ed54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
 	if (adev->in_s0ix)
 		amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry);
 
+	if (!adev->pm_completed) {
+		dev_warn(adev->dev, "suspend aborted will do asic reset\n");
+		amdgpu_asic_reset(adev);
+	}
 	/* post card */
 	if (amdgpu_device_need_post(adev)) {
 		r = amdgpu_device_asic_init(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index eee3cf8..9f90017 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev)
 	return r;
 }
 
+/*
+ * Actually the PM suspend whether is completed should be confirmed
+ * by checking the sysfs sys/power/suspend_stats/failed_suspend.However,
+ * in this function only check the AMDGPU device whether is suspended
+ * completely in the system-wide suspend process.
+ */
+static int amdgpu_pmops_noirq_suspend(struct device *dev)
+{
+	struct drm_device *drm_dev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = drm_to_adev(drm_dev);
+
+	dev_dbg(dev, "amdgpu suspend completely.\n");
+	adev->pm_completed = true;
+
+	return 0;
+}
+
 static int amdgpu_pmops_resume(struct device *dev)
 {
 	struct drm_device *drm_dev = dev_get_drvdata(dev);
@@ -2181,6 +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
 	r = amdgpu_device_resume(drm_dev, true);
 	if (amdgpu_acpi_is_s0ix_active(adev))
 		adev->in_s0ix = false;
+	adev->pm_completed = false;
 	return r;
 }
 
@@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops = {
 	.runtime_suspend = amdgpu_pmops_runtime_suspend,
 	.runtime_resume = amdgpu_pmops_runtime_resume,
 	.runtime_idle = amdgpu_pmops_runtime_idle,
+	.suspend_noirq = amdgpu_pmops_noirq_suspend,
 };
 
 static int amdgpu_flush(struct file *f, fl_owner_t id)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18  7:02 [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted Prike Liang
@ 2021-11-18  8:00 ` Lazar, Lijo
  2021-11-18 13:11   ` Liang, Prike
  0 siblings, 1 reply; 15+ messages in thread
From: Lazar, Lijo @ 2021-11-18  8:00 UTC (permalink / raw)
  To: Prike Liang, amd-gfx; +Cc: Alexander.Deucher, ray.huang



On 11/18/2021 12:32 PM, Prike Liang wrote:
> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend
> to keep AMDGPU in a clean reset state and that can avoid re-initialize
> device improperly error.
> 
> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19 +++++++++++++++++++
>   3 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index b85b67a..8bd9833 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
>   	bool				in_s3;
>   	bool				in_s4;
>   	bool				in_s0ix;
> +	bool				pm_completed;

PM framework maintains separate flags, why not use the same?

         dev->power.is_suspended = false;
         dev->power.is_noirq_suspended = false;
         dev->power.is_late_suspended = false;

BTW, Alex posted a patch which does similar thing, though it tries reset 
if suspend fails.

https://lore.kernel.org/all/DM6PR12MB26195F8E099407B4B6966FEBE4999@DM6PR12MB2619.namprd12.prod.outlook.com/

If that reset also failed, then no point in another reset, or keep it as 
part of resume.

Thanks,
Lijo

>   
>   	atomic_t 			in_gpu_reset;
>   	enum pp_mp1_state               mp1_state;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index ec42a6f..a12ed54 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
>   	if (adev->in_s0ix)
>   		amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry);
>   
> +	if (!adev->pm_completed) {
> +		dev_warn(adev->dev, "suspend aborted will do asic reset\n");
> +		amdgpu_asic_reset(adev);
> +	}
>   	/* post card */
>   	if (amdgpu_device_need_post(adev)) {
>   		r = amdgpu_device_asic_init(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index eee3cf8..9f90017 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev)
>   	return r;
>   }
>   
> +/*
> + * Actually the PM suspend whether is completed should be confirmed
> + * by checking the sysfs sys/power/suspend_stats/failed_suspend.However,
> + * in this function only check the AMDGPU device whether is suspended
> + * completely in the system-wide suspend process.
> + */
> +static int amdgpu_pmops_noirq_suspend(struct device *dev)
> +{
> +	struct drm_device *drm_dev = dev_get_drvdata(dev);
> +	struct amdgpu_device *adev = drm_to_adev(drm_dev);
> +
> +	dev_dbg(dev, "amdgpu suspend completely.\n");
> +	adev->pm_completed = true;
> +
> +	return 0;
> +}
> +
>   static int amdgpu_pmops_resume(struct device *dev)
>   {
>   	struct drm_device *drm_dev = dev_get_drvdata(dev);
> @@ -2181,6 +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
>   	r = amdgpu_device_resume(drm_dev, true);
>   	if (amdgpu_acpi_is_s0ix_active(adev))
>   		adev->in_s0ix = false;
> +	adev->pm_completed = false;
>   	return r;
>   }
>   
> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops = {
>   	.runtime_suspend = amdgpu_pmops_runtime_suspend,
>   	.runtime_resume = amdgpu_pmops_runtime_resume,
>   	.runtime_idle = amdgpu_pmops_runtime_idle,
> +	.suspend_noirq = amdgpu_pmops_noirq_suspend,
>   };
>   
>   static int amdgpu_flush(struct file *f, fl_owner_t id)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18  8:00 ` Lazar, Lijo
@ 2021-11-18 13:11   ` Liang, Prike
  2021-11-18 14:06     ` Alex Deucher
  0 siblings, 1 reply; 15+ messages in thread
From: Liang, Prike @ 2021-11-18 13:11 UTC (permalink / raw)
  To: Lazar, Lijo, amd-gfx; +Cc: Deucher, Alexander, Huang, Ray

[Public]

> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar@amd.com>
> Sent: Thursday, November 18, 2021 4:01 PM
> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
> <Ray.Huang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> aborted
>
>
>
> On 11/18/2021 12:32 PM, Prike Liang wrote:
> > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
> > suspend to keep AMDGPU in a clean reset state and that can avoid
> > re-initialize device improperly error.
> >
> > Signed-off-by: Prike Liang <Prike.Liang@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
> +++++++++++++++++++
> >   3 files changed, 24 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index b85b67a..8bd9833 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1053,6 +1053,7 @@ struct amdgpu_device {
> >     bool                            in_s3;
> >     bool                            in_s4;
> >     bool                            in_s0ix;
> > +   bool                            pm_completed;
>
> PM framework maintains separate flags, why not use the same?
>
>          dev->power.is_suspended = false;
>          dev->power.is_noirq_suspended = false;
>          dev->power.is_late_suspended = false;
>

Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend.

> BTW, Alex posted a patch which does similar thing, though it tries reset if
> suspend fails.
>
> https://lore.kernel.org/all/DM6PR12MB26195F8E099407B4B6966FEBE4999@
> DM6PR12MB2619.namprd12.prod.outlook.com/
>
> If that reset also failed, then no point in another reset, or keep it as part of
> resume.
>

Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here.

> Thanks,
> Lijo
>
> >
> >     atomic_t                        in_gpu_reset;
> >     enum pp_mp1_state               mp1_state;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index ec42a6f..a12ed54 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
> *dev, bool fbcon)
> >     if (adev->in_s0ix)
> >             amdgpu_gfx_state_change_set(adev,
> sGpuChangeState_D0Entry);
> >
> > +   if (!adev->pm_completed) {
> > +           dev_warn(adev->dev, "suspend aborted will do asic reset\n");
> > +           amdgpu_asic_reset(adev);
> > +   }
> >     /* post card */
> >     if (amdgpu_device_need_post(adev)) {
> >             r = amdgpu_device_asic_init(adev); diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index eee3cf8..9f90017 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
> device *dev)
> >     return r;
> >   }
> >
> > +/*
> > + * Actually the PM suspend whether is completed should be confirmed
> > + * by checking the sysfs
> > +sys/power/suspend_stats/failed_suspend.However,
> > + * in this function only check the AMDGPU device whether is suspended
> > + * completely in the system-wide suspend process.
> > + */
> > +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
> > +   struct drm_device *drm_dev = dev_get_drvdata(dev);
> > +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
> > +
> > +   dev_dbg(dev, "amdgpu suspend completely.\n");
> > +   adev->pm_completed = true;
> > +
> > +   return 0;
> > +}
> > +
> >   static int amdgpu_pmops_resume(struct device *dev)
> >   {
> >     struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
> > +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
> >     r = amdgpu_device_resume(drm_dev, true);
> >     if (amdgpu_acpi_is_s0ix_active(adev))
> >             adev->in_s0ix = false;
> > +   adev->pm_completed = false;
> >     return r;
> >   }
> >
> > @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
> = {
> >     .runtime_suspend = amdgpu_pmops_runtime_suspend,
> >     .runtime_resume = amdgpu_pmops_runtime_resume,
> >     .runtime_idle = amdgpu_pmops_runtime_idle,
> > +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
> >   };
> >
> >   static int amdgpu_flush(struct file *f, fl_owner_t id)
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 13:11   ` Liang, Prike
@ 2021-11-18 14:06     ` Alex Deucher
  2021-11-18 14:09       ` Lazar, Lijo
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2021-11-18 14:06 UTC (permalink / raw)
  To: Liang, Prike; +Cc: Deucher, Alexander, Lazar, Lijo, Huang, Ray, amd-gfx

On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Lazar, Lijo <Lijo.Lazar@amd.com>
> > Sent: Thursday, November 18, 2021 4:01 PM
> > To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
> > <Ray.Huang@amd.com>
> > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> > aborted
> >
> >
> >
> > On 11/18/2021 12:32 PM, Prike Liang wrote:
> > > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
> > > suspend to keep AMDGPU in a clean reset state and that can avoid
> > > re-initialize device improperly error.
> > >
> > > Signed-off-by: Prike Liang <Prike.Liang@amd.com>
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
> > +++++++++++++++++++
> > >   3 files changed, 24 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > index b85b67a..8bd9833 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > @@ -1053,6 +1053,7 @@ struct amdgpu_device {
> > >     bool                            in_s3;
> > >     bool                            in_s4;
> > >     bool                            in_s0ix;
> > > +   bool                            pm_completed;
> >
> > PM framework maintains separate flags, why not use the same?
> >
> >          dev->power.is_suspended = false;
> >          dev->power.is_noirq_suspended = false;
> >          dev->power.is_late_suspended = false;
> >
>
> Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend.
>
> > BTW, Alex posted a patch which does similar thing, though it tries reset if
> > suspend fails.
> >
> > https://lore.kernel.org/all/DM6PR12MB26195F8E099407B4B6966FEBE4999@
> > DM6PR12MB2619.namprd12.prod.outlook.com/
> >
> > If that reset also failed, then no point in another reset, or keep it as part of
> > resume.
> >
>
> Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here.
>

Yeah, I was thinking we'd take this patch rather than mine to minimize
the number of resets.

Alex


> > Thanks,
> > Lijo
> >
> > >
> > >     atomic_t                        in_gpu_reset;
> > >     enum pp_mp1_state               mp1_state;
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index ec42a6f..a12ed54 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
> > *dev, bool fbcon)
> > >     if (adev->in_s0ix)
> > >             amdgpu_gfx_state_change_set(adev,
> > sGpuChangeState_D0Entry);
> > >
> > > +   if (!adev->pm_completed) {
> > > +           dev_warn(adev->dev, "suspend aborted will do asic reset\n");
> > > +           amdgpu_asic_reset(adev);
> > > +   }
> > >     /* post card */
> > >     if (amdgpu_device_need_post(adev)) {
> > >             r = amdgpu_device_asic_init(adev); diff --git
> > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > index eee3cf8..9f90017 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
> > device *dev)
> > >     return r;
> > >   }
> > >
> > > +/*
> > > + * Actually the PM suspend whether is completed should be confirmed
> > > + * by checking the sysfs
> > > +sys/power/suspend_stats/failed_suspend.However,
> > > + * in this function only check the AMDGPU device whether is suspended
> > > + * completely in the system-wide suspend process.
> > > + */
> > > +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
> > > +   struct drm_device *drm_dev = dev_get_drvdata(dev);
> > > +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
> > > +
> > > +   dev_dbg(dev, "amdgpu suspend completely.\n");
> > > +   adev->pm_completed = true;
> > > +
> > > +   return 0;
> > > +}
> > > +
> > >   static int amdgpu_pmops_resume(struct device *dev)
> > >   {
> > >     struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
> > > +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
> > >     r = amdgpu_device_resume(drm_dev, true);
> > >     if (amdgpu_acpi_is_s0ix_active(adev))
> > >             adev->in_s0ix = false;
> > > +   adev->pm_completed = false;
> > >     return r;
> > >   }
> > >
> > > @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
> > = {
> > >     .runtime_suspend = amdgpu_pmops_runtime_suspend,
> > >     .runtime_resume = amdgpu_pmops_runtime_resume,
> > >     .runtime_idle = amdgpu_pmops_runtime_idle,
> > > +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
> > >   };
> > >
> > >   static int amdgpu_flush(struct file *f, fl_owner_t id)
> > >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:06     ` Alex Deucher
@ 2021-11-18 14:09       ` Lazar, Lijo
  2021-11-18 14:11         ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Lazar, Lijo @ 2021-11-18 14:09 UTC (permalink / raw)
  To: Alex Deucher, Liang, Prike; +Cc: Deucher, Alexander, Huang, Ray, amd-gfx



On 11/18/2021 7:36 PM, Alex Deucher wrote:
> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com> wrote:
>>
>> [Public]
>>
>>> -----Original Message-----
>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
>>> Sent: Thursday, November 18, 2021 4:01 PM
>>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
>>> <Ray.Huang@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
>>> aborted
>>>
>>>
>>>
>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
>>>> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
>>>> suspend to keep AMDGPU in a clean reset state and that can avoid
>>>> re-initialize device improperly error.
>>>>
>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
>>> +++++++++++++++++++
>>>>    3 files changed, 24 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index b85b67a..8bd9833 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
>>>>      bool                            in_s3;
>>>>      bool                            in_s4;
>>>>      bool                            in_s0ix;
>>>> +   bool                            pm_completed;
>>>
>>> PM framework maintains separate flags, why not use the same?
>>>
>>>           dev->power.is_suspended = false;
>>>           dev->power.is_noirq_suspended = false;
>>>           dev->power.is_late_suspended = false;
>>>
>>
>> Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend.
>>
>>> BTW, Alex posted a patch which does similar thing, though it tries reset if
>>> suspend fails.
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&amp;data=04%7C01%7CLijo.Lazar%40amd.com%7C6401dce9411b4c134b0208d9aa9ca644%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728412055353107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Z4dgNvHISHVlR4grHm1RU3%2FJMJVdRe7Ukq1DnjCgG0o%3D&amp;reserved=0
>>> DM6PR12MB2619.namprd12.prod.outlook.com/
>>>
>>> If that reset also failed, then no point in another reset, or keep it as part of
>>> resume.
>>>
>>
>> Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here.
>>
> 
> Yeah, I was thinking we'd take this patch rather than mine to minimize
> the number of resets.
> 

Wondering if suspend fails whether there is a need to call resume. It 
may not get to resume() to do a reset, that needs to be checked.

Thanks,
Lijo


> Alex
> 
> 
>>> Thanks,
>>> Lijo
>>>
>>>>
>>>>      atomic_t                        in_gpu_reset;
>>>>      enum pp_mp1_state               mp1_state;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index ec42a6f..a12ed54 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
>>> *dev, bool fbcon)
>>>>      if (adev->in_s0ix)
>>>>              amdgpu_gfx_state_change_set(adev,
>>> sGpuChangeState_D0Entry);
>>>>
>>>> +   if (!adev->pm_completed) {
>>>> +           dev_warn(adev->dev, "suspend aborted will do asic reset\n");
>>>> +           amdgpu_asic_reset(adev);
>>>> +   }
>>>>      /* post card */
>>>>      if (amdgpu_device_need_post(adev)) {
>>>>              r = amdgpu_device_asic_init(adev); diff --git
>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index eee3cf8..9f90017 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
>>> device *dev)
>>>>      return r;
>>>>    }
>>>>
>>>> +/*
>>>> + * Actually the PM suspend whether is completed should be confirmed
>>>> + * by checking the sysfs
>>>> +sys/power/suspend_stats/failed_suspend.However,
>>>> + * in this function only check the AMDGPU device whether is suspended
>>>> + * completely in the system-wide suspend process.
>>>> + */
>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>> +
>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
>>>> +   adev->pm_completed = true;
>>>> +
>>>> +   return 0;
>>>> +}
>>>> +
>>>>    static int amdgpu_pmops_resume(struct device *dev)
>>>>    {
>>>>      struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
>>>>      r = amdgpu_device_resume(drm_dev, true);
>>>>      if (amdgpu_acpi_is_s0ix_active(adev))
>>>>              adev->in_s0ix = false;
>>>> +   adev->pm_completed = false;
>>>>      return r;
>>>>    }
>>>>
>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
>>> = {
>>>>      .runtime_suspend = amdgpu_pmops_runtime_suspend,
>>>>      .runtime_resume = amdgpu_pmops_runtime_resume,
>>>>      .runtime_idle = amdgpu_pmops_runtime_idle,
>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
>>>>    };
>>>>
>>>>    static int amdgpu_flush(struct file *f, fl_owner_t id)
>>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:09       ` Lazar, Lijo
@ 2021-11-18 14:11         ` Christian König
  2021-11-18 14:14           ` Lazar, Lijo
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2021-11-18 14:11 UTC (permalink / raw)
  To: Lazar, Lijo, Alex Deucher, Liang, Prike
  Cc: Deucher, Alexander, Huang, Ray, amd-gfx

Am 18.11.21 um 15:09 schrieb Lazar, Lijo:
> On 11/18/2021 7:36 PM, Alex Deucher wrote:
>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com> 
>> wrote:
>>>
>>> [Public]
>>>
>>>> -----Original Message-----
>>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
>>>> Sent: Thursday, November 18, 2021 4:01 PM
>>>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
>>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
>>>> <Ray.Huang@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
>>>> aborted
>>>>
>>>>
>>>>
>>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
>>>>> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
>>>>> suspend to keep AMDGPU in a clean reset state and that can avoid
>>>>> re-initialize device improperly error.
>>>>>
>>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
>>>> +++++++++++++++++++
>>>>>    3 files changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index b85b67a..8bd9833 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
>>>>>      bool                            in_s3;
>>>>>      bool                            in_s4;
>>>>>      bool                            in_s0ix;
>>>>> +   bool                            pm_completed;
>>>>
>>>> PM framework maintains separate flags, why not use the same?
>>>>
>>>>           dev->power.is_suspended = false;
>>>>           dev->power.is_noirq_suspended = false;
>>>>           dev->power.is_late_suspended = false;
>>>>
>>>
>>> Thanks for pointing it out and I miss that flag. In this case we can 
>>> use the PM flag is_noirq_suspended for checking AMDGPU device 
>>> whether is finished suspend.
>>>
>>>> BTW, Alex posted a patch which does similar thing, though it tries 
>>>> reset if
>>>> suspend fails.
>>>>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&amp;data=04%7C01%7CLijo.Lazar%40amd.com%7C6401dce9411b4c134b0208d9aa9ca644%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728412055353107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Z4dgNvHISHVlR4grHm1RU3%2FJMJVdRe7Ukq1DnjCgG0o%3D&amp;reserved=0 
>>>>
>>>> DM6PR12MB2619.namprd12.prod.outlook.com/
>>>>
>>>> If that reset also failed, then no point in another reset, or keep 
>>>> it as part of
>>>> resume.
>>>>
>>>
>>> Alex's patch seems always do the ASIC reset at the end of AMDGPU 
>>> device, but that may needn't on the normal AMDGPU device suspend. 
>>> However, this patch shows that can handle the system suspend aborted 
>>> after AMDGPU suspend case well, so now seems we only need take care 
>>> suspend abort case here.
>>>
>>
>> Yeah, I was thinking we'd take this patch rather than mine to minimize
>> the number of resets.
>>
>
> Wondering if suspend fails whether there is a need to call resume. It 
> may not get to resume() to do a reset, that needs to be checked.

I would rather do it so that we reset the ASIC during resume when we 
detect that the hardware is in a bad state.

This way it should also work with things like kexec and virtualization.

Regards,
Christian.

>
> Thanks,
> Lijo
>
>
>> Alex
>>
>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>>>
>>>>>      atomic_t                        in_gpu_reset;
>>>>>      enum pp_mp1_state               mp1_state;
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index ec42a6f..a12ed54 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
>>>> *dev, bool fbcon)
>>>>>      if (adev->in_s0ix)
>>>>>              amdgpu_gfx_state_change_set(adev,
>>>> sGpuChangeState_D0Entry);
>>>>>
>>>>> +   if (!adev->pm_completed) {
>>>>> +           dev_warn(adev->dev, "suspend aborted will do asic 
>>>>> reset\n");
>>>>> +           amdgpu_asic_reset(adev);
>>>>> +   }
>>>>>      /* post card */
>>>>>      if (amdgpu_device_need_post(adev)) {
>>>>>              r = amdgpu_device_asic_init(adev); diff --git
>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> index eee3cf8..9f90017 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
>>>> device *dev)
>>>>>      return r;
>>>>>    }
>>>>>
>>>>> +/*
>>>>> + * Actually the PM suspend whether is completed should be confirmed
>>>>> + * by checking the sysfs
>>>>> +sys/power/suspend_stats/failed_suspend.However,
>>>>> + * in this function only check the AMDGPU device whether is 
>>>>> suspended
>>>>> + * completely in the system-wide suspend process.
>>>>> + */
>>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
>>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>>> +
>>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
>>>>> +   adev->pm_completed = true;
>>>>> +
>>>>> +   return 0;
>>>>> +}
>>>>> +
>>>>>    static int amdgpu_pmops_resume(struct device *dev)
>>>>>    {
>>>>>      struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
>>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
>>>>>      r = amdgpu_device_resume(drm_dev, true);
>>>>>      if (amdgpu_acpi_is_s0ix_active(adev))
>>>>>              adev->in_s0ix = false;
>>>>> +   adev->pm_completed = false;
>>>>>      return r;
>>>>>    }
>>>>>
>>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
>>>> = {
>>>>>      .runtime_suspend = amdgpu_pmops_runtime_suspend,
>>>>>      .runtime_resume = amdgpu_pmops_runtime_resume,
>>>>>      .runtime_idle = amdgpu_pmops_runtime_idle,
>>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
>>>>>    };
>>>>>
>>>>>    static int amdgpu_flush(struct file *f, fl_owner_t id)
>>>>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:11         ` Christian König
@ 2021-11-18 14:14           ` Lazar, Lijo
  2021-11-18 14:25             ` Alex Deucher
  0 siblings, 1 reply; 15+ messages in thread
From: Lazar, Lijo @ 2021-11-18 14:14 UTC (permalink / raw)
  To: Christian König, Alex Deucher, Liang, Prike
  Cc: Deucher, Alexander, Huang, Ray, amd-gfx



On 11/18/2021 7:41 PM, Christian König wrote:
> Am 18.11.21 um 15:09 schrieb Lazar, Lijo:
>> On 11/18/2021 7:36 PM, Alex Deucher wrote:
>>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com> 
>>> wrote:
>>>>
>>>> [Public]
>>>>
>>>>> -----Original Message-----
>>>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
>>>>> Sent: Thursday, November 18, 2021 4:01 PM
>>>>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
>>>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
>>>>> <Ray.Huang@amd.com>
>>>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
>>>>> aborted
>>>>>
>>>>>
>>>>>
>>>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
>>>>>> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
>>>>>> suspend to keep AMDGPU in a clean reset state and that can avoid
>>>>>> re-initialize device improperly error.
>>>>>>
>>>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
>>>>> +++++++++++++++++++
>>>>>>    3 files changed, 24 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> index b85b67a..8bd9833 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
>>>>>>      bool                            in_s3;
>>>>>>      bool                            in_s4;
>>>>>>      bool                            in_s0ix;
>>>>>> +   bool                            pm_completed;
>>>>>
>>>>> PM framework maintains separate flags, why not use the same?
>>>>>
>>>>>           dev->power.is_suspended = false;
>>>>>           dev->power.is_noirq_suspended = false;
>>>>>           dev->power.is_late_suspended = false;
>>>>>
>>>>
>>>> Thanks for pointing it out and I miss that flag. In this case we can 
>>>> use the PM flag is_noirq_suspended for checking AMDGPU device 
>>>> whether is finished suspend.
>>>>
>>>>> BTW, Alex posted a patch which does similar thing, though it tries 
>>>>> reset if
>>>>> suspend fails.
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C79c861fcfccc4f2cc45d08d9aa9d61f0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728415203834017%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=gGnctl8rlNTj6o02hlE06og3RCA%2BQP37B3ejsZSxPdM%3D&amp;reserved=0 
>>>>>
>>>>> DM6PR12MB2619.namprd12.prod.outlook.com/
>>>>>
>>>>> If that reset also failed, then no point in another reset, or keep 
>>>>> it as part of
>>>>> resume.
>>>>>
>>>>
>>>> Alex's patch seems always do the ASIC reset at the end of AMDGPU 
>>>> device, but that may needn't on the normal AMDGPU device suspend. 
>>>> However, this patch shows that can handle the system suspend aborted 
>>>> after AMDGPU suspend case well, so now seems we only need take care 
>>>> suspend abort case here.
>>>>
>>>
>>> Yeah, I was thinking we'd take this patch rather than mine to minimize
>>> the number of resets.
>>>
>>
>> Wondering if suspend fails whether there is a need to call resume. It 
>> may not get to resume() to do a reset, that needs to be checked.
> 
> I would rather do it so that we reset the ASIC during resume when we 
> detect that the hardware is in a bad state.
> 
> This way it should also work with things like kexec and virtualization.

I was thinking from the power framework perspective. If the device's 
suspend() callback returns failure, why would the framework needs to 
call a resume() on that device.

Thanks,
Lijo

> 
> Regards,
> Christian.
> 
>>
>> Thanks,
>> Lijo
>>
>>
>>> Alex
>>>
>>>
>>>>> Thanks,
>>>>> Lijo
>>>>>
>>>>>>
>>>>>>      atomic_t                        in_gpu_reset;
>>>>>>      enum pp_mp1_state               mp1_state;
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index ec42a6f..a12ed54 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
>>>>> *dev, bool fbcon)
>>>>>>      if (adev->in_s0ix)
>>>>>>              amdgpu_gfx_state_change_set(adev,
>>>>> sGpuChangeState_D0Entry);
>>>>>>
>>>>>> +   if (!adev->pm_completed) {
>>>>>> +           dev_warn(adev->dev, "suspend aborted will do asic 
>>>>>> reset\n");
>>>>>> +           amdgpu_asic_reset(adev);
>>>>>> +   }
>>>>>>      /* post card */
>>>>>>      if (amdgpu_device_need_post(adev)) {
>>>>>>              r = amdgpu_device_asic_init(adev); diff --git
>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>> index eee3cf8..9f90017 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
>>>>> device *dev)
>>>>>>      return r;
>>>>>>    }
>>>>>>
>>>>>> +/*
>>>>>> + * Actually the PM suspend whether is completed should be confirmed
>>>>>> + * by checking the sysfs
>>>>>> +sys/power/suspend_stats/failed_suspend.However,
>>>>>> + * in this function only check the AMDGPU device whether is 
>>>>>> suspended
>>>>>> + * completely in the system-wide suspend process.
>>>>>> + */
>>>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
>>>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>>>> +
>>>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
>>>>>> +   adev->pm_completed = true;
>>>>>> +
>>>>>> +   return 0;
>>>>>> +}
>>>>>> +
>>>>>>    static int amdgpu_pmops_resume(struct device *dev)
>>>>>>    {
>>>>>>      struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
>>>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
>>>>>>      r = amdgpu_device_resume(drm_dev, true);
>>>>>>      if (amdgpu_acpi_is_s0ix_active(adev))
>>>>>>              adev->in_s0ix = false;
>>>>>> +   adev->pm_completed = false;
>>>>>>      return r;
>>>>>>    }
>>>>>>
>>>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
>>>>> = {
>>>>>>      .runtime_suspend = amdgpu_pmops_runtime_suspend,
>>>>>>      .runtime_resume = amdgpu_pmops_runtime_resume,
>>>>>>      .runtime_idle = amdgpu_pmops_runtime_idle,
>>>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
>>>>>>    };
>>>>>>
>>>>>>    static int amdgpu_flush(struct file *f, fl_owner_t id)
>>>>>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:14           ` Lazar, Lijo
@ 2021-11-18 14:25             ` Alex Deucher
  2021-11-18 14:28               ` Lazar, Lijo
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2021-11-18 14:25 UTC (permalink / raw)
  To: Lazar, Lijo
  Cc: Deucher, Alexander, Christian König, Liang, Prike, Huang,
	Ray, amd-gfx

On Thu, Nov 18, 2021 at 9:15 AM Lazar, Lijo <lijo.lazar@amd.com> wrote:
>
>
>
> On 11/18/2021 7:41 PM, Christian König wrote:
> > Am 18.11.21 um 15:09 schrieb Lazar, Lijo:
> >> On 11/18/2021 7:36 PM, Alex Deucher wrote:
> >>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com>
> >>> wrote:
> >>>>
> >>>> [Public]
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
> >>>>> Sent: Thursday, November 18, 2021 4:01 PM
> >>>>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
> >>>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
> >>>>> <Ray.Huang@amd.com>
> >>>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> >>>>> aborted
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
> >>>>>> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
> >>>>>> suspend to keep AMDGPU in a clean reset state and that can avoid
> >>>>>> re-initialize device improperly error.
> >>>>>>
> >>>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
> >>>>>> ---
> >>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> >>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
> >>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
> >>>>> +++++++++++++++++++
> >>>>>>    3 files changed, 24 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>> index b85b67a..8bd9833 100644
> >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
> >>>>>>      bool                            in_s3;
> >>>>>>      bool                            in_s4;
> >>>>>>      bool                            in_s0ix;
> >>>>>> +   bool                            pm_completed;
> >>>>>
> >>>>> PM framework maintains separate flags, why not use the same?
> >>>>>
> >>>>>           dev->power.is_suspended = false;
> >>>>>           dev->power.is_noirq_suspended = false;
> >>>>>           dev->power.is_late_suspended = false;
> >>>>>
> >>>>
> >>>> Thanks for pointing it out and I miss that flag. In this case we can
> >>>> use the PM flag is_noirq_suspended for checking AMDGPU device
> >>>> whether is finished suspend.
> >>>>
> >>>>> BTW, Alex posted a patch which does similar thing, though it tries
> >>>>> reset if
> >>>>> suspend fails.
> >>>>>
> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C79c861fcfccc4f2cc45d08d9aa9d61f0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728415203834017%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=gGnctl8rlNTj6o02hlE06og3RCA%2BQP37B3ejsZSxPdM%3D&amp;reserved=0
> >>>>>
> >>>>> DM6PR12MB2619.namprd12.prod.outlook.com/
> >>>>>
> >>>>> If that reset also failed, then no point in another reset, or keep
> >>>>> it as part of
> >>>>> resume.
> >>>>>
> >>>>
> >>>> Alex's patch seems always do the ASIC reset at the end of AMDGPU
> >>>> device, but that may needn't on the normal AMDGPU device suspend.
> >>>> However, this patch shows that can handle the system suspend aborted
> >>>> after AMDGPU suspend case well, so now seems we only need take care
> >>>> suspend abort case here.
> >>>>
> >>>
> >>> Yeah, I was thinking we'd take this patch rather than mine to minimize
> >>> the number of resets.
> >>>
> >>
> >> Wondering if suspend fails whether there is a need to call resume. It
> >> may not get to resume() to do a reset, that needs to be checked.
> >
> > I would rather do it so that we reset the ASIC during resume when we
> > detect that the hardware is in a bad state.
> >
> > This way it should also work with things like kexec and virtualization.
>
> I was thinking from the power framework perspective. If the device's
> suspend() callback returns failure, why would the framework needs to
> call a resume() on that device.

The device's suspend callback succeeds, but some other device or event
in the system causes the overall suspend to abort.  As such the GPU is
never powered off by the platform so it's left in a partially
initialized state.  The system then calls the resume() callbacks for
all of the devices that were previously suspended to bring them back
to a working system.  GPU reset is just a convenient way to get the
device back into a known good state.  Unfortunately, I'm not sure if
there is a good way to detect whether the GPU is in a known good state
or not until we try and re-init the IPs and at that point the IPs are
not fully initialized so it's complex to try and unwind that, reset,
and try again.  It's probably easiest to just reset the GPU on resume
all the time.  If the GPU was powered down, the reset should work fine
since we are resetting a GPU that just came out of reset.  If the GPU
was not powered down, the reset will put the GPU into a known good
state.

Alex

>
> Thanks,
> Lijo
>
> >
> > Regards,
> > Christian.
> >
> >>
> >> Thanks,
> >> Lijo
> >>
> >>
> >>> Alex
> >>>
> >>>
> >>>>> Thanks,
> >>>>> Lijo
> >>>>>
> >>>>>>
> >>>>>>      atomic_t                        in_gpu_reset;
> >>>>>>      enum pp_mp1_state               mp1_state;
> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>> index ec42a6f..a12ed54 100644
> >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
> >>>>> *dev, bool fbcon)
> >>>>>>      if (adev->in_s0ix)
> >>>>>>              amdgpu_gfx_state_change_set(adev,
> >>>>> sGpuChangeState_D0Entry);
> >>>>>>
> >>>>>> +   if (!adev->pm_completed) {
> >>>>>> +           dev_warn(adev->dev, "suspend aborted will do asic
> >>>>>> reset\n");
> >>>>>> +           amdgpu_asic_reset(adev);
> >>>>>> +   }
> >>>>>>      /* post card */
> >>>>>>      if (amdgpu_device_need_post(adev)) {
> >>>>>>              r = amdgpu_device_asic_init(adev); diff --git
> >>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>> index eee3cf8..9f90017 100644
> >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
> >>>>> device *dev)
> >>>>>>      return r;
> >>>>>>    }
> >>>>>>
> >>>>>> +/*
> >>>>>> + * Actually the PM suspend whether is completed should be confirmed
> >>>>>> + * by checking the sysfs
> >>>>>> +sys/power/suspend_stats/failed_suspend.However,
> >>>>>> + * in this function only check the AMDGPU device whether is
> >>>>>> suspended
> >>>>>> + * completely in the system-wide suspend process.
> >>>>>> + */
> >>>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
> >>>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
> >>>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >>>>>> +
> >>>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
> >>>>>> +   adev->pm_completed = true;
> >>>>>> +
> >>>>>> +   return 0;
> >>>>>> +}
> >>>>>> +
> >>>>>>    static int amdgpu_pmops_resume(struct device *dev)
> >>>>>>    {
> >>>>>>      struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
> >>>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
> >>>>>>      r = amdgpu_device_resume(drm_dev, true);
> >>>>>>      if (amdgpu_acpi_is_s0ix_active(adev))
> >>>>>>              adev->in_s0ix = false;
> >>>>>> +   adev->pm_completed = false;
> >>>>>>      return r;
> >>>>>>    }
> >>>>>>
> >>>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
> >>>>> = {
> >>>>>>      .runtime_suspend = amdgpu_pmops_runtime_suspend,
> >>>>>>      .runtime_resume = amdgpu_pmops_runtime_resume,
> >>>>>>      .runtime_idle = amdgpu_pmops_runtime_idle,
> >>>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
> >>>>>>    };
> >>>>>>
> >>>>>>    static int amdgpu_flush(struct file *f, fl_owner_t id)
> >>>>>>
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:25             ` Alex Deucher
@ 2021-11-18 14:28               ` Lazar, Lijo
  2021-11-18 15:01                 ` Lazar, Lijo
  2021-11-22 14:10                 ` Liang, Prike
  0 siblings, 2 replies; 15+ messages in thread
From: Lazar, Lijo @ 2021-11-18 14:28 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Christian König, Liang, Prike, Huang,
	Ray, amd-gfx



On 11/18/2021 7:55 PM, Alex Deucher wrote:
> On Thu, Nov 18, 2021 at 9:15 AM Lazar, Lijo <lijo.lazar@amd.com> wrote:
>>
>>
>>
>> On 11/18/2021 7:41 PM, Christian König wrote:
>>> Am 18.11.21 um 15:09 schrieb Lazar, Lijo:
>>>> On 11/18/2021 7:36 PM, Alex Deucher wrote:
>>>>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com>
>>>>> wrote:
>>>>>>
>>>>>> [Public]
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
>>>>>>> Sent: Thursday, November 18, 2021 4:01 PM
>>>>>>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
>>>>>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang, Ray
>>>>>>> <Ray.Huang@amd.com>
>>>>>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
>>>>>>> aborted
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
>>>>>>>> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu
>>>>>>>> suspend to keep AMDGPU in a clean reset state and that can avoid
>>>>>>>> re-initialize device improperly error.
>>>>>>>>
>>>>>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
>>>>>>> +++++++++++++++++++
>>>>>>>>     3 files changed, 24 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>>> index b85b67a..8bd9833 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
>>>>>>>>       bool                            in_s3;
>>>>>>>>       bool                            in_s4;
>>>>>>>>       bool                            in_s0ix;
>>>>>>>> +   bool                            pm_completed;
>>>>>>>
>>>>>>> PM framework maintains separate flags, why not use the same?
>>>>>>>
>>>>>>>            dev->power.is_suspended = false;
>>>>>>>            dev->power.is_noirq_suspended = false;
>>>>>>>            dev->power.is_late_suspended = false;
>>>>>>>
>>>>>>
>>>>>> Thanks for pointing it out and I miss that flag. In this case we can
>>>>>> use the PM flag is_noirq_suspended for checking AMDGPU device
>>>>>> whether is finished suspend.
>>>>>>
>>>>>>> BTW, Alex posted a patch which does similar thing, though it tries
>>>>>>> reset if
>>>>>>> suspend fails.
>>>>>>>
>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C2ce211aeeeb448f6cb2c08d9aa9f4741%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728423343483218%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=nyzhGwTJV83YZkit34Bb%2B5tBxGEMvFzCyZPjz8eSDgc%3D&amp;reserved=0
>>>>>>>
>>>>>>> DM6PR12MB2619.namprd12.prod.outlook.com/
>>>>>>>
>>>>>>> If that reset also failed, then no point in another reset, or keep
>>>>>>> it as part of
>>>>>>> resume.
>>>>>>>
>>>>>>
>>>>>> Alex's patch seems always do the ASIC reset at the end of AMDGPU
>>>>>> device, but that may needn't on the normal AMDGPU device suspend.
>>>>>> However, this patch shows that can handle the system suspend aborted
>>>>>> after AMDGPU suspend case well, so now seems we only need take care
>>>>>> suspend abort case here.
>>>>>>
>>>>>
>>>>> Yeah, I was thinking we'd take this patch rather than mine to minimize
>>>>> the number of resets.
>>>>>
>>>>
>>>> Wondering if suspend fails whether there is a need to call resume. It
>>>> may not get to resume() to do a reset, that needs to be checked.
>>>
>>> I would rather do it so that we reset the ASIC during resume when we
>>> detect that the hardware is in a bad state.
>>>
>>> This way it should also work with things like kexec and virtualization.
>>
>> I was thinking from the power framework perspective. If the device's
>> suspend() callback returns failure, why would the framework needs to
>> call a resume() on that device.
> 
> The device's suspend callback succeeds, but some other device or event
> in the system causes the overall suspend to abort.  As such the GPU is
> never powered off by the platform so it's left in a partially
> initialized state.  The system then calls the resume() callbacks for
> all of the devices that were previously suspended to bring them back
> to a working system.  GPU reset is just a convenient way to get the
> device back into a known good state.  Unfortunately, I'm not sure if
> there is a good way to detect whether the GPU is in a known good state
> or not until we try and re-init the IPs and at that point the IPs are
> not fully initialized so it's complex to try and unwind that, reset,
> and try again.  It's probably easiest to just reset the GPU on resume
> all the time.  If the GPU was powered down, the reset should work fine
> since we are resetting a GPU that just came out of reset.  If the GPU
> was not powered down, the reset will put the GPU into a known good
> state.
> 

https://elixir.bootlin.com/linux/latest/source/drivers/base/power/main.c#L925

I don't have a machine to trace the path. The flag is set only if 
suspend is succesful.

Thanks,
Lijo

> Alex
> 
>>
>> Thanks,
>> Lijo
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>>> Thanks,
>>>>>>> Lijo
>>>>>>>
>>>>>>>>
>>>>>>>>       atomic_t                        in_gpu_reset;
>>>>>>>>       enum pp_mp1_state               mp1_state;
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> index ec42a6f..a12ed54 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device
>>>>>>> *dev, bool fbcon)
>>>>>>>>       if (adev->in_s0ix)
>>>>>>>>               amdgpu_gfx_state_change_set(adev,
>>>>>>> sGpuChangeState_D0Entry);
>>>>>>>>
>>>>>>>> +   if (!adev->pm_completed) {
>>>>>>>> +           dev_warn(adev->dev, "suspend aborted will do asic
>>>>>>>> reset\n");
>>>>>>>> +           amdgpu_asic_reset(adev);
>>>>>>>> +   }
>>>>>>>>       /* post card */
>>>>>>>>       if (amdgpu_device_need_post(adev)) {
>>>>>>>>               r = amdgpu_device_asic_init(adev); diff --git
>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>> index eee3cf8..9f90017 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>> @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct
>>>>>>> device *dev)
>>>>>>>>       return r;
>>>>>>>>     }
>>>>>>>>
>>>>>>>> +/*
>>>>>>>> + * Actually the PM suspend whether is completed should be confirmed
>>>>>>>> + * by checking the sysfs
>>>>>>>> +sys/power/suspend_stats/failed_suspend.However,
>>>>>>>> + * in this function only check the AMDGPU device whether is
>>>>>>>> suspended
>>>>>>>> + * completely in the system-wide suspend process.
>>>>>>>> + */
>>>>>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
>>>>>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
>>>>>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>>>>>> +
>>>>>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
>>>>>>>> +   adev->pm_completed = true;
>>>>>>>> +
>>>>>>>> +   return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static int amdgpu_pmops_resume(struct device *dev)
>>>>>>>>     {
>>>>>>>>       struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6
>>>>>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
>>>>>>>>       r = amdgpu_device_resume(drm_dev, true);
>>>>>>>>       if (amdgpu_acpi_is_s0ix_active(adev))
>>>>>>>>               adev->in_s0ix = false;
>>>>>>>> +   adev->pm_completed = false;
>>>>>>>>       return r;
>>>>>>>>     }
>>>>>>>>
>>>>>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops
>>>>>>> = {
>>>>>>>>       .runtime_suspend = amdgpu_pmops_runtime_suspend,
>>>>>>>>       .runtime_resume = amdgpu_pmops_runtime_resume,
>>>>>>>>       .runtime_idle = amdgpu_pmops_runtime_idle,
>>>>>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
>>>>>>>>     };
>>>>>>>>
>>>>>>>>     static int amdgpu_flush(struct file *f, fl_owner_t id)
>>>>>>>>
>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:28               ` Lazar, Lijo
@ 2021-11-18 15:01                 ` Lazar, Lijo
  2021-11-18 16:18                   ` Alex Deucher
  2021-11-22 14:10                 ` Liang, Prike
  1 sibling, 1 reply; 15+ messages in thread
From: Lazar, Lijo @ 2021-11-18 15:01 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Christian König, Liang, Prike, Huang,
	Ray, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 216 bytes --]

[Public]

BTW, I'm not sure if 'reset always' on resume is a good idea  for GPUs in a hive (assuming those systems also get suspended and get hiccups). At this point the hive isn't reinitialized.

Thanks,
Lijo

[-- Attachment #2: Type: text/html, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 15:01                 ` Lazar, Lijo
@ 2021-11-18 16:18                   ` Alex Deucher
  2021-11-22 14:23                     ` Liang, Prike
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2021-11-18 16:18 UTC (permalink / raw)
  To: Lazar, Lijo
  Cc: Deucher, Alexander, Christian König, Liang, Prike, Huang,
	Ray, amd-gfx

On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <Lijo.Lazar@amd.com> wrote:
>
> [Public]
>
>
> BTW, I'm not sure if 'reset always' on resume is a good idea  for GPUs in a hive (assuming those systems also get suspended and get hiccups). At this point the hive isn't reinitialized.

Yeah, we should probably not reset if we are part of a hive.

Alex


>
> Thanks,
> Lijo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 14:28               ` Lazar, Lijo
  2021-11-18 15:01                 ` Lazar, Lijo
@ 2021-11-22 14:10                 ` Liang, Prike
  1 sibling, 0 replies; 15+ messages in thread
From: Liang, Prike @ 2021-11-22 14:10 UTC (permalink / raw)
  To: Lazar, Lijo, Alex Deucher
  Cc: Deucher, Alexander, Christian König, Huang, Ray, amd-gfx

[Public]

> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar@amd.com>
> Sent: Thursday, November 18, 2021 10:28 PM
> To: Alex Deucher <alexdeucher@gmail.com>
> Cc: Christian König <ckoenig.leichtzumerken@gmail.com>; Liang, Prike
> <Prike.Liang@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Huang, Ray <Ray.Huang@amd.com>;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> aborted
>
>
>
> On 11/18/2021 7:55 PM, Alex Deucher wrote:
> > On Thu, Nov 18, 2021 at 9:15 AM Lazar, Lijo <lijo.lazar@amd.com> wrote:
> >>
> >>
> >>
> >> On 11/18/2021 7:41 PM, Christian König wrote:
> >>> Am 18.11.21 um 15:09 schrieb Lazar, Lijo:
> >>>> On 11/18/2021 7:36 PM, Alex Deucher wrote:
> >>>>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike <Prike.Liang@amd.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> [Public]
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Lazar, Lijo <Lijo.Lazar@amd.com>
> >>>>>>> Sent: Thursday, November 18, 2021 4:01 PM
> >>>>>>> To: Liang, Prike <Prike.Liang@amd.com>;
> >>>>>>> amd-gfx@lists.freedesktop.org
> >>>>>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Huang,
> Ray
> >>>>>>> <Ray.Huang@amd.com>
> >>>>>>> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide
> >>>>>>> suspend aborted
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 11/18/2021 12:32 PM, Prike Liang wrote:
> >>>>>>>> Do ASIC reset at the moment Sx suspend aborted behind of
> amdgpu
> >>>>>>>> suspend to keep AMDGPU in a clean reset state and that can
> >>>>>>>> avoid re-initialize device improperly error.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Prike Liang <Prike.Liang@amd.com>
> >>>>>>>> ---
> >>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
> >>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++++
> >>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 19
> >>>>>>> +++++++++++++++++++
> >>>>>>>>     3 files changed, 24 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>>>> index b85b67a..8bd9833 100644
> >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >>>>>>>> @@ -1053,6 +1053,7 @@ struct amdgpu_device {
> >>>>>>>>       bool                            in_s3;
> >>>>>>>>       bool                            in_s4;
> >>>>>>>>       bool                            in_s0ix;
> >>>>>>>> +   bool                            pm_completed;
> >>>>>>>
> >>>>>>> PM framework maintains separate flags, why not use the same?
> >>>>>>>
> >>>>>>>            dev->power.is_suspended = false;
> >>>>>>>            dev->power.is_noirq_suspended = false;
> >>>>>>>            dev->power.is_late_suspended = false;
> >>>>>>>
> >>>>>>
> >>>>>> Thanks for pointing it out and I miss that flag. In this case we
> >>>>>> can use the PM flag is_noirq_suspended for checking AMDGPU
> device
> >>>>>> whether is finished suspend.
> >>>>>>
> >>>>>>> BTW, Alex posted a patch which does similar thing, though it
> >>>>>>> tries reset if suspend fails.
> >>>>>>>
> >>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>>>>>>
> 2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999
> >>>>>>> %40&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C2ce211aeee
> b448f6cb
> >>>>>>>
> 2c08d9aa9f4741%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63
> 77
> >>>>>>>
> 28423343483218%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQI
> >>>>>>>
> joiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=nyzhG
> >>>>>>>
> wTJV83YZkit34Bb%2B5tBxGEMvFzCyZPjz8eSDgc%3D&amp;reserved=0
> >>>>>>>
> >>>>>>> DM6PR12MB2619.namprd12.prod.outlook.com/
> >>>>>>>
> >>>>>>> If that reset also failed, then no point in another reset, or
> >>>>>>> keep it as part of resume.
> >>>>>>>
> >>>>>>
> >>>>>> Alex's patch seems always do the ASIC reset at the end of AMDGPU
> >>>>>> device, but that may needn't on the normal AMDGPU device suspend.
> >>>>>> However, this patch shows that can handle the system suspend
> aborted
> >>>>>> after AMDGPU suspend case well, so now seems we only need take
> care
> >>>>>> suspend abort case here.
> >>>>>>
> >>>>>
> >>>>> Yeah, I was thinking we'd take this patch rather than mine to minimize
> >>>>> the number of resets.
> >>>>>
> >>>>
> >>>> Wondering if suspend fails whether there is a need to call resume. It
> >>>> may not get to resume() to do a reset, that needs to be checked.
> >>>
> >>> I would rather do it so that we reset the ASIC during resume when we
> >>> detect that the hardware is in a bad state.
> >>>
> >>> This way it should also work with things like kexec and virtualization.
> >>
> >> I was thinking from the power framework perspective. If the device's
> >> suspend() callback returns failure, why would the framework needs to
> >> call a resume() on that device.
> >
> > The device's suspend callback succeeds, but some other device or event
> > in the system causes the overall suspend to abort.  As such the GPU is
> > never powered off by the platform so it's left in a partially
> > initialized state.  The system then calls the resume() callbacks for
> > all of the devices that were previously suspended to bring them back
> > to a working system.  GPU reset is just a convenient way to get the
> > device back into a known good state.  Unfortunately, I'm not sure if
> > there is a good way to detect whether the GPU is in a known good state
> > or not until we try and re-init the IPs and at that point the IPs are
> > not fully initialized so it's complex to try and unwind that, reset,
> > and try again.  It's probably easiest to just reset the GPU on resume
> > all the time.  If the GPU was powered down, the reset should work fine
> > since we are resetting a GPU that just came out of reset.  If the GPU
> > was not powered down, the reset will put the GPU into a known good
> > state.
> >
>
> https://elixir.bootlin.com/linux/latest/source/drivers/base/power/main.c#L
> 925
>
> I don't have a machine to trace the path. The flag is set only if
> suspend is succesful.

The PM flags are set on the PM opt callback processed successfully and then reset after relevant PM resume callback complete undoing process. Regards to sort out the system suspend abort after AMDGPU suspend seems not easy only use PM flags(is_suspended, .is_late_suspended ,is_noirq_suspended) since in the AMDGPU device resume phase only leave the flags is_suspended =1 and is_late_suspended/ is_noirq_suspended =0 no matter normal system suspend or system suspend process abort after AMDGPU suspend. In this patch the flag pm_completed only can handle the case which abort system suspend before call AMDGPU noirq suspend, and can't well cover all the system abort corner case such as system abort after AMDGPU noirq phase. So, how about check the PM suspend state by using the sys file  /sys/power/suspend_stats/failed_suspend  instead of tracing the flags of PM framework or driver callback defined and only do the GPU reset in the amdgpu_device_resume() when detect failed suspend?

If we give up checking suspend state and always do the GPU reset in the amdgpu_device_resume(), then that may have a side effect on the system resume latency?

> Thanks,
> Lijo
>
> > Alex
> >
> >>
> >> Thanks,
> >> Lijo
> >>
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>>
> >>>> Thanks,
> >>>> Lijo
> >>>>
> >>>>
> >>>>> Alex
> >>>>>
> >>>>>
> >>>>>>> Thanks,
> >>>>>>> Lijo
> >>>>>>>
> >>>>>>>>
> >>>>>>>>       atomic_t                        in_gpu_reset;
> >>>>>>>>       enum pp_mp1_state               mp1_state;
> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>> index ec42a6f..a12ed54 100644
> >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>>>>>>> @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct
> drm_device
> >>>>>>> *dev, bool fbcon)
> >>>>>>>>       if (adev->in_s0ix)
> >>>>>>>>               amdgpu_gfx_state_change_set(adev,
> >>>>>>> sGpuChangeState_D0Entry);
> >>>>>>>>
> >>>>>>>> +   if (!adev->pm_completed) {
> >>>>>>>> +           dev_warn(adev->dev, "suspend aborted will do asic
> >>>>>>>> reset\n");
> >>>>>>>> +           amdgpu_asic_reset(adev);
> >>>>>>>> +   }
> >>>>>>>>       /* post card */
> >>>>>>>>       if (amdgpu_device_need_post(adev)) {
> >>>>>>>>               r = amdgpu_device_asic_init(adev); diff --git
> >>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>>> index eee3cf8..9f90017 100644
> >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>>> @@ -2168,6 +2168,23 @@ static int
> amdgpu_pmops_suspend(struct
> >>>>>>> device *dev)
> >>>>>>>>       return r;
> >>>>>>>>     }
> >>>>>>>>
> >>>>>>>> +/*
> >>>>>>>> + * Actually the PM suspend whether is completed should be
> confirmed
> >>>>>>>> + * by checking the sysfs
> >>>>>>>> +sys/power/suspend_stats/failed_suspend.However,
> >>>>>>>> + * in this function only check the AMDGPU device whether is
> >>>>>>>> suspended
> >>>>>>>> + * completely in the system-wide suspend process.
> >>>>>>>> + */
> >>>>>>>> +static int amdgpu_pmops_noirq_suspend(struct device *dev) {
> >>>>>>>> +   struct drm_device *drm_dev = dev_get_drvdata(dev);
> >>>>>>>> +   struct amdgpu_device *adev = drm_to_adev(drm_dev);
> >>>>>>>> +
> >>>>>>>> +   dev_dbg(dev, "amdgpu suspend completely.\n");
> >>>>>>>> +   adev->pm_completed = true;
> >>>>>>>> +
> >>>>>>>> +   return 0;
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>>     static int amdgpu_pmops_resume(struct device *dev)
> >>>>>>>>     {
> >>>>>>>>       struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -
> 2181,6
> >>>>>>>> +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev)
> >>>>>>>>       r = amdgpu_device_resume(drm_dev, true);
> >>>>>>>>       if (amdgpu_acpi_is_s0ix_active(adev))
> >>>>>>>>               adev->in_s0ix = false;
> >>>>>>>> +   adev->pm_completed = false;
> >>>>>>>>       return r;
> >>>>>>>>     }
> >>>>>>>>
> >>>>>>>> @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops
> amdgpu_pm_ops
> >>>>>>> = {
> >>>>>>>>       .runtime_suspend = amdgpu_pmops_runtime_suspend,
> >>>>>>>>       .runtime_resume = amdgpu_pmops_runtime_resume,
> >>>>>>>>       .runtime_idle = amdgpu_pmops_runtime_idle,
> >>>>>>>> +   .suspend_noirq = amdgpu_pmops_noirq_suspend,
> >>>>>>>>     };
> >>>>>>>>
> >>>>>>>>     static int amdgpu_flush(struct file *f, fl_owner_t id)
> >>>>>>>>
> >>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-18 16:18                   ` Alex Deucher
@ 2021-11-22 14:23                     ` Liang, Prike
  2021-11-22 15:47                       ` Alex Deucher
  0 siblings, 1 reply; 15+ messages in thread
From: Liang, Prike @ 2021-11-22 14:23 UTC (permalink / raw)
  To: Alex Deucher, Lazar, Lijo
  Cc: Deucher, Alexander, Christian König, Huang, Ray, amd-gfx

[Public]

> -----Original Message-----
> From: Alex Deucher <alexdeucher@gmail.com>
> Sent: Friday, November 19, 2021 12:18 AM
> To: Lazar, Lijo <Lijo.Lazar@amd.com>
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Christian König
> <ckoenig.leichtzumerken@gmail.com>; Liang, Prike <Prike.Liang@amd.com>;
> Huang, Ray <Ray.Huang@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> aborted
>
> On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <Lijo.Lazar@amd.com> wrote:
> >
> > [Public]
> >
> >
> > BTW, I'm not sure if 'reset always' on resume is a good idea  for GPUs in a
> hive (assuming those systems also get suspended and get hiccups). At this
> point the hive isn't reinitialized.
>
> Yeah, we should probably not reset if we are part of a hive.
>
> Alex
>
For the GPU hive reset in this suspend abort case need treat specially, does that because of GPU hive need take care each node reset dependence and synchronous reset? For this purpose, can we skip the hive reset case and only do GPU reset under adev->gmc.xgmi.num_physical_nodes == 0 ?

> >
> > Thanks,
> > Lijo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-22 14:23                     ` Liang, Prike
@ 2021-11-22 15:47                       ` Alex Deucher
  2021-11-23  5:40                         ` Liang, Prike
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2021-11-22 15:47 UTC (permalink / raw)
  To: Liang, Prike
  Cc: Deucher, Alexander, Christian König, Lazar, Lijo, Huang,
	Ray, amd-gfx

On Mon, Nov 22, 2021 at 9:23 AM Liang, Prike <Prike.Liang@amd.com> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Alex Deucher <alexdeucher@gmail.com>
> > Sent: Friday, November 19, 2021 12:18 AM
> > To: Lazar, Lijo <Lijo.Lazar@amd.com>
> > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Christian König
> > <ckoenig.leichtzumerken@gmail.com>; Liang, Prike <Prike.Liang@amd.com>;
> > Huang, Ray <Ray.Huang@amd.com>; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> > aborted
> >
> > On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <Lijo.Lazar@amd.com> wrote:
> > >
> > > [Public]
> > >
> > >
> > > BTW, I'm not sure if 'reset always' on resume is a good idea  for GPUs in a
> > hive (assuming those systems also get suspended and get hiccups). At this
> > point the hive isn't reinitialized.
> >
> > Yeah, we should probably not reset if we are part of a hive.
> >
> > Alex
> >
> For the GPU hive reset in this suspend abort case need treat specially, does that because of GPU hive need take care each node reset dependence and synchronous reset? For this purpose, can we skip the hive reset case and only do GPU reset under adev->gmc.xgmi.num_physical_nodes == 0 ?

Yes, exactly.  For the aborted suspend reset, we can check the value
before doing a reset.  I think you want to check if
adev->gmc.xgmi.num_physical_nodes <= 1.

Alex

>
> > >
> > > Thanks,
> > > Lijo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
  2021-11-22 15:47                       ` Alex Deucher
@ 2021-11-23  5:40                         ` Liang, Prike
  0 siblings, 0 replies; 15+ messages in thread
From: Liang, Prike @ 2021-11-23  5:40 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Christian König, Lazar, Lijo, Huang,
	Ray, amd-gfx

[Public]

> -----Original Message-----
> From: Alex Deucher <alexdeucher@gmail.com>
> Sent: Monday, November 22, 2021 11:48 PM
> To: Liang, Prike <Prike.Liang@amd.com>
> Cc: Lazar, Lijo <Lijo.Lazar@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Christian König
> <ckoenig.leichtzumerken@gmail.com>; Huang, Ray <Ray.Huang@amd.com>;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> aborted
>
> On Mon, Nov 22, 2021 at 9:23 AM Liang, Prike <Prike.Liang@amd.com>
> wrote:
> >
> > [Public]
> >
> > > -----Original Message-----
> > > From: Alex Deucher <alexdeucher@gmail.com>
> > > Sent: Friday, November 19, 2021 12:18 AM
> > > To: Lazar, Lijo <Lijo.Lazar@amd.com>
> > > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Christian König
> > > <ckoenig.leichtzumerken@gmail.com>; Liang, Prike
> > > <Prike.Liang@amd.com>; Huang, Ray <Ray.Huang@amd.com>;
> > > amd-gfx@lists.freedesktop.org
> > > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide
> > > suspend aborted
> > >
> > > On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <Lijo.Lazar@amd.com>
> wrote:
> > > >
> > > > [Public]
> > > >
> > > >
> > > > BTW, I'm not sure if 'reset always' on resume is a good idea  for
> > > > GPUs in a
> > > hive (assuming those systems also get suspended and get hiccups). At
> > > this point the hive isn't reinitialized.
> > >
> > > Yeah, we should probably not reset if we are part of a hive.
> > >
> > > Alex
> > >
> > For the GPU hive reset in this suspend abort case need treat specially, does
> that because of GPU hive need take care each node reset dependence and
> synchronous reset? For this purpose, can we skip the hive reset case and
> only do GPU reset under adev->gmc.xgmi.num_physical_nodes == 0 ?
>
> Yes, exactly.  For the aborted suspend reset, we can check the value before
> doing a reset.  I think you want to check if
> adev->gmc.xgmi.num_physical_nodes <= 1.
>
> Alex
>
Thanks for the clarification and will add this checking for GPU reset in the amdgpu_device_resume().
> >
> > > >
> > > > Thanks,
> > > > Lijo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-11-23  5:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-18  7:02 [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted Prike Liang
2021-11-18  8:00 ` Lazar, Lijo
2021-11-18 13:11   ` Liang, Prike
2021-11-18 14:06     ` Alex Deucher
2021-11-18 14:09       ` Lazar, Lijo
2021-11-18 14:11         ` Christian König
2021-11-18 14:14           ` Lazar, Lijo
2021-11-18 14:25             ` Alex Deucher
2021-11-18 14:28               ` Lazar, Lijo
2021-11-18 15:01                 ` Lazar, Lijo
2021-11-18 16:18                   ` Alex Deucher
2021-11-22 14:23                     ` Liang, Prike
2021-11-22 15:47                       ` Alex Deucher
2021-11-23  5:40                         ` Liang, Prike
2021-11-22 14:10                 ` Liang, Prike

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.