linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Disable RPM helpers while reprobing connectors on resume
@ 2016-07-08 15:37 Lyude
  2016-07-15 18:44 ` Alex Deucher
  0 siblings, 1 reply; 2+ messages in thread
From: Lyude @ 2016-07-08 15:37 UTC (permalink / raw)
  To: amd-gfx
  Cc: Lyude, stable, Alex Deucher, Christian König, David Airlie,
	Jammy Zhou, Chunming Zhou, Monk Liu, Flora Cui, Tom St Denis,
	open list:RADEON and AMDGPU DRM DRIVERS, open list

Just about all of amdgpu's connector probing functions try to acquire
runtime PM refs. If we try to do this in the context of
amdgpu_resume_kms by calling drm_helper_hpd_irq_event(), we end up
deadlocking the system.

Since we're guaranteed to be holding the spinlock for RPM in
amdgpu_resume_kms, and we already know the GPU is in working order, we
need to prevent the RPM helpers from trying to run during the initial
connector reprobe on resume.

There's a couple of solutions I've explored for fixing this, but this
one by far seems to be the simplest and most reliable (plus I'm pretty
sure that's what disable_depth is there for anyway).

Reproduction recipe:
  - Get any laptop dual GPUs using PRIME
  - Make sure runtime PM is enabled for amdgpu
  - Boot the machine
  - If the machine managed to boot without hanging, switch out of X to
    another VT. This should definitely cause X to hang infinitely.

Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6e92008..46c1fee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1841,7 +1841,19 @@ int amdgpu_resume_kms(struct drm_device *dev, bool resume, bool fbcon)
 	}
 
 	drm_kms_helper_poll_enable(dev);
+
+	/*
+	 * Most of the connector probing functions try to acquire runtime pm
+	 * refs to ensure that the GPU is powered on when connector polling is
+	 * performed. Since we're calling this from a runtime PM callback,
+	 * trying to acquire rpm refs will cause us to deadlock.
+	 *
+	 * Since we're guaranteed to be holding the rpm lock, it's safe to
+	 * temporarily disable the rpm helpers so this doesn't deadlock us.
+	 */
+	dev->dev->power.disable_depth++;
 	drm_helper_hpd_irq_event(dev);
+	dev->dev->power.disable_depth--;
 
 	if (fbcon) {
 		amdgpu_fbdev_set_suspend(adev, 0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] drm/amdgpu: Disable RPM helpers while reprobing connectors on resume
  2016-07-08 15:37 [PATCH] drm/amdgpu: Disable RPM helpers while reprobing connectors on resume Lyude
@ 2016-07-15 18:44 ` Alex Deucher
  0 siblings, 0 replies; 2+ messages in thread
From: Alex Deucher @ 2016-07-15 18:44 UTC (permalink / raw)
  To: Lyude
  Cc: amd-gfx list, Tom St Denis, Jammy Zhou, open list, for 3.8,
	open list:RADEON and AMDGPU DRM DRIVERS, Alex Deucher, Flora Cui,
	Christian König, Monk Liu

On Fri, Jul 8, 2016 at 11:37 AM, Lyude <cpaul@redhat.com> wrote:
> Just about all of amdgpu's connector probing functions try to acquire
> runtime PM refs. If we try to do this in the context of
> amdgpu_resume_kms by calling drm_helper_hpd_irq_event(), we end up
> deadlocking the system.
>
> Since we're guaranteed to be holding the spinlock for RPM in
> amdgpu_resume_kms, and we already know the GPU is in working order, we
> need to prevent the RPM helpers from trying to run during the initial
> connector reprobe on resume.
>
> There's a couple of solutions I've explored for fixing this, but this
> one by far seems to be the simplest and most reliable (plus I'm pretty
> sure that's what disable_depth is there for anyway).
>
> Reproduction recipe:
>   - Get any laptop dual GPUs using PRIME
>   - Make sure runtime PM is enabled for amdgpu
>   - Boot the machine
>   - If the machine managed to boot without hanging, switch out of X to
>     another VT. This should definitely cause X to hang infinitely.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Lyude <cpaul@redhat.com>

Applied.  thanks!

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 6e92008..46c1fee 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1841,7 +1841,19 @@ int amdgpu_resume_kms(struct drm_device *dev, bool resume, bool fbcon)
>         }
>
>         drm_kms_helper_poll_enable(dev);
> +
> +       /*
> +        * Most of the connector probing functions try to acquire runtime pm
> +        * refs to ensure that the GPU is powered on when connector polling is
> +        * performed. Since we're calling this from a runtime PM callback,
> +        * trying to acquire rpm refs will cause us to deadlock.
> +        *
> +        * Since we're guaranteed to be holding the rpm lock, it's safe to
> +        * temporarily disable the rpm helpers so this doesn't deadlock us.
> +        */
> +       dev->dev->power.disable_depth++;
>         drm_helper_hpd_irq_event(dev);
> +       dev->dev->power.disable_depth--;
>
>         if (fbcon) {
>                 amdgpu_fbdev_set_suspend(adev, 0);
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-07-15 18:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08 15:37 [PATCH] drm/amdgpu: Disable RPM helpers while reprobing connectors on resume Lyude
2016-07-15 18:44 ` Alex Deucher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).