* [PATCH 0/9] Fix runtime pm ref leaks @ 2016-05-24 16:03 Lukas Wunner 2016-05-24 16:03 ` [PATCH 5/9] drm/radeon: Forbid runtime pm on driver unload Lukas Wunner ` (6 more replies) 0 siblings, 7 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Alex Deucher, Dave Airlie In preparation for runtime pm on muxed dual GPU laptops, I've fixed all runtime pm ref leaks I could find in nouveau, radeon and amdgpu. To ease reviewing, I've pushed this series to GitHub: https://github.com/l1k/linux/commits/drm_runpm_fixes_v1 @Alex Deucher: I do not have an AMD GPU so couldn't test this beyond verifying that it compiles. Please double-check the patches and test them internally at AMD. By the way, I've noticed that nouveau takes a runtime pm ref in ->preclose and releases it in ->postclose. This is missing in radeon and amdgpu. Please check if it is needed. Thanks, Lukas Lukas Wunner (9): drm/nouveau: Don't leak runtime pm ref on driver unload drm/nouveau: Forbid runtime pm on driver unload drm/radeon: Don't leak runtime pm ref on driver unload drm/radeon: Don't leak runtime pm ref on driver load drm/radeon: Forbid runtime pm on driver unload drm/amdgpu: Don't leak runtime pm ref on driver unload drm/amdgpu: Don't leak runtime pm ref on driver load drm/amdgpu: Forbid runtime pm on driver unload drm: Turn off crtc before tearing down its data structure drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 12 +++++++++--- drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- drivers/gpu/drm/nouveau/nouveau_drm.c | 6 +++++- drivers/gpu/drm/radeon/radeon_device.c | 4 ++++ drivers/gpu/drm/radeon/radeon_kms.c | 5 ++++- 5 files changed, 34 insertions(+), 6 deletions(-) -- 2.8.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 5/9] drm/radeon: Forbid runtime pm on driver unload 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 3/9] drm/radeon: Don't leak runtime pm ref " Lukas Wunner ` (5 subsequent siblings) 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie The PCI core calls pm_runtime_forbid() on device probe in pci_pm_init(), making this the default state when radeon is loaded. radeon_driver_load_kms() therefore calls pm_runtime_allow(), but there's no pm_runtime_forbid() in radeon_driver_unload_kms() to balance it. Add it so that we leave the device in the same state that we found it. This isn't a bug, it's just good housekeeping. When radeon is first loaded with runpm=1, then unloaded and loaded again with runpm=0, pm_runtime_forbid() will be called from radeon_pmops_runtime_idle() or radeon_pmops_runtime_suspend(), so the behaviour is correct. If there ever is a third party driver for AMD cards, this commit avoids that it has to clean up behind radeon. Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/radeon/radeon_kms.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 51998a4..835563c 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -65,6 +65,7 @@ int radeon_driver_unload_kms(struct drm_device *dev) if (radeon_is_px(dev)) { pm_runtime_get_sync(dev->dev); + pm_runtime_forbid(dev->dev); } radeon_kfd_device_fini(rdev); -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/9] drm/radeon: Don't leak runtime pm ref on driver unload 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner 2016-05-24 16:03 ` [PATCH 5/9] drm/radeon: Forbid runtime pm on driver unload Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 7/9] drm/amdgpu: Don't leak runtime pm ref on driver load Lukas Wunner ` (4 subsequent siblings) 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie radeon_driver_load_kms() calls pm_runtime_put_autosuspend() if radeon_is_px(dev), but radeon_driver_unload_kms() calls pm_runtime_get_sync() unconditionally. We therefore leak a runtime pm ref whenever radeon is unloaded on a non-PX machine or if runpm=0. The GPU will subsequently never runtime suspend after loading radeon again. Fix by taking the runtime pm ref under the same condition that it was released on driver load. Fixes: 10ebc0bc0934 ("drm/radeon: add runtime PM support (v2)") Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 414953c..51998a4 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -63,7 +63,9 @@ int radeon_driver_unload_kms(struct drm_device *dev) if (rdev->rmmio == NULL) goto done_free; - pm_runtime_get_sync(dev->dev); + if (radeon_is_px(dev)) { + pm_runtime_get_sync(dev->dev); + } radeon_kfd_device_fini(rdev); -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 7/9] drm/amdgpu: Don't leak runtime pm ref on driver load 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner 2016-05-24 16:03 ` [PATCH 5/9] drm/radeon: Forbid runtime pm on driver unload Lukas Wunner 2016-05-24 16:03 ` [PATCH 3/9] drm/radeon: Don't leak runtime pm ref " Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 6/9] drm/amdgpu: Don't leak runtime pm ref on driver unload Lukas Wunner ` (3 subsequent siblings) 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie If an error occurs in amdgpu_device_init() after adev->rmmio has been set, its caller amdgpu_driver_load_kms() will skip runtime pm initialization and call amdgpu_driver_unload_kms(), which acquires a runtime pm ref that is leaked. Balance by releasing a runtime pm ref in the error path of amdgpu_driver_load_kms(). Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 9b1f979..0db692e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -137,9 +137,12 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags) } out: - if (r) + if (r) { + /* balance pm_runtime_get_sync in amdgpu_driver_unload_kms */ + if (adev->rmmio && amdgpu_device_is_px(dev)) + pm_runtime_put_noidle(dev->dev); amdgpu_driver_unload_kms(dev); - + } return r; } -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 6/9] drm/amdgpu: Don't leak runtime pm ref on driver unload 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner ` (2 preceding siblings ...) 2016-05-24 16:03 ` [PATCH 7/9] drm/amdgpu: Don't leak runtime pm ref on driver load Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 4/9] drm/radeon: Don't leak runtime pm ref on driver load Lukas Wunner ` (2 subsequent siblings) 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie amdgpu_driver_load_kms() calls pm_runtime_put_autosuspend() if amdgpu_device_is_px(dev), but amdgpu_driver_unload_kms() calls pm_runtime_get_sync() unconditionally. We therefore leak a runtime pm ref whenever amdgpu is unloaded on a non-PX machine or if runpm=0. The GPU will subsequently never runtime suspend after loading amdgpu again. Fix by taking the runtime pm ref under the same condition that it was released on driver load. Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 40a2370..9b1f979 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -60,7 +60,9 @@ int amdgpu_driver_unload_kms(struct drm_device *dev) if (adev->rmmio == NULL) goto done_free; - pm_runtime_get_sync(dev->dev); + if (amdgpu_device_is_px(dev)) { + pm_runtime_get_sync(dev->dev); + } amdgpu_amdkfd_device_fini(adev); -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/9] drm/radeon: Don't leak runtime pm ref on driver load 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner ` (3 preceding siblings ...) 2016-05-24 16:03 ` [PATCH 6/9] drm/amdgpu: Don't leak runtime pm ref on driver unload Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-24 16:03 ` [PATCH 8/9] drm/amdgpu: Forbid runtime pm " Lukas Wunner 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie radeon_device_init() returns an error if either of the two calls to radeon_init() fail. One level up in the call stack, radeon_driver_load_kms() will then skip runtime pm initialization and call radeon_driver_unload_kms(), which acquires a runtime pm ref that is leaked. Balance by releasing a runtime pm ref in the error path of radeon_device_init(). Fixes: 10ebc0bc0934 ("drm/radeon: add runtime PM support (v2)") Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/radeon/radeon_device.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index e721e6b..e0bf778 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -30,6 +30,7 @@ #include <drm/drmP.h> #include <drm/drm_crtc_helper.h> #include <drm/radeon_drm.h> +#include <linux/pm_runtime.h> #include <linux/vgaarb.h> #include <linux/vga_switcheroo.h> #include <linux/efi.h> @@ -1505,6 +1506,9 @@ int radeon_device_init(struct radeon_device *rdev, return 0; failed: + /* balance pm_runtime_get_sync() in radeon_driver_unload_kms() */ + if (radeon_is_px(ddev)) + pm_runtime_put_noidle(ddev->dev); if (runtime) vga_switcheroo_fini_domain_pm_ops(rdev->dev); return r; -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
[parent not found: <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* [PATCH 2/9] drm/nouveau: Forbid runtime pm on driver unload [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 9/9] drm: Turn off crtc before tearing down its data structure Lukas Wunner 2016-05-24 16:03 ` [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload Lukas Wunner 2 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Dave Airlie The PCI core calls pm_runtime_forbid() on device probe in pci_pm_init(), making this the default state when nouveau is loaded. nouveau_drm_load() therefore calls pm_runtime_allow(), but there's no pm_runtime_forbid() in nouveau_drm_unload() to balance it. Add it so that we leave the device in the same state that we found it. This isn't a bug, it's just good housekeeping. When nouveau is first loaded with runpm=1, then unloaded and loaded again with runpm=0, pm_runtime_forbid() will be called from nouveau_pmops_runtime_idle() or nouveau_pmops_runtime_suspend(), so the behaviour is correct. The nvidia blob doesn't use runtime pm, but if it ever does, this commit avoids that it has to clean up behind nouveau. Tested-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/nouveau/nouveau_drm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index faf7438..ef784b7 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -500,6 +500,7 @@ nouveau_drm_unload(struct drm_device *dev) if (nouveau_runtime_pm != 0) { pm_runtime_get_sync(dev->dev); + pm_runtime_forbid(dev->dev); } nouveau_fbcon_fini(dev); -- 2.8.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 9/9] drm: Turn off crtc before tearing down its data structure [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-24 16:03 ` [PATCH 2/9] drm/nouveau: Forbid runtime pm on driver unload Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner 2016-05-24 21:30 ` [Nouveau] " Daniel Vetter 2016-05-24 16:03 ` [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload Lukas Wunner 2 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Alex Deucher, Dave Airlie When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM core does not turn off the crtc first and neither do the drivers. With nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on driver unload if at least one crtc was enabled. (See usage of have_disp_power_ref in nouveau_crtc_set_config(), radeon_crtc_set_config() and amdgpu_crtc_set_config()). Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") Cc: Dave Airlie <airlied@redhat.com> Tested-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index d2a6d95..0cd6f00 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -716,12 +716,23 @@ EXPORT_SYMBOL(drm_crtc_init_with_planes); * * This function cleans up @crtc and removes it from the DRM mode setting * core. Note that the function does *not* free the crtc structure itself, - * this is the responsibility of the caller. + * this is the responsibility of the caller. If @crtc is currently enabled, + * it is turned off first. */ void drm_crtc_cleanup(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + if (crtc->enabled) { + struct drm_mode_set modeset = { + .crtc = crtc, + }; + + drm_modeset_lock_all(dev); + drm_mode_set_config_internal(&modeset); + drm_modeset_unlock_all(dev); + } + kfree(crtc->gamma_store); crtc->gamma_store = NULL; -- 2.8.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-05-24 16:03 ` [PATCH 9/9] drm: Turn off crtc before tearing down its data structure Lukas Wunner @ 2016-05-24 21:30 ` Daniel Vetter 2016-05-24 22:07 ` Lukas Wunner [not found] ` <20160524213042.GC27098-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 0 siblings, 2 replies; 25+ messages in thread From: Daniel Vetter @ 2016-05-24 21:30 UTC (permalink / raw) To: Lukas Wunner; +Cc: Alex Deucher, nouveau, dri-devel, Dave Airlie On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > core does not turn off the crtc first and neither do the drivers. With > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > driver unload if at least one crtc was enabled. > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > Cc: Dave Airlie <airlied@redhat.com> > Tested-by: Karol Herbst <nouveau@karolherbst.de> > Signed-off-by: Lukas Wunner <lukas@wunner.de> This is a core regression, we fixed it again. Previously when unreference drm_planes the core made sure that it's not longer in use, which had the side effect of shutting everything off in module unload. For a bunch of reasons we've stopped doing that, but that turned out to be a mistake. It's fixed since commit f2d580b9a8149735cbc4b59c4a8df60173658140 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Wed May 4 14:38:26 2016 +0200 drm/core: Do not preserve framebuffer on rmfb, v4. Your patch shouldn't be needed with that any more. If it still is it's most likely the fbdev cleanup done too late, but you /should/ get a big WARNING splat in that case from drm_mode_config_cleanup(). -Daniel > --- > drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c > index d2a6d95..0cd6f00 100644 > --- a/drivers/gpu/drm/drm_crtc.c > +++ b/drivers/gpu/drm/drm_crtc.c > @@ -716,12 +716,23 @@ EXPORT_SYMBOL(drm_crtc_init_with_planes); > * > * This function cleans up @crtc and removes it from the DRM mode setting > * core. Note that the function does *not* free the crtc structure itself, > - * this is the responsibility of the caller. > + * this is the responsibility of the caller. If @crtc is currently enabled, > + * it is turned off first. > */ > void drm_crtc_cleanup(struct drm_crtc *crtc) > { > struct drm_device *dev = crtc->dev; > > + if (crtc->enabled) { > + struct drm_mode_set modeset = { > + .crtc = crtc, > + }; > + > + drm_modeset_lock_all(dev); > + drm_mode_set_config_internal(&modeset); > + drm_modeset_unlock_all(dev); > + } > + > kfree(crtc->gamma_store); > crtc->gamma_store = NULL; > > -- > 2.8.1 > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-05-24 21:30 ` [Nouveau] " Daniel Vetter @ 2016-05-24 22:07 ` Lukas Wunner [not found] ` <20160524220753.GA5941-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> [not found] ` <20160524213042.GC27098-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 1 sibling, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 22:07 UTC (permalink / raw) To: Daniel Vetter; +Cc: Alex Deucher, nouveau, dri-devel, Dave Airlie Good evening Daniel, On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > core does not turn off the crtc first and neither do the drivers. With > > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > driver unload if at least one crtc was enabled. > > > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > Cc: Dave Airlie <airlied@redhat.com> > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > This is a core regression, we fixed it again. Previously when unreference > drm_planes the core made sure that it's not longer in use, which had the > side effect of shutting everything off in module unload. > > For a bunch of reasons we've stopped doing that, but that turned out to be > a mistake. It's fixed since > > commit f2d580b9a8149735cbc4b59c4a8df60173658140 > Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Date: Wed May 4 14:38:26 2016 +0200 > > drm/core: Do not preserve framebuffer on rmfb, v4. Okay, I will test it. May I ask you a question while you have this topic swapped into your brain: nouveau, radeon and amdgpu currently hold one runtime pm ref if any crtc is turned on. I'm pondering how to make this work for muxed dual GPU laptops. When switching GPUs, the now inactive GPU should turn off the crtc used by the panel to save power (if it's *only* used by the panel) and release that runtime pm ref. Likewise, the now active GPU needs to turn on the crtc used by the panel and take a runtime pm ref. The whole thing becomes a bit complicated because MacBook Pros with Thunderbolt can only drive external displays with their discrete GPU. So when switching to the integrated GPU, the discrete GPU may turn off the crtc for the panel but the crtc for the external display needs to stay alive if it's in use and the GPU may not suspend. What I have in mind is to change the scheme nouveau/radeon/amdgpu are currently using by taking a runtime pm ref when enabling a crtc and releasing it when disabling the crtc. This is different from the status quo where only a *single* runtime pm ref is taken if *any* crtc is enabled. Upon switching, the runtime pm ref for the crtc previously used by the panel is released and the crtc should be turned off. If it was the only active crtc the GPU automatically goes to sleep. I'm thinking about putting the pm_runtime_get() and pm_runtime_put() in the DRM core. (Actually I already have a commit to do just that in my local repo.) That way we're using less code in nouveau/radeon/amdgpu because we're doing the runtime pm handling centrally in the core. I'm wondering if that would impact other DRM drivers negatively. Basically the idea is to harmonize runtime pm handling among DRM drivers. What do you think about that? Thanks, Lukas > > Your patch shouldn't be needed with that any more. If it still is it's > most likely the fbdev cleanup done too late, but you /should/ get a big > WARNING splat in that case from drm_mode_config_cleanup(). > -Daniel > > > --- > > drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c > > index d2a6d95..0cd6f00 100644 > > --- a/drivers/gpu/drm/drm_crtc.c > > +++ b/drivers/gpu/drm/drm_crtc.c > > @@ -716,12 +716,23 @@ EXPORT_SYMBOL(drm_crtc_init_with_planes); > > * > > * This function cleans up @crtc and removes it from the DRM mode setting > > * core. Note that the function does *not* free the crtc structure itself, > > - * this is the responsibility of the caller. > > + * this is the responsibility of the caller. If @crtc is currently enabled, > > + * it is turned off first. > > */ > > void drm_crtc_cleanup(struct drm_crtc *crtc) > > { > > struct drm_device *dev = crtc->dev; > > > > + if (crtc->enabled) { > > + struct drm_mode_set modeset = { > > + .crtc = crtc, > > + }; > > + > > + drm_modeset_lock_all(dev); > > + drm_mode_set_config_internal(&modeset); > > + drm_modeset_unlock_all(dev); > > + } > > + > > kfree(crtc->gamma_store); > > crtc->gamma_store = NULL; > > > > -- > > 2.8.1 > > > > _______________________________________________ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20160524220753.GA5941-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [PATCH 9/9] drm: Turn off crtc before tearing down its data structure [not found] ` <20160524220753.GA5941-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-24 22:30 ` Daniel Vetter 0 siblings, 0 replies; 25+ messages in thread From: Daniel Vetter @ 2016-05-24 22:30 UTC (permalink / raw) To: Lukas Wunner; +Cc: Alex Deucher, Nouveau Dev, dri-devel, Dave Airlie On Wed, May 25, 2016 at 12:07 AM, Lukas Wunner <lukas@wunner.de> wrote: > Good evening Daniel, > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: >> On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: >> > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM >> > core does not turn off the crtc first and neither do the drivers. With >> > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on >> > driver unload if at least one crtc was enabled. >> > >> > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), >> > radeon_crtc_set_config() and amdgpu_crtc_set_config()). >> > >> > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") >> > Cc: Dave Airlie <airlied@redhat.com> >> > Tested-by: Karol Herbst <nouveau@karolherbst.de> >> > Signed-off-by: Lukas Wunner <lukas@wunner.de> >> >> This is a core regression, we fixed it again. Previously when unreference >> drm_planes the core made sure that it's not longer in use, which had the >> side effect of shutting everything off in module unload. >> >> For a bunch of reasons we've stopped doing that, but that turned out to be >> a mistake. It's fixed since >> >> commit f2d580b9a8149735cbc4b59c4a8df60173658140 >> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> >> Date: Wed May 4 14:38:26 2016 +0200 >> >> drm/core: Do not preserve framebuffer on rmfb, v4. > > Okay, I will test it. > > May I ask you a question while you have this topic swapped into your > brain: nouveau, radeon and amdgpu currently hold one runtime pm ref > if any crtc is turned on. I'm pondering how to make this work for > muxed dual GPU laptops. When switching GPUs, the now inactive GPU > should turn off the crtc used by the panel to save power (if it's > *only* used by the panel) and release that runtime pm ref. Likewise, > the now active GPU needs to turn on the crtc used by the panel and > take a runtime pm ref. > > The whole thing becomes a bit complicated because MacBook Pros with > Thunderbolt can only drive external displays with their discrete GPU. > So when switching to the integrated GPU, the discrete GPU may turn > off the crtc for the panel but the crtc for the external display needs > to stay alive if it's in use and the GPU may not suspend. > > What I have in mind is to change the scheme nouveau/radeon/amdgpu are > currently using by taking a runtime pm ref when enabling a crtc and > releasing it when disabling the crtc. This is different from the status > quo where only a *single* runtime pm ref is taken if *any* crtc is enabled. > > Upon switching, the runtime pm ref for the crtc previously used by the > panel is released and the crtc should be turned off. If it was the only > active crtc the GPU automatically goes to sleep. > > I'm thinking about putting the pm_runtime_get() and pm_runtime_put() > in the DRM core. (Actually I already have a commit to do just that in > my local repo.) That way we're using less code in nouveau/radeon/amdgpu > because we're doing the runtime pm handling centrally in the core. > I'm wondering if that would impact other DRM drivers negatively. > > Basically the idea is to harmonize runtime pm handling among DRM drivers. > What do you think about that? Great idea and should work well with atomic helpers. Kerneldoc even explains the suggested way to do it, but doesn't go into all details since e.g. on arm-soc you might have a platform device for each crtc and each encoder. Lots of the arm drivers do full-blown runtime pm with atomic, so there's plenty of examples. Will be fireworks show with legacy drivers (and hence amdgpu&nouveau) unfortunately because with legacy crtc helpers the ordering of crtc enable/disable isn't as well-defined, and you might end up accessing hw without an rpm reference. Maybe possible with a lot of swearing and some hacks, but "make rpm easy" was a big reason for an entirely revamped helper design for atomic (among a few other reasons ofc). Putting rpm get/put calls into the drm core otoh is a complete no-go, that's a perfect example of the midlayer mistake. The long-term goal is to entirely decouple the drm core from underlying devices. All the new arm drivers don't even use the drm_platform.c stuff, it's just that there's a big pile of existing drivers which will be somewhat painful to convert. And for pci it's probably impossible due to old crap like dri1 and agp :( Cheers, Daniel > > Thanks, > > Lukas > >> >> Your patch shouldn't be needed with that any more. If it still is it's >> most likely the fbdev cleanup done too late, but you /should/ get a big >> WARNING splat in that case from drm_mode_config_cleanup(). >> -Daniel >> >> > --- >> > drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- >> > 1 file changed, 12 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c >> > index d2a6d95..0cd6f00 100644 >> > --- a/drivers/gpu/drm/drm_crtc.c >> > +++ b/drivers/gpu/drm/drm_crtc.c >> > @@ -716,12 +716,23 @@ EXPORT_SYMBOL(drm_crtc_init_with_planes); >> > * >> > * This function cleans up @crtc and removes it from the DRM mode setting >> > * core. Note that the function does *not* free the crtc structure itself, >> > - * this is the responsibility of the caller. >> > + * this is the responsibility of the caller. If @crtc is currently enabled, >> > + * it is turned off first. >> > */ >> > void drm_crtc_cleanup(struct drm_crtc *crtc) >> > { >> > struct drm_device *dev = crtc->dev; >> > >> > + if (crtc->enabled) { >> > + struct drm_mode_set modeset = { >> > + .crtc = crtc, >> > + }; >> > + >> > + drm_modeset_lock_all(dev); >> > + drm_mode_set_config_internal(&modeset); >> > + drm_modeset_unlock_all(dev); >> > + } >> > + >> > kfree(crtc->gamma_store); >> > crtc->gamma_store = NULL; >> > >> > -- >> > 2.8.1 >> > >> > _______________________________________________ >> > Nouveau mailing list >> > Nouveau@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/nouveau >> >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20160524213042.GC27098-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [PATCH 9/9] drm: Turn off crtc before tearing down its data structure [not found] ` <20160524213042.GC27098-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-05-25 10:51 ` Lukas Wunner 2016-05-25 13:43 ` [Nouveau] " Daniel Vetter 0 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-25 10:51 UTC (permalink / raw) To: Daniel Vetter Cc: Alex Deucher, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie Hi Daniel, On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > core does not turn off the crtc first and neither do the drivers. With > > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > driver unload if at least one crtc was enabled. > > > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > Cc: Dave Airlie <airlied@redhat.com> > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > This is a core regression, we fixed it again. Previously when unreference > drm_planes the core made sure that it's not longer in use, which had the > side effect of shutting everything off in module unload. > > For a bunch of reasons we've stopped doing that, but that turned out to be > a mistake. It's fixed since > > commit f2d580b9a8149735cbc4b59c4a8df60173658140 > Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Date: Wed May 4 14:38:26 2016 +0200 > > drm/core: Do not preserve framebuffer on rmfb, v4. > > Your patch shouldn't be needed with that any more. If it still is it's > most likely the fbdev cleanup done too late, but you /should/ get a big > WARNING splat in that case from drm_mode_config_cleanup(). I tested it and at least with nouveau, the above-mentioned commit does *not* solve the issue, so patch [9/9] of this series is still needed. I do not get a WARN splat when unloading nouveau. Best regards, Lukas > > > --- > > drivers/gpu/drm/drm_crtc.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c > > index d2a6d95..0cd6f00 100644 > > --- a/drivers/gpu/drm/drm_crtc.c > > +++ b/drivers/gpu/drm/drm_crtc.c > > @@ -716,12 +716,23 @@ EXPORT_SYMBOL(drm_crtc_init_with_planes); > > * > > * This function cleans up @crtc and removes it from the DRM mode setting > > * core. Note that the function does *not* free the crtc structure itself, > > - * this is the responsibility of the caller. > > + * this is the responsibility of the caller. If @crtc is currently enabled, > > + * it is turned off first. > > */ > > void drm_crtc_cleanup(struct drm_crtc *crtc) > > { > > struct drm_device *dev = crtc->dev; > > > > + if (crtc->enabled) { > > + struct drm_mode_set modeset = { > > + .crtc = crtc, > > + }; > > + > > + drm_modeset_lock_all(dev); > > + drm_mode_set_config_internal(&modeset); > > + drm_modeset_unlock_all(dev); > > + } > > + > > kfree(crtc->gamma_store); > > crtc->gamma_store = NULL; > > > > -- > > 2.8.1 > > > > _______________________________________________ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-05-25 10:51 ` Lukas Wunner @ 2016-05-25 13:43 ` Daniel Vetter [not found] ` <CAKMK7uGFb9ihRtjeK7s0ezPPv-C6S9GKbE4h9MLoPyHyN=9W5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 25+ messages in thread From: Daniel Vetter @ 2016-05-25 13:43 UTC (permalink / raw) To: Lukas Wunner; +Cc: Alex Deucher, Nouveau Dev, dri-devel, Dave Airlie On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: >> On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: >> > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM >> > core does not turn off the crtc first and neither do the drivers. With >> > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on >> > driver unload if at least one crtc was enabled. >> > >> > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), >> > radeon_crtc_set_config() and amdgpu_crtc_set_config()). >> > >> > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") >> > Cc: Dave Airlie <airlied@redhat.com> >> > Tested-by: Karol Herbst <nouveau@karolherbst.de> >> > Signed-off-by: Lukas Wunner <lukas@wunner.de> >> >> This is a core regression, we fixed it again. Previously when unreference >> drm_planes the core made sure that it's not longer in use, which had the >> side effect of shutting everything off in module unload. >> >> For a bunch of reasons we've stopped doing that, but that turned out to be >> a mistake. It's fixed since >> >> commit f2d580b9a8149735cbc4b59c4a8df60173658140 >> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> >> Date: Wed May 4 14:38:26 2016 +0200 >> >> drm/core: Do not preserve framebuffer on rmfb, v4. >> >> Your patch shouldn't be needed with that any more. If it still is it's >> most likely the fbdev cleanup done too late, but you /should/ get a big >> WARNING splat in that case from drm_mode_config_cleanup(). > > I tested it and at least with nouveau, the above-mentioned commit does *not* > solve the issue, so patch [9/9] of this series is still needed. I do not get > a WARN splat when unloading nouveau. With legacy kms the only way to keep a crtc enabled is to display a drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if framebuffers are left behind. There's a bunch of options: - nouveau somehow manages to keep the crtc on without a framebuffer - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list - something else There's still no need to forcefully shut down crtc at cleanup time in the core, this is still a driver bug. So yes your patch might be needed, but it's not the right fix. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <CAKMK7uGFb9ihRtjeK7s0ezPPv-C6S9GKbE4h9MLoPyHyN=9W5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 9/9] drm: Turn off crtc before tearing down its data structure [not found] ` <CAKMK7uGFb9ihRtjeK7s0ezPPv-C6S9GKbE4h9MLoPyHyN=9W5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-06-01 12:36 ` Lukas Wunner [not found] ` <20160601123641.GA15243-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 0 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-06-01 12:36 UTC (permalink / raw) To: Daniel Vetter Cc: Alex Deucher, Nouveau Dev, intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel, Dave Airlie On Wed, May 25, 2016 at 03:43:42PM +0200, Daniel Vetter wrote: > On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > >> On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > >> > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > >> > core does not turn off the crtc first and neither do the drivers. With > >> > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > >> > driver unload if at least one crtc was enabled. > >> > > >> > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > >> > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > >> > > >> > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > >> > Cc: Dave Airlie <airlied@redhat.com> > >> > Tested-by: Karol Herbst <nouveau@karolherbst.de> > >> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > >> > >> This is a core regression, we fixed it again. Previously when unreference > >> drm_planes the core made sure that it's not longer in use, which had the > >> side effect of shutting everything off in module unload. > >> > >> For a bunch of reasons we've stopped doing that, but that turned out to be > >> a mistake. It's fixed since > >> > >> commit f2d580b9a8149735cbc4b59c4a8df60173658140 > >> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > >> Date: Wed May 4 14:38:26 2016 +0200 > >> > >> drm/core: Do not preserve framebuffer on rmfb, v4. > >> > >> Your patch shouldn't be needed with that any more. If it still is it's > >> most likely the fbdev cleanup done too late, but you /should/ get a big > >> WARNING splat in that case from drm_mode_config_cleanup(). > > > > I tested it and at least with nouveau, the above-mentioned commit does > > *not* solve the issue, so patch [9/9] of this series is still needed. > > I do not get a WARN splat when unloading nouveau. > > With legacy kms the only way to keep a crtc enabled is to display a > drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if > framebuffers are left behind. There's a bunch of options: > - nouveau somehow manages to keep the crtc on without a framebuffer > - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list > - something else Found it. nouveau_fbcon_destroy() doesn't call drm_framebuffer_remove(). If I add that, the crtc gets properly disabled on unload. It does call drm_framebuffer_cleanup(). That's why there was no WARN, drm_mode_config_cleanup() only WARNs if a framebuffer was left on the mode_config.fb_list. radeon and amdgpu have the same problem. In fact there are very few drivers that call drm_framebuffer_remove(): tegra, msm, exynos, omapdrm and i915 (since Imre Deak's 9d6612516da0). Should we add a WARN to prevent this? How about WARN_ON(crtc->enabled) in drm_crtc_cleanup()? Also, i915 calls drm_framebuffer_unregister_private() before it calls drm_framebuffer_remove(). This ordering has the unfortunate side effect that the drm_framebuffer has ID 0 in log messages emitted by drm_framebuffer_remove(): [ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3) [ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2) [ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1) Best regards, Lukas > > There's still no need to forcefully shut down crtc at cleanup time in > the core, this is still a driver bug. So yes your patch might be > needed, but it's not the right fix. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20160601123641.GA15243-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [PATCH 9/9] drm: Turn off crtc before tearing down its data structure [not found] ` <20160601123641.GA15243-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-06-01 14:40 ` Daniel Vetter 2016-06-03 7:30 ` [Nouveau] " Lukas Wunner 0 siblings, 1 reply; 25+ messages in thread From: Daniel Vetter @ 2016-06-01 14:40 UTC (permalink / raw) To: Lukas Wunner Cc: Nouveau Dev, intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel, Daniel Vetter, Alex Deucher, Dave Airlie On Wed, Jun 01, 2016 at 02:36:41PM +0200, Lukas Wunner wrote: > On Wed, May 25, 2016 at 03:43:42PM +0200, Daniel Vetter wrote: > > On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > > >> On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > >> > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > >> > core does not turn off the crtc first and neither do the drivers. With > > >> > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > >> > driver unload if at least one crtc was enabled. > > >> > > > >> > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > >> > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > >> > > > >> > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > >> > Cc: Dave Airlie <airlied@redhat.com> > > >> > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > >> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > >> > > >> This is a core regression, we fixed it again. Previously when unreference > > >> drm_planes the core made sure that it's not longer in use, which had the > > >> side effect of shutting everything off in module unload. > > >> > > >> For a bunch of reasons we've stopped doing that, but that turned out to be > > >> a mistake. It's fixed since > > >> > > >> commit f2d580b9a8149735cbc4b59c4a8df60173658140 > > >> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > > >> Date: Wed May 4 14:38:26 2016 +0200 > > >> > > >> drm/core: Do not preserve framebuffer on rmfb, v4. > > >> > > >> Your patch shouldn't be needed with that any more. If it still is it's > > >> most likely the fbdev cleanup done too late, but you /should/ get a big > > >> WARNING splat in that case from drm_mode_config_cleanup(). > > > > > > I tested it and at least with nouveau, the above-mentioned commit does > > > *not* solve the issue, so patch [9/9] of this series is still needed. > > > I do not get a WARN splat when unloading nouveau. > > > > With legacy kms the only way to keep a crtc enabled is to display a > > drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if > > framebuffers are left behind. There's a bunch of options: > > - nouveau somehow manages to keep the crtc on without a framebuffer > > - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list > > - something else > > Found it. nouveau_fbcon_destroy() doesn't call drm_framebuffer_remove(). > If I add that, the crtc gets properly disabled on unload. > > It does call drm_framebuffer_cleanup(). That's why there was no WARN, > drm_mode_config_cleanup() only WARNs if a framebuffer was left on the > mode_config.fb_list. > > radeon and amdgpu have the same problem. In fact there are very few > drivers that call drm_framebuffer_remove(): tegra, msm, exynos, omapdrm > and i915 (since Imre Deak's 9d6612516da0). > > Should we add a WARN to prevent this? How about WARN_ON(crtc->enabled) > in drm_crtc_cleanup()? > > Also, i915 calls drm_framebuffer_unregister_private() before it calls > drm_framebuffer_remove(). This ordering has the unfortunate side effect > that the drm_framebuffer has ID 0 in log messages emitted by > drm_framebuffer_remove(): > > [ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3) > [ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2) > [ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1) Well we must first unregister it before we can remove it, so this is unavoidable. Wrt switching from _cleanup to _remove, iirc there was troubles with the later calling into the fb->funcs->destroy hook. But many drivers have their fbdev fb embedded into some struct (instead of a pointer like i915), and then things go sideways badly. That's why you can't just blindly replace them. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-06-01 14:40 ` Daniel Vetter @ 2016-06-03 7:30 ` Lukas Wunner 2016-06-03 18:21 ` Daniel Vetter 0 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-06-03 7:30 UTC (permalink / raw) To: Daniel Vetter Cc: Alex Deucher, Nouveau Dev, intel-gfx, dri-devel, Dave Airlie On Wed, Jun 01, 2016 at 04:40:12PM +0200, Daniel Vetter wrote: > On Wed, Jun 01, 2016 at 02:36:41PM +0200, Lukas Wunner wrote: > > On Wed, May 25, 2016 at 03:43:42PM +0200, Daniel Vetter wrote: > > > On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > > > > > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > > > > > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > > > > > core does not turn off the crtc first and neither do the drivers. With > > > > > > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > > > > > driver unload if at least one crtc was enabled. > > > > > > > > > > > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > > > > > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > > > > > > > > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > > > > > Cc: Dave Airlie <airlied@redhat.com> > > > > > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > > > With legacy kms the only way to keep a crtc enabled is to display a > > > drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if > > > framebuffers are left behind. There's a bunch of options: > > > - nouveau somehow manages to keep the crtc on without a framebuffer > > > - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list > > > - something else > > > > Found it. nouveau_fbcon_destroy() doesn't call drm_framebuffer_remove(). > > If I add that, the crtc gets properly disabled on unload. > > > > It does call drm_framebuffer_cleanup(). That's why there was no WARN, > > drm_mode_config_cleanup() only WARNs if a framebuffer was left on the > > mode_config.fb_list. > > > > radeon and amdgpu have the same problem. In fact there are very few > > drivers that call drm_framebuffer_remove(): tegra, msm, exynos, omapdrm > > and i915 (since Imre Deak's 9d6612516da0). > > > > Should we add a WARN to prevent this? How about WARN_ON(crtc->enabled) > > in drm_crtc_cleanup()? > > > > Also, i915 calls drm_framebuffer_unregister_private() before it calls > > drm_framebuffer_remove(). This ordering has the unfortunate side effect > > that the drm_framebuffer has ID 0 in log messages emitted by > > drm_framebuffer_remove(): > > > > [ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3) > > [ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2) > > [ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1) > > Well we must first unregister it before we can remove it, so this is > unavoidable. Yes but drm_framebuffer_free() calls drm_mode_object_unregister() and is invoked by drm_framebuffer_remove(), so the additional call to drm_framebuffer_unregister_private() in intel_fbdev_destroy() seems superfluous. Or is there some reason I'm missing that this needs to be called before intel_unpin_fb_obj()? > Wrt switching from _cleanup to _remove, iirc there was troubles with the > later calling into the fb->funcs->destroy hook. But many drivers have > their fbdev fb embedded into some struct (instead of a pointer like i915), > and then things go sideways badly. That's why you can't just blindly > replace them. So the options seem to be: (1) Refactor nouveau, radeon and amdgpu to not embed their framebuffer struct in their fbdev struct, so that drm_framebuffer_remove() can be used. (2) Amend each of them to turn off crtcs which are using the fbdev framebuffer, duplicating the code in drm_framebuffer_remove(). (3) Split drm_framebuffer_remove(), move the portion to turn off crtcs into a separate helper, say, drm_framebuffer_deactivate(), call that from nouveau, radeon and amdgpu. (4) Go back to square one and use patch [9/9] of this series. Which one would be most preferred? Is there another solution I've missed? Thanks, Lukas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-06-03 7:30 ` [Nouveau] " Lukas Wunner @ 2016-06-03 18:21 ` Daniel Vetter 2016-06-08 16:55 ` Lukas Wunner 0 siblings, 1 reply; 25+ messages in thread From: Daniel Vetter @ 2016-06-03 18:21 UTC (permalink / raw) To: Lukas Wunner; +Cc: Nouveau Dev, intel-gfx, dri-devel, Alex Deucher, Dave Airlie On Fri, Jun 03, 2016 at 09:30:06AM +0200, Lukas Wunner wrote: > On Wed, Jun 01, 2016 at 04:40:12PM +0200, Daniel Vetter wrote: > > On Wed, Jun 01, 2016 at 02:36:41PM +0200, Lukas Wunner wrote: > > > On Wed, May 25, 2016 at 03:43:42PM +0200, Daniel Vetter wrote: > > > > On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > > > > > > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > > > > > > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > > > > > > core does not turn off the crtc first and neither do the drivers. With > > > > > > > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > > > > > > driver unload if at least one crtc was enabled. > > > > > > > > > > > > > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > > > > > > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > > > > > > > > > > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > > > > > > Cc: Dave Airlie <airlied@redhat.com> > > > > > > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > > > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > > > > > With legacy kms the only way to keep a crtc enabled is to display a > > > > drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if > > > > framebuffers are left behind. There's a bunch of options: > > > > - nouveau somehow manages to keep the crtc on without a framebuffer > > > > - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list > > > > - something else > > > > > > Found it. nouveau_fbcon_destroy() doesn't call drm_framebuffer_remove(). > > > If I add that, the crtc gets properly disabled on unload. > > > > > > It does call drm_framebuffer_cleanup(). That's why there was no WARN, > > > drm_mode_config_cleanup() only WARNs if a framebuffer was left on the > > > mode_config.fb_list. > > > > > > radeon and amdgpu have the same problem. In fact there are very few > > > drivers that call drm_framebuffer_remove(): tegra, msm, exynos, omapdrm > > > and i915 (since Imre Deak's 9d6612516da0). > > > > > > Should we add a WARN to prevent this? How about WARN_ON(crtc->enabled) > > > in drm_crtc_cleanup()? > > > > > > Also, i915 calls drm_framebuffer_unregister_private() before it calls > > > drm_framebuffer_remove(). This ordering has the unfortunate side effect > > > that the drm_framebuffer has ID 0 in log messages emitted by > > > drm_framebuffer_remove(): > > > > > > [ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3) > > > [ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2) > > > [ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1) > > > > Well we must first unregister it before we can remove it, so this is > > unavoidable. > > Yes but drm_framebuffer_free() calls drm_mode_object_unregister() > and is invoked by drm_framebuffer_remove(), so the additional call to > drm_framebuffer_unregister_private() in intel_fbdev_destroy() seems > superfluous. Or is there some reason I'm missing that this needs to > be called before intel_unpin_fb_obj()? > > > > Wrt switching from _cleanup to _remove, iirc there was troubles with the > > later calling into the fb->funcs->destroy hook. But many drivers have > > their fbdev fb embedded into some struct (instead of a pointer like i915), > > and then things go sideways badly. That's why you can't just blindly > > replace them. > > So the options seem to be: > > (1) Refactor nouveau, radeon and amdgpu to not embed their framebuffer > struct in their fbdev struct, so that drm_framebuffer_remove() can > be used. > > (2) Amend each of them to turn off crtcs which are using the fbdev > framebuffer, duplicating the code in drm_framebuffer_remove(). > > (3) Split drm_framebuffer_remove(), move the portion to turn off crtcs > into a separate helper, say, drm_framebuffer_deactivate(), call that > from nouveau, radeon and amdgpu. > > (4) Go back to square one and use patch [9/9] of this series. > > Which one would be most preferred? Is there another solution I've missed? I think a dedicated turn_off_everything helper would be best. We'd need an atomic and a legacy version (because hooray), but that would work in all cases. Relying on the implicit behaviour to turn off everything (strictly speaking you only need to turn off all the planes, you can leave crtcs on, and that's what most atomic drivers want really under normal circumstances) is a bit fragile, and it's also possible to disable fbdev emulation. If you driver needs everything to be off in module unload, then it's imo best to explicitly enforce that. So "(5) Write dedicated helper to turn off everything" is imo the right fix. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Nouveau] [PATCH 9/9] drm: Turn off crtc before tearing down its data structure 2016-06-03 18:21 ` Daniel Vetter @ 2016-06-08 16:55 ` Lukas Wunner 0 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-06-08 16:55 UTC (permalink / raw) To: Daniel Vetter Cc: Alex Deucher, Nouveau Dev, intel-gfx, dri-devel, Dave Airlie On Fri, Jun 03, 2016 at 08:21:50PM +0200, Daniel Vetter wrote: > On Fri, Jun 03, 2016 at 09:30:06AM +0200, Lukas Wunner wrote: > > On Wed, Jun 01, 2016 at 04:40:12PM +0200, Daniel Vetter wrote: > > > On Wed, Jun 01, 2016 at 02:36:41PM +0200, Lukas Wunner wrote: > > > > On Wed, May 25, 2016 at 03:43:42PM +0200, Daniel Vetter wrote: > > > > > On Wed, May 25, 2016 at 12:51 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > > > > On Tue, May 24, 2016 at 11:30:42PM +0200, Daniel Vetter wrote: > > > > > > > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > > > > > > > When a drm_crtc structure is destroyed with drm_crtc_cleanup(), the DRM > > > > > > > > core does not turn off the crtc first and neither do the drivers. With > > > > > > > > nouveau, radeon and amdgpu, this causes a runtime pm ref to be leaked on > > > > > > > > driver unload if at least one crtc was enabled. > > > > > > > > > > > > > > > > (See usage of have_disp_power_ref in nouveau_crtc_set_config(), > > > > > > > > radeon_crtc_set_config() and amdgpu_crtc_set_config()). > > > > > > > > > > > > > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > > > > > > > Cc: Dave Airlie <airlied@redhat.com> > > > > > > > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > > > > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > > > > > > > With legacy kms the only way to keep a crtc enabled is to display a > > > > > drm_framebuffer on it. And drm_mode_config_cleanup has a WARN_ON if > > > > > framebuffers are left behind. There's a bunch of options: > > > > > - nouveau somehow manages to keep the crtc on without a framebuffer > > > > > - nouveau somehow leaks a drm_framebuffer, but removes it from the fb_list > > > > > - something else > > > > > > > > Found it. nouveau_fbcon_destroy() doesn't call drm_framebuffer_remove(). > > > > If I add that, the crtc gets properly disabled on unload. > > > > > > > > It does call drm_framebuffer_cleanup(). That's why there was no WARN, > > > > drm_mode_config_cleanup() only WARNs if a framebuffer was left on the > > > > mode_config.fb_list. > > > > > > > > radeon and amdgpu have the same problem. In fact there are very few > > > > drivers that call drm_framebuffer_remove(): tegra, msm, exynos, omapdrm > > > > and i915 (since Imre Deak's 9d6612516da0). > > > > > > > > Should we add a WARN to prevent this? How about WARN_ON(crtc->enabled) > > > > in drm_crtc_cleanup()? > > > > > > > > Also, i915 calls drm_framebuffer_unregister_private() before it calls > > > > drm_framebuffer_remove(). This ordering has the unfortunate side effect > > > > that the drm_framebuffer has ID 0 in log messages emitted by > > > > drm_framebuffer_remove(): > > > > > > > > [ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3) > > > > [ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2) > > > > [ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1) > > > > > > Well we must first unregister it before we can remove it, so this is > > > unavoidable. > > > > Yes but drm_framebuffer_free() calls drm_mode_object_unregister() > > and is invoked by drm_framebuffer_remove(), so the additional call to > > drm_framebuffer_unregister_private() in intel_fbdev_destroy() seems > > superfluous. Or is there some reason I'm missing that this needs to > > be called before intel_unpin_fb_obj()? > > > > > > > Wrt switching from _cleanup to _remove, iirc there was troubles with the > > > later calling into the fb->funcs->destroy hook. But many drivers have > > > their fbdev fb embedded into some struct (instead of a pointer like i915), > > > and then things go sideways badly. That's why you can't just blindly > > > replace them. > > > > So the options seem to be: > > > > (1) Refactor nouveau, radeon and amdgpu to not embed their framebuffer > > struct in their fbdev struct, so that drm_framebuffer_remove() can > > be used. > > > > (2) Amend each of them to turn off crtcs which are using the fbdev > > framebuffer, duplicating the code in drm_framebuffer_remove(). > > > > (3) Split drm_framebuffer_remove(), move the portion to turn off crtcs > > into a separate helper, say, drm_framebuffer_deactivate(), call that > > from nouveau, radeon and amdgpu. > > > > (4) Go back to square one and use patch [9/9] of this series. > > > > Which one would be most preferred? Is there another solution I've missed? > > I think a dedicated turn_off_everything helper would be best. We'd need an > atomic and a legacy version (because hooray), but that would work in all > cases. Relying on the implicit behaviour to turn off everything (strictly > speaking you only need to turn off all the planes, you can leave crtcs on, > and that's what most atomic drivers want really under normal > circumstances) is a bit fragile, and it's also possible to disable fbdev > emulation. If you driver needs everything to be off in module unload, then > it's imo best to explicitly enforce that. > > So "(5) Write dedicated helper to turn off everything" is imo the right > fix. Okay I did that and just posted it as v2. Hope I've understood correctly what you suggested, if not please let me know and I'll rectify in a v3. Thanks, Lukas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-24 16:03 ` [PATCH 2/9] drm/nouveau: Forbid runtime pm on driver unload Lukas Wunner 2016-05-24 16:03 ` [PATCH 9/9] drm: Turn off crtc before tearing down its data structure Lukas Wunner @ 2016-05-24 16:03 ` Lukas Wunner [not found] ` <dd120a30cb769c93af8973cae41f61831d17e04b.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Dave Airlie nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0, but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally. We therefore leak a runtime pm ref whenever nouveau is loaded with runpm=0 and then unloaded. The GPU will subsequently never runtime suspend even if nouveau is loaded again with runpm=1. Fix by taking the runtime pm ref under the same condition that it was released on driver load. Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") Cc: Dave Airlie <airlied@redhat.com> Reported-by: Karol Herbst <nouveau@karolherbst.de> Tested-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 11f8dd9..faf7438 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -498,7 +498,10 @@ nouveau_drm_unload(struct drm_device *dev) { struct nouveau_drm *drm = nouveau_drm(dev); - pm_runtime_get_sync(dev->dev); + if (nouveau_runtime_pm != 0) { + pm_runtime_get_sync(dev->dev); + } + nouveau_fbcon_fini(dev); nouveau_accel_fini(drm); nouveau_hwmon_fini(dev); -- 2.8.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 25+ messages in thread
[parent not found: <dd120a30cb769c93af8973cae41f61831d17e04b.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload [not found] ` <dd120a30cb769c93af8973cae41f61831d17e04b.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-27 1:07 ` Peter Wu 2016-05-29 15:50 ` Lukas Wunner 0 siblings, 1 reply; 25+ messages in thread From: Peter Wu @ 2016-05-27 1:07 UTC (permalink / raw) To: Lukas Wunner Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0, > but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally. > We therefore leak a runtime pm ref whenever nouveau is loaded with > runpm=0 and then unloaded. The GPU will subsequently never runtime > suspend even if nouveau is loaded again with runpm=1. > > Fix by taking the runtime pm ref under the same condition that it was > released on driver load. > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > Cc: Dave Airlie <airlied@redhat.com> > Reported-by: Karol Herbst <nouveau@karolherbst.de> > Tested-by: Karol Herbst <nouveau@karolherbst.de> > Signed-off-by: Lukas Wunner <lukas@wunner.de> Looks good, I tested this scenario: ru(){ cat /sys/bus/pci/devices/0000\:01:00.0/power/runtime_usage;} ru # reports 1 modprobe nouveau runpm=0 ru # reports 2 rmmod nouveau ru # reports 1 Without runpm=0 the count drops to 0 in the second step and stays 0 in the third step. After applying patch 2/9, this correctly reports 1 as expected (this is the same as manually setting power/control to on). Peter > --- > drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c > index 11f8dd9..faf7438 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > @@ -498,7 +498,10 @@ nouveau_drm_unload(struct drm_device *dev) > { > struct nouveau_drm *drm = nouveau_drm(dev); > > - pm_runtime_get_sync(dev->dev); > + if (nouveau_runtime_pm != 0) { > + pm_runtime_get_sync(dev->dev); > + } > + > nouveau_fbcon_fini(dev); > nouveau_accel_fini(drm); > nouveau_hwmon_fini(dev); > -- > 2.8.1 > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau -- Kind regards, Peter Wu https://lekensteyn.nl _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload 2016-05-27 1:07 ` Peter Wu @ 2016-05-29 15:50 ` Lukas Wunner [not found] ` <20160529155006.GA12909-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 0 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-29 15:50 UTC (permalink / raw) To: Peter Wu Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie Hi Peter, On Fri, May 27, 2016 at 03:07:33AM +0200, Peter Wu wrote: > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0, > > but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally. > > We therefore leak a runtime pm ref whenever nouveau is loaded with > > runpm=0 and then unloaded. The GPU will subsequently never runtime > > suspend even if nouveau is loaded again with runpm=1. > > > > Fix by taking the runtime pm ref under the same condition that it was > > released on driver load. > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > Cc: Dave Airlie <airlied@redhat.com> > > Reported-by: Karol Herbst <nouveau@karolherbst.de> > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > Looks good, I tested this scenario: > > ru(){ cat /sys/bus/pci/devices/0000\:01:00.0/power/runtime_usage;} > ru # reports 1 > modprobe nouveau runpm=0 > ru # reports 2 > rmmod nouveau > ru # reports 1 > > Without runpm=0 the count drops to 0 in the second step and stays 0 in > the third step. After applying patch 2/9, this correctly reports 1 as > expected (this is the same as manually setting power/control to on). How exactly did you reach the situation where the root port didn't wake up when you tried to load nouveau again? (IRC conversation this week.) What's happening is, the PCI core will keep unbound devices (i.e., without driver) in D0 but the runtime status is allowed to change to "suspended". So it'll appear to the kernel as if it was suspended but in reality it stays in D0. Once runtime pm for PCIe ports gets merged, the root port above the GPU will indeed go to D3 in such a situation because the check pm_children_suspended() (called from rpm_check_suspend_allowed()) returns true. I'm not sure if this is desirable or not. If we keep unbound devices in D0, should we allow ports above them to go to D3? In any case, when nouveau is loaded again, local_pci_probe() will call pm_runtime_get_sync(), which will implicitly set the runtime status to "active" and which should also wake parents. So how did you ever reach a point where you loaded nouveau and the root port stayed asleep? Clearly we have a bug there, question is where. This shouldn't work only if pm_runtime_forbid() was called on driver unload. Thanks for the extensive testing! Lukas > > Peter > > > --- > > drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c > > index 11f8dd9..faf7438 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > > @@ -498,7 +498,10 @@ nouveau_drm_unload(struct drm_device *dev) > > { > > struct nouveau_drm *drm = nouveau_drm(dev); > > > > - pm_runtime_get_sync(dev->dev); > > + if (nouveau_runtime_pm != 0) { > > + pm_runtime_get_sync(dev->dev); > > + } > > + > > nouveau_fbcon_fini(dev); > > nouveau_accel_fini(drm); > > nouveau_hwmon_fini(dev); > > -- > > 2.8.1 > > > > _______________________________________________ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau > > -- > Kind regards, > Peter Wu > https://lekensteyn.nl _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20160529155006.GA12909-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload [not found] ` <20160529155006.GA12909-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-30 17:03 ` Peter Wu 2016-05-31 11:34 ` Lukas Wunner 0 siblings, 1 reply; 25+ messages in thread From: Peter Wu @ 2016-05-30 17:03 UTC (permalink / raw) To: Lukas Wunner Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie On Sun, May 29, 2016 at 05:50:06PM +0200, Lukas Wunner wrote: > Hi Peter, > > On Fri, May 27, 2016 at 03:07:33AM +0200, Peter Wu wrote: > > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote: > > > nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0, > > > but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally. > > > We therefore leak a runtime pm ref whenever nouveau is loaded with > > > runpm=0 and then unloaded. The GPU will subsequently never runtime > > > suspend even if nouveau is loaded again with runpm=1. > > > > > > Fix by taking the runtime pm ref under the same condition that it was > > > released on driver load. > > > > > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") > > > Cc: Dave Airlie <airlied@redhat.com> > > > Reported-by: Karol Herbst <nouveau@karolherbst.de> > > > Tested-by: Karol Herbst <nouveau@karolherbst.de> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > Looks good, I tested this scenario: > > > > ru(){ cat /sys/bus/pci/devices/0000\:01:00.0/power/runtime_usage;} > > ru # reports 1 > > modprobe nouveau runpm=0 > > ru # reports 2 > > rmmod nouveau > > ru # reports 1 > > > > Without runpm=0 the count drops to 0 in the second step and stays 0 in > > the third step. After applying patch 2/9, this correctly reports 1 as > > expected (this is the same as manually setting power/control to on). > > How exactly did you reach the situation where the root port didn't wake > up when you tried to load nouveau again? (IRC conversation this week.) Ensure that the pci/pm patches are applied, then: 0. Unload nouveau (I have blacklisted it for testing). 1. Enable rpm for the root port and children (control = auto). 2. Verify in the kernel logs that the devices are sleeping: pcieport 0000:00:01.0: power state changed by ACPI to D3cold 3. (Optional, to rule out issues with delays:) Disable rpm for the Nvidia device (control = on). 4. modprobe nouveau. The above test with v4.6 + 4 pci/pm patches (8b71f565) gives: 50.245795 MXM: GUID detected in BIOS 50.245948 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 50.246016 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 50.246044 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 50.246110 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F 50.246256 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 50.246289 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F 50.246443 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 50.246457 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F 50.246932 pci 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported 50.247005 VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP handle 50.247084 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PG00._ON] at AML address ffffc9000001086e length 11D 50.390140 pcieport 0000:00:01.0: power state changed by ACPI to D0 50.491893 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0._DSW] at AML address ffffc90000010a2d length 1D 50.492285 pcieport 0000:00:01.0: PME# disabled 50.492583 nouveau 0000:01:00.0: unknown chipset (ffffffff) 50.492687 nouveau: probe of 0000:01:00.0 failed with error -12 50.501990 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0._S0W] at AML address ffffc90000010a8e length 2 50.502403 pcieport 0000:00:01.0: PME# enabled 50.502601 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0._DSW] at AML address ffffc90000010a2d length 1D 50.513005 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PG00._OFF] at AML address ffffc90000010994 length 6D 50.533258 pcieport 0000:00:01.0: power state changed by ACPI to D3cold (Note that this patch is not included.) When nouveau is operating normally, I see that _PS0 is also called (which does not happen above). If you think that mixing power resources with DSM causes this issue, I also tried to apply my power resources work for nouveau but it gives the same problem: 20.183306 MXM: GUID detected in BIOS 20.183606 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 20.184158 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 20.184547 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) 20.185152 pci 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported 20.185351 VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP handle 20.185384 nouveau: detected PR support, will not use DSM 20.185552 nouveau 0000:01:00.0: enabling device (0000 -> 0003) 20.185873 nouveau 0000:01:00.0: unknown chipset (ffffffff) 20.185946 nouveau: probe of 0000:01:00.0 failed with error -12 > What's happening is, the PCI core will keep unbound devices (i.e., > without driver) in D0 but the runtime status is allowed to change > to "suspended". So it'll appear to the kernel as if it was suspended > but in reality it stays in D0. > > Once runtime pm for PCIe ports gets merged, the root port above the > GPU will indeed go to D3 in such a situation because the check > pm_children_suspended() (called from rpm_check_suspend_allowed()) > returns true. > > I'm not sure if this is desirable or not. If we keep unbound devices > in D0, should we allow ports above them to go to D3? Maybe Rafael (linux-pm / linux-pci) can answer this question better? The comments in local_pci_probe, pci_pm_runtime_suspend and pci_pm_runtime_resume suggest that unbound devices are assumed in D0 which is apparently not the case when runtime PM is enabled. > In any case, when nouveau is loaded again, local_pci_probe() will > call pm_runtime_get_sync(), which will implicitly set the runtime > status to "active" and which should also wake parents. So how did > you ever reach a point where you loaded nouveau and the root port > stayed asleep? Clearly we have a bug there, question is where. > This shouldn't work only if pm_runtime_forbid() was called on > driver unload. > > Thanks for the extensive testing! > Lukas Both devices (root port and Nvidia) were resumed, but somehow the Nvidia card was not fully initialized/ready (as you can see in the above logs). Peter > > > > Peter > > > > > --- > > > drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c > > > index 11f8dd9..faf7438 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > > > @@ -498,7 +498,10 @@ nouveau_drm_unload(struct drm_device *dev) > > > { > > > struct nouveau_drm *drm = nouveau_drm(dev); > > > > > > - pm_runtime_get_sync(dev->dev); > > > + if (nouveau_runtime_pm != 0) { > > > + pm_runtime_get_sync(dev->dev); > > > + } > > > + > > > nouveau_fbcon_fini(dev); > > > nouveau_accel_fini(drm); > > > nouveau_hwmon_fini(dev); > > > -- > > > 2.8.1 > > > > > > _______________________________________________ > > > Nouveau mailing list > > > Nouveau@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload 2016-05-30 17:03 ` Peter Wu @ 2016-05-31 11:34 ` Lukas Wunner [not found] ` <20160531113443.GA14098-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 0 siblings, 1 reply; 25+ messages in thread From: Lukas Wunner @ 2016-05-31 11:34 UTC (permalink / raw) To: Peter Wu Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie On Mon, May 30, 2016 at 07:03:46PM +0200, Peter Wu wrote: > On Sun, May 29, 2016 at 05:50:06PM +0200, Lukas Wunner wrote: > > How exactly did you reach the situation where the root port didn't wake > > up when you tried to load nouveau again? (IRC conversation this week.) > > Ensure that the pci/pm patches are applied, then: > > 0. Unload nouveau (I have blacklisted it for testing). > 1. Enable rpm for the root port and children (control = auto). > 2. Verify in the kernel logs that the devices are sleeping: > pcieport 0000:00:01.0: power state changed by ACPI to D3cold > 3. (Optional, to rule out issues with delays:) Disable rpm for the > Nvidia device (control = on). > 4. modprobe nouveau. > > The above test with v4.6 + 4 pci/pm patches (8b71f565) gives: > > 50.245795 MXM: GUID detected in BIOS > 50.245948 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 > 50.246016 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > 50.246044 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 > 50.246110 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > 50.246256 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > 50.246289 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > 50.246443 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > 50.246457 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > 50.246932 pci 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported > 50.247005 VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP handle > 50.247084 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PG00._ON] at AML address ffffc9000001086e length 11D > 50.390140 pcieport 0000:00:01.0: power state changed by ACPI to D0 > 50.491893 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0._DSW] at AML address ffffc90000010a2d length 1D > 50.492285 pcieport 0000:00:01.0: PME# disabled > 50.492583 nouveau 0000:01:00.0: unknown chipset (ffffffff) > 50.492687 nouveau: probe of 0000:01:00.0 failed with error -12 I've tested this on a MacBook Pro, which does not have ACPI _PR3 methods for the root port to which the discrete GPU is attached. The port can thus only suspend to D3hot, not D3cold. Even without patch [2/9], when unloading nouveau and letting the root port go to D3hot, the port is subsequently correctly resumed to D0 when reloading nouveau. So the issue that you're seeing without patch [2/9] seems to be specific to Optimus/_PR3 machines. If possible you should try to get it working without patch [2/9] because that patch is really optional (as I've written in the commit message). I'm totally unfamiliar with Optimus but maybe lspci could help to debug this? Lukas _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20160531113443.GA14098-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload [not found] ` <20160531113443.GA14098-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-31 11:41 ` Peter Wu 0 siblings, 0 replies; 25+ messages in thread From: Peter Wu @ 2016-05-31 11:41 UTC (permalink / raw) To: Lukas Wunner Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie On Tue, May 31, 2016 at 01:34:43PM +0200, Lukas Wunner wrote: > On Mon, May 30, 2016 at 07:03:46PM +0200, Peter Wu wrote: > > On Sun, May 29, 2016 at 05:50:06PM +0200, Lukas Wunner wrote: > > > How exactly did you reach the situation where the root port didn't wake > > > up when you tried to load nouveau again? (IRC conversation this week.) > > > > Ensure that the pci/pm patches are applied, then: > > > > 0. Unload nouveau (I have blacklisted it for testing). > > 1. Enable rpm for the root port and children (control = auto). > > 2. Verify in the kernel logs that the devices are sleeping: > > pcieport 0000:00:01.0: power state changed by ACPI to D3cold > > 3. (Optional, to rule out issues with delays:) Disable rpm for the > > Nvidia device (control = on). > > 4. modprobe nouveau. > > > > The above test with v4.6 + 4 pci/pm patches (8b71f565) gives: > > > > 50.245795 MXM: GUID detected in BIOS > > 50.245948 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 > > 50.246016 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > > 50.246044 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.GFX0._DSM] at AML address ffffc90000013b11 length 492 > > 50.246110 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > > 50.246256 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > > 50.246289 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > > 50.246443 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95) > > 50.246457 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PEGP._DSM] at AML address ffffc90000018297 length 1F > > 50.246932 pci 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported > > 50.247005 VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP handle > > 50.247084 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0.PG00._ON] at AML address ffffc9000001086e length 11D > > 50.390140 pcieport 0000:00:01.0: power state changed by ACPI to D0 > > 50.491893 nseval-0227 ns_evaluate : **** Execute method [\_SB.PCI0.PEG0._DSW] at AML address ffffc90000010a2d length 1D > > 50.492285 pcieport 0000:00:01.0: PME# disabled > > 50.492583 nouveau 0000:01:00.0: unknown chipset (ffffffff) > > 50.492687 nouveau: probe of 0000:01:00.0 failed with error -12 > > I've tested this on a MacBook Pro, which does not have ACPI _PR3 > methods for the root port to which the discrete GPU is attached. > The port can thus only suspend to D3hot, not D3cold. > > Even without patch [2/9], when unloading nouveau and letting the > root port go to D3hot, the port is subsequently correctly resumed > to D0 when reloading nouveau. > > So the issue that you're seeing without patch [2/9] seems to be > specific to Optimus/_PR3 machines. If possible you should try to > get it working without patch [2/9] because that patch is really > optional (as I've written in the commit message). I'm totally > unfamiliar with Optimus but maybe lspci could help to debug this? Without 2/9 I can prevent the issue by writing "on" to /sys/bus/pci/devices/0000:00:01.0/power/control (the PCIe port), but that effectively gives the same result as applying 2/9. The problem occurs when the power is lost (by putting the PCIe port in D3cold). Maybe it is a bug in the PCI core that does not re-initialize devices under the port, but since a workaround is available (2/9), I will focus on other issues first. Maybe it is worth to mention this issue in the commit message for 2/9 though. -- Kind regards, Peter Wu https://lekensteyn.nl _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 8/9] drm/amdgpu: Forbid runtime pm on driver unload 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner ` (5 preceding siblings ...) [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-05-24 16:03 ` Lukas Wunner 6 siblings, 0 replies; 25+ messages in thread From: Lukas Wunner @ 2016-05-24 16:03 UTC (permalink / raw) To: dri-devel; +Cc: Alex Deucher, Dave Airlie The PCI core calls pm_runtime_forbid() on device probe in pci_pm_init(), making this the default state when amdgpu is loaded. amdgpu_driver_load_kms() therefore calls pm_runtime_allow(), but there's no pm_runtime_forbid() in amdgpu_driver_unload_kms() to balance it. Add it so that we leave the device in the same state that we found it. This isn't a bug, it's just good housekeeping. When amdgpu is first loaded with runpm=1, then unloaded and loaded again with runpm=0, pm_runtime_forbid() will be called from amdgpu_pmops_runtime_idle() or amdgpu_pmops_runtime_suspend(), so the behaviour is correct. If there ever is a third party driver for AMD cards, this commit avoids that it has to clean up behind amdgpu. Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 0db692e..38a28d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -62,6 +62,7 @@ int amdgpu_driver_unload_kms(struct drm_device *dev) if (amdgpu_device_is_px(dev)) { pm_runtime_get_sync(dev->dev); + pm_runtime_forbid(dev->dev); } amdgpu_amdkfd_device_fini(adev); -- 2.8.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-06-08 16:55 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-24 16:03 [PATCH 0/9] Fix runtime pm ref leaks Lukas Wunner 2016-05-24 16:03 ` [PATCH 5/9] drm/radeon: Forbid runtime pm on driver unload Lukas Wunner 2016-05-24 16:03 ` [PATCH 3/9] drm/radeon: Don't leak runtime pm ref " Lukas Wunner 2016-05-24 16:03 ` [PATCH 7/9] drm/amdgpu: Don't leak runtime pm ref on driver load Lukas Wunner 2016-05-24 16:03 ` [PATCH 6/9] drm/amdgpu: Don't leak runtime pm ref on driver unload Lukas Wunner 2016-05-24 16:03 ` [PATCH 4/9] drm/radeon: Don't leak runtime pm ref on driver load Lukas Wunner [not found] ` <cover.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-24 16:03 ` [PATCH 2/9] drm/nouveau: Forbid runtime pm on driver unload Lukas Wunner 2016-05-24 16:03 ` [PATCH 9/9] drm: Turn off crtc before tearing down its data structure Lukas Wunner 2016-05-24 21:30 ` [Nouveau] " Daniel Vetter 2016-05-24 22:07 ` Lukas Wunner [not found] ` <20160524220753.GA5941-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-24 22:30 ` Daniel Vetter [not found] ` <20160524213042.GC27098-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-05-25 10:51 ` Lukas Wunner 2016-05-25 13:43 ` [Nouveau] " Daniel Vetter [not found] ` <CAKMK7uGFb9ihRtjeK7s0ezPPv-C6S9GKbE4h9MLoPyHyN=9W5Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-06-01 12:36 ` Lukas Wunner [not found] ` <20160601123641.GA15243-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-06-01 14:40 ` Daniel Vetter 2016-06-03 7:30 ` [Nouveau] " Lukas Wunner 2016-06-03 18:21 ` Daniel Vetter 2016-06-08 16:55 ` Lukas Wunner 2016-05-24 16:03 ` [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload Lukas Wunner [not found] ` <dd120a30cb769c93af8973cae41f61831d17e04b.1464103767.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-27 1:07 ` Peter Wu 2016-05-29 15:50 ` Lukas Wunner [not found] ` <20160529155006.GA12909-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-30 17:03 ` Peter Wu 2016-05-31 11:34 ` Lukas Wunner [not found] ` <20160531113443.GA14098-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-05-31 11:41 ` Peter Wu 2016-05-24 16:03 ` [PATCH 8/9] drm/amdgpu: Forbid runtime pm " Lukas Wunner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.