* [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all @ 2017-07-14 22:46 Daniel Vetter 2017-07-14 22:46 ` [PATCH 2/3] drm/i915: Fix fbdev unload sequence Daniel Vetter ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Daniel Vetter @ 2017-07-14 22:46 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Daniel Vetter, Intel Graphics Development, martin.peres The legacy plane->fb pointer is refcounted by calling drm_atomic_clean_old_fb(). In practice this isn't a real problem because: - The caller in the i915 gpu reset code restores the original state again, which means the plane->fb pointer won't change, hence can't leak. - Drivers using drm_atomic_helper_shutdown call the fbdev cleanup first, and that usually cleans up the fb through drm_remove_framebuffer, which does this correctly. - Without fbdev the only framebuffers are from userspace, and those get cleaned up (again using drm_remove_framebuffer) befor the driver can even be unloaded. But in i915 I've switched the cleanup sequence around so that the _shutdown() calls happens after the drm_remove_framebuffer(), which is how I discovered this issue. v2: My analysis why the current code was ok for gpu reset and suspend/resume was correct, but then I totally failed to realize that we better keep this symmetric. Thanksfully CI noticed that for balance, a refcounting bug must exist at 2 places if previously there was no issue ... Cc: martin.peres@free.fr Cc: chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/drm_atomic_helper.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index b07fc30372d3..3a04619d3bec 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -2726,6 +2726,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, struct drm_plane *plane; struct drm_crtc_state *crtc_state; struct drm_crtc *crtc; + unsigned plane_mask = 0; int ret, i; state = drm_atomic_state_alloc(dev); @@ -2768,10 +2769,14 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, goto free; drm_atomic_set_fb_for_plane(plane_state, NULL); + plane_mask |= BIT(drm_plane_index(plane)); + plane->old_fb = plane->fb; } ret = drm_atomic_commit(state); free: + if (plane_mask) + drm_atomic_clean_old_fb(dev, plane_mask, ret); drm_atomic_state_put(state); return ret; } @@ -2902,6 +2907,8 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, struct drm_connector_state *new_conn_state; struct drm_crtc *crtc; struct drm_crtc_state *new_crtc_state; + struct drm_device *dev = state->dev; + int ret; state->acquire_ctx = ctx; @@ -2914,7 +2921,10 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, for_each_new_connector_in_state(state, connector, new_conn_state, i) state->connectors[i].old_state = connector->state; - return drm_atomic_commit(state); + ret = drm_atomic_commit(state); + drm_atomic_clean_old_fb(dev, ~0U, ret); + + return ret; } EXPORT_SYMBOL(drm_atomic_helper_commit_duplicated_state); -- 2.13.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] drm/i915: Fix fbdev unload sequence 2017-07-14 22:46 [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all Daniel Vetter @ 2017-07-14 22:46 ` Daniel Vetter 2017-07-14 22:46 ` [PATCH 3/3] drm/i915: unregister interfaces first in unload Daniel Vetter ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Daniel Vetter @ 2017-07-14 22:46 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Daniel Vetter, Intel Graphics Development, martin.peres First thing we need to do is unregister the fbdev instance, but we can't just go ahead and kfree it. That must wait until the hotplug and polling work are stopped, since they can race with the with the teardown. That means we need to split up the fbdev teardown into the unregister part and the cleanup part. I originally suspected that this was broken in one of the unload shuffles, but on closer inspection the oldest sequence I've dug out also gets this wrong. Just not quite so badly. I've run drv_module_reload a few hundred times and it's rock solid compared to insta-death beforehand. This bug seems to have been uncovered by commit 88be58be886f1215cc73dc8c273c985eecd7385c Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jul 6 15:00:19 2017 +0200 drm/i915/fbdev: Always forward hotplug events But the effect of that seems to only be to increase the race window enough to make it blow up easier. I'm not exactly clear on what's going on there ... v2: Fix whitespace and use fetch_and_zero (Chris). Testcase: igt/drv_module_reload Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101791 Cc: martin.peres@free.fr Cc: chris@chris-wilson.co.uk Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/i915_drv.c | 3 +-- drivers/gpu/drm/i915/intel_display.c | 3 +++ drivers/gpu/drm/i915/intel_drv.h | 9 +++++++-- drivers/gpu/drm/i915/intel_fbdev.c | 22 +++++++++++++++------- 4 files changed, 26 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index d310d8245dca..2c2afb19975d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1240,6 +1240,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) */ static void i915_driver_unregister(struct drm_i915_private *dev_priv) { + intel_fbdev_unregister(dev_priv); intel_audio_deinit(dev_priv); intel_gpu_ips_teardown(); @@ -1371,8 +1372,6 @@ void i915_driver_unload(struct drm_device *dev) struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; - intel_fbdev_fini(dev); - if (i915_gem_suspend(dev_priv)) DRM_ERROR("failed to idle hardware; continuing to unload!\n"); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ecb4e30673fc..e10fae2faa88 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15111,6 +15111,9 @@ void intel_modeset_cleanup(struct drm_device *dev) */ drm_kms_helper_poll_fini(dev); + /* poll work can call into fbdev, hence clean that up afterwards */ + intel_fbdev_fini(dev_priv); + intel_unregister_dsm_handler(); intel_fbc_global_disable(dev_priv); diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 3132260f18ce..7361e983d953 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1597,7 +1597,8 @@ void intel_hpd_poll_init(struct drm_i915_private *dev_priv); #ifdef CONFIG_DRM_FBDEV_EMULATION extern int intel_fbdev_init(struct drm_device *dev); extern void intel_fbdev_initial_config_async(struct drm_device *dev); -extern void intel_fbdev_fini(struct drm_device *dev); +extern void intel_fbdev_unregister(struct drm_i915_private *dev_priv); +extern void intel_fbdev_fini(struct drm_i915_private *dev_priv); extern void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous); extern void intel_fbdev_output_poll_changed(struct drm_device *dev); extern void intel_fbdev_restore_mode(struct drm_device *dev); @@ -1611,7 +1612,11 @@ static inline void intel_fbdev_initial_config_async(struct drm_device *dev) { } -static inline void intel_fbdev_fini(struct drm_device *dev) +static inline void intel_fbdev_unregister(struct drm_i915_private *dev_priv) +{ +} + +static inline void intel_fbdev_fini(struct drm_i915_private *dev_priv) { } diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c index b953365a3eec..391992373d27 100644 --- a/drivers/gpu/drm/i915/intel_fbdev.c +++ b/drivers/gpu/drm/i915/intel_fbdev.c @@ -538,8 +538,6 @@ static void intel_fbdev_destroy(struct intel_fbdev *ifbdev) * trying to rectify all the possible error paths leading here. */ - drm_fb_helper_unregister_fbi(&ifbdev->helper); - drm_fb_helper_fini(&ifbdev->helper); if (ifbdev->vma) { @@ -727,8 +725,10 @@ static void intel_fbdev_initial_config(void *data, async_cookie_t cookie) /* Due to peculiar init order wrt to hpd handling this is separate. */ if (drm_fb_helper_initial_config(&ifbdev->helper, - ifbdev->preferred_bpp)) - intel_fbdev_fini(ifbdev->helper.dev); + ifbdev->preferred_bpp)) { + intel_fbdev_unregister(to_i915(ifbdev->helper.dev)); + intel_fbdev_fini(to_i915(ifbdev->helper.dev)); + } } void intel_fbdev_initial_config_async(struct drm_device *dev) @@ -751,9 +751,8 @@ static void intel_fbdev_sync(struct intel_fbdev *ifbdev) ifbdev->cookie = 0; } -void intel_fbdev_fini(struct drm_device *dev) +void intel_fbdev_unregister(struct drm_i915_private *dev_priv) { - struct drm_i915_private *dev_priv = to_i915(dev); struct intel_fbdev *ifbdev = dev_priv->fbdev; if (!ifbdev) @@ -763,8 +762,17 @@ void intel_fbdev_fini(struct drm_device *dev) if (!current_is_async()) intel_fbdev_sync(ifbdev); + drm_fb_helper_unregister_fbi(&ifbdev->helper); +} + +void intel_fbdev_fini(struct drm_i915_private *dev_priv) +{ + struct intel_fbdev *ifbdev = fetch_and_zero(&dev_priv->fbdev); + + if (!ifbdev) + return; + intel_fbdev_destroy(ifbdev); - dev_priv->fbdev = NULL; } void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous) -- 2.13.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] drm/i915: unregister interfaces first in unload 2017-07-14 22:46 [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all Daniel Vetter 2017-07-14 22:46 ` [PATCH 2/3] drm/i915: Fix fbdev unload sequence Daniel Vetter @ 2017-07-14 22:46 ` Daniel Vetter 2017-07-15 9:31 ` [PATCH] drm/atomic-helper: Fix leak in disable_all Daniel Vetter 2017-07-15 10:10 ` [PATCH 1/3] " Chris Wilson 3 siblings, 0 replies; 9+ messages in thread From: Daniel Vetter @ 2017-07-14 22:46 UTC (permalink / raw) To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development We first need to make sure no one else can get at us anymore, before we can proceed to tear down all the datastructures. Just a small step towards eventually the perfect unload code ... Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> --- drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 2c2afb19975d..38990b264b97 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1372,6 +1372,8 @@ void i915_driver_unload(struct drm_device *dev) struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; + i915_driver_unregister(dev_priv); + if (i915_gem_suspend(dev_priv)) DRM_ERROR("failed to idle hardware; continuing to unload!\n"); @@ -1381,8 +1383,6 @@ void i915_driver_unload(struct drm_device *dev) intel_gvt_cleanup(dev_priv); - i915_driver_unregister(dev_priv); - intel_modeset_cleanup(dev); /* -- 2.13.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] drm/atomic-helper: Fix leak in disable_all 2017-07-14 22:46 [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all Daniel Vetter 2017-07-14 22:46 ` [PATCH 2/3] drm/i915: Fix fbdev unload sequence Daniel Vetter 2017-07-14 22:46 ` [PATCH 3/3] drm/i915: unregister interfaces first in unload Daniel Vetter @ 2017-07-15 9:31 ` Daniel Vetter 2017-07-17 9:39 ` [Intel-gfx] " Maarten Lankhorst 2017-07-15 10:10 ` [PATCH 1/3] " Chris Wilson 3 siblings, 1 reply; 9+ messages in thread From: Daniel Vetter @ 2017-07-15 9:31 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Daniel Vetter, Intel Graphics Development, martin.peres The legacy plane->fb pointer is refcounted by calling drm_atomic_clean_old_fb(). In practice this isn't a real problem because: - The caller in the i915 gpu reset code restores the original state again, which means the plane->fb pointer won't change, hence can't leak. - Drivers using drm_atomic_helper_shutdown call the fbdev cleanup first, and that usually cleans up the fb through drm_remove_framebuffer, which does this correctly. - Without fbdev the only framebuffers are from userspace, and those get cleaned up (again using drm_remove_framebuffer) befor the driver can even be unloaded. But in i915 I've switched the cleanup sequence around so that the _shutdown() calls happens after the drm_remove_framebuffer(), which is how I discovered this issue. v2: My analysis why the current code was ok for gpu reset and suspend/resume was correct, but then I totally failed to realize that we better keep this symmetric. Thanksfully CI noticed that for balance, a refcounting bug must exist at 2 places if previously there was no issue ... v3: Don't be lazy and compute the plane_mask in commit_duplicated_state properly too, instead of just using ~0U. Cc: martin.peres@free.fr Cc: chris@chris-wilson.co.uk Acked-by: Dave Airlie <airlied@gmail.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/drm_atomic_helper.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index b07fc30372d3..545328a9a769 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -2726,6 +2726,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, struct drm_plane *plane; struct drm_crtc_state *crtc_state; struct drm_crtc *crtc; + unsigned plane_mask = 0; int ret, i; state = drm_atomic_state_alloc(dev); @@ -2768,10 +2769,14 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, goto free; drm_atomic_set_fb_for_plane(plane_state, NULL); + plane_mask |= BIT(drm_plane_index(plane)); + plane->old_fb = plane->fb; } ret = drm_atomic_commit(state); free: + if (plane_mask) + drm_atomic_clean_old_fb(dev, plane_mask, ret); drm_atomic_state_put(state); return ret; } @@ -2902,11 +2907,16 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, struct drm_connector_state *new_conn_state; struct drm_crtc *crtc; struct drm_crtc_state *new_crtc_state; + unsigned plane_mask = 0; + struct drm_device *dev = state->dev; + int ret; state->acquire_ctx = ctx; - for_each_new_plane_in_state(state, plane, new_plane_state, i) + for_each_new_plane_in_state(state, plane, new_plane_state, i) { + plane_mask |= BIT(drm_plane_index(plane)); state->planes[i].old_state = plane->state; + } for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) state->crtcs[i].old_state = crtc->state; @@ -2914,7 +2924,11 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, for_each_new_connector_in_state(state, connector, new_conn_state, i) state->connectors[i].old_state = connector->state; - return drm_atomic_commit(state); + ret = drm_atomic_commit(state); + if (plane_mask) + drm_atomic_clean_old_fb(dev, plane_mask, ret); + + return ret; } EXPORT_SYMBOL(drm_atomic_helper_commit_duplicated_state); -- 2.13.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/atomic-helper: Fix leak in disable_all 2017-07-15 9:31 ` [PATCH] drm/atomic-helper: Fix leak in disable_all Daniel Vetter @ 2017-07-17 9:39 ` Maarten Lankhorst 2017-07-17 15:21 ` Daniel Vetter 0 siblings, 1 reply; 9+ messages in thread From: Maarten Lankhorst @ 2017-07-17 9:39 UTC (permalink / raw) To: Daniel Vetter, DRI Development; +Cc: Daniel Vetter, Intel Graphics Development Op 15-07-17 om 11:31 schreef Daniel Vetter: > The legacy plane->fb pointer is refcounted by calling > drm_atomic_clean_old_fb(). > > In practice this isn't a real problem because: > - The caller in the i915 gpu reset code restores the original state > again, which means the plane->fb pointer won't change, hence can't > leak. > - Drivers using drm_atomic_helper_shutdown call the fbdev cleanup > first, and that usually cleans up the fb through > drm_remove_framebuffer, which does this correctly. > - Without fbdev the only framebuffers are from userspace, and those > get cleaned up (again using drm_remove_framebuffer) befor the driver > can even be unloaded. > > But in i915 I've switched the cleanup sequence around so that the > _shutdown() calls happens after the drm_remove_framebuffer(), which is > how I discovered this issue. > > v2: My analysis why the current code was ok for gpu reset and > suspend/resume was correct, but then I totally failed to realize that > we better keep this symmetric. Thanksfully CI noticed that for > balance, a refcounting bug must exist at 2 places if previously there > was no issue ... > > v3: Don't be lazy and compute the plane_mask in > commit_duplicated_state properly too, instead of just using ~0U. > > Cc: martin.peres@free.fr > Cc: chris@chris-wilson.co.uk > Acked-by: Dave Airlie <airlied@gmail.com> (v1) > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > drivers/gpu/drm/drm_atomic_helper.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > index b07fc30372d3..545328a9a769 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -2726,6 +2726,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, > struct drm_plane *plane; > struct drm_crtc_state *crtc_state; > struct drm_crtc *crtc; > + unsigned plane_mask = 0; > int ret, i; > > state = drm_atomic_state_alloc(dev); > @@ -2768,10 +2769,14 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, > goto free; > > drm_atomic_set_fb_for_plane(plane_state, NULL); > + plane_mask |= BIT(drm_plane_index(plane)); > + plane->old_fb = plane->fb; > } > > ret = drm_atomic_commit(state); > free: > + if (plane_mask) > + drm_atomic_clean_old_fb(dev, plane_mask, ret); > drm_atomic_state_put(state); > return ret; > } > @@ -2902,11 +2907,16 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > struct drm_connector_state *new_conn_state; > struct drm_crtc *crtc; > struct drm_crtc_state *new_crtc_state; > + unsigned plane_mask = 0; > + struct drm_device *dev = state->dev; > + int ret; > > state->acquire_ctx = ctx; > > - for_each_new_plane_in_state(state, plane, new_plane_state, i) > + for_each_new_plane_in_state(state, plane, new_plane_state, i) { > + plane_mask |= BIT(drm_plane_index(plane)); We should really add a drm_plane_mask/drm_connector_mask at some point, to complement drm_crtc_mask. > state->planes[i].old_state = plane->state; > + } > > for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) > state->crtcs[i].old_state = crtc->state; > @@ -2914,7 +2924,11 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > for_each_new_connector_in_state(state, connector, new_conn_state, i) > state->connectors[i].old_state = connector->state; > > - return drm_atomic_commit(state); > + ret = drm_atomic_commit(state); > + if (plane_mask) > + drm_atomic_clean_old_fb(dev, plane_mask, ret); Kill the if () part, and make it unconditional? On second thought, I should have done the same in drm_framebuffer.c > + return ret; > } > EXPORT_SYMBOL(drm_atomic_helper_commit_duplicated_state); > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/atomic-helper: Fix leak in disable_all 2017-07-17 9:39 ` [Intel-gfx] " Maarten Lankhorst @ 2017-07-17 15:21 ` Daniel Vetter 2017-07-19 10:15 ` Maarten Lankhorst 0 siblings, 1 reply; 9+ messages in thread From: Daniel Vetter @ 2017-07-17 15:21 UTC (permalink / raw) To: Maarten Lankhorst Cc: Daniel Vetter, Intel Graphics Development, DRI Development, Daniel Vetter On Mon, Jul 17, 2017 at 11:39:56AM +0200, Maarten Lankhorst wrote: > Op 15-07-17 om 11:31 schreef Daniel Vetter: > > The legacy plane->fb pointer is refcounted by calling > > drm_atomic_clean_old_fb(). > > > > In practice this isn't a real problem because: > > - The caller in the i915 gpu reset code restores the original state > > again, which means the plane->fb pointer won't change, hence can't > > leak. > > - Drivers using drm_atomic_helper_shutdown call the fbdev cleanup > > first, and that usually cleans up the fb through > > drm_remove_framebuffer, which does this correctly. > > - Without fbdev the only framebuffers are from userspace, and those > > get cleaned up (again using drm_remove_framebuffer) befor the driver > > can even be unloaded. > > > > But in i915 I've switched the cleanup sequence around so that the > > _shutdown() calls happens after the drm_remove_framebuffer(), which is > > how I discovered this issue. > > > > v2: My analysis why the current code was ok for gpu reset and > > suspend/resume was correct, but then I totally failed to realize that > > we better keep this symmetric. Thanksfully CI noticed that for > > balance, a refcounting bug must exist at 2 places if previously there > > was no issue ... > > > > v3: Don't be lazy and compute the plane_mask in > > commit_duplicated_state properly too, instead of just using ~0U. > > > > Cc: martin.peres@free.fr > > Cc: chris@chris-wilson.co.uk > > Acked-by: Dave Airlie <airlied@gmail.com> (v1) > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > drivers/gpu/drm/drm_atomic_helper.c | 18 ++++++++++++++++-- > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > > index b07fc30372d3..545328a9a769 100644 > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > @@ -2726,6 +2726,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, > > struct drm_plane *plane; > > struct drm_crtc_state *crtc_state; > > struct drm_crtc *crtc; > > + unsigned plane_mask = 0; > > int ret, i; > > > > state = drm_atomic_state_alloc(dev); > > @@ -2768,10 +2769,14 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, > > goto free; > > > > drm_atomic_set_fb_for_plane(plane_state, NULL); > > + plane_mask |= BIT(drm_plane_index(plane)); > > + plane->old_fb = plane->fb; > > } > > > > ret = drm_atomic_commit(state); > > free: > > + if (plane_mask) > > + drm_atomic_clean_old_fb(dev, plane_mask, ret); > > drm_atomic_state_put(state); > > return ret; > > } > > @@ -2902,11 +2907,16 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > > struct drm_connector_state *new_conn_state; > > struct drm_crtc *crtc; > > struct drm_crtc_state *new_crtc_state; > > + unsigned plane_mask = 0; > > + struct drm_device *dev = state->dev; > > + int ret; > > > > state->acquire_ctx = ctx; > > > > - for_each_new_plane_in_state(state, plane, new_plane_state, i) > > + for_each_new_plane_in_state(state, plane, new_plane_state, i) { > > + plane_mask |= BIT(drm_plane_index(plane)); > We should really add a drm_plane_mask/drm_connector_mask at some point, to complement drm_crtc_mask. > > state->planes[i].old_state = plane->state; > > + } > > > > for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) > > state->crtcs[i].old_state = crtc->state; > > @@ -2914,7 +2924,11 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > > for_each_new_connector_in_state(state, connector, new_conn_state, i) > > state->connectors[i].old_state = connector->state; > > > > - return drm_atomic_commit(state); > > + ret = drm_atomic_commit(state); > > + if (plane_mask) > > + drm_atomic_clean_old_fb(dev, plane_mask, ret); > Kill the if () part, and make it unconditional? On second thought, I should have done the same in drm_framebuffer.c I'd prefer to not bikeshed this stuff too much and just go with what we do everywhere else (i.e. rmfb code and atomic commit) for consistency. clean_old_fb is definitely a horrible function with misleading kerneldoc on top, and I think the best way to clean that up would be to: - Move the plane_mask computation in drm_atmic_commit and also put the clean_old_fb call in there. Maybe give it a more descriptive name like update_legacy_fb or whatever. Unexport them. - This would break the compat helpers, where drm_atomic_commit must not update the legacy refcounts, because for set_config, page_flip and the plane hooks the core does that already. Create a drm_atomic_commit_legacy for these. But since one of my patches caused an existing issue to pop up as a regression, and this series fixes it, I'd like to first get the minimal fix in through drm-intel. And then bikeshed it properly in drm-misc. Typing the patches is also a bit annoying right now since I have a few other atomic_commit patches in-flight ... Thanks, Daniel > > + return ret; > > } > > EXPORT_SYMBOL(drm_atomic_helper_commit_duplicated_state); > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/atomic-helper: Fix leak in disable_all 2017-07-17 15:21 ` Daniel Vetter @ 2017-07-19 10:15 ` Maarten Lankhorst 0 siblings, 0 replies; 9+ messages in thread From: Maarten Lankhorst @ 2017-07-19 10:15 UTC (permalink / raw) To: Daniel Vetter Cc: Daniel Vetter, Intel Graphics Development, DRI Development, Daniel Vetter Op 17-07-17 om 17:21 schreef Daniel Vetter: > On Mon, Jul 17, 2017 at 11:39:56AM +0200, Maarten Lankhorst wrote: >> Op 15-07-17 om 11:31 schreef Daniel Vetter: >>> The legacy plane->fb pointer is refcounted by calling >>> drm_atomic_clean_old_fb(). >>> >>> In practice this isn't a real problem because: >>> - The caller in the i915 gpu reset code restores the original state >>> again, which means the plane->fb pointer won't change, hence can't >>> leak. >>> - Drivers using drm_atomic_helper_shutdown call the fbdev cleanup >>> first, and that usually cleans up the fb through >>> drm_remove_framebuffer, which does this correctly. >>> - Without fbdev the only framebuffers are from userspace, and those >>> get cleaned up (again using drm_remove_framebuffer) befor the driver >>> can even be unloaded. >>> >>> But in i915 I've switched the cleanup sequence around so that the >>> _shutdown() calls happens after the drm_remove_framebuffer(), which is >>> how I discovered this issue. >>> >>> v2: My analysis why the current code was ok for gpu reset and >>> suspend/resume was correct, but then I totally failed to realize that >>> we better keep this symmetric. Thanksfully CI noticed that for >>> balance, a refcounting bug must exist at 2 places if previously there >>> was no issue ... >>> >>> v3: Don't be lazy and compute the plane_mask in >>> commit_duplicated_state properly too, instead of just using ~0U. >>> >>> Cc: martin.peres@free.fr >>> Cc: chris@chris-wilson.co.uk >>> Acked-by: Dave Airlie <airlied@gmail.com> (v1) >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >>> --- >>> drivers/gpu/drm/drm_atomic_helper.c | 18 ++++++++++++++++-- >>> 1 file changed, 16 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c >>> index b07fc30372d3..545328a9a769 100644 >>> --- a/drivers/gpu/drm/drm_atomic_helper.c >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c >>> @@ -2726,6 +2726,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, >>> struct drm_plane *plane; >>> struct drm_crtc_state *crtc_state; >>> struct drm_crtc *crtc; >>> + unsigned plane_mask = 0; >>> int ret, i; >>> >>> state = drm_atomic_state_alloc(dev); >>> @@ -2768,10 +2769,14 @@ int drm_atomic_helper_disable_all(struct drm_device *dev, >>> goto free; >>> >>> drm_atomic_set_fb_for_plane(plane_state, NULL); >>> + plane_mask |= BIT(drm_plane_index(plane)); >>> + plane->old_fb = plane->fb; >>> } >>> >>> ret = drm_atomic_commit(state); >>> free: >>> + if (plane_mask) >>> + drm_atomic_clean_old_fb(dev, plane_mask, ret); >>> drm_atomic_state_put(state); >>> return ret; >>> } >>> @@ -2902,11 +2907,16 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, >>> struct drm_connector_state *new_conn_state; >>> struct drm_crtc *crtc; >>> struct drm_crtc_state *new_crtc_state; >>> + unsigned plane_mask = 0; >>> + struct drm_device *dev = state->dev; >>> + int ret; >>> >>> state->acquire_ctx = ctx; >>> >>> - for_each_new_plane_in_state(state, plane, new_plane_state, i) >>> + for_each_new_plane_in_state(state, plane, new_plane_state, i) { >>> + plane_mask |= BIT(drm_plane_index(plane)); >> We should really add a drm_plane_mask/drm_connector_mask at some point, to complement drm_crtc_mask. >>> state->planes[i].old_state = plane->state; >>> + } >>> >>> for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) >>> state->crtcs[i].old_state = crtc->state; >>> @@ -2914,7 +2924,11 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, >>> for_each_new_connector_in_state(state, connector, new_conn_state, i) >>> state->connectors[i].old_state = connector->state; >>> >>> - return drm_atomic_commit(state); >>> + ret = drm_atomic_commit(state); >>> + if (plane_mask) >>> + drm_atomic_clean_old_fb(dev, plane_mask, ret); >> Kill the if () part, and make it unconditional? On second thought, I should have done the same in drm_framebuffer.c > I'd prefer to not bikeshed this stuff too much and just go with what we do > everywhere else (i.e. rmfb code and atomic commit) for consistency. > clean_old_fb is definitely a horrible function with misleading kerneldoc > on top, and I think the best way to clean that up would be to: > > - Move the plane_mask computation in drm_atmic_commit and also put the > clean_old_fb call in there. Maybe give it a more descriptive name like > update_legacy_fb or whatever. Unexport them. > > - This would break the compat helpers, where drm_atomic_commit must not > update the legacy refcounts, because for set_config, page_flip and the > plane hooks the core does that already. Create a > drm_atomic_commit_legacy for these. > > But since one of my patches caused an existing issue to pop up as a > regression, and this series fixes it, I'd like to first get the minimal > fix in through drm-intel. And then bikeshed it properly in drm-misc. > > Typing the patches is also a bit annoying right now since I have a few > other atomic_commit patches in-flight ... I'd prefer a bright gray bikeshed, but hey.. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all 2017-07-14 22:46 [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all Daniel Vetter ` (2 preceding siblings ...) 2017-07-15 9:31 ` [PATCH] drm/atomic-helper: Fix leak in disable_all Daniel Vetter @ 2017-07-15 10:10 ` Chris Wilson 2017-07-15 11:03 ` Daniel Vetter 3 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2017-07-15 10:10 UTC (permalink / raw) To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development, Daniel Vetter Quoting Daniel Vetter (2017-07-14 23:46:54) > @@ -2902,6 +2907,8 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > struct drm_connector_state *new_conn_state; > struct drm_crtc *crtc; > struct drm_crtc_state *new_crtc_state; > + struct drm_device *dev = state->dev; > + int ret; > > state->acquire_ctx = ctx; > > @@ -2914,7 +2921,10 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > for_each_new_connector_in_state(state, connector, new_conn_state, i) > state->connectors[i].old_state = connector->state; > > - return drm_atomic_commit(state); > + ret = drm_atomic_commit(state); > + drm_atomic_clean_old_fb(dev, ~0U, ret); I have no idea what the rules should be here. Or how it should interact with error. Should we just try the "drm: Track framebuffer references at the point of assignment" approach to simplify the rules (at least from my perspective)? The problem with that patch is sorting out the state duplication done in a couple of drivers and figuring out if they are transferring ownership or not. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all 2017-07-15 10:10 ` [PATCH 1/3] " Chris Wilson @ 2017-07-15 11:03 ` Daniel Vetter 0 siblings, 0 replies; 9+ messages in thread From: Daniel Vetter @ 2017-07-15 11:03 UTC (permalink / raw) To: Chris Wilson Cc: Daniel Vetter, Intel Graphics Development, DRI Development, Daniel Vetter On Sat, Jul 15, 2017 at 11:10:07AM +0100, Chris Wilson wrote: > Quoting Daniel Vetter (2017-07-14 23:46:54) > > @@ -2902,6 +2907,8 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > > struct drm_connector_state *new_conn_state; > > struct drm_crtc *crtc; > > struct drm_crtc_state *new_crtc_state; > > + struct drm_device *dev = state->dev; > > + int ret; > > > > state->acquire_ctx = ctx; > > > > @@ -2914,7 +2921,10 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state, > > for_each_new_connector_in_state(state, connector, new_conn_state, i) > > state->connectors[i].old_state = connector->state; > > > > - return drm_atomic_commit(state); > > + ret = drm_atomic_commit(state); > > + drm_atomic_clean_old_fb(dev, ~0U, ret); > > I have no idea what the rules should be here. Or how it should interact > with error. Should we just try the "drm: Track framebuffer references at > the point of assignment" approach to simplify the rules (at least from > my perspective)? The problem with that patch is sorting out the state > duplication done in a couple of drivers and figuring out if they are > transferring ownership or not. Yeah this is a decent mess. I think a first incremental step would be to move the ->old_fb and ->fb refcount updating into drm_atomic_commit - clean_old_fb is a superbly misnamed function, what it really does is update the legacy framebuffer pointer refcounts. The kernel-doc it has also just serves to increase the confusion :-/ The only problem is that for an drm_atomic_commit in one of the legacy callbacks we must _not_ do that, because the core already takes care fo that. But having a drm_atomic_commit_legacy for that would be a lot cleaner I think. Once we have that I guess we could try and figure out what to do with the refcounting of the legacy pointers at large. Meanwhile the rules are: - If you do an atomic commit, you must call clean_old_fb. Everywhere. We got away in a few cases because we've made nice symmetric commits (like suspend/resume) or commits where we know the fb doesn't get updated. - Exception: When called from ->plane_update/disable, ->set_config or ->page_flip, the core does it for you. It's a mess, but I'd like to decouple fixing that mess from fixing the unload bug we have. I'll try to type up the drm_atomic_commit/_legacy patch next week. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-07-19 10:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-07-14 22:46 [PATCH 1/3] drm/atomic-helper: Fix leak in disable_all Daniel Vetter 2017-07-14 22:46 ` [PATCH 2/3] drm/i915: Fix fbdev unload sequence Daniel Vetter 2017-07-14 22:46 ` [PATCH 3/3] drm/i915: unregister interfaces first in unload Daniel Vetter 2017-07-15 9:31 ` [PATCH] drm/atomic-helper: Fix leak in disable_all Daniel Vetter 2017-07-17 9:39 ` [Intel-gfx] " Maarten Lankhorst 2017-07-17 15:21 ` Daniel Vetter 2017-07-19 10:15 ` Maarten Lankhorst 2017-07-15 10:10 ` [PATCH 1/3] " Chris Wilson 2017-07-15 11:03 ` Daniel Vetter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).