* [PATCH 0/2] drm: Fix atomic helper dirtyfb stalls @ 2021-05-08 19:56 Rob Clark 2021-05-08 19:56 ` [PATCH 1/2] drm: Fix " Rob Clark 2021-05-08 19:56 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 0 siblings, 2 replies; 23+ messages in thread From: Rob Clark @ 2021-05-08 19:56 UTC (permalink / raw) To: dri-devel Cc: Rob Clark, Abhinav Kumar, Bernard, open list:DRM DRIVER FOR MSM ADRENO GPU, Hongbo Yao, Kalyan Thota, open list:DRM DRIVER FOR MSM ADRENO GPU, open list, Maxime Ripard, Qinglang Miao, Stephen Boyd From: Rob Clark <robdclark@chromium.org> Someone on IRC once asked an innocent enough sounding question: Why with xf86-video-modesetting is es2gears limited at 120fps. So I broke out the perfetto tracing mesa MR and took a look. It turns out the problem was drm_atomic_helper_dirtyfb(), which would end up waiting for vblank.. es2gears would rapidly push two frames to Xorg, which would blit them to screen and in idle hook (I assume) call the DIRTYFB ioctl. Which in turn would do an atomic update to flush the dirty rects, which would stall until the next vblank. And then the whole process would repeat. But this is a bit silly, we only need dirtyfb for command mode DSI panels. So lets just skip it otherwise. Rob Clark (2): drm: Fix dirtyfb stalls drm/msm/dpu: Wire up needs_dirtyfb drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 14 ++++++++++++++ include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ 3 files changed, 36 insertions(+) -- 2.30.2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-08 19:56 [PATCH 0/2] drm: Fix atomic helper dirtyfb stalls Rob Clark @ 2021-05-08 19:56 ` Rob Clark 2021-05-10 16:14 ` Daniel Vetter 2021-05-08 19:56 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 1 sibling, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-08 19:56 UTC (permalink / raw) To: dri-devel Cc: Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, open list From: Rob Clark <robdclark@chromium.org> drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video mode" type displays, which is pointless and unnecessary. Add an optional helper vfunc to determine if a plane is attached to a CRTC that actually needs dirtyfb, and skip over them. Signed-off-by: Rob Clark <robdclark@chromium.org> --- drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c index 3a4126dc2520..a0bed1a2c2dc 100644 --- a/drivers/gpu/drm/drm_damage_helper.c +++ b/drivers/gpu/drm/drm_damage_helper.c @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, retry: drm_for_each_plane(plane, fb->dev) { struct drm_plane_state *plane_state; + struct drm_crtc *crtc; ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); if (ret) @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, continue; } + crtc = plane->state->crtc; + if (crtc->helper_private->needs_dirtyfb && + !crtc->helper_private->needs_dirtyfb(crtc)) { + drm_modeset_unlock(&plane->mutex); + continue; + } + plane_state = drm_atomic_get_plane_state(state, plane); if (IS_ERR(plane_state)) { ret = PTR_ERR(plane_state); diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h index eb706342861d..afa8ec5754e7 100644 --- a/include/drm/drm_modeset_helper_vtables.h +++ b/include/drm/drm_modeset_helper_vtables.h @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { bool in_vblank_irq, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, const struct drm_display_mode *mode); + + /** + * @needs_dirtyfb + * + * Optional callback used by damage helpers to determine if fb_damage_clips + * update is needed. + * + * Returns: + * + * True if fb_damage_clips update is needed to handle DIRTYFB, False + * otherwise. If this callback is not implemented, then True is + * assumed. + */ + bool (*needs_dirtyfb)(struct drm_crtc *crtc); }; /** -- 2.30.2 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-08 19:56 ` [PATCH 1/2] drm: Fix " Rob Clark @ 2021-05-10 16:14 ` Daniel Vetter 2021-05-10 16:16 ` Daniel Vetter 2021-05-10 16:55 ` Rob Clark 0 siblings, 2 replies; 23+ messages in thread From: Daniel Vetter @ 2021-05-10 16:14 UTC (permalink / raw) To: Rob Clark Cc: dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, open list On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > From: Rob Clark <robdclark@chromium.org> > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > mode" type displays, which is pointless and unnecessary. Add an > optional helper vfunc to determine if a plane is attached to a CRTC > that actually needs dirtyfb, and skip over them. > > Signed-off-by: Rob Clark <robdclark@chromium.org> So this is a bit annoying because the idea of all these "remap legacy uapi to atomic constructs" helpers is that they shouldn't need/use anything beyond what userspace also has available. So adding hacks for them feels really bad. Also I feel like it's not entirely the right thing to do here either. We've had this problem already on the fbcon emulation side (which also shouldn't be able to peek behind the atomic kms uapi curtain), and the fix there was to have a worker which batches up all the updates and avoids any stalls in bad places. Since this is for frontbuffer rendering userspace only we can probably get away with assuming there's only a single fb, so the implementation becomes pretty simple: - 1 worker, and we keep track of a single pending fb - if there's already a dirty fb pending on a different fb, we stall for the worker to start processing that one already (i.e. the fb we track is reset to NULL) - if it's pending on the same fb we just toss away all the updates and go with a full update, since merging the clip rects is too much work :-) I think there's helpers so you could be slightly more clever and just have an overall bounding box Could probably steal most of the implementation. This approach here feels a tad too much in the hacky area ... Thoughts? -Daniel > --- > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > 2 files changed, 22 insertions(+) > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > index 3a4126dc2520..a0bed1a2c2dc 100644 > --- a/drivers/gpu/drm/drm_damage_helper.c > +++ b/drivers/gpu/drm/drm_damage_helper.c > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > retry: > drm_for_each_plane(plane, fb->dev) { > struct drm_plane_state *plane_state; > + struct drm_crtc *crtc; > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > if (ret) > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > continue; > } > > + crtc = plane->state->crtc; > + if (crtc->helper_private->needs_dirtyfb && > + !crtc->helper_private->needs_dirtyfb(crtc)) { > + drm_modeset_unlock(&plane->mutex); > + continue; > + } > + > plane_state = drm_atomic_get_plane_state(state, plane); > if (IS_ERR(plane_state)) { > ret = PTR_ERR(plane_state); > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > index eb706342861d..afa8ec5754e7 100644 > --- a/include/drm/drm_modeset_helper_vtables.h > +++ b/include/drm/drm_modeset_helper_vtables.h > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > bool in_vblank_irq, int *vpos, int *hpos, > ktime_t *stime, ktime_t *etime, > const struct drm_display_mode *mode); > + > + /** > + * @needs_dirtyfb > + * > + * Optional callback used by damage helpers to determine if fb_damage_clips > + * update is needed. > + * > + * Returns: > + * > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > + * otherwise. If this callback is not implemented, then True is > + * assumed. > + */ > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > }; > > /** > -- > 2.30.2 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-10 16:14 ` Daniel Vetter @ 2021-05-10 16:16 ` Daniel Vetter 2021-05-10 16:55 ` Rob Clark 1 sibling, 0 replies; 23+ messages in thread From: Daniel Vetter @ 2021-05-10 16:16 UTC (permalink / raw) To: Rob Clark, dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list On Mon, May 10, 2021 at 06:14:20PM +0200, Daniel Vetter wrote: > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > From: Rob Clark <robdclark@chromium.org> > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > mode" type displays, which is pointless and unnecessary. Add an > > optional helper vfunc to determine if a plane is attached to a CRTC > > that actually needs dirtyfb, and skip over them. > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > So this is a bit annoying because the idea of all these "remap legacy uapi > to atomic constructs" helpers is that they shouldn't need/use anything > beyond what userspace also has available. So adding hacks for them feels > really bad. > > Also I feel like it's not entirely the right thing to do here either. > We've had this problem already on the fbcon emulation side (which also > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > there was to have a worker which batches up all the updates and avoids any > stalls in bad places. > > Since this is for frontbuffer rendering userspace only we can probably get > away with assuming there's only a single fb, so the implementation becomes > pretty simple: > > - 1 worker, and we keep track of a single pending fb > - if there's already a dirty fb pending on a different fb, we stall for > the worker to start processing that one already (i.e. the fb we track is > reset to NULL) > - if it's pending on the same fb we just toss away all the updates and go > with a full update, since merging the clip rects is too much work :-) I > think there's helpers so you could be slightly more clever and just have > an overall bounding box > > Could probably steal most of the implementation. Maybe we should even do all this in the common dirtyfb code, before we call into the driver hook. Gives more symmetry in how it works between fbcon and direct rendering userspace. -Daniel > > This approach here feels a tad too much in the hacky area ... > > Thoughts? > -Daniel > > > --- > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > 2 files changed, 22 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > --- a/drivers/gpu/drm/drm_damage_helper.c > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > retry: > > drm_for_each_plane(plane, fb->dev) { > > struct drm_plane_state *plane_state; > > + struct drm_crtc *crtc; > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > if (ret) > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > continue; > > } > > > > + crtc = plane->state->crtc; > > + if (crtc->helper_private->needs_dirtyfb && > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > + drm_modeset_unlock(&plane->mutex); > > + continue; > > + } > > + > > plane_state = drm_atomic_get_plane_state(state, plane); > > if (IS_ERR(plane_state)) { > > ret = PTR_ERR(plane_state); > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > index eb706342861d..afa8ec5754e7 100644 > > --- a/include/drm/drm_modeset_helper_vtables.h > > +++ b/include/drm/drm_modeset_helper_vtables.h > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > bool in_vblank_irq, int *vpos, int *hpos, > > ktime_t *stime, ktime_t *etime, > > const struct drm_display_mode *mode); > > + > > + /** > > + * @needs_dirtyfb > > + * > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > + * update is needed. > > + * > > + * Returns: > > + * > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > + * otherwise. If this callback is not implemented, then True is > > + * assumed. > > + */ > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > }; > > > > /** > > -- > > 2.30.2 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-10 16:14 ` Daniel Vetter 2021-05-10 16:16 ` Daniel Vetter @ 2021-05-10 16:55 ` Rob Clark 2021-05-10 17:43 ` Daniel Vetter 1 sibling, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-10 16:55 UTC (permalink / raw) To: Rob Clark, dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list Cc: Daniel Vetter On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > From: Rob Clark <robdclark@chromium.org> > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > mode" type displays, which is pointless and unnecessary. Add an > > optional helper vfunc to determine if a plane is attached to a CRTC > > that actually needs dirtyfb, and skip over them. > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > So this is a bit annoying because the idea of all these "remap legacy uapi > to atomic constructs" helpers is that they shouldn't need/use anything > beyond what userspace also has available. So adding hacks for them feels > really bad. I suppose the root problem is that userspace doesn't know if dirtyfb (or similar) is actually required or is a no-op. But it is perhaps less of a problem because this essentially boils down to "x11 vs wayland", and it seems like wayland compositors for non-vsync'd rendering just pageflips and throws away extra frames from the app? > Also I feel like it's not entirely the right thing to do here either. > We've had this problem already on the fbcon emulation side (which also > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > there was to have a worker which batches up all the updates and avoids any > stalls in bad places. I'm not too worried about fbcon not being able to render faster than vblank. OTOH it is a pretty big problem for x11 > Since this is for frontbuffer rendering userspace only we can probably get > away with assuming there's only a single fb, so the implementation becomes > pretty simple: > > - 1 worker, and we keep track of a single pending fb > - if there's already a dirty fb pending on a different fb, we stall for > the worker to start processing that one already (i.e. the fb we track is > reset to NULL) > - if it's pending on the same fb we just toss away all the updates and go > with a full update, since merging the clip rects is too much work :-) I > think there's helpers so you could be slightly more clever and just have > an overall bounding box This doesn't really fix the problem, you still end up delaying sending the next back-buffer to mesa But we could re-work drm_framebuffer_funcs::dirty to operate on a per-crtc basis and hoist the loop and check if dirtyfb is needed out of drm_atomic_helper_dirtyfb() BR, -R > > Could probably steal most of the implementation. > > This approach here feels a tad too much in the hacky area ... > > Thoughts? > -Daniel > > > --- > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > 2 files changed, 22 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > --- a/drivers/gpu/drm/drm_damage_helper.c > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > retry: > > drm_for_each_plane(plane, fb->dev) { > > struct drm_plane_state *plane_state; > > + struct drm_crtc *crtc; > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > if (ret) > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > continue; > > } > > > > + crtc = plane->state->crtc; > > + if (crtc->helper_private->needs_dirtyfb && > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > + drm_modeset_unlock(&plane->mutex); > > + continue; > > + } > > + > > plane_state = drm_atomic_get_plane_state(state, plane); > > if (IS_ERR(plane_state)) { > > ret = PTR_ERR(plane_state); > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > index eb706342861d..afa8ec5754e7 100644 > > --- a/include/drm/drm_modeset_helper_vtables.h > > +++ b/include/drm/drm_modeset_helper_vtables.h > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > bool in_vblank_irq, int *vpos, int *hpos, > > ktime_t *stime, ktime_t *etime, > > const struct drm_display_mode *mode); > > + > > + /** > > + * @needs_dirtyfb > > + * > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > + * update is needed. > > + * > > + * Returns: > > + * > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > + * otherwise. If this callback is not implemented, then True is > > + * assumed. > > + */ > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > }; > > > > /** > > -- > > 2.30.2 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-10 16:55 ` Rob Clark @ 2021-05-10 17:43 ` Daniel Vetter 2021-05-10 19:06 ` Rob Clark 0 siblings, 1 reply; 23+ messages in thread From: Daniel Vetter @ 2021-05-10 17:43 UTC (permalink / raw) To: Rob Clark Cc: dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > From: Rob Clark <robdclark@chromium.org> > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > mode" type displays, which is pointless and unnecessary. Add an > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > that actually needs dirtyfb, and skip over them. > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > to atomic constructs" helpers is that they shouldn't need/use anything > > beyond what userspace also has available. So adding hacks for them feels > > really bad. > > I suppose the root problem is that userspace doesn't know if dirtyfb > (or similar) is actually required or is a no-op. > > But it is perhaps less of a problem because this essentially boils > down to "x11 vs wayland", and it seems like wayland compositors for > non-vsync'd rendering just pageflips and throws away extra frames from > the app? Yeah it's about not adequately batching up rendering and syncing with hw. bare metal x11 is just especially stupid about it :-) > > Also I feel like it's not entirely the right thing to do here either. > > We've had this problem already on the fbcon emulation side (which also > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > there was to have a worker which batches up all the updates and avoids any > > stalls in bad places. > > I'm not too worried about fbcon not being able to render faster than > vblank. OTOH it is a pretty big problem for x11 That's why we'd let the worker get ahead at most one dirtyfb. We do the same with fbcon, which trivially can get ahead of vblank otherwise (if sometimes flushes each character, so you have to pile them up into a single update if that's still pending). > > Since this is for frontbuffer rendering userspace only we can probably get > > away with assuming there's only a single fb, so the implementation becomes > > pretty simple: > > > > - 1 worker, and we keep track of a single pending fb > > - if there's already a dirty fb pending on a different fb, we stall for > > the worker to start processing that one already (i.e. the fb we track is > > reset to NULL) > > - if it's pending on the same fb we just toss away all the updates and go > > with a full update, since merging the clip rects is too much work :-) I > > think there's helpers so you could be slightly more clever and just have > > an overall bounding box > > This doesn't really fix the problem, you still end up delaying sending > the next back-buffer to mesa With this the dirtyfb would never block. Also glorious frontbuffer tracking corruption is possible, but that's not the kernel's problem. So how would anything get held up in userspace. > But we could re-work drm_framebuffer_funcs::dirty to operate on a > per-crtc basis and hoist the loop and check if dirtyfb is needed out > of drm_atomic_helper_dirtyfb() That's still using information that userspace doesn't have, which is a bit irky. We might as well go with your thing here then. -Daniel > BR, > -R > > > > > Could probably steal most of the implementation. > > > > This approach here feels a tad too much in the hacky area ... > > > > Thoughts? > > -Daniel > > > > > --- > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > 2 files changed, 22 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > retry: > > > drm_for_each_plane(plane, fb->dev) { > > > struct drm_plane_state *plane_state; > > > + struct drm_crtc *crtc; > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > if (ret) > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > continue; > > > } > > > > > > + crtc = plane->state->crtc; > > > + if (crtc->helper_private->needs_dirtyfb && > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > + drm_modeset_unlock(&plane->mutex); > > > + continue; > > > + } > > > + > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > if (IS_ERR(plane_state)) { > > > ret = PTR_ERR(plane_state); > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > index eb706342861d..afa8ec5754e7 100644 > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > bool in_vblank_irq, int *vpos, int *hpos, > > > ktime_t *stime, ktime_t *etime, > > > const struct drm_display_mode *mode); > > > + > > > + /** > > > + * @needs_dirtyfb > > > + * > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > + * update is needed. > > > + * > > > + * Returns: > > > + * > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > + * otherwise. If this callback is not implemented, then True is > > > + * assumed. > > > + */ > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > }; > > > > > > /** > > > -- > > > 2.30.2 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-10 17:43 ` Daniel Vetter @ 2021-05-10 19:06 ` Rob Clark 2021-05-11 16:44 ` Daniel Vetter 0 siblings, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-10 19:06 UTC (permalink / raw) To: Daniel Vetter Cc: dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > beyond what userspace also has available. So adding hacks for them feels > > > really bad. > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > (or similar) is actually required or is a no-op. > > > > But it is perhaps less of a problem because this essentially boils > > down to "x11 vs wayland", and it seems like wayland compositors for > > non-vsync'd rendering just pageflips and throws away extra frames from > > the app? > > Yeah it's about not adequately batching up rendering and syncing with > hw. bare metal x11 is just especially stupid about it :-) > > > > Also I feel like it's not entirely the right thing to do here either. > > > We've had this problem already on the fbcon emulation side (which also > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > there was to have a worker which batches up all the updates and avoids any > > > stalls in bad places. > > > > I'm not too worried about fbcon not being able to render faster than > > vblank. OTOH it is a pretty big problem for x11 > > That's why we'd let the worker get ahead at most one dirtyfb. We do > the same with fbcon, which trivially can get ahead of vblank otherwise > (if sometimes flushes each character, so you have to pile them up into > a single update if that's still pending). > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > away with assuming there's only a single fb, so the implementation becomes > > > pretty simple: > > > > > > - 1 worker, and we keep track of a single pending fb > > > - if there's already a dirty fb pending on a different fb, we stall for > > > the worker to start processing that one already (i.e. the fb we track is > > > reset to NULL) > > > - if it's pending on the same fb we just toss away all the updates and go > > > with a full update, since merging the clip rects is too much work :-) I > > > think there's helpers so you could be slightly more clever and just have > > > an overall bounding box > > > > This doesn't really fix the problem, you still end up delaying sending > > the next back-buffer to mesa > > With this the dirtyfb would never block. Also glorious frontbuffer > tracking corruption is possible, but that's not the kernel's problem. > So how would anything get held up in userspace. the part about stalling if a dirtyfb is pending was what I was worried about.. but I suppose you meant the worker stalling, rather than userspace stalling (where I had interpreted it the other way around). As soon as userspace needs to stall, you're losing again. > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > of drm_atomic_helper_dirtyfb() > > That's still using information that userspace doesn't have, which is a > bit irky. We might as well go with your thing here then. arguably, this is something we should expose to userspace.. for DSI command-mode panels, you probably want to make a different decision with regard to how many buffers in your flip-chain.. Possibly we should add/remove the fb_damage_clips property depending on the display type (ie. video/pull vs cmd/push mode)? BR, -R > -Daniel > > > BR, > > -R > > > > > > > > Could probably steal most of the implementation. > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > Thoughts? > > > -Daniel > > > > > > > --- > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > 2 files changed, 22 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > retry: > > > > drm_for_each_plane(plane, fb->dev) { > > > > struct drm_plane_state *plane_state; > > > > + struct drm_crtc *crtc; > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > if (ret) > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > continue; > > > > } > > > > > > > > + crtc = plane->state->crtc; > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > + drm_modeset_unlock(&plane->mutex); > > > > + continue; > > > > + } > > > > + > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > if (IS_ERR(plane_state)) { > > > > ret = PTR_ERR(plane_state); > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > index eb706342861d..afa8ec5754e7 100644 > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > ktime_t *stime, ktime_t *etime, > > > > const struct drm_display_mode *mode); > > > > + > > > > + /** > > > > + * @needs_dirtyfb > > > > + * > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > + * update is needed. > > > > + * > > > > + * Returns: > > > > + * > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > + * otherwise. If this callback is not implemented, then True is > > > > + * assumed. > > > > + */ > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > }; > > > > > > > > /** > > > > -- > > > > 2.30.2 > > > > > > > > > > -- > > > Daniel Vetter > > > Software Engineer, Intel Corporation > > > http://blog.ffwll.ch > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-10 19:06 ` Rob Clark @ 2021-05-11 16:44 ` Daniel Vetter 2021-05-11 17:19 ` Rob Clark 2021-05-12 8:23 ` Pekka Paalanen 0 siblings, 2 replies; 23+ messages in thread From: Daniel Vetter @ 2021-05-11 16:44 UTC (permalink / raw) To: Rob Clark Cc: Daniel Vetter, dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > beyond what userspace also has available. So adding hacks for them feels > > > > really bad. > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > (or similar) is actually required or is a no-op. > > > > > > But it is perhaps less of a problem because this essentially boils > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > the app? > > > > Yeah it's about not adequately batching up rendering and syncing with > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > We've had this problem already on the fbcon emulation side (which also > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > there was to have a worker which batches up all the updates and avoids any > > > > stalls in bad places. > > > > > > I'm not too worried about fbcon not being able to render faster than > > > vblank. OTOH it is a pretty big problem for x11 > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > the same with fbcon, which trivially can get ahead of vblank otherwise > > (if sometimes flushes each character, so you have to pile them up into > > a single update if that's still pending). > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > away with assuming there's only a single fb, so the implementation becomes > > > > pretty simple: > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > the worker to start processing that one already (i.e. the fb we track is > > > > reset to NULL) > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > with a full update, since merging the clip rects is too much work :-) I > > > > think there's helpers so you could be slightly more clever and just have > > > > an overall bounding box > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > the next back-buffer to mesa > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > tracking corruption is possible, but that's not the kernel's problem. > > So how would anything get held up in userspace. > > the part about stalling if a dirtyfb is pending was what I was worried > about.. but I suppose you meant the worker stalling, rather than > userspace stalling (where I had interpreted it the other way around). > As soon as userspace needs to stall, you're losing again. Nah, I did mean userspace stalling, so we can't pile up unlimited amounts of dirtyfb request in the kernel. But also I never expect userspace that uses dirtyfb to actually hit this stall point (otherwise we'd need to look at this again). It would really be only there as defense against abuse. > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > of drm_atomic_helper_dirtyfb() > > > > That's still using information that userspace doesn't have, which is a > > bit irky. We might as well go with your thing here then. > > arguably, this is something we should expose to userspace.. for DSI > command-mode panels, you probably want to make a different decision > with regard to how many buffers in your flip-chain.. > > Possibly we should add/remove the fb_damage_clips property depending > on the display type (ie. video/pull vs cmd/push mode)? I'm not sure whether atomic actually needs this exposed: - clients will do full flips for every frame anyway, I've not heard of anyone seriously doing frontbuffer rendering. - transporting the cliprects around and then tossing them if the driver doesn't need them in their flip is probably not a measurable win But yeah if I'm wrong and we have a need here and it's useful, then exposing this to userspace should be done. Meanwhile I think a "offload to worker like fbcon" trick for this legacy interface is probabyl the best option. Plus it will fix things not just for the case where you don't need dirty uploading, it will also fix things for the case where you _do_ need dirty uploading (since right now we stall in a few bad places for that I think). -Daniel > > BR, > -R > > > -Daniel > > > > > BR, > > > -R > > > > > > > > > > > Could probably steal most of the implementation. > > > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > > > Thoughts? > > > > -Daniel > > > > > > > > > --- > > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > > 2 files changed, 22 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > retry: > > > > > drm_for_each_plane(plane, fb->dev) { > > > > > struct drm_plane_state *plane_state; > > > > > + struct drm_crtc *crtc; > > > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > > if (ret) > > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > continue; > > > > > } > > > > > > > > > > + crtc = plane->state->crtc; > > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > + continue; > > > > > + } > > > > > + > > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > > if (IS_ERR(plane_state)) { > > > > > ret = PTR_ERR(plane_state); > > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > > index eb706342861d..afa8ec5754e7 100644 > > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > > ktime_t *stime, ktime_t *etime, > > > > > const struct drm_display_mode *mode); > > > > > + > > > > > + /** > > > > > + * @needs_dirtyfb > > > > > + * > > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > > + * update is needed. > > > > > + * > > > > > + * Returns: > > > > > + * > > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > > + * otherwise. If this callback is not implemented, then True is > > > > > + * assumed. > > > > > + */ > > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > > }; > > > > > > > > > > /** > > > > > -- > > > > > 2.30.2 > > > > > > > > > > > > > -- > > > > Daniel Vetter > > > > Software Engineer, Intel Corporation > > > > http://blog.ffwll.ch > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-11 16:44 ` Daniel Vetter @ 2021-05-11 17:19 ` Rob Clark 2021-05-11 17:21 ` Daniel Vetter 2021-05-12 8:23 ` Pekka Paalanen 1 sibling, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-11 17:19 UTC (permalink / raw) To: Rob Clark, dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list Cc: Daniel Vetter On Tue, May 11, 2021 at 9:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > really bad. > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > (or similar) is actually required or is a no-op. > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > the app? > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > stalls in bad places. > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > (if sometimes flushes each character, so you have to pile them up into > > > a single update if that's still pending). > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > pretty simple: > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > reset to NULL) > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > think there's helpers so you could be slightly more clever and just have > > > > > an overall bounding box > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > the next back-buffer to mesa > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > tracking corruption is possible, but that's not the kernel's problem. > > > So how would anything get held up in userspace. > > > > the part about stalling if a dirtyfb is pending was what I was worried > > about.. but I suppose you meant the worker stalling, rather than > > userspace stalling (where I had interpreted it the other way around). > > As soon as userspace needs to stall, you're losing again. > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > of dirtyfb request in the kernel. > > But also I never expect userspace that uses dirtyfb to actually hit this > stall point (otherwise we'd need to look at this again). It would really > be only there as defense against abuse. I don't believe modesetting ddx throttles dirtyfb, it (indirectly) calls this from it's BlockHandler.. so if you do end up blocking after the N'th dirtyfb, you are still going to end up stalling for vblank, you are just deferring that for a frame or two.. The thing is, for a push style panel, you don't necessarily have to wait for "vblank" (because "vblank" isn't necessarily a real thing), so in that scenario dirtyfb could in theory be fast. What you want to do is fundamentally different for push vs pull style displays. > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > of drm_atomic_helper_dirtyfb() > > > > > > That's still using information that userspace doesn't have, which is a > > > bit irky. We might as well go with your thing here then. > > > > arguably, this is something we should expose to userspace.. for DSI > > command-mode panels, you probably want to make a different decision > > with regard to how many buffers in your flip-chain.. > > > > Possibly we should add/remove the fb_damage_clips property depending > > on the display type (ie. video/pull vs cmd/push mode)? > > I'm not sure whether atomic actually needs this exposed: > - clients will do full flips for every frame anyway, I've not heard of > anyone seriously doing frontbuffer rendering. Frontbuffer rendering is actually a thing, for ex. to reduce latency for stylus (android and CrOS do this.. fortunately AFAICT CrOS never uses the dirtyfb ioctl.. but as soon as someone has the nice idea to add that we'd be running into the same problem) Possibly one idea is to treat dirty-clip updates similarly to cursor updates, and let the driver accumulate the updates and then wait until vblank to apply them BR, -R > - transporting the cliprects around and then tossing them if the driver > doesn't need them in their flip is probably not a measurable win > > But yeah if I'm wrong and we have a need here and it's useful, then > exposing this to userspace should be done. Meanwhile I think a "offload to > worker like fbcon" trick for this legacy interface is probabyl the best > option. Plus it will fix things not just for the case where you don't need > dirty uploading, it will also fix things for the case where you _do_ need > dirty uploading (since right now we stall in a few bad places for that I > think). > -Daniel > > > > > BR, > > -R > > > > > -Daniel > > > > > > > BR, > > > > -R > > > > > > > > > > > > > > Could probably steal most of the implementation. > > > > > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > > > > > Thoughts? > > > > > -Daniel > > > > > > > > > > > --- > > > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > > > 2 files changed, 22 insertions(+) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > retry: > > > > > > drm_for_each_plane(plane, fb->dev) { > > > > > > struct drm_plane_state *plane_state; > > > > > > + struct drm_crtc *crtc; > > > > > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > > > if (ret) > > > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > continue; > > > > > > } > > > > > > > > > > > > + crtc = plane->state->crtc; > > > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > > + continue; > > > > > > + } > > > > > > + > > > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > > > if (IS_ERR(plane_state)) { > > > > > > ret = PTR_ERR(plane_state); > > > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > > > index eb706342861d..afa8ec5754e7 100644 > > > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > > > ktime_t *stime, ktime_t *etime, > > > > > > const struct drm_display_mode *mode); > > > > > > + > > > > > > + /** > > > > > > + * @needs_dirtyfb > > > > > > + * > > > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > > > + * update is needed. > > > > > > + * > > > > > > + * Returns: > > > > > > + * > > > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > > > + * otherwise. If this callback is not implemented, then True is > > > > > > + * assumed. > > > > > > + */ > > > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > > > }; > > > > > > > > > > > > /** > > > > > > -- > > > > > > 2.30.2 > > > > > > > > > > > > > > > > -- > > > > > Daniel Vetter > > > > > Software Engineer, Intel Corporation > > > > > http://blog.ffwll.ch > > > > > > > > > > > > -- > > > Daniel Vetter > > > Software Engineer, Intel Corporation > > > http://blog.ffwll.ch > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-11 17:19 ` Rob Clark @ 2021-05-11 17:21 ` Daniel Vetter 2021-05-11 17:42 ` Rob Clark 0 siblings, 1 reply; 23+ messages in thread From: Daniel Vetter @ 2021-05-11 17:21 UTC (permalink / raw) To: Rob Clark Cc: dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list, Daniel Vetter On Tue, May 11, 2021 at 10:19:57AM -0700, Rob Clark wrote: > On Tue, May 11, 2021 at 9:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > really bad. > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > the app? > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > stalls in bad places. > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > (if sometimes flushes each character, so you have to pile them up into > > > > a single update if that's still pending). > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > pretty simple: > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > reset to NULL) > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > an overall bounding box > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > the next back-buffer to mesa > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > So how would anything get held up in userspace. > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > about.. but I suppose you meant the worker stalling, rather than > > > userspace stalling (where I had interpreted it the other way around). > > > As soon as userspace needs to stall, you're losing again. > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > of dirtyfb request in the kernel. > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > stall point (otherwise we'd need to look at this again). It would really > > be only there as defense against abuse. > > I don't believe modesetting ddx throttles dirtyfb, it (indirectly) > calls this from it's BlockHandler.. so if you do end up blocking after > the N'th dirtyfb, you are still going to end up stalling for vblank, > you are just deferring that for a frame or two.. Nope, that's not what I mean. By default we pile up the updates, so you _never_ stall. The worker then takes the entire update every time it runs and batches them up. We _only_ stall when we get a dirtyfb with a different fb. Because that's much harder to pile up, plus frontbuffer rendering userspace uses a single fb across all screens anyway. So really I don't expect X to ever stall in it's BlockHandler with this. > The thing is, for a push style panel, you don't necessarily have to > wait for "vblank" (because "vblank" isn't necessarily a real thing), > so in that scenario dirtyfb could in theory be fast. What you want to > do is fundamentally different for push vs pull style displays. Yeah, but we'd only stall if userspace does a modeset (which means different fb) and at that point you'll stall anyway a bit. So shouldn't hurt. Well you can do frontbuffer rendering even with atomic ioctl. Just don't use dirtyfb. But also you really shouldn't use frontbuffer rendering right now, since we don't have the interfaces right now to tell userspace whether it's cmd-mode or something else and what kind of corruption (if any) to expect when they do that. > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > bit irky. We might as well go with your thing here then. > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > command-mode panels, you probably want to make a different decision > > > with regard to how many buffers in your flip-chain.. > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > I'm not sure whether atomic actually needs this exposed: > > - clients will do full flips for every frame anyway, I've not heard of > > anyone seriously doing frontbuffer rendering. > > Frontbuffer rendering is actually a thing, for ex. to reduce latency > for stylus (android and CrOS do this.. fortunately AFAICT CrOS never > uses the dirtyfb ioctl.. but as soon as someone has the nice idea to > add that we'd be running into the same problem) > > Possibly one idea is to treat dirty-clip updates similarly to cursor > updates, and let the driver accumulate the updates and then wait until > vblank to apply them Yeah that's what I mean. Except implemented cheaper. fbcon code already does it. I think we're seriously talking past each another. -Daniel > > BR, > -R > > > - transporting the cliprects around and then tossing them if the driver > > doesn't need them in their flip is probably not a measurable win > > > > But yeah if I'm wrong and we have a need here and it's useful, then > > exposing this to userspace should be done. Meanwhile I think a "offload to > > worker like fbcon" trick for this legacy interface is probabyl the best > > option. Plus it will fix things not just for the case where you don't need > > dirty uploading, it will also fix things for the case where you _do_ need > > dirty uploading (since right now we stall in a few bad places for that I > > think). > > -Daniel > > > > > > > > BR, > > > -R > > > > > > > -Daniel > > > > > > > > > BR, > > > > > -R > > > > > > > > > > > > > > > > > Could probably steal most of the implementation. > > > > > > > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > > > > > > > Thoughts? > > > > > > -Daniel > > > > > > > > > > > > > --- > > > > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > > > > 2 files changed, 22 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > retry: > > > > > > > drm_for_each_plane(plane, fb->dev) { > > > > > > > struct drm_plane_state *plane_state; > > > > > > > + struct drm_crtc *crtc; > > > > > > > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > > > > if (ret) > > > > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > continue; > > > > > > > } > > > > > > > > > > > > > > + crtc = plane->state->crtc; > > > > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > > > + continue; > > > > > > > + } > > > > > > > + > > > > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > > > > if (IS_ERR(plane_state)) { > > > > > > > ret = PTR_ERR(plane_state); > > > > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > > > > index eb706342861d..afa8ec5754e7 100644 > > > > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > > > > ktime_t *stime, ktime_t *etime, > > > > > > > const struct drm_display_mode *mode); > > > > > > > + > > > > > > > + /** > > > > > > > + * @needs_dirtyfb > > > > > > > + * > > > > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > > > > + * update is needed. > > > > > > > + * > > > > > > > + * Returns: > > > > > > > + * > > > > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > > > > + * otherwise. If this callback is not implemented, then True is > > > > > > > + * assumed. > > > > > > > + */ > > > > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > > > > }; > > > > > > > > > > > > > > /** > > > > > > > -- > > > > > > > 2.30.2 > > > > > > > > > > > > > > > > > > > -- > > > > > > Daniel Vetter > > > > > > Software Engineer, Intel Corporation > > > > > > http://blog.ffwll.ch > > > > > > > > > > > > > > > > -- > > > > Daniel Vetter > > > > Software Engineer, Intel Corporation > > > > http://blog.ffwll.ch > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-11 17:21 ` Daniel Vetter @ 2021-05-11 17:42 ` Rob Clark 2021-05-11 17:50 ` Daniel Vetter 0 siblings, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-11 17:42 UTC (permalink / raw) To: Rob Clark, dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list Cc: Daniel Vetter On Tue, May 11, 2021 at 10:21 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Tue, May 11, 2021 at 10:19:57AM -0700, Rob Clark wrote: > > On Tue, May 11, 2021 at 9:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > > really bad. > > > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > > the app? > > > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > > stalls in bad places. > > > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > > (if sometimes flushes each character, so you have to pile them up into > > > > > a single update if that's still pending). > > > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > > pretty simple: > > > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > > reset to NULL) > > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > > an overall bounding box > > > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > > the next back-buffer to mesa > > > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > > So how would anything get held up in userspace. > > > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > > about.. but I suppose you meant the worker stalling, rather than > > > > userspace stalling (where I had interpreted it the other way around). > > > > As soon as userspace needs to stall, you're losing again. > > > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > > of dirtyfb request in the kernel. > > > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > > stall point (otherwise we'd need to look at this again). It would really > > > be only there as defense against abuse. > > > > I don't believe modesetting ddx throttles dirtyfb, it (indirectly) > > calls this from it's BlockHandler.. so if you do end up blocking after > > the N'th dirtyfb, you are still going to end up stalling for vblank, > > you are just deferring that for a frame or two.. > > Nope, that's not what I mean. > > By default we pile up the updates, so you _never_ stall. The worker then > takes the entire update every time it runs and batches them up. > > We _only_ stall when we get a dirtyfb with a different fb. Because that's > much harder to pile up, plus frontbuffer rendering userspace uses a single > fb across all screens anyway. > > So really I don't expect X to ever stall in it's BlockHandler with this. ok, sorry, I missed the "different fb" part.. but I could see a userspace that uses multiple fb's wanting to do front buffer rendering.. although they are probably only going to do it on a single display at a time, so maybe that is a bit of an edge case > > The thing is, for a push style panel, you don't necessarily have to > > wait for "vblank" (because "vblank" isn't necessarily a real thing), > > so in that scenario dirtyfb could in theory be fast. What you want to > > do is fundamentally different for push vs pull style displays. > > Yeah, but we'd only stall if userspace does a modeset (which means > different fb) and at that point you'll stall anyway a bit. So shouldn't > hurt. > > Well you can do frontbuffer rendering even with atomic ioctl. Just don't > use dirtyfb. > > But also you really shouldn't use frontbuffer rendering right now, since > we don't have the interfaces right now to tell userspace whether it's > cmd-mode or something else and what kind of corruption (if any) to expect > when they do that. Compressed formats and front-buffer rendering don't really work out in a pleasant way.. minigbm has a usage flag to indicate that the surface will be used for front-buffer rendering (and it is a thing we should probably port to real gbm). I think this aspect of it is better solved in userspace. > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > command-mode panels, you probably want to make a different decision > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > I'm not sure whether atomic actually needs this exposed: > > > - clients will do full flips for every frame anyway, I've not heard of > > > anyone seriously doing frontbuffer rendering. > > > > Frontbuffer rendering is actually a thing, for ex. to reduce latency > > for stylus (android and CrOS do this.. fortunately AFAICT CrOS never > > uses the dirtyfb ioctl.. but as soon as someone has the nice idea to > > add that we'd be running into the same problem) > > > > Possibly one idea is to treat dirty-clip updates similarly to cursor > > updates, and let the driver accumulate the updates and then wait until > > vblank to apply them > > Yeah that's what I mean. Except implemented cheaper. fbcon code already > does it. I think we're seriously talking past each another. Hmm, well 'state->async_update = true' is a pretty cheap implementation.. BR, -R > -Daniel > > > > > BR, > > -R > > > > > - transporting the cliprects around and then tossing them if the driver > > > doesn't need them in their flip is probably not a measurable win > > > > > > But yeah if I'm wrong and we have a need here and it's useful, then > > > exposing this to userspace should be done. Meanwhile I think a "offload to > > > worker like fbcon" trick for this legacy interface is probabyl the best > > > option. Plus it will fix things not just for the case where you don't need > > > dirty uploading, it will also fix things for the case where you _do_ need > > > dirty uploading (since right now we stall in a few bad places for that I > > > think). > > > -Daniel > > > > > > > > > > > BR, > > > > -R > > > > > > > > > -Daniel > > > > > > > > > > > BR, > > > > > > -R > > > > > > > > > > > > > > > > > > > > Could probably steal most of the implementation. > > > > > > > > > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > > > > > > > > > Thoughts? > > > > > > > -Daniel > > > > > > > > > > > > > > > --- > > > > > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > > > > > 2 files changed, 22 insertions(+) > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > > retry: > > > > > > > > drm_for_each_plane(plane, fb->dev) { > > > > > > > > struct drm_plane_state *plane_state; > > > > > > > > + struct drm_crtc *crtc; > > > > > > > > > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > > > > > if (ret) > > > > > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > > continue; > > > > > > > > } > > > > > > > > > > > > > > > > + crtc = plane->state->crtc; > > > > > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > > > > + continue; > > > > > > > > + } > > > > > > > > + > > > > > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > > > > > if (IS_ERR(plane_state)) { > > > > > > > > ret = PTR_ERR(plane_state); > > > > > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > > > > > index eb706342861d..afa8ec5754e7 100644 > > > > > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > > > > > ktime_t *stime, ktime_t *etime, > > > > > > > > const struct drm_display_mode *mode); > > > > > > > > + > > > > > > > > + /** > > > > > > > > + * @needs_dirtyfb > > > > > > > > + * > > > > > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > > > > > + * update is needed. > > > > > > > > + * > > > > > > > > + * Returns: > > > > > > > > + * > > > > > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > > > > > + * otherwise. If this callback is not implemented, then True is > > > > > > > > + * assumed. > > > > > > > > + */ > > > > > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > > > > > }; > > > > > > > > > > > > > > > > /** > > > > > > > > -- > > > > > > > > 2.30.2 > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Daniel Vetter > > > > > > > Software Engineer, Intel Corporation > > > > > > > http://blog.ffwll.ch > > > > > > > > > > > > > > > > > > > > -- > > > > > Daniel Vetter > > > > > Software Engineer, Intel Corporation > > > > > http://blog.ffwll.ch > > > > > > -- > > > Daniel Vetter > > > Software Engineer, Intel Corporation > > > http://blog.ffwll.ch > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-11 17:42 ` Rob Clark @ 2021-05-11 17:50 ` Daniel Vetter 0 siblings, 0 replies; 23+ messages in thread From: Daniel Vetter @ 2021-05-11 17:50 UTC (permalink / raw) To: Rob Clark Cc: dri-devel, Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, open list, Daniel Vetter On Tue, May 11, 2021 at 10:42:58AM -0700, Rob Clark wrote: > On Tue, May 11, 2021 at 10:21 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Tue, May 11, 2021 at 10:19:57AM -0700, Rob Clark wrote: > > > On Tue, May 11, 2021 at 9:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > > > really bad. > > > > > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > > > the app? > > > > > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > > > stalls in bad places. > > > > > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > > > (if sometimes flushes each character, so you have to pile them up into > > > > > > a single update if that's still pending). > > > > > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > > > pretty simple: > > > > > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > > > reset to NULL) > > > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > > > an overall bounding box > > > > > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > > > the next back-buffer to mesa > > > > > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > > > So how would anything get held up in userspace. > > > > > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > > > about.. but I suppose you meant the worker stalling, rather than > > > > > userspace stalling (where I had interpreted it the other way around). > > > > > As soon as userspace needs to stall, you're losing again. > > > > > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > > > of dirtyfb request in the kernel. > > > > > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > > > stall point (otherwise we'd need to look at this again). It would really > > > > be only there as defense against abuse. > > > > > > I don't believe modesetting ddx throttles dirtyfb, it (indirectly) > > > calls this from it's BlockHandler.. so if you do end up blocking after > > > the N'th dirtyfb, you are still going to end up stalling for vblank, > > > you are just deferring that for a frame or two.. > > > > Nope, that's not what I mean. > > > > By default we pile up the updates, so you _never_ stall. The worker then > > takes the entire update every time it runs and batches them up. > > > > We _only_ stall when we get a dirtyfb with a different fb. Because that's > > much harder to pile up, plus frontbuffer rendering userspace uses a single > > fb across all screens anyway. > > > > So really I don't expect X to ever stall in it's BlockHandler with this. > > ok, sorry, I missed the "different fb" part.. > > but I could see a userspace that uses multiple fb's wanting to do > front buffer rendering.. although they are probably only going to do > it on a single display at a time, so maybe that is a bit of an edge > case Yeah at that point we either tell them "pls dont" (if it's new userspace). Or we quietly sigh and make the stall avoidance/pile up logic a bit more fancy to take another case into account. > > > The thing is, for a push style panel, you don't necessarily have to > > > wait for "vblank" (because "vblank" isn't necessarily a real thing), > > > so in that scenario dirtyfb could in theory be fast. What you want to > > > do is fundamentally different for push vs pull style displays. > > > > Yeah, but we'd only stall if userspace does a modeset (which means > > different fb) and at that point you'll stall anyway a bit. So shouldn't > > hurt. > > > > Well you can do frontbuffer rendering even with atomic ioctl. Just don't > > use dirtyfb. > > > > But also you really shouldn't use frontbuffer rendering right now, since > > we don't have the interfaces right now to tell userspace whether it's > > cmd-mode or something else and what kind of corruption (if any) to expect > > when they do that. > > Compressed formats and front-buffer rendering don't really work out in > a pleasant way.. minigbm has a usage flag to indicate that the surface > will be used for front-buffer rendering (and it is a thing we should > probably port to real gbm). I think this aspect of it is better > solved in userspace. Yeah, I'm thinking more of cmd/scanout panels and stuff like that. Altough even with cmd-mode we currently reserve the right to rescan the buffer whenever we feel like in the kernel, so right now you can't rely on anything to avoid corruption for frontbuffer rendering. > > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > > command-mode panels, you probably want to make a different decision > > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > > > I'm not sure whether atomic actually needs this exposed: > > > > - clients will do full flips for every frame anyway, I've not heard of > > > > anyone seriously doing frontbuffer rendering. > > > > > > Frontbuffer rendering is actually a thing, for ex. to reduce latency > > > for stylus (android and CrOS do this.. fortunately AFAICT CrOS never > > > uses the dirtyfb ioctl.. but as soon as someone has the nice idea to > > > add that we'd be running into the same problem) > > > > > > Possibly one idea is to treat dirty-clip updates similarly to cursor > > > updates, and let the driver accumulate the updates and then wait until > > > vblank to apply them > > > > Yeah that's what I mean. Except implemented cheaper. fbcon code already > > does it. I think we're seriously talking past each another. > > Hmm, well 'state->async_update = true' is a pretty cheap implementation.. It's also very broken thus far :-/ It's broken enough that I've essentially given up trying to make cursor work reasonably well across drivers, much less extend this to plane updates in general, or more. One can dream still, but for legacy ioctl or functionality like fbcon it's much easier to hack over the problem with some kernel threads before you call drm_atomic_commit. Cheers, Daniel > > BR, > -R > > > -Daniel > > > > > > > > BR, > > > -R > > > > > > > - transporting the cliprects around and then tossing them if the driver > > > > doesn't need them in their flip is probably not a measurable win > > > > > > > > But yeah if I'm wrong and we have a need here and it's useful, then > > > > exposing this to userspace should be done. Meanwhile I think a "offload to > > > > worker like fbcon" trick for this legacy interface is probabyl the best > > > > option. Plus it will fix things not just for the case where you don't need > > > > dirty uploading, it will also fix things for the case where you _do_ need > > > > dirty uploading (since right now we stall in a few bad places for that I > > > > think). > > > > -Daniel > > > > > > > > > > > > > > BR, > > > > > -R > > > > > > > > > > > -Daniel > > > > > > > > > > > > > BR, > > > > > > > -R > > > > > > > > > > > > > > > > > > > > > > > Could probably steal most of the implementation. > > > > > > > > > > > > > > > > This approach here feels a tad too much in the hacky area ... > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > -Daniel > > > > > > > > > > > > > > > > > --- > > > > > > > > > drivers/gpu/drm/drm_damage_helper.c | 8 ++++++++ > > > > > > > > > include/drm/drm_modeset_helper_vtables.h | 14 ++++++++++++++ > > > > > > > > > 2 files changed, 22 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > > index 3a4126dc2520..a0bed1a2c2dc 100644 > > > > > > > > > --- a/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > > +++ b/drivers/gpu/drm/drm_damage_helper.c > > > > > > > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > > > retry: > > > > > > > > > drm_for_each_plane(plane, fb->dev) { > > > > > > > > > struct drm_plane_state *plane_state; > > > > > > > > > + struct drm_crtc *crtc; > > > > > > > > > > > > > > > > > > ret = drm_modeset_lock(&plane->mutex, state->acquire_ctx); > > > > > > > > > if (ret) > > > > > > > > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb, > > > > > > > > > continue; > > > > > > > > > } > > > > > > > > > > > > > > > > > > + crtc = plane->state->crtc; > > > > > > > > > + if (crtc->helper_private->needs_dirtyfb && > > > > > > > > > + !crtc->helper_private->needs_dirtyfb(crtc)) { > > > > > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > > > > > + continue; > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > plane_state = drm_atomic_get_plane_state(state, plane); > > > > > > > > > if (IS_ERR(plane_state)) { > > > > > > > > > ret = PTR_ERR(plane_state); > > > > > > > > > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h > > > > > > > > > index eb706342861d..afa8ec5754e7 100644 > > > > > > > > > --- a/include/drm/drm_modeset_helper_vtables.h > > > > > > > > > +++ b/include/drm/drm_modeset_helper_vtables.h > > > > > > > > > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs { > > > > > > > > > bool in_vblank_irq, int *vpos, int *hpos, > > > > > > > > > ktime_t *stime, ktime_t *etime, > > > > > > > > > const struct drm_display_mode *mode); > > > > > > > > > + > > > > > > > > > + /** > > > > > > > > > + * @needs_dirtyfb > > > > > > > > > + * > > > > > > > > > + * Optional callback used by damage helpers to determine if fb_damage_clips > > > > > > > > > + * update is needed. > > > > > > > > > + * > > > > > > > > > + * Returns: > > > > > > > > > + * > > > > > > > > > + * True if fb_damage_clips update is needed to handle DIRTYFB, False > > > > > > > > > + * otherwise. If this callback is not implemented, then True is > > > > > > > > > + * assumed. > > > > > > > > > + */ > > > > > > > > > + bool (*needs_dirtyfb)(struct drm_crtc *crtc); > > > > > > > > > }; > > > > > > > > > > > > > > > > > > /** > > > > > > > > > -- > > > > > > > > > 2.30.2 > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Daniel Vetter > > > > > > > > Software Engineer, Intel Corporation > > > > > > > > http://blog.ffwll.ch > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Daniel Vetter > > > > > > Software Engineer, Intel Corporation > > > > > > http://blog.ffwll.ch > > > > > > > > -- > > > > Daniel Vetter > > > > Software Engineer, Intel Corporation > > > > http://blog.ffwll.ch > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-11 16:44 ` Daniel Vetter 2021-05-11 17:19 ` Rob Clark @ 2021-05-12 8:23 ` Pekka Paalanen 2021-05-12 8:44 ` Daniel Vetter 2021-05-12 14:57 ` Rob Clark 1 sibling, 2 replies; 23+ messages in thread From: Pekka Paalanen @ 2021-05-12 8:23 UTC (permalink / raw) To: Daniel Vetter Cc: Rob Clark, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann [-- Attachment #1: Type: text/plain, Size: 6471 bytes --] On Tue, 11 May 2021 18:44:17 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > really bad. > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > (or similar) is actually required or is a no-op. > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > the app? > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > stalls in bad places. > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > (if sometimes flushes each character, so you have to pile them up into > > > a single update if that's still pending). > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > pretty simple: > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > reset to NULL) > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > think there's helpers so you could be slightly more clever and just have > > > > > an overall bounding box > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > the next back-buffer to mesa > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > tracking corruption is possible, but that's not the kernel's problem. > > > So how would anything get held up in userspace. > > > > the part about stalling if a dirtyfb is pending was what I was worried > > about.. but I suppose you meant the worker stalling, rather than > > userspace stalling (where I had interpreted it the other way around). > > As soon as userspace needs to stall, you're losing again. > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > of dirtyfb request in the kernel. > > But also I never expect userspace that uses dirtyfb to actually hit this > stall point (otherwise we'd need to look at this again). It would really > be only there as defense against abuse. > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > of drm_atomic_helper_dirtyfb() > > > > > > That's still using information that userspace doesn't have, which is a > > > bit irky. We might as well go with your thing here then. > > > > arguably, this is something we should expose to userspace.. for DSI > > command-mode panels, you probably want to make a different decision > > with regard to how many buffers in your flip-chain.. > > > > Possibly we should add/remove the fb_damage_clips property depending > > on the display type (ie. video/pull vs cmd/push mode)? > > I'm not sure whether atomic actually needs this exposed: > - clients will do full flips for every frame anyway, I've not heard of > anyone seriously doing frontbuffer rendering. That may or may not be changing, depending on whether the DRM drivers will actually support tearing flips. There has been a huge amount of debate for needing tearing for Wayland [1], and while I haven't really joined that discussion, using front-buffer rendering (blits) to work around the driver inability to flip-tear might be something some people will want. Personally, what I do agree with is that "tear if late from intended vblank" is a feature that will be needed when VRR cannot be used. However, I would also argue that multiple tearing updates per refresh cycle is not a good idea, and I know people disagree with this because practically all relevant games are using a naive main loop that makes multi-tearing necessary for good input response. I'm not quite sure where this leaves the KMS UAPI usage patterns. Maybe this matters, maybe not? Does it make a difference between using legacy DirtyFB vs. atomic FB_DAMAGE_CLIPS property? Also mind that Wayland compositors would be dynamically switching between "normal flips" and "tearing updates" depending on the scenegraph. This switch should not be considered a "mode set". [1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65 Thanks, pq [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-12 8:23 ` Pekka Paalanen @ 2021-05-12 8:44 ` Daniel Vetter 2021-05-12 9:46 ` Pekka Paalanen 2021-05-12 14:57 ` Rob Clark 1 sibling, 1 reply; 23+ messages in thread From: Daniel Vetter @ 2021-05-12 8:44 UTC (permalink / raw) To: Pekka Paalanen Cc: Daniel Vetter, Rob Clark, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann On Wed, May 12, 2021 at 11:23:30AM +0300, Pekka Paalanen wrote: > On Tue, 11 May 2021 18:44:17 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > really bad. > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > the app? > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > stalls in bad places. > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > (if sometimes flushes each character, so you have to pile them up into > > > > a single update if that's still pending). > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > pretty simple: > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > reset to NULL) > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > an overall bounding box > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > the next back-buffer to mesa > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > So how would anything get held up in userspace. > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > about.. but I suppose you meant the worker stalling, rather than > > > userspace stalling (where I had interpreted it the other way around). > > > As soon as userspace needs to stall, you're losing again. > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > of dirtyfb request in the kernel. > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > stall point (otherwise we'd need to look at this again). It would really > > be only there as defense against abuse. > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > bit irky. We might as well go with your thing here then. > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > command-mode panels, you probably want to make a different decision > > > with regard to how many buffers in your flip-chain.. > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > I'm not sure whether atomic actually needs this exposed: > > - clients will do full flips for every frame anyway, I've not heard of > > anyone seriously doing frontbuffer rendering. > > That may or may not be changing, depending on whether the DRM drivers > will actually support tearing flips. There has been a huge amount of > debate for needing tearing for Wayland [1], and while I haven't really > joined that discussion, using front-buffer rendering (blits) to work > around the driver inability to flip-tear might be something some people > will want. Uh pls dont, dirtyfb does a full atomic commit on atomic drivers underneath it. > Personally, what I do agree with is that "tear if late from intended > vblank" is a feature that will be needed when VRR cannot be used. > However, I would also argue that multiple tearing updates per refresh > cycle is not a good idea, and I know people disagree with this because > practically all relevant games are using a naive main loop that makes > multi-tearing necessary for good input response. > > I'm not quite sure where this leaves the KMS UAPI usage patterns. Maybe > this matters, maybe not? > > Does it make a difference between using legacy DirtyFB vs. atomic > FB_DAMAGE_CLIPS property? > > Also mind that Wayland compositors would be dynamically switching > between "normal flips" and "tearing updates" depending on the > scenegraph. This switch should not be considered a "mode set". > > [1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65 I think what you want is two things: - some indication that frontbuffer rendering "works", for some value of that (which should probably be "doesn't require dirtyfb") - tearing flips support. This needs driver support If you don't have either, pls don't try to emulate something using frontbuffer rendering and dirtyfb, because that will make it really, really awkward for the kernel to know what exactly userspace wants to do. Overloading existing interfaces with new meaning just we can really and it happens to work on the one platform we tested is really not a good idea. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-12 8:44 ` Daniel Vetter @ 2021-05-12 9:46 ` Pekka Paalanen 2021-05-12 10:35 ` Daniel Vetter 0 siblings, 1 reply; 23+ messages in thread From: Pekka Paalanen @ 2021-05-12 9:46 UTC (permalink / raw) To: Daniel Vetter Cc: Rob Clark, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann [-- Attachment #1: Type: text/plain, Size: 8683 bytes --] On Wed, 12 May 2021 10:44:29 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > On Wed, May 12, 2021 at 11:23:30AM +0300, Pekka Paalanen wrote: > > On Tue, 11 May 2021 18:44:17 +0200 > > Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > > really bad. > > > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > > the app? > > > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > > stalls in bad places. > > > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > > (if sometimes flushes each character, so you have to pile them up into > > > > > a single update if that's still pending). > > > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > > pretty simple: > > > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > > reset to NULL) > > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > > an overall bounding box > > > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > > the next back-buffer to mesa > > > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > > So how would anything get held up in userspace. > > > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > > about.. but I suppose you meant the worker stalling, rather than > > > > userspace stalling (where I had interpreted it the other way around). > > > > As soon as userspace needs to stall, you're losing again. > > > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > > of dirtyfb request in the kernel. > > > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > > stall point (otherwise we'd need to look at this again). It would really > > > be only there as defense against abuse. > > > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > command-mode panels, you probably want to make a different decision > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > I'm not sure whether atomic actually needs this exposed: > > > - clients will do full flips for every frame anyway, I've not heard of > > > anyone seriously doing frontbuffer rendering. > > > > That may or may not be changing, depending on whether the DRM drivers > > will actually support tearing flips. There has been a huge amount of > > debate for needing tearing for Wayland [1], and while I haven't really > > joined that discussion, using front-buffer rendering (blits) to work > > around the driver inability to flip-tear might be something some people > > will want. > > Uh pls dont, dirtyfb does a full atomic commit on atomic drivers > underneath it. You keep saying dirtyfb, but I still didn't understand if you mean literally *only* the legacy DirtyFB ioctl, or does it include FB_DAMAGE_CLIPS in atomic too? I suppose you mean only the legacy ioctl. > > Personally, what I do agree with is that "tear if late from intended > > vblank" is a feature that will be needed when VRR cannot be used. > > However, I would also argue that multiple tearing updates per refresh > > cycle is not a good idea, and I know people disagree with this because > > practically all relevant games are using a naive main loop that makes > > multi-tearing necessary for good input response. > > > > I'm not quite sure where this leaves the KMS UAPI usage patterns. Maybe > > this matters, maybe not? > > > > Does it make a difference between using legacy DirtyFB vs. atomic > > FB_DAMAGE_CLIPS property? > > > > Also mind that Wayland compositors would be dynamically switching > > between "normal flips" and "tearing updates" depending on the > > scenegraph. This switch should not be considered a "mode set". > > > > [1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65 > > I think what you want is two things: > - some indication that frontbuffer rendering "works", for some value of > that (which should probably be "doesn't require dirtyfb") > > - tearing flips support. This needs driver support A "tear if late" functionality in the kernel would be really nice too, but can probably be worked around with high resolution timers in userspace and just-in-time atomic tearing flips. Although those flips would need to be tearing always, because timers that close to vblank are going to race with vblank. > If you don't have either, pls don't try to emulate something using > frontbuffer rendering and dirtyfb, because that will make it really, > really awkward for the kernel to know what exactly userspace wants to do. > Overloading existing interfaces with new meaning just we can really > and it happens to work on the one platform we tested is really not a good > idea. Alright, I'll spread the word if I catch people trying that. I didn't even understand that using DirtyFB at all would put "new meaning" to it. I mean, if you do front-buffer rendering, you must use DirtyFB or FB_DAMAGE_CLIPS on atomic to make sure it actually goes anywhere, right? Thanks, pq [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-12 9:46 ` Pekka Paalanen @ 2021-05-12 10:35 ` Daniel Vetter 0 siblings, 0 replies; 23+ messages in thread From: Daniel Vetter @ 2021-05-12 10:35 UTC (permalink / raw) To: Pekka Paalanen Cc: Rob Clark, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann On Wed, May 12, 2021 at 11:46 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > On Wed, 12 May 2021 10:44:29 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Wed, May 12, 2021 at 11:23:30AM +0300, Pekka Paalanen wrote: > > > On Tue, 11 May 2021 18:44:17 +0200 > > > Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > > > really bad. > > > > > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > > > the app? > > > > > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > > > stalls in bad places. > > > > > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > > > (if sometimes flushes each character, so you have to pile them up into > > > > > > a single update if that's still pending). > > > > > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > > > pretty simple: > > > > > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > > > reset to NULL) > > > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > > > an overall bounding box > > > > > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > > > the next back-buffer to mesa > > > > > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > > > So how would anything get held up in userspace. > > > > > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > > > about.. but I suppose you meant the worker stalling, rather than > > > > > userspace stalling (where I had interpreted it the other way around). > > > > > As soon as userspace needs to stall, you're losing again. > > > > > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > > > of dirtyfb request in the kernel. > > > > > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > > > stall point (otherwise we'd need to look at this again). It would really > > > > be only there as defense against abuse. > > > > > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > > command-mode panels, you probably want to make a different decision > > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > > > I'm not sure whether atomic actually needs this exposed: > > > > - clients will do full flips for every frame anyway, I've not heard of > > > > anyone seriously doing frontbuffer rendering. > > > > > > That may or may not be changing, depending on whether the DRM drivers > > > will actually support tearing flips. There has been a huge amount of > > > debate for needing tearing for Wayland [1], and while I haven't really > > > joined that discussion, using front-buffer rendering (blits) to work > > > around the driver inability to flip-tear might be something some people > > > will want. > > > > Uh pls dont, dirtyfb does a full atomic commit on atomic drivers > > underneath it. > > You keep saying dirtyfb, but I still didn't understand if you mean > literally *only* the legacy DirtyFB ioctl, or does it include > FB_DAMAGE_CLIPS in atomic too? > > I suppose you mean only the legacy ioctl. Only the legacy DIRTYFB ioctl. FB_DAMAGE_CLIPS is all solid I think. > > > Personally, what I do agree with is that "tear if late from intended > > > vblank" is a feature that will be needed when VRR cannot be used. > > > However, I would also argue that multiple tearing updates per refresh > > > cycle is not a good idea, and I know people disagree with this because > > > practically all relevant games are using a naive main loop that makes > > > multi-tearing necessary for good input response. > > > > > > I'm not quite sure where this leaves the KMS UAPI usage patterns. Maybe > > > this matters, maybe not? > > > > > > Does it make a difference between using legacy DirtyFB vs. atomic > > > FB_DAMAGE_CLIPS property? > > > > > > Also mind that Wayland compositors would be dynamically switching > > > between "normal flips" and "tearing updates" depending on the > > > scenegraph. This switch should not be considered a "mode set". > > > > > > [1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65 > > > > I think what you want is two things: > > - some indication that frontbuffer rendering "works", for some value of > > that (which should probably be "doesn't require dirtyfb") > > > > - tearing flips support. This needs driver support > > A "tear if late" functionality in the kernel would be really nice too, > but can probably be worked around with high resolution timers in > userspace and just-in-time atomic tearing flips. Although those flips > would need to be tearing always, because timers that close to vblank are > going to race with vblank. > > > If you don't have either, pls don't try to emulate something using > > frontbuffer rendering and dirtyfb, because that will make it really, > > really awkward for the kernel to know what exactly userspace wants to do. > > Overloading existing interfaces with new meaning just we can really > > and it happens to work on the one platform we tested is really not a good > > idea. > > Alright, I'll spread the word if I catch people trying that. > > I didn't even understand that using DirtyFB at all would put "new > meaning" to it. I mean, if you do front-buffer rendering, you must use > DirtyFB or FB_DAMAGE_CLIPS on atomic to make sure it actually goes > anywhere, right? TBH I'd do FB_DAMAGE_CLIPS with atomic ioctl and the same fb. Also maybe userspace wants to better understand what exactly happens for frontbuffer tracking in this case too. The issue with DIRTYFB ioctl like with all the legacy ioctls is that it's very undefined how nonblocking and how async/tearing they are, and there's no completion event userspace could use to properly stall when it gets ahead too much. Any additional use we pile on top of them just makes this even more awkward for the kernel to do in a way that doesn't upset some userspace somewhere, while still trying to be as consistent across drivers as possible (ideally using one code path to remap to an atomic op in the same way for all drivers). Properly definied atomic properties and the exact semantics userspace expects is imo much better than "hey calling this ioctl gets the job done on my driver, let's just use that". If there's something missing in the atomic kms uapi, we need to add it properly. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-12 8:23 ` Pekka Paalanen 2021-05-12 8:44 ` Daniel Vetter @ 2021-05-12 14:57 ` Rob Clark 2021-05-14 7:54 ` Pekka Paalanen 1 sibling, 1 reply; 23+ messages in thread From: Rob Clark @ 2021-05-12 14:57 UTC (permalink / raw) To: Pekka Paalanen Cc: Daniel Vetter, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann On Wed, May 12, 2021 at 1:23 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > On Tue, 11 May 2021 18:44:17 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > So this is a bit annoying because the idea of all these "remap legacy uapi > > > > > > to atomic constructs" helpers is that they shouldn't need/use anything > > > > > > beyond what userspace also has available. So adding hacks for them feels > > > > > > really bad. > > > > > > > > > > I suppose the root problem is that userspace doesn't know if dirtyfb > > > > > (or similar) is actually required or is a no-op. > > > > > > > > > > But it is perhaps less of a problem because this essentially boils > > > > > down to "x11 vs wayland", and it seems like wayland compositors for > > > > > non-vsync'd rendering just pageflips and throws away extra frames from > > > > > the app? > > > > > > > > Yeah it's about not adequately batching up rendering and syncing with > > > > hw. bare metal x11 is just especially stupid about it :-) > > > > > > > > > > Also I feel like it's not entirely the right thing to do here either. > > > > > > We've had this problem already on the fbcon emulation side (which also > > > > > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix > > > > > > there was to have a worker which batches up all the updates and avoids any > > > > > > stalls in bad places. > > > > > > > > > > I'm not too worried about fbcon not being able to render faster than > > > > > vblank. OTOH it is a pretty big problem for x11 > > > > > > > > That's why we'd let the worker get ahead at most one dirtyfb. We do > > > > the same with fbcon, which trivially can get ahead of vblank otherwise > > > > (if sometimes flushes each character, so you have to pile them up into > > > > a single update if that's still pending). > > > > > > > > > > Since this is for frontbuffer rendering userspace only we can probably get > > > > > > away with assuming there's only a single fb, so the implementation becomes > > > > > > pretty simple: > > > > > > > > > > > > - 1 worker, and we keep track of a single pending fb > > > > > > - if there's already a dirty fb pending on a different fb, we stall for > > > > > > the worker to start processing that one already (i.e. the fb we track is > > > > > > reset to NULL) > > > > > > - if it's pending on the same fb we just toss away all the updates and go > > > > > > with a full update, since merging the clip rects is too much work :-) I > > > > > > think there's helpers so you could be slightly more clever and just have > > > > > > an overall bounding box > > > > > > > > > > This doesn't really fix the problem, you still end up delaying sending > > > > > the next back-buffer to mesa > > > > > > > > With this the dirtyfb would never block. Also glorious frontbuffer > > > > tracking corruption is possible, but that's not the kernel's problem. > > > > So how would anything get held up in userspace. > > > > > > the part about stalling if a dirtyfb is pending was what I was worried > > > about.. but I suppose you meant the worker stalling, rather than > > > userspace stalling (where I had interpreted it the other way around). > > > As soon as userspace needs to stall, you're losing again. > > > > Nah, I did mean userspace stalling, so we can't pile up unlimited amounts > > of dirtyfb request in the kernel. > > > > But also I never expect userspace that uses dirtyfb to actually hit this > > stall point (otherwise we'd need to look at this again). It would really > > be only there as defense against abuse. > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > bit irky. We might as well go with your thing here then. > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > command-mode panels, you probably want to make a different decision > > > with regard to how many buffers in your flip-chain.. > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > I'm not sure whether atomic actually needs this exposed: > > - clients will do full flips for every frame anyway, I've not heard of > > anyone seriously doing frontbuffer rendering. > > That may or may not be changing, depending on whether the DRM drivers > will actually support tearing flips. There has been a huge amount of > debate for needing tearing for Wayland [1], and while I haven't really > joined that discussion, using front-buffer rendering (blits) to work > around the driver inability to flip-tear might be something some people > will want. jfwiw, there is a lot of hw that just can't do tearing pageflips.. I think this probably includes most arm hw. What is done instead is to skip the pageflip and render directly to the front-buffer. EGL_KHR_mutable_render_buffer is a thing you might be interested in.. it is wired up for android on i965 and there is a WIP MR[1] for mesa/st (gallium): Possibly it could be useful to add support for platform_wayland? [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685 BR, -R > Personally, what I do agree with is that "tear if late from intended > vblank" is a feature that will be needed when VRR cannot be used. > However, I would also argue that multiple tearing updates per refresh > cycle is not a good idea, and I know people disagree with this because > practically all relevant games are using a naive main loop that makes > multi-tearing necessary for good input response. > > I'm not quite sure where this leaves the KMS UAPI usage patterns. Maybe > this matters, maybe not? > > Does it make a difference between using legacy DirtyFB vs. atomic > FB_DAMAGE_CLIPS property? > > Also mind that Wayland compositors would be dynamically switching > between "normal flips" and "tearing updates" depending on the > scenegraph. This switch should not be considered a "mode set". > > [1] https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65 > > > Thanks, > pq ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-12 14:57 ` Rob Clark @ 2021-05-14 7:54 ` Pekka Paalanen 2021-05-14 14:43 ` Rob Clark 0 siblings, 1 reply; 23+ messages in thread From: Pekka Paalanen @ 2021-05-14 7:54 UTC (permalink / raw) To: Rob Clark Cc: Daniel Vetter, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann [-- Attachment #1: Type: text/plain, Size: 6161 bytes --] On Wed, 12 May 2021 07:57:26 -0700 Rob Clark <robdclark@gmail.com> wrote: > On Wed, May 12, 2021 at 1:23 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > > > On Tue, 11 May 2021 18:44:17 +0200 > > Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> ... > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > command-mode panels, you probably want to make a different decision > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > I'm not sure whether atomic actually needs this exposed: > > > - clients will do full flips for every frame anyway, I've not heard of > > > anyone seriously doing frontbuffer rendering. > > > > That may or may not be changing, depending on whether the DRM drivers > > will actually support tearing flips. There has been a huge amount of > > debate for needing tearing for Wayland [1], and while I haven't really > > joined that discussion, using front-buffer rendering (blits) to work > > around the driver inability to flip-tear might be something some people > > will want. > > jfwiw, there is a lot of hw that just can't do tearing pageflips.. I > think this probably includes most arm hw. What is done instead is to > skip the pageflip and render directly to the front-buffer. > > EGL_KHR_mutable_render_buffer is a thing you might be interested in.. > it is wired up for android on i965 and there is a WIP MR[1] for > mesa/st (gallium): > > Possibly it could be useful to add support for platform_wayland? > > [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685 Thanks Rob, that's interesting. I would like to say that EGL Wayland platform cannot and has no reason to support frontbuffer rendering in Wayland clients, because the compositor may be reading the current client frontbuffer at any time, including *not reading it again* until another update is posted via Wayland. So if a Wayland client is doing frontbuffer rendering, then I would expect it to be very likely that the window might almost never show a "good" picture, meaning that it is literally just the half-rendered frame after the current one with continuously updating clients. That is because a Wayland client doing frontbuffer rendering is completely unrelated to the Wayland compositor putting the client buffer on scanout. Frontbuffer rendering used by a Wayland compositor would be used for fallback tearing updates, where the rendering is roughly just a blit from a client buffer. By definition, it means blit instead of scanout from client buffers, so the performance/power hit is going to be... let's say observable. If a Wayland client did frontbuffer rendering, and then it used a shadow buffer of its own to minimise the level of garbage on screen by doing only blits into the frontbuffer, that would again be a blit. And then the compositor might be doing another blit because it doesn't know the client is doing frontbuffer rendering which would theoretically allow the compositor to scan out the client buffer. There emerges the need for a Wayland extension for clients to be telling the compositor explicitly that they are going to be doing frontbuffer rendering now, it is ok to put the client buffer on scanout even if you want to do tearing updates on hardware that cannot tear-flip. However, knowing that a client buffer is good for scanout is not sufficient for scanout in a compositor, so it might still not be scanned out. If the compositor is instead render-compositing, you have the first problem of "likely never a good picture". I'm sure there can be specialised use cases (e.g. a game console Wayland compositor) where scanout can be guaranteed, but generally for desktops it's not so. I believe there will be people wanting EGL Wayland platform frontbuffer rendering for very special cases, and I also believe it will just break horribly everywhere else. Would it be worth it to implement? No idea. Maybe there would need to be a Wayland extension so that compositors can control the availability of frontbuffer rendering in EGL Wayland platform? There is the dmabuf-hints Wayland addition that is aimed at dynamically providing information to help optimise for scanout and render-compositing. If a compositor could control frontbuffer rendering in a client, it could tell the client to use frontbuffer rendering when it does hit scanout, and tell the client to stop frontbuffer rendering when scanout is no longer possible. The problem with the latter is a glitch, but since frontbuffer rendering is by definition glitchy (when done in clients), maybe that is acceptable to some? Thanks, pq [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] drm: Fix dirtyfb stalls 2021-05-14 7:54 ` Pekka Paalanen @ 2021-05-14 14:43 ` Rob Clark 0 siblings, 0 replies; 23+ messages in thread From: Rob Clark @ 2021-05-14 14:43 UTC (permalink / raw) To: Pekka Paalanen Cc: Daniel Vetter, Rob Clark, David Airlie, open list, dri-devel, Thomas Zimmermann On Fri, May 14, 2021 at 12:54 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > On Wed, 12 May 2021 07:57:26 -0700 > Rob Clark <robdclark@gmail.com> wrote: > > > On Wed, May 12, 2021 at 1:23 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > > > > > On Tue, 11 May 2021 18:44:17 +0200 > > > Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > On Mon, May 10, 2021 at 12:06:05PM -0700, Rob Clark wrote: > > > > > On Mon, May 10, 2021 at 10:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Mon, May 10, 2021 at 6:51 PM Rob Clark <robdclark@gmail.com> wrote: > > > > > > > > > > > > > > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > > > > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote: > > > > > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > > > > > > > > > > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video > > > > > > > > > mode" type displays, which is pointless and unnecessary. Add an > > > > > > > > > optional helper vfunc to determine if a plane is attached to a CRTC > > > > > > > > > that actually needs dirtyfb, and skip over them. > > > > > > > > > > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > ... > > > > > > > > But we could re-work drm_framebuffer_funcs::dirty to operate on a > > > > > > > per-crtc basis and hoist the loop and check if dirtyfb is needed out > > > > > > > of drm_atomic_helper_dirtyfb() > > > > > > > > > > > > That's still using information that userspace doesn't have, which is a > > > > > > bit irky. We might as well go with your thing here then. > > > > > > > > > > arguably, this is something we should expose to userspace.. for DSI > > > > > command-mode panels, you probably want to make a different decision > > > > > with regard to how many buffers in your flip-chain.. > > > > > > > > > > Possibly we should add/remove the fb_damage_clips property depending > > > > > on the display type (ie. video/pull vs cmd/push mode)? > > > > > > > > I'm not sure whether atomic actually needs this exposed: > > > > - clients will do full flips for every frame anyway, I've not heard of > > > > anyone seriously doing frontbuffer rendering. > > > > > > That may or may not be changing, depending on whether the DRM drivers > > > will actually support tearing flips. There has been a huge amount of > > > debate for needing tearing for Wayland [1], and while I haven't really > > > joined that discussion, using front-buffer rendering (blits) to work > > > around the driver inability to flip-tear might be something some people > > > will want. > > > > jfwiw, there is a lot of hw that just can't do tearing pageflips.. I > > think this probably includes most arm hw. What is done instead is to > > skip the pageflip and render directly to the front-buffer. > > > > EGL_KHR_mutable_render_buffer is a thing you might be interested in.. > > it is wired up for android on i965 and there is a WIP MR[1] for > > mesa/st (gallium): > > > > Possibly it could be useful to add support for platform_wayland? > > > > [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685 > > Thanks Rob, that's interesting. > > I would like to say that EGL Wayland platform cannot and has no reason > to support frontbuffer rendering in Wayland clients, because the > compositor may be reading the current client frontbuffer at any time, > including *not reading it again* until another update is posted via > Wayland. So if a Wayland client is doing frontbuffer rendering, then I > would expect it to be very likely that the window might almost never > show a "good" picture, meaning that it is literally just the > half-rendered frame after the current one with continuously updating > clients. > > That is because a Wayland client doing frontbuffer rendering is > completely unrelated to the Wayland compositor putting the client > buffer on scanout. > > Frontbuffer rendering used by a Wayland compositor would be used for > fallback tearing updates, where the rendering is roughly just a blit > from a client buffer. By definition, it means blit instead of scanout > from client buffers, so the performance/power hit is going to be... > let's say observable. > > If a Wayland client did frontbuffer rendering, and then it used a > shadow buffer of its own to minimise the level of garbage on screen by > doing only blits into the frontbuffer, that would again be a blit. And > then the compositor might be doing another blit because it doesn't know > the client is doing frontbuffer rendering which would theoretically > allow the compositor to scan out the client buffer. > > There emerges the need for a Wayland extension for clients to be > telling the compositor explicitly that they are going to be doing > frontbuffer rendering now, it is ok to put the client buffer on scanout > even if you want to do tearing updates on hardware that cannot > tear-flip. > > However, knowing that a client buffer is good for scanout is not > sufficient for scanout in a compositor, so it might still not be > scanned out. If the compositor is instead render-compositing, you have > the first problem of "likely never a good picture". I think if a client is doing front-buffer rendering, then "tearing" is the clients problem. The super-big-deal use-case for this is stylus, because you want to minimize the latency of pen-to-pixel.. tearing isn't really a problem because things aren't changing much from-by-frame I'm going to predict there will be at least one wayland compositor supporting this, maybe via custom protocol, idk. ;-) BR, -R > I'm sure there can be specialised use cases (e.g. a game console > Wayland compositor) where scanout can be guaranteed, but generally > for desktops it's not so. > > I believe there will be people wanting EGL Wayland platform frontbuffer > rendering for very special cases, and I also believe it will just break > horribly everywhere else. Would it be worth it to implement? No idea. > > Maybe there would need to be a Wayland extension so that compositors > can control the availability of frontbuffer rendering in EGL Wayland > platform? > > There is the dmabuf-hints Wayland addition that is aimed at dynamically > providing information to help optimise for scanout and > render-compositing. If a compositor could control frontbuffer rendering > in a client, it could tell the client to use frontbuffer rendering when > it does hit scanout, and tell the client to stop frontbuffer rendering > when scanout is no longer possible. The problem with the latter is a > glitch, but since frontbuffer rendering is by definition glitchy (when > done in clients), maybe that is acceptable to some? > > > Thanks, > pq ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb 2021-05-08 19:56 [PATCH 0/2] drm: Fix atomic helper dirtyfb stalls Rob Clark 2021-05-08 19:56 ` [PATCH 1/2] drm: Fix " Rob Clark @ 2021-05-08 19:56 ` Rob Clark 2021-05-09 15:38 ` Tested houdek.ryan 2021-05-09 16:28 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 1 sibling, 2 replies; 23+ messages in thread From: Rob Clark @ 2021-05-08 19:56 UTC (permalink / raw) To: dri-devel Cc: Rob Clark, Rob Clark, Sean Paul, David Airlie, Daniel Vetter, Abhinav Kumar, Stephen Boyd, Maxime Ripard, Kalyan Thota, Qinglang Miao, Hongbo Yao, open list:DRM DRIVER FOR MSM ADRENO GPU, open list:DRM DRIVER FOR MSM ADRENO GPU, open list From: Rob Clark <robdclark@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org> --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c index 5a74f93e29da..868ee6136438 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c @@ -143,6 +143,19 @@ static bool dpu_crtc_get_scanout_position(struct drm_crtc *crtc, return true; } +static bool dpu_crtc_needs_dirtyfb(struct drm_crtc *crtc) +{ + struct drm_encoder *encoder; + + drm_for_each_encoder_mask (encoder, crtc->dev, crtc->state->encoder_mask) { + if (dpu_encoder_get_intf_mode(encoder) == INTF_MODE_CMD) { + return true; + } + } + + return false; +} + static void _dpu_crtc_setup_blend_cfg(struct dpu_crtc_mixer *mixer, struct dpu_plane_state *pstate, struct dpu_format *format) { @@ -1343,6 +1356,7 @@ static const struct drm_crtc_helper_funcs dpu_crtc_helper_funcs = { .atomic_begin = dpu_crtc_atomic_begin, .atomic_flush = dpu_crtc_atomic_flush, .get_scanout_position = dpu_crtc_get_scanout_position, + .needs_dirtyfb = dpu_crtc_needs_dirtyfb, }; /* initialize crtc */ -- 2.30.2 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Tested 2021-05-08 19:56 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark @ 2021-05-09 15:38 ` houdek.ryan 2021-05-10 15:26 ` Tested Alex Deucher 2021-05-09 16:28 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 1 sibling, 1 reply; 23+ messages in thread From: houdek.ryan @ 2021-05-09 15:38 UTC (permalink / raw) To: robdclark Cc: abhinavk, airlied, dri-devel, freedreno, kalyan_t, linux-arm-msm, linux-kernel, maxime, miaoqinglang, robdclark, sean, swboyd, yaohongbo, Ryan Houdek I have tested this on my end and it resolves the 120hz problem. Tested-By: Ryan Houdek <Houdek.Ryan@fex-emu.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Tested 2021-05-09 15:38 ` Tested houdek.ryan @ 2021-05-10 15:26 ` Alex Deucher 0 siblings, 0 replies; 23+ messages in thread From: Alex Deucher @ 2021-05-10 15:26 UTC (permalink / raw) To: houdek.ryan Cc: Rob Clark, robdclark, Dave Airlie, linux-arm-msm, yaohongbo, LKML, Maling list - DRI developers, swboyd, Sean Paul, Qinglang Miao, abhinavk, kalyan_t, freedreno, Maxime Ripard Sorry, what patch are you referring to? Alex On Mon, May 10, 2021 at 4:04 AM <houdek.ryan@fex-emu.org> wrote: > > I have tested this on my end and it resolves the 120hz problem. > > Tested-By: Ryan Houdek <Houdek.Ryan@fex-emu.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb 2021-05-08 19:56 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 2021-05-09 15:38 ` Tested houdek.ryan @ 2021-05-09 16:28 ` Rob Clark 1 sibling, 0 replies; 23+ messages in thread From: Rob Clark @ 2021-05-09 16:28 UTC (permalink / raw) To: dri-devel Cc: Rob Clark, Sean Paul, David Airlie, Daniel Vetter, Abhinav Kumar, Stephen Boyd, Maxime Ripard, Kalyan Thota, Qinglang Miao, Hongbo Yao, open list:DRM DRIVER FOR MSM ADRENO GPU, open list:DRM DRIVER FOR MSM ADRENO GPU, open list, Ryan Houdek On Sat, May 8, 2021 at 12:53 PM Rob Clark <robdclark@gmail.com> wrote: > > From: Rob Clark <robdclark@chromium.org> > > Signed-off-by: Rob Clark <robdclark@chromium.org> From Ryan (sending this for him because mailing list workflow is lame): I have tested this on my end and it resolves the 120hz problem. Tested-By: Ryan Houdek <Houdek.Ryan@fex-emu.org> > --- > drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c > index 5a74f93e29da..868ee6136438 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c > @@ -143,6 +143,19 @@ static bool dpu_crtc_get_scanout_position(struct drm_crtc *crtc, > return true; > } > > +static bool dpu_crtc_needs_dirtyfb(struct drm_crtc *crtc) > +{ > + struct drm_encoder *encoder; > + > + drm_for_each_encoder_mask (encoder, crtc->dev, crtc->state->encoder_mask) { > + if (dpu_encoder_get_intf_mode(encoder) == INTF_MODE_CMD) { > + return true; > + } > + } > + > + return false; > +} > + > static void _dpu_crtc_setup_blend_cfg(struct dpu_crtc_mixer *mixer, > struct dpu_plane_state *pstate, struct dpu_format *format) > { > @@ -1343,6 +1356,7 @@ static const struct drm_crtc_helper_funcs dpu_crtc_helper_funcs = { > .atomic_begin = dpu_crtc_atomic_begin, > .atomic_flush = dpu_crtc_atomic_flush, > .get_scanout_position = dpu_crtc_get_scanout_position, > + .needs_dirtyfb = dpu_crtc_needs_dirtyfb, > }; > > /* initialize crtc */ > -- > 2.30.2 > ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2021-05-14 14:39 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-08 19:56 [PATCH 0/2] drm: Fix atomic helper dirtyfb stalls Rob Clark 2021-05-08 19:56 ` [PATCH 1/2] drm: Fix " Rob Clark 2021-05-10 16:14 ` Daniel Vetter 2021-05-10 16:16 ` Daniel Vetter 2021-05-10 16:55 ` Rob Clark 2021-05-10 17:43 ` Daniel Vetter 2021-05-10 19:06 ` Rob Clark 2021-05-11 16:44 ` Daniel Vetter 2021-05-11 17:19 ` Rob Clark 2021-05-11 17:21 ` Daniel Vetter 2021-05-11 17:42 ` Rob Clark 2021-05-11 17:50 ` Daniel Vetter 2021-05-12 8:23 ` Pekka Paalanen 2021-05-12 8:44 ` Daniel Vetter 2021-05-12 9:46 ` Pekka Paalanen 2021-05-12 10:35 ` Daniel Vetter 2021-05-12 14:57 ` Rob Clark 2021-05-14 7:54 ` Pekka Paalanen 2021-05-14 14:43 ` Rob Clark 2021-05-08 19:56 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark 2021-05-09 15:38 ` Tested houdek.ryan 2021-05-10 15:26 ` Tested Alex Deucher 2021-05-09 16:28 ` [PATCH 2/2] drm/msm/dpu: Wire up needs_dirtyfb Rob Clark
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).