stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.
@ 2019-02-09  6:52 Mario Kleiner
  2019-02-11 14:38 ` Kazlauskas, Nicholas
  0 siblings, 1 reply; 2+ messages in thread
From: Mario Kleiner @ 2019-02-09  6:52 UTC (permalink / raw)
  To: amd-gfx
  Cc: dri-devel, mario.kleiner.de, stable, Nicholas Kazlauskas,
	Harry Wentland, Alex Deucher, Michel Dänzer

In VRR mode, keep track of the vblank count of the last
completed pageflip in amdgpu_crtc->last_flip_vblank, as
recorded in the pageflip completion handler after each
completed flip.

Use that count to prevent mmio programming a new pageflip
within the same vblank in which the last pageflip completed,
iow. to throttle pageflips to at most one flip per video
frame, while at the same time allowing to request a flip
not only before start of vblank, but also anywhere within
vblank.

The old logic did the same, and made sense for regular fixed
refresh rate flipping, but in vrr mode it prevents requesting
a flip anywhere inside the possibly huge vblank, thereby
reducing framerate in vrr mode instead of improving it, by
delaying a slightly delayed flip requests up to a maximum
vblank duration + 1 scanout duration. This would limit VRR
usefulness to only help applications with a very high GPU
demand, which can submit the flip request before start of
vblank, but then have to wait long for fences to complete.

With this method a flip can be both requested and - after
fences have completed - executed, ie. it doesn't matter if
the request (amdgpu_dm_do_flip()) gets delayed until deep
into the extended vblank due to cpu execution delays. This
also allows clients which want to regulate framerate within
the vrr range a much more fine-grained control of flip timing,
a feature that might be useful for video playback, and is
very useful for neuroscience/vision research applications.

In regular non-VRR mode, retain the old flip submission
behavior. This to keep flip scheduling for fullscreen X11/GLX
OpenGL clients intact, if they use the GLX_OML_sync_control
extensions glXSwapBufferMscOML(, ..., target_msc,...) function
with a specific target_msc target vblank count.

glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
not flip at the proper target_msc for a non-zero target_msc
if VRR mode is active with this patch. They'd often flip one
frame too early. However, this limitation should not matter
much in VRR mode, as scheduling based on vblank counts is
pretty futile/unusable under variable refresh duration
anyway, so no real extra harm is done.

According to some testing already done with this patch by
Nicholas on top of my tests, IGT tests didn't report any
problems. If fixes stuttering and flickering when flipping
at rates below the minimum vrr refresh rate.

Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
properties")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h      |  1 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 ++++++++++++++++---
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index bfa394ffd6d2..87ca5746f861 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -411,6 +411,7 @@ struct amdgpu_crtc {
 	struct amdgpu_flip_work *pflip_works;
 	enum amdgpu_flip_status pflip_status;
 	int deferred_flip_completion;
+	u64 last_flip_vblank;
 	/* pll sharing */
 	struct amdgpu_atom_ss ss;
 	bool ss_enabled;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d59bafc84475..d4da331aa349 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -303,12 +303,11 @@ static void dm_pflip_high_irq(void *interrupt_params)
 		return;
 	}
 
+	/* Update to correct count(s) if racing with vblank irq */
+	amdgpu_crtc->last_flip_vblank = drm_crtc_accurate_vblank_count(&amdgpu_crtc->base);
 
 	/* wake up userspace */
 	if (amdgpu_crtc->event) {
-		/* Update to correct count(s) if racing with vblank irq */
-		drm_crtc_accurate_vblank_count(&amdgpu_crtc->base);
-
 		drm_crtc_send_vblank_event(&amdgpu_crtc->base, amdgpu_crtc->event);
 
 		/* page flip completed. clean up */
@@ -4736,6 +4735,8 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 	struct amdgpu_bo *abo;
 	uint64_t tiling_flags, dcc_address;
 	uint32_t target, target_vblank;
+	uint64_t last_flip_vblank;
+	bool vrr_active = acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE;
 
 	struct {
 		struct dc_surface_update surface_updates[MAX_SURFACES];
@@ -4889,7 +4890,31 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 	 * hopefully eliminating dc_*_update structs in their entirety.
 	 */
 	if (flip_count) {
-		target = (uint32_t)drm_crtc_vblank_count(pcrtc) + *wait_for_vblank;
+		if (!vrr_active) {
+			/* Use old throttling in non-vrr fixed refresh rate mode
+			 * to keep flip scheduling based on target vblank counts
+			 * working in a backwards compatible way, e.g., for
+			 * clients using the GLX_OML_sync_control extension or
+			 * DRI3/Present extension with defined target_msc.
+			 */
+			last_flip_vblank = drm_crtc_vblank_count(pcrtc);
+		}
+		else {
+			/* For variable refresh rate mode only:
+			 * Get vblank of last completed flip to avoid > 1 vrr
+			 * flips per video frame by use of throttling, but allow
+			 * flip programming anywhere in the possibly large
+			 * variable vrr vblank interval for fine-grained flip
+			 * timing control and more opportunity to avoid stutter
+			 * on late submission of flips.
+			 */
+			spin_lock_irqsave(&pcrtc->dev->event_lock, flags);
+			last_flip_vblank = acrtc_attach->last_flip_vblank;
+			spin_unlock_irqrestore(&pcrtc->dev->event_lock, flags);
+		}
+
+		target = (uint32_t)last_flip_vblank + *wait_for_vblank;
+
 		/* Prepare wait for target vblank early - before the fence-waits */
 		target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) +
 				amdgpu_get_vblank_counter_kms(pcrtc->dev, acrtc_attach->crtc_id);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.
  2019-02-09  6:52 [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC Mario Kleiner
@ 2019-02-11 14:38 ` Kazlauskas, Nicholas
  0 siblings, 0 replies; 2+ messages in thread
From: Kazlauskas, Nicholas @ 2019-02-11 14:38 UTC (permalink / raw)
  To: Mario Kleiner, amd-gfx
  Cc: dri-devel, stable, Wentland, Harry, Deucher, Alexander,
	Michel Dänzer

On 2/9/19 1:52 AM, Mario Kleiner wrote:
> In VRR mode, keep track of the vblank count of the last
> completed pageflip in amdgpu_crtc->last_flip_vblank, as
> recorded in the pageflip completion handler after each
> completed flip.
> 
> Use that count to prevent mmio programming a new pageflip
> within the same vblank in which the last pageflip completed,
> iow. to throttle pageflips to at most one flip per video
> frame, while at the same time allowing to request a flip
> not only before start of vblank, but also anywhere within
> vblank.
> 
> The old logic did the same, and made sense for regular fixed
> refresh rate flipping, but in vrr mode it prevents requesting
> a flip anywhere inside the possibly huge vblank, thereby
> reducing framerate in vrr mode instead of improving it, by
> delaying a slightly delayed flip requests up to a maximum
> vblank duration + 1 scanout duration. This would limit VRR
> usefulness to only help applications with a very high GPU
> demand, which can submit the flip request before start of
> vblank, but then have to wait long for fences to complete.
> 
> With this method a flip can be both requested and - after
> fences have completed - executed, ie. it doesn't matter if
> the request (amdgpu_dm_do_flip()) gets delayed until deep
> into the extended vblank due to cpu execution delays. This
> also allows clients which want to regulate framerate within
> the vrr range a much more fine-grained control of flip timing,
> a feature that might be useful for video playback, and is
> very useful for neuroscience/vision research applications.
> 
> In regular non-VRR mode, retain the old flip submission
> behavior. This to keep flip scheduling for fullscreen X11/GLX
> OpenGL clients intact, if they use the GLX_OML_sync_control
> extensions glXSwapBufferMscOML(, ..., target_msc,...) function
> with a specific target_msc target vblank count.
> 
> glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
> not flip at the proper target_msc for a non-zero target_msc
> if VRR mode is active with this patch. They'd often flip one
> frame too early. However, this limitation should not matter
> much in VRR mode, as scheduling based on vblank counts is
> pretty futile/unusable under variable refresh duration
> anyway, so no real extra harm is done.
> 
> According to some testing already done with this patch by
> Nicholas on top of my tests, IGT tests didn't report any
> problems. If fixes stuttering and flickering when flipping
> at rates below the minimum vrr refresh rate.
> 
> Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
> properties")
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Cc: <stable@vger.kernel.org>
> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Michel Dänzer <michel@daenzer.net>

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>

I can confirm I've tested this a bit as well and it looks reasonable to 
me. This patch really helps avoid large or frequent changes in timeout 
duration in the case where we take too long from the last and flip and 
we're already in the next vblank interval by the time we get into commit 
tail.

It's unfortunate that this changes the behavior of how waiting for the 
next vblank interval works when VRR is active but I can't really think 
of a case where we would want anything else but this.

For applications where timing and scheduling like this really matters I 
think we'd need to look into adding the target presentation timestamp to 
have full control over this kind of behavior from userspace in the first 
place. Along with some sort of control over async page-flipping in the 
atomic commit IOCTL as well.

Nicholas Kazlauskas

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h      |  1 +
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 ++++++++++++++++---
>   2 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index bfa394ffd6d2..87ca5746f861 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -411,6 +411,7 @@ struct amdgpu_crtc {
>   	struct amdgpu_flip_work *pflip_works;
>   	enum amdgpu_flip_status pflip_status;
>   	int deferred_flip_completion;
> +	u64 last_flip_vblank;
>   	/* pll sharing */
>   	struct amdgpu_atom_ss ss;
>   	bool ss_enabled;
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index d59bafc84475..d4da331aa349 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -303,12 +303,11 @@ static void dm_pflip_high_irq(void *interrupt_params)
>   		return;
>   	}
>   
> +	/* Update to correct count(s) if racing with vblank irq */
> +	amdgpu_crtc->last_flip_vblank = drm_crtc_accurate_vblank_count(&amdgpu_crtc->base);
>   
>   	/* wake up userspace */
>   	if (amdgpu_crtc->event) {
> -		/* Update to correct count(s) if racing with vblank irq */
> -		drm_crtc_accurate_vblank_count(&amdgpu_crtc->base);
> -
>   		drm_crtc_send_vblank_event(&amdgpu_crtc->base, amdgpu_crtc->event);
>   
>   		/* page flip completed. clean up */
> @@ -4736,6 +4735,8 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>   	struct amdgpu_bo *abo;
>   	uint64_t tiling_flags, dcc_address;
>   	uint32_t target, target_vblank;
> +	uint64_t last_flip_vblank;
> +	bool vrr_active = acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE;
>   
>   	struct {
>   		struct dc_surface_update surface_updates[MAX_SURFACES];
> @@ -4889,7 +4890,31 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>   	 * hopefully eliminating dc_*_update structs in their entirety.
>   	 */
>   	if (flip_count) {
> -		target = (uint32_t)drm_crtc_vblank_count(pcrtc) + *wait_for_vblank;
> +		if (!vrr_active) {
> +			/* Use old throttling in non-vrr fixed refresh rate mode
> +			 * to keep flip scheduling based on target vblank counts
> +			 * working in a backwards compatible way, e.g., for
> +			 * clients using the GLX_OML_sync_control extension or
> +			 * DRI3/Present extension with defined target_msc.
> +			 */
> +			last_flip_vblank = drm_crtc_vblank_count(pcrtc);
> +		}
> +		else {
> +			/* For variable refresh rate mode only:
> +			 * Get vblank of last completed flip to avoid > 1 vrr
> +			 * flips per video frame by use of throttling, but allow
> +			 * flip programming anywhere in the possibly large
> +			 * variable vrr vblank interval for fine-grained flip
> +			 * timing control and more opportunity to avoid stutter
> +			 * on late submission of flips.
> +			 */
> +			spin_lock_irqsave(&pcrtc->dev->event_lock, flags);
> +			last_flip_vblank = acrtc_attach->last_flip_vblank;
> +			spin_unlock_irqrestore(&pcrtc->dev->event_lock, flags);
> +		}
> +
> +		target = (uint32_t)last_flip_vblank + *wait_for_vblank;
> +
>   		/* Prepare wait for target vblank early - before the fence-waits */
>   		target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) +
>   				amdgpu_get_vblank_counter_kms(pcrtc->dev, acrtc_attach->crtc_id);
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-02-11 15:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-09  6:52 [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC Mario Kleiner
2019-02-11 14:38 ` Kazlauskas, Nicholas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).