All of lore.kernel.org
 help / color / mirror / Atom feed
* Patches to go on top of Michels page_flip_target patches.
@ 2016-09-17 12:25 Mario Kleiner
  2016-09-17 12:25 ` [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4 Mario Kleiner
  2016-09-17 12:25 ` [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion Mario Kleiner
  0 siblings, 2 replies; 6+ messages in thread
From: Mario Kleiner @ 2016-09-17 12:25 UTC (permalink / raw)
  To: dri-devel; +Cc: alexander.deucher, michel.daenzer

Hi,

testing on "simulated" pre-DCE4 radeon hw (via setting the module
parameter radeon.use_pflipirq=0 to avoid use of pflip irqs) showed
we need some adjustments on top of Michel's page_flip_target patches
on these old asics.

Patch 2/2:

Essentially disabling the ability to flip anywhere in vblank on
pre-DCE4, either via hw programming or via software for pre-AVIVO.

Patch 1/2:

Also some small refinement i noticed was possible for that old
pageflip completion path with some functionality we added a couple
of releases ago.

So these should go into drm-next for Linux 4.9.

thanks,
-mario

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4
  2016-09-17 12:25 Patches to go on top of Michels page_flip_target patches Mario Kleiner
@ 2016-09-17 12:25 ` Mario Kleiner
  2016-09-20  2:58   ` Michel Dänzer
  2016-09-17 12:25 ` [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion Mario Kleiner
  1 sibling, 1 reply; 6+ messages in thread
From: Mario Kleiner @ 2016-09-17 12:25 UTC (permalink / raw)
  To: dri-devel; +Cc: alexander.deucher, michel.daenzer

Pre DCE4 hardware doesn't have (reliable) pageflip completion
irqs, therefore we have to use the old polling method for flip
completion handling in vblank irq.

As vblank irqs fire a bit before start of vblank (when the
linebuffer fifo read position reaches end of scanout), we
have some fudge for flip completion handling in the last
lines of active scanout. Old code assumed the threshold to
be 99% of active scanout height, a ballpark estimate which
worked ok. Since we know since a while how to calculate the
actual threshold from linebuffer size, lets make use of it
to get a more accurate threshold.

This completion path is still prone to some races in corner
cases, especially on pre-AVIVO hardware, so document them
a bit better in the code comments.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/radeon/radeon_display.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 890171f..5070646 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -321,16 +321,30 @@ void radeon_crtc_handle_vblank(struct radeon_device *rdev, int crtc_id)
 	update_pending = radeon_page_flip_pending(rdev, crtc_id);
 
 	/* Has the pageflip already completed in crtc, or is it certain
-	 * to complete in this vblank?
+	 * to complete in this vblank? GET_DISTANCE_TO_VBLANKSTART provides
+	 * distance to start of "fudged earlier" vblank in vpos, distance to
+	 * start of real vblank in hpos. vpos >= 0 && hpos < 0 means we are in
+	 * the last few scanlines before start of real vblank, where the vblank
+	 * irq can fire, so we have sampled update_pending a bit too early and
+	 * know the flip will complete at leading edge of the upcoming real
+	 * vblank. On pre-AVIVO hardware, flips also complete inside the real
+	 * vblank, not only at leading edge, so if update_pending for hpos >= 0
+	 *  == inside real vblank, the flip will complete almost immediately.
+	 * Note that this method of completion handling is still not 100% race
+	 * free, as we could execute before the radeon_flip_work_func managed
+	 * to run and set the RADEON_FLIP_SUBMITTED status, thereby we no-op,
+	 * but the flip still gets programmed into hw and completed during
+	 * vblank, leading to a delayed emission of the flip completion event.
+	 * This applies at least to pre-AVIVO hardware, where flips are always
+	 * completing inside vblank, not only at leading edge of vblank.
 	 */
 	if (update_pending &&
-	    (DRM_SCANOUTPOS_VALID & radeon_get_crtc_scanoutpos(rdev->ddev,
-							       crtc_id,
-							       USE_REAL_VBLANKSTART,
-							       &vpos, &hpos, NULL, NULL,
-							       &rdev->mode_info.crtcs[crtc_id]->base.hwmode)) &&
-	    ((vpos >= (99 * rdev->mode_info.crtcs[crtc_id]->base.hwmode.crtc_vdisplay)/100) ||
-	     (vpos < 0 && !ASIC_IS_AVIVO(rdev)))) {
+	    (DRM_SCANOUTPOS_VALID &
+	     radeon_get_crtc_scanoutpos(rdev->ddev, crtc_id,
+					GET_DISTANCE_TO_VBLANKSTART,
+					&vpos, &hpos, NULL, NULL,
+					&rdev->mode_info.crtcs[crtc_id]->base.hwmode)) &&
+	    ((vpos >= 0 && hpos < 0) || (hpos >= 0 && !ASIC_IS_AVIVO(rdev)))) {
 		/* crtc didn't flip in this target vblank interval,
 		 * but flip is pending in crtc. Based on the current
 		 * scanout position we know that the current frame is
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion.
  2016-09-17 12:25 Patches to go on top of Michels page_flip_target patches Mario Kleiner
  2016-09-17 12:25 ` [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4 Mario Kleiner
@ 2016-09-17 12:25 ` Mario Kleiner
  2016-09-20  2:56   ` Michel Dänzer
  1 sibling, 1 reply; 6+ messages in thread
From: Mario Kleiner @ 2016-09-17 12:25 UTC (permalink / raw)
  To: dri-devel; +Cc: alexander.deucher, michel.daenzer

Pre DCE4 hw doesn't have reliable pageflip completion
interrupts, so instead polling for flip completion is
used from within the vblank irq handler to complete
page flips.

This causes a race if pageflip ioctl is called close to
vblank:

1. pageflip ioctl queues execution of radeon_flip_work_func.

2. vblank irq fires, radeon_crtc_handle_vblank checks for
   flip_status == FLIP_SUBMITTED finds none, no-ops.

3. radeon_flip_work_func runs inside vblank, decides to
   set flip_status == FLIP_SUBMITTED and programs the
   flip into hw.

4. hw executes flip immediately (because in vblank), but
   as 2 already happened, the flip completion routine only
   emits the flip completion event one refresh later ->
   wrong vblank count/timestamp for completion and no
   performance gain, as instead of delaying the flip until
   next vblank, we now delay the next flip by 1 refresh
   while waiting for the delayed flip completion event.

Given we often don't gain anything due to this race, but
lose precision, prevent the programmed flip from executing
in vblank on pre DCE4 asics to avoid this race.

On pre-AVIVO hw we can't program the hw for edge-triggered
flips, they always execute anywhere in vblank. Therefore delay
the actual flip programming until after vblank on pre-AVIVO.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/radeon/atombios_crtc.c  |  4 ++--
 drivers/gpu/drm/radeon/radeon_display.c | 17 ++++++++++-------
 drivers/gpu/drm/radeon/rv515.c          |  3 ++-
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index a4e9f35..74f99ba 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -1638,8 +1638,8 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(AVIVO_D1MODE_VIEWPORT_SIZE + radeon_crtc->crtc_offset,
 	       (viewport_w << 16) | viewport_h);
 
-	/* set pageflip to happen anywhere in vblank interval */
-	WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 0);
+	/* set pageflip to happen only at start of vblank interval (front porch) */
+	WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 3);
 
 	if (!atomic && fb && fb != crtc->primary->fb) {
 		radeon_fb = to_radeon_framebuffer(fb);
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 5070646..b8ab30a 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -452,16 +452,19 @@ static void radeon_flip_work_func(struct work_struct *__work)
 	}
 
 	/* Wait until we're out of the vertical blank period before the one
-	 * targeted by the flip
+	 * targeted by the flip. Always wait on pre DCE4 to avoid races with
+	 * flip completion handling from vblank irq, as these old asics don't
+	 * have reliable pageflip completion interrupts.
 	 */
 	while (radeon_crtc->enabled &&
-	       (radeon_get_crtc_scanoutpos(dev, work->crtc_id, 0,
-					   &vpos, &hpos, NULL, NULL,
-					   &crtc->hwmode)
+		(radeon_get_crtc_scanoutpos(dev, work->crtc_id, 0,
+					    &vpos, &hpos, NULL, NULL,
+					    &crtc->hwmode)
 		& (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK)) ==
-	       (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
-	       (int)(work->target_vblank -
-		     dev->driver->get_vblank_counter(dev, work->crtc_id)) > 0)
+		(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
+		(!ASIC_IS_AVIVO(rdev) ||
+		((int) (work->target_vblank -
+		dev->driver->get_vblank_counter(dev, work->crtc_id)) > 0)))
 		usleep_range(1000, 2000);
 
 	/* We borrow the event spin lock for protecting flip_status */
diff --git a/drivers/gpu/drm/radeon/rv515.c b/drivers/gpu/drm/radeon/rv515.c
index 76c55c5..c55d653 100644
--- a/drivers/gpu/drm/radeon/rv515.c
+++ b/drivers/gpu/drm/radeon/rv515.c
@@ -406,8 +406,9 @@ void rv515_mc_resume(struct radeon_device *rdev, struct rv515_mc_save *save)
 	for (i = 0; i < rdev->num_crtc; i++) {
 		if (save->crtc_enabled[i]) {
 			tmp = RREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + crtc_offsets[i]);
-			if ((tmp & 0x7) != 0) {
+			if ((tmp & 0x7) != 3) {
 				tmp &= ~0x7;
+				tmp |= 0x3;
 				WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + crtc_offsets[i], tmp);
 			}
 			tmp = RREG32(AVIVO_D1GRPH_UPDATE + crtc_offsets[i]);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion.
  2016-09-17 12:25 ` [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion Mario Kleiner
@ 2016-09-20  2:56   ` Michel Dänzer
  2016-09-29 13:36     ` Alex Deucher
  0 siblings, 1 reply; 6+ messages in thread
From: Michel Dänzer @ 2016-09-20  2:56 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: alexander.deucher, dri-devel

On 17/09/16 09:25 PM, Mario Kleiner wrote:
> Pre DCE4 hw doesn't have reliable pageflip completion
> interrupts, so instead polling for flip completion is
> used from within the vblank irq handler to complete
> page flips.
> 
> This causes a race if pageflip ioctl is called close to
> vblank:
> 
> 1. pageflip ioctl queues execution of radeon_flip_work_func.
> 
> 2. vblank irq fires, radeon_crtc_handle_vblank checks for
>    flip_status == FLIP_SUBMITTED finds none, no-ops.
> 
> 3. radeon_flip_work_func runs inside vblank, decides to
>    set flip_status == FLIP_SUBMITTED and programs the
>    flip into hw.
> 
> 4. hw executes flip immediately (because in vblank), but
>    as 2 already happened, the flip completion routine only
>    emits the flip completion event one refresh later ->
>    wrong vblank count/timestamp for completion and no
>    performance gain, as instead of delaying the flip until
>    next vblank, we now delay the next flip by 1 refresh
>    while waiting for the delayed flip completion event.
> 
> Given we often don't gain anything due to this race, but
> lose precision, prevent the programmed flip from executing
> in vblank on pre DCE4 asics to avoid this race.
> 
> On pre-AVIVO hw we can't program the hw for edge-triggered
> flips, they always execute anywhere in vblank. Therefore delay
> the actual flip programming until after vblank on pre-AVIVO.
> 
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>


P.S. Please send radeon (and amdgpu) driver patches to the amd-gfx
mailing list (as well).

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4
  2016-09-17 12:25 ` [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4 Mario Kleiner
@ 2016-09-20  2:58   ` Michel Dänzer
  0 siblings, 0 replies; 6+ messages in thread
From: Michel Dänzer @ 2016-09-20  2:58 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: alexander.deucher, dri-devel

On 17/09/16 09:25 PM, Mario Kleiner wrote:
> Pre DCE4 hardware doesn't have (reliable) pageflip completion
> irqs, therefore we have to use the old polling method for flip
> completion handling in vblank irq.
> 
> As vblank irqs fire a bit before start of vblank (when the
> linebuffer fifo read position reaches end of scanout), we
> have some fudge for flip completion handling in the last
> lines of active scanout. Old code assumed the threshold to
> be 99% of active scanout height, a ballpark estimate which
> worked ok. Since we know since a while how to calculate the
> actual threshold from linebuffer size, lets make use of it
> to get a more accurate threshold.
> 
> This completion path is still prone to some races in corner
> cases, especially on pre-AVIVO hardware, so document them
> a bit better in the code comments.
> 
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>

[...]

> +	 * Note that this method of completion handling is still not 100% race
> +	 * free, as we could execute before the radeon_flip_work_func managed
> +	 * to run and set the RADEON_FLIP_SUBMITTED status, thereby we no-op,
> +	 * but the flip still gets programmed into hw and completed during
> +	 * vblank, leading to a delayed emission of the flip completion event.
> +	 * This applies at least to pre-AVIVO hardware, where flips are always
> +	 * completing inside vblank, not only at leading edge of vblank.

Does this part of the comment still apply with patch 2?

Anyway,

Acked-by: Michel Dänzer <michel.daenzer@amd.com>


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion.
  2016-09-20  2:56   ` Michel Dänzer
@ 2016-09-29 13:36     ` Alex Deucher
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Deucher @ 2016-09-29 13:36 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Deucher, Alexander, Maling list - DRI developers

On Mon, Sep 19, 2016 at 10:56 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 17/09/16 09:25 PM, Mario Kleiner wrote:
>> Pre DCE4 hw doesn't have reliable pageflip completion
>> interrupts, so instead polling for flip completion is
>> used from within the vblank irq handler to complete
>> page flips.
>>
>> This causes a race if pageflip ioctl is called close to
>> vblank:
>>
>> 1. pageflip ioctl queues execution of radeon_flip_work_func.
>>
>> 2. vblank irq fires, radeon_crtc_handle_vblank checks for
>>    flip_status == FLIP_SUBMITTED finds none, no-ops.
>>
>> 3. radeon_flip_work_func runs inside vblank, decides to
>>    set flip_status == FLIP_SUBMITTED and programs the
>>    flip into hw.
>>
>> 4. hw executes flip immediately (because in vblank), but
>>    as 2 already happened, the flip completion routine only
>>    emits the flip completion event one refresh later ->
>>    wrong vblank count/timestamp for completion and no
>>    performance gain, as instead of delaying the flip until
>>    next vblank, we now delay the next flip by 1 refresh
>>    while waiting for the delayed flip completion event.
>>
>> Given we often don't gain anything due to this race, but
>> lose precision, prevent the programmed flip from executing
>> in vblank on pre DCE4 asics to avoid this race.
>>
>> On pre-AVIVO hw we can't program the hw for edge-triggered
>> flips, they always execute anywhere in vblank. Therefore delay
>> the actual flip programming until after vblank on pre-AVIVO.
>>
>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
>
> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

Applied this series.  Thanks!

Alex

>
>
> P.S. Please send radeon (and amdgpu) driver patches to the amd-gfx
> mailing list (as well).
>
> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-29 13:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-17 12:25 Patches to go on top of Michels page_flip_target patches Mario Kleiner
2016-09-17 12:25 ` [PATCH 1/2] drm/radeon: Slightly more robust flip completion handling for < DCE-4 Mario Kleiner
2016-09-20  2:58   ` Michel Dänzer
2016-09-17 12:25 ` [PATCH 2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion Mario Kleiner
2016-09-20  2:56   ` Michel Dänzer
2016-09-29 13:36     ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.