Improvements and fixes for Intel ddx swap scheduling/timestamping.

All of lore.kernel.org
 help / color / mirror / Atom feed

* Improvements and fixes for Intel ddx swap scheduling/timestamping.
@ 2012-10-07  6:38 Mario Kleiner
  2012-10-07  6:38 ` [PATCH 1/3] ddx/dri2: Make triple-buffering compatible with timestamping on Xorg 1.12+ Mario Kleiner
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-10-07  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter

Hi,

a series of three patches to improve the dri2 swap scheduling
and timestamping for the current intel ddx.

The first one enables proper OML_sync_control timestamping
while triple-buffering is enabled and XOrg 1.12+ with DRI2SwapLimit
support is in use. So far, timestamping was unuseable with
triple-buffering, only worked with double-buffering.

The second one repairs the broken pageflip swap scheduling, which
is apparently in a frightening state for timing sensitive apps
since a year, due to a tiny but really ugly bug. In a perfect
implementation of Murphy's law, the same commit that broke the
scheduling also disabled the builtin correctness checks that
were supposed to catch such bugs.

The third one proposes to revert 'SwapBuffersWait' to its old
behaviour where it didn't affect pageflipping. I just can't
think of a case where the current behaviour makes any sense, not
even for benchmarking? But maybe i'm overlooking something.

All patches were tested against an Intel 945-GME gpu.

I don't really care about the 'SwapBuffersWait' patch one way
or the other, but the first two are crucial to make the intel
ddx useable in a painless and safe way again for users of
timing sensitive applications.

Thanks,
-mario

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] ddx/dri2: Make triple-buffering compatible with timestamping on Xorg 1.12+
  2012-10-07  6:38 Improvements and fixes for Intel ddx swap scheduling/timestamping Mario Kleiner
@ 2012-10-07  6:38 ` Mario Kleiner
  2012-10-07  6:38 ` [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling Mario Kleiner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-10-07  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter, Mario Kleiner

This patch adds support for OML_sync_control compliant pageflip
completion timestamping while triple-buffering is enabled and
the ddx is running on Xorg server 1.12 or later.

It makes use of the DRI2SwapLimit api to allow up to two
pending flips without throttling of the client, then defers
invocation of DRI2SwapComplete() until the corresponding
pageflips actually complete, just as in the double-buffering case,
allowing for proper timestamping and glXWaitForSbcOML() behaviour.
This should not impact triple-buffering performance.

On server versions <= 1.11 without swaplimit api, triple-buffering
is achieved by sacrificing OML_sync_control compliance, as in the
old implementation.

Tested against server without swaplimit api and 1.13 with swaplimit
api, both with and without triplebuffering enabled.

Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
---
 src/intel_dri.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 59 insertions(+), 8 deletions(-)

diff --git a/src/intel_dri.c b/src/intel_dri.c
index 64cb567..126acb2 100644
--- a/src/intel_dri.c
+++ b/src/intel_dri.c
@@ -310,6 +310,15 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment,
 			drawable = &(get_drawable_pixmap(drawable)->drawable);
 			is_glamor_pixmap = TRUE;
 		}
+
+#if DRI2INFOREC_VERSION >= 6
+		/* If swaplimit api supported, use it to tell server we are
+		 * triple-buffering capable. This allows triple-buffering
+		 * without need for hacks which compromise time-stamping.
+		 */
+		if (drawable->type == DRAWABLE_WINDOW)
+			DRI2SwapLimit(drawable, intel->use_triple_buffer ? 2 : 1);
+#endif
 	}
 
 	if (pixmap == NULL) {
@@ -815,6 +824,20 @@ static drm_intel_bo *get_pixmap_bo(I830DRI2BufferPrivatePtr priv)
 	return bo;
 }
 
+#if DRI2INFOREC_VERSION >= 6
+static Bool
+I830DRI2SwapLimitValidate(DrawablePtr draw, int swap_limit)
+{
+	ScrnInfoPtr pScrn = xf86ScreenToScrn(draw->pScreen);
+	struct intel_screen_private *intel = intel_get_screen_private(pScrn);
+
+	if ((swap_limit < 1 ) || (swap_limit > (intel->use_triple_buffer ? 2 : 1)))
+		return FALSE;
+
+	return TRUE;
+}
+#endif
+
 /*
  * Our internal swap routine takes care of actually exchanging, blitting, or
  * flipping buffers as necessary.
@@ -914,10 +937,17 @@ I830DRI2ScheduleFlip(struct intel_screen_private *intel,
 
 	/* Then flip DRI2 pointers and update the screen pixmap */
 	I830DRI2ExchangeBuffers(intel, info->front, info->back);
+
+#if DRI2INFOREC_VERSION < 6
+	/* Only needed on Xorg <= 1.11 server, which doesn't have swaplimit
+	 * api to do this cleanly. Breaks OML_sync_control timestamping.
+	 */
 	DRI2SwapComplete(info->client, draw, 0, 0, 0,
 			 DRI2_EXCHANGE_COMPLETE,
 			 info->event_complete,
 			 info->event_data);
+#endif
+
 	return TRUE;
 }
 
@@ -1041,14 +1071,22 @@ void I830DRI2FlipEventHandler(unsigned int frame, unsigned int tv_sec,
 		dixLookupDrawable(&drawable, flip_info->drawable_id, serverClient,
 				  M_ANY, DixWriteAccess);
 
-
-	/* We assume our flips arrive in order, so we don't check the frame */
-	switch (flip_info->type) {
-	case DRI2_SWAP:
-		if (!drawable)
-			break;
-
-		/* Check for too small vblank count of pageflip completion, taking wraparound
+	/* Perform consistency check, final timestamping and swap completion here iff:
+	 * - This is a pageflip completion for a classic double-buffered swap.
+	 * - This is a pageflip completion for a triple-buffered swap and the XOrg 1.12+
+	 *   server supports the swap limit api, so we were able to defer swap completion
+	 *   until this point without negative impact on performance.
+	 *
+	 * -> This allows OML_sync_control spec compliant timestamping.
+	 *
+	 * On older servers we already mark the swap as completed ahead of its completion in
+	 * I830DRI2ScheduleFlip to allow triple-buffering at the cost of broken timestamping.
+	 */
+	if (drawable && ((flip_info->type == DRI2_SWAP) ||
+			 ((DRI2INFOREC_VERSION >= 6) && (flip_info->type == DRI2_SWAP_CHAIN)))) {
+		/* We assume our flips arrive in order, so we don't check the frame.
+		 *
+		 * Check for too small vblank count of pageflip completion, taking wraparound
 		 * into account. This usually means some defective kms pageflip completion,
 		 * causing wrong (msc, ust) return values and possible visual corruption.
 		 */
@@ -1072,6 +1110,11 @@ void I830DRI2FlipEventHandler(unsigned int frame, unsigned int tv_sec,
 		DRI2SwapComplete(flip_info->client, drawable, frame, tv_sec, tv_usec,
 				 DRI2_FLIP_COMPLETE, flip_info->client ? flip_info->event_complete : NULL,
 				 flip_info->event_data);
+	}
+
+	switch (flip_info->type) {
+	case DRI2_SWAP:
+		/* All completion work for double-buffered swaps already done above. */
 		break;
 
 	case DRI2_SWAP_CHAIN:
@@ -1596,6 +1639,14 @@ Bool I830DRI2ScreenInit(ScreenPtr screen)
 	driverNames[0] = info.driverName;
 #endif
 
+#if DRI2INFOREC_VERSION >= 6
+	info.version = 6;
+	info.SwapLimitValidate = I830DRI2SwapLimitValidate;
+	/* Server has fallbacks for these undefined ones for v5 and v6: */
+	info.AuthMagic = NULL;
+	info.ReuseBufferNotify = NULL;
+#endif
+
 	return DRI2ScreenInit(screen, &info);
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.
  2012-10-07  6:38 Improvements and fixes for Intel ddx swap scheduling/timestamping Mario Kleiner
  2012-10-07  6:38 ` [PATCH 1/3] ddx/dri2: Make triple-buffering compatible with timestamping on Xorg 1.12+ Mario Kleiner
@ 2012-10-07  6:38 ` Mario Kleiner
       [not found]   ` <509365D3.3080203@tuebingen.mpg.de>
  2012-10-07  6:38 ` [PATCH 3/3] ddx/dri2: Prevent option 'SwapBuffersWait' from disabling pageflip Mario Kleiner
  2012-10-08  8:00 ` Improvements and fixes for Intel ddx swap scheduling/timestamping Daniel Vetter
  3 siblings, 1 reply; 11+ messages in thread
From: Mario Kleiner @ 2012-10-07  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter, Mario Kleiner

Commit 7538be3315b8683b05e8f6b22023baadcc0bc4da together
with DRI2/OpenGL triple-buffering support added an
optimization which breaks kms pageflip swap scheduling for most
meaningful use cases and thereby violates the OML_sync_control,
GLX_SGI_swap_control, GLX_swap_control and MESA_swap_control
specs, except for the trivial case of a glXSwapBuffers call
with swap_interval == 1.

This small modification allows to keep the optimization in
the intended way, while removing the hilarious side effects
for timing sensitive applications.

Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
---
 src/intel_dri.c |   22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/intel_dri.c b/src/intel_dri.c
index 126acb2..8786bf4 100644
--- a/src/intel_dri.c
+++ b/src/intel_dri.c
@@ -1275,9 +1275,6 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, DRI2BufferPtr front,
 	 * the swap.
 	 */
 	if (divisor == 0 || current_msc < *target_msc) {
-		if (flip && I830DRI2ScheduleFlip(intel, draw, swap_info))
-			return TRUE;
-
 		vbl.request.type =
 			DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT | pipe_select(pipe);
 
@@ -1295,6 +1292,25 @@ I830DRI2ScheduleSwap(ClientPtr client, DrawablePtr draw, DRI2BufferPtr front,
 		if (current_msc >= *target_msc)
 			*target_msc = current_msc;
 
+		/* If pageflip is requested for the next possible vblank,
+		 * then avoid the roundtrip to the kernels vblank event
+	         * delivery and schedule the pageflip for next vblank
+		 * directly. This can't be done for any other case, as
+		 * it would violate the OML_sync_control spec and
+		 * SGI/MESA/GLX_swap_control spec!
+		 */
+		if (flip && (current_msc == *target_msc) && 
+		    I830DRI2ScheduleFlip(intel, draw, swap_info)) {
+			/* Create approximate target vblank of flip-completion,
+			 * so basic consistency checks and swap_interval still
+			 * work correctly.
+			 */
+			*target_msc += flip;
+			swap_info->frame = *target_msc;
+
+			return TRUE;
+		}
+
 		vbl.request.sequence = *target_msc;
 		vbl.request.signal = (unsigned long)swap_info;
 		ret = drmWaitVBlank(intel->drmSubFD, &vbl);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] ddx/dri2: Prevent option 'SwapBuffersWait' from disabling pageflip.
  2012-10-07  6:38 Improvements and fixes for Intel ddx swap scheduling/timestamping Mario Kleiner
  2012-10-07  6:38 ` [PATCH 1/3] ddx/dri2: Make triple-buffering compatible with timestamping on Xorg 1.12+ Mario Kleiner
  2012-10-07  6:38 ` [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling Mario Kleiner
@ 2012-10-07  6:38 ` Mario Kleiner
  2012-10-08  8:00 ` Improvements and fixes for Intel ddx swap scheduling/timestamping Daniel Vetter
  3 siblings, 0 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-10-07  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter, Mario Kleiner

Commit 1d102cc6ed21d1c4afa47800eecd24b9d663f689 introduced a
change, where setting option 'SwapBuffersWait' to 'off' disables
kms pageflipping of fullscreen windows.

Afaics this change had no benefit, but only downsides:

1. Apps which don't want to throttle their swaps to video refresh
   rate and therefore set swap_limit to zero, will run through
   the non-vsynced, tearing copyswap path, as before the change.

2. Apps which set their swap_limit > 0, because they want throttled
   and vsynced swaps, will still get throttled to video refresh rate,
   regardless of the 'SwapBuffersWait' 'off', because their swaps
   are still scheduled via the normal kernel vblank events
   path, as before the change. The only differences are that
   more inefficient copyswaps instead of kms pageflips are used
   for fullscreen drawables, timestamping gets unreliable and
   tearing is present at the top of the screen due to lack of vsync.

Therefore no longer disable pageflipping if SwapBuffersWait is off.

Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
---
 src/intel_display.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel_display.c b/src/intel_display.c
index b2a5904..ac3dd8f 100644
--- a/src/intel_display.c
+++ b/src/intel_display.c
@@ -1731,7 +1731,7 @@ Bool intel_mode_pre_init(ScrnInfoPtr scrn, int fd, int cpp)
 	gp.value = &has_flipping;
 	(void)drmCommandWriteRead(intel->drmSubFD, DRM_I915_GETPARAM, &gp,
 				  sizeof(gp));
-	if (has_flipping && intel->swapbuffers_wait) {
+	if (has_flipping) {
 		xf86DrvMsg(scrn->scrnIndex, X_INFO,
 			   "Kernel page flipping support detected, enabling\n");
 		intel->use_pageflipping = TRUE;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Improvements and fixes for Intel ddx swap scheduling/timestamping.
  2012-10-07  6:38 Improvements and fixes for Intel ddx swap scheduling/timestamping Mario Kleiner
                   ` (2 preceding siblings ...)
  2012-10-07  6:38 ` [PATCH 3/3] ddx/dri2: Prevent option 'SwapBuffersWait' from disabling pageflip Mario Kleiner
@ 2012-10-08  8:00 ` Daniel Vetter
  2012-10-08 16:08   ` Eric Anholt
  3 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2012-10-08  8:00 UTC (permalink / raw)
  To: Mario Kleiner; +Cc: daniel.vetter, intel-gfx

On Sun, Oct 07, 2012 at 08:38:07AM +0200, Mario Kleiner wrote:
> Hi,
> 
> a series of three patches to improve the dri2 swap scheduling
> and timestamping for the current intel ddx.
> 
> The first one enables proper OML_sync_control timestamping
> while triple-buffering is enabled and XOrg 1.12+ with DRI2SwapLimit
> support is in use. So far, timestamping was unuseable with
> triple-buffering, only worked with double-buffering.
> 
> The second one repairs the broken pageflip swap scheduling, which
> is apparently in a frightening state for timing sensitive apps
> since a year, due to a tiny but really ugly bug. In a perfect
> implementation of Murphy's law, the same commit that broke the
> scheduling also disabled the builtin correctness checks that
> were supposed to catch such bugs.
> 
> The third one proposes to revert 'SwapBuffersWait' to its old
> behaviour where it didn't affect pageflipping. I just can't
> think of a case where the current behaviour makes any sense, not
> even for benchmarking? But maybe i'm overlooking something.
> 
> All patches were tested against an Intel 945-GME gpu.
> 
> I don't really care about the 'SwapBuffersWait' patch one way
> or the other, but the first two are crucial to make the intel
> ddx useable in a painless and safe way again for users of
> timing sensitive applications.

This kind of regressions suck. Are there no tests (in piglit) that check
OML buffer swap and timestamping behaviour at least internally? Or could I
volunteer you to create that? Toghether with the new kernel flip tests
this should catch any further such regressions ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improvements and fixes for Intel ddx swap scheduling/timestamping.
  2012-10-08  8:00 ` Improvements and fixes for Intel ddx swap scheduling/timestamping Daniel Vetter
@ 2012-10-08 16:08   ` Eric Anholt
  2012-10-08 22:49     ` Mario Kleiner
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Anholt @ 2012-10-08 16:08 UTC (permalink / raw)
  To: Daniel Vetter, Mario Kleiner; +Cc: daniel.vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1865 bytes --]

Daniel Vetter <daniel@ffwll.ch> writes:

> On Sun, Oct 07, 2012 at 08:38:07AM +0200, Mario Kleiner wrote:
>> Hi,
>> 
>> a series of three patches to improve the dri2 swap scheduling
>> and timestamping for the current intel ddx.
>> 
>> The first one enables proper OML_sync_control timestamping
>> while triple-buffering is enabled and XOrg 1.12+ with DRI2SwapLimit
>> support is in use. So far, timestamping was unuseable with
>> triple-buffering, only worked with double-buffering.
>> 
>> The second one repairs the broken pageflip swap scheduling, which
>> is apparently in a frightening state for timing sensitive apps
>> since a year, due to a tiny but really ugly bug. In a perfect
>> implementation of Murphy's law, the same commit that broke the
>> scheduling also disabled the builtin correctness checks that
>> were supposed to catch such bugs.
>> 
>> The third one proposes to revert 'SwapBuffersWait' to its old
>> behaviour where it didn't affect pageflipping. I just can't
>> think of a case where the current behaviour makes any sense, not
>> even for benchmarking? But maybe i'm overlooking something.
>> 
>> All patches were tested against an Intel 945-GME gpu.
>> 
>> I don't really care about the 'SwapBuffersWait' patch one way
>> or the other, but the first two are crucial to make the intel
>> ddx useable in a painless and safe way again for users of
>> timing sensitive applications.
>
> This kind of regressions suck. Are there no tests (in piglit) that check
> OML buffer swap and timestamping behaviour at least internally? Or could I
> volunteer you to create that? Toghether with the new kernel flip tests
> this should catch any further such regressions ...

I posted a first couple of tests that try to just hit the API for the
most part, but I don't test the MSCs or wall-time timing behavior.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improvements and fixes for Intel ddx swap scheduling/timestamping.
  2012-10-08 16:08   ` Eric Anholt
@ 2012-10-08 22:49     ` Mario Kleiner
  0 siblings, 0 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-10-08 22:49 UTC (permalink / raw)
  To: Eric Anholt; +Cc: Daniel Vetter, intel-gfx

On 08.10.12 18:08, Eric Anholt wrote:
> Daniel Vetter <daniel@ffwll.ch> writes:
>
>> On Sun, Oct 07, 2012 at 08:38:07AM +0200, Mario Kleiner wrote:
>>> Hi,
>>>
>>> a series of three patches to improve the dri2 swap scheduling
>>> and timestamping for the current intel ddx.
>>>
>>> The first one enables proper OML_sync_control timestamping
>>> while triple-buffering is enabled and XOrg 1.12+ with DRI2SwapLimit
>>> support is in use. So far, timestamping was unuseable with
>>> triple-buffering, only worked with double-buffering.
>>>
>>> The second one repairs the broken pageflip swap scheduling, which
>>> is apparently in a frightening state for timing sensitive apps
>>> since a year, due to a tiny but really ugly bug. In a perfect
>>> implementation of Murphy's law, the same commit that broke the
>>> scheduling also disabled the builtin correctness checks that
>>> were supposed to catch such bugs.
>>>
>>> The third one proposes to revert 'SwapBuffersWait' to its old
>>> behaviour where it didn't affect pageflipping. I just can't
>>> think of a case where the current behaviour makes any sense, not
>>> even for benchmarking? But maybe i'm overlooking something.
>>>
>>> All patches were tested against an Intel 945-GME gpu.
>>>
>>> I don't really care about the 'SwapBuffersWait' patch one way
>>> or the other, but the first two are crucial to make the intel
>>> ddx useable in a painless and safe way again for users of
>>> timing sensitive applications.
>>
>> This kind of regressions suck. Are there no tests (in piglit) that check
>> OML buffer swap and timestamping behaviour at least internally? Or could I
>> volunteer you to create that? Toghether with the new kernel flip tests
>> this should catch any further such regressions ...
>
> I posted a first couple of tests that try to just hit the API for the
> most part, but I don't test the MSCs or wall-time timing behavior.
>

I will also have a look at piglit at one of the next weekends and see 
what i can do. I haven't tried piglit yet, but that would make sense if 
it can catch bugs before a driver release. My own toolkit has some 
builtin online tests for the bugs or limitations that are so far known, 
but that can only catch driver bugs after a broken driver is released. 
Here is what i currently test for, starting at the line in the link 
(paranoia at work):

<https://github.com/kleinerm/Psychtoolbox-3/blob/master/PsychSourceGL/Source/Linux/Screen/PsychWindowGlue.c#L1098>

-mario

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.
       [not found]   ` <509365D3.3080203@tuebingen.mpg.de>
@ 2012-12-10 18:48     ` Jesse Barnes
  2012-12-10 23:00       ` Jesse Barnes
  2012-12-12 20:15       ` Mario Kleiner
  0 siblings, 2 replies; 11+ messages in thread
From: Jesse Barnes @ 2012-12-10 18:48 UTC (permalink / raw)
  To: Mario Kleiner, chris, intel-gfx

> On 15.10.12 16:52, Chris Wilson wrote:
>  > On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner 
> <mario.kleiner@tuebingen.mpg.de> wrote:
>  >> Hi Chris,
>  >>
>  >> can you please check & merge at least the first two patches of the
>  >> series into the intel ddx?
> 
> Thanks for the quick reply.
> 
>  >
>  > The first is along the right path, but I think you want to base the
>  > split on divisor==0.
> 
> I don't think so, unless i misunderstand what you mean? The optimization 
> of I830DRI2ScheduleFlip()'ing immediately should only happen in the case 
> where current_msc is >= target_msc, ie., if the client really requests 
> swap at next possible vblank, regardless what divisor is, once we've 
> entered the whole ...
> 
> if (divisor == 0 || current_msc < *target_msc) {
> 
> ... block. Checking for divisor == 0 would be a nice little cleanup if 
> only either (divisor == 0) or (current_msc < *target_msc) could be true. 
> But it can happen that both (divisor == 0) and (current_msc < 
> *target_msc) and then a check for (divisor == 0) wouldn't tell you if 
> target_msc is in the future and the kernel vblank mechanism should 
> schedule swap in the future, or if it is time for an immediate flip.
> 
> Also i tested with various distances between successive swap and with 
> divisor 0 vs. non-zero, so at least it works as advertised with the 
> current patch.
> 
> So that patch should be ok.

Yeah I don't understand the flip schedule at the top there either; if
target_msc is out in the future, why would we schedule a page flip
immediately just because divisor == 0?

Maybe it should look like this instead?

if (divisor == 0 || current_msc < *target_msc) {
	if (divisor == 0 && current_msc >= *target_msc)
		if (flip && I830DRI2ScheduleFlip(intel,	draw, swap_info))
			return TRUE;

(could clean that up a little but it captures the optimization I think
and avoids an extra vblank and potential frame delay.)

>   The second is broken in the DRI2 layer, as the
>  > SwapLimit mechanism exposes the multi-client race to a single client
>  > (i.e. there is no serialisation between the InvalidateEvent and the
>  > GetBuffers, and the InvalidateEvent is always broadcast too early.)
> 
> Can you explain in more detail? I know there were many problems with the 
> invalidate events, but thought that's ok now? Multi-client race case 
> happens when a OpenGL app and a compositor is at work? So basically, the 
> whole DRI2 setup is fundamentally broken for any swaplimit other than 1 
> and we have to wait for a future DRI3 to fix this properly?
> 
> I didn't see any weird behaviour during testing under both compiz+unity 
> or without compositor with fullscreen OpenGL + pageflipping. Is this 
> race common or was i just somehow lucky?

I'm not sure I understand Chris's comment either, however in the
redirected case the compositor is ultimately responsible for when
things show up on the screen, so timestamping is meaningless in that
case.  However for fullscreen, unredirected windows (sounds like that's
what you tested), we should be able to properly queue and stamp things
as they complete AIUI.

>   The
>  > third is just plain wrong, as pageflip may be a blocking ioctl (and will
>  > impose client blocking) so anybody who wants to override SwapBuffersWait
>  > will also want to prevent the inherent wait due to the flip. In any
>  > event that option is to only paper over the loose DRI2 specification and
>  > the lack of client control...
> 
> I don't think so: If you want to run non-vsynced/tearing with max. 
> swaprate for benchmarks etc., then you must set "SwapBuffersWait" "off" 
> and set the drawables swapinterval to zero via some .drmrc setting or 
> the app calling glXSwapInterval etc. But once you've set the 
> swapinterval to zero, the x-server *always* does an immediate copy blit 
> and bypasses the whole vblank events / kms-pageflip mechanism. See

I think this comes down to what people expect of the "SwapbuffersWait"
option.  Our man page indicates it's simply an anti-tearing feature, so
if you disable it I think users would expect swaps to occur as soon as
possible or when scheduled, regardless of potential tearing.

If we leave flipping enabled, I think we'll often hit cases where a
swap will be delayed until the next vblank (when the flip is latched
into the display engine) rather than occurring immediately or when the
vblank event for it fires, with likely tearing.  Or do you think the
current code handles this case well enough that we can guarantee we
won't be delaying things by a frame sometimes?

If so, you can argue that this isn't a performance advantage, but then
it's a rather lame option anyway...  So if/until we have async page
flips, I think the current behavior is probably fine.

Mario, do we have some small tests we can add to piglit for the
timestamping cases?  Looking at simple client code always helps me
understand the complex DDX and server interaction better...

Also, I'm assuming all this works ok with triple buffering disabled?
What about with SNA?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.
  2012-12-10 18:48     ` Jesse Barnes
@ 2012-12-10 23:00       ` Jesse Barnes
  2012-12-12 19:12         ` Mario Kleiner
  2012-12-12 20:15       ` Mario Kleiner
  1 sibling, 1 reply; 11+ messages in thread
From: Jesse Barnes @ 2012-12-10 23:00 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx

On Mon, 10 Dec 2012 10:48:29 -0800
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> > On 15.10.12 16:52, Chris Wilson wrote:
> >  > On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner 
> > <mario.kleiner@tuebingen.mpg.de> wrote:
> >  >> Hi Chris,
> >  >>
> >  >> can you please check & merge at least the first two patches of the
> >  >> series into the intel ddx?
> > 
> > Thanks for the quick reply.
> > 
> >  >
> >  > The first is along the right path, but I think you want to base the
> >  > split on divisor==0.
> > 
> > I don't think so, unless i misunderstand what you mean? The optimization 
> > of I830DRI2ScheduleFlip()'ing immediately should only happen in the case 
> > where current_msc is >= target_msc, ie., if the client really requests 
> > swap at next possible vblank, regardless what divisor is, once we've 
> > entered the whole ...
> > 
> > if (divisor == 0 || current_msc < *target_msc) {
> > 
> > ... block. Checking for divisor == 0 would be a nice little cleanup if 
> > only either (divisor == 0) or (current_msc < *target_msc) could be true. 
> > But it can happen that both (divisor == 0) and (current_msc < 
> > *target_msc) and then a check for (divisor == 0) wouldn't tell you if 
> > target_msc is in the future and the kernel vblank mechanism should 
> > schedule swap in the future, or if it is time for an immediate flip.
> > 
> > Also i tested with various distances between successive swap and with 
> > divisor 0 vs. non-zero, so at least it works as advertised with the 
> > current patch.
> > 
> > So that patch should be ok.
> 
> Yeah I don't understand the flip schedule at the top there either; if
> target_msc is out in the future, why would we schedule a page flip
> immediately just because divisor == 0?
> 
> Maybe it should look like this instead?
> 
> if (divisor == 0 || current_msc < *target_msc) {
> 	if (divisor == 0 && current_msc >= *target_msc)
> 		if (flip && I830DRI2ScheduleFlip(intel,	draw, swap_info))
> 			return TRUE;

commit 2ab29a1688cd313768d928e87e145570f35b4a70
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Mon Dec 10 14:55:32 2012 -0800

    dri2: don't schedule a flip prematurely at ScheduleSwap time

Mario, can you make sure this works for you?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.
  2012-12-10 23:00       ` Jesse Barnes
@ 2012-12-12 19:12         ` Mario Kleiner
  0 siblings, 0 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-12-12 19:12 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx

On 11.12.12 00:00, Jesse Barnes wrote:
> On Mon, 10 Dec 2012 10:48:29 -0800
> Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
>
>>> On 15.10.12 16:52, Chris Wilson wrote:
>>>   > On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner
>>> <mario.kleiner@tuebingen.mpg.de> wrote:
>>>   >> Hi Chris,
>>>   >>
>>>   >> can you please check & merge at least the first two patches of the
>>>   >> series into the intel ddx?
>>>
>>> Thanks for the quick reply.
>>>
>>>   >
>>>   > The first is along the right path, but I think you want to base the
>>>   > split on divisor==0.
>>>
>>> I don't think so, unless i misunderstand what you mean? The optimization
>>> of I830DRI2ScheduleFlip()'ing immediately should only happen in the case
>>> where current_msc is >= target_msc, ie., if the client really requests
>>> swap at next possible vblank, regardless what divisor is, once we've
>>> entered the whole ...
>>>
>>> if (divisor == 0 || current_msc < *target_msc) {
>>>
>>> ... block. Checking for divisor == 0 would be a nice little cleanup if
>>> only either (divisor == 0) or (current_msc < *target_msc) could be true.
>>> But it can happen that both (divisor == 0) and (current_msc <
>>> *target_msc) and then a check for (divisor == 0) wouldn't tell you if
>>> target_msc is in the future and the kernel vblank mechanism should
>>> schedule swap in the future, or if it is time for an immediate flip.
>>>
>>> Also i tested with various distances between successive swap and with
>>> divisor 0 vs. non-zero, so at least it works as advertised with the
>>> current patch.
>>>
>>> So that patch should be ok.
>>
>> Yeah I don't understand the flip schedule at the top there either; if
>> target_msc is out in the future, why would we schedule a page flip
>> immediately just because divisor == 0?
>>
>> Maybe it should look like this instead?
>>
>> if (divisor == 0 || current_msc < *target_msc) {
>> 	if (divisor == 0 && current_msc >= *target_msc)
>> 		if (flip && I830DRI2ScheduleFlip(intel,	draw, swap_info))
>> 			return TRUE;
>
> commit 2ab29a1688cd313768d928e87e145570f35b4a70
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date:   Mon Dec 10 14:55:32 2012 -0800
>
>      dri2: don't schedule a flip prematurely at ScheduleSwap time
>
> Mario, can you make sure this works for you?
>

Looks good for my purposes, ie., should fix glXSwapBuffersMscOML(), 
which was my main point of grief, thanks a lot! I'll retest soonish for 
extra paranoia. The divisor == 0 check is not really needed for the 
logic to work, but doesn't hurt and maybe still nice to leave there for 
readability to clarify the condition when the optimization should work.

However, it doesn't fix some cases for normal glXSwapBuffers() with a 
swapinterval > 1. That requires always updating *target_msc properly, 
and the early exit if the optimization is taken prevents that. Also the 
swap_info->frame doesn't get updated in this case, which effectively 
disables/skips some correctness test in I830DRI2FlipEventHandler().

If you want to fix those as well you'll basically end up with my 
original patch, which moves the optimization a few lines down in the 
function and adds updating of those two variables, minus lots of 
comments in the code and commit message.

But i'm happy, i mostly need glXSwapBuffersMscOML() and the pageflip 
timestamping to work properly.
-mario

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.
  2012-12-10 18:48     ` Jesse Barnes
  2012-12-10 23:00       ` Jesse Barnes
@ 2012-12-12 20:15       ` Mario Kleiner
  1 sibling, 0 replies; 11+ messages in thread
From: Mario Kleiner @ 2012-12-12 20:15 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx

On 10.12.12 19:48, Jesse Barnes wrote:
>> On 15.10.12 16:52, Chris Wilson wrote:
>>   > On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner
>> <mario.kleiner@tuebingen.mpg.de> wrote:
>>   >> Hi Chris,
>>   >>
>>   >> can you please check & merge at least the first two patches of the
>>   >> series into the intel ddx?
>>
>> Thanks for the quick reply.
>>
>>   >
>>   > The first is along the right path, but I think you want to base the
>>   > split on divisor==0.
>>
>> I don't think so, unless i misunderstand what you mean? The optimization
>> of I830DRI2ScheduleFlip()'ing immediately should only happen in the case
>> where current_msc is >= target_msc, ie., if the client really requests
>> swap at next possible vblank, regardless what divisor is, once we've
>> entered the whole ...
>>
>> if (divisor == 0 || current_msc < *target_msc) {
>>
>> ... block. Checking for divisor == 0 would be a nice little cleanup if
>> only either (divisor == 0) or (current_msc < *target_msc) could be true.
>> But it can happen that both (divisor == 0) and (current_msc <
>> *target_msc) and then a check for (divisor == 0) wouldn't tell you if
>> target_msc is in the future and the kernel vblank mechanism should
>> schedule swap in the future, or if it is time for an immediate flip.
>>
>> Also i tested with various distances between successive swap and with
>> divisor 0 vs. non-zero, so at least it works as advertised with the
>> current patch.
>>
>> So that patch should be ok.
>
> Yeah I don't understand the flip schedule at the top there either; if
> target_msc is out in the future, why would we schedule a page flip
> immediately just because divisor == 0?
>
> Maybe it should look like this instead?
>
> if (divisor == 0 || current_msc < *target_msc) {
> 	if (divisor == 0 && current_msc >= *target_msc)
> 		if (flip && I830DRI2ScheduleFlip(intel,	draw, swap_info))
> 			return TRUE;
>
> (could clean that up a little but it captures the optimization I think
> and avoids an extra vblank and potential frame delay.)
>
>>    The second is broken in the DRI2 layer, as the
>>   > SwapLimit mechanism exposes the multi-client race to a single client
>>   > (i.e. there is no serialisation between the InvalidateEvent and the
>>   > GetBuffers, and the InvalidateEvent is always broadcast too early.)
>>
>> Can you explain in more detail? I know there were many problems with the
>> invalidate events, but thought that's ok now? Multi-client race case
>> happens when a OpenGL app and a compositor is at work? So basically, the
>> whole DRI2 setup is fundamentally broken for any swaplimit other than 1
>> and we have to wait for a future DRI3 to fix this properly?
>>
>> I didn't see any weird behaviour during testing under both compiz+unity
>> or without compositor with fullscreen OpenGL + pageflipping. Is this
>> race common or was i just somehow lucky?
>
> I'm not sure I understand Chris's comment either, however in the
> redirected case the compositor is ultimately responsible for when
> things show up on the screen, so timestamping is meaningless in that
> case.  However for fullscreen, unredirected windows (sounds like that's
> what you tested), we should be able to properly queue and stamp things
> as they complete AIUI.
>

Yes. I need it to work for fullscreen, unredirected windows. This is the 
only case where timestamping works reliably atm. My patch only applies 
to page-flipped swaps. In this cases (no compositor, fullscreen or 
unredirected with compositor) i didn't see problems on my system. But if 
it is a race it could be that my app is just not triggering it in 
typical use.

It would be good to know when and when not exactly these races with 
invalidate events can happen?

>>    The
>>   > third is just plain wrong, as pageflip may be a blocking ioctl (and will
>>   > impose client blocking) so anybody who wants to override SwapBuffersWait
>>   > will also want to prevent the inherent wait due to the flip. In any
>>   > event that option is to only paper over the loose DRI2 specification and
>>   > the lack of client control...
>>
>> I don't think so: If you want to run non-vsynced/tearing with max.
>> swaprate for benchmarks etc., then you must set "SwapBuffersWait" "off"
>> and set the drawables swapinterval to zero via some .drmrc setting or
>> the app calling glXSwapInterval etc. But once you've set the
>> swapinterval to zero, the x-server *always* does an immediate copy blit
>> and bypasses the whole vblank events / kms-pageflip mechanism. See
>
> I think this comes down to what people expect of the "SwapbuffersWait"
> option.  Our man page indicates it's simply an anti-tearing feature, so
> if you disable it I think users would expect swaps to occur as soon as
> possible or when scheduled, regardless of potential tearing.
>
> If we leave flipping enabled, I think we'll often hit cases where a
> swap will be delayed until the next vblank (when the flip is latched
> into the display engine) rather than occurring immediately or when the
> vblank event for it fires, with likely tearing.  Or do you think the
> current code handles this case well enough that we can guarantee we
> won't be delaying things by a frame sometimes?
>

My assumption was that the use case is getting maximum framerate for 
benchmarking, or for gamers who happily trade tearing for performance. 
In both cases your app probably just calls glXSwapBuffers asap and 
doesn't use deferred swaps. Also to avoid being throttled to refresh 
rate by use of vblank events, the app/user would set the swapinterval to 
zero -- But in this case, the x-server will skip the whole vblank event 
swap path anyway and always call DRI2CopyRegion() to do a blit copy, and 
those are not vsync'ed if "SwapBuffersWait" "off". My argument is that 
in the case where you don't want vsync the current code doesn't make a 
difference because vsync'ed pageflips weren't used anyway, but in the 
case you want vsync'ed swaps via non-zero swapinterval, you'll lose 
pageflipping and introduce uneccessary tearing and lower performance.

> If so, you can argue that this isn't a performance advantage, but then
> it's a rather lame option anyway...  So if/until we have async page
> flips, I think the current behavior is probably fine.
>

I don't really care about this case one way or the other for my purpose, 
i just thought the old behavior made more sense, assuming people use it 
the way i think they'd use it.

But didn't you submit some async-flip ioctl() patches recently to allow 
non-vsync'ed kms flips? I also remember Chris posting a patch set for 
the x-server that allowed to communicate this properly to the ddx? Those 
together would solve it properly without "SwapBuffersWait" hacks.

> Mario, do we have some small tests we can add to piglit for the
> timestamping cases?  Looking at simple client code always helps me
> understand the complex DDX and server interaction better...
>

I'll write some, hopefully i'll find some time under the christmas tree...

> Also, I'm assuming all this works ok with triple buffering disabled?

Yes, works with double-buffering, that path is unchanged. Atm. that's 
what works correctly with timestamping. All my users have to disable 
triple-buffering to get a useable setup. My hope was to spare them the 
editing of xorg.conf in the default triple-buffering case, but we can 
live with that inconvenience as long as the xorg.conf option works.

> What about with SNA?

I think i don't understand the relationship between OpenGL and SNA? I'm 
somewhat lost there. As far as i can see there are completely separate 
implementations for SNA and OpenGL use? As far as i understood from 
following bits on the mailing list and on Phoronix, SNA is all about 
accelerating 2D operations, not 3D operations? But then i can see that 
quite some of the swap scheduling etc. is replicated in the SNA code, 
even the use of target_msc, divisor, remainder stuff for deferred swaps, 
and it is not clear how this OpenGL concepts are related to a 2D 
acceleration architecture?

Enlight me a bit if you can,

thanks,
-mario

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-12-12 20:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-07  6:38 Improvements and fixes for Intel ddx swap scheduling/timestamping Mario Kleiner
2012-10-07  6:38 ` [PATCH 1/3] ddx/dri2: Make triple-buffering compatible with timestamping on Xorg 1.12+ Mario Kleiner
2012-10-07  6:38 ` [PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling Mario Kleiner
     [not found]   ` <509365D3.3080203@tuebingen.mpg.de>
2012-12-10 18:48     ` Jesse Barnes
2012-12-10 23:00       ` Jesse Barnes
2012-12-12 19:12         ` Mario Kleiner
2012-12-12 20:15       ` Mario Kleiner
2012-10-07  6:38 ` [PATCH 3/3] ddx/dri2: Prevent option 'SwapBuffersWait' from disabling pageflip Mario Kleiner
2012-10-08  8:00 ` Improvements and fixes for Intel ddx swap scheduling/timestamping Daniel Vetter
2012-10-08 16:08   ` Eric Anholt
2012-10-08 22:49     ` Mario Kleiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.