All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
@ 2013-08-09 14:13 Chris Wilson
  2013-08-09 19:17 ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2013-08-09 14:13 UTC (permalink / raw)
  To: intel-gfx

A simple modeset, where we only wish to switch over to a new framebuffer
such as the transition from fbcon to X, takes around 30-60ms. This is
due to three factors:

1. We need to make sure the fb->obj is in the display domain, which
incurs a cache flush to ensure no dirt is left on the scanout.

2. We need to flush any pending rendering before performing the mmio
so that the frame is complete before it is shown.

3. We currently wait for the vblank after the mmio to be sure that the
old fb is no longer being shown before releasing it.

(1) can only be eliminated by userspace preparing the fb->obj in advance
to already be in the display domain. This can be done through use of the
create2 ioctl, or by reusing an existing fb->obj.

However, (2) and (3) are already solved by the existing page flip
mechanism, and it is surprisingly trivial to wire them up for use in the
set-base fast path. Though it can be argued that this represents a
subtle ABI break in that the set_config ioctl now returns before the old
framebuffer is unpinned. The danger is that userspace will start to
modify it before it is no longer being shown, however we should be able
to prevent that through proper domain tracking.

By combining all of the above, we can achieve an instaneous set_config:

[     6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal
[     6.601] (II) intel(0): Setting screen physical size to 677 x 381

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_display.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 809b968..c6eea51 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set)
 		ret = intel_set_mode(set->crtc, set->mode,
 				     set->x, set->y, set->fb);
 	} else if (config->fb_changed) {
-		intel_crtc_wait_for_pending_flips(set->crtc);
-
-		ret = intel_pipe_set_base(set->crtc,
-					  set->x, set->y, set->fb);
+		if (to_intel_framebuffer(set->fb)->obj->ring == NULL ||
+		    save_set.x != set->x || save_set.y != set->y ||
+		    intel_crtc_page_flip(set->crtc, set->fb, NULL)) {
+			intel_crtc_wait_for_pending_flips(set->crtc);
+			ret = intel_pipe_set_base(set->crtc,
+						  set->x, set->y, set->fb);
+		}
 	}
 
 	if (ret) {
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
  2013-08-09 14:13 [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset Chris Wilson
@ 2013-08-09 19:17 ` Daniel Vetter
  2013-08-09 20:06   ` Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Vetter @ 2013-08-09 19:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote:
> A simple modeset, where we only wish to switch over to a new framebuffer
> such as the transition from fbcon to X, takes around 30-60ms. This is
> due to three factors:
> 
> 1. We need to make sure the fb->obj is in the display domain, which
> incurs a cache flush to ensure no dirt is left on the scanout.
> 
> 2. We need to flush any pending rendering before performing the mmio
> so that the frame is complete before it is shown.
> 
> 3. We currently wait for the vblank after the mmio to be sure that the
> old fb is no longer being shown before releasing it.
> 
> (1) can only be eliminated by userspace preparing the fb->obj in advance
> to already be in the display domain. This can be done through use of the
> create2 ioctl, or by reusing an existing fb->obj.
> 
> However, (2) and (3) are already solved by the existing page flip
> mechanism, and it is surprisingly trivial to wire them up for use in the
> set-base fast path. Though it can be argued that this represents a
> subtle ABI break in that the set_config ioctl now returns before the old
> framebuffer is unpinned. The danger is that userspace will start to
> modify it before it is no longer being shown, however we should be able
> to prevent that through proper domain tracking.

Hm, right now we don't prevent anyone from rendering into a to-be-flipped
out buffer. There was once code in it, using MI_WAIT_EVENT but we've
ripped it out. I guess we could just throw in a synchronous stall on the
flip queue though, that should work always.

Testing would be easy if we have the crtc CRC stuff, but that's atm stuck
due to lack of volunteers ...

Overall I really like the idea and I think doing most of the plane
enabling (including psr, fbc, ips, and all that stuff which potentially
blows through a wblank wait) should be done in async work queues. That
should then also help resume time a lot.

Cheers, Daniel

> By combining all of the above, we can achieve an instaneous set_config:
> 
> [     6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal
> [     6.601] (II) intel(0): Setting screen physical size to 677 x 381
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 809b968..c6eea51 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set)
>  		ret = intel_set_mode(set->crtc, set->mode,
>  				     set->x, set->y, set->fb);
>  	} else if (config->fb_changed) {
> -		intel_crtc_wait_for_pending_flips(set->crtc);
> -
> -		ret = intel_pipe_set_base(set->crtc,
> -					  set->x, set->y, set->fb);
> +		if (to_intel_framebuffer(set->fb)->obj->ring == NULL ||
> +		    save_set.x != set->x || save_set.y != set->y ||
> +		    intel_crtc_page_flip(set->crtc, set->fb, NULL)) {
> +			intel_crtc_wait_for_pending_flips(set->crtc);
> +			ret = intel_pipe_set_base(set->crtc,
> +						  set->x, set->y, set->fb);
> +		}
>  	}
>  
>  	if (ret) {
> -- 
> 1.8.1.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
  2013-08-09 19:17 ` Daniel Vetter
@ 2013-08-09 20:06   ` Chris Wilson
  2013-08-12  8:03     ` Ville Syrjälä
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2013-08-09 20:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote:
> On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote:
> > A simple modeset, where we only wish to switch over to a new framebuffer
> > such as the transition from fbcon to X, takes around 30-60ms. This is
> > due to three factors:
> > 
> > 1. We need to make sure the fb->obj is in the display domain, which
> > incurs a cache flush to ensure no dirt is left on the scanout.
> > 
> > 2. We need to flush any pending rendering before performing the mmio
> > so that the frame is complete before it is shown.
> > 
> > 3. We currently wait for the vblank after the mmio to be sure that the
> > old fb is no longer being shown before releasing it.
> > 
> > (1) can only be eliminated by userspace preparing the fb->obj in advance
> > to already be in the display domain. This can be done through use of the
> > create2 ioctl, or by reusing an existing fb->obj.
> > 
> > However, (2) and (3) are already solved by the existing page flip
> > mechanism, and it is surprisingly trivial to wire them up for use in the
> > set-base fast path. Though it can be argued that this represents a
> > subtle ABI break in that the set_config ioctl now returns before the old
> > framebuffer is unpinned. The danger is that userspace will start to
> > modify it before it is no longer being shown, however we should be able
> > to prevent that through proper domain tracking.
> 
> Hm, right now we don't prevent anyone from rendering into a to-be-flipped
> out buffer. There was once code in it, using MI_WAIT_EVENT but we've
> ripped it out. I guess we could just throw in a synchronous stall on the
> flip queue though, that should work always.

I'm glad we did. I'd rather put that into userspace rather than have the
kernel impose that policy on everybody, as for X that is exactly the
behaviour we want (i.e. not blocking rendering on the next scanout).

> Testing would be easy if we have the crtc CRC stuff, but that's atm stuck
> due to lack of volunteers ...
> 
> Overall I really like the idea and I think doing most of the plane
> enabling (including psr, fbc, ips, and all that stuff which potentially
> blows through a wblank wait) should be done in async work queues. That
> should then also help resume time a lot.

I'd also like to hear Ville's opinion since with his atomic modesetting
I hope we will be able to achieve something very similar.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
  2013-08-09 20:06   ` Chris Wilson
@ 2013-08-12  8:03     ` Ville Syrjälä
  0 siblings, 0 replies; 4+ messages in thread
From: Ville Syrjälä @ 2013-08-12  8:03 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx

On Fri, Aug 09, 2013 at 09:06:36PM +0100, Chris Wilson wrote:
> On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote:
> > On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote:
> > > A simple modeset, where we only wish to switch over to a new framebuffer
> > > such as the transition from fbcon to X, takes around 30-60ms. This is
> > > due to three factors:
> > > 
> > > 1. We need to make sure the fb->obj is in the display domain, which
> > > incurs a cache flush to ensure no dirt is left on the scanout.
> > > 
> > > 2. We need to flush any pending rendering before performing the mmio
> > > so that the frame is complete before it is shown.
> > > 
> > > 3. We currently wait for the vblank after the mmio to be sure that the
> > > old fb is no longer being shown before releasing it.
> > > 
> > > (1) can only be eliminated by userspace preparing the fb->obj in advance
> > > to already be in the display domain. This can be done through use of the
> > > create2 ioctl, or by reusing an existing fb->obj.
> > > 
> > > However, (2) and (3) are already solved by the existing page flip
> > > mechanism, and it is surprisingly trivial to wire them up for use in the
> > > set-base fast path. Though it can be argued that this represents a
> > > subtle ABI break in that the set_config ioctl now returns before the old
> > > framebuffer is unpinned. The danger is that userspace will start to
> > > modify it before it is no longer being shown, however we should be able
> > > to prevent that through proper domain tracking.
> > 
> > Hm, right now we don't prevent anyone from rendering into a to-be-flipped
> > out buffer. There was once code in it, using MI_WAIT_EVENT but we've
> > ripped it out. I guess we could just throw in a synchronous stall on the
> > flip queue though, that should work always.
> 
> I'm glad we did. I'd rather put that into userspace rather than have the
> kernel impose that policy on everybody, as for X that is exactly the
> behaviour we want (i.e. not blocking rendering on the next scanout).
> 
> > Testing would be easy if we have the crtc CRC stuff, but that's atm stuck
> > due to lack of volunteers ...
> > 
> > Overall I really like the idea and I think doing most of the plane
> > enabling (including psr, fbc, ips, and all that stuff which potentially
> > blows through a wblank wait) should be done in async work queues. That
> > should then also help resume time a lot.
> 
> I'd also like to hear Ville's opinion since with his atomic modesetting
> I hope we will be able to achieve something very similar.

Async is nice. Like everyone else I suppose, my only concern has to do
with writes to the old scanout buffer. One option would be to add an
event also to setcrtc, and maybe a new flag to request the
event+nonblocking operation. That's what I have in my atomic page flip
code (actually I think I had separate flags for each). Unfortunately
we don't seem to have flags in setcrtc, unless we commandeer some bits
from mode_valid.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-12  8:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-09 14:13 [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset Chris Wilson
2013-08-09 19:17 ` Daniel Vetter
2013-08-09 20:06   ` Chris Wilson
2013-08-12  8:03     ` Ville Syrjälä

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.