All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Use usleep_range() in wait_for()
@ 2015-03-20 19:28 ville.syrjala
  2015-03-20 19:31 ` Jesse Barnes
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: ville.syrjala @ 2015-03-20 19:28 UTC (permalink / raw)
  To: intel-gfx

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

msleep() can sleep for way too long, so switch wait_for() to use
usleep_range() instead. Following a totally unscientific method
I just picked the range as W-2W.

This cuts the i915 init time on my BSW to almost half:
- initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
+ initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs

Note that I didn't perform any other benchmarks on this so far.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_drv.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 6f20f3a..d2a4de0 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -56,8 +56,8 @@
 				ret__ = -ETIMEDOUT;			\
 			break;						\
 		}							\
-		if (W && drm_can_sleep())  {				\
-			msleep(W);					\
+		if ((W) && drm_can_sleep()) {				\
+			usleep_range((W)*1000, (W)*2000);		\
 		} else {						\
 			cpu_relax();					\
 		}							\
-- 
2.0.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-20 19:28 [PATCH] drm/i915: Use usleep_range() in wait_for() ville.syrjala
@ 2015-03-20 19:31 ` Jesse Barnes
  2015-03-23  9:30   ` Daniel Vetter
  2015-03-20 21:09 ` Chris Wilson
  2015-03-20 23:38 ` shuang.he
  2 siblings, 1 reply; 7+ messages in thread
From: Jesse Barnes @ 2015-03-20 19:31 UTC (permalink / raw)
  To: ville.syrjala, intel-gfx

On 03/20/2015 12:28 PM, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> msleep() can sleep for way too long, so switch wait_for() to use
> usleep_range() instead. Following a totally unscientific method
> I just picked the range as W-2W.
> 
> This cuts the i915 init time on my BSW to almost half:
> - initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
> + initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs
> 
> Note that I didn't perform any other benchmarks on this so far.
> 
> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_drv.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 6f20f3a..d2a4de0 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -56,8 +56,8 @@
>  				ret__ = -ETIMEDOUT;			\
>  			break;						\
>  		}							\
> -		if (W && drm_can_sleep())  {				\
> -			msleep(W);					\
> +		if ((W) && drm_can_sleep()) {				\
> +			usleep_range((W)*1000, (W)*2000);		\
>  		} else {						\
>  			cpu_relax();					\
>  		}							\
> 

Nice improvement!  I guess it's just because usleep_range() uses an
hrtimeout, but since we have the added slop of the range it may actually
be an improvement, power-wise too.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-20 19:28 [PATCH] drm/i915: Use usleep_range() in wait_for() ville.syrjala
  2015-03-20 19:31 ` Jesse Barnes
@ 2015-03-20 21:09 ` Chris Wilson
  2015-03-23  9:39   ` Ville Syrjälä
  2015-03-20 23:38 ` shuang.he
  2 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2015-03-20 21:09 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Fri, Mar 20, 2015 at 09:28:08PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> msleep() can sleep for way too long, so switch wait_for() to use
> usleep_range() instead. Following a totally unscientific method
> I just picked the range as W-2W.
> 
> This cuts the i915 init time on my BSW to almost half:
> - initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
> + initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs
> 
> Note that I didn't perform any other benchmarks on this so far.
> 
> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Hmm, I think we can improve further with a more variable sleep. The
maximum we pass to wait_for() is usually plucked from somewhere in the
spec (or is just a safety factor). Either way, it is a good guide as to
how to actually sleep for - if say we try to only sample 1000 times up
to the maximum:

if (do_sleep && drm_can_sleep()) {
	usleep_range((MS), 10*(MS));
}

So whilst you have a situation where we clearly sleep too long between
sampling (a register), it would be beneficial to start adding some debug
infrastructure. Or even better if usleep_range already does it for us...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-20 19:28 [PATCH] drm/i915: Use usleep_range() in wait_for() ville.syrjala
  2015-03-20 19:31 ` Jesse Barnes
  2015-03-20 21:09 ` Chris Wilson
@ 2015-03-20 23:38 ` shuang.he
  2 siblings, 0 replies; 7+ messages in thread
From: shuang.he @ 2015-03-20 23:38 UTC (permalink / raw)
  To: shuang.he, ethan.gao, intel-gfx, ville.syrjala

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6020
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -2              274/274              272/274
ILK                                  303/303              303/303
SNB                                  303/303              303/303
IVB                 -2              342/342              340/342
BYT                                  287/287              287/287
HSW                                  362/362              362/362
BDW                                  308/308              308/308
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 PNV  igt_gem_userptr_blits_minor-sync-interruptible      DMESG_WARN(1)PASS(3)      DMESG_WARN(1)PASS(1)
*PNV  igt_gen3_render_linear_blits      FAIL(1)PASS(6)      CRASH(1)PASS(1)
 IVB  igt_gem_pwrite_pread_snooped-copy-performance      DMESG_WARN(2)PASS(3)      DMESG_WARN(1)PASS(1)
*IVB  igt_gem_storedw_batches_loop_secure-dispatch      PASS(4)      DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-20 19:31 ` Jesse Barnes
@ 2015-03-23  9:30   ` Daniel Vetter
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Vetter @ 2015-03-23  9:30 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx

On Fri, Mar 20, 2015 at 12:31:16PM -0700, Jesse Barnes wrote:
> On 03/20/2015 12:28 PM, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > msleep() can sleep for way too long, so switch wait_for() to use
> > usleep_range() instead. Following a totally unscientific method
> > I just picked the range as W-2W.
> > 
> > This cuts the i915 init time on my BSW to almost half:
> > - initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
> > + initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs
> > 
> > Note that I didn't perform any other benchmarks on this so far.
> > 
> > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_drv.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 6f20f3a..d2a4de0 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -56,8 +56,8 @@
> >  				ret__ = -ETIMEDOUT;			\
> >  			break;						\
> >  		}							\
> > -		if (W && drm_can_sleep())  {				\
> > -			msleep(W);					\
> > +		if ((W) && drm_can_sleep()) {				\
> > +			usleep_range((W)*1000, (W)*2000);		\
> >  		} else {						\
> >  			cpu_relax();					\
> >  		}							\
> > 
> 
> Nice improvement!  I guess it's just because usleep_range() uses an
> hrtimeout, but since we have the added slop of the range it may actually
> be an improvement, power-wise too.
> 
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>

Nice indeed. Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-20 21:09 ` Chris Wilson
@ 2015-03-23  9:39   ` Ville Syrjälä
  2015-03-23  9:46     ` Daniel Vetter
  0 siblings, 1 reply; 7+ messages in thread
From: Ville Syrjälä @ 2015-03-23  9:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Fri, Mar 20, 2015 at 09:09:51PM +0000, Chris Wilson wrote:
> On Fri, Mar 20, 2015 at 09:28:08PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > msleep() can sleep for way too long, so switch wait_for() to use
> > usleep_range() instead. Following a totally unscientific method
> > I just picked the range as W-2W.
> > 
> > This cuts the i915 init time on my BSW to almost half:
> > - initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
> > + initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs
> > 
> > Note that I didn't perform any other benchmarks on this so far.
> > 
> > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Hmm, I think we can improve further with a more variable sleep. The
> maximum we pass to wait_for() is usually plucked from somewhere in the
> spec (or is just a safety factor). Either way, it is a good guide as to
> how to actually sleep for - if say we try to only sample 1000 times up
> to the maximum:
> 
> if (do_sleep && drm_can_sleep()) {
> 	usleep_range((MS), 10*(MS));
> }
> 
> So whilst you have a situation where we clearly sleep too long between
> sampling (a register), it would be beneficial to start adding some debug
> infrastructure. Or even better if usleep_range already does it for us...

Yeah, it would be nice to have some more information about how long we
sleep typically in each case.

We could perhaps then even micro optimize the 'set knob -> poll for knob
to become active' pattern by doing a preemptive sleep between the set
and poll steps in the hopes that we don't have to check the condition
more than once. Not sure that would be worth the effort though.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Use usleep_range() in wait_for()
  2015-03-23  9:39   ` Ville Syrjälä
@ 2015-03-23  9:46     ` Daniel Vetter
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Vetter @ 2015-03-23  9:46 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Mar 23, 2015 at 11:39:12AM +0200, Ville Syrjälä wrote:
> On Fri, Mar 20, 2015 at 09:09:51PM +0000, Chris Wilson wrote:
> > On Fri, Mar 20, 2015 at 09:28:08PM +0200, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > msleep() can sleep for way too long, so switch wait_for() to use
> > > usleep_range() instead. Following a totally unscientific method
> > > I just picked the range as W-2W.
> > > 
> > > This cuts the i915 init time on my BSW to almost half:
> > > - initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
> > > + initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs
> > > 
> > > Note that I didn't perform any other benchmarks on this so far.
> > > 
> > > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Hmm, I think we can improve further with a more variable sleep. The
> > maximum we pass to wait_for() is usually plucked from somewhere in the
> > spec (or is just a safety factor). Either way, it is a good guide as to
> > how to actually sleep for - if say we try to only sample 1000 times up
> > to the maximum:
> > 
> > if (do_sleep && drm_can_sleep()) {
> > 	usleep_range((MS), 10*(MS));
> > }
> > 
> > So whilst you have a situation where we clearly sleep too long between
> > sampling (a register), it would be beneficial to start adding some debug
> > infrastructure. Or even better if usleep_range already does it for us...
> 
> Yeah, it would be nice to have some more information about how long we
> sleep typically in each case.
> 
> We could perhaps then even micro optimize the 'set knob -> poll for knob
> to become active' pattern by doing a preemptive sleep between the set
> and poll steps in the hopes that we don't have to check the condition
> more than once. Not sure that would be worth the effort though.

Something like the below might be worth a quick shot. Maybe with 3 instead
of 2, but then we need to make sure we don't drop down to 0 ever.
-Daniel

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index d2a4de0e4f4a..b796e4c47da5 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -58,6 +58,7 @@
 		}							\
 		if ((W) && drm_can_sleep()) {				\
 			usleep_range((W)*1000, (W)*2000);		\
+			W = ROUND_UP(W, 2);				\
 		} else {						\
 			cpu_relax();					\
 		}							\
@@ -65,7 +66,7 @@
 	ret__;								\
 })
 
-#define wait_for(COND, MS) _wait_for(COND, MS, 1)
+#define wait_for(COND, MS) _wait_for(COND, MS, ROUND_UP(COND, 2))
 #define wait_for_atomic(COND, MS) _wait_for(COND, MS, 0)
 #define wait_for_atomic_us(COND, US) _wait_for((COND), \
 					       DIV_ROUND_UP((US), 1000), 0)
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-23  9:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-20 19:28 [PATCH] drm/i915: Use usleep_range() in wait_for() ville.syrjala
2015-03-20 19:31 ` Jesse Barnes
2015-03-23  9:30   ` Daniel Vetter
2015-03-20 21:09 ` Chris Wilson
2015-03-23  9:39   ` Ville Syrjälä
2015-03-23  9:46     ` Daniel Vetter
2015-03-20 23:38 ` shuang.he

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.