From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH v2] xen/credit scheduler; Use delay to control scheduling frequency Date: Mon, 19 Dec 2011 10:54:21 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Lv, Hui" Cc: "Tian, Kevin" , "xen-devel@lists.xensource.com" , "keir@xen.org" , "Dong, Eddie" , "Duan, Jiangang" , "Yu, Zhidong" List-Id: xen-devel@lists.xenproject.org Hui, Unfortunately your mailer is mangling the patch: static int __read_mostly sched_credit_tslice_ms =3D3D CSCHED_DEFAULT_TSLIC= E_=3D MS; Try using "hg email", or sending it as an attachment. -George On Sat, Dec 17, 2011 at 3:24 AM, Lv, Hui wrote: > The delay method for credit scheduler can do as well as SRC patch (previo= us > one) to gain significant performance boost without obvious drawbacks. > > 1. Basically, the "delay method" can achieve nearly the same benefits as = my > previous SRC patch, 11% overall performance boost for SPECvirt than origi= nal > credit scheduler. > 2. We have tried 1ms delay and 10ms delay, there is no big difference > between these two configurations. (1ms is enough to achieve a good > performance) > 3. We have compared different load level response time/latency (low, high, > peak), "delay method" didn't bring very much response time increase. > 4. 1ms delay can reduce 30% context switch at peak performance, where > produces the benefits. (=93int sched_ratelimit_us =3D 1000=94 is the reco= mmended > setting) > > > Signed-off-by: Hui Lv > Signed-off-by: George Dunlap > > diff -r 1c58bb664d8d xen/common/sched_credit.c > --- a/xen/common/sched_credit.c Thu Dec 08 17:15:16 2011 +0000 > +++ b/xen/common/sched_credit.c Fri Dec 16 15:08:09 2011 -0500 > @@ -110,6 +110,9 @@ boolean_param("sched_credit_default_yiel > static int __read_mostly sched_credit_tslice_ms =3D CSCHED_DEFAULT_TSLICE= _MS; > integer_param("sched_credit_tslice_ms", sched_credit_tslice_ms); > > +/* Scheduler generic parameters > +*/ > +extern int sched_ratelimit_us; > /* > =A0 * Physical CPU > =A0 */ > @@ -1297,10 +1300,15 @@ csched_schedule( > =A0=A0=A0=A0 struct csched_private *prv =3D CSCHED_PRIV(ops); > =A0=A0=A0=A0 struct csched_vcpu *snext; > =A0=A0=A0=A0 struct task_slice ret; > +=A0=A0=A0 s_time_t runtime, tslice; > > =A0=A0=A0=A0 CSCHED_STAT_CRANK(schedule); > =A0=A0=A0=A0 CSCHED_VCPU_CHECK(current); > > +=A0=A0=A0 runtime =3D now - current->runstate.state_entry_time; > +=A0=A0=A0 if ( runtime < 0 ) /* Does this ever happen? */ > +=A0=A0=A0=A0=A0=A0=A0 runtime =3D 0; > + > =A0=A0=A0=A0 if ( !is_idle_vcpu(scurr->vcpu) ) > =A0=A0=A0=A0 { > =A0=A0=A0=A0=A0=A0=A0=A0 /* Update credits of a non-idle VCPU. */ > @@ -1313,6 +1321,41 @@ csched_schedule( > =A0=A0=A0=A0=A0=A0=A0=A0 scurr->pri =3D CSCHED_PRI_IDLE; > =A0=A0=A0=A0 } > > +=A0=A0=A0 /* Choices, choices: > +=A0=A0=A0=A0 * - If we have a tasklet, we need to run the idle vcpu no m= atter what. > +=A0=A0=A0=A0 * - If sched rate limiting is in effect, and the current vc= pu has > +=A0=A0=A0=A0 *=A0=A0 run for less than that amount of time, continue the= current one, > +=A0=A0=A0=A0 *=A0=A0 but with a shorter timeslice and return it immediat= ely > +=A0=A0=A0=A0 * - Otherwise, chose the one with the highest priority (whi= ch may > +=A0=A0=A0=A0 *=A0=A0 be the one currently running) > +=A0=A0=A0=A0 * - If the currently running one is TS_OVER, see if there > +=A0=A0=A0=A0 *=A0=A0 is a higher priority one waiting on the runqueue of= another > +=A0=A0=A0=A0 *=A0=A0 cpu and steal it. > +=A0=A0=A0=A0 */ > + > +=A0=A0=A0 /* If we have schedule rate limiting enabled, check to see > +=A0=A0=A0=A0 * how long we've run for. */ > +=A0=A0=A0 if ( sched_ratelimit_us > +=A0=A0=A0=A0=A0=A0=A0=A0 && !tasklet_work_scheduled > +=A0=A0=A0=A0=A0=A0=A0=A0 && vcpu_runnable(current) > +=A0=A0=A0=A0=A0=A0=A0=A0 && !is_idle_vcpu(current) > +=A0=A0=A0=A0=A0=A0=A0=A0 && runtime < MICROSECS(sched_ratelimit_us) ) > +=A0=A0=A0 { > +=A0=A0=A0=A0=A0=A0=A0 snext =3D scurr; > +=A0=A0=A0=A0=A0=A0=A0 snext->start_time +=3D now; > +=A0=A0=A0=A0=A0=A0=A0 perfc_incr(delay_ms); > +=A0=A0=A0=A0=A0=A0=A0 tslice =3D MICROSECS(sched_ratelimit_us); > +=A0=A0=A0=A0=A0=A0=A0 ret.migrated =3D 0; > +=A0=A0=A0=A0=A0=A0=A0 goto out; > +=A0=A0=A0 } > +=A0=A0=A0 else > +=A0=A0=A0 { > +=A0=A0=A0=A0=A0=A0=A0 /* > +=A0=A0=A0=A0=A0=A0=A0=A0 * Select next runnable local VCPU (ie top of lo= cal runq) > +=A0=A0=A0=A0=A0=A0=A0 */ > +=A0=A0=A0=A0=A0=A0=A0 tslice =3D MILLISECS(prv->tslice_ms); > +=A0=A0=A0 } > + > =A0=A0=A0=A0 /* > =A0=A0=A0=A0=A0 * Select next runnable local VCPU (ie top of local runq) > =A0=A0=A0=A0=A0 */ > @@ -1367,11 +1410,12 @@ csched_schedule( > =A0=A0=A0=A0 if ( !is_idle_vcpu(snext->vcpu) ) > =A0=A0=A0=A0=A0=A0=A0=A0 snext->start_time +=3D now; > > +out: > =A0=A0=A0=A0 /* > =A0=A0=A0=A0=A0 * Return task to run next... > =A0=A0=A0=A0=A0 */ > =A0=A0=A0=A0 ret.time =3D (is_idle_vcpu(snext->vcpu) ? > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -1 : MILLISECS(prv->tslice= _ms)); > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -1 : tslice); > =A0=A0=A0=A0 ret.task =3D snext->vcpu; > > =A0=A0=A0=A0 CSCHED_VCPU_CHECK(ret.task); > diff -r 1c58bb664d8d xen/common/schedule.c > --- a/xen/common/schedule.c=A0=A0=A0=A0 Thu Dec 08 17:15:16 2011 +0000 > +++ b/xen/common/schedule.c=A0=A0=A0=A0 Fri Dec 16 15:08:09 2011 -0500 > @@ -47,6 +47,10 @@ string_param("sched", opt_sched); > bool_t sched_smt_power_savings =3D 0; > boolean_param("sched_smt_power_savings", sched_smt_power_savings); > > +/* Default scheduling rate limit: 1ms */ > +int sched_ratelimit_us =3D 1000; > +integer_param("sched_ratelimit_us", sched_ratelimit_us); > + > /* Various timer handlers. */ > static void s_timer_fn(void *unused); > static void vcpu_periodic_timer_fn(void *data); > diff -r 1c58bb664d8d xen/include/xen/perfc_defn.h > --- a/xen/include/xen/perfc_defn.h=A0=A0=A0=A0=A0 Thu Dec 08 17:15:16 201= 1 +0000 > +++ b/xen/include/xen/perfc_defn.h=A0=A0=A0=A0=A0 Fri Dec 16 15:08:09 201= 1 -0500 > @@ -16,6 +16,7 @@ PERFCOUNTER(sched_irq,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 "sch > PERFCOUNTER(sched_run,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "sched: run= s through scheduler") > PERFCOUNTER(sched_ctx,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "sched: con= text switches") > > +PERFCOUNTER(delay_ms,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "csched: de= lay") > PERFCOUNTER(vcpu_check,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "csched: vcpu= _check") > PERFCOUNTER(schedule,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "csched: = schedule") > PERFCOUNTER(acct_run,=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "csched: = acct_run") > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >