From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 05/24] xen: credit2: make tickling more deterministic Date: Mon, 5 Sep 2016 15:47:41 +0200 Message-ID: <1473083261.19612.54.camel@citrix.com> References: <147145358844.25877.7490417583264534196.stgit@Solace.fritz.box> <147145428827.25877.8612560340138019986.stgit@Solace.fritz.box> <57C70F80.80802@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3599722885390570982==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bguFd-00035J-RZ for xen-devel@lists.xenproject.org; Mon, 05 Sep 2016 13:47:53 +0000 In-Reply-To: <57C70F80.80802@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: anshul makkar , xen-devel@lists.xenproject.org Cc: Andrew Cooper , George Dunlap , Jan Beulich List-Id: xen-devel@lists.xenproject.org --===============3599722885390570982== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-jNgkwxxax4siTHsIcjAK" --=-jNgkwxxax4siTHsIcjAK Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-08-31 at 18:10 +0100, anshul makkar wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > >=20 > > Right now, the following scenario can occurr: > > =C2=A0 - upon vcpu v wakeup, v itself is put in the runqueue, > > =C2=A0=C2=A0=C2=A0=C2=A0and pcpu X is tickled; > > =C2=A0 - pcpu Y schedules (for whatever reason), sees v in > > =C2=A0=C2=A0=C2=A0=C2=A0the runqueue and picks it up. > >=20 > > This may seem ok (or even a good thing), but it's not. > > In fact, if runq_tickle() decided X is where v should > > run, it did it for a reason (load distribution, SMT > > support, cache hotness, affinity, etc), and we really > > should try as hard as possible to stick to that. > >=20 > > Of course, we can't be too strict, or we risk leaving > > vcpus in the runqueue while there is available CPU > > capacity. So, we only leave v in runqueue --for X to > > pick it up-- if we see that X has been tickled and > > has not scheduled yet, i.e., it will have a real chance > > of actually select and schedule v. > >=20 > > If that is not the case, we schedule it on Y (or, at > > least, we consider that), as running somewhere non-ideal > > is better than not running at all. > >=20 > > The commit also adds performance counters for each of > > the possible situations. > >=20 > > Signed-off-by: Dario Faggioli > > --- > > Cc: George Dunlap > > Cc: Anshul Makkar > > Cc: Jan Beulich > > Cc: Andrew Cooper > > --- > > =C2=A0 xen/common/sched_credit2.c=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A06= 5 > > +++++++++++++++++++++++++++++++++++++++--- > > =C2=A0 xen/include/xen/perfc_defn.h |=C2=A0=C2=A0=C2=A0=C2=A03 ++ > > =C2=A0 2 files changed, 64 insertions(+), 4 deletions(-) > >=20 > > diff --git a/xen/common/sched_credit2.c > > b/xen/common/sched_credit2.c > > index 12dfd20..a3d7beb 100644 > > --- a/xen/common/sched_credit2.c > > +++ b/xen/common/sched_credit2.c > > @@ -54,6 +54,7 @@ > > =C2=A0 #define TRC_CSCHED2_LOAD_CHECK=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0TRC_SCHED_CLASS_EVT(CSCHED2, > > 16) > > =C2=A0 #define TRC_CSCHED2_LOAD_BALANCE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0TR= C_SCHED_CLASS_EVT(CSCHED2, > > 17) > > =C2=A0 #define TRC_CSCHED2_PICKED_CPU=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0TRC_SCHED_CLASS_EVT(CSCHED2, > > 19) > > +#define TRC_CSCHED2_RUNQ_CANDIDATE=C2=A0=C2=A0=C2=A0TRC_SCHED_CLASS_EV= T(CSCHED2, > > 20) > >=20 > > =C2=A0 /* > > =C2=A0=C2=A0=C2=A0* WARNING: This is still in an experimental phase.=C2= =A0=C2=A0Status and > > work can be found at the > > @@ -398,6 +399,7 @@ struct csched2_vcpu { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int credit; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t start_time; /* When we wer= e scheduled (used for > > credit) */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0unsigned flags;=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0/* 16 bits doesn't seem to play well > > with clear_bit() */ > > +=C2=A0=C2=A0=C2=A0=C2=A0int tickled_cpu;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= /* cpu tickled for picking us up (-1 if > > none) */ > >=20 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* Individual contribution to load = */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t load_last_update;=C2=A0=C2= =A0/* Last time average was updated > > */ > > @@ -1049,6 +1051,10 @@ runq_tickle(const struct scheduler *ops, > > struct csched2_vcpu *new, s_time_t now) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__cpumask_set_cpu(ipid, &rqd->tickl= ed); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0smt_idle_mask_clear(ipid, &rqd->smt= _idle); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpu_raise_softirq(ipid, SCHEDULE_SO= FTIRQ); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(new->tickled_cpu !=3D -1) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_STAT_CRANK(tickl= ed_cpu_overwritten); > > +=C2=A0=C2=A0=C2=A0=C2=A0new->tickled_cpu =3D ipid; > > =C2=A0 } > >=20 > > =C2=A0 /* > > @@ -1266,6 +1272,7 @@ csched2_alloc_vdata(const struct scheduler > > *ops, struct vcpu *vc, void *dd) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ASSERT(svc-= >sdom !=3D NULL); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->credit= =3D CSCHED2_CREDIT_INIT; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->weight= =3D svc->sdom->weight; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->tickled_cpu =3D -= 1; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* Starting= load of 50% */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->avgloa= d =3D 1ULL << (CSCHED2_PRIV(ops)- > > >load_precision_shift - 1); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->load_l= ast_update =3D NOW() >> > > LOADAVG_GRANULARITY_SHIFT; > > @@ -1273,6 +1280,7 @@ csched2_alloc_vdata(const struct scheduler > > *ops, struct vcpu *vc, void *dd) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ASSERT(svc-= >sdom =3D=3D NULL); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc->tickled_cpu =3D s= vc->vcpu->vcpu_id; > If I understood correctly, tickled_cpu refers to pcpu and not a > vcpu.=C2=A0 > Saving vcpu_id in tickled_cpu looks wrong. >=20 Yes, and in fact, as you can see in the previous hunk, for pretty much all vcpus, tickled_cpu is initialized to -1. Here, we are dealing with the vcpus of the idle domain. And for vcpus of the idle domain, their vcpu id is the same as the id of the pcpu they're associated to. I agree it looks a little bit weird, but it's both correct, and the easiest and cleanest way for initializing this. I guess I can add a comment... Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-jNgkwxxax4siTHsIcjAK Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXzXd/AAoJEBZCeImluHPuyHMQAJb+skSLPgoX3bQxfwALNpkI +4QDEzluzgMWI4OjM0MU7ewo5e6GNL6sGCxpAKtyh+UCqZFYy1/nNWCRPGXbZ1OG efBDUa4x+ldzt3UBoL8L0HeEwTy0E+kg1FAf+uvCdWz2MiXH2yyF/XycRHpS5wVz 89u9YHXcZaSVt+pvlyLYqt74a+GxEiXaKDL2TGjKflJ6uQ51bqsohjLXNGSZIUP+ GBRK9hPRCeDMK54GKLWwQMt6URYpfkaqess1gsGuZ3O8DLATaw2DQKlYuk4DHE36 sq5RlIKSwH+Hcx5mqtgG1FkWIR0d46VLyqNj6rHTszwLBQb3FImjS/6DUlMtgjUU Dh9JxTfULroEkqUSWO6o26pM1UHR7qPiEwos8uzQaVNCHjBxRnX39+lksSjo6pr5 U7EJ6CCy6rgaIsxLlb0qXZO8kRHm4KTJYGF9YDIbJOzGvKakEkVwfCBABJkFkI8y NIlvYY1jUnoconcgRDsErgF8+S9xb1oT5Ov1lvavq8IEJuXMHzg4JK7X3oQEofln 0xOFWs87cgkbFi3egMOLM/A0Y0YiNPmnaOBkFb9w3kYiuAqXuuIv4G/DwYIh0CiQ YGuttMJFojDsoMTIhGxUxAnWROaw9S/PYThHnTYF91IfBlFIEu/AxxZf5xluDkPx FIYw/ttbQdjiAAM6HzTE =oNvK -----END PGP SIGNATURE----- --=-jNgkwxxax4siTHsIcjAK-- --===============3599722885390570982== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3599722885390570982==--