From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 2/3] xen: Have schedulers revise initial placement Date: Fri, 12 Aug 2016 01:35:21 +0200 Message-ID: <1470958521.6250.37.camel@citrix.com> References: <1468605722-24239-1-git-send-email-george.dunlap@citrix.com> <1468605722-24239-2-git-send-email-george.dunlap@citrix.com> <579F434B0200007800101346@prv-mh.provo.novell.com> <1470054737.3311.0.camel@citrix.com> <57A4AFA70200007800103367@prv-mh.provo.novell.com> <1470927584.6250.26.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8637833087600628913==" Return-path: Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bXzVj-0003pe-IN for xen-devel@lists.xenproject.org; Thu, 11 Aug 2016 23:35:39 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Andrew Cooper , Jan Beulich , George Dunlap Cc: xen-devel@lists.xenproject.org, Anshul Makkar , MengXu List-Id: xen-devel@lists.xenproject.org --===============8637833087600628913== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-td0leEg/8T1mluy5Qry/" --=-td0leEg/8T1mluy5Qry/ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2016-08-11 at 16:51 +0100, Andrew Cooper wrote: > On 11/08/16 15:59, Dario Faggioli wrote: > >=C2=A0 > > Which, I think needs at least this hunk (from 6b53bb4ab3c9=C2=A0=C2=A0"= sched: > > better handle (not) inserting idle vCPUs in runqueues"): > >=20 > > diff --git a/xen/common/schedule.c b/xen/common/schedule.c > > index 2beebe8..fddcd52 100644 > > --- a/xen/common/schedule.c > > +++ b/xen/common/schedule.c > > @@ -240,20 +240,22 @@ int sched_init_vcpu(struct vcpu *v, unsigned > > int processor) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0init_timer(&v->poll_timer, poll_timer_fn, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0v, v->processor); > > =C2=A0 > > -=C2=A0=C2=A0=C2=A0=C2=A0/* Idle VCPUs are scheduled immediately. */ > > +=C2=A0=C2=A0=C2=A0=C2=A0v->sched_priv =3D SCHED_OP(DOM2OP(d), alloc_vd= ata, v, d- > > >sched_priv); > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( v->sched_priv =3D=3D NULL ) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 1; > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_= id, v->vcpu_id); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0/* Idle VCPUs are scheduled immediately, so do= n't put them in > > runqueue. */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( is_idle_domain(d) ) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0per_cpu(schedule_= data, v->processor).curr =3D v; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0v->is_running =3D= 1; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > - > > -=C2=A0=C2=A0=C2=A0=C2=A0TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_= id, v->vcpu_id); > > - > > -=C2=A0=C2=A0=C2=A0=C2=A0v->sched_priv =3D SCHED_OP(DOM2OP(d), alloc_vd= ata, v, d- > > >sched_priv); > > -=C2=A0=C2=A0=C2=A0=C2=A0if ( v->sched_priv =3D=3D NULL ) > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 1; > > - > > -=C2=A0=C2=A0=C2=A0=C2=A0SCHED_OP(DOM2OP(d), insert_vcpu, v); > > +=C2=A0=C2=A0=C2=A0=C2=A0else > > +=C2=A0=C2=A0=C2=A0=C2=A0{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_OP(DOM2OP(d), in= sert_vcpu, v); > > +=C2=A0=C2=A0=C2=A0=C2=A0} > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 0; > > =C2=A0} > >=20 > > So, yeah, it's proving a little more complicated than how I thought > > it > > would have, just by looking at the patches. :-/ > >=20 > > Will let know. > FWIW, this looks very similar to the regression I just raised against > Xen 4.7 "[Xen-devel] Scheduler regression in 4.7".=C2=A0=C2=A0The stack t= races > are > suspiciously similar.=C2=A0=C2=A0 > I thought the same at the beginning, but they actually may not be the same or even related. This happens at early boot, and reason is we try to call csched_cpu_pick() on the idle vcpus, which does not make any sense, and in fact one of the ASSERTS triggers. In your case, system has booted fine already. And the reason for that is you're looking at 4.7, and 4.7 is no longer calling insert_vcpu(), which then calls csched_cpu_pick(), on idle vcpus at boot, thanks to the patch I'm mentioning above. And in fact, I confirm that, on 4.6, with "just" the hunk above of said patch, I can boot, create a (large) VM, play a bit with it, shutdown or reboot it, and shutdown the host as well. Also, yours seems to _explode_ because of a race on the runq (in IS_RUNQ_IDLE()), this one _asserts_ here: =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Pick an online CPU from the proper affinity = mask */ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0csched_balance_cpumask(vc, = balance_step, &cpus); =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpumask_and(&cpus, &cpus, o= nline); =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* If present, prefer vc's current processor */ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpu =3D cpumask_test_cpu(vc= ->processor, &cpus) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0? vc->processor =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0: cpumask_cycle(vc->processor, &cpus); =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ASSERT(cpumask_test_cpu(cpu= , &cpus)); Because, as I said, we're on early boot, and most likely, there's almost no one in online! > I expect they have the same root cause. >=20 No, I think they're two different things. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-td0leEg/8T1mluy5Qry/ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXrQu6AAoJEBZCeImluHPuq/cQAJbuYzA+4LlOueCKEyWZTxeF y5O2QpLJh07y3GrKFTj3nEemR1HN7YvaW0yXPxyjish19NUYPvWRQ05YGkMQlu+M qPznGTGwhPQxwD83lr3B+4lVrPV5a1tafqd6GBVqPSUEe2X6Vt6gGEvEQDxqLuvl XQ3+gw1NQl6XTmgEPeyf54Qzc0pArXf3c9SK1aBXQuQPEKNQd3u6zoXr7lLKvNdb QnpXyqpa+kbDQfo96zOxP+/Es4/S5dWQVLDwB0RY5ey4JlDCW4NsPIS3smMvLJI4 MJfTIF2Wv94ZphJR6IMtTlOzXX0NH/pbDWkUM0vF2aWrC1rcCjaaGAg55VXDZlUJ ANpXg1SzoV2TFlhaoxCzha76wX0KZ0KMwJCBAOoW4+MxhU3QY4bOjzXIDEQik2Ul dExemDGbsoucPHWieviTlIHCtv8yQ31UWb3uodnxDP6lSgwSrANFUo8b57Rb3uxL GwgDBELHz8N3LVZh4ClfJzAqCSLVgFUlQS6MoXlxm5p6WBqmb+m8npaFen2ZsQQx GxhKk6pLwo9uvB8kP+oVnXNJCBUptJH/KLh32uB8Kr6yqVFqdIwLxjLVdPOm35L0 lbtgR0suZnOGi9AjNTGyec6XUjtnt+d+XrLpfw0lgbKmug1P8qC5WiywuV9XBu9L enHq+dFoDPf7DPmF2lb2 =AzlK -----END PGP SIGNATURE----- --=-td0leEg/8T1mluy5Qry/-- --===============8637833087600628913== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============8637833087600628913==--