From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 2/3] xen: Have schedulers revise initial placement Date: Thu, 11 Aug 2016 16:59:44 +0200 Message-ID: <1470927584.6250.26.camel@citrix.com> References: <1468605722-24239-1-git-send-email-george.dunlap@citrix.com> <1468605722-24239-2-git-send-email-george.dunlap@citrix.com> <579F434B0200007800101346@prv-mh.provo.novell.com> <1470054737.3311.0.camel@citrix.com> <57A4AFA70200007800103367@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3569404143304099861==" Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bXrWu-0004Lk-JE for xen-devel@lists.xenproject.org; Thu, 11 Aug 2016 15:04:20 +0000 In-Reply-To: <57A4AFA70200007800103367@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich , George Dunlap Cc: xen-devel@lists.xenproject.org, Anshul Makkar , MengXu List-Id: xen-devel@lists.xenproject.org --===============3569404143304099861== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-LQEhxpJtR+aoM1do0r5B" --=-LQEhxpJtR+aoM1do0r5B Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2016-08-05 at 07:24 -0600, Jan Beulich wrote: > I'd really like to have those backported, but I have to ask one > of you to identify which prereq-s are needed on 4.6 and 4.5 > (I'll revert them from 4.5 right away, but I'll wait for an osstest > flight to confirm the same issue exists on 4.6).=20 > Hey, I could only start working on this this morning (sorry for the delay), and I'll continue tomorrow but, at least here, staging-4.6 (plus the patches!) crashes like this: (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] ----[ Xen-4.6.4-pre=C2=A0=C2=A0x86= _64=C2=A0=C2=A0debug=3Dy=C2=A0=C2=A0Not tainted ]---- (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] CPU:=C2=A0=C2=A0=C2=A0=C2=A00 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] RIP:=C2=A0=C2=A0=C2=A0=C2=A0e008:[= ] _csched_cpu_pick+0x156/0x612 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] RFLAGS: 0000000000010046=C2=A0=C2= =A0=C2=A0CONTEXT: hypervisor (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] rax: 0000000000000040=C2=A0=C2=A0= =C2=A0rbx: 0000000000000040=C2=A0=C2=A0=C2=A0rcx: 0000000000000000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] rdx: 000000000000003f=C2=A0=C2=A0= =C2=A0rsi: 0000000000000040=C2=A0=C2=A0=C2=A0rdi: 0000000000000040 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] rbp: ffff82d0802f7d68=C2=A0=C2=A0= =C2=A0rsp: ffff82d0802f7c78=C2=A0=C2=A0=C2=A0r8:=C2=A0=C2=A0000000000000000= 0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] r9:=C2=A0=C2=A00000000000000001=C2= =A0=C2=A0=C2=A0r10: 0000000000000001=C2=A0=C2=A0=C2=A0r11: 00000000000000b4 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] r12: 0000000000000040=C2=A0=C2=A0= =C2=A0r13: ffff83032152bf40=C2=A0=C2=A0=C2=A0r14: ffff82d08033ae40 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] r15: ffff8300dbdf4000=C2=A0=C2=A0= =C2=A0cr0: 0000000080050033=C2=A0=C2=A0=C2=A0cr4: 00000000000006e0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] cr3: 00000000dba9f000=C2=A0=C2=A0= =C2=A0cr2: 0000000000000000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] ds: 0000=C2=A0=C2=A0=C2=A0es: 0000= =C2=A0=C2=A0=C2=A0fs: 0000=C2=A0=C2=A0=C2=A0gs: 0000=C2=A0=C2=A0=C2=A0ss: 0= 000=C2=A0=C2=A0=C2=A0cs: e008 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] Xen stack trace from rsp=3Dffff82d= 0802f7c78: (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00046 0000000200000092 ffff82d08033ae48 ffff82d08028b020 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 ffff82d0802ff040 0000000100000001 ffff82d08033ae40 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00097 ffff82d0802f7cd8 0000000000000006 ffff82d0802f7ce8 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0801= 2d027 ffff830321532000 ffff82d0802f7d28 ffff82d08013c779 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 0000000000000010 0000000000000048 0000000000000048 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 0000000000000000 0000000000000000 0000000000000000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0802= f7d78 ffff8300dbdf4000 ffff83032152a000 ffff83032152bf40 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 ffff82d08028e020 ffff82d0802f7d78 ffff82d080123d8e (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0802= f7db8 ffff82d080123db0 ffff83032152a000 ffff8300dbdf4000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff8303215= 2a000 0000000000000000 0000000000000000 ffff82d08028e020 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0802= f7de8 ffff82d080129e8d ffff82d0802f7de8 ffff82d08013cda0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff8300dbd= f4000 ffff83032152a000 ffff82d0802f7e18 ffff82d080105f59 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0802= a1720 ffff82d0802a1718 ffff82d0802a1720 ffff82d0802daa60 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff82d0802= f7e48 ffff82d0802a8145 0000000000000002 ffff83032152b610 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00002 0000000000000001 ffff82d0802f7f08 ffff82d0802c5c6f (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 0000000000100000 00000000014ae000 0000000000324000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000800163000= 00000 0000000300000015 0000000000000000 0000000000000010 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0ffff8300000= 8dd50 ffff83000008df20 0000000000000002 ffff83000008dfb0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000008000= 00000 000000010000006e 0000000000000003 00000000000002f8 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A000000000000= 00000 0000000000000000 0000000000000000 0000000000000000 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] Xen call trace: (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] _csched_cpu_pick+0x156/0x612 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] csched_cpu_pick+0xe/0x10 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] csched_vcpu_insert+0x20/0x145 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] sched_init_vcpu+0x1d1/0x218 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] alloc_vcpu+0x1ba/0x2a4 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] scheduler_init+0x1b0/0x271 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] __start_xen+0x1f85/0x2550 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0=C2=A0=C2=A0=C2=A0[] __high_start+0x53/0x55 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000]=C2=A0 (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] **********************************= ****** (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] Panic on CPU 0: (XEN) [=C2=A0=C2=A0=C2=A0=C2=A00.000000] Assertion 'cpu < nr_cpu_ids' faile= d at ...e/SOURCES/xen/xen.git/xen/include/xen/cpumask.h:97 Which, I think needs at least this hunk (from=C2=A06b53bb4ab3c9 =C2=A0"sche= d: better handle (not) inserting idle vCPUs in runqueues"): diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 2beebe8..fddcd52 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -240,20 +240,22 @@ int sched_init_vcpu(struct vcpu *v, unsigned int proc= essor) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0init_timer(&v->poll_timer, poll_timer_fn, =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0v, v->processor); =C2=A0 -=C2=A0=C2=A0=C2=A0=C2=A0/* Idle VCPUs are scheduled immediately. */ +=C2=A0=C2=A0=C2=A0=C2=A0v->sched_priv =3D SCHED_OP(DOM2OP(d), alloc_vdata,= v, d->sched_priv); +=C2=A0=C2=A0=C2=A0=C2=A0if ( v->sched_priv =3D=3D NULL ) +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 1; + +=C2=A0=C2=A0=C2=A0=C2=A0TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_id, = v->vcpu_id); + +=C2=A0=C2=A0=C2=A0=C2=A0/* Idle VCPUs are scheduled immediately, so don't = put them in runqueue. */ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( is_idle_domain(d) ) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0per_cpu(schedule_data= , v->processor).curr =3D v; =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0v->is_running =3D 1; =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} - -=C2=A0=C2=A0=C2=A0=C2=A0TRACE_2D(TRC_SCHED_DOM_ADD, v->domain->domain_id, = v->vcpu_id); - -=C2=A0=C2=A0=C2=A0=C2=A0v->sched_priv =3D SCHED_OP(DOM2OP(d), alloc_vdata,= v, d->sched_priv); -=C2=A0=C2=A0=C2=A0=C2=A0if ( v->sched_priv =3D=3D NULL ) -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 1; - -=C2=A0=C2=A0=C2=A0=C2=A0SCHED_OP(DOM2OP(d), insert_vcpu, v); +=C2=A0=C2=A0=C2=A0=C2=A0else +=C2=A0=C2=A0=C2=A0=C2=A0{ +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_OP(DOM2OP(d), insert= _vcpu, v); +=C2=A0=C2=A0=C2=A0=C2=A0} =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return 0; =C2=A0} So, yeah, it's proving a little more complicated than how I thought it would have, just by looking at the patches. :-/ Will let know. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-LQEhxpJtR+aoM1do0r5B Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXrJLhAAoJEBZCeImluHPuWO8P/A5qYqs6DGb5ahlXwWO/eCWb gGNnoeyGv+aIn9mG354gc7HWTsbeFWUy8JaWpVPKYE1MwRb7+51yYWImo3B5GyNX V6lTgNMPQRjVNf/en7ampzMMfNECUm+XJnwQFKSA00hqCFdl9cILDNH++qAmmwKh 4rRSde5Px6DNIQqHZlY9UtURT+67IjRKvwUA7sdX9IG3Cf7KI+AUX9/9vychUSoh 5uVgQ0Q4EcN/U8X84toTJiXDYUWZoxWShOTV8SNRseKaFtCxaPmfF91p9k5KVvZC 9lpH22hxqT0WJgz4OUfXgvTYkEXakO2daLlusS7dIzBiHuo1ax+LH+OT27CpW9iE TE0268rtxiOdFjk3RjGrDX9jpvB2gC3K1RU5lqH8nb3OMYGfFmzVunbAz0ahWDLu Wu91iXhoTPP9O7cJf462JULnE6CY2aXy6Br9tgm/+n3PWkHNavPd5n5984qMoph6 Ld5C0M6arrFNd4+8r0Vi3k4eL4bzT8KxPBnWfPXgnm7hSv/aYgrWpvnTFnkjcQ5Y bWhaVQ1MYJC8i/dNARz64Hxfh2cNvTNTzd47BNz3Hm24xPQNobeJ5gqy63rkyXgB X4nGDzi8oHglDoGgwBubnbhFATgCmsP/icx9G1Yh8RiOXQlHGKoUXqBzjPMMgzG4 ETFdw8ddfHrDtiEhDzK2 =8bjk -----END PGP SIGNATURE----- --=-LQEhxpJtR+aoM1do0r5B-- --===============3569404143304099861== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3569404143304099861==--