From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [RFC 0/5] xen/arm: support big.little SoC Date: Tue, 20 Sep 2016 02:11:04 +0200 Message-ID: <1474330264.4393.129.camel@citrix.com> References: <1474250936-27962-1-git-send-email-peng.fan@nxp.com> <10152e13-bccb-0794-44e4-556845875e33@arm.com> <20160919083619.GA16854@linux-7smt.suse> <5ddefbc1-3bd4-c990-b615-0039761535d8@arm.com> <170e2787-a410-37c5-a675-6fc7cf31ad6f@citrix.com> <20160919133259.GC7407@linux-u7w5.ap.freescale.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2997899403128260413==" Return-path: In-Reply-To: <20160919133259.GC7407@linux-u7w5.ap.freescale.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Peng Fan , George Dunlap Cc: J??rgen Gro?? , Peng Fan , Stefano Stabellini , George Dunlap , Andrew Cooper , "xen-devel@lists.xen.org" , Julien Grall , Jan Beulich List-Id: xen-devel@lists.xenproject.org --===============2997899403128260413== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-9WF1DYIjVWYTBSm1tfg1" --=-9WF1DYIjVWYTBSm1tfg1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 2016-09-19 at 21:33 +0800, Peng Fan wrote: > On Mon, Sep 19, 2016 at 11:33:58AM +0100, George Dunlap wrote: > >=C2=A0 > > No, I think it would be a lot simpler to just teach the scheduler > > about > > different classes of cpus.=C2=A0=C2=A0credit1 would probably need to be > > modified > > so that its credit algorithm would be per-class rather than pool- > > wide; > > but credit2 shouldn't need much modification at all, other than to > > make > > sure that a given runqueue doesn't include more than one class; and > > to > > do load-balancing only with runqueues of the same class. >=20 > I try to follow. > =C2=A0- scheduler needs to be aware of different classes of cpus. ARM > big.Little cpus. > Yes, I think this is essential. > =C2=A0- scheduler schedules vcpus on different physical cpus in one > cpupool. > Yep, that's what the scheduler does. And personally, I'd start implementing big.LITTLE support for a situation where both big and LITTLE cpus coexists in the same pool. > =C2=A0- different cpu classes needs to be in different runqueue. >=20 Yes. So, basically, imagine to use vcpu pinning to support big.LITTLE. I've spoken briefly about this in my reply to Juergen. You probably can even get something like this up-&-running by writing very few or zero code (you'll need --for now-- max_dom0_vcpus, dom0_vcpus_pin, and then, in domain config files, "cpus=3D'...'"). Then, the real goal, would be to achieve the same behavior automatically, by acting on runqueues' arrangement and load balancing logic in the scheduler(s). Anyway, sorry for my ignorance on big.LITTLE, but there's something I'm missing: _when_ is it that it is (or needs to be) decided whether a vcpu will run on a big or LITTLE core? Thinking to a bare metal system, I think that cpu X is, for instance, big, = and will always be like that; similarly, cpu Y is LITTLE. This makes me think that, for a virtual machine, it is ok to choose/specify= at _domain_creation_ time, which vcpus are big and which vcpus are LITTLE,= is this correct? If yes, this also means that --whatever way we find to make this happen, cp= upools, scheduler, etc-- the vcpus that we decided they are big, must only = be scheduled on actual big pcpus, and pcpus that we decided they are LITTLE= , must only be scheduled on actual LITTLE pcpus, correct again? > Then for implementation. > =C2=A0- When create a guest, specific physical cpus that the guest will b= e > run on. > I'd actually do that the other way round. I'd ask the user to specify how many --and, if that's important-- vcpus are big and how many/which are LITTLE. Knowing that, we also know whether the domain is a big only, LITTLE only or big.LITTLE one. And we also know on which set of pcpus each set of vcpus should be restrict to. So, basically (but it's just an example) something like this, in the xl config file of a guest: 1) big.LITTLE guest, with 2 big and 2 LITTLE pcpus. User doesn't care =C2= =A0 =C2=A0 =C2=A0which is which, so a default could be 0,1 big and 2,3 LITTLE: =C2=A0vcpus =3D 4 =C2=A0vcpus.big =3D 2 2) big.LITTLE guest, with 8 vcpus, of which 0,2,4 and 6 are big: vcpus =3D 8 vcpus.big =3D [0, 2, 4, 6] Which would be the same as vcpus =3D 8 vcpus.little =3D [1, 3, 5, 7] 3) guest with 4 vcpus, all big: vcpus =3D 4 vcpus.big =3D "all" Which would be the same as: vcpus =3D 4 vcpus.little =3D "none" And also the same as just: vcpus =3D 4 Or something like this > =C2=A0- If the physical cpus are different cpus, indicate the guest would > like to be a big.little guest. > =C2=A0=C2=A0=C2=A0And have big vcpus and little vcpus. > Not liking this as _the_ way of specifying the guest topology, wrt big.LITTLE-ness (see alternative proposal right above. :-)) However, right now we support pinning/affinity already. We certainly need to decide what to do if, e.g., no vcpus.big or vcpus.little are present, but the vcpus have hard or soft affinity to some specific pcpus. So, right now, this, in the xl config file: cpus =3D [2, 8, 12, 13, 15, 17] means that we want to ping 1-to-1 vcpu 0 to pcpu 2, vcpu 1 to pcpu 8, vcpu 2 to pcpu 12, vcpu 3 to pcpu 13, vcpu 4 to pcpu 15 and vcpu 5 to pcpu 17. Now, if cores 2, 8 and 12 are big, and no vcpus.big or vcpu.little is specified, I'd put forward the assumption that the user wants vcpus 0, 1 and 2 to be big, and vcpus 3, 4, and 5 to be LITTLE. If, instead, there are vcpus.big or vcpus.little specified, and there's disagreement, I'd either error out or decide which overrun the other (and print a WARNING about that happening). Still right now, this: cpus =3D "2-12" means that all the vcpus of the domain have hard affinity (i.e., are pinned) to pcpus 2-12. And in this case I'd conclude that the user wants for all the vcpus to be big. I'm less sure what to do if _only_ soft-affinity is specified (via "cpus_soft=3D"), or if hard-affinity contains both big and LITTLE pcpus, like, e.g.: cpus =3D "2-15" > =C2=A0- If no physical cpus specificed, then the guest may runs on big > cpus or on little cpus. But not both. > Yes. if nothing (or something contradictory) is specified, we "just" have to decide what's the sanest default. > =C2=A0=C2=A0=C2=A0How to decide runs on big or little physical cpus? > I'd default to big. > =C2=A0- For Dom0, I am still not sure,default big.little or else? >=20 Again, if nothing is specified, I'd probably default to: =C2=A0- give dom0 as much vcpus are there are big cores =C2=A0- restrict them to big cores But, of course, I think we should add boot time parameters like these ones: =C2=A0dom0_vcpus_big =3D 4 =C2=A0dom0_vcpus_little =3D 2 which would mean the user wants dom0 to have 4 big and 2 LITTLE cores... and then we act accordingly, as described above, and in other emails. > If use scheduler to handle the different classes cpu, we do not need > to use cpupool > to block vcpus be scheduled onto different physical cpus. And using > scheudler to handle this > gives an opportunity to support big.little guest. >=20 Exactly, this is one strong point in favour of this solution, IMO! Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-9WF1DYIjVWYTBSm1tfg1 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJX4H6ZAAoJEBZCeImluHPuZVYQAKDb8tggqKBZhwZejTK9L3uF 2q/tUknnqR+pYfXccgDkOwdzbNeAWgZNn0ONeWfxbwgdRokFfhyc2HvZ35qZCghz hxCasMPVU7dRj7LUXp68vE6YrDsfiCPuL1tSEB4Fod4kGfeCj446ZFB3fAIRe8zj 0Yoja2zSADsYh+ZmbDWRk6anFX2kz88nmaK7c+ob/Jmu3Qw1TrCb2lX9mMy6peTj 6BcreQT7fQLD9kEaAp1yIigHfFwnub0WNj+kw4zt56l6cXx+JDQtED+Fu/px4g69 2VD4RjttS8XWyYlOoDifTqTh+Yvh60afkmmMo6H5ETn/LD7EQsQembdOE90yWkGl LIHq0KrQVajIrhxrKyrP06Clu35FJfmlQfDsqIcVTW/GkV+2VTvca37m3DWln/bW ZjB14BnabucIud9IoprTOoDNcLCoJgDiUGufnEcPkb6SL+es/xCtxrtkW4PR1iJd cJlcGBvENXYZ5Nhec1Uafedw100p0+Lcqz5HkRw97DkLBfWR7QeTBVlrRkPT9CcJ sKPaD/ySBUKWAtfKYyDiMS0eftCOYCpUz66Q67+nIbEvsCK459hmiq96YsJVub+w 8BtIbbmy/YY3OWDzKTi4LKZ9i0R9u765VARX8QcTtZE4UrLGUr8T4S/q4+QMbmDg BN7TsAdzUtuZA5Ox3tH6 =Gkok -----END PGP SIGNATURE----- --=-9WF1DYIjVWYTBSm1tfg1-- --===============2997899403128260413== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============2997899403128260413==--