From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 19/19] xen: credit2: use cpumask_first instead of cpumask_any when choosing cpu Date: Mon, 20 Jun 2016 13:28:57 +0200 Message-ID: <1466422137.19253.34.camel@citrix.com> References: <146620492155.29766.10321123657058307698.stgit@Solace.fritz.box> <146620521750.29766.12008171749820205436.stgit@Solace.fritz.box> <5767C5A902000078000F6838@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1987943505769600965==" Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bExOC-0006dN-GO for xen-devel@lists.xenproject.org; Mon, 20 Jun 2016 11:29:12 +0000 In-Reply-To: <5767C5A902000078000F6838@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich Cc: xen-devel@lists.xenproject.org, Anshul Makkar , David Vrabel , George Dunlap List-Id: xen-devel@lists.xenproject.org --===============1987943505769600965== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-fXlhLiyvi3yfc988aaGR" --=-fXlhLiyvi3yfc988aaGR Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 2016-06-20 at 02:30 -0600, Jan Beulich wrote: > >=20 > > >=20 > > > >=20 > > > > On 18.06.16 at 01:13, wrote: > > because it is cheaper, and there is no much point in > > randomizing which cpu gets selected anyway, as such > > choice will be overridden shortly after, in runq_tickle(). > If it will always be overridden, why fill it in the first place? And > if there > are cases where it won't get overridden, you're re-introducing a > preference towards lower CPU numbers, which I think is not a good > idea.=20 > It will never be used directly as the actual target CPU --at least according to my analysis of the code. runq_tickle() will consider it, but only as an hint, and will actually use it only if it satisfies all the other load balancing conditions (being part of a fully idle core, being idle, being in hard affinity, being in preemptable, etc). As I said in the rest of the changelog, if we really fear, or start to observe, that lower CPU numbers are being preferred, we can add countermeasures (stashing the CPU we chose last time and use cpumask_cycle(), as we do in Credit1, for another thing). My feeling is that they won't, as the load balancing logic in runq_tickle() will make that unlikely enough. > Can the code perhaps be rearranged to avoid the cpumask_any() > when another value will subsequently get stored anyway? >=20 I thought about it, and although for sure there are alternatives, none of the ones I could come up with were looking better than the present situation. Fact is, when the pick_cpu() hook is called in vcpu_migrate(), what vcpu_migrate() wants back from it is indeed a CPU number. Then (through vcpu_move_locked()) it either just sets v->processor equal to such CPU, or call the migrate() hook. On Credit1, the CPU returned by pick_cpu() is indeed the CPU where we want the vcpu to run, and setting v->processor to that is all we need to do for migrating a vcpu (and in fact, migrate() is not even defined). On Credit2, we (ab?)use pick_cpu() to actually select not really a CPU, but a runqueue. The fact that we return a random CPU from the runqueue we decided we want is the (pretty clever, IMO) way with which we avoid having to teach schedule.c about runqueues. Then, in migrate() (which is defined for Credit2), we do the other way round: we hand a CPU to Credit2 and it will translate that back in a runqueue (the runqueue where that CPU sits). Archaeology confirms that the migrate() hook was introduced (in ff38d3faa7d "credit2: add a callback to migrate to a new cpu") specifically for Credit2. The main difference, wrt all the above, between Credit1 and Credit2 is that in Credit1 there is one runqueue per each CPU, in Credit2, more CPUs use the same runqueue. The current pick_cpu()/migrate() approach lets both the schedulers, despite this difference, achieve what they want. Note also how such an approach targets the simplest case (<>), which is good when reading and wanting to understand schedule.c. It's then responsibility of any scheduler that wants to play fancy tricks --like Credit2 does with runqueues-- to take care of that, without making anyone else paying the price in terms of complexity. Every alternative I thought to, always involved making things less straightforward in schedule.c, which is something I'd rather avoid. If anyone has better alternatives, I'm all ears. :-) I certainly can add more comments, in sched_credit2.c, for explaining the situation. Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-fXlhLiyvi3yfc988aaGR Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXZ9N5AAoJEBZCeImluHPuuIYP/RLJz8RH8NbF7j+s6wrhkC0V a3uklYs276Yp6vfpVjL49KuViV4RP+tmWFE5dLm9IaPt2sOAVygRrdPqgoT1qowf rJr0+W2iCqSefkkym99hJiOIv5c1OLPWuWX4Pe5VXbG/eCClHrQl3U6pOg0DUQXP +ebpgt4DqgYFV28r+d7LdGWHkW4C8TV7o+7xbRlE2v6f2QryeZNAKdLsTM1yvxky sxgDMvcrDln1f7qTrDGDzUtKYHr9mRRgxQy45Ho7UoHeuT1RaTrWi2UVBmJjzKV9 uKd8em2hiVslXjAv20fBKCEyZHY9p84m16o2O2sePD8ErJ+P3PxTushm/CwwCZBB C8cw+sGMdO70jnpOlo6QDlZ43hug+wVRC5mPpvv5oJ415g96eZwizRjZjxRXIggY /RvMQSNhdT087g54eEY3sIfTyd+obdfKSq+F2oqtONMrU2GJVWgSCGHHwuCPkimS 5fB4/bLLfzf9GBWoSpnPkVXp0i96HcxY0WLKDmzn2ww76iFpNUCli5PuHQmCXWkm H/eqkZPoV0LTnv/qoAzeUmpkJg8oQVlCfdYY6PRbVZW2bTF0HjpavlrmfJt+Hhv4 zLg+2LrqUuhbDZwRy248WuVmRVoJLf4kvLq9a9CmaOv/eb6eKTXIhYHCRdTIHKYg 9A77goWhD722HqMlP3tr =3fWn -----END PGP SIGNATURE----- --=-fXlhLiyvi3yfc988aaGR-- --===============1987943505769600965== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============1987943505769600965==--