xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Tim Deegan <tim@xen.org>, Meng Xu <mengxu@cis.upenn.edu>,
	Jan Beulich <jbeulich@suse.com>,
	Anshul Makkar <anshul.makkar@citrix.com>
Subject: [PATCH 3/3] xen: Remove buggy initial placement algorithm
Date: Fri, 15 Jul 2016 19:02:02 +0100	[thread overview]
Message-ID: <1468605722-24239-3-git-send-email-george.dunlap@citrix.com> (raw)
In-Reply-To: <1468605722-24239-1-git-send-email-george.dunlap@citrix.com>

The initial placement algorithm sometimes picks cpus outside of the
mask it's given, does a lot of unnecessary bitmasking, does its own
separate load calculation, and completely ignores vcpu hard and soft
affinities.  Just get rid of it and rely on the schedulers to do
initial placement.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
Since many of scheduler cpu_pick functions have a strong preference to
just leave the cpu where it is (in particular, credit1 and rt), this
may cause some cpus to be overloaded when creating a lot of domains.
Arguably this should be fixed in the schedulers themselves.

The core problem with default_vcpu0_location() is that it chooses its
initial cpu based on the sibling of pcpu 0, not the first available
sibling in the online mask; so if pcpu 1 ends up being less "busy"
than all the cpus in the pool, then it ends up being chosen even
though it's not in the pool.

Fixing the algorithm would involve starting with the sibling map of
cpumask_first(online) rather than 0, and then having all sibling
checks not only test that the result of cpumask_next() < nr_cpu_ids,
but that the result is in online.

Additionally, as far as I can tell, the cpumask_test_cpu(i,
&cpu_exclude_map) at the top of the for_each_cpu() loop can never
return false; and this both this test and the cpumask_or() are
unnecessary and should be removed.

CC: Dario Faggioli <dario.faggioli@citrix.com>
CC: Anshul Makkar <anshul.makkar@citrix.com>
CC: Meng Xu <mengxu@cis.upenn.edu>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/domctl.c | 50 +-------------------------------------------------
 1 file changed, 1 insertion(+), 49 deletions(-)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index b784e6c..cc85e27 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -217,54 +217,6 @@ void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info)
     memcpy(info->handle, d->handle, sizeof(xen_domain_handle_t));
 }
 
-static unsigned int default_vcpu0_location(cpumask_t *online)
-{
-    struct domain *d;
-    struct vcpu   *v;
-    unsigned int   i, cpu, nr_cpus, *cnt;
-    cpumask_t      cpu_exclude_map;
-
-    /* Do an initial CPU placement. Pick the least-populated CPU. */
-    nr_cpus = cpumask_last(&cpu_online_map) + 1;
-    cnt = xzalloc_array(unsigned int, nr_cpus);
-    if ( cnt )
-    {
-        rcu_read_lock(&domlist_read_lock);
-        for_each_domain ( d )
-            for_each_vcpu ( d, v )
-                if ( !(v->pause_flags & VPF_down)
-                     && ((cpu = v->processor) < nr_cpus) )
-                    cnt[cpu]++;
-        rcu_read_unlock(&domlist_read_lock);
-    }
-
-    /*
-     * If we're on a HT system, we only auto-allocate to a non-primary HT. We
-     * favour high numbered CPUs in the event of a tie.
-     */
-    cpumask_copy(&cpu_exclude_map, per_cpu(cpu_sibling_mask, 0));
-    cpu = cpumask_first(&cpu_exclude_map);
-    i = cpumask_next(cpu, &cpu_exclude_map);
-    if ( i < nr_cpu_ids )
-        cpu = i;
-    for_each_cpu(i, online)
-    {
-        if ( cpumask_test_cpu(i, &cpu_exclude_map) )
-            continue;
-        if ( (i == cpumask_first(per_cpu(cpu_sibling_mask, i))) &&
-             (cpumask_next(i, per_cpu(cpu_sibling_mask, i)) < nr_cpu_ids) )
-            continue;
-        cpumask_or(&cpu_exclude_map, &cpu_exclude_map,
-                   per_cpu(cpu_sibling_mask, i));
-        if ( !cnt || cnt[i] <= cnt[cpu] )
-            cpu = i;
-    }
-
-    xfree(cnt);
-
-    return cpu;
-}
-
 bool_t domctl_lock_acquire(void)
 {
     /*
@@ -691,7 +643,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
                 continue;
 
             cpu = (i == 0) ?
-                default_vcpu0_location(online) :
+                cpumask_first(online) :
                 cpumask_cycle(d->vcpu[i-1]->processor, online);
 
             if ( alloc_vcpu(d, i, cpu) == NULL )
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-07-15 18:02 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15 18:02 [PATCH 1/3] xen: Some code motion to avoid having to do forward-declaration George Dunlap
2016-07-15 18:02 ` [PATCH 2/3] xen: Have schedulers revise initial placement George Dunlap
2016-07-15 18:07   ` Andrew Cooper
2016-07-16 14:12     ` Dario Faggioli
2016-07-18 18:10       ` Andrew Cooper
2016-07-18 18:55         ` Dario Faggioli
2016-07-18 21:36           ` Andrew Cooper
2016-07-19  7:14             ` Dario Faggioli
2016-07-18 10:28   ` Dario Faggioli
2016-07-25 11:17     ` George Dunlap
2016-07-25 14:36       ` Meng Xu
2016-07-26  9:17       ` Dario Faggioli
2016-07-25 14:35   ` Meng Xu
2016-08-01 10:40   ` Jan Beulich
2016-08-01 12:32     ` Dario Faggioli
2016-08-05 13:24       ` Jan Beulich
2016-08-05 14:09         ` Dario Faggioli
2016-08-05 14:44           ` Jan Beulich
2016-08-11 14:59         ` Dario Faggioli
2016-08-11 15:51           ` Andrew Cooper
2016-08-11 23:35             ` Dario Faggioli
2016-08-12  1:59         ` dependences for backporting to 4.6 [was: Re: [PATCH 2/3] xen: Have schedulers revise initial placement] Dario Faggioli
2016-08-12 13:53           ` Jan Beulich
2016-08-16 10:21             ` Dario Faggioli
2016-08-16 11:21               ` Jan Beulich
2016-08-12  8:58         ` dependences for backporting to 4.5 " Dario Faggioli
2016-07-15 18:02 ` George Dunlap [this message]
2016-07-15 18:10   ` [PATCH 3/3] xen: Remove buggy initial placement algorithm Andrew Cooper
2016-07-16 13:55   ` Dario Faggioli
2016-07-18 10:03     ` George Dunlap
2016-07-16 15:48 ` [PATCH 1/3] xen: Some code motion to avoid having to do forward-declaration Meng Xu
2016-07-18  9:58 ` Dario Faggioli
2016-07-18 10:06   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1468605722-24239-3-git-send-email-george.dunlap@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=mengxu@cis.upenn.edu \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).