[PATCH v2 0/2] xen: sched/cpupool: more fixing of (corner?) cases

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] xen: sched/cpupool: more fixing of (corner?) cases
@ 2015-07-17 13:35 Dario Faggioli
  2015-07-17 13:35 ` [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler() Dario Faggioli
  2015-07-17 13:36 ` [PATCH v2 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool Dario Faggioli
  0 siblings, 2 replies; 6+ messages in thread
From: Dario Faggioli @ 2015-07-17 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Juergen Gross

Hi all,

Here's v2 of this series. v1 is here:
 http://lists.xen.org/archives/html/xen-devel/2015-07/msg00647.html

As a quick recap, this is aimed at avoiding an host crash during shutdown or
suspension, it in the following cases:
 - when the boot cpu (i.e., most of the times, cpu 0) is not assigned to any
   cpupool,
 - when a (non default) cpupool only has one cpu (and that is not the boot
   cpu).

v1 was bigger, because of some preparatory/mechanical chages, which have been
applied already, so here's what remains.

Patch 2 is fully acked. Patch 1 won Juergen's ack, during v1, but I'm avoiding
sticking it there, as it changed a little bit, while addressing George's
comments.

Best Regards,
Dario
---
Dario Faggioli (2):
      xen: sched: reorganize cpu_disable_scheduler()
      xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool

 xen/common/cpupool.c  |   18 ++++++++
 xen/common/schedule.c |  109 +++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 110 insertions(+), 17 deletions(-)

--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler()
  2015-07-17 13:35 [PATCH v2 0/2] xen: sched/cpupool: more fixing of (corner?) cases Dario Faggioli
@ 2015-07-17 13:35 ` Dario Faggioli
  2015-07-20 11:41   ` Juergen Gross
  2015-07-17 13:36 ` [PATCH v2 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool Dario Faggioli
  1 sibling, 1 reply; 6+ messages in thread
From: Dario Faggioli @ 2015-07-17 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Juergen Gross

The function is called both when we want to remove a cpu
from a cpupool, and during cpu teardown, for suspend or
shutdown. If, however, the boot cpu (cpu 0, most of the
times) is not present in the default cpupool, during
suspend or shutdown, Xen crashes like this:

  root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
  root@Zhaman:~# shutdown -h now
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d0801238de>] _csched_cpu_pick+0x156/0x61f
  (XEN)    [<ffff82d080123db5>] csched_cpu_pick+0xe/0x10
  (XEN)    [<ffff82d08012de3c>] vcpu_migrate+0x18e/0x321
  (XEN)    [<ffff82d08012e4f8>] cpu_disable_scheduler+0x1cf/0x2ac
  (XEN)    [<ffff82d08018bb8d>] __cpu_disable+0x313/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 15:
  (XEN) Assertion 'cpu < nr_cpu_ids' failed at ...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
  (XEN) ****************************************

There also are problems when we try to suspend or shutdown
with a cpupool configured with just one cpu (no matter, in
this case, whether that is the boot cpu or not):

  root@Zhaman:~# xl create /etc/xen/test.cfg
  root@Zhaman:~# xl cpupool-migrate test Pool-1
  root@Zhaman:~# xl cpupool-list -c
  Name               CPU list
  Pool-0             0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
  Pool-1             12
  root@Zhaman:~# shutdown -h now
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  (XEN) CPU:    12
  ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d08018bb91>] __cpu_disable+0x317/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 12:
  (XEN) Xen BUG at smpboot.c:895
  (XEN) ****************************************

In both cases, the problem is the scheduler not being able
to:
 - move all the vcpus to the boot cpu (as the boot cpu is
   not in the cpupool), in the former;
 - move the vcpus away from a cpu at all (as that is the
   only one cpu in the cpupool), in the latter.

Solution is to distinguish, inside cpu_disable_scheduler(),
the two cases of cpupool manipulation and teardown. For
cpupool manipulation, it is correct to ask the scheduler to
take an action, as pathological situation (like there not
being any cpu in the pool where to send vcpus) are taken
care of (i.e., forbidden!) already. For suspend and shutdown,
we don't want the scheduler to be involved at all, as the
final goal is pretty simple: "send all the vcpus to the
boot cpu ASAP", so we just go for it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Changes from v1:
 * BUG_ON() if, in the suspend/shutdown case, the mask of
   online pCPUs will ever get empty, as suggested during
   review;
 * reorganize and improve comments inside cpu_disable_scheduler()
   as suggested during review;
 * make it more clear that vcpu_move_nosched() (name
   changed, as suggested during review), should only be
   called from "quite contextes", such us, during suspend
   or shutdown. Do that via both comments and asserts,
   as requested during review;
 * reorganize cpu_disable_scheduler() and vcpu_move_nosched()
   so that calling to sleep and wakeup functions are only
   called when necessary (i.e., *not* in case we are
   suspending/shutting down, as requested during review.
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Juergen Gross <jgross@suse.com>
---
 xen/common/schedule.c |  102 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 86 insertions(+), 16 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index df8c1d0..ed0f356 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -454,9 +454,10 @@ void vcpu_unblock(struct vcpu *v)
  * Do the actual movement of a vcpu from old to new CPU. Locks for *both*
  * CPUs needs to have been taken already when calling this!
  */
-static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
-                      unsigned int new_cpu)
+static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
 {
+    unsigned int old_cpu = v->processor;
+
     /*
      * Transfer urgency status to new CPU before switching CPUs, as
      * once the switch occurs, v->is_urgent is no longer protected by
@@ -478,6 +479,33 @@ static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
         v->processor = new_cpu;
 }
 
+/*
+ * Move a vcpu from it's current processor to a target new processor,
+ * without asking the scheduler to do any placement. This is intended
+ * for being called from special contextes, where things are quite
+ * enough that no contention is supposed to happen (i.e., during
+ * shutdown or software suspend, like ACPI S3).
+ */
+static void vcpu_move_nosched(struct vcpu *v, unsigned int new_cpu)
+{
+    unsigned long flags;
+    spinlock_t *lock, *new_lock;
+
+    ASSERT(system_state == SYS_STATE_suspend);
+    ASSERT(!vcpu_runnable(v) && (atomic_read(&v->pause_count) ||
+                                 atomic_read(&v->domain->pause_count)));
+
+    lock = per_cpu(schedule_data, v->processor).schedule_lock;
+    new_lock = per_cpu(schedule_data, new_cpu).schedule_lock;
+
+    sched_spin_lock_double(lock, new_lock, &flags);
+    ASSERT(new_cpu != v->processor);
+    vcpu_move_locked(v, new_cpu);
+    sched_spin_unlock_double(lock, new_lock, flags);
+
+    sched_move_irqs(v);
+}
+
 static void vcpu_migrate(struct vcpu *v)
 {
     unsigned long flags;
@@ -542,7 +570,7 @@ static void vcpu_migrate(struct vcpu *v)
         return;
     }
 
-    vcpu_move(v, old_cpu, new_cpu);
+    vcpu_move_locked(v, new_cpu);
 
     sched_spin_unlock_double(old_lock, new_lock, flags);
 
@@ -615,7 +643,8 @@ int cpu_disable_scheduler(unsigned int cpu)
     struct vcpu *v;
     struct cpupool *c;
     cpumask_t online_affinity;
-    int    ret = 0;
+    unsigned int new_cpu;
+    int ret = 0;
 
     c = per_cpu(cpupool, cpu);
     if ( c == NULL )
@@ -644,25 +673,66 @@ int cpu_disable_scheduler(unsigned int cpu)
                 cpumask_setall(v->cpu_hard_affinity);
             }
 
-            if ( v->processor == cpu )
+            if ( v->processor != cpu )
             {
-                set_bit(_VPF_migrating, &v->pause_flags);
+                /* This vcpu is not on cpu, so we can move on. */
                 vcpu_schedule_unlock_irqrestore(lock, flags, v);
-                vcpu_sleep_nosync(v);
-                vcpu_migrate(v);
+                continue;
+            }
+
+            /* If it is on cpu, we must send it away. */
+            if ( unlikely(system_state == SYS_STATE_suspend) )
+            {
+                /*
+                 * If we are doing a shutdown/suspend, it is not necessary to
+                 * ask the scheduler to chime in. In fact:
+                 *  * there is no reason for it: the end result we are after
+                 *    is just 'all the vcpus on the boot pcpu, and no vcpu
+                 *    anywhere else', so let's just go for it;
+                 *  * it's wrong, for cpupools with only non-boot pcpus, as
+                 *    the scheduler would always fail to send the vcpus away
+                 *    from the last online (non boot) pcpu!
+                 *
+                 * Therefore, in the shutdown/suspend case, we just pick up
+                 * one (still) online pcpu. Note that, at this stage, all
+                 * domains (including dom0) have been paused already, so we
+                 * do not expect any vcpu activity at all.
+                 */
+                cpumask_andnot(&online_affinity, &cpu_online_map,
+                               cpumask_of(cpu));
+                BUG_ON(cpumask_empty(&online_affinity));
+                /*
+                 * As boot cpu is, usually, pcpu #0, using cpumask_first()
+                 * will make us converge quicker.
+                 */
+                new_cpu = cpumask_first(&online_affinity);
+                vcpu_move_nosched(v, new_cpu);
             }
             else
+            {
+                /*
+                 * OTOH, if the system is still live, and we are here because
+                 * we are doing some cpupool manipulations:
+                 *  * we want to call the scheduler, and let it re-evaluation
+                 *    the placement of the vcpu, taking into account the new
+                 *    cpupool configuration;
+                 *  * the scheduler will always fine a suitable solution, or
+                 *    things would have failed before getting in here.
+                 */
+                set_bit(_VPF_migrating, &v->pause_flags);
                 vcpu_schedule_unlock_irqrestore(lock, flags, v);
+                vcpu_sleep_nosync(v);
+                vcpu_migrate(v);
 
-            /*
-             * A vcpu active in the hypervisor will not be migratable.
-             * The caller should try again after releasing and reaquiring
-             * all locks.
-             */
-            if ( v->processor == cpu )
-                ret = -EAGAIN;
+                /*
+                 * The only caveat, in this case, is that if a vcpu active in
+                 * the hypervisor isn't migratable. In this case, the caller
+                 * should try again after releasing and reaquiring all locks.
+                 */
+                if ( v->processor == cpu )
+                    ret = -EAGAIN;
+            }
         }
-
         domain_update_node_affinity(d);
     }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool
  2015-07-17 13:35 [PATCH v2 0/2] xen: sched/cpupool: more fixing of (corner?) cases Dario Faggioli
  2015-07-17 13:35 ` [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler() Dario Faggioli
@ 2015-07-17 13:36 ` Dario Faggioli
  1 sibling, 0 replies; 6+ messages in thread
From: Dario Faggioli @ 2015-07-17 13:36 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Juergen Gross

And this time, do it right. In fact, a similar change was
attempted in 93be8285a79c6 ("cpupools: update domU's node-affinity
on the cpupool_unassign_cpu() path"). But that was buggy, and got
reverted with 8395b67ab0b8a86.

However, even though reverting was the right thing to do, it
remains true that:
 - calling the function is better done in the cpupool cpu removal
   code, even if just for simmetry with the cpupool cpu adding path;
 - it is not necessary to call it during cpu teardown (for suspend
   or shutdown) code as we either are going down and will never
   come up (shutdown) or, when coming up, we want everything to be
   as before the tearing down process started, and so we would just
   undo any update made during the process.
 - calling it from the teardown path is not only unnecessary, but
   it can trigger an ASSERT(), in case we get, during the process,
   to remove the last online pcpu of a domain's node affinity:

  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  ... ... ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d0801055b9>] domain_update_node_affinity+0x113/0x240
  (XEN)    [<ffff82d08012e676>] cpu_disable_scheduler+0x334/0x3f2
  (XEN)    [<ffff82d08018bb8d>] __cpu_disable+0x313/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d080130ad9>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d08013274f>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132a85>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 12:
  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) ****************************************

Therefore, for all these reasons, move the call from
cpu_disable_schedule() to cpupool_unassign_cpu_helper().

While there, add some sanity checking (in the latter function), and
make sure that scanning the domain list is done with domlist_read_lock
held, at least when the system is 'live'.

I re-tested the scenario described in here:
 http://permalink.gmane.org/gmane.comp.emulators.xen.devel/235310

which is what led to the revert of 93be8285a79c6, and that is
working ok after this commit.

Signed-off-by: <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
---
 xen/common/cpupool.c  |   18 ++++++++++++++++++
 xen/common/schedule.c |    7 ++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index b48ae17..69b984c 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -297,12 +297,25 @@ static int cpupool_assign_cpu_locked(struct cpupool *c, unsigned int cpu)
 static long cpupool_unassign_cpu_helper(void *info)
 {
     int cpu = cpupool_moving_cpu;
+    struct cpupool *c = info;
+    struct domain *d;
     long ret;
 
     cpupool_dprintk("cpupool_unassign_cpu(pool=%d,cpu=%d)\n",
                     cpupool_cpu_moving->cpupool_id, cpu);
 
     spin_lock(&cpupool_lock);
+    if ( c != cpupool_cpu_moving )
+    {
+        ret = -EBUSY;
+        goto out;
+    }
+
+    /*
+     * We need this for scanning the domain list, both in
+     * cpu_disable_scheduler(), and at the bottom of this function.
+     */
+    rcu_read_lock(&domlist_read_lock);
     ret = cpu_disable_scheduler(cpu);
     cpumask_set_cpu(cpu, &cpupool_free_cpus);
     if ( !ret )
@@ -319,6 +332,11 @@ static long cpupool_unassign_cpu_helper(void *info)
         cpupool_cpu_moving = NULL;
     }
 
+    for_each_domain_in_cpupool(d, c)
+    {
+        domain_update_node_affinity(d);
+    }
+    rcu_read_unlock(&domlist_read_lock);
 out:
     spin_unlock(&cpupool_lock);
     cpupool_dprintk("cpupool_unassign_cpu ret=%ld\n", ret);
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index ed0f356..576b5fa 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -650,6 +650,12 @@ int cpu_disable_scheduler(unsigned int cpu)
     if ( c == NULL )
         return ret;
 
+    /*
+     * We'd need the domain RCU lock, but:
+     *  - when we are called from cpupool code, it's acquired there already;
+     *  - when we are called for CPU teardown, we're in stop-machine context,
+     *    so that's not be a problem.
+     */
     for_each_domain_in_cpupool ( d, c )
     {
         for_each_vcpu ( d, v )
@@ -733,7 +739,6 @@ int cpu_disable_scheduler(unsigned int cpu)
                     ret = -EAGAIN;
             }
         }
-        domain_update_node_affinity(d);
     }
 
     return ret;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler()
  2015-07-17 13:35 ` [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler() Dario Faggioli
@ 2015-07-20 11:41   ` Juergen Gross
  2015-07-20 11:59     ` Dario Faggioli
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2015-07-20 11:41 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: George Dunlap

On 07/17/2015 03:35 PM, Dario Faggioli wrote:
> The function is called both when we want to remove a cpu
> from a cpupool, and during cpu teardown, for suspend or
> shutdown. If, however, the boot cpu (cpu 0, most of the
> times) is not present in the default cpupool, during
> suspend or shutdown, Xen crashes like this:
>
>    root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
>    root@Zhaman:~# shutdown -h now
>    (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
>    ...
>    (XEN) Xen call trace:
>    (XEN)    [<ffff82d0801238de>] _csched_cpu_pick+0x156/0x61f
>    (XEN)    [<ffff82d080123db5>] csched_cpu_pick+0xe/0x10
>    (XEN)    [<ffff82d08012de3c>] vcpu_migrate+0x18e/0x321
>    (XEN)    [<ffff82d08012e4f8>] cpu_disable_scheduler+0x1cf/0x2ac
>    (XEN)    [<ffff82d08018bb8d>] __cpu_disable+0x313/0x36e
>    (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
>    (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
>    (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
>    (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
>    (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
>    (XEN)
>    (XEN)
>    (XEN) ****************************************
>    (XEN) Panic on CPU 15:
>    (XEN) Assertion 'cpu < nr_cpu_ids' failed at ...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
>    (XEN) ****************************************
>
> There also are problems when we try to suspend or shutdown
> with a cpupool configured with just one cpu (no matter, in
> this case, whether that is the boot cpu or not):
>
>    root@Zhaman:~# xl create /etc/xen/test.cfg
>    root@Zhaman:~# xl cpupool-migrate test Pool-1
>    root@Zhaman:~# xl cpupool-list -c
>    Name               CPU list
>    Pool-0             0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
>    Pool-1             12
>    root@Zhaman:~# shutdown -h now
>    (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
>    (XEN) CPU:    12
>    ...
>    (XEN) Xen call trace:
>    (XEN)    [<ffff82d08018bb91>] __cpu_disable+0x317/0x36e
>    (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
>    (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
>    (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
>    (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
>    (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
>    (XEN)
>    (XEN)
>    (XEN) ****************************************
>    (XEN) Panic on CPU 12:
>    (XEN) Xen BUG at smpboot.c:895
>    (XEN) ****************************************
>
> In both cases, the problem is the scheduler not being able
> to:
>   - move all the vcpus to the boot cpu (as the boot cpu is
>     not in the cpupool), in the former;
>   - move the vcpus away from a cpu at all (as that is the
>     only one cpu in the cpupool), in the latter.
>
> Solution is to distinguish, inside cpu_disable_scheduler(),
> the two cases of cpupool manipulation and teardown. For
> cpupool manipulation, it is correct to ask the scheduler to
> take an action, as pathological situation (like there not
> being any cpu in the pool where to send vcpus) are taken
> care of (i.e., forbidden!) already. For suspend and shutdown,
> we don't want the scheduler to be involved at all, as the
> final goal is pretty simple: "send all the vcpus to the
> boot cpu ASAP", so we just go for it.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Changes from v1:
>   * BUG_ON() if, in the suspend/shutdown case, the mask of
>     online pCPUs will ever get empty, as suggested during
>     review;
>   * reorganize and improve comments inside cpu_disable_scheduler()
>     as suggested during review;
>   * make it more clear that vcpu_move_nosched() (name
>     changed, as suggested during review), should only be
>     called from "quite contextes", such us, during suspend

s/quite/quiet/

>     or shutdown. Do that via both comments and asserts,
>     as requested during review;
>   * reorganize cpu_disable_scheduler() and vcpu_move_nosched()
>     so that calling to sleep and wakeup functions are only
>     called when necessary (i.e., *not* in case we are
>     suspending/shutting down, as requested during review.
> ---
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
>   xen/common/schedule.c |  102 +++++++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 86 insertions(+), 16 deletions(-)
>
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index df8c1d0..ed0f356 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c

...

> @@ -644,25 +673,66 @@ int cpu_disable_scheduler(unsigned int cpu)
>                   cpumask_setall(v->cpu_hard_affinity);
>               }
>
> -            if ( v->processor == cpu )
> +            if ( v->processor != cpu )
>               {
> -                set_bit(_VPF_migrating, &v->pause_flags);
> +                /* This vcpu is not on cpu, so we can move on. */
>                   vcpu_schedule_unlock_irqrestore(lock, flags, v);
> -                vcpu_sleep_nosync(v);
> -                vcpu_migrate(v);
> +                continue;
> +            }
> +
> +            /* If it is on cpu, we must send it away. */
> +            if ( unlikely(system_state == SYS_STATE_suspend) )
> +            {
> +                /*
> +                 * If we are doing a shutdown/suspend, it is not necessary to
> +                 * ask the scheduler to chime in. In fact:
> +                 *  * there is no reason for it: the end result we are after
> +                 *    is just 'all the vcpus on the boot pcpu, and no vcpu
> +                 *    anywhere else', so let's just go for it;
> +                 *  * it's wrong, for cpupools with only non-boot pcpus, as
> +                 *    the scheduler would always fail to send the vcpus away
> +                 *    from the last online (non boot) pcpu!
> +                 *
> +                 * Therefore, in the shutdown/suspend case, we just pick up
> +                 * one (still) online pcpu. Note that, at this stage, all
> +                 * domains (including dom0) have been paused already, so we
> +                 * do not expect any vcpu activity at all.
> +                 */
> +                cpumask_andnot(&online_affinity, &cpu_online_map,
> +                               cpumask_of(cpu));
> +                BUG_ON(cpumask_empty(&online_affinity));
> +                /*
> +                 * As boot cpu is, usually, pcpu #0, using cpumask_first()
> +                 * will make us converge quicker.
> +                 */
> +                new_cpu = cpumask_first(&online_affinity);
> +                vcpu_move_nosched(v, new_cpu);

Shouldn't there be a vcpu_schedule_unlock_irqrestore() ?


Juergen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler()
  2015-07-20 11:41   ` Juergen Gross
@ 2015-07-20 11:59     ` Dario Faggioli
  2015-07-20 12:06       ` Juergen Gross
  0 siblings, 1 reply; 6+ messages in thread
From: Dario Faggioli @ 2015-07-20 11:59 UTC (permalink / raw)
  To: Juergen Gross; +Cc: George Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3325 bytes --]

On Mon, 2015-07-20 at 13:41 +0200, Juergen Gross wrote:
> On 07/17/2015 03:35 PM, Dario Faggioli wrote:

> > @@ -644,25 +673,66 @@ int cpu_disable_scheduler(unsigned int cpu)
> >                   cpumask_setall(v->cpu_hard_affinity);
> >               }
> >
> > -            if ( v->processor == cpu )
> > +            if ( v->processor != cpu )
> >               {
> > -                set_bit(_VPF_migrating, &v->pause_flags);
> > +                /* This vcpu is not on cpu, so we can move on. */
> >                   vcpu_schedule_unlock_irqrestore(lock, flags, v);
> > -                vcpu_sleep_nosync(v);
> > -                vcpu_migrate(v);
> > +                continue;
> > +            }
> > +
> > +            /* If it is on cpu, we must send it away. */
> > +            if ( unlikely(system_state == SYS_STATE_suspend) )
> > +            {
> > +                /*
> > +                 * If we are doing a shutdown/suspend, it is not necessary to
> > +                 * ask the scheduler to chime in. In fact:
> > +                 *  * there is no reason for it: the end result we are after
> > +                 *    is just 'all the vcpus on the boot pcpu, and no vcpu
> > +                 *    anywhere else', so let's just go for it;
> > +                 *  * it's wrong, for cpupools with only non-boot pcpus, as
> > +                 *    the scheduler would always fail to send the vcpus away
> > +                 *    from the last online (non boot) pcpu!
> > +                 *
> > +                 * Therefore, in the shutdown/suspend case, we just pick up
> > +                 * one (still) online pcpu. Note that, at this stage, all
> > +                 * domains (including dom0) have been paused already, so we
> > +                 * do not expect any vcpu activity at all.
> > +                 */
> > +                cpumask_andnot(&online_affinity, &cpu_online_map,
> > +                               cpumask_of(cpu));
> > +                BUG_ON(cpumask_empty(&online_affinity));
> > +                /*
> > +                 * As boot cpu is, usually, pcpu #0, using cpumask_first()
> > +                 * will make us converge quicker.
> > +                 */
> > +                new_cpu = cpumask_first(&online_affinity);
> > +                vcpu_move_nosched(v, new_cpu);
> 
> Shouldn't there be a vcpu_schedule_unlock_irqrestore() ?
> 
I'm sure I put one there, as I was sure that it was there the last time
I inspected the patch before hitting send.

But I see that it's not there now, so I must have messed up when
formatting the patch, or something like that. :-(

It's really really weird, as I forgot it during development, and then
the system was hanging, and then I added it, and that's why I'm sure I
did have it in place... but perhaps I fat fingered some stgit command
which made it disappear.

In any case, sorry for this. I will re-test (just to be sure) and
re-send (and this time I'll triple check!!)

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler()
  2015-07-20 11:59     ` Dario Faggioli
@ 2015-07-20 12:06       ` Juergen Gross
  0 siblings, 0 replies; 6+ messages in thread
From: Juergen Gross @ 2015-07-20 12:06 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel

On 07/20/2015 01:59 PM, Dario Faggioli wrote:
> On Mon, 2015-07-20 at 13:41 +0200, Juergen Gross wrote:
>> On 07/17/2015 03:35 PM, Dario Faggioli wrote:
>
>>> @@ -644,25 +673,66 @@ int cpu_disable_scheduler(unsigned int cpu)
>>>                    cpumask_setall(v->cpu_hard_affinity);
>>>                }
>>>
>>> -            if ( v->processor == cpu )
>>> +            if ( v->processor != cpu )
>>>                {
>>> -                set_bit(_VPF_migrating, &v->pause_flags);
>>> +                /* This vcpu is not on cpu, so we can move on. */
>>>                    vcpu_schedule_unlock_irqrestore(lock, flags, v);
>>> -                vcpu_sleep_nosync(v);
>>> -                vcpu_migrate(v);
>>> +                continue;
>>> +            }
>>> +
>>> +            /* If it is on cpu, we must send it away. */
>>> +            if ( unlikely(system_state == SYS_STATE_suspend) )
>>> +            {
>>> +                /*
>>> +                 * If we are doing a shutdown/suspend, it is not necessary to
>>> +                 * ask the scheduler to chime in. In fact:
>>> +                 *  * there is no reason for it: the end result we are after
>>> +                 *    is just 'all the vcpus on the boot pcpu, and no vcpu
>>> +                 *    anywhere else', so let's just go for it;
>>> +                 *  * it's wrong, for cpupools with only non-boot pcpus, as
>>> +                 *    the scheduler would always fail to send the vcpus away
>>> +                 *    from the last online (non boot) pcpu!
>>> +                 *
>>> +                 * Therefore, in the shutdown/suspend case, we just pick up
>>> +                 * one (still) online pcpu. Note that, at this stage, all
>>> +                 * domains (including dom0) have been paused already, so we
>>> +                 * do not expect any vcpu activity at all.
>>> +                 */
>>> +                cpumask_andnot(&online_affinity, &cpu_online_map,
>>> +                               cpumask_of(cpu));
>>> +                BUG_ON(cpumask_empty(&online_affinity));
>>> +                /*
>>> +                 * As boot cpu is, usually, pcpu #0, using cpumask_first()
>>> +                 * will make us converge quicker.
>>> +                 */
>>> +                new_cpu = cpumask_first(&online_affinity);
>>> +                vcpu_move_nosched(v, new_cpu);
>>
>> Shouldn't there be a vcpu_schedule_unlock_irqrestore() ?
>>
> I'm sure I put one there, as I was sure that it was there the last time
> I inspected the patch before hitting send.
>
> But I see that it's not there now, so I must have messed up when
> formatting the patch, or something like that. :-(
>
> It's really really weird, as I forgot it during development, and then
> the system was hanging, and then I added it, and that's why I'm sure I
> did have it in place... but perhaps I fat fingered some stgit command
> which made it disappear.

Or you forgot stg refresh? I just managed to do so. :-(


Juergen

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-07-20 12:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-17 13:35 [PATCH v2 0/2] xen: sched/cpupool: more fixing of (corner?) cases Dario Faggioli
2015-07-17 13:35 ` [PATCH v2 1/2] xen: sched: reorganize cpu_disable_scheduler() Dario Faggioli
2015-07-20 11:41   ` Juergen Gross
2015-07-20 11:59     ` Dario Faggioli
2015-07-20 12:06       ` Juergen Gross
2015-07-17 13:36 ` [PATCH v2 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool Dario Faggioli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).