All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/5] Series short description
@ 2018-08-25  0:21 Dario Faggioli
  2018-08-25  0:21 ` [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign() Dario Faggioli
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:21 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Stefano Stabellini, Roger Pau Monne

xen: sched: support vcpu hotplug/hotunplug in the 'null scheduler'

Hello,

It turned out, while discussing this patch:
https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00249.html

that the 'null scheduler' does not really support vcpu
hotplug/hotunplug. In fact, under some circumnstances, it is possible
that the vcpus which are actually offline, get assigned to a pcpu, while
one or more online vcpus, may be left in the wait list, and stay there
forever.

As a matter of fact, one of these circumnstances was when the 'null
scheduler' was used within the PV-SHIM, but the problem is more general.

The cited patch proposed a solution, which IMO was the wrong one, as it
was causing unnecessary overhead to regular use cases of the 'null
scheduler' (i.e., when one has vcpus==pcpus, does not online/offline,
etc).

This series is what I think should be the proper implementation of
offlining/onlining vcpus within 'null'. It is all a little bit tricky,
as we need to figure out, from within the scheduler's sleep and wakeup
functions, whether or not the cpu is coming online/going offline, but it
works.

I did some testing in different setups and scenarios, and the system
survived. However:
- Stefano, if you're interested in hotplug/hotunplug when using 'null',
and have already a use case to test it, please go ahead, and tell me if
something crashes! :-)
- Roger, if you fancy giving it a try again to use 'null' as the
scheduler of the SHIM, and tell me if now, with this patch, it works,
that would be awesome. I can do that myself, of course, and I will,
after vacations, if you don't get round to it by then. :-D

Thanks and Regards,
Dario
---
Dario Faggioli (5):
      xen: sched: null: refactor core around vcpu_deassign()
      xen: sched: null: don't assign down vcpus to pcpus
      xen: sched: null: deassign offline vcpus from pcpus
      xen: sched: null: reassign vcpus to pcpus when they come back online
      xen: sched: null: refactor the ASSERTs around vcpu_deassing()

 xen/common/sched_null.c |  196 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 145 insertions(+), 51 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign()
  2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
@ 2018-08-25  0:21 ` Dario Faggioli
  2019-07-15 15:46   ` [Xen-devel] " George Dunlap
  2018-08-25  0:21 ` [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus Dario Faggioli
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:21 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

vcpu_deassign() has only one caller: _vcpu_remove().
Let's consolidate the two functions into one.

No functional change intended.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/common/sched_null.c |   76 ++++++++++++++++++++++-------------------------
 1 file changed, 35 insertions(+), 41 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 784db71027..f372172c32 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -359,9 +359,14 @@ static void vcpu_assign(struct null_private *prv, struct vcpu *v,
     }
 }
 
-static void vcpu_deassign(struct null_private *prv, struct vcpu *v,
-                          unsigned int cpu)
+static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
 {
+    unsigned int bs;
+    unsigned int cpu = v->processor;
+    struct null_vcpu *wvc;
+
+    ASSERT(list_empty(&null_vcpu(v)->waitq_elem));
+
     per_cpu(npc, cpu).vcpu = NULL;
     cpumask_set_cpu(cpu, &prv->cpus_free);
 
@@ -378,6 +383,32 @@ static void vcpu_deassign(struct null_private *prv, struct vcpu *v,
         d.cpu = cpu;
         __trace_var(TRC_SNULL_VCPU_DEASSIGN, 1, sizeof(d), &d);
     }
+
+    spin_lock(&prv->waitq_lock);
+
+    /*
+     * If v is assigned to a pCPU, let's see if there is someone waiting,
+     * suitable to be assigned to it (prioritizing vcpus that have
+     * soft-affinity with cpu).
+     */
+    for_each_affinity_balance_step( bs )
+    {
+        list_for_each_entry( wvc, &prv->waitq, waitq_elem )
+        {
+            if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(wvc->vcpu) )
+                continue;
+
+            if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) )
+            {
+                list_del_init(&wvc->waitq_elem);
+                vcpu_assign(prv, wvc->vcpu, cpu);
+                cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+                spin_unlock(&prv->waitq_lock);
+                return;
+            }
+        }
+    }
+    spin_unlock(&prv->waitq_lock);
 }
 
 /* Change the scheduler of cpu to us (null). */
@@ -469,43 +500,6 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
     SCHED_STAT_CRANK(vcpu_insert);
 }
 
-static void _vcpu_remove(struct null_private *prv, struct vcpu *v)
-{
-    unsigned int bs;
-    unsigned int cpu = v->processor;
-    struct null_vcpu *wvc;
-
-    ASSERT(list_empty(&null_vcpu(v)->waitq_elem));
-
-    vcpu_deassign(prv, v, cpu);
-
-    spin_lock(&prv->waitq_lock);
-
-    /*
-     * If v is assigned to a pCPU, let's see if there is someone waiting,
-     * suitable to be assigned to it (prioritizing vcpus that have
-     * soft-affinity with cpu).
-     */
-    for_each_affinity_balance_step( bs )
-    {
-        list_for_each_entry( wvc, &prv->waitq, waitq_elem )
-        {
-            if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(wvc->vcpu) )
-                continue;
-
-            if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) )
-            {
-                list_del_init(&wvc->waitq_elem);
-                vcpu_assign(prv, wvc->vcpu, cpu);
-                cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
-                spin_unlock(&prv->waitq_lock);
-                return;
-            }
-        }
-    }
-    spin_unlock(&prv->waitq_lock);
-}
-
 static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
 {
     struct null_private *prv = null_priv(ops);
@@ -529,7 +523,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
     ASSERT(per_cpu(npc, v->processor).vcpu == v);
     ASSERT(!cpumask_test_cpu(v->processor, &prv->cpus_free));
 
-    _vcpu_remove(prv, v);
+    vcpu_deassign(prv, v);
 
  out:
     vcpu_schedule_unlock_irq(lock, v);
@@ -615,7 +609,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      */
     if ( likely(list_empty(&nvc->waitq_elem)) )
     {
-        _vcpu_remove(prv, v);
+        vcpu_deassign(prv, v);
         SCHED_STAT_CRANK(migrate_running);
     }
     else


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
  2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
  2018-08-25  0:21 ` [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign() Dario Faggioli
@ 2018-08-25  0:21 ` Dario Faggioli
  2019-07-15 16:06   ` [Xen-devel] " George Dunlap
  2018-08-25  0:21 ` [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus Dario Faggioli
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:21 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Stefano Stabellini, Roger Pau Monne

If a pCPU has been/is being offlined, we don't want it to be neither
assigned to any pCPU, nor in the wait list.

So, avoid doing that while inserting or migrating vcpus.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/common/sched_null.c |   26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index f372172c32..1426124525 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -340,6 +340,8 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
 static void vcpu_assign(struct null_private *prv, struct vcpu *v,
                         unsigned int cpu)
 {
+    ASSERT(is_vcpu_online(v));
+
     per_cpu(npc, cpu).vcpu = v;
     v->processor = cpu;
     cpumask_clear_cpu(cpu, &prv->cpus_free);
@@ -454,8 +456,14 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
     ASSERT(!is_idle_vcpu(v));
 
     lock = vcpu_schedule_lock_irq(v);
- retry:
 
+    if ( unlikely(!is_vcpu_online(v)) )
+    {
+        vcpu_schedule_unlock_irq(lock, v);
+        return;
+    }
+
+ retry:
     cpu = v->processor = pick_cpu(prv, v);
 
     spin_unlock(lock);
@@ -617,6 +625,21 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
 
     SCHED_STAT_CRANK(migrated);
 
+    /*
+     * If it was/it's going offline, we don't want it neither assigned to
+     * a vcpu, nor in the waitqueue.
+     *
+     * If it was on a cpu, we've removed it from there above. If it is in the
+     * waitqueue, we remove it from there now. And then we bail.
+     */
+    if ( unlikely(!is_vcpu_online(v)) )
+    {
+        spin_lock(&prv->waitq_lock);
+        list_del_init(&nvc->waitq_elem);
+        spin_unlock(&prv->waitq_lock);
+        goto out;
+    }
+
     /*
      * Let's now consider new_cpu, which is where v is being sent. It can be
      * either free, or have a vCPU already assigned to it.
@@ -657,6 +680,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      * at least. In case of suspend, any temporary inconsistency caused
      * by this, will be fixed-up during resume.
      */
+ out:
     v->processor = new_cpu;
 }
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
  2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
  2018-08-25  0:21 ` [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign() Dario Faggioli
  2018-08-25  0:21 ` [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus Dario Faggioli
@ 2018-08-25  0:21 ` Dario Faggioli
  2019-07-17 16:04   ` [Xen-devel] " George Dunlap
  2018-08-25  0:22 ` [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online Dario Faggioli
  2018-08-25  0:22 ` [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing() Dario Faggioli
  4 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:21 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Stefano Stabellini, Roger Pau Monne

If a pCPU has been/is being offlined, we don't want it to be neither
assigned to any pCPU, nor in the wait list.

Therefore, when we detect that a vcpu is going offline, remove it from
both places.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/common/sched_null.c |   43 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 1426124525..6259f4643e 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -361,7 +361,8 @@ static void vcpu_assign(struct null_private *prv, struct vcpu *v,
     }
 }
 
-static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
+/* Returns true if a cpu was tickled */
+static bool vcpu_deassign(struct null_private *prv, struct vcpu *v)
 {
     unsigned int bs;
     unsigned int cpu = v->processor;
@@ -406,11 +407,13 @@ static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
                 vcpu_assign(prv, wvc->vcpu, cpu);
                 cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
                 spin_unlock(&prv->waitq_lock);
-                return;
+                return true;
             }
         }
     }
     spin_unlock(&prv->waitq_lock);
+
+    return false;
 }
 
 /* Change the scheduler of cpu to us (null). */
@@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
 
     lock = vcpu_schedule_lock_irq(v);
 
+    /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
+    if ( unlikely(!is_vcpu_online(v)) )
+    {
+        ASSERT(per_cpu(npc, v->processor).vcpu != v);
+        ASSERT(list_empty(&nvc->waitq_elem));
+        goto out;
+    }
+
     /* If v is in waitqueue, just get it out of there and bail */
     if ( unlikely(!list_empty(&nvc->waitq_elem)) )
     {
@@ -567,11 +578,31 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
 
 static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
 {
+    struct null_private *prv = null_priv(ops);
+    unsigned int cpu = v->processor;
+    bool tickled = false;
+
     ASSERT(!is_idle_vcpu(v));
 
+    /* We need to special case the handling of the vcpu being offlined */
+    if ( unlikely(!is_vcpu_online(v)) )
+    {
+        struct null_vcpu *nvc = null_vcpu(v);
+
+        printk("YYY d%dv%d going down?\n", v->domain->domain_id, v->vcpu_id);
+        if ( unlikely(!list_empty(&nvc->waitq_elem)) )
+        {
+            spin_lock(&prv->waitq_lock);
+            list_del_init(&nvc->waitq_elem);
+            spin_unlock(&prv->waitq_lock);
+        }
+        else if ( per_cpu(npc, cpu).vcpu == v )
+            tickled = vcpu_deassign(prv, v);
+    }
+
     /* If v is not assigned to a pCPU, or is not running, no need to bother */
-    if ( curr_on_cpu(v->processor) == v )
-        cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
+    if ( likely(!tickled && curr_on_cpu(cpu) == v) )
+        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
 
     SCHED_STAT_CRANK(vcpu_sleep);
 }
@@ -615,12 +646,12 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      *
      * In the latter, there is just nothing to do.
      */
-    if ( likely(list_empty(&nvc->waitq_elem)) )
+    if ( likely(per_cpu(npc, v->processor).vcpu == v) )
     {
         vcpu_deassign(prv, v);
         SCHED_STAT_CRANK(migrate_running);
     }
-    else
+    else if ( !list_empty(&nvc->waitq_elem) )
         SCHED_STAT_CRANK(migrate_on_runq);
 
     SCHED_STAT_CRANK(migrated);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online
  2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
                   ` (2 preceding siblings ...)
  2018-08-25  0:21 ` [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus Dario Faggioli
@ 2018-08-25  0:22 ` Dario Faggioli
  2019-07-19 16:08   ` [Xen-devel] " George Dunlap
  2018-08-25  0:22 ` [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing() Dario Faggioli
  4 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:22 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Stefano Stabellini, Dario Faggioli, Roger Pau Monne

When a vcpu that was offline, comes back online, we do want it to either
be assigned to a pCPU, or go into the wait list.

So let's do exactly that. Detecting that a vcpu is coming back online is
a bit tricky. Basically, if the vcpu is waking up, and is neither
assigned to a pCPU, nor in the wait list, it must be coming back from
offline.

When this happens, we put it in the waitqueue, and we "tickle" an idle
pCPU (if any), to go pick it up.

Looking at the patch, it seems that the vcpu wakeup code is getting
complex, and hence that it could potentially introduce latencies.
However, all this new logic is triggered only by the case of a vcpu
coming online, so, basically, the overhead during normal operations is
just an additional 'if()'.

Signed-off-by: Dario Faggioli <dario.faggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/common/sched_null.c |   50 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 6259f4643e..3dc2f0738b 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -552,15 +552,19 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
 
 static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
 {
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *nvc = null_vcpu(v);
+    unsigned int cpu = v->processor;
+
     ASSERT(!is_idle_vcpu(v));
 
-    if ( unlikely(curr_on_cpu(v->processor) == v) )
+    if ( unlikely(curr_on_cpu(cpu) == v) )
     {
         SCHED_STAT_CRANK(vcpu_wake_running);
         return;
     }
 
-    if ( unlikely(!list_empty(&null_vcpu(v)->waitq_elem)) )
+    if ( unlikely(!list_empty(&nvc->waitq_elem)) )
     {
         /* Not exactly "on runq", but close enough for reusing the counter */
         SCHED_STAT_CRANK(vcpu_wake_onrunq);
@@ -572,6 +576,45 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
     else
         SCHED_STAT_CRANK(vcpu_wake_not_runnable);
 
+    /* We need to special case the handling of the vcpu being onlined */
+    if ( unlikely(per_cpu(npc, cpu).vcpu != v && list_empty(&nvc->waitq_elem)) )
+    {
+        spin_lock(&prv->waitq_lock);
+        list_add_tail(&nvc->waitq_elem, &prv->waitq);
+        spin_unlock(&prv->waitq_lock);
+
+        cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+                    cpupool_domain_cpumask(v->domain));
+
+        if ( !cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        {
+            dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU!\n",
+                    v->domain->domain_id, v->vcpu_id);
+            return;
+        }
+
+        /*
+         * Now we would want to assign the vcpu to cpu, but we can't, because
+         * we don't have the lock. So, let's do the following:
+         * - try to remove cpu from the list of free cpus, to avoid races with
+         *   other onlining, inserting or migrating operations;
+         * - tickle the cpu, which will pickup work from the waitqueue, and
+         *   assign it to itself;
+         * - if we're racing already, and if there still are free cpus, try
+         *   again.
+         */
+        while ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
+        {
+            unsigned int new_cpu = pick_cpu(prv, v);
+
+            if ( test_and_clear_bit(new_cpu, &prv->cpus_free) )
+            {
+                cpu_raise_softirq(new_cpu, SCHEDULE_SOFTIRQ);
+                return;
+            }
+        }
+    }
+
     /* Note that we get here only for vCPUs assigned to a pCPU */
     cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
 }
@@ -822,6 +865,9 @@ static struct task_slice null_schedule(const struct scheduler *ops,
         }
  unlock:
         spin_unlock(&prv->waitq_lock);
+
+        if ( ret.task == NULL && !cpumask_test_cpu(cpu, &prv->cpus_free) )
+            cpumask_set_cpu(cpu, &prv->cpus_free);
     }
 
     if ( unlikely(ret.task == NULL || !vcpu_runnable(ret.task)) )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing()
  2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
                   ` (3 preceding siblings ...)
  2018-08-25  0:22 ` [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online Dario Faggioli
@ 2018-08-25  0:22 ` Dario Faggioli
  2019-07-19 16:20   ` [Xen-devel] " George Dunlap
  4 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2018-08-25  0:22 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

It is all the time that we call vcpu_deassing() that the vcpu _must_ be
assigned to a pCPU, and hence that such pCPU can't be free.

Therefore, move the ASSERT-s which check for these properties in that
function, where they belong better.
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/common/sched_null.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 3dc2f0738b..b3764de657 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -369,6 +369,8 @@ static bool vcpu_deassign(struct null_private *prv, struct vcpu *v)
     struct null_vcpu *wvc;
 
     ASSERT(list_empty(&null_vcpu(v)->waitq_elem));
+    ASSERT(per_cpu(npc, v->processor).vcpu == v);
+    ASSERT(!cpumask_test_cpu(v->processor, &prv->cpus_free));
 
     per_cpu(npc, cpu).vcpu = NULL;
     cpumask_set_cpu(cpu, &prv->cpus_free);
@@ -539,9 +541,6 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
         goto out;
     }
 
-    ASSERT(per_cpu(npc, v->processor).vcpu == v);
-    ASSERT(!cpumask_test_cpu(v->processor, &prv->cpus_free));
-
     vcpu_deassign(prv, v);
 
  out:


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign()
  2018-08-25  0:21 ` [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign() Dario Faggioli
@ 2019-07-15 15:46   ` George Dunlap
  2019-07-24 10:32     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2019-07-15 15:46 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: George Dunlap

On 8/25/18 1:21 AM, Dario Faggioli wrote:
> vcpu_deassign() has only one caller: _vcpu_remove().
> Let's consolidate the two functions into one.
> 
> No functional change intended.
> 
> Signed-off-by: Dario Faggioli <dfaggioli@suse.com>

Acked-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
  2018-08-25  0:21 ` [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus Dario Faggioli
@ 2019-07-15 16:06   ` George Dunlap
  2019-07-16 10:50     ` Dario Faggioli
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2019-07-15 16:06 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: George Dunlap, Stefano Stabellini, Roger Pau Monne

On 8/25/18 1:21 AM, Dario Faggioli wrote:
> If a pCPU has been/is being offlined, we don't want it to be neither
> assigned to any pCPU, nor in the wait list.

I take it the first `pCPU` should be `vCPU`?

Also, English grammar agreement is funny: `neither` needs to agree with
`nor`, but the two do *not* agree with the original verb.  That is, the
sentence should say:

"...we want it to be neither assigned to pCPU, nor in the wait list".

Both here and in the comment.

The other thing is, from a "developmental purity" point of view, I think
this series technically has a regression in the middle: cpu offline /
online stops working between patch 2 and patch 4.  But I'm inclined in
this case not to worry too much. :-)

Everything else looks good.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
  2019-07-15 16:06   ` [Xen-devel] " George Dunlap
@ 2019-07-16 10:50     ` Dario Faggioli
  2019-07-16 12:02       ` George Dunlap
  0 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2019-07-16 10:50 UTC (permalink / raw)
  To: george.dunlap, xen-devel; +Cc: george.dunlap, sstabellini, roger.pau


[-- Attachment #1.1: Type: text/plain, Size: 1774 bytes --]

On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
> On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > If a pCPU has been/is being offlined, we don't want it to be
> > neither
> > assigned to any pCPU, nor in the wait list.
> 
> I take it the first `pCPU` should be `vCPU`?
> 
Indeed. :-)

> Also, English grammar agreement is funny: `neither` needs to agree
> with
> `nor`, but the two do *not* agree with the original verb.  That is,
> the
> sentence should say:
> 
> "...we want it to be neither assigned to pCPU, nor in the wait list".
> 
Yep, now that I see it, this rings a bell back from my high-school
English class! :-O

Sorry... and thanks. :-)

> Both here and in the comment.
>
And in patch 3 changelog too, I'm afraid. :-P

> The other thing is, from a "developmental purity" point of view, I
> think
> this series technically has a regression in the middle: cpu offline /
> online stops working between patch 2 and patch 4.  But I'm inclined
> in
> this case not to worry too much. :-)
> 
Well, the point is that offlining/onlining does not work before this
series. System does not crash, but behavior is wrong, as offline vCPUs
stay assigned to pCPUs (keeping them idle) while online vCPUs are
"trapped" in the wait list, which is wrong.

So that's why I don't think there's much value in being consistent with
such behavior throughout the series... which I guess is why you said
you "won't worry too much in this case" ?

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
  2019-07-16 10:50     ` Dario Faggioli
@ 2019-07-16 12:02       ` George Dunlap
  2019-07-16 15:00         ` Dario Faggioli
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2019-07-16 12:02 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, sstabellini, George Dunlap, Roger Pau Monne



> On Jul 16, 2019, at 11:50 AM, Dario Faggioli <dfaggioli@suse.com> wrote:
> 
> On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
>> On 8/25/18 1:21 AM, Dario Faggioli wrote:
>>> If a pCPU has been/is being offlined, we don't want it to be
>>> neither
>>> assigned to any pCPU, nor in the wait list.
>> 
>> I take it the first `pCPU` should be `vCPU`?
>> 
> Indeed. :-)
> 
>> Also, English grammar agreement is funny: `neither` needs to agree
>> with
>> `nor`, but the two do *not* agree with the original verb.  That is,
>> the
>> sentence should say:
>> 
>> "...we want it to be neither assigned to pCPU, nor in the wait list".
>> 
> Yep, now that I see it, this rings a bell back from my high-school
> English class! :-O
> 
> Sorry... and thanks. :-)
> 
>> Both here and in the comment.
>> 
> And in patch 3 changelog too, I'm afraid. :-P
> 
>> The other thing is, from a "developmental purity" point of view, I
>> think
>> this series technically has a regression in the middle: cpu offline /
>> online stops working between patch 2 and patch 4.  But I'm inclined
>> in
>> this case not to worry too much. :-)
>> 
> Well, the point is that offlining/onlining does not work before this
> series. System does not crash, but behavior is wrong, as offline vCPUs
> stay assigned to pCPUs (keeping them idle) while online vCPUs are
> "trapped" in the wait list, which is wrong.
> 
> So that's why I don't think there's much value in being consistent with
> such behavior throughout the series... which I guess is why you said
> you "won't worry too much in this case” ?

It’s definitely sub-optimal from a system point of view; but from a guest point of view, it does (or should) function.  Before this series, if a guest offline and then online vcpus, they should come back.  In the middle of this series, once a vcpu is offlined, it can’t be brought back up.  (That is, if I’m reading it right.)

But I’m not expecting people to be doing bisections of that particular functionality in this particular scheduler very much.  I think the “benefit” of avoiding a complicated re-write is well worth the “cost” of having that particular bisection fail, on the off chance that anyone tries it.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
  2019-07-16 12:02       ` George Dunlap
@ 2019-07-16 15:00         ` Dario Faggioli
  0 siblings, 0 replies; 19+ messages in thread
From: Dario Faggioli @ 2019-07-16 15:00 UTC (permalink / raw)
  To: George.Dunlap; +Cc: xen-devel, sstabellini, roger.pau


[-- Attachment #1.1: Type: text/plain, Size: 2454 bytes --]

On Tue, 2019-07-16 at 12:02 +0000, George Dunlap wrote:
> > On Jul 16, 2019, at 11:50 AM, Dario Faggioli <faggioli@suse.com>
> > On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
> > > On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > > > 
> > > The other thing is, from a "developmental purity" point of view,
> > > I
> > > think
> > > this series technically has a regression in the middle: cpu
> > > offline /
> > > online stops working between patch 2 and patch 4.  But I'm
> > > inclined
> > > in
> > > this case not to worry too much. :-)
> > > 
> > Well, the point is that offlining/onlining does not work before
> > this
> > series. System does not crash, but behavior is wrong, as offline
> > vCPUs
> > stay assigned to pCPUs (keeping them idle) while online vCPUs are
> > "trapped" in the wait list, which is wrong.
> > 
> > So that's why I don't think there's much value in being consistent
> > with
> > such behavior throughout the series... which I guess is why you
> > said
> > you "won't worry too much in this case” ?
> 
> It’s definitely sub-optimal from a system point of view; but from a
> guest point of view, it does (or should) function.  Before this
> series, if a guest offline and then online vcpus, they should come
> back.
>
Well, yes, I guess they should. IAC, one of the main backing usecases
for these fixes is when null is used as the scheduler of the Xen PV-
SHIM. In that case, if I remember correctly, the L1 and L2 vCPUs are
created offline, and then onlined dynamically. And they don't come up.
:-)

Anyway...

> In the middle of this series, once a vcpu is offlined, it can’t be
> brought back up.  (That is, if I’m reading it right.)
>
...Yes, at that stage, things are working even less. But more
important...

> But I’m not expecting people to be doing bisections of that
> particular functionality in this particular scheduler very much.  I
> think the “benefit” of avoiding a complicated re-write is well worth
> the “cost” of having that particular bisection fail, on the off
> chance that anyone tries it.
> 
...As we fully agree on this, let's move on

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
  2018-08-25  0:21 ` [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus Dario Faggioli
@ 2019-07-17 16:04   ` George Dunlap
  2019-07-17 18:39     ` Dario Faggioli
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2019-07-17 16:04 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: George Dunlap, Stefano Stabellini, Roger Pau Monne

On 8/25/18 1:21 AM, Dario Faggioli wrote:
> If a pCPU has been/is being offlined, we don't want it to be neither
> assigned to any pCPU, nor in the wait list.
> 
> Therefore, when we detect that a vcpu is going offline, remove it from
> both places.

Hmm, this commit message wasn't very informative.

It looks like what you really mean to do is:

> Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
> ---
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Roger Pau Monne <roger.pau@citrix.com>
> ---
>  xen/common/sched_null.c |   43 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
> index 1426124525..6259f4643e 100644
> --- a/xen/common/sched_null.c
> +++ b/xen/common/sched_null.c
> @@ -361,7 +361,8 @@ static void vcpu_assign(struct null_private *prv, struct vcpu *v,
>      }
>  }
>  
> -static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
> +/* Returns true if a cpu was tickled */
> +static bool vcpu_deassign(struct null_private *prv, struct vcpu *v)
>  {
>      unsigned int bs;
>      unsigned int cpu = v->processor;
> @@ -406,11 +407,13 @@ static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
>                  vcpu_assign(prv, wvc->vcpu, cpu);
>                  cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>                  spin_unlock(&prv->waitq_lock);
> -                return;
> +                return true;
>              }
>          }
>      }
>      spin_unlock(&prv->waitq_lock);
> +
> +    return false;
>  }
>  
>  /* Change the scheduler of cpu to us (null). */
> @@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
>  
>      lock = vcpu_schedule_lock_irq(v);
>  
> +    /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
> +    if ( unlikely(!is_vcpu_online(v)) )
> +    {
> +        ASSERT(per_cpu(npc, v->processor).vcpu != v);
> +        ASSERT(list_empty(&nvc->waitq_elem));
> +        goto out;
> +    }

* Handle the case of an offline vcpu being removed (ASSERTing that it's
neither on a processor nor on the waitqueue)

But wait, isn't this fixing a important regression in patch 2?  If after
patch 2 but before patch 3, a VM is created with offline vcpus, and then
destroyed, won't the offline vcpus reach here neither on the waitlist
nor on a vcpu?

Offlining/onlining vcpus is one thing; but creating and destroying
guests is something different.

>      /* If v is in waitqueue, just get it out of there and bail */
>      if ( unlikely(!list_empty(&nvc->waitq_elem)) )
>      {
> @@ -567,11 +578,31 @@ static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
>  
>  static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
>  {
> +    struct null_private *prv = null_priv(ops);
> +    unsigned int cpu = v->processor;
> +    bool tickled = false;
> +
>      ASSERT(!is_idle_vcpu(v));
>  
> +    /* We need to special case the handling of the vcpu being offlined */
> +    if ( unlikely(!is_vcpu_online(v)) )
> +    {
> +        struct null_vcpu *nvc = null_vcpu(v);
> +
> +        printk("YYY d%dv%d going down?\n", v->domain->domain_id, v->vcpu_id);
> +        if ( unlikely(!list_empty(&nvc->waitq_elem)) )
> +        {
> +            spin_lock(&prv->waitq_lock);
> +            list_del_init(&nvc->waitq_elem);
> +            spin_unlock(&prv->waitq_lock);
> +        }
> +        else if ( per_cpu(npc, cpu).vcpu == v )
> +            tickled = vcpu_deassign(prv, v);
> +    }

* Handle the unexpected(?) case of a vcpu being put to sleep as(?) it's
offlined

If it's not unexpected, then why the printk?

And if it is unexpected, what is the expected path for a cpu going
offline to be de-assigned from a vcpu (which is what the title seems to
imply this patch is about)?

> +
>      /* If v is not assigned to a pCPU, or is not running, no need to bother */
> -    if ( curr_on_cpu(v->processor) == v )
> -        cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
> +    if ( likely(!tickled && curr_on_cpu(cpu) == v) )
> +        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>  
>      SCHED_STAT_CRANK(vcpu_sleep);
>  }
> @@ -615,12 +646,12 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
>       *
>       * In the latter, there is just nothing to do.
>       */
> -    if ( likely(list_empty(&nvc->waitq_elem)) )
> +    if ( likely(per_cpu(npc, v->processor).vcpu == v) )
>      {
>          vcpu_deassign(prv, v);
>          SCHED_STAT_CRANK(migrate_running);
>      }
> -    else
> +    else if ( !list_empty(&nvc->waitq_elem) )
>          SCHED_STAT_CRANK(migrate_on_runq);

* Teach null_vcpu_migrate() that !on_waitqueue != on_vcpu.

It looks like the comment just above this hunk is now out of date:

"v is either assigned to a pCPU, or in the waitqueue."

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
  2019-07-17 16:04   ` [Xen-devel] " George Dunlap
@ 2019-07-17 18:39     ` Dario Faggioli
  2019-07-18 13:09       ` George Dunlap
  0 siblings, 1 reply; 19+ messages in thread
From: Dario Faggioli @ 2019-07-17 18:39 UTC (permalink / raw)
  To: george.dunlap, xen-devel; +Cc: george.dunlap, sstabellini, roger.pau


[-- Attachment #1.1: Type: text/plain, Size: 5324 bytes --]

On Wed, 2019-07-17 at 17:04 +0100, George Dunlap wrote:
> On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > If a pCPU has been/is being offlined, we don't want it to be
> > neither
> > assigned to any pCPU, nor in the wait list.
> > 
> > Therefore, when we detect that a vcpu is going offline, remove it
> > from
> > both places.
> 
> Hmm, this commit message wasn't very informative.
> 
Ok, we can certainly improve that. :-)

> It looks like what you really mean to do is:
> 

> > @@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct
> > scheduler *ops, struct vcpu *v)
> >  
> >      lock = vcpu_schedule_lock_irq(v);
> >  
> > +    /* If offline, the vcpu shouldn't be assigned, nor in the
> > waitqueue */
> > +    if ( unlikely(!is_vcpu_online(v)) )
> > +    {
> > +        ASSERT(per_cpu(npc, v->processor).vcpu != v);
> > +        ASSERT(list_empty(&nvc->waitq_elem));
> > +        goto out;
> > +    }
> 
> * Handle the case of an offline vcpu being removed (ASSERTing that
> it's
> neither on a processor nor on the waitqueue)
> 
So, IIRC (sorry, it's been a while :-D ), this is for dealing with
remove_vcpu() being called on a vcpu which is offline. So, yes,
basically what you said. :-)

Point is the work of removing such vCPU from any CPU and from the wait
list has been done already, in null_vcpu_sleep(), while the vCPU was
going offline. So, here, we only need to make sure that we don't do
anything, i.e., that we don't call _vcpu_remove().

> But wait, isn't this fixing a important regression in patch 2?  If
> after
> patch 2 but before patch 3, a VM is created with offline vcpus, and
> then
> destroyed, won't the offline vcpus reach here neither on the waitlist
> nor on a vcpu?
> 
I'm not sure I understand the point you're trying to make here, sorry.

In general, considering what we've said in other mails, if you think
that patch 2 and 3 should be merged, we can do that.

My reasoning, when putting together the series, was the one I already
stated: this is broken already, so no big deal breaking it "more", and
I continue to see it that way.

But I appreciate you seeing it differently, while I don't have a too
strong opinion, so I'd be fine merging the patches (or doing other
series rearrangements, if you feel strongly that they're necessary).

Or is it something completely different that you meant?

> > @@ -567,11 +578,31 @@ static void null_vcpu_wake(const struct
> > scheduler *ops, struct vcpu *v)
> >  
> >  static void null_vcpu_sleep(const struct scheduler *ops, struct
> > vcpu *v)
> >  {
> > +    struct null_private *prv = null_priv(ops);
> > +    unsigned int cpu = v->processor;
> > +    bool tickled = false;
> > +
> >      ASSERT(!is_idle_vcpu(v));
> >  
> > +    /* We need to special case the handling of the vcpu being
> > offlined */
> > +    if ( unlikely(!is_vcpu_online(v)) )
> > +    {
> > +        struct null_vcpu *nvc = null_vcpu(v);
> > +
> > +        printk("YYY d%dv%d going down?\n", v->domain->domain_id,
> > v->vcpu_id);
> > +        if ( unlikely(!list_empty(&nvc->waitq_elem)) )
> > +        {
> > +            spin_lock(&prv->waitq_lock);
> > +            list_del_init(&nvc->waitq_elem);
> > +            spin_unlock(&prv->waitq_lock);
> > +        }
> > +        else if ( per_cpu(npc, cpu).vcpu == v )
> > +            tickled = vcpu_deassign(prv, v);
> > +    }
> 
> * Handle the unexpected(?) case of a vcpu being put to sleep as(?)
> it's
> offlined
> 
Well, that printk, really shouldn't be there! :-(

> If it's not unexpected, then why the printk?
> 
Because it was a debugging aid, for while I was working on the series,
and I apparently failed at killing it before sending.

Sorry. :-(

> And if it is unexpected, what is the expected path for a cpu going
> offline to be de-assigned from a vcpu (which is what the title seems
> to
> imply this patch is about)?
> 
This is the vCPU going down, when do_vcpu_op(VCPU_down) is invoked on
it, which then calls vcpu_sleep_nosync() with _VPF_down set in
pause_flags (which means vcpu_is_online() would be false.

So it is indeed the _expected_ path, and sorry again if the stray
debugging printk made you think otherwise.

> > @@ -615,12 +646,12 @@ static void null_vcpu_migrate(const struct
> > scheduler *ops, struct vcpu *v,
> >       *
> >       * In the latter, there is just nothing to do.
> >       */
> > -    if ( likely(list_empty(&nvc->waitq_elem)) )
> > +    if ( likely(per_cpu(npc, v->processor).vcpu == v) )
> >      {
> >          vcpu_deassign(prv, v);
> >          SCHED_STAT_CRANK(migrate_running);
> >      }
> > -    else
> > +    else if ( !list_empty(&nvc->waitq_elem) )
> >          SCHED_STAT_CRANK(migrate_on_runq);
> 
> * Teach null_vcpu_migrate() that !on_waitqueue != on_vcpu.
> 
> It looks like the comment just above this hunk is now out of date:
> 
> "v is either assigned to a pCPU, or in the waitqueue."
> 
Yes it does.

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
  2019-07-17 18:39     ` Dario Faggioli
@ 2019-07-18 13:09       ` George Dunlap
  2019-07-19 15:46         ` Dario Faggioli
  0 siblings, 1 reply; 19+ messages in thread
From: George Dunlap @ 2019-07-18 13:09 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: george.dunlap, sstabellini, roger.pau

On 7/17/19 7:39 PM, Dario Faggioli wrote:
>>> @@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct
>>> scheduler *ops, struct vcpu *v)
>>>  
>>>      lock = vcpu_schedule_lock_irq(v);
>>>  
>>> +    /* If offline, the vcpu shouldn't be assigned, nor in the
>>> waitqueue */
>>> +    if ( unlikely(!is_vcpu_online(v)) )
>>> +    {
>>> +        ASSERT(per_cpu(npc, v->processor).vcpu != v);
>>> +        ASSERT(list_empty(&nvc->waitq_elem));
>>> +        goto out;
>>> +    }
>>
>> * Handle the case of an offline vcpu being removed (ASSERTing that
>> it's
>> neither on a processor nor on the waitqueue)
>>
> So, IIRC (sorry, it's been a while :-D ), this is for dealing with
> remove_vcpu() being called on a vcpu which is offline. So, yes,
> basically what you said. :-)
> 
> Point is the work of removing such vCPU from any CPU and from the wait
> list has been done already, in null_vcpu_sleep(), while the vCPU was
> going offline. So, here, we only need to make sure that we don't do
> anything, i.e., that we don't call _vcpu_remove().

Right; I'm mainly saying, if the commit message had said what I wrote
above, then I would  have immediately been able to see what this hunk
was doing and understand why it was needed.

>> But wait, isn't this fixing a important regression in patch 2?  If
>> after
>> patch 2 but before patch 3, a VM is created with offline vcpus, and
>> then
>> destroyed, won't the offline vcpus reach here neither on the waitlist
>> nor on a vcpu?
>>
> I'm not sure I understand the point you're trying to make here, sorry.
> 
> In general, considering what we've said in other mails, if you think
> that patch 2 and 3 should be merged, we can do that.
> 
> My reasoning, when putting together the series, was the one I already
> stated: this is broken already, so no big deal breaking it "more", and
> I continue to see it that way.
> 
> But I appreciate you seeing it differently, while I don't have a too
> strong opinion, so I'd be fine merging the patches (or doing other
> series rearrangements, if you feel strongly that they're necessary).
> 
> Or is it something completely different that you meant?

Merging the patches would be one way to avoid the regression, yes.

Sorry to be picky, but I've recently spent a lot of time doing
archaeology, and wishing people in the distant past had been more
careful / informative in their commit hygiene.  A little bit of effort
now may save someone in the future -- possibly you or me -- hours of
frustration. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
  2019-07-18 13:09       ` George Dunlap
@ 2019-07-19 15:46         ` Dario Faggioli
  0 siblings, 0 replies; 19+ messages in thread
From: Dario Faggioli @ 2019-07-19 15:46 UTC (permalink / raw)
  To: george.dunlap, xen-devel; +Cc: george.dunlap, sstabellini, roger.pau


[-- Attachment #1.1: Type: text/plain, Size: 1650 bytes --]

On Thu, 2019-07-18 at 14:09 +0100, George Dunlap wrote:
> On 7/17/19 7:39 PM, Dario Faggioli wrote:
> > Point is the work of removing such vCPU from any CPU and from the
> > wait
> > list has been done already, in null_vcpu_sleep(), while the vCPU
> > was
> > going offline. So, here, we only need to make sure that we don't do
> > anything, i.e., that we don't call _vcpu_remove().
> 
> Right; I'm mainly saying, if the commit message had said what I wrote
> above, then I would  have immediately been able to see what this hunk
> was doing and understand why it was needed.
> 
Ok then, I'll improve the commit message and...

> > But I appreciate you seeing it differently, while I don't have a
> > too
> > strong opinion, so I'd be fine merging the patches (or doing other
> > series rearrangements, if you feel strongly that they're
> > necessary).
> > 
> Merging the patches would be one way to avoid the regression, yes.
> 
... I'll merge the patches.

> Sorry to be picky, but I've recently spent a lot of time doing
> archaeology, and wishing people in the distant past had been more
> careful / informative in their commit hygiene.
>
No problem at all, I see and agree on the fact that changelogs are
really important. :-)

I guess I'll wait a little, to see if you have any comments on patch 4,
and then resend.

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online
  2018-08-25  0:22 ` [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online Dario Faggioli
@ 2019-07-19 16:08   ` George Dunlap
  0 siblings, 0 replies; 19+ messages in thread
From: George Dunlap @ 2019-07-19 16:08 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: George Dunlap, Stefano Stabellini, Dario Faggioli, Roger Pau Monne

On 8/25/18 1:22 AM, Dario Faggioli wrote:
> When a vcpu that was offline, comes back online, we do want it to either
> be assigned to a pCPU, or go into the wait list.
> 
> So let's do exactly that. Detecting that a vcpu is coming back online is
> a bit tricky. Basically, if the vcpu is waking up, and is neither
> assigned to a pCPU, nor in the wait list, it must be coming back from
> offline.
> 
> When this happens, we put it in the waitqueue, and we "tickle" an idle
> pCPU (if any), to go pick it up.
> 
> Looking at the patch, it seems that the vcpu wakeup code is getting
> complex, and hence that it could potentially introduce latencies.
> However, all this new logic is triggered only by the case of a vcpu
> coming online, so, basically, the overhead during normal operations is
> just an additional 'if()'.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@suse.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing()
  2018-08-25  0:22 ` [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing() Dario Faggioli
@ 2019-07-19 16:20   ` George Dunlap
  0 siblings, 0 replies; 19+ messages in thread
From: George Dunlap @ 2019-07-19 16:20 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: George Dunlap

On 8/25/18 1:22 AM, Dario Faggioli wrote:
> It is all the time that we call vcpu_deassing() that the vcpu _must_ be
> assigned to a pCPU, and hence that such pCPU can't be free.
> 
> Therefore, move the ASSERT-s which check for these properties in that
> function, where they belong better.
> ---
> Cc: George Dunlap <george.dunlap@eu.citrix.com>

This seems to missing an SoB.

With that fixed:

Reviewed-by: George Dunlap <george.dunlap@citix.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign()
  2019-07-15 15:46   ` [Xen-devel] " George Dunlap
@ 2019-07-24 10:32     ` Jan Beulich
  2019-07-24 13:20       ` Dario Faggioli
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2019-07-24 10:32 UTC (permalink / raw)
  To: George Dunlap, Dario Faggioli; +Cc: George Dunlap, xen-devel

On 15.07.2019 17:46, George Dunlap wrote:
> On 8/25/18 1:21 AM, Dario Faggioli wrote:
>> vcpu_deassign() has only one caller: _vcpu_remove().
>> Let's consolidate the two functions into one.
>>
>> No functional change intended.
>>
>> Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
> 
> Acked-by: George Dunlap <george.dunlap@citrix.com>

I thought I'd apply this, but I can't find the mail in my mailbox
anymore. And I'm not surprised, seeing the date of the original
posting. So unless George wants to apply it, could I ask you,
Dario, to resend?

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign()
  2019-07-24 10:32     ` Jan Beulich
@ 2019-07-24 13:20       ` Dario Faggioli
  0 siblings, 0 replies; 19+ messages in thread
From: Dario Faggioli @ 2019-07-24 13:20 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap; +Cc: George Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1105 bytes --]

On Wed, 2019-07-24 at 10:32 +0000, Jan Beulich wrote:
> On 15.07.2019 17:46, George Dunlap wrote:
> > On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > > vcpu_deassign() has only one caller: _vcpu_remove().
> > > Let's consolidate the two functions into one.
> > > 
> > > No functional change intended.
> > > 
> > > Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
> > 
> > Acked-by: George Dunlap <george.dunlap@citrix.com>
> 
> I thought I'd apply this, but I can't find the mail in my mailbox
> anymore. And I'm not surprised, seeing the date of the original
> posting. So unless George wants to apply it, could I ask you,
> Dario, to resend?
> 
I will resend, sure.

A couple of the patches need tweaking, and I was planning to resend the
entire series at once, if this is not a problem.

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-24 13:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-25  0:21 [PATCH v1 0/5] Series short description Dario Faggioli
2018-08-25  0:21 ` [PATCH v1 1/5] xen: sched: null: refactor core around vcpu_deassign() Dario Faggioli
2019-07-15 15:46   ` [Xen-devel] " George Dunlap
2019-07-24 10:32     ` Jan Beulich
2019-07-24 13:20       ` Dario Faggioli
2018-08-25  0:21 ` [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus Dario Faggioli
2019-07-15 16:06   ` [Xen-devel] " George Dunlap
2019-07-16 10:50     ` Dario Faggioli
2019-07-16 12:02       ` George Dunlap
2019-07-16 15:00         ` Dario Faggioli
2018-08-25  0:21 ` [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus Dario Faggioli
2019-07-17 16:04   ` [Xen-devel] " George Dunlap
2019-07-17 18:39     ` Dario Faggioli
2019-07-18 13:09       ` George Dunlap
2019-07-19 15:46         ` Dario Faggioli
2018-08-25  0:22 ` [PATCH v1 4/5] xen: sched: null: reassign vcpus to pcpus when they come back online Dario Faggioli
2019-07-19 16:08   ` [Xen-devel] " George Dunlap
2018-08-25  0:22 ` [PATCH v1 5/5] xen: sched: null: refactor the ASSERTs around vcpu_deassing() Dario Faggioli
2019-07-19 16:20   ` [Xen-devel] " George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.