All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace
@ 2018-05-02 10:16 George Dunlap
  2018-05-02 10:16 ` [PATCH for-4.11 2/3] xen: Introduce vcpu_sleep_nosync_locked() George Dunlap
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: George Dunlap @ 2018-05-02 10:16 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Tim Deegan, George Dunlap, Dario Faggioli, Julien Grall,
	Jan Beulich, Ian Jackson

Delete tabs and trailing whitespace.

No functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
Changes since RFC: Introduced

CC: Dario Faggioli <dfaggioli@suse.com>
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: Konrad Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/common/schedule.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 343ab6306e..69d255389e 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -7,7 +7,7 @@
  *        File: common/schedule.c
  *      Author: Rolf Neugebauer & Keir Fraser
  *              Updated for generic API by Mark Williamson
- * 
+ *
  * Description: Generic CPU scheduling code
  *              implements support functionality for the Xen scheduler API.
  *
@@ -49,7 +49,7 @@ string_param("sched", opt_sched);
 bool_t sched_smt_power_savings = 0;
 boolean_param("sched_smt_power_savings", sched_smt_power_savings);
 
-/* Default scheduling rate limit: 1ms 
+/* Default scheduling rate limit: 1ms
  * The behavior when sched_ratelimit_us is greater than sched_credit_tslice_ms is undefined
  * */
 int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
@@ -253,7 +253,7 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
     spin_unlock_irqrestore(lock1, flags);
 }
 
-int sched_init_vcpu(struct vcpu *v, unsigned int processor) 
+int sched_init_vcpu(struct vcpu *v, unsigned int processor)
 {
     struct domain *d = v->domain;
     cpumask_t allcpus;
@@ -271,7 +271,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
                v, v->processor);
 
     v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, v,
-		             d->sched_priv);
+                     d->sched_priv);
     if ( v->sched_priv == NULL )
         return 1;
 
@@ -1319,11 +1319,11 @@ long do_set_timer_op(s_time_t timeout)
               unlikely((offset > 0) && ((uint32_t)(offset >> 50) != 0)) )
     {
         /*
-         * Linux workaround: occasionally we will see timeouts a long way in 
-         * the future due to wrapping in Linux's jiffy time handling. We check 
-         * for timeouts wrapped negative, and for positive timeouts more than 
-         * about 13 days in the future (2^50ns). The correct fix is to trigger 
-         * an interrupt immediately (since Linux in fact has pending work to 
+         * Linux workaround: occasionally we will see timeouts a long way in
+         * the future due to wrapping in Linux's jiffy time handling. We check
+         * for timeouts wrapped negative, and for positive timeouts more than
+         * about 13 days in the future (2^50ns). The correct fix is to trigger
+         * an interrupt immediately (since Linux in fact has pending work to
          * do in this situation). However, older guests also set a long timeout
          * when they have *no* pending timers at all: setting an immediate
          * timeout in this case can burn a lot of CPU. We therefore go for a
@@ -1425,7 +1425,7 @@ static void vcpu_periodic_timer_work(struct vcpu *v)
     set_timer(&v->periodic_timer, periodic_next_event);
 }
 
-/* 
+/*
  * The main function
  * - deschedule the current domain (scheduler independent).
  * - pick a new domain (scheduler dependent).
@@ -1471,7 +1471,7 @@ static void schedule(void)
     now = NOW();
 
     stop_timer(&sd->s_timer);
-    
+
     /* get policy-specific decision on scheduling... */
     sched = this_cpu(scheduler);
     next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
-- 
2.17.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH for-4.11 2/3] xen: Introduce vcpu_sleep_nosync_locked()
  2018-05-02 10:16 [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace George Dunlap
@ 2018-05-02 10:16 ` George Dunlap
  2018-05-02 10:16 ` [PATCH for-4.11 3/3] xen/schedule: Fix races in vcpu migration George Dunlap
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2018-05-02 10:16 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

There are a lot of places which release a lock before calling
vcpu_sleep_nosync(), which then just grabs the lock again.  This is
not only a waste of time, but leads to more code duplication (since
you have to copy-and-paste recipes rather than calling a unified
function), which in turn leads to an increased chance of bugs.

Introduce vcpu_sleep_nosync_locked(), which can be called if you
already hold the schedule lock.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
---
 xen/common/schedule.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 69d255389e..7489833361 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -455,14 +455,9 @@ void sched_destroy_domain(struct domain *d)
     }
 }
 
-void vcpu_sleep_nosync(struct vcpu *v)
+void vcpu_sleep_nosync_locked(struct vcpu *v)
 {
-    unsigned long flags;
-    spinlock_t *lock;
-
-    TRACE_2D(TRC_SCHED_SLEEP, v->domain->domain_id, v->vcpu_id);
-
-    lock = vcpu_schedule_lock_irqsave(v, &flags);
+    ASSERT(spin_is_locked(per_cpu(schedule_data,v->processor).schedule_lock));
 
     if ( likely(!vcpu_runnable(v)) )
     {
@@ -471,6 +466,18 @@ void vcpu_sleep_nosync(struct vcpu *v)
 
         SCHED_OP(vcpu_scheduler(v), sleep, v);
     }
+}
+
+void vcpu_sleep_nosync(struct vcpu *v)
+{
+    unsigned long flags;
+    spinlock_t *lock;
+
+    TRACE_2D(TRC_SCHED_SLEEP, v->domain->domain_id, v->vcpu_id);
+
+    lock = vcpu_schedule_lock_irqsave(v, &flags);
+
+    vcpu_sleep_nosync_locked(v);
 
     vcpu_schedule_unlock_irqrestore(lock, flags, v);
 }
-- 
2.17.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH for-4.11 3/3] xen/schedule: Fix races in vcpu migration
  2018-05-02 10:16 [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace George Dunlap
  2018-05-02 10:16 ` [PATCH for-4.11 2/3] xen: Introduce vcpu_sleep_nosync_locked() George Dunlap
@ 2018-05-02 10:16 ` George Dunlap
  2018-05-02 10:24 ` [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace Juergen Gross
  2018-05-02 10:38 ` Wei Liu
  3 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2018-05-02 10:16 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

The current sequence to initiate vcpu migration is inefficent and error-prone:

- The initiator sets VPF_migraging with the lock held, then drops the
  lock and calls vcpu_sleep_nosync(), which immediately grabs the lock
  again

- A number of places unnecessarily check for v->pause_flags in between
  those two

- Every call to vcpu_migrate() must be prefaced with
  vcpu_sleep_nosync() or introduce a race condition; this code
  duplication is error-prone

- In the event that v->is_running is true at the beginning of
  vcpu_migrate(), it's almost certain that vcpu_migrate() will end up
  being called in context_switch() as well; we might as well simply
  let it run there and save the duplicated effort (which will be
  non-negligible).

The result is that Credit1 has several races which result in runqueue
<-> v->processor invariants being violated (triggering ASSERTs in
debug builds and strange bugs in production builds).

Instead, introduce vcpu_migrate_start() to initiate the process.
vcpu_migrate_start() is called with the scheduling lock held.  It not
only sets VPF_migrating, but also calls vcpu_sleep_nosync_locked()
(which will automatically do nothing if there's nothing to do).

Rename vcpu_migrate() to vcpu_migrate_finish().  Check for v->is_running and
pause_flags & VPF_migrating at the top and return if appropriate.

Then the way to initiate migration is consistently:

* Grab lock
* vcpu_migrate_start()
* Release lock
* vcpu_migrate_finish()

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Release-acked-by: Juergen Gross <jgross@suse.com>
---
This is a candidate for backporting.

Changes since v1:
- Finished comment in code about how to do vcpu migration
- Updated changelog and
- Fix up trailing whitespace
---
 xen/common/schedule.c | 80 +++++++++++++++++++++++++++++--------------
 1 file changed, 55 insertions(+), 25 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 7489833361..049f93f7aa 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -593,13 +593,54 @@ static void vcpu_move_nosched(struct vcpu *v, unsigned int new_cpu)
     sched_move_irqs(v);
 }
 
-static void vcpu_migrate(struct vcpu *v)
+/*
+ * Initiating migration
+ *
+ * In order to migrate, we need the vcpu in question to have stopped
+ * running and had SCHED_OP(sleep) called (to take it off any
+ * runqueues, for instance); and if it is currently running, it needs
+ * to be scheduled out.  Finally, we need to hold the scheduling locks
+ * for both the processor we're migrating from, and the processor
+ * we're migrating to.
+ *
+ * In order to avoid deadlock while satisfying the final requirement,
+ * we must release any scheduling lock we hold, then try to grab both
+ * locks we want, then double-check to make sure that what we started
+ * to do hasn't been changed in the mean time.
+ *
+ * These steps are encapsulated in the following two functions; they
+ * should be called like this:
+ *
+ *     lock = vcpu_schedule_lock_irq(v);
+ *     vcpu_migrate_start(v);
+ *     vcpu_schedule_unlock_irq(lock, v)
+ *     vcpu_migrate_finish(v);
+ *
+ * vcpu_migrate_finish() will do the work now if it can, or simply
+ * return if it can't (because v is still running); in that case
+ * vcpu_migrate_finish() will be called by context_saved().
+ */
+void vcpu_migrate_start(struct vcpu *v)
+{
+    set_bit(_VPF_migrating, &v->pause_flags);
+    vcpu_sleep_nosync_locked(v);
+}
+
+static void vcpu_migrate_finish(struct vcpu *v)
 {
     unsigned long flags;
     unsigned int old_cpu, new_cpu;
     spinlock_t *old_lock, *new_lock;
     bool_t pick_called = 0;
 
+    /*
+     * If the vcpu is currently running, this will be handled by
+     * context_saved(); and in any case, if the bit is cleared, then
+     * someone else has already done the work so we don't need to.
+     */
+    if ( v->is_running || !test_bit(_VPF_migrating, &v->pause_flags) )
+        return;
+
     old_cpu = new_cpu = v->processor;
     for ( ; ; )
     {
@@ -679,14 +720,11 @@ void vcpu_force_reschedule(struct vcpu *v)
     spinlock_t *lock = vcpu_schedule_lock_irq(v);
 
     if ( v->is_running )
-        set_bit(_VPF_migrating, &v->pause_flags);
+        vcpu_migrate_start(v);
+
     vcpu_schedule_unlock_irq(lock, v);
 
-    if ( v->pause_flags & VPF_migrating )
-    {
-        vcpu_sleep_nosync(v);
-        vcpu_migrate(v);
-    }
+    vcpu_migrate_finish(v);
 }
 
 void restore_vcpu_affinity(struct domain *d)
@@ -841,10 +879,10 @@ int cpu_disable_scheduler(unsigned int cpu)
                  *  * the scheduler will always fine a suitable solution, or
                  *    things would have failed before getting in here.
                  */
-                set_bit(_VPF_migrating, &v->pause_flags);
+                vcpu_migrate_start(v);
                 vcpu_schedule_unlock_irqrestore(lock, flags, v);
-                vcpu_sleep_nosync(v);
-                vcpu_migrate(v);
+
+                vcpu_migrate_finish(v);
 
                 /*
                  * The only caveat, in this case, is that if a vcpu active in
@@ -908,18 +946,14 @@ static int vcpu_set_affinity(
             ASSERT(which == v->cpu_soft_affinity);
             sched_set_affinity(v, NULL, affinity);
         }
-        set_bit(_VPF_migrating, &v->pause_flags);
+        vcpu_migrate_start(v);
     }
 
     vcpu_schedule_unlock_irq(lock, v);
 
     domain_update_node_affinity(v->domain);
 
-    if ( v->pause_flags & VPF_migrating )
-    {
-        vcpu_sleep_nosync(v);
-        vcpu_migrate(v);
-    }
+    vcpu_migrate_finish(v);
 
     return ret;
 }
@@ -1147,7 +1181,6 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
         {
             sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
             v->affinity_broken = 0;
-            set_bit(_VPF_migrating, &v->pause_flags);
             ret = 0;
         }
     }
@@ -1160,20 +1193,18 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
             cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity);
             v->affinity_broken = 1;
             sched_set_affinity(v, cpumask_of(cpu), NULL);
-            set_bit(_VPF_migrating, &v->pause_flags);
             ret = 0;
         }
     }
 
+    if ( ret == 0 )
+        vcpu_migrate_start(v);
+
     vcpu_schedule_unlock_irq(lock, v);
 
     domain_update_node_affinity(v->domain);
 
-    if ( v->pause_flags & VPF_migrating )
-    {
-        vcpu_sleep_nosync(v);
-        vcpu_migrate(v);
-    }
+    vcpu_migrate_finish(v);
 
     return ret;
 }
@@ -1560,8 +1591,7 @@ void context_saved(struct vcpu *prev)
 
     SCHED_OP(vcpu_scheduler(prev), context_saved, prev);
 
-    if ( unlikely(prev->pause_flags & VPF_migrating) )
-        vcpu_migrate(prev);
+    vcpu_migrate_finish(prev);
 }
 
 /* The scheduler timer: force a run through the scheduler */
-- 
2.17.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace
  2018-05-02 10:16 [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace George Dunlap
  2018-05-02 10:16 ` [PATCH for-4.11 2/3] xen: Introduce vcpu_sleep_nosync_locked() George Dunlap
  2018-05-02 10:16 ` [PATCH for-4.11 3/3] xen/schedule: Fix races in vcpu migration George Dunlap
@ 2018-05-02 10:24 ` Juergen Gross
  2018-05-02 10:38 ` Wei Liu
  3 siblings, 0 replies; 5+ messages in thread
From: Juergen Gross @ 2018-05-02 10:24 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim Deegan,
	Dario Faggioli, Julien Grall, Jan Beulich, Ian Jackson

On 02/05/18 12:16, George Dunlap wrote:
> Delete tabs and trailing whitespace.
> 
> No functional change.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace
  2018-05-02 10:16 [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace George Dunlap
                   ` (2 preceding siblings ...)
  2018-05-02 10:24 ` [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace Juergen Gross
@ 2018-05-02 10:38 ` Wei Liu
  3 siblings, 0 replies; 5+ messages in thread
From: Wei Liu @ 2018-05-02 10:38 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Tim Deegan, Dario Faggioli, Julien Grall, Jan Beulich,
	Ian Jackson, xen-devel

On Wed, May 02, 2018 at 11:16:40AM +0100, George Dunlap wrote:
> Delete tabs and trailing whitespace.
> 
> No functional change.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-02 10:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02 10:16 [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace George Dunlap
2018-05-02 10:16 ` [PATCH for-4.11 2/3] xen: Introduce vcpu_sleep_nosync_locked() George Dunlap
2018-05-02 10:16 ` [PATCH for-4.11 3/3] xen/schedule: Fix races in vcpu migration George Dunlap
2018-05-02 10:24 ` [PATCH for-4.11 1/3] xen/schedule.c: Fix up whitespace Juergen Gross
2018-05-02 10:38 ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.