[PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking
@ 2018-03-15 18:35 Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 18:35 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

Hello,

Here it is another rather old series of mine. In this case, George has
Reviewed-by most of it, but it needed rebasing on top of staging.

https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg01850.html

And that is exactly what I am doing with this RESEND.

George:
- I did not apply your R-b to patch 1, because I had to slightly change it (not
  at all by much, but still...),
- I did not find a R-b email for patch 3... If it's my fault, sorry for that.

The series is available in a branch:
 git://xenbits.xen.org/people/dariof/xen.git  rel/sched/misc-credit1-credit2-optim
 http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/misc-credit1-credit2-optim

Travis is running right now:
 https://travis-ci.org/fdario/xen/builds/353972175

Regards,
Dario
---
Dario Faggioli (4):
      xen: sched: introduce 'adjust_affinity' hook.
      xen: sched: optimize exclusive pinning case (Credit1 & 2)
      xen: sched: improve checking soft-affinity
      xen: sched: simplify (and speedup) checking soft-affinity

 xen/arch/x86/dom0_build.c    |    7 +--
 xen/common/sched_credit.c    |  112 ++++++++++++++++++++++++++----------------
 xen/common/sched_credit2.c   |   53 ++++++++++++++++++--
 xen/common/sched_null.c      |    8 +--
 xen/common/schedule.c        |   80 ++++++++++++++++++++++--------
 xen/include/xen/perfc_defn.h |    1 
 xen/include/xen/sched-if.h   |   16 +++---
 xen/include/xen/sched.h      |    6 ++
 8 files changed, 197 insertions(+), 86 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook.
  2018-03-15 18:35 [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking Dario Faggioli
@ 2018-03-15 18:35 ` Dario Faggioli
  2018-03-15 19:10   ` Dario Faggioli
  2018-03-21 15:24   ` George Dunlap
  2018-03-15 18:35 ` [PATCH RESEND 2/4] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 18:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Dario Faggioli, George Dunlap, Anshul Makkar

From: Dario Faggioli <raistlin@linux.it>

For now, just as a way to give a scheduler an "heads up",
about the fact that the affinity changed.

This enables some optimizations, such as pre-computing
and storing (e.g., in flags) facts like a vcpu being
exclusively pinned to a pcpu, or having or not a
soft affinity. I.e., conditions that, despite the fact
that they rarely change, are right now checked very
frequently, even in hot paths.

Note that, as we expect many scheduler specific
implementations of the adjust_affinity hook to do
something with the per-scheduler vCPU private data,
this commit moves the calls to sched_set_affinity()
after that is allocated (in sched_init_vcpu()).

Note also that this, in future, may turn out as a useful
mean for, e.g., having the schedulers vet, ack or nack
the changes themselves.

Signed-off-by: Dario Faggioli <raistlin@linux.it>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
Changes from last posting:
- rebased on staging.

Changes from v2:
- fix sched_set_affinity() not to use always hard_affinity;
- move calls to sched_set_affinity() below per-scheduler vCPU data allocation.
---
 xen/arch/x86/dom0_build.c  |    7 ++--
 xen/common/schedule.c      |   75 ++++++++++++++++++++++++++++++++------------
 xen/include/xen/sched-if.h |    3 ++
 xen/include/xen/sched.h    |    3 ++
 4 files changed, 63 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 555660b853..b744791c38 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -140,14 +140,13 @@ struct vcpu *__init dom0_setup_vcpu(struct domain *d,
     {
         if ( pv_shim )
         {
-            __cpumask_set_cpu(vcpu_id, v->cpu_hard_affinity);
-            __cpumask_set_cpu(vcpu_id, v->cpu_soft_affinity);
+            sched_set_affinity(v, cpumask_of(vcpu_id), cpumask_of(vcpu_id));
         }
         else
         {
             if ( !d->is_pinned && !dom0_affinity_relaxed )
-                cpumask_copy(v->cpu_hard_affinity, &dom0_cpus);
-            cpumask_copy(v->cpu_soft_affinity, &dom0_cpus);
+                sched_set_affinity(v, &dom0_cpus, NULL);
+            sched_set_affinity(v, NULL, &dom0_cpus);
         }
     }
 
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 64524f4da7..f43d943765 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -256,18 +256,11 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
 int sched_init_vcpu(struct vcpu *v, unsigned int processor) 
 {
     struct domain *d = v->domain;
+    cpumask_t allcpus;
 
-    /*
-     * Initialize processor and affinity settings. The idler, and potentially
-     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
-     */
-    v->processor = processor;
-    if ( is_idle_domain(d) || d->is_pinned )
-        cpumask_copy(v->cpu_hard_affinity, cpumask_of(processor));
-    else
-        cpumask_setall(v->cpu_hard_affinity);
+    cpumask_setall(&allcpus);
 
-    cpumask_setall(v->cpu_soft_affinity);
+    v->processor = processor;
 
     /* Initialise the per-vcpu timers. */
     init_timer(&v->periodic_timer, vcpu_periodic_timer_fn,
@@ -282,6 +275,15 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
     if ( v->sched_priv == NULL )
         return 1;
 
+    /*
+     * Initialize affinity settings. The idler, and potentially
+     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
+     */
+    if ( is_idle_domain(d) || d->is_pinned )
+        sched_set_affinity(v, cpumask_of(processor), &allcpus);
+    else
+        sched_set_affinity(v, &allcpus, &allcpus);
+
     /* Idle VCPUs are scheduled immediately, so don't put them in runqueue. */
     if ( is_idle_domain(d) )
     {
@@ -359,6 +361,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
     for_each_vcpu ( d, v )
     {
         spinlock_t *lock;
+        cpumask_t allcpus;
 
         vcpudata = v->sched_priv;
 
@@ -366,10 +369,12 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
         migrate_timer(&v->singleshot_timer, new_p);
         migrate_timer(&v->poll_timer, new_p);
 
-        cpumask_setall(v->cpu_hard_affinity);
-        cpumask_setall(v->cpu_soft_affinity);
+        cpumask_setall(&allcpus);
 
         lock = vcpu_schedule_lock_irq(v);
+
+        sched_set_affinity(v, &allcpus, &allcpus);
+
         v->processor = new_p;
         /*
          * With v->processor modified we must not
@@ -694,7 +699,7 @@ void restore_vcpu_affinity(struct domain *d)
 
         if ( v->affinity_broken )
         {
-            cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved);
+            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
             v->affinity_broken = 0;
 
         }
@@ -758,6 +763,8 @@ int cpu_disable_scheduler(unsigned int cpu)
             if ( cpumask_empty(&online_affinity) &&
                  cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
             {
+                cpumask_t allcpus;
+
                 if ( v->affinity_broken )
                 {
                     /* The vcpu is temporarily pinned, can't move it. */
@@ -775,7 +782,8 @@ int cpu_disable_scheduler(unsigned int cpu)
                 else
                     printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
 
-                cpumask_setall(v->cpu_hard_affinity);
+                cpumask_setall(&allcpus);
+                sched_set_affinity(v, &allcpus, NULL);
             }
 
             if ( v->processor != cpu )
@@ -845,8 +853,26 @@ int cpu_disable_scheduler(unsigned int cpu)
     return ret;
 }
 
+/*
+ * In general, this must be called with the scheduler lock held, because the
+ * adjust_affinity hook may want to modify the vCPU state. However, when the
+ * vCPU is being initialized (either for dom0 or domU) there is no risk of
+ * races, and it's fine to not take the look (we're talking about
+ * dom0_setup_vcpu() an sched_init_vcpu()).
+ */
+void sched_set_affinity(
+    struct vcpu *v, const cpumask_t *hard, const cpumask_t *soft)
+{
+    SCHED_OP(dom_scheduler(v->domain), adjust_affinity, v, hard, soft);
+
+    if ( hard )
+        cpumask_copy(v->cpu_hard_affinity, hard);
+    if ( soft )
+        cpumask_copy(v->cpu_soft_affinity, soft);
+}
+
 static int vcpu_set_affinity(
-    struct vcpu *v, const cpumask_t *affinity, cpumask_t *which)
+    struct vcpu *v, const cpumask_t *affinity, const cpumask_t *which)
 {
     spinlock_t *lock;
     int ret = 0;
@@ -857,12 +883,19 @@ static int vcpu_set_affinity(
         ret = -EBUSY;
     else
     {
-        cpumask_copy(which, affinity);
-
         /*
-         * Always ask the scheduler to re-evaluate placement
-         * when changing the affinity.
+         * Tell the scheduler we changes something about affinity,
+         * and ask to re-evaluate vcpu placement.
          */
+        if ( which == v->cpu_hard_affinity )
+        {
+            sched_set_affinity(v, affinity, NULL);
+        }
+        else
+        {
+            ASSERT(which == v->cpu_soft_affinity);
+            sched_set_affinity(v, NULL, affinity);
+        }
         set_bit(_VPF_migrating, &v->pause_flags);
     }
 
@@ -1100,7 +1133,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
     {
         if ( v->affinity_broken )
         {
-            cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved);
+            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
             v->affinity_broken = 0;
             set_bit(_VPF_migrating, &v->pause_flags);
             ret = 0;
@@ -1114,7 +1147,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
         {
             cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity);
             v->affinity_broken = 1;
-            cpumask_copy(v->cpu_hard_affinity, cpumask_of(cpu));
+            sched_set_affinity(v, cpumask_of(cpu), NULL);
             set_bit(_VPF_migrating, &v->pause_flags);
             ret = 0;
         }
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index c5dd43ed9c..926d063ccf 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -173,6 +173,9 @@ struct scheduler {
                                     unsigned int);
     int          (*adjust)         (const struct scheduler *, struct domain *,
                                     struct xen_domctl_scheduler_op *);
+    void         (*adjust_affinity)(const struct scheduler *, struct vcpu *,
+                                    const struct cpumask *,
+                                    const struct cpumask *);
     int          (*adjust_global)  (const struct scheduler *,
                                     struct xen_sysctl_scheduler_op *);
     void         (*dump_settings)  (const struct scheduler *);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 39f938644a..ade4d7b9aa 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -846,6 +846,9 @@ void scheduler_free(struct scheduler *sched);
 int schedule_cpu_switch(unsigned int cpu, struct cpupool *c);
 void vcpu_force_reschedule(struct vcpu *v);
 int cpu_disable_scheduler(unsigned int cpu);
+/* We need it in dom0_setup_vcpu */
+void sched_set_affinity(struct vcpu *v, const cpumask_t *hard,
+                        const cpumask_t *soft);
 int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
 int vcpu_set_soft_affinity(struct vcpu *v, const cpumask_t *affinity);
 void restore_vcpu_affinity(struct domain *d);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RESEND 2/4] xen: sched: optimize exclusive pinning case (Credit1 & 2)
  2018-03-15 18:35 [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
@ 2018-03-15 18:35 ` Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 4/4] xen: sched: simplify (and speedup) " Dario Faggioli
  3 siblings, 0 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 18:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Dario Faggioli, George Dunlap, Anshul Makkar

From: Dario Faggioli <raistlin@linux.it>

Exclusive pinning of vCPUs is used, sometimes, for
achieving the highest level of determinism, and the
least possible overhead, for the vCPUs in question.

Although static 1:1 pinning is not recommended, for
general use cases, optimizing the tickling code (of
Credit1 and Credit2) is easy and cheap enough, so go
for it.

Signed-off-by: Dario Faggioli <raistlin@linux.it>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
Changes from last posting:
- none!

Changes from v2:
- deal with NULL being passed as the value of 'hard' to sched_set_affinity().

Changes from v1:
- use a flag during runtime, as suggested during review;
- make use of the affinity-change hook, introduced in pevious patch.
---
 xen/common/sched_credit.c    |   38 +++++++++++++++++++++++++++++++++++++
 xen/common/sched_credit2.c   |   43 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/perfc_defn.h |    1 +
 3 files changed, 82 insertions(+)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 7c40ee2d00..02fea872bd 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -73,6 +73,7 @@
 #define CSCHED_FLAG_VCPU_PARKED    0x0  /* VCPU over capped credits */
 #define CSCHED_FLAG_VCPU_YIELD     0x1  /* VCPU yielding */
 #define CSCHED_FLAG_VCPU_MIGRATING 0x2  /* VCPU may have moved to a new pcpu */
+#define CSCHED_FLAG_VCPU_PINNED    0x4  /* VCPU can run only on 1 pcpu */
 
 
 /*
@@ -361,6 +362,25 @@ static inline void __runq_tickle(struct csched_vcpu *new)
     cpumask_and(&idle_mask, prv->idlers, online);
     idlers_empty = cpumask_empty(&idle_mask);
 
+    /*
+     * Exclusive pinning is when a vcpu has hard-affinity with only one
+     * cpu, and there is no other vcpu that has hard-affinity with that
+     * same cpu. This is infrequent, but if it happens, is for achieving
+     * the most possible determinism, and least possible overhead for
+     * the vcpus in question.
+     *
+     * Try to identify the vast majority of these situations, and deal
+     * with them quickly.
+     */
+    if ( unlikely(test_bit(CSCHED_FLAG_VCPU_PINNED, &new->flags) &&
+                  cpumask_test_cpu(cpu, &idle_mask)) )
+    {
+        ASSERT(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) == cpu);
+        SCHED_STAT_CRANK(tickled_idle_cpu_excl);
+        __cpumask_set_cpu(cpu, &mask);
+        goto tickle;
+    }
+
     /*
      * If the pcpu is idle, or there are no idlers and the new
      * vcpu is a higher priority than the old vcpu, run it here.
@@ -457,6 +477,7 @@ static inline void __runq_tickle(struct csched_vcpu *new)
         }
     }
 
+ tickle:
     if ( !cpumask_empty(&mask) )
     {
         if ( unlikely(tb_init_done) )
@@ -1222,6 +1243,22 @@ csched_dom_cntl(
     return rc;
 }
 
+static void
+csched_aff_cntl(const struct scheduler *ops, struct vcpu *v,
+                const cpumask_t *hard, const cpumask_t *soft)
+{
+    struct csched_vcpu *svc = CSCHED_VCPU(v);
+
+    if ( !hard )
+        return;
+
+    /* Are we becoming exclusively pinned? */
+    if ( cpumask_weight(hard) == 1 )
+        set_bit(CSCHED_FLAG_VCPU_PINNED, &svc->flags);
+    else
+        clear_bit(CSCHED_FLAG_VCPU_PINNED, &svc->flags);
+}
+
 static inline void
 __csched_set_tslice(struct csched_private *prv, unsigned int timeslice_ms)
 {
@@ -2242,6 +2279,7 @@ static const struct scheduler sched_credit_def = {
     .yield          = csched_vcpu_yield,
 
     .adjust         = csched_dom_cntl,
+    .adjust_affinity= csched_aff_cntl,
     .adjust_global  = csched_sys_cntl,
 
     .pick_cpu       = csched_cpu_pick,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 5a635e8c4c..0e5a8fc541 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -299,6 +299,12 @@
  */
 #define __CSFLAG_vcpu_yield 4
 #define CSFLAG_vcpu_yield (1U<<__CSFLAG_vcpu_yield)
+/*
+ * CSFLAGS_pinned: this vcpu is currently 'pinned', i.e., has its hard
+ * affinity set to one and only 1 cpu (and, hence, can only run there).
+ */
+#define __CSFLAG_pinned 5
+#define CSFLAG_pinned (1U<<__CSFLAG_pinned)
 
 static unsigned int __read_mostly opt_migrate_resist = 500;
 integer_param("sched_credit2_migrate_resist", opt_migrate_resist);
@@ -1455,6 +1461,26 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
                     (unsigned char *)&d);
     }
 
+    /*
+     * Exclusive pinning is when a vcpu has hard-affinity with only one
+     * cpu, and there is no other vcpu that has hard-affinity with that
+     * same cpu. This is infrequent, but if it happens, is for achieving
+     * the most possible determinism, and least possible overhead for
+     * the vcpus in question.
+     *
+     * Try to identify the vast majority of these situations, and deal
+     * with them quickly.
+     */
+    if ( unlikely((new->flags & CSFLAG_pinned) &&
+                  cpumask_test_cpu(cpu, &rqd->idle) &&
+                  !cpumask_test_cpu(cpu, &rqd->tickled)) )
+    {
+        ASSERT(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) == cpu);
+        SCHED_STAT_CRANK(tickled_idle_cpu_excl);
+        ipid = cpu;
+        goto tickle;
+    }
+
     for_each_affinity_balance_step( bs )
     {
         /* Just skip first step, if we don't have a soft affinity */
@@ -2971,6 +2997,22 @@ csched2_dom_cntl(
     return rc;
 }
 
+static void
+csched2_aff_cntl(const struct scheduler *ops, struct vcpu *v,
+                 const cpumask_t *hard, const cpumask_t *soft)
+{
+    struct csched2_vcpu *svc = csched2_vcpu(v);
+
+    if ( !hard )
+        return;
+
+    /* Are we becoming exclusively pinned? */
+    if ( cpumask_weight(hard) == 1 )
+        __set_bit(__CSFLAG_pinned, &svc->flags);
+    else
+        __clear_bit(__CSFLAG_pinned, &svc->flags);
+}
+
 static int csched2_sys_cntl(const struct scheduler *ops,
                             struct xen_sysctl_scheduler_op *sc)
 {
@@ -3999,6 +4041,7 @@ static const struct scheduler sched_credit2_def = {
     .yield          = csched2_vcpu_yield,
 
     .adjust         = csched2_dom_cntl,
+    .adjust_affinity= csched2_aff_cntl,
     .adjust_global  = csched2_sys_cntl,
 
     .pick_cpu       = csched2_cpu_pick,
diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h
index e8ff565321..ef6f86b91e 100644
--- a/xen/include/xen/perfc_defn.h
+++ b/xen/include/xen/perfc_defn.h
@@ -32,6 +32,7 @@ PERFCOUNTER(vcpu_wake_runnable,     "sched: vcpu_wake_runnable")
 PERFCOUNTER(vcpu_wake_not_runnable, "sched: vcpu_wake_not_runnable")
 PERFCOUNTER(tickled_no_cpu,         "sched: tickled_no_cpu")
 PERFCOUNTER(tickled_idle_cpu,       "sched: tickled_idle_cpu")
+PERFCOUNTER(tickled_idle_cpu_excl,  "sched: tickled_idle_cpu_exclusive")
 PERFCOUNTER(tickled_busy_cpu,       "sched: tickled_busy_cpu")
 PERFCOUNTER(vcpu_check,             "sched: vcpu_check")
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity
  2018-03-15 18:35 [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
  2018-03-15 18:35 ` [PATCH RESEND 2/4] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
@ 2018-03-15 18:35 ` Dario Faggioli
  2018-03-21 15:24   ` George Dunlap
  2018-03-15 18:35 ` [PATCH RESEND 4/4] xen: sched: simplify (and speedup) " Dario Faggioli
  3 siblings, 1 reply; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 18:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Dario Faggioli, George Dunlap, Anshul Makkar

From: Dario Faggioli <raistlin@linux.it>

Whether or not a vCPU has a soft-affinity which is
effective, i.e., with the power of actually affecting
the scheduling of the vCPU itself, happens in an
helper function, called has_soft_affinity().

Such function takes a custom cpumask as its third
parameter, for better flexibility, but that mask is
different from the vCPU's hard-affinity only in one
case. Getting rid of that parameter, not only simplify
the function, but enables for optimizing the soft
affinity check (which will happen, in a subsequent
commit).

This commit, therefore, does that. It's mostly
mechanical, with the only exception _csched_cpu_pick()
(in Credit1 code).

Signed-off-by: Dario Faggioli <raistlin@linux.it>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
Changes from v2:
- fix potential spurious usage of the scratch space of the wrong cpu.
---
 xen/common/sched_credit.c  |   74 +++++++++++++++++++-------------------------
 xen/common/sched_credit2.c |   10 ++----
 xen/common/sched_null.c    |    8 ++---
 xen/include/xen/sched-if.h |    8 ++---
 4 files changed, 42 insertions(+), 58 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 02fea872bd..6b8ce1ec3c 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -410,8 +410,7 @@ static inline void __runq_tickle(struct csched_vcpu *new)
             int new_idlers_empty;
 
             if ( balance_step == BALANCE_SOFT_AFFINITY
-                 && !has_soft_affinity(new->vcpu,
-                                       new->vcpu->cpu_hard_affinity) )
+                 && !has_soft_affinity(new->vcpu) )
                 continue;
 
             /* Are there idlers suitable for new (for this balance step)? */
@@ -742,50 +741,42 @@ __csched_vcpu_is_migrateable(struct vcpu *vc, int dest_cpu, cpumask_t *mask)
 static int
 _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
 {
-    cpumask_t cpus;
+    /* We must always use vc->procssor's scratch space */
+    cpumask_t *cpus = cpumask_scratch_cpu(vc->processor);
     cpumask_t idlers;
-    cpumask_t *online;
+    cpumask_t *online = cpupool_domain_cpumask(vc->domain);
     struct csched_pcpu *spc = NULL;
     int cpu = vc->processor;
     int balance_step;
 
-    /* Store in cpus the mask of online cpus on which the domain can run */
-    online = cpupool_domain_cpumask(vc->domain);
-    cpumask_and(&cpus, vc->cpu_hard_affinity, online);
-
     for_each_affinity_balance_step( balance_step )
     {
+        affinity_balance_cpumask(vc, balance_step, cpus);
+        cpumask_and(cpus, online, cpus);
         /*
          * We want to pick up a pcpu among the ones that are online and
-         * can accommodate vc, which is basically what we computed above
-         * and stored in cpus. As far as hard affinity is concerned,
-         * there always will be at least one of these pcpus, hence cpus
-         * is never empty and the calls to cpumask_cycle() and
-         * cpumask_test_cpu() below are ok.
+         * can accommodate vc. As far as hard affinity is concerned, there
+         * always will be at least one of these pcpus in the scratch cpumask,
+         * hence, the calls to cpumask_cycle() and cpumask_test_cpu() below
+         * are ok.
          *
-         * On the other hand, when considering soft affinity too, it
-         * is possible for the mask to become empty (for instance, if the
-         * domain has been put in a cpupool that does not contain any of the
-         * pcpus in its soft affinity), which would result in the ASSERT()-s
-         * inside cpumask_*() operations triggering (in debug builds).
+         * On the other hand, when considering soft affinity, it is possible
+         * that the mask is empty (for instance, if the domain has been put
+         * in a cpupool that does not contain any of the pcpus in its soft
+         * affinity), which would result in the ASSERT()-s inside cpumask_*()
+         * operations triggering (in debug builds).
          *
-         * Therefore, in this case, we filter the soft affinity mask against
-         * cpus and, if the result is empty, we just skip the soft affinity
+         * Therefore, if that is the case, we just skip the soft affinity
          * balancing step all together.
          */
-        if ( balance_step == BALANCE_SOFT_AFFINITY
-             && !has_soft_affinity(vc, &cpus) )
+        if ( balance_step == BALANCE_SOFT_AFFINITY &&
+             (!has_soft_affinity(vc) || cpumask_empty(cpus)) )
             continue;
 
-        /* Pick an online CPU from the proper affinity mask */
-        affinity_balance_cpumask(vc, balance_step, &cpus);
-        cpumask_and(&cpus, &cpus, online);
-
         /* If present, prefer vc's current processor */
-        cpu = cpumask_test_cpu(vc->processor, &cpus)
-                ? vc->processor
-                : cpumask_cycle(vc->processor, &cpus);
-        ASSERT(cpumask_test_cpu(cpu, &cpus));
+        cpu = cpumask_test_cpu(vc->processor, cpus)
+                ? vc->processor : cpumask_cycle(vc->processor, cpus);
+        ASSERT(cpumask_test_cpu(cpu, cpus));
 
         /*
          * Try to find an idle processor within the above constraints.
@@ -806,7 +797,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
         cpumask_and(&idlers, &cpu_online_map, CSCHED_PRIV(ops)->idlers);
         if ( vc->processor == cpu && is_runq_idle(cpu) )
             __cpumask_set_cpu(cpu, &idlers);
-        cpumask_and(&cpus, &cpus, &idlers);
+        cpumask_and(cpus, &idlers, cpus);
 
         /*
          * It is important that cpu points to an idle processor, if a suitable
@@ -820,18 +811,18 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
          * Notice that cpumask_test_cpu() is quicker than cpumask_empty(), so
          * we check for it first.
          */
-        if ( !cpumask_test_cpu(cpu, &cpus) && !cpumask_empty(&cpus) )
-            cpu = cpumask_cycle(cpu, &cpus);
-        __cpumask_clear_cpu(cpu, &cpus);
+        if ( !cpumask_test_cpu(cpu, cpus) && !cpumask_empty(cpus) )
+            cpu = cpumask_cycle(cpu, cpus);
+        __cpumask_clear_cpu(cpu, cpus);
 
-        while ( !cpumask_empty(&cpus) )
+        while ( !cpumask_empty(cpus) )
         {
             cpumask_t cpu_idlers;
             cpumask_t nxt_idlers;
             int nxt, weight_cpu, weight_nxt;
             int migrate_factor;
 
-            nxt = cpumask_cycle(cpu, &cpus);
+            nxt = cpumask_cycle(cpu, cpus);
 
             if ( cpumask_test_cpu(cpu, per_cpu(cpu_core_mask, nxt)) )
             {
@@ -861,14 +852,14 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
                  weight_cpu > weight_nxt :
                  weight_cpu * migrate_factor < weight_nxt )
             {
-                cpumask_and(&nxt_idlers, &cpus, &nxt_idlers);
+                cpumask_and(&nxt_idlers, &nxt_idlers, cpus);
                 spc = CSCHED_PCPU(nxt);
                 cpu = cpumask_cycle(spc->idle_bias, &nxt_idlers);
-                cpumask_andnot(&cpus, &cpus, per_cpu(cpu_sibling_mask, cpu));
+                cpumask_andnot(cpus, cpus, per_cpu(cpu_sibling_mask, cpu));
             }
             else
             {
-                cpumask_andnot(&cpus, &cpus, &nxt_idlers);
+                cpumask_andnot(cpus, cpus, &nxt_idlers);
             }
         }
 
@@ -1665,9 +1656,8 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step)
          * vCPUs with useful soft affinities in some sort of bitmap
          * or counter.
          */
-        if ( vc->is_running ||
-             (balance_step == BALANCE_SOFT_AFFINITY
-              && !has_soft_affinity(vc, vc->cpu_hard_affinity)) )
+        if ( vc->is_running || (balance_step == BALANCE_SOFT_AFFINITY &&
+                                !has_soft_affinity(vc)) )
             continue;
 
         affinity_balance_cpumask(vc, balance_step, cpumask_scratch);
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 0e5a8fc541..9a3e71f1c8 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -700,8 +700,7 @@ static int get_fallback_cpu(struct csched2_vcpu *svc)
     {
         int cpu = v->processor;
 
-        if ( bs == BALANCE_SOFT_AFFINITY &&
-             !has_soft_affinity(v, v->cpu_hard_affinity) )
+        if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(v) )
             continue;
 
         affinity_balance_cpumask(v, bs, cpumask_scratch_cpu(cpu));
@@ -1484,8 +1483,7 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
     for_each_affinity_balance_step( bs )
     {
         /* Just skip first step, if we don't have a soft affinity */
-        if ( bs == BALANCE_SOFT_AFFINITY &&
-             !has_soft_affinity(new->vcpu, new->vcpu->cpu_hard_affinity) )
+        if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(new->vcpu) )
             continue;
 
         affinity_balance_cpumask(new->vcpu, bs, cpumask_scratch_cpu(cpu));
@@ -2285,7 +2283,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
      *
      * Find both runqueues in one pass.
      */
-    has_soft = has_soft_affinity(vc, vc->cpu_hard_affinity);
+    has_soft = has_soft_affinity(vc);
     for_each_cpu(i, &prv->active_queues)
     {
         struct csched2_runqueue_data *rqd;
@@ -3307,7 +3305,7 @@ runq_candidate(struct csched2_runqueue_data *rqd,
     }
 
     /* If scurr has a soft-affinity, let's check whether cpu is part of it */
-    if ( has_soft_affinity(scurr->vcpu, scurr->vcpu->cpu_hard_affinity) )
+    if ( has_soft_affinity(scurr->vcpu) )
     {
         affinity_balance_cpumask(scurr->vcpu, BALANCE_SOFT_AFFINITY,
                                  cpumask_scratch);
diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 58e306a7ea..6b0f9d44a2 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -283,8 +283,7 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
 
     for_each_affinity_balance_step( bs )
     {
-        if ( bs == BALANCE_SOFT_AFFINITY &&
-             !has_soft_affinity(v, v->cpu_hard_affinity) )
+        if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(v) )
             continue;
 
         affinity_balance_cpumask(v, bs, cpumask_scratch_cpu(cpu));
@@ -496,8 +495,7 @@ static void _vcpu_remove(struct null_private *prv, struct vcpu *v)
     {
         list_for_each_entry( wvc, &prv->waitq, waitq_elem )
         {
-            if ( bs == BALANCE_SOFT_AFFINITY &&
-                 !has_soft_affinity(wvc->vcpu, wvc->vcpu->cpu_hard_affinity) )
+            if ( bs == BALANCE_SOFT_AFFINITY && !has_soft_affinity(wvc->vcpu) )
                 continue;
 
             if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) )
@@ -766,7 +764,7 @@ static struct task_slice null_schedule(const struct scheduler *ops,
             list_for_each_entry( wvc, &prv->waitq, waitq_elem )
             {
                 if ( bs == BALANCE_SOFT_AFFINITY &&
-                     !has_soft_affinity(wvc->vcpu, wvc->vcpu->cpu_hard_affinity) )
+                     !has_soft_affinity(wvc->vcpu) )
                     continue;
 
                 if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) )
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 926d063ccf..65b4538114 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -266,16 +266,14 @@ static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
  * Soft affinity only needs to be considered if:
  * * The cpus in the cpupool are not a subset of soft affinity
  * * The hard affinity is not a subset of soft affinity
- * * There is an overlap between the soft affinity and the mask which is
- *   currently being considered.
+ * * There is an overlap between the soft and hard affinity masks
  */
-static inline int has_soft_affinity(const struct vcpu *v,
-                                    const cpumask_t *mask)
+static inline int has_soft_affinity(const struct vcpu *v)
 {
     return !cpumask_subset(cpupool_domain_cpumask(v->domain),
                            v->cpu_soft_affinity) &&
            !cpumask_subset(v->cpu_hard_affinity, v->cpu_soft_affinity) &&
-           cpumask_intersects(v->cpu_soft_affinity, mask);
+           cpumask_intersects(v->cpu_soft_affinity, v->cpu_hard_affinity);
 }
 
 /*


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RESEND 4/4] xen: sched: simplify (and speedup) checking soft-affinity
  2018-03-15 18:35 [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking Dario Faggioli
                   ` (2 preceding siblings ...)
  2018-03-15 18:35 ` [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity Dario Faggioli
@ 2018-03-15 18:35 ` Dario Faggioli
  3 siblings, 0 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 18:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Dario Faggioli, George Dunlap, Anshul Makkar

From: Dario Faggioli <raistlin@linux.it>

The fact of whether or not a vCPU has a soft-affinity
which is effective, i.e., with the power of actually
affecting the scheduling of the vCPU itself rarely
changes. Very, very rarely, as compared to how often
we need to check for the same thing (basically, at
every scheduling decision!).

That can be improved by storing in a (per-vCPU) flag
(it's actually a boolean field in struct vcpu) whether
or not, considering how hard-affinity and soft-affinity
look like, soft-affinity should or not be taken into
account during scheduling decisions.

This saves some cpumask manipulations, which is nice,
considering how frequently they were being done. Note
that we can't get rid of 100% of the cpumask operations
involved in the check, because soft-affinity being
effective or not, not only depends on the relationship
between the hard and soft-affinity masks of a vCPU, but
also of the online pCPUs and/or of what pCPUs are part
of the cpupool where the vCPU lives, and that's rather
impractical to store in a per-vCPU flag. Still the
overhead is reduced to "just" one cpumask_subset() (and
only if the newly introduced flag is 'true')!

Signed-off-by: Dario Faggioli <raistlin@linux.it>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/schedule.c      |    5 +++++
 xen/include/xen/sched-if.h |    7 +++----
 xen/include/xen/sched.h    |    3 +++
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index f43d943765..ba4ab877e0 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -869,6 +869,11 @@ void sched_set_affinity(
         cpumask_copy(v->cpu_hard_affinity, hard);
     if ( soft )
         cpumask_copy(v->cpu_soft_affinity, soft);
+
+    v->soft_aff_effective = !cpumask_subset(v->cpu_hard_affinity,
+                                            v->cpu_soft_affinity) &&
+                            cpumask_intersects(v->cpu_soft_affinity,
+                                               v->cpu_hard_affinity);
 }
 
 static int vcpu_set_affinity(
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 65b4538114..9596eae1e2 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -270,10 +270,9 @@ static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
  */
 static inline int has_soft_affinity(const struct vcpu *v)
 {
-    return !cpumask_subset(cpupool_domain_cpumask(v->domain),
-                           v->cpu_soft_affinity) &&
-           !cpumask_subset(v->cpu_hard_affinity, v->cpu_soft_affinity) &&
-           cpumask_intersects(v->cpu_soft_affinity, v->cpu_hard_affinity);
+    return v->soft_aff_effective &&
+           !cpumask_subset(cpupool_domain_cpumask(v->domain),
+                           v->cpu_soft_affinity);
 }
 
 /*
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index ade4d7b9aa..73376a969f 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -210,6 +210,9 @@ struct vcpu
     bool             hcall_compat;
 #endif
 
+    /* Does soft affinity actually play a role (given hard affinity)? */
+    bool             soft_aff_effective;
+
     /* The CPU, if any, which is holding onto this VCPU's state. */
 #define VCPU_CPU_CLEAN (~0u)
     unsigned int     dirty_cpu;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook.
  2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
@ 2018-03-15 19:10   ` Dario Faggioli
  2018-03-21 15:24   ` George Dunlap
  1 sibling, 0 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-15 19:10 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, anshul makkar


[-- Attachment #1.1: Type: text/plain, Size: 10710 bytes --]

Il Gio 15 Mar 2018, 19:35 Dario Faggioli <dfaggioli@suse.com> ha scritto:

> From: Dario Faggioli <raistlin@linux.it>
>
> For now, just as a way to give a scheduler an "heads up",
> about the fact that the affinity changed.
>
> This enables some optimizations, such as pre-computing
> and storing (e.g., in flags) facts like a vcpu being
> exclusively pinned to a pcpu, or having or not a
> soft affinity. I.e., conditions that, despite the fact
> that they rarely change, are right now checked very
> frequently, even in hot paths.
>
> Note that, as we expect many scheduler specific
> implementations of the adjust_affinity hook to do
> something with the per-scheduler vCPU private data,
> this commit moves the calls to sched_set_affinity()
> after that is allocated (in sched_init_vcpu()).
>
> Note also that this, in future, may turn out as a useful
> mean for, e.g., having the schedulers vet, ack or nack
> the changes themselves.
>
> Signed-off-by: Dario Faggioli <raistlin@linux.it>
>

And I did forget to change this in dfaggioli@suse.com... :-/

In case the series can go in, would it be possible to fix this upon commit?

Aĺternatively, I'm  certainly up for re-resending as well, just let me know
what's better.

Sorry for the mess and Regards,
Dario


---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
> Changes from last posting:
> - rebased on staging.
>
> Changes from v2:
> - fix sched_set_affinity() not to use always hard_affinity;
> - move calls to sched_set_affinity() below per-scheduler vCPU data
> allocation.
> ---
>  xen/arch/x86/dom0_build.c  |    7 ++--
>  xen/common/schedule.c      |   75
> ++++++++++++++++++++++++++++++++------------
>  xen/include/xen/sched-if.h |    3 ++
>  xen/include/xen/sched.h    |    3 ++
>  4 files changed, 63 insertions(+), 25 deletions(-)
>
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index 555660b853..b744791c38 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -140,14 +140,13 @@ struct vcpu *__init dom0_setup_vcpu(struct domain *d,
>      {
>          if ( pv_shim )
>          {
> -            __cpumask_set_cpu(vcpu_id, v->cpu_hard_affinity);
> -            __cpumask_set_cpu(vcpu_id, v->cpu_soft_affinity);
> +            sched_set_affinity(v, cpumask_of(vcpu_id),
> cpumask_of(vcpu_id));
>          }
>          else
>          {
>              if ( !d->is_pinned && !dom0_affinity_relaxed )
> -                cpumask_copy(v->cpu_hard_affinity, &dom0_cpus);
> -            cpumask_copy(v->cpu_soft_affinity, &dom0_cpus);
> +                sched_set_affinity(v, &dom0_cpus, NULL);
> +            sched_set_affinity(v, NULL, &dom0_cpus);
>          }
>      }
>
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 64524f4da7..f43d943765 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -256,18 +256,11 @@ static void sched_spin_unlock_double(spinlock_t
> *lock1, spinlock_t *lock2,
>  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>  {
>      struct domain *d = v->domain;
> +    cpumask_t allcpus;
>
> -    /*
> -     * Initialize processor and affinity settings. The idler, and
> potentially
> -     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
> -     */
> -    v->processor = processor;
> -    if ( is_idle_domain(d) || d->is_pinned )
> -        cpumask_copy(v->cpu_hard_affinity, cpumask_of(processor));
> -    else
> -        cpumask_setall(v->cpu_hard_affinity);
> +    cpumask_setall(&allcpus);
>
> -    cpumask_setall(v->cpu_soft_affinity);
> +    v->processor = processor;
>
>      /* Initialise the per-vcpu timers. */
>      init_timer(&v->periodic_timer, vcpu_periodic_timer_fn,
> @@ -282,6 +275,15 @@ int sched_init_vcpu(struct vcpu *v, unsigned int
> processor)
>      if ( v->sched_priv == NULL )
>          return 1;
>
> +    /*
> +     * Initialize affinity settings. The idler, and potentially
> +     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
> +     */
> +    if ( is_idle_domain(d) || d->is_pinned )
> +        sched_set_affinity(v, cpumask_of(processor), &allcpus);
> +    else
> +        sched_set_affinity(v, &allcpus, &allcpus);
> +
>      /* Idle VCPUs are scheduled immediately, so don't put them in
> runqueue. */
>      if ( is_idle_domain(d) )
>      {
> @@ -359,6 +361,7 @@ int sched_move_domain(struct domain *d, struct cpupool
> *c)
>      for_each_vcpu ( d, v )
>      {
>          spinlock_t *lock;
> +        cpumask_t allcpus;
>
>          vcpudata = v->sched_priv;
>
> @@ -366,10 +369,12 @@ int sched_move_domain(struct domain *d, struct
> cpupool *c)
>          migrate_timer(&v->singleshot_timer, new_p);
>          migrate_timer(&v->poll_timer, new_p);
>
> -        cpumask_setall(v->cpu_hard_affinity);
> -        cpumask_setall(v->cpu_soft_affinity);
> +        cpumask_setall(&allcpus);
>
>          lock = vcpu_schedule_lock_irq(v);
> +
> +        sched_set_affinity(v, &allcpus, &allcpus);
> +
>          v->processor = new_p;
>          /*
>           * With v->processor modified we must not
> @@ -694,7 +699,7 @@ void restore_vcpu_affinity(struct domain *d)
>
>          if ( v->affinity_broken )
>          {
> -            cpumask_copy(v->cpu_hard_affinity,
> v->cpu_hard_affinity_saved);
> +            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
>              v->affinity_broken = 0;
>
>          }
> @@ -758,6 +763,8 @@ int cpu_disable_scheduler(unsigned int cpu)
>              if ( cpumask_empty(&online_affinity) &&
>                   cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
>              {
> +                cpumask_t allcpus;
> +
>                  if ( v->affinity_broken )
>                  {
>                      /* The vcpu is temporarily pinned, can't move it. */
> @@ -775,7 +782,8 @@ int cpu_disable_scheduler(unsigned int cpu)
>                  else
>                      printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
>
> -                cpumask_setall(v->cpu_hard_affinity);
> +                cpumask_setall(&allcpus);
> +                sched_set_affinity(v, &allcpus, NULL);
>              }
>
>              if ( v->processor != cpu )
> @@ -845,8 +853,26 @@ int cpu_disable_scheduler(unsigned int cpu)
>      return ret;
>  }
>
> +/*
> + * In general, this must be called with the scheduler lock held, because
> the
> + * adjust_affinity hook may want to modify the vCPU state. However, when
> the
> + * vCPU is being initialized (either for dom0 or domU) there is no risk of
> + * races, and it's fine to not take the look (we're talking about
> + * dom0_setup_vcpu() an sched_init_vcpu()).
> + */
> +void sched_set_affinity(
> +    struct vcpu *v, const cpumask_t *hard, const cpumask_t *soft)
> +{
> +    SCHED_OP(dom_scheduler(v->domain), adjust_affinity, v, hard, soft);
> +
> +    if ( hard )
> +        cpumask_copy(v->cpu_hard_affinity, hard);
> +    if ( soft )
> +        cpumask_copy(v->cpu_soft_affinity, soft);
> +}
> +
>  static int vcpu_set_affinity(
> -    struct vcpu *v, const cpumask_t *affinity, cpumask_t *which)
> +    struct vcpu *v, const cpumask_t *affinity, const cpumask_t *which)
>  {
>      spinlock_t *lock;
>      int ret = 0;
> @@ -857,12 +883,19 @@ static int vcpu_set_affinity(
>          ret = -EBUSY;
>      else
>      {
> -        cpumask_copy(which, affinity);
> -
>          /*
> -         * Always ask the scheduler to re-evaluate placement
> -         * when changing the affinity.
> +         * Tell the scheduler we changes something about affinity,
> +         * and ask to re-evaluate vcpu placement.
>           */
> +        if ( which == v->cpu_hard_affinity )
> +        {
> +            sched_set_affinity(v, affinity, NULL);
> +        }
> +        else
> +        {
> +            ASSERT(which == v->cpu_soft_affinity);
> +            sched_set_affinity(v, NULL, affinity);
> +        }
>          set_bit(_VPF_migrating, &v->pause_flags);
>      }
>
> @@ -1100,7 +1133,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
>      {
>          if ( v->affinity_broken )
>          {
> -            cpumask_copy(v->cpu_hard_affinity,
> v->cpu_hard_affinity_saved);
> +            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
>              v->affinity_broken = 0;
>              set_bit(_VPF_migrating, &v->pause_flags);
>              ret = 0;
> @@ -1114,7 +1147,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
>          {
>              cpumask_copy(v->cpu_hard_affinity_saved,
> v->cpu_hard_affinity);
>              v->affinity_broken = 1;
> -            cpumask_copy(v->cpu_hard_affinity, cpumask_of(cpu));
> +            sched_set_affinity(v, cpumask_of(cpu), NULL);
>              set_bit(_VPF_migrating, &v->pause_flags);
>              ret = 0;
>          }
> diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
> index c5dd43ed9c..926d063ccf 100644
> --- a/xen/include/xen/sched-if.h
> +++ b/xen/include/xen/sched-if.h
> @@ -173,6 +173,9 @@ struct scheduler {
>                                      unsigned int);
>      int          (*adjust)         (const struct scheduler *, struct
> domain *,
>                                      struct xen_domctl_scheduler_op *);
> +    void         (*adjust_affinity)(const struct scheduler *, struct vcpu
> *,
> +                                    const struct cpumask *,
> +                                    const struct cpumask *);
>      int          (*adjust_global)  (const struct scheduler *,
>                                      struct xen_sysctl_scheduler_op *);
>      void         (*dump_settings)  (const struct scheduler *);
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 39f938644a..ade4d7b9aa 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -846,6 +846,9 @@ void scheduler_free(struct scheduler *sched);
>  int schedule_cpu_switch(unsigned int cpu, struct cpupool *c);
>  void vcpu_force_reschedule(struct vcpu *v);
>  int cpu_disable_scheduler(unsigned int cpu);
> +/* We need it in dom0_setup_vcpu */
> +void sched_set_affinity(struct vcpu *v, const cpumask_t *hard,
> +                        const cpumask_t *soft);
>  int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
>  int vcpu_set_soft_affinity(struct vcpu *v, const cpumask_t *affinity);
>  void restore_vcpu_affinity(struct domain *d);
>
>

[-- Attachment #1.2: Type: text/html, Size: 13239 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity
  2018-03-15 18:35 ` [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity Dario Faggioli
@ 2018-03-21 15:24   ` George Dunlap
  2018-03-21 16:57     ` Dario Faggioli
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2018-03-21 15:24 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Dario Faggioli, Anshul Makkar

On 03/15/2018 06:35 PM, Dario Faggioli wrote:
> From: Dario Faggioli <raistlin@linux.it>
> 
> Whether or not a vCPU has a soft-affinity which is
> effective, i.e., with the power of actually affecting
> the scheduling of the vCPU itself, happens in an
> helper function, called has_soft_affinity().

This is a bit confusing; what about something like this:

"has_soft_affinity() determines whether the soft-affinity of a vcpu will
have any effect -- that is, whether the affinity will have any
difference, scheduling-wise, from an empty soft-affinity mask."

> Such function takes a custom cpumask as its third
> parameter, for better flexibility, but that mask is
> different from the vCPU's hard-affinity only in one
> case. Getting rid of that parameter, not only simplify
> the function, but enables for optimizing the soft
> affinity check (which will happen, in a subsequent
> commit).

There are a number of extraneous commas and one grammar mistake here:

"Such a function takes a custom cpumask as its third parameter[] for
better flexibility; but that mask is different from the vCPU's
hard-affinity in only one case.  Getting rid of that parameter[] not
only simplif[ies] the function, but enables [] optimizing the soft
affinity check."

> This commit, therefore, does that. It's mostly
> mechanical, with the only exception _csched_cpu_pick()
> (in Credit1 code).

The first sentence is unnecessary; I'd just say:

"It's mostly mechanical, with the exception of
sched_credit.c:_cshed_cpu_pick(), which was the one case where we passed
in something other than the existing hard-affinity."

> Signed-off-by: Dario Faggioli <raistlin@linux.it>

If you're OK with those changes, I can make them on check-in (as well as
changing your email address)

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook.
  2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
  2018-03-15 19:10   ` Dario Faggioli
@ 2018-03-21 15:24   ` George Dunlap
  1 sibling, 0 replies; 9+ messages in thread
From: George Dunlap @ 2018-03-21 15:24 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Dario Faggioli, Anshul Makkar

On 03/15/2018 06:35 PM, Dario Faggioli wrote:
> From: Dario Faggioli <raistlin@linux.it>
> 
> For now, just as a way to give a scheduler an "heads up",
> about the fact that the affinity changed.
> 
> This enables some optimizations, such as pre-computing
> and storing (e.g., in flags) facts like a vcpu being
> exclusively pinned to a pcpu, or having or not a
> soft affinity. I.e., conditions that, despite the fact
> that they rarely change, are right now checked very
> frequently, even in hot paths.
> 
> Note that, as we expect many scheduler specific
> implementations of the adjust_affinity hook to do
> something with the per-scheduler vCPU private data,
> this commit moves the calls to sched_set_affinity()
> after that is allocated (in sched_init_vcpu()).
> 
> Note also that this, in future, may turn out as a useful
> mean for, e.g., having the schedulers vet, ack or nack
> the changes themselves.
> 
> Signed-off-by: Dario Faggioli <raistlin@linux.it>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

As I said, I can change all these on check-in.

 -George

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
> Changes from last posting:
> - rebased on staging.
> 
> Changes from v2:
> - fix sched_set_affinity() not to use always hard_affinity;
> - move calls to sched_set_affinity() below per-scheduler vCPU data allocation.
> ---
>  xen/arch/x86/dom0_build.c  |    7 ++--
>  xen/common/schedule.c      |   75 ++++++++++++++++++++++++++++++++------------
>  xen/include/xen/sched-if.h |    3 ++
>  xen/include/xen/sched.h    |    3 ++
>  4 files changed, 63 insertions(+), 25 deletions(-)
> 
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index 555660b853..b744791c38 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -140,14 +140,13 @@ struct vcpu *__init dom0_setup_vcpu(struct domain *d,
>      {
>          if ( pv_shim )
>          {
> -            __cpumask_set_cpu(vcpu_id, v->cpu_hard_affinity);
> -            __cpumask_set_cpu(vcpu_id, v->cpu_soft_affinity);
> +            sched_set_affinity(v, cpumask_of(vcpu_id), cpumask_of(vcpu_id));
>          }
>          else
>          {
>              if ( !d->is_pinned && !dom0_affinity_relaxed )
> -                cpumask_copy(v->cpu_hard_affinity, &dom0_cpus);
> -            cpumask_copy(v->cpu_soft_affinity, &dom0_cpus);
> +                sched_set_affinity(v, &dom0_cpus, NULL);
> +            sched_set_affinity(v, NULL, &dom0_cpus);
>          }
>      }
>  
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 64524f4da7..f43d943765 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -256,18 +256,11 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
>  int sched_init_vcpu(struct vcpu *v, unsigned int processor) 
>  {
>      struct domain *d = v->domain;
> +    cpumask_t allcpus;
>  
> -    /*
> -     * Initialize processor and affinity settings. The idler, and potentially
> -     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
> -     */
> -    v->processor = processor;
> -    if ( is_idle_domain(d) || d->is_pinned )
> -        cpumask_copy(v->cpu_hard_affinity, cpumask_of(processor));
> -    else
> -        cpumask_setall(v->cpu_hard_affinity);
> +    cpumask_setall(&allcpus);
>  
> -    cpumask_setall(v->cpu_soft_affinity);
> +    v->processor = processor;
>  
>      /* Initialise the per-vcpu timers. */
>      init_timer(&v->periodic_timer, vcpu_periodic_timer_fn,
> @@ -282,6 +275,15 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>      if ( v->sched_priv == NULL )
>          return 1;
>  
> +    /*
> +     * Initialize affinity settings. The idler, and potentially
> +     * domain-0 VCPUs, are pinned onto their respective physical CPUs.
> +     */
> +    if ( is_idle_domain(d) || d->is_pinned )
> +        sched_set_affinity(v, cpumask_of(processor), &allcpus);
> +    else
> +        sched_set_affinity(v, &allcpus, &allcpus);
> +
>      /* Idle VCPUs are scheduled immediately, so don't put them in runqueue. */
>      if ( is_idle_domain(d) )
>      {
> @@ -359,6 +361,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
>      for_each_vcpu ( d, v )
>      {
>          spinlock_t *lock;
> +        cpumask_t allcpus;
>  
>          vcpudata = v->sched_priv;
>  
> @@ -366,10 +369,12 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
>          migrate_timer(&v->singleshot_timer, new_p);
>          migrate_timer(&v->poll_timer, new_p);
>  
> -        cpumask_setall(v->cpu_hard_affinity);
> -        cpumask_setall(v->cpu_soft_affinity);
> +        cpumask_setall(&allcpus);
>  
>          lock = vcpu_schedule_lock_irq(v);
> +
> +        sched_set_affinity(v, &allcpus, &allcpus);
> +
>          v->processor = new_p;
>          /*
>           * With v->processor modified we must not
> @@ -694,7 +699,7 @@ void restore_vcpu_affinity(struct domain *d)
>  
>          if ( v->affinity_broken )
>          {
> -            cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved);
> +            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
>              v->affinity_broken = 0;
>  
>          }
> @@ -758,6 +763,8 @@ int cpu_disable_scheduler(unsigned int cpu)
>              if ( cpumask_empty(&online_affinity) &&
>                   cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
>              {
> +                cpumask_t allcpus;
> +
>                  if ( v->affinity_broken )
>                  {
>                      /* The vcpu is temporarily pinned, can't move it. */
> @@ -775,7 +782,8 @@ int cpu_disable_scheduler(unsigned int cpu)
>                  else
>                      printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
>  
> -                cpumask_setall(v->cpu_hard_affinity);
> +                cpumask_setall(&allcpus);
> +                sched_set_affinity(v, &allcpus, NULL);
>              }
>  
>              if ( v->processor != cpu )
> @@ -845,8 +853,26 @@ int cpu_disable_scheduler(unsigned int cpu)
>      return ret;
>  }
>  
> +/*
> + * In general, this must be called with the scheduler lock held, because the
> + * adjust_affinity hook may want to modify the vCPU state. However, when the
> + * vCPU is being initialized (either for dom0 or domU) there is no risk of
> + * races, and it's fine to not take the look (we're talking about
> + * dom0_setup_vcpu() an sched_init_vcpu()).
> + */
> +void sched_set_affinity(
> +    struct vcpu *v, const cpumask_t *hard, const cpumask_t *soft)
> +{
> +    SCHED_OP(dom_scheduler(v->domain), adjust_affinity, v, hard, soft);
> +
> +    if ( hard )
> +        cpumask_copy(v->cpu_hard_affinity, hard);
> +    if ( soft )
> +        cpumask_copy(v->cpu_soft_affinity, soft);
> +}
> +
>  static int vcpu_set_affinity(
> -    struct vcpu *v, const cpumask_t *affinity, cpumask_t *which)
> +    struct vcpu *v, const cpumask_t *affinity, const cpumask_t *which)
>  {
>      spinlock_t *lock;
>      int ret = 0;
> @@ -857,12 +883,19 @@ static int vcpu_set_affinity(
>          ret = -EBUSY;
>      else
>      {
> -        cpumask_copy(which, affinity);
> -
>          /*
> -         * Always ask the scheduler to re-evaluate placement
> -         * when changing the affinity.
> +         * Tell the scheduler we changes something about affinity,
> +         * and ask to re-evaluate vcpu placement.
>           */
> +        if ( which == v->cpu_hard_affinity )
> +        {
> +            sched_set_affinity(v, affinity, NULL);
> +        }
> +        else
> +        {
> +            ASSERT(which == v->cpu_soft_affinity);
> +            sched_set_affinity(v, NULL, affinity);
> +        }
>          set_bit(_VPF_migrating, &v->pause_flags);
>      }
>  
> @@ -1100,7 +1133,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
>      {
>          if ( v->affinity_broken )
>          {
> -            cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved);
> +            sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
>              v->affinity_broken = 0;
>              set_bit(_VPF_migrating, &v->pause_flags);
>              ret = 0;
> @@ -1114,7 +1147,7 @@ int vcpu_pin_override(struct vcpu *v, int cpu)
>          {
>              cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity);
>              v->affinity_broken = 1;
> -            cpumask_copy(v->cpu_hard_affinity, cpumask_of(cpu));
> +            sched_set_affinity(v, cpumask_of(cpu), NULL);
>              set_bit(_VPF_migrating, &v->pause_flags);
>              ret = 0;
>          }
> diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
> index c5dd43ed9c..926d063ccf 100644
> --- a/xen/include/xen/sched-if.h
> +++ b/xen/include/xen/sched-if.h
> @@ -173,6 +173,9 @@ struct scheduler {
>                                      unsigned int);
>      int          (*adjust)         (const struct scheduler *, struct domain *,
>                                      struct xen_domctl_scheduler_op *);
> +    void         (*adjust_affinity)(const struct scheduler *, struct vcpu *,
> +                                    const struct cpumask *,
> +                                    const struct cpumask *);
>      int          (*adjust_global)  (const struct scheduler *,
>                                      struct xen_sysctl_scheduler_op *);
>      void         (*dump_settings)  (const struct scheduler *);
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 39f938644a..ade4d7b9aa 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -846,6 +846,9 @@ void scheduler_free(struct scheduler *sched);
>  int schedule_cpu_switch(unsigned int cpu, struct cpupool *c);
>  void vcpu_force_reschedule(struct vcpu *v);
>  int cpu_disable_scheduler(unsigned int cpu);
> +/* We need it in dom0_setup_vcpu */
> +void sched_set_affinity(struct vcpu *v, const cpumask_t *hard,
> +                        const cpumask_t *soft);
>  int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
>  int vcpu_set_soft_affinity(struct vcpu *v, const cpumask_t *affinity);
>  void restore_vcpu_affinity(struct domain *d);
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity
  2018-03-21 15:24   ` George Dunlap
@ 2018-03-21 16:57     ` Dario Faggioli
  0 siblings, 0 replies; 9+ messages in thread
From: Dario Faggioli @ 2018-03-21 16:57 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Anshul Makkar


[-- Attachment #1.1: Type: text/plain, Size: 493 bytes --]

On Wed, 2018-03-21 at 15:24 +0000, George Dunlap wrote:
> 
> If you're OK with those changes, 
>
I am cool with every one of them.

> I can make them on check-in (as well as
> changing your email address)
> 
And thanks for this as well,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-03-21 16:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 18:35 [PATCH RESEND 0/4] xen: sched: optimize exclusive pinning and soft-affinity checking Dario Faggioli
2018-03-15 18:35 ` [PATCH RESEND 1/4] xen: sched: introduce 'adjust_affinity' hook Dario Faggioli
2018-03-15 19:10   ` Dario Faggioli
2018-03-21 15:24   ` George Dunlap
2018-03-15 18:35 ` [PATCH RESEND 2/4] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
2018-03-15 18:35 ` [PATCH RESEND 3/4] xen: sched: improve checking soft-affinity Dario Faggioli
2018-03-21 15:24   ` George Dunlap
2018-03-21 16:57     ` Dario Faggioli
2018-03-15 18:35 ` [PATCH RESEND 4/4] xen: sched: simplify (and speedup) " Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.