xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity
@ 2015-07-13  8:13 Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file Justin T. Weaver
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, henric

Hello,

The credit2 vcpu scheduler currently ignores per-vcpu hard and soft affinity
masks.

In the v3 review I was asked by George Dunlap to split up the soft affinity
patch into multiple patches by function and only resend the changes to two of
them, get_fallback_cpu and runq_tickle, so this series does not include any
soft affinity changes to credit2 functions balance_load or choose_cpu.  

The first patch is new to the series. It just moves macro VCPU2ONLINE from 
schedule.c to sched.h so other schedulers can use it.

The second patch updates the scheduler to ensure that vcpus only run
on pcpus on which they are allowed to run (hard affinity). I tested it using
xl vcpu-pin and xl vcpu-list. I changed the affinity in different ways using
scripted calls to vcpu-pin and observed the results using vcpu-list. Each VCPU
ran where it was supposed to.

Patch three factors out code from the credit scheduler (sched_credit.c) related
to soft affinity load balancing and places it in a common header (sched-if.h).
This allows credit2 to reuse the functions and defines in the soft affinity
patches. The only change here from v3 is an update to the commit message
adding that no functional changes are intended with the patch. I carried over
the reviewed-by line from Dario Faggioli.

In the v3 series there was a patch that only included indents in credit2
function runq_tickle in order to make the soft affinity patch easier to review.
Based on the review that patch has been dropped and the indents are included in
the separate credit2 soft affinity runq_tickle patch.

The fourth and fifth patches add per-vcpu soft affinity awareness to functions
get_fallback_cpu and runq_tickle, respectively. 

Look forward to the review comments, thanks!

Justin Weaver

---
[1/5] sched: factor out VCPU2ONLINE to common header file
[2/5] sched: credit2: respect per-vcpu hard affinity
[3/5] sched: factor out per-vcpu affinity related code to common header file
[4/5] sched: credit2: add soft affinity awareness to function get_fallback_cpu
[5/5] sched: credit2: add soft affinity awareness to function runq_tickle

xen/common/sched_credit.c  |   87 ++------------
xen/common/sched_credit2.c |  268 +++++++++++++++++++++++++++++++++-----------
xen/common/schedule.c      |    1 -
xen/include/xen/sched-if.h |   65 +++++++++++
xen/include/xen/sched.h    |    2 +
5 files changed, 282 insertions(+), 141 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
@ 2015-07-13  8:13 ` Justin T. Weaver
  2015-09-17 15:26   ` Dario Faggioli
  2015-07-13  8:13 ` [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity Justin T. Weaver
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, Justin T. Weaver, henric

Move macro VCPU2ONLINE from schedule.c to sched.h so it can be used by other
source files.

Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
---
 xen/common/schedule.c   |    1 -
 xen/include/xen/sched.h |    2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index ecf1545..c43b733 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -80,7 +80,6 @@ static struct scheduler __read_mostly ops;
 
 #define DOM2OP(_d)    (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched))
 #define VCPU2OP(_v)   (DOM2OP((_v)->domain))
-#define VCPU2ONLINE(_v) cpupool_online_cpumask((_v)->domain->cpupool)
 
 static inline void trace_runstate_change(struct vcpu *v, int new_state)
 {
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b29d9e7..e5dd040 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -891,6 +891,8 @@ extern void dump_runq(unsigned char key);
 
 void arch_do_physinfo(xen_sysctl_physinfo_t *pi);
 
+#define VCPU2ONLINE(_v) cpupool_online_cpumask((_v)->domain->cpupool)
+
 #endif /* __SCHED_H__ */
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file Justin T. Weaver
@ 2015-07-13  8:13 ` Justin T. Weaver
  2015-09-18 22:12   ` Dario Faggioli
  2015-07-13  8:13 ` [PATCH v4 3/5] sched: factor out per-vcpu affinity related code to common header file Justin T. Weaver
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, Justin T. Weaver, henric

by making sure that vcpus only run on the pcpu(s) they are allowed to
run on based on their hard affinity cpu masks.

Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
---
Changes in v4:
 * Renamed scratch_mask to _scratch_mask
 * Renamed csched2_cpumask to scratch_mask
 * Removed "else continue" in function choose_cpu's for_each_cpu loop to make
   the code less confusing
 * Added an ASSERT that triggers if _scratch_mask[cpu] is NULL after
   allocation in function csched2_alloc_pdata
 * Added assignment to NULL for _scratch_mask[cpu] after call to
   free_cpumask_var in function csched2_alloc_pdata
 * Changed allocation of _scratch_mask from using xmalloc_array back to using
   xzalloc_array
 * Moved allocation of _scratch_mask from function csched2_init to function
   csched2_global_init
 * Added comment to function csched2_vcpu_migrate explaining the need for the
   vc->processor assignment after the else
 * Modified comment before function get_fallback_cpu; reworded into bulleted
   list
 * Changed cpumask_any to cpumask_first at the end of function get_fallback_cpu
 * Fixed indentation in function get_fallback_cpu to align with opening parens
 * Changed function get_fallback_cpu to variant suggested in the v3 review
 * Changed comment before function vcpu_is_migrateable; vcpu svc to just svc
 * Changed "run queue" in several comments to "runqueue"
 * Renamed function valid_vcpu_migration to vcpu_is_migrateable
 * Made condition check in function vcpu_is_migrateable "positive"
Changes in v3:
(all changes are based on v2 review comments unless noted)
 * Renamed cpumask to scratch_mask
 * Renamed function get_safe_pcpu to get_fallback_cpu
 * Improved comment for function get_fallback_cpu
 * Replaced cpupool_online_cpumask with VCPU2ONLINE in function
   get_fallback_cpu to shorten the line
 * Added #define for VCPU2ONLINE (probably should be factored out of
   schedule.c and here, and put into a common header)
 * Modified code in function get_fallback_cpu: moved check for current
   processor to the top; added an ASSERT because the mask should not be empty
 * Modified code and comment in function choose_cpu in migrate request section
 * Added comment to function choose_cpu explaining why the vcpu passed to the
   function might not have hard affinity with any of the pcpus in its assigned
   run queue
 * Modified code in function choose_cpu to make it more readable
 * Moved/changed "We didn't find ..." comment in function choose_cpu
 * Combined migration flag check and hard affinity check into valid migration
   check helper function; replaced code in three places in function
   balance_load with call to the helper function
 * Changed a BUG_ON to an ASSERT in function csched2_vcpu_migrate
 * Moved vc->processor assignment in function csched2_vcpu_migrate to an else
   block to execute only if current and destination run queues are the same;
   Note: without the processor assignment here the vcpu might be assigned to a
   processor it no longer is allowed to run on. In that case, function
   runq_candidate may only get called for the vcpu's old processor, and
   runq_candidate will no longer let a vcpu run on a processor that it's not
   allowed to run on (because of the hard affinity check first introduced in
   v1 of this patch).
 * csched2_init: changed xzalloc_bytes to xmalloc_array for allocation of
   scratch_mask
 * csched2_deinit: removed scratch_mask freeing loop; it wasn't needed
Changes in v2:
 * Added dynamically allocated cpu masks to avoid putting them on the stack;
   replaced temp masks from v1 throughout
 * Added helper function for code suggested in v1 review and called it in two
   locations in function choose_cpu
 * Removed v1 change to comment in the beginning of choose_cpu
 * Replaced two instances of cpumask_and/cpumask_empty with cpumask_intersects
 * Removed v1 re-work of code in function migrate; only change in migrate in
   v2 is the assignment of a valid pcpu from the destination run queue to
   vc->processor
 * In function csched2_vcpu_migrate: removed change from v1 that called
   function migrate even if cur and dest run queues were the same in order
   to get a runq_tickle call; added processor assignment to new_cpu to fix
   the real underlying issue which was the vcpu not getting a call to
   sched_move_irqs
 * Removed the looping added in v1 in function balance_load; may be added back
   later because it would help to have balance_load be more aware of hard
   affinity, but adding it does not affect credit2's current inability to
   respect hard affinity.
 * Removed coding style fix in function balance_load
 * Improved comment in function runq_candidate
---
 xen/common/sched_credit2.c |  153 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 125 insertions(+), 28 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 75e0321..42a1097 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -194,6 +194,12 @@ int opt_overload_balance_tolerance=-3;
 integer_param("credit2_balance_over", opt_overload_balance_tolerance);
 
 /*
+ * Use this to avoid having too many cpumask_t structs on the stack
+ */
+static cpumask_t **_scratch_mask = NULL;
+#define scratch_mask _scratch_mask[smp_processor_id()]
+
+/*
  * Per-runqueue data
  */
 struct csched2_runqueue_data {
@@ -268,6 +274,38 @@ struct csched2_dom {
     uint16_t nr_vcpus;
 };
 
+/*
+ * When a hard affinity change occurs, we may not be able to check some or
+ * all of the other runqueues for a valid new processor for the given vcpu
+ * because (in function choose_cpu) either the trylock on the private data
+ * failed or the trylock on each runqueue with valid processor(s) for the
+ * vcpu failed. In these cases, this function is used to pick a pcpu that svc
+ * can run on.
+ *
+ * Function returns a valid pcpu for svc, in order of preference:
+ * - svc's current pcpu;
+ * - another pcpu from svc's current runq;
+ * - an online pcpu in svc's domain's cpupool, and in svc's hard affinity;
+ */
+static int get_fallback_cpu(struct csched2_vcpu *svc)
+{
+    int cpu;
+
+    if ( likely(cpumask_test_cpu(svc->vcpu->processor,
+                                 svc->vcpu->cpu_hard_affinity)) )
+        return svc->vcpu->processor;
+
+    cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
+                &svc->rqd->active);
+    cpu = cpumask_first(scratch_mask);
+    if ( likely(cpu < nr_cpu_ids) )
+        return cpu;
+
+    cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
+                VCPU2ONLINE(svc->vcpu));
+    ASSERT( !cpumask_empty(scratch_mask) );
+    return cpumask_first(scratch_mask);
+}
 
 /*
  * Time-to-credit, credit-to-time.
@@ -501,8 +539,9 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu *
         goto tickle;
     }
     
-    /* Get a mask of idle, but not tickled */
+    /* Get a mask of idle, but not tickled, that new is allowed to run on. */
     cpumask_andnot(&mask, &rqd->idle, &rqd->tickled);
+    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
     
     /* If it's not empty, choose one */
     i = cpumask_cycle(cpu, &mask);
@@ -513,9 +552,11 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu *
     }
 
     /* Otherwise, look for the non-idle cpu with the lowest credit,
-     * skipping cpus which have been tickled but not scheduled yet */
+     * skipping cpus which have been tickled but not scheduled yet,
+     * that new is allowed to run on. */
     cpumask_andnot(&mask, &rqd->active, &rqd->idle);
     cpumask_andnot(&mask, &mask, &rqd->tickled);
+    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
 
     for_each_cpu(i, &mask)
     {
@@ -1078,9 +1119,8 @@ choose_cpu(const struct scheduler *ops, struct vcpu *vc)
             d2printk("%pv -\n", svc->vcpu);
             clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
         }
-        /* Leave it where it is for now.  When we actually pay attention
-         * to affinity we'll have to figure something out... */
-        return vc->processor;
+
+        return get_fallback_cpu(svc);
     }
 
     /* First check to see if we're here because someone else suggested a place
@@ -1091,45 +1131,55 @@ choose_cpu(const struct scheduler *ops, struct vcpu *vc)
         {
             printk("%s: Runqueue migrate aborted because target runqueue disappeared!\n",
                    __func__);
-            /* Fall-through to normal cpu pick */
         }
         else
         {
-            d2printk("%pv +\n", svc->vcpu);
-            new_cpu = cpumask_cycle(vc->processor, &svc->migrate_rqd->active);
-            goto out_up;
+            cpumask_and(scratch_mask, vc->cpu_hard_affinity,
+                &svc->migrate_rqd->active);
+            new_cpu = cpumask_any(scratch_mask);
+            if ( new_cpu < nr_cpu_ids )
+            {
+                d2printk("%pv +\n", svc->vcpu);
+                goto out_up;
+            }
         }
+        /* Fall-through to normal cpu pick */
     }
 
-    /* FIXME: Pay attention to cpu affinity */                                                                                      
-
     min_avgload = MAX_LOAD;
 
     /* Find the runqueue with the lowest instantaneous load */
     for_each_cpu(i, &prv->active_queues)
     {
         struct csched2_runqueue_data *rqd;
-        s_time_t rqd_avgload;
+        s_time_t rqd_avgload = MAX_LOAD;
 
         rqd = prv->rqd + i;
 
         /* If checking a different runqueue, grab the lock,
-         * read the avg, and then release the lock.
+         * check hard affinity, read the avg, and then release the lock.
          *
          * If on our own runqueue, don't grab or release the lock;
          * but subtract our own load from the runqueue load to simulate
-         * impartiality */
+         * impartiality.
+         *
+         * svc's hard affinity may have changed; this function is the
+         * credit 2 scheduler's first opportunity to react to the change,
+         * so it is possible here that svc does not have hard affinity
+         * with any of the pcpus of svc's currently assigned runqueue.
+         */
         if ( rqd == svc->rqd )
         {
-            rqd_avgload = rqd->b_avgload - svc->avgload;
+            if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
+                rqd_avgload = rqd->b_avgload - svc->avgload;
         }
         else if ( spin_trylock(&rqd->lock) )
         {
-            rqd_avgload = rqd->b_avgload;
+            if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
+                rqd_avgload = rqd->b_avgload;
+
             spin_unlock(&rqd->lock);
         }
-        else
-            continue;
 
         if ( rqd_avgload < min_avgload )
         {
@@ -1138,12 +1188,14 @@ choose_cpu(const struct scheduler *ops, struct vcpu *vc)
         }
     }
 
-    /* We didn't find anyone (most likely because of spinlock contention); leave it where it is */
+    /* We didn't find anyone (most likely because of spinlock contention). */
     if ( min_rqi == -1 )
-        new_cpu = vc->processor;
+        new_cpu = get_fallback_cpu(svc);
     else
     {
-        new_cpu = cpumask_cycle(vc->processor, &prv->rqd[min_rqi].active);
+        cpumask_and(scratch_mask, vc->cpu_hard_affinity,
+            &prv->rqd[min_rqi].active);
+        new_cpu = cpumask_any(scratch_mask);
         BUG_ON(new_cpu >= nr_cpu_ids);
     }
 
@@ -1223,7 +1275,12 @@ static void migrate(const struct scheduler *ops,
             on_runq=1;
         }
         __runq_deassign(svc);
-        svc->vcpu->processor = cpumask_any(&trqd->active);
+
+        cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
+            &trqd->active);
+        svc->vcpu->processor = cpumask_any(scratch_mask);
+        BUG_ON(svc->vcpu->processor >= nr_cpu_ids);
+
         __runq_assign(svc, trqd);
         if ( on_runq )
         {
@@ -1237,6 +1294,17 @@ static void migrate(const struct scheduler *ops,
     }
 }
 
+/*
+ * Migration of svc to runqueue rqd is a valid option if svc is not already
+ * flagged to migrate and if svc is allowed to run on at least one of the
+ * pcpus assigned to rqd based on svc's hard affinity mask.
+ */
+static bool_t vcpu_is_migrateable(struct csched2_vcpu *svc,
+                                  struct csched2_runqueue_data *rqd)
+{
+    return !test_bit(__CSFLAG_runq_migrate_request, &svc->flags)
+        && cpumask_intersects(svc->vcpu->cpu_hard_affinity, &rqd->active);
+}
 
 static void balance_load(const struct scheduler *ops, int cpu, s_time_t now)
 {
@@ -1345,8 +1413,7 @@ retry:
 
         __update_svc_load(ops, push_svc, 0, now);
 
-        /* Skip this one if it's already been flagged to migrate */
-        if ( test_bit(__CSFLAG_runq_migrate_request, &push_svc->flags) )
+        if ( !vcpu_is_migrateable(push_svc, st.orqd) )
             continue;
 
         list_for_each( pull_iter, &st.orqd->svc )
@@ -1358,8 +1425,7 @@ retry:
                 __update_svc_load(ops, pull_svc, 0, now);
             }
         
-            /* Skip this one if it's already been flagged to migrate */
-            if ( test_bit(__CSFLAG_runq_migrate_request, &pull_svc->flags) )
+            if ( !vcpu_is_migrateable(pull_svc, st.lrqd) )
                 continue;
 
             consider(&st, push_svc, pull_svc);
@@ -1375,8 +1441,7 @@ retry:
     {
         struct csched2_vcpu * pull_svc = list_entry(pull_iter, struct csched2_vcpu, rqd_elem);
         
-        /* Skip this one if it's already been flagged to migrate */
-        if ( test_bit(__CSFLAG_runq_migrate_request, &pull_svc->flags) )
+        if ( !vcpu_is_migrateable(pull_svc, st.lrqd) )
             continue;
 
         /* Consider pull only */
@@ -1415,11 +1480,20 @@ csched2_vcpu_migrate(
 
     /* Check if new_cpu is valid */
     BUG_ON(!cpumask_test_cpu(new_cpu, &CSCHED2_PRIV(ops)->initialized));
+    ASSERT(cpumask_test_cpu(new_cpu, vc->cpu_hard_affinity));
 
     trqd = RQD(ops, new_cpu);
 
+    /*
+     * Without the processor assignment after the else, vc may be assigned to
+     * a processor it is not allowed to run on. In that case, runq_candidate
+     * might only get called for the old cpu, and vc will not get to run due
+     * to the hard affinity check.
+     */
     if ( trqd != svc->rqd )
         migrate(ops, svc, trqd, NOW());
+    else
+        vc->processor = new_cpu;
 }
 
 static int
@@ -1638,6 +1712,10 @@ runq_candidate(struct csched2_runqueue_data *rqd,
     {
         struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, runq_elem);
 
+        /* Only consider vcpus that are allowed to run on this processor. */
+        if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) )
+            continue;
+
         /* If this is on a different processor, don't pull it unless
          * its credit is at least CSCHED2_MIGRATE_RESIST higher. */
         if ( svc->vcpu->processor != cpu
@@ -2047,6 +2125,9 @@ static void init_pcpu(const struct scheduler *ops, int cpu)
 
     spin_unlock_irqrestore(&prv->lock, flags);
 
+    free_cpumask_var(_scratch_mask[cpu]);
+    _scratch_mask[cpu] = NULL;
+
     return;
 }
 
@@ -2061,6 +2142,16 @@ csched2_alloc_pdata(const struct scheduler *ops, int cpu)
         printk("%s: cpu %d not online yet, deferring initializatgion\n",
                __func__, cpu);
 
+    /*
+     * For each new pcpu, allocate a cpumask_t for use throughout the
+     * scheduler to avoid putting any cpumask_t structs on the stack.
+     */
+    if ( !zalloc_cpumask_var(&_scratch_mask[cpu]) )
+    {
+        ASSERT(_scratch_mask[cpu] == NULL);
+        return NULL;
+    }
+
     return (void *)1;
 }
 
@@ -2151,6 +2242,10 @@ static struct notifier_block cpu_credit2_nfb = {
 static int
 csched2_global_init(void)
 {
+    _scratch_mask = xzalloc_array(cpumask_t *, nr_cpu_ids);
+    if ( _scratch_mask == NULL )
+        return -ENOMEM;
+
     register_cpu_notifier(&cpu_credit2_nfb);
     return 0;
 }
@@ -2206,6 +2301,8 @@ csched2_deinit(const struct scheduler *ops)
 
     prv = CSCHED2_PRIV(ops);
     xfree(prv);
+
+    xfree(_scratch_mask);
 }
 
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 3/5] sched: factor out per-vcpu affinity related code to common header file
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity Justin T. Weaver
@ 2015-07-13  8:13 ` Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 4/5] sched: credit2: add soft affinity awareness to function get_fallback_cpu Justin T. Weaver
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, Justin T. Weaver, henric

Move affinity balancing related functions and defines from sched_credit.c to
sched-if.h so other schedulers can use them. Change name prefixes from csched
to sched since they are no longer specific to the credit scheduler. No
functional changes intended.

Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Changes in v4:
 * only the commit message was modified to indicate no functional changes
---
 xen/common/sched_credit.c  |   87 ++++++--------------------------------------
 xen/include/xen/sched-if.h |   65 +++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+), 76 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 953ecb0..22252aa 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -129,26 +129,6 @@
 
 
 /*
- * Hard and soft affinity load balancing.
- *
- * Idea is each vcpu has some pcpus that it prefers, some that it does not
- * prefer but is OK with, and some that it cannot run on at all. The first
- * set of pcpus are the ones that are both in the soft affinity *and* in the
- * hard affinity; the second set of pcpus are the ones that are in the hard
- * affinity but *not* in the soft affinity; the third set of pcpus are the
- * ones that are not in the hard affinity.
- *
- * We implement a two step balancing logic. Basically, every time there is
- * the need to decide where to run a vcpu, we first check the soft affinity
- * (well, actually, the && between soft and hard affinity), to see if we can
- * send it where it prefers to (and can) run on. However, if the first step
- * does not find any suitable and free pcpu, we fall back checking the hard
- * affinity.
- */
-#define CSCHED_BALANCE_SOFT_AFFINITY    0
-#define CSCHED_BALANCE_HARD_AFFINITY    1
-
-/*
  * Boot parameters
  */
 static int __read_mostly sched_credit_tslice_ms = CSCHED_DEFAULT_TSLICE_MS;
@@ -290,51 +270,6 @@ __runq_remove(struct csched_vcpu *svc)
 }
 
 
-#define for_each_csched_balance_step(step) \
-    for ( (step) = 0; (step) <= CSCHED_BALANCE_HARD_AFFINITY; (step)++ )
-
-
-/*
- * Hard affinity balancing is always necessary and must never be skipped.
- * But soft affinity need only be considered when it has a functionally
- * different effect than other constraints (such as hard affinity, cpus
- * online, or cpupools).
- *
- * Soft affinity only needs to be considered if:
- * * The cpus in the cpupool are not a subset of soft affinity
- * * The hard affinity is not a subset of soft affinity
- * * There is an overlap between the soft affinity and the mask which is
- *   currently being considered.
- */
-static inline int __vcpu_has_soft_affinity(const struct vcpu *vc,
-                                           const cpumask_t *mask)
-{
-    return !cpumask_subset(cpupool_online_cpumask(vc->domain->cpupool),
-                           vc->cpu_soft_affinity) &&
-           !cpumask_subset(vc->cpu_hard_affinity, vc->cpu_soft_affinity) &&
-           cpumask_intersects(vc->cpu_soft_affinity, mask);
-}
-
-/*
- * Each csched-balance step uses its own cpumask. This function determines
- * which one (given the step) and copies it in mask. For the soft affinity
- * balancing step, the pcpus that are not part of vc's hard affinity are
- * filtered out from the result, to avoid running a vcpu where it would
- * like, but is not allowed to!
- */
-static void
-csched_balance_cpumask(const struct vcpu *vc, int step, cpumask_t *mask)
-{
-    if ( step == CSCHED_BALANCE_SOFT_AFFINITY )
-    {
-        cpumask_and(mask, vc->cpu_soft_affinity, vc->cpu_hard_affinity);
-
-        if ( unlikely(cpumask_empty(mask)) )
-            cpumask_copy(mask, vc->cpu_hard_affinity);
-    }
-    else /* step == CSCHED_BALANCE_HARD_AFFINITY */
-        cpumask_copy(mask, vc->cpu_hard_affinity);
-}
 
 static void burn_credits(struct csched_vcpu *svc, s_time_t now)
 {
@@ -396,18 +331,18 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
          * Soft and hard affinity balancing loop. For vcpus without
          * a useful soft affinity, consider hard affinity only.
          */
-        for_each_csched_balance_step( balance_step )
+        for_each_sched_balance_step( balance_step )
         {
             int new_idlers_empty;
 
-            if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY
+            if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
                  && !__vcpu_has_soft_affinity(new->vcpu,
                                               new->vcpu->cpu_hard_affinity) )
                 continue;
 
             /* Are there idlers suitable for new (for this balance step)? */
-            csched_balance_cpumask(new->vcpu, balance_step,
-                                   csched_balance_mask);
+            sched_balance_cpumask(new->vcpu, balance_step,
+                                  csched_balance_mask);
             cpumask_and(&idle_mask, prv->idlers, csched_balance_mask);
             new_idlers_empty = cpumask_empty(&idle_mask);
 
@@ -417,7 +352,7 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
              * hard affinity as well, before taking final decisions.
              */
             if ( new_idlers_empty
-                 && balance_step == CSCHED_BALANCE_SOFT_AFFINITY )
+                 && balance_step == SCHED_BALANCE_SOFT_AFFINITY )
                 continue;
 
             /*
@@ -639,7 +574,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
     online = cpupool_scheduler_cpumask(vc->domain->cpupool);
     cpumask_and(&cpus, vc->cpu_hard_affinity, online);
 
-    for_each_csched_balance_step( balance_step )
+    for_each_sched_balance_step( balance_step )
     {
         /*
          * We want to pick up a pcpu among the ones that are online and
@@ -659,12 +594,12 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
          * cpus and, if the result is empty, we just skip the soft affinity
          * balancing step all together.
          */
-        if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY
+        if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
              && !__vcpu_has_soft_affinity(vc, &cpus) )
             continue;
 
         /* Pick an online CPU from the proper affinity mask */
-        csched_balance_cpumask(vc, balance_step, &cpus);
+        sched_balance_cpumask(vc, balance_step, &cpus);
         cpumask_and(&cpus, &cpus, online);
 
         /* If present, prefer vc's current processor */
@@ -1482,11 +1417,11 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step)
              * vCPUs with useful soft affinities in some sort of bitmap
              * or counter.
              */
-            if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY
+            if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
                  && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) )
                 continue;
 
-            csched_balance_cpumask(vc, balance_step, csched_balance_mask);
+            sched_balance_cpumask(vc, balance_step, csched_balance_mask);
             if ( __csched_vcpu_is_migrateable(vc, cpu, csched_balance_mask) )
             {
                 /* We got a candidate. Grab it! */
@@ -1537,7 +1472,7 @@ csched_load_balance(struct csched_private *prv, int cpu,
      *  1. any "soft-affine work" to steal first,
      *  2. if not finding anything, any "hard-affine work" to steal.
      */
-    for_each_csched_balance_step( bstep )
+    for_each_sched_balance_step( bstep )
     {
         /*
          * We peek at the non-idling CPUs in a node-wise fashion. In fact,
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 7cc25c6..3a118da 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -188,4 +188,69 @@ struct cpupool
 #define cpupool_online_cpumask(_pool) \
     (((_pool) == NULL) ? &cpu_online_map : (_pool)->cpu_valid)
 
+/*
+ * Hard and soft affinity load balancing.
+ *
+ * Idea is each vcpu has some pcpus that it prefers, some that it does not
+ * prefer but is OK with, and some that it cannot run on at all. The first
+ * set of pcpus are the ones that are both in the soft affinity *and* in the
+ * hard affinity; the second set of pcpus are the ones that are in the hard
+ * affinity but *not* in the soft affinity; the third set of pcpus are the
+ * ones that are not in the hard affinity.
+ *
+ * We implement a two step balancing logic. Basically, every time there is
+ * the need to decide where to run a vcpu, we first check the soft affinity
+ * (well, actually, the && between soft and hard affinity), to see if we can
+ * send it where it prefers to (and can) run on. However, if the first step
+ * does not find any suitable and free pcpu, we fall back checking the hard
+ * affinity.
+ */
+#define SCHED_BALANCE_SOFT_AFFINITY    0
+#define SCHED_BALANCE_HARD_AFFINITY    1
+
+#define for_each_sched_balance_step(step) \
+    for ( (step) = 0; (step) <= SCHED_BALANCE_HARD_AFFINITY; (step)++ )
+
+/*
+ * Hard affinity balancing is always necessary and must never be skipped.
+ * But soft affinity need only be considered when it has a functionally
+ * different effect than other constraints (such as hard affinity, cpus
+ * online, or cpupools).
+ *
+ * Soft affinity only needs to be considered if:
+ * * The cpus in the cpupool are not a subset of soft affinity
+ * * The hard affinity is not a subset of soft affinity
+ * * There is an overlap between the soft affinity and the mask which is
+ *   currently being considered.
+ */
+static inline int __vcpu_has_soft_affinity(const struct vcpu *vc,
+                                           const cpumask_t *mask)
+{
+    return !cpumask_subset(cpupool_online_cpumask(vc->domain->cpupool),
+                           vc->cpu_soft_affinity) &&
+           !cpumask_subset(vc->cpu_hard_affinity, vc->cpu_soft_affinity) &&
+           cpumask_intersects(vc->cpu_soft_affinity, mask);
+}
+
+/*
+ * Each sched-balance step uses its own cpumask. This function determines
+ * which one (given the step) and copies it in mask. For the soft affinity
+ * balancing step, the pcpus that are not part of vc's hard affinity are
+ * filtered out from the result, to avoid running a vcpu where it would
+ * like, but is not allowed to!
+ */
+static inline void
+sched_balance_cpumask(const struct vcpu *vc, int step, cpumask_t *mask)
+{
+    if ( step == SCHED_BALANCE_SOFT_AFFINITY )
+    {
+        cpumask_and(mask, vc->cpu_soft_affinity, vc->cpu_hard_affinity);
+
+        if ( unlikely(cpumask_empty(mask)) )
+            cpumask_copy(mask, vc->cpu_hard_affinity);
+    }
+    else /* step == SCHED_BALANCE_HARD_AFFINITY */
+        cpumask_copy(mask, vc->cpu_hard_affinity);
+}
+
 #endif /* __XEN_SCHED_IF_H__ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 4/5] sched: credit2: add soft affinity awareness to function get_fallback_cpu
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
                   ` (2 preceding siblings ...)
  2015-07-13  8:13 ` [PATCH v4 3/5] sched: factor out per-vcpu affinity related code to common header file Justin T. Weaver
@ 2015-07-13  8:13 ` Justin T. Weaver
  2015-07-13  8:13 ` [PATCH v4 5/5] sched: credit2: add soft affinity awareness to function runq_tickle Justin T. Weaver
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, Justin T. Weaver, henric

by adding a two step loop. The function now finds a fallback cpu for a given
vcpu using the following precedence...
1) the vcpu's current pcpu
soft affinity step...
2) another pcpu from the vcpu's current runq in the vcpu's soft affinity
3) an online pcpu in the vcpu's domain's cpupool, and in the vcpu's soft
   affinity
hard affinity step...
4) another pcpu from the vcpu's current runq in the vcpu's hard affinity
3) an online pcpu in the vcpu's domain's cpupool, and in the vcpu's hard
   affinity

Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
---
Changes in v4:
 * renamed all uses of csched2_cpumask to scratch_mask
 * updated the comment before the function describing the added soft affinity
   aware functionality
 * updated the function to match the flow of the rewrite in the hard affinity
   patch based on the v3 hard affinity review
 * moved the VCPU2ONLINE section outside of the else block; removed the else
   block
Changes in v3:
 * added balance loop to try to find a soft affinity cpu
Changes in v2:
 * Not submitted in version 2; focus was on the hard affinity patch
---
 xen/common/sched_credit2.c |   31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 42a1097..66f0a20 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -284,25 +284,38 @@ struct csched2_dom {
  *
  * Function returns a valid pcpu for svc, in order of preference:
  * - svc's current pcpu;
- * - another pcpu from svc's current runq;
+ * - another pcpu from svc's current runq in svc's soft affinity;
+ * - an online pcpu in svc's domain's cpupool, and in svc's soft affinity;
+ * - another pcpu from svc's current runq in svc's hard affinity;
  * - an online pcpu in svc's domain's cpupool, and in svc's hard affinity;
  */
 static int get_fallback_cpu(struct csched2_vcpu *svc)
 {
-    int cpu;
+    int cpu, balance_step;
 
     if ( likely(cpumask_test_cpu(svc->vcpu->processor,
                                  svc->vcpu->cpu_hard_affinity)) )
         return svc->vcpu->processor;
 
-    cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
-                &svc->rqd->active);
-    cpu = cpumask_first(scratch_mask);
-    if ( likely(cpu < nr_cpu_ids) )
-        return cpu;
+    for_each_sched_balance_step( balance_step )
+    {
+        if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
+            && !__vcpu_has_soft_affinity(svc->vcpu,
+                svc->vcpu->cpu_hard_affinity) )
+            continue;
+
+        sched_balance_cpumask(svc->vcpu, balance_step, scratch_mask);
+        cpumask_and(scratch_mask, scratch_mask, &svc->rqd->active);
+        cpu = cpumask_first(scratch_mask);
+        if ( likely(cpu < nr_cpu_ids) )
+            return cpu;
+
+        sched_balance_cpumask(svc->vcpu, balance_step, scratch_mask);
+        cpumask_and(scratch_mask, scratch_mask, VCPU2ONLINE(svc->vcpu));
+        if ( !cpumask_empty(scratch_mask) )
+            break;
+    }
 
-    cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
-                VCPU2ONLINE(svc->vcpu));
     ASSERT( !cpumask_empty(scratch_mask) );
     return cpumask_first(scratch_mask);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 5/5] sched: credit2: add soft affinity awareness to function runq_tickle
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
                   ` (3 preceding siblings ...)
  2015-07-13  8:13 ` [PATCH v4 4/5] sched: credit2: add soft affinity awareness to function get_fallback_cpu Justin T. Weaver
@ 2015-07-13  8:13 ` Justin T. Weaver
  2015-07-13 15:43 ` [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Dario Faggioli
  2015-09-14  9:03 ` Dario Faggioli
  6 siblings, 0 replies; 10+ messages in thread
From: Justin T. Weaver @ 2015-07-13  8:13 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, dario.faggioli, Justin T. Weaver, henric

by adding two two-step affinity loops.

The first looks for an idle, non-tickled cpu in the given vcpu's soft
affinity, and then in it's hard affinity.

If no cpu was found, the second two-step loop first looks for the non-idle,
non-tickled cpu with the lowest credit in the vcpu's soft affinity. If the
vcpu on the found cpu has less credit than the given vcpu, then that cpu is
chosen. Finally, if no cpu was picked yet, the second step looks for the
non-idle, non-tickled cpu with the lowest credit in the vcpu's hard affinity.

Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
---
Changes in v4:
 * removed "indent only" patch and integrated its changes into this patch
 * renamed all uses of csched2_cpumask to scratch_mask
 * moved comment outside of for_each_sched_balance_step loop and updated the
   comment for soft affinity in the "idle, not tickled" section
 * updated the functionality of the "not idle, not tickled" section; it now
   breaks out of the for_each_sched_balance_step loop if the vcpu on the cpu
   found during the soft affinity step has less credit than vcpu new
 * updated the comment above the "not idle, not tickled" section explaining
   the new functionality
Changes in v3:
 * replaced use of the on-stack cpumask_t with the per-vcpu scratch_mask
 * added two balance loops, one for finding idle, but not tickled, and other
   for finding non-idle with lowest credit
Changes in v2:
 * Not submitted in version 2; focus was on the hard affinity patch
---
 xen/common/sched_credit2.c |  112 ++++++++++++++++++++++++++++----------------
 1 file changed, 71 insertions(+), 41 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 66f0a20..cd44ac3 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -534,8 +534,8 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu *
     int i, ipid=-1;
     s_time_t lowest=(1<<30);
     struct csched2_runqueue_data *rqd = RQD(ops, cpu);
-    cpumask_t mask;
     struct csched2_vcpu * cur;
+    int balance_step;
 
     d2printk("rqt %pv curr %pv\n", new->vcpu, current);
 
@@ -552,57 +552,87 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, struct csched2_vcpu *
         goto tickle;
     }
     
-    /* Get a mask of idle, but not tickled, that new is allowed to run on. */
-    cpumask_andnot(&mask, &rqd->idle, &rqd->tickled);
-    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
-    
-    /* If it's not empty, choose one */
-    i = cpumask_cycle(cpu, &mask);
-    if ( i < nr_cpu_ids )
+    /*
+     * Look for an idle, untickled cpu in the vcpu's soft affinity, then in
+     * its hard affinity.
+     */
+    for_each_sched_balance_step ( balance_step )
     {
-        ipid = i;
-        goto tickle;
-    }
+        if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
+            && !__vcpu_has_soft_affinity(new->vcpu,
+                new->vcpu->cpu_hard_affinity) )
+            continue;
+
+        sched_balance_cpumask(new->vcpu, balance_step, scratch_mask);
+        cpumask_and(scratch_mask, scratch_mask, &rqd->idle);
+        cpumask_andnot(scratch_mask, scratch_mask, &rqd->tickled);
 
-    /* Otherwise, look for the non-idle cpu with the lowest credit,
-     * skipping cpus which have been tickled but not scheduled yet,
-     * that new is allowed to run on. */
-    cpumask_andnot(&mask, &rqd->active, &rqd->idle);
-    cpumask_andnot(&mask, &mask, &rqd->tickled);
-    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
+        /* If it's not empty, choose one */
+        i = cpumask_cycle(cpu, scratch_mask);
+        if ( i < nr_cpu_ids )
+        {
+            ipid = i;
+            goto tickle;
+        }
+    }
 
-    for_each_cpu(i, &mask)
+    /*
+     * Otherwise, look for the non-idle cpu whose vcpu has the lowest credit,
+     * skipping cpus which have been tickled but not scheduled yet.
+     * First look in new's soft affinity, and choose the cpu if its currently
+     * running vcpu's credit is lower than new's credit.
+     * If a cpu was not found using new's soft affinity, choose the cpu in
+     * new's hard affinity with the lowest credit.
+     */
+    for_each_sched_balance_step ( balance_step )
     {
-        struct csched2_vcpu * cur;
+        if ( balance_step == SCHED_BALANCE_HARD_AFFINITY
+            && lowest < new->credit )
+            goto tickle;
 
-        /* Already looked at this one above */
-        if ( i == cpu )
+        if ( balance_step == SCHED_BALANCE_SOFT_AFFINITY
+            && !__vcpu_has_soft_affinity(new->vcpu,
+                                         new->vcpu->cpu_hard_affinity) )
             continue;
 
-        cur = CSCHED2_VCPU(curr_on_cpu(i));
+        sched_balance_cpumask(new->vcpu, balance_step, scratch_mask);
+        cpumask_and(scratch_mask, scratch_mask, &rqd->active);
+        cpumask_andnot(scratch_mask, scratch_mask, &rqd->idle);
+        cpumask_andnot(scratch_mask, scratch_mask, &rqd->tickled);
+
+        for_each_cpu(i, scratch_mask)
+        {
+            struct csched2_vcpu * cur;
 
-        BUG_ON(is_idle_vcpu(cur->vcpu));
+            /* Already looked at this one above */
+            if ( i == cpu )
+                continue;
 
-        /* Update credits for current to see if we want to preempt */
-        burn_credits(rqd, cur, now);
+            cur = CSCHED2_VCPU(curr_on_cpu(i));
 
-        if ( cur->credit < lowest )
-        {
-            ipid = i;
-            lowest = cur->credit;
-        }
+            BUG_ON(is_idle_vcpu(cur->vcpu));
 
-        /* TRACE */ {
-            struct {
-                unsigned dom:16,vcpu:16;
-                unsigned credit;
-            } d;
-            d.dom = cur->vcpu->domain->domain_id;
-            d.vcpu = cur->vcpu->vcpu_id;
-            d.credit = cur->credit;
-            trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
-                      sizeof(d),
-                      (unsigned char *)&d);
+            /* Update credits for current to see if we want to preempt */
+            burn_credits(rqd, cur, now);
+
+            if ( cur->credit < lowest )
+            {
+                ipid = i;
+                lowest = cur->credit;
+            }
+
+            /* TRACE */ {
+                struct {
+                    unsigned dom:16,vcpu:16;
+                    unsigned credit;
+                } d;
+                d.dom = cur->vcpu->domain->domain_id;
+                d.vcpu = cur->vcpu->vcpu_id;
+                d.credit = cur->credit;
+                trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
+                          sizeof(d),
+                          (unsigned char *)&d);
+            }
         }
     }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
                   ` (4 preceding siblings ...)
  2015-07-13  8:13 ` [PATCH v4 5/5] sched: credit2: add soft affinity awareness to function runq_tickle Justin T. Weaver
@ 2015-07-13 15:43 ` Dario Faggioli
  2015-09-14  9:03 ` Dario Faggioli
  6 siblings, 0 replies; 10+ messages in thread
From: Dario Faggioli @ 2015-07-13 15:43 UTC (permalink / raw)
  To: Justin T. Weaver; +Cc: george.dunlap, henric, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 855 bytes --]

On Sun, 2015-07-12 at 22:13 -1000, Justin T. Weaver wrote:
> Hello,
> 
Hey Justin!

Thanks a lot for having kept up with your great work up to this point,
and for sending this new series!

I, for now, just wanted to say one thing: we are right at the beginning
of the freeze period, for the upcoming release of Xen 4.6. This means
that a lot of developer's bandwidth will be devoted to actual 4.6
features, testing and bugfixing.

We'll (well, at least I will :-D) definitely review your work ASAP, but
expect some delays due to the above! :-/

Thanks again and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity
  2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
                   ` (5 preceding siblings ...)
  2015-07-13 15:43 ` [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Dario Faggioli
@ 2015-09-14  9:03 ` Dario Faggioli
  6 siblings, 0 replies; 10+ messages in thread
From: Dario Faggioli @ 2015-09-14  9:03 UTC (permalink / raw)
  To: Justin T. Weaver; +Cc: george.dunlap, henric, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 691 bytes --]

On Sun, 2015-07-12 at 22:13 -1000, Justin T. Weaver wrote:
> Hello,
> 
Hey Justin!

Sorry a ton for being so late again.

I'm now back at work after travelling and some vacations.

I'll be reviewing this ASAP, right after having done a bit of catch up
with the email backlog accumulated during this period... I should be
able to start the review today or, at most, tomorrow.

Thanks, sorry again, and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file
  2015-07-13  8:13 ` [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file Justin T. Weaver
@ 2015-09-17 15:26   ` Dario Faggioli
  0 siblings, 0 replies; 10+ messages in thread
From: Dario Faggioli @ 2015-09-17 15:26 UTC (permalink / raw)
  To: Justin T. Weaver; +Cc: george.dunlap, henric, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 595 bytes --]

On Sun, 2015-07-12 at 22:13 -1000, Justin T. Weaver wrote:
> Move macro VCPU2ONLINE from schedule.c to sched.h so it can be used by other
> source files.
> 
"No functional changes intended."

> Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
>
(with the above)

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity
  2015-07-13  8:13 ` [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity Justin T. Weaver
@ 2015-09-18 22:12   ` Dario Faggioli
  0 siblings, 0 replies; 10+ messages in thread
From: Dario Faggioli @ 2015-09-18 22:12 UTC (permalink / raw)
  To: Justin T. Weaver, xen-devel; +Cc: george.dunlap, henric


[-- Attachment #1.1: Type: text/plain, Size: 6699 bytes --]

And here we are, finally... :-/

On Sun, 2015-07-12 at 22:13 -1000, Justin T. Weaver wrote:
> by making sure that vcpus only run on the pcpu(s) they are allowed to
> run on based on their hard affinity cpu masks.
> 
> Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
>
So, this patch looks now good to me. I've found:
 - a few style issues (indentation)
 - a comment that I would reword
 - a trivial code issue (details below)

With all these addressed, this patch is:

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

> ---
> Changes in v4:
>  * Renamed scratch_mask to _scratch_mask
>  * Renamed csched2_cpumask to scratch_mask
>  * Removed "else continue" in function choose_cpu's for_each_cpu loop
> to make
>    the code less confusing
>  * Added an ASSERT that triggers if _scratch_mask[cpu] is NULL after
>    allocation in function csched2_alloc_pdata
>
> [...]
>
>  * Changed "run queue" in several comments to "runqueue"
>  * Renamed function valid_vcpu_migration to vcpu_is_migrateable
>  * Made condition check in function vcpu_is_migrateable "positive"
>
Ok, thanks for this really detailed list of what's changed in v4. It's
very thorough, and, AFAICT, it really covers all the review comments
that I can find about, when looking back at v3 submission.

> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c

> @@ -1091,45 +1131,55 @@ choose_cpu(const struct scheduler *ops,
> struct vcpu *vc)
>          {
>              printk("%s: Runqueue migrate aborted because target
> runqueue disappeared!\n",
>                     __func__);
> -            /* Fall-through to normal cpu pick */
>          }
>          else
>          {
> -            d2printk("%pv +\n", svc->vcpu);
> -            new_cpu = cpumask_cycle(vc->processor, &svc->migrate_rqd
> ->active);
> -            goto out_up;
> +            cpumask_and(scratch_mask, vc->cpu_hard_affinity,
> +                &svc->migrate_rqd->active);
>
Indentation.

          cpumask_and(scratch_mask, vc->cpu_hard_affinity,
                      &svc->migrate_rqd->active);

> @@ -1138,12 +1188,14 @@ choose_cpu(const struct scheduler *ops,
> struct vcpu *vc)
>          }
>      }
>  
> -    /* We didn't find anyone (most likely because of spinlock
> contention); leave it where it is */
> +    /* We didn't find anyone (most likely because of spinlock
> contention). */
>      if ( min_rqi == -1 )
> -        new_cpu = vc->processor;
> +        new_cpu = get_fallback_cpu(svc);
>      else
>      {
> -        new_cpu = cpumask_cycle(vc->processor, &prv
> ->rqd[min_rqi].active);
> +        cpumask_and(scratch_mask, vc->cpu_hard_affinity,
> +            &prv->rqd[min_rqi].active);
>
Here as well.
 
> @@ -1223,7 +1275,12 @@ static void migrate(const struct scheduler
> *ops,
>              on_runq=1;
>          }
>          __runq_deassign(svc);
> -        svc->vcpu->processor = cpumask_any(&trqd->active);
> +
> +        cpumask_and(scratch_mask, svc->vcpu->cpu_hard_affinity,
> +            &trqd->active);
>
And here.

> @@ -1237,6 +1294,17 @@ static void migrate(const struct scheduler
> *ops,
>      }
>  }
>  
> +/*
> + * Migration of svc to runqueue rqd is a valid option if svc is not
> already
> + * flagged to migrate and if svc is allowed to run on at least one
> of the
> + * pcpus assigned to rqd based on svc's hard affinity mask.
> + */
> +static bool_t vcpu_is_migrateable(struct csched2_vcpu *svc,
> +                                  struct csched2_runqueue_data *rqd)
> +{
> +    return !test_bit(__CSFLAG_runq_migrate_request, &svc->flags)
> +        && cpumask_intersects(svc->vcpu->cpu_hard_affinity, &rqd
> ->active);
>
This one, I'd make it look like this:

    return !test_bit(__CSFLAG_runq_migrate_request, &svc-flags) &&
           cpumask_intersects(svc->vcpu->cpu_hard_affinity, &rqd->active);


> @@ -1415,11 +1480,20 @@ csched2_vcpu_migrate(
>  
>      /* Check if new_cpu is valid */
>      BUG_ON(!cpumask_test_cpu(new_cpu, &CSCHED2_PRIV(ops)
> ->initialized));
> +    ASSERT(cpumask_test_cpu(new_cpu, vc->cpu_hard_affinity));
>  
>      trqd = RQD(ops, new_cpu);
>  
> +    /*
> +     * Without the processor assignment after the else, vc may be
> assigned to
> +     * a processor it is not allowed to run on. In that case,
> runq_candidate
> +     * might only get called for the old cpu, and vc will not get to
> run due
> +     * to the hard affinity check.
> +     */
>      if ( trqd != svc->rqd )
>          migrate(ops, svc, trqd, NOW());
> +    else
> +        vc->processor = new_cpu;
>  }
>  
About the comment. I don't like it expressing would happen if a
specific piece of code, that is there, were missing. I'd go for
something like this:

"Do the actual movement toward new_cpu, and update vc->processor. If we
are changing runqueue, migrate() takes care of everything. If we are
not changing runqueue, we need to update vc->processor here. In fact,
if, for instance, we are here because the vcpu's hard affinity is being
changed, we don't want to risk leaving vc->processor pointing to a pcpu
where the vcpu can't run any longer"

> @@ -2047,6 +2125,9 @@ static void init_pcpu(const struct scheduler
> *ops, int cpu)
>  
>      spin_unlock_irqrestore(&prv->lock, flags);
>  
> +    free_cpumask_var(_scratch_mask[cpu]);
> +    _scratch_mask[cpu] = NULL;
> +
>      return;
>  }
> 
And here we are: this is the only issue in the code (i.e., not about
comment wording or style issues) that I've found.

How come this hunk is in init_pcpu()? Shouldn't it live in
csched2_free_pdata()? I think it should...

From the look of things, it is probably the result of a rebase of the
patch which went slightly wrong, without you noticing it. :-)

> @@ -2061,6 +2142,16 @@ csched2_alloc_pdata(const struct scheduler
> *ops, int cpu)
>          printk("%s: cpu %d not online yet, deferring
> initializatgion\n",
>                 __func__, cpu);
>  
> +    /*
> +     * For each new pcpu, allocate a cpumask_t for use throughout
> the
> +     * scheduler to avoid putting any cpumask_t structs on the
> stack.
>
"to avoid putting too many cpumask_t structs on the stack"

In fact, there are some already, and we're not removing them, we're
just avoiding adding more of them

Thanks and Regards,
Dario 
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-09-18 22:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-13  8:13 [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Justin T. Weaver
2015-07-13  8:13 ` [PATCH v4 1/5] sched: factor out VCPU2ONLINE to common header file Justin T. Weaver
2015-09-17 15:26   ` Dario Faggioli
2015-07-13  8:13 ` [PATCH v4 2/5] sched: credit2: respect per-vcpu hard affinity Justin T. Weaver
2015-09-18 22:12   ` Dario Faggioli
2015-07-13  8:13 ` [PATCH v4 3/5] sched: factor out per-vcpu affinity related code to common header file Justin T. Weaver
2015-07-13  8:13 ` [PATCH v4 4/5] sched: credit2: add soft affinity awareness to function get_fallback_cpu Justin T. Weaver
2015-07-13  8:13 ` [PATCH v4 5/5] sched: credit2: add soft affinity awareness to function runq_tickle Justin T. Weaver
2015-07-13 15:43 ` [PATCH v4 0/5] sched: credit2: introduce per-vcpu hard and soft affinity Dario Faggioli
2015-09-14  9:03 ` Dario Faggioli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).