All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Anshul Makkar <anshul.makkar@citrix.com>,
	"Justin T. Weaver" <jtweaver@hawaii.edu>,
	George Dunlap <george.dunlap@citrix.com>
Subject: [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick()
Date: Wed, 17 Aug 2016 19:19:40 +0200	[thread overview]
Message-ID: <147145438023.25877.4564307455358470316.stgit@Solace.fritz.box> (raw)
In-Reply-To: <147145358844.25877.7490417583264534196.stgit@Solace.fritz.box>

For get_fallback_cpu(), by putting in place the "usual"
two steps (soft affinity step and hard affinity step)
loop. We just move the core logic of the function inside
the body of the loop itself.

For csched2_cpu_pick(), what is important is to find
the runqueue with the least average load. Currently,
we do that by looping on all runqueues and checking,
well, their load. For soft affinity, we want to know
which one is the runqueue with the least load, among
the ones where the vcpu would prefer to be assigned.

We find both the least loaded runqueue among the soft
affinity "friendly" ones, and the overall least loaded
one, in the same pass.

(Also, kill a spurious ';' when defining MAX_LOAD.)

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
 xen/common/sched_credit2.c |  136 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 111 insertions(+), 25 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 3aef1b4..2d7228a 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -506,34 +506,68 @@ void smt_idle_mask_clear(unsigned int cpu, cpumask_t *mask)
 }
 
 /*
- * When a hard affinity change occurs, we may not be able to check some
- * (any!) of the other runqueues, when looking for the best new processor
- * for svc (as trylock-s in csched2_cpu_pick() can fail). If that happens, we
- * pick, in order of decreasing preference:
- *  - svc's current pcpu;
- *  - another pcpu from svc's current runq;
- *  - any cpu.
+ * In csched2_cpu_pick(), it may not be possible to actually look at remote
+ * runqueues (the trylock-s on their spinlocks can fail!). If that happens,
+ * we pick, in order of decreasing preference:
+ *  1) svc's current pcpu, if it is part of svc's soft affinity;
+ *  2) a pcpu in svc's current runqueue that is also in svc's soft affinity;
+ *  3) just one valid pcpu from svc's soft affinity;
+ *  4) svc's current pcpu, if it is part of svc's hard affinity;
+ *  5) a pcpu in svc's current runqueue that is also in svc's hard affinity;
+ *  6) just one valid pcpu from svc's hard affinity
+ *
+ * Of course, 1, 2 and 3 makes sense only if svc has a soft affinity. Also
+ * note that at least 6 is guaranteed to _always_ return at least one pcpu.
  */
 static int get_fallback_cpu(struct csched2_vcpu *svc)
 {
     int cpu;
+    unsigned int bs;
 
-    if ( likely(cpumask_test_cpu(svc->vcpu->processor,
-                                 svc->vcpu->cpu_hard_affinity)) )
-        return svc->vcpu->processor;
+    for_each_affinity_balance_step( bs )
+    {
+        if ( bs == BALANCE_SOFT_AFFINITY &&
+             !has_soft_affinity(svc->vcpu, svc->vcpu->cpu_hard_affinity) )
+            continue;
 
-    cpumask_and(cpumask_scratch, svc->vcpu->cpu_hard_affinity,
-                &svc->rqd->active);
-    cpu = cpumask_first(cpumask_scratch);
-    if ( likely(cpu < nr_cpu_ids) )
-        return cpu;
+        affinity_balance_cpumask(svc->vcpu, bs, cpumask_scratch);
 
-    cpumask_and(cpumask_scratch, svc->vcpu->cpu_hard_affinity,
-                cpupool_domain_cpumask(svc->vcpu->domain));
+        /*
+         * This is cases 1 or 4 (depending on bs): if v->processor is (still)
+         * in our affinity, go for it, for cache betterness.
+         */
+        if ( likely(cpumask_test_cpu(svc->vcpu->processor,
+                                     cpumask_scratch)) )
+            return svc->vcpu->processor;
 
-    ASSERT(!cpumask_empty(cpumask_scratch));
+        /*
+         * This is cases 2 or 5 (depending on bsp): v->processor isn't there
+         * any longer, check if we at least can stay in our current runq.
+         */
+        cpumask_and(cpumask_scratch, cpumask_scratch,
+                    &svc->rqd->active);
+        cpu = cpumask_first(cpumask_scratch);
+        if ( likely(cpu < nr_cpu_ids) )
+            return cpu;
 
-    return cpumask_first(cpumask_scratch);
+        /*
+         * This is cases 3 or 6 (depending on bs): last stand, just one valid
+         * pcpu from our soft affinity, if we have one and if there's any. In
+         * fact, if we are doing soft-affinity, it is possible that we fail,
+         * which means we stay in the loop and look for hard affinity. OTOH,
+         * if we are at the hard-affinity balancing step, it's guaranteed that
+         * there is at least one valid cpu, and therefore we are sure that we
+         * return it, and never really exit the loop.
+         */
+        cpumask_and(cpumask_scratch, cpumask_scratch,
+                    cpupool_domain_cpumask(svc->vcpu->domain));
+        ASSERT(!cpumask_empty(cpumask_scratch) || bs == BALANCE_SOFT_AFFINITY);
+        cpu = cpumask_first(cpumask_scratch);
+        if ( likely(cpu < nr_cpu_ids) )
+            return cpu;
+    }
+    BUG_ON(1);
+    return -1;
 }
 
 /*
@@ -1561,14 +1595,15 @@ csched2_context_saved(const struct scheduler *ops, struct vcpu *vc)
     vcpu_schedule_unlock_irq(lock, vc);
 }
 
-#define MAX_LOAD (STIME_MAX);
+#define MAX_LOAD (STIME_MAX)
 static int
 csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
 {
     struct csched2_private *prv = CSCHED2_PRIV(ops);
-    int i, min_rqi = -1, new_cpu;
+    int i, min_rqi = -1, min_s_rqi = -1, new_cpu;
     struct csched2_vcpu *svc = CSCHED2_VCPU(vc);
-    s_time_t min_avgload = MAX_LOAD;
+    s_time_t min_avgload = MAX_LOAD, min_s_avgload = MAX_LOAD;
+    bool_t has_soft;
 
     ASSERT(!cpumask_empty(&prv->active_queues));
 
@@ -1613,6 +1648,12 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
         }
         else
         {
+            /*
+             * If we've been asked to move to migrate_rqd, we should just do
+             * that, which we actually do by returning one cpu from that runq.
+             * There is no need to take care of soft affinity, as that will
+             * happen in runq_tickle().
+             */
             cpumask_and(cpumask_scratch, vc->cpu_hard_affinity,
                         &svc->migrate_rqd->active);
             new_cpu = cpumask_any(cpumask_scratch);
@@ -1622,7 +1663,21 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
         /* Fall-through to normal cpu pick */
     }
 
-    /* Find the runqueue with the lowest average load. */
+    has_soft = has_soft_affinity(vc, vc->cpu_hard_affinity);
+    if ( has_soft )
+        affinity_balance_cpumask(vc, BALANCE_SOFT_AFFINITY, cpumask_scratch);
+
+    /*
+     * What we want is:
+     *  - if we have soft affinity, the runqueue with the lowest average
+     *    load, among the ones that contain cpus in our soft affinity; this
+     *    represents the best runq on which we would want to run.
+     *  - the runqueue with the lowest average load among the ones that
+     *    contains cpus in our hard affinity; this represent the best runq
+     *    on which we can run.
+     *
+     * Find both runqueues in one pass.
+     */
     for_each_cpu(i, &prv->active_queues)
     {
         struct csched2_runqueue_data *rqd;
@@ -1656,6 +1711,13 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
             spin_unlock(&rqd->lock);
         }
 
+        if ( has_soft &&
+             rqd_avgload < min_s_avgload &&
+             cpumask_intersects(cpumask_scratch, &rqd->active) )
+        {
+            min_s_avgload = rqd_avgload;
+            min_s_rqi = i;
+        }
         if ( rqd_avgload < min_avgload )
         {
             min_avgload = rqd_avgload;
@@ -1663,9 +1725,33 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
         }
     }
 
-    /* We didn't find anyone (most likely because of spinlock contention). */
-    if ( min_rqi == -1 )
+    if ( has_soft && min_s_rqi != -1 )
+    {
+        /*
+         * We have soft affinity, and we have a candidate runq, so go for it.
+         *
+         * Note that, since has_soft is true, cpumask_scratch holds the proper
+         * soft-affinity mask.
+         */
+        cpumask_and(cpumask_scratch, cpumask_scratch,
+                    &prv->rqd[min_s_rqi].active);
+    }
+    else if ( min_rqi != -1 )
     {
+        /*
+         * Either we don't have soft affinity, or we do, but we did not find
+         * any suitable runq. But we did find one when considering hard
+         * affinity, so go for it.
+         */
+        cpumask_and(cpumask_scratch, vc->cpu_hard_affinity,
+                    &prv->rqd[min_rqi].active);
+    }
+    else
+    {
+        /*
+         * We didn't find anyone at all (most likely because of spinlock
+         * contention).
+         */
         new_cpu = get_fallback_cpu(svc);
         min_rqi = c2r(ops, new_cpu);
         min_avgload = prv->rqd[min_rqi].b_avgload;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-08-17 17:19 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17 17:17 [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-17 17:17 ` [PATCH 01/24] xen: credit1: small optimization in Credit1's tickling logic Dario Faggioli
2016-09-12 15:01   ` George Dunlap
2016-08-17 17:17 ` [PATCH 02/24] xen: credit1: fix mask to be used for tickling in Credit1 Dario Faggioli
2016-08-17 23:42   ` Dario Faggioli
2016-09-12 15:04     ` George Dunlap
2016-08-17 17:17 ` [PATCH 03/24] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-12 15:14   ` George Dunlap
2016-09-12 17:00     ` Dario Faggioli
2016-09-14  9:34       ` George Dunlap
2016-09-14 13:54         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 04/24] xen: credit2: properly schedule migration of a running vcpu Dario Faggioli
2016-09-12 17:11   ` George Dunlap
2016-08-17 17:18 ` [PATCH 05/24] xen: credit2: make tickling more deterministic Dario Faggioli
2016-08-31 17:10   ` anshul makkar
2016-09-05 13:47     ` Dario Faggioli
2016-09-07 12:25       ` anshul makkar
2016-09-13 11:13       ` George Dunlap
2016-09-29 15:24         ` Dario Faggioli
2016-09-13 11:28   ` George Dunlap
2016-09-30  2:22     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 06/24] xen: credit2: implement yield() Dario Faggioli
2016-09-13 13:33   ` George Dunlap
2016-09-29 16:05     ` Dario Faggioli
2016-09-20 13:25   ` George Dunlap
2016-09-20 13:37     ` George Dunlap
2016-08-17 17:18 ` [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields Dario Faggioli
2016-09-20 13:32   ` George Dunlap
2016-09-29 16:46     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 08/24] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-08-18  0:57   ` Meng Xu
2016-08-18  9:41     ` Dario Faggioli
2016-09-20 13:50   ` George Dunlap
2016-08-17 17:18 ` [PATCH 09/24] xen/tools: tracing: improve tracing of context switches Dario Faggioli
2016-09-20 14:08   ` George Dunlap
2016-08-17 17:18 ` [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records Dario Faggioli
2016-09-20 14:35   ` George Dunlap
2016-09-29 17:23     ` Dario Faggioli
2016-09-29 17:28       ` George Dunlap
2016-09-29 20:53         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 11/24] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-20 14:37   ` George Dunlap
2016-08-17 17:18 ` [PATCH 12/24] xen: libxc: allow to set the ratelimit value online Dario Faggioli
2016-09-20 14:43   ` George Dunlap
2016-09-20 14:45     ` Wei Liu
2016-09-28 15:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 13/24] libxc: improve error handling of xc Credit1 and Credit2 helpers Dario Faggioli
2016-09-20 15:10   ` Wei Liu
2016-08-17 17:19 ` [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-08-22  9:21   ` Ian Jackson
2016-09-05 14:02     ` Dario Faggioli
2016-08-22  9:28   ` Ian Jackson
2016-09-28 15:37     ` George Dunlap
2016-09-30  1:03     ` Dario Faggioli
2016-09-28 15:39   ` George Dunlap
2016-08-17 17:19 ` [PATCH 15/24] xl: " Dario Faggioli
2016-09-28 15:46   ` George Dunlap
2016-08-17 17:19 ` [PATCH 16/24] xen: sched: factor affinity helpers out of sched_credit.c Dario Faggioli
2016-09-28 15:49   ` George Dunlap
2016-08-17 17:19 ` [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle() Dario Faggioli
2016-09-01 10:52   ` anshul makkar
2016-09-05 14:55     ` Dario Faggioli
2016-09-07 13:24       ` anshul makkar
2016-09-07 13:31         ` Dario Faggioli
2016-09-28 20:44   ` George Dunlap
2016-08-17 17:19 ` Dario Faggioli [this message]
2016-09-01 11:08   ` [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick() anshul makkar
2016-09-05 13:26     ` Dario Faggioli
2016-09-07 12:52       ` anshul makkar
2016-09-29 11:11   ` George Dunlap
2016-08-17 17:19 ` [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing Dario Faggioli
2016-09-02 11:46   ` anshul makkar
2016-09-05 12:49     ` Dario Faggioli
2016-08-17 17:19 ` [PATCH 20/24] xen: credit2: kick away vcpus not running within their soft-affinity Dario Faggioli
2016-08-17 17:20 ` [PATCH 21/24] xen: credit2: optimize runq_candidate() a little bit Dario Faggioli
2016-08-17 17:20 ` [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER Dario Faggioli
2016-09-30 15:30   ` George Dunlap
2016-08-17 17:20 ` [PATCH 23/24] xen: credit2: optimize runq_tickle() a little bit Dario Faggioli
2016-09-02 12:38   ` anshul makkar
2016-09-05 12:52     ` Dario Faggioli
2016-08-17 17:20 ` [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting Dario Faggioli
2016-08-18  0:11 ` [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-18 11:49 ` Dario Faggioli
2016-08-18 11:53 ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=147145438023.25877.4564307455358470316.stgit@Solace.fritz.box \
    --to=dario.faggioli@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=jtweaver@hawaii.edu \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.