All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: Anshul Makkar <anshul.makkar@citrix.com>,
	"Justin T. Weaver" <jtweaver@hawaii.edu>
Subject: Re: [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle()
Date: Wed, 28 Sep 2016 21:44:33 +0100	[thread overview]
Message-ID: <325815e7-a1cf-5292-a496-04cd552b3f40@citrix.com> (raw)
In-Reply-To: <147145437291.25877.11396888641547651914.stgit@Solace.fritz.box>

[-- Attachment #1: Type: text/plain, Size: 3674 bytes --]

On 17/08/16 18:19, Dario Faggioli wrote:
> This is done by means of the "usual" two steps loop:
>  - soft affinity balance step;
>  - hard affinity balance step.
> 
> The entire logic implemented in runq_tickle() is
> applied, during the first step, considering only the
> CPUs in the vcpu's soft affinity. In the second step,
> we fall back to use all the CPUs from its hard
> affinity (as it is doing now, without this patch).
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
>  xen/common/sched_credit2.c |  243 ++++++++++++++++++++++++++++----------------
>  1 file changed, 157 insertions(+), 86 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 0d83bd7..3aef1b4 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -902,6 +902,42 @@ __runq_remove(struct csched2_vcpu *svc)
>      list_del_init(&svc->runq_elem);
>  }
>  
> +/*
> + * During the soft-affinity step, only actually preempt someone if
> + * he does not have soft-affinity with cpu (while we have).
> + *
> + * BEWARE that this uses cpumask_scratch, trowing away what's in there!
> + */
> +static inline bool_t soft_aff_check_preempt(unsigned int bs, unsigned int cpu)
> +{
> +    struct csched2_vcpu * cur = CSCHED2_VCPU(curr_on_cpu(cpu));
> +
> +    /*
> +     * If we're doing hard-affinity, always check whether to preempt cur.
> +     * If we're doing soft-affinity, but cur doesn't have one, check as well.
> +     */
> +    if ( bs == BALANCE_HARD_AFFINITY ||
> +         !has_soft_affinity(cur->vcpu, cur->vcpu->cpu_hard_affinity) )
> +        return 1;
> +
> +    /*
> +     * We're doing soft-affinity, and we know that the current vcpu on cpu
> +     * has a soft affinity. We now want to know whether cpu itself is in
> +     * such affinity. In fact, since we now that new (in runq_tickle()) is:

This is a bit confusing.  I think you mean, "We know that the vcpu we
want to place has soft affinity with the target cpu; now we want to know
whether the vcpu running on the target cpu has soft affinity with that
cpu or not."

> +     *  - if cpu is not in cur's soft-affinity, we should indeed check to
> +     *    see whether new should preempt cur. If that will be the case, that
> +     *    would be an improvement wrt respecting soft affinity;
> +     *  - if cpu is in cur's soft-affinity, we leave it alone and (in
> +     *    runq_tickle()) move on to another cpu. In fact, we don't want to
> +     *    be too harsh with someone which is running within its soft-affinity.
> +     *    This is safe because later, if we don't fine anyone else during the
> +     *    soft-affinity step, we will check cpu for preemption anyway, when
> +     *    doing hard-affinity.

But by doing this, isn't it actually more likely that we'll end up
somewhere outside our soft affinity, even though there are cpus inside
our soft affinity where we have higher credit?

It would be nice if we could pre-empt somebody outside their soft
affinity before pre-empting somebody inside their soft affinity.

It occurs to me -- in the normal case, the number of cpus involved here
should be lower than 8, and often lower than that.  Rather than loop
around twice, with potentially two "inner" loops, would it make sense
just to sweep through the hard affinity once, calculating a "score" that
factored in the different things we want to factor in, and then choosing
the highest score (if any)?

Something like the attached (compile-tested only)?

 -George



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-xen-credit2-soft-affinity-awareness-in-runq_tickle.patch --]
[-- Type: text/x-diff; name="0001-xen-credit2-soft-affinity-awareness-in-runq_tickle.patch", Size: 6835 bytes --]

From 93346e02da5def9c1bca502e0e47aa8be9b3f2a6 Mon Sep 17 00:00:00 2001
From: Dario Faggioli <dario.faggioli@citrix.com>
Date: Thu, 15 Sep 2016 12:35:05 +0100
Subject: [PATCH] xen: credit2: soft-affinity awareness in runq_tickle()

Rather than the usual two-step loop, after first checking for idlers,
we scan through each cpu in the runqueue and find a "score" for the
utility of tickling each cpu.

FIXME - needs filling out. :-)

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/common/sched_credit2.c | 126 ++++++++++++++++++++++++++++++---------------
 1 file changed, 84 insertions(+), 42 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 0d83bd7..36acf82 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -904,6 +904,64 @@ __runq_remove(struct csched2_vcpu *svc)
 
 void burn_credits(struct csched2_runqueue_data *rqd, struct csched2_vcpu *, s_time_t);
 
+/* 
+ * Score to preempt the target cpu.  Return a negative number if the
+ * credit isn't high enough; if it is, favor preemptions in this
+ * order:
+ * - cpu is in new's soft affinity, not in cur's soft affinity
+ * - cpu is in new's soft affinity and cur's soft affinity
+ * - cpu is not in new's soft affinity
+ * - Within the same class, the highest difference of credit
+ */
+static s_time_t tickle_score(struct csched2_runqueue_data *rqd, s_time_t now,
+                             struct csched2_vcpu *new, unsigned int cpu)
+{
+    struct csched2_vcpu * cur;
+    s_time_t score;
+    
+    cur = CSCHED2_VCPU(curr_on_cpu(cpu));
+
+    burn_credits(rqd, cur, now);
+
+    score = new->credit - cur->credit;
+
+    if ( new->vcpu->processor != cpu )
+        score -= CSCHED2_MIGRATE_RESIST;
+
+    /* 
+     * At this point, if cur->credit + RESISTANCE >= new->credit,
+     * score will be negative (or zero), which means default -1 is
+     * still higher.  
+     *
+     * Otherwise, add bonuses for soft affinities.
+     */
+    
+    if ( score > 0 && cpumask_test_cpu(cpu, new->vcpu->cpu_soft_affinity) )
+    {
+        score += CSCHED2_CREDIT_INIT;
+        if ( !cpumask_test_cpu(cpu, cur->vcpu->cpu_soft_affinity) )
+            score += CSCHED2_CREDIT_INIT;
+    }
+
+    if ( unlikely(tb_init_done) )
+    {
+        struct {
+            unsigned vcpu:16, dom:16;
+            unsigned cpu, credit, score;
+        } d;
+        d.dom = cur->vcpu->domain->domain_id;
+        d.vcpu = cur->vcpu->vcpu_id;
+        d.credit = cur->credit;
+        d.score = score;
+        d.cpu = cpu;
+        __trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
+                    sizeof(d),
+                    (unsigned char *)&d);
+    }
+        
+    return score;
+}
+
 /*
  * Check what processor it is best to 'wake', for picking up a vcpu that has
  * just been put (back) in the runqueue. Logic is as follows:
@@ -924,11 +982,10 @@ static void
 runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
 {
     int i, ipid = -1;
-    s_time_t lowest = (1<<30);
+    s_time_t max = 0;
     unsigned int cpu = new->vcpu->processor;
     struct csched2_runqueue_data *rqd = RQD(ops, cpu);
     cpumask_t mask;
-    struct csched2_vcpu * cur;
 
     ASSERT(new->rqd == rqd);
 
@@ -959,7 +1016,7 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
         cpumask_andnot(&mask, &rqd->idle, &rqd->smt_idle);
     else
         cpumask_copy(&mask, &rqd->smt_idle);
-    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
+    cpumask_and(&mask, &mask, new->vcpu->cpu_soft_affinity);
     i = cpumask_test_or_cycle(cpu, &mask);
     if ( i < nr_cpu_ids )
     {
@@ -974,7 +1031,7 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
      * gone through the scheduler yet.
      */
     cpumask_andnot(&mask, &rqd->idle, &rqd->tickled);
-    cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
+    cpumask_and(&mask, &mask, new->vcpu->cpu_soft_affinity);
     i = cpumask_test_or_cycle(cpu, &mask);
     if ( i < nr_cpu_ids )
     {
@@ -993,63 +1050,48 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
     cpumask_and(&mask, &mask, new->vcpu->cpu_hard_affinity);
     if ( cpumask_test_cpu(cpu, &mask) )
     {
-        cur = CSCHED2_VCPU(curr_on_cpu(cpu));
-        burn_credits(rqd, cur, now);
+        s_time_t score = tickle_score(rqd, now, new, cpu);
 
-        if ( cur->credit < new->credit )
+        if ( score > max )
         {
-            SCHED_STAT_CRANK(tickled_busy_cpu);
+            max = score;
             ipid = cpu;
-            goto tickle;
+            
+            /* If this is in the vcpu's soft affinity, just take it */
+            if ( cpumask_test_cpu(cpu, new->vcpu->cpu_soft_affinity) )
+                goto tickle;
         }
     }
 
     for_each_cpu(i, &mask)
     {
+        s_time_t score;
+        
         /* Already looked at this one above */
         if ( i == cpu )
             continue;
 
-        cur = CSCHED2_VCPU(curr_on_cpu(i));
-
-        ASSERT(!is_idle_vcpu(cur->vcpu));
-
-        /* Update credits for current to see if we want to preempt. */
-        burn_credits(rqd, cur, now);
-
-        if ( cur->credit < lowest )
-        {
-            ipid = i;
-            lowest = cur->credit;
-        }
+        /* 
+         * This will factor in both our soft affinity and the soft
+         * affinity of the vcpu currently running on i.
+         */
+        score = tickle_score(rqd, now, new, i);
 
-        if ( unlikely(tb_init_done) )
+        if ( score > max )
         {
-            struct {
-                unsigned vcpu:16, dom:16;
-                unsigned cpu, credit;
-            } d;
-            d.dom = cur->vcpu->domain->domain_id;
-            d.vcpu = cur->vcpu->vcpu_id;
-            d.credit = cur->credit;
-            d.cpu = i;
-            __trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
-                        sizeof(d),
-                        (unsigned char *)&d);
+            max = score;
+            ipid = cpu;
         }
     }
-
-    /*
-     * Only switch to another processor if the credit difference is
-     * greater than the migrate resistance.
-     */
-    if ( ipid == -1 || lowest + CSCHED2_MIGRATE_RESIST > new->credit )
+        
+    if ( ipid != -1 )
     {
-        SCHED_STAT_CRANK(tickled_no_cpu);
-        return;
+        SCHED_STAT_CRANK(tickled_busy_cpu);
+        goto tickle;
     }
 
-    SCHED_STAT_CRANK(tickled_busy_cpu);
+    SCHED_STAT_CRANK(tickled_no_cpu);
+    return;
  tickle:
     BUG_ON(ipid == -1);
 
-- 
2.1.4


[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-09-28 20:44 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17 17:17 [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-17 17:17 ` [PATCH 01/24] xen: credit1: small optimization in Credit1's tickling logic Dario Faggioli
2016-09-12 15:01   ` George Dunlap
2016-08-17 17:17 ` [PATCH 02/24] xen: credit1: fix mask to be used for tickling in Credit1 Dario Faggioli
2016-08-17 23:42   ` Dario Faggioli
2016-09-12 15:04     ` George Dunlap
2016-08-17 17:17 ` [PATCH 03/24] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-12 15:14   ` George Dunlap
2016-09-12 17:00     ` Dario Faggioli
2016-09-14  9:34       ` George Dunlap
2016-09-14 13:54         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 04/24] xen: credit2: properly schedule migration of a running vcpu Dario Faggioli
2016-09-12 17:11   ` George Dunlap
2016-08-17 17:18 ` [PATCH 05/24] xen: credit2: make tickling more deterministic Dario Faggioli
2016-08-31 17:10   ` anshul makkar
2016-09-05 13:47     ` Dario Faggioli
2016-09-07 12:25       ` anshul makkar
2016-09-13 11:13       ` George Dunlap
2016-09-29 15:24         ` Dario Faggioli
2016-09-13 11:28   ` George Dunlap
2016-09-30  2:22     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 06/24] xen: credit2: implement yield() Dario Faggioli
2016-09-13 13:33   ` George Dunlap
2016-09-29 16:05     ` Dario Faggioli
2016-09-20 13:25   ` George Dunlap
2016-09-20 13:37     ` George Dunlap
2016-08-17 17:18 ` [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields Dario Faggioli
2016-09-20 13:32   ` George Dunlap
2016-09-29 16:46     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 08/24] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-08-18  0:57   ` Meng Xu
2016-08-18  9:41     ` Dario Faggioli
2016-09-20 13:50   ` George Dunlap
2016-08-17 17:18 ` [PATCH 09/24] xen/tools: tracing: improve tracing of context switches Dario Faggioli
2016-09-20 14:08   ` George Dunlap
2016-08-17 17:18 ` [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records Dario Faggioli
2016-09-20 14:35   ` George Dunlap
2016-09-29 17:23     ` Dario Faggioli
2016-09-29 17:28       ` George Dunlap
2016-09-29 20:53         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 11/24] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-20 14:37   ` George Dunlap
2016-08-17 17:18 ` [PATCH 12/24] xen: libxc: allow to set the ratelimit value online Dario Faggioli
2016-09-20 14:43   ` George Dunlap
2016-09-20 14:45     ` Wei Liu
2016-09-28 15:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 13/24] libxc: improve error handling of xc Credit1 and Credit2 helpers Dario Faggioli
2016-09-20 15:10   ` Wei Liu
2016-08-17 17:19 ` [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-08-22  9:21   ` Ian Jackson
2016-09-05 14:02     ` Dario Faggioli
2016-08-22  9:28   ` Ian Jackson
2016-09-28 15:37     ` George Dunlap
2016-09-30  1:03     ` Dario Faggioli
2016-09-28 15:39   ` George Dunlap
2016-08-17 17:19 ` [PATCH 15/24] xl: " Dario Faggioli
2016-09-28 15:46   ` George Dunlap
2016-08-17 17:19 ` [PATCH 16/24] xen: sched: factor affinity helpers out of sched_credit.c Dario Faggioli
2016-09-28 15:49   ` George Dunlap
2016-08-17 17:19 ` [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle() Dario Faggioli
2016-09-01 10:52   ` anshul makkar
2016-09-05 14:55     ` Dario Faggioli
2016-09-07 13:24       ` anshul makkar
2016-09-07 13:31         ` Dario Faggioli
2016-09-28 20:44   ` George Dunlap [this message]
2016-08-17 17:19 ` [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick() Dario Faggioli
2016-09-01 11:08   ` anshul makkar
2016-09-05 13:26     ` Dario Faggioli
2016-09-07 12:52       ` anshul makkar
2016-09-29 11:11   ` George Dunlap
2016-08-17 17:19 ` [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing Dario Faggioli
2016-09-02 11:46   ` anshul makkar
2016-09-05 12:49     ` Dario Faggioli
2016-08-17 17:19 ` [PATCH 20/24] xen: credit2: kick away vcpus not running within their soft-affinity Dario Faggioli
2016-08-17 17:20 ` [PATCH 21/24] xen: credit2: optimize runq_candidate() a little bit Dario Faggioli
2016-08-17 17:20 ` [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER Dario Faggioli
2016-09-30 15:30   ` George Dunlap
2016-08-17 17:20 ` [PATCH 23/24] xen: credit2: optimize runq_tickle() a little bit Dario Faggioli
2016-09-02 12:38   ` anshul makkar
2016-09-05 12:52     ` Dario Faggioli
2016-08-17 17:20 ` [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting Dario Faggioli
2016-08-18  0:11 ` [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-18 11:49 ` Dario Faggioli
2016-08-18 11:53 ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=325815e7-a1cf-5292-a496-04cd552b3f40@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=jtweaver@hawaii.edu \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.