All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: [PATCH 06/24] xen: credit2: implement yield()
Date: Wed, 17 Aug 2016 19:18:15 +0200	[thread overview]
Message-ID: <147145429544.25877.17160453823324324872.stgit@Solace.fritz.box> (raw)
In-Reply-To: <147145358844.25877.7490417583264534196.stgit@Solace.fritz.box>

When a vcpu explicitly yields it is usually giving
us an advice of "let someone else run and come back
to me in a bit."

Credit2 isn't, so far, doing anything when a vcpu
yields, which means an yield is basically a NOP (well,
actually, it's pure overhead, as it causes the scheduler
kick in, but the result is --at least 99% of the time--
that the very same vcpu that yielded continues to run).

Implement a "preempt bias", to be applied to yielding
vcpus. Basically when evaluating what vcpu to run next,
if a vcpu that has just yielded is encountered, we give
it a credit penalty, and check whether there is anyone
else that would better take over the cpu (of course,
if there isn't the yielding vcpu will continue).

The value of this bias can be configured with a boot
time parameter, and the default is set to 1 ms.

Also, add an yield performance counter, and fix the
style of a couple of comments.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Note that this *only* consider the bias during the very scheduling decision
that retults from the vcpu calling yield. After that, the __CSFLAG_vcpu_yield
flag is reset, and during all furute scheduling decisions, the vcpu will
compete with the other ones with its own amount of credits.

Alternatively, we can actually _subtract_ some credits to a yielding vcpu.
That will sort of make the effect of a call to yield last in time.

I'm not sure which path is best. Personally, I like the subtract approach
(perhaps, with a smaller bias than 1ms), but I think the "one shot" behavior
implemented here is a good starting point. It is _something_, which is better
than nothing, which is what we have without this patch! :-) It's lightweight
(in its impact on the crediting algorithm, I mean), and benchmarks looks nice,
so I propose we go for this one, and explore the "permanent" --subtraction
based-- solution a bit more.
---
 docs/misc/xen-command-line.markdown |   10 ++++++
 xen/common/sched_credit2.c          |   62 +++++++++++++++++++++++++++++++----
 xen/common/schedule.c               |    2 +
 xen/include/xen/perfc_defn.h        |    1 +
 4 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 3a250cb..5f469b1 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1389,6 +1389,16 @@ Choose the default scheduler.
 ### sched\_credit2\_migrate\_resist
 > `= <integer>`
 
+### sched\_credit2\_yield\_bias
+> `= <integer>`
+
+> Default: `1000`
+
+Set how much a yielding vcpu will be penalized, in order to actually
+give a chance to run to some other vcpu. This is basically a bias, in
+favour of the non-yielding vcpus, expressed in microseconds (default
+is 1ms).
+
 ### sched\_credit\_tslice\_ms
 > `= <integer>`
 
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index a3d7beb..569174b 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -144,6 +144,9 @@
 #define CSCHED2_MIGRATE_RESIST       ((opt_migrate_resist)*MICROSECS(1))
 /* How much to "compensate" a vcpu for L2 migration */
 #define CSCHED2_MIGRATE_COMPENSATION MICROSECS(50)
+/* How big of a bias we should have against a yielding vcpu */
+#define CSCHED2_YIELD_BIAS           ((opt_yield_bias)*MICROSECS(1))
+#define CSCHED2_YIELD_BIAS_MIN       CSCHED2_MIN_TIMER
 /* Reset: Value below which credit will be reset. */
 #define CSCHED2_CREDIT_RESET         0
 /* Max timer: Maximum time a guest can be run for. */
@@ -181,11 +184,20 @@
  */
 #define __CSFLAG_runq_migrate_request 3
 #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request)
-
+/*
+ * CSFLAG_vcpu_yield: this vcpu was running, and has called vcpu_yield(). The
+ * scheduler is invoked to see if we can give the cpu to someone else, and
+ * get back to the yielding vcpu in a while.
+ */
+#define __CSFLAG_vcpu_yield 4
+#define CSFLAG_vcpu_yield (1<<__CSFLAG_vcpu_yield)
 
 static unsigned int __read_mostly opt_migrate_resist = 500;
 integer_param("sched_credit2_migrate_resist", opt_migrate_resist);
 
+static unsigned int __read_mostly opt_yield_bias = 1000;
+integer_param("sched_credit2_yield_bias", opt_yield_bias);
+
 /*
  * Useful macros
  */
@@ -1432,6 +1444,14 @@ out:
 }
 
 static void
+csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v)
+{
+    struct csched2_vcpu * const svc = CSCHED2_VCPU(v);
+
+    __set_bit(__CSFLAG_vcpu_yield, &svc->flags);
+}
+
+static void
 csched2_context_saved(const struct scheduler *ops, struct vcpu *vc)
 {
     struct csched2_vcpu * const svc = CSCHED2_VCPU(vc);
@@ -2247,10 +2267,22 @@ runq_candidate(struct csched2_runqueue_data *rqd,
     struct list_head *iter;
     struct csched2_vcpu *snext = NULL;
     struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu));
+    int yield_bias = 0;
 
     /* Default to current if runnable, idle otherwise */
     if ( vcpu_runnable(scurr->vcpu) )
+    {
+        /*
+         * The way we actually take yields into account is like this:
+         * if scurr is yielding, when comparing its credits with other
+         * vcpus in the runqueue, act like those other vcpus had yield_bias
+         * more credits.
+         */
+        if ( unlikely(scurr->flags & CSFLAG_vcpu_yield) )
+            yield_bias = CSCHED2_YIELD_BIAS;
+
         snext = scurr;
+    }
     else
         snext = CSCHED2_VCPU(idle_vcpu[cpu]);
 
@@ -2268,6 +2300,7 @@ runq_candidate(struct csched2_runqueue_data *rqd,
     list_for_each( iter, &rqd->runq )
     {
         struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, runq_elem);
+        int svc_credit = svc->credit + yield_bias;
 
         /* Only consider vcpus that are allowed to run on this processor. */
         if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) )
@@ -2288,19 +2321,23 @@ runq_candidate(struct csched2_runqueue_data *rqd,
             continue;
         }
 
-        /* If this is on a different processor, don't pull it unless
-         * its credit is at least CSCHED2_MIGRATE_RESIST higher. */
+        /*
+         * If this is on a different processor, don't pull it unless
+         * its credit is at least CSCHED2_MIGRATE_RESIST higher.
+         */
         if ( svc->vcpu->processor != cpu
-             && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit )
+             && snext->credit + CSCHED2_MIGRATE_RESIST > svc_credit )
         {
             (*pos)++;
             SCHED_STAT_CRANK(migrate_resisted);
             continue;
         }
 
-        /* If the next one on the list has more credit than current
-         * (or idle, if current is not runnable), choose it. */
-        if ( svc->credit > snext->credit )
+        /*
+         * If the next one on the list has more credit than current
+         * (or idle, if current is not runnable), choose it.
+         */
+        if ( svc_credit > snext->credit )
             snext = svc;
 
         /* In any case, if we got this far, break. */
@@ -2399,6 +2436,8 @@ csched2_schedule(
          && vcpu_runnable(current) )
         __set_bit(__CSFLAG_delayed_runq_add, &scurr->flags);
 
+    __clear_bit(__CSFLAG_vcpu_yield, &scurr->flags);
+
     ret.migrated = 0;
 
     /* Accounting for non-idle tasks */
@@ -2918,6 +2957,14 @@ csched2_init(struct scheduler *ops)
     printk(XENLOG_INFO "load tracking window lenght %llu ns\n",
            1ULL << opt_load_window_shift);
 
+    if ( opt_yield_bias < CSCHED2_YIELD_BIAS_MIN )
+    {
+        printk("WARNING: %s: opt_yield_bias %d too small, resetting\n",
+               __func__, opt_yield_bias);
+        opt_yield_bias = 1000; /* 1 ms */
+    }
+    printk(XENLOG_INFO "yield bias value %d us\n", opt_yield_bias);
+
     /* Basically no CPU information is available at this point; just
      * set up basic structures, and a callback when the CPU info is
      * available. */
@@ -2970,6 +3017,7 @@ static const struct scheduler sched_credit2_def = {
 
     .sleep          = csched2_vcpu_sleep,
     .wake           = csched2_vcpu_wake,
+    .yield          = csched2_vcpu_yield,
 
     .adjust         = csched2_dom_cntl,
     .adjust_global  = csched2_sys_cntl,
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 32a300f..abe063d 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -947,6 +947,8 @@ long vcpu_yield(void)
     SCHED_OP(VCPU2OP(v), yield, v);
     vcpu_schedule_unlock_irq(lock, v);
 
+    SCHED_STAT_CRANK(vcpu_yield);
+
     TRACE_2D(TRC_SCHED_YIELD, current->domain->domain_id, current->vcpu_id);
     raise_softirq(SCHEDULE_SOFTIRQ);
     return 0;
diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h
index 4a835b8..900fddd 100644
--- a/xen/include/xen/perfc_defn.h
+++ b/xen/include/xen/perfc_defn.h
@@ -23,6 +23,7 @@ PERFCOUNTER(vcpu_alloc,             "sched: vcpu_alloc")
 PERFCOUNTER(vcpu_insert,            "sched: vcpu_insert")
 PERFCOUNTER(vcpu_remove,            "sched: vcpu_remove")
 PERFCOUNTER(vcpu_sleep,             "sched: vcpu_sleep")
+PERFCOUNTER(vcpu_yield,             "sched: vcpu_yield")
 PERFCOUNTER(vcpu_wake_running,      "sched: vcpu_wake_running")
 PERFCOUNTER(vcpu_wake_onrunq,       "sched: vcpu_wake_onrunq")
 PERFCOUNTER(vcpu_wake_runnable,     "sched: vcpu_wake_runnable")


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-08-17 17:18 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17 17:17 [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-17 17:17 ` [PATCH 01/24] xen: credit1: small optimization in Credit1's tickling logic Dario Faggioli
2016-09-12 15:01   ` George Dunlap
2016-08-17 17:17 ` [PATCH 02/24] xen: credit1: fix mask to be used for tickling in Credit1 Dario Faggioli
2016-08-17 23:42   ` Dario Faggioli
2016-09-12 15:04     ` George Dunlap
2016-08-17 17:17 ` [PATCH 03/24] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-12 15:14   ` George Dunlap
2016-09-12 17:00     ` Dario Faggioli
2016-09-14  9:34       ` George Dunlap
2016-09-14 13:54         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 04/24] xen: credit2: properly schedule migration of a running vcpu Dario Faggioli
2016-09-12 17:11   ` George Dunlap
2016-08-17 17:18 ` [PATCH 05/24] xen: credit2: make tickling more deterministic Dario Faggioli
2016-08-31 17:10   ` anshul makkar
2016-09-05 13:47     ` Dario Faggioli
2016-09-07 12:25       ` anshul makkar
2016-09-13 11:13       ` George Dunlap
2016-09-29 15:24         ` Dario Faggioli
2016-09-13 11:28   ` George Dunlap
2016-09-30  2:22     ` Dario Faggioli
2016-08-17 17:18 ` Dario Faggioli [this message]
2016-09-13 13:33   ` [PATCH 06/24] xen: credit2: implement yield() George Dunlap
2016-09-29 16:05     ` Dario Faggioli
2016-09-20 13:25   ` George Dunlap
2016-09-20 13:37     ` George Dunlap
2016-08-17 17:18 ` [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields Dario Faggioli
2016-09-20 13:32   ` George Dunlap
2016-09-29 16:46     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 08/24] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-08-18  0:57   ` Meng Xu
2016-08-18  9:41     ` Dario Faggioli
2016-09-20 13:50   ` George Dunlap
2016-08-17 17:18 ` [PATCH 09/24] xen/tools: tracing: improve tracing of context switches Dario Faggioli
2016-09-20 14:08   ` George Dunlap
2016-08-17 17:18 ` [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records Dario Faggioli
2016-09-20 14:35   ` George Dunlap
2016-09-29 17:23     ` Dario Faggioli
2016-09-29 17:28       ` George Dunlap
2016-09-29 20:53         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 11/24] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-20 14:37   ` George Dunlap
2016-08-17 17:18 ` [PATCH 12/24] xen: libxc: allow to set the ratelimit value online Dario Faggioli
2016-09-20 14:43   ` George Dunlap
2016-09-20 14:45     ` Wei Liu
2016-09-28 15:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 13/24] libxc: improve error handling of xc Credit1 and Credit2 helpers Dario Faggioli
2016-09-20 15:10   ` Wei Liu
2016-08-17 17:19 ` [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-08-22  9:21   ` Ian Jackson
2016-09-05 14:02     ` Dario Faggioli
2016-08-22  9:28   ` Ian Jackson
2016-09-28 15:37     ` George Dunlap
2016-09-30  1:03     ` Dario Faggioli
2016-09-28 15:39   ` George Dunlap
2016-08-17 17:19 ` [PATCH 15/24] xl: " Dario Faggioli
2016-09-28 15:46   ` George Dunlap
2016-08-17 17:19 ` [PATCH 16/24] xen: sched: factor affinity helpers out of sched_credit.c Dario Faggioli
2016-09-28 15:49   ` George Dunlap
2016-08-17 17:19 ` [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle() Dario Faggioli
2016-09-01 10:52   ` anshul makkar
2016-09-05 14:55     ` Dario Faggioli
2016-09-07 13:24       ` anshul makkar
2016-09-07 13:31         ` Dario Faggioli
2016-09-28 20:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick() Dario Faggioli
2016-09-01 11:08   ` anshul makkar
2016-09-05 13:26     ` Dario Faggioli
2016-09-07 12:52       ` anshul makkar
2016-09-29 11:11   ` George Dunlap
2016-08-17 17:19 ` [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing Dario Faggioli
2016-09-02 11:46   ` anshul makkar
2016-09-05 12:49     ` Dario Faggioli
2016-08-17 17:19 ` [PATCH 20/24] xen: credit2: kick away vcpus not running within their soft-affinity Dario Faggioli
2016-08-17 17:20 ` [PATCH 21/24] xen: credit2: optimize runq_candidate() a little bit Dario Faggioli
2016-08-17 17:20 ` [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER Dario Faggioli
2016-09-30 15:30   ` George Dunlap
2016-08-17 17:20 ` [PATCH 23/24] xen: credit2: optimize runq_tickle() a little bit Dario Faggioli
2016-09-02 12:38   ` anshul makkar
2016-09-05 12:52     ` Dario Faggioli
2016-08-17 17:20 ` [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting Dario Faggioli
2016-08-18  0:11 ` [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-18 11:49 ` Dario Faggioli
2016-08-18 11:53 ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=147145429544.25877.17160453823324324872.stgit@Solace.fritz.box \
    --to=dario.faggioli@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.