All of lore.kernel.org
 help / color / mirror / Atom feed
From: anshul makkar <anshul.makkar@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: George Dunlap <george.dunlap@citrix.com>
Subject: Re: [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing
Date: Fri, 2 Sep 2016 12:46:01 +0100	[thread overview]
Message-ID: <57C96679.3000902@citrix.com> (raw)
In-Reply-To: <147145438726.25877.12520091608250776214.stgit@Solace.fritz.box>

On 17/08/16 18:19, Dario Faggioli wrote:
> We want is soft-affinity to play a role in load
> balancing, i.e., when deciding whether or not to

> something like that at some point.
>
> (Oh, and while there, just a couple of style fixes
> are also done.)
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
>   xen/common/sched_credit2.c |  359 ++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 326 insertions(+), 33 deletions(-)
>
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 2d7228a..3722f46 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1786,19 +1786,21 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>       return new_cpu;
>   }
>
> -/* Working state of the load-balancing algorithm */
> +/* Working state of the load-balancing algorithm. */
>   typedef struct {
> -    /* NB: Modified by consider() */
> +    /* NB: Modified by consider(). */
>       s_time_t load_delta;
>       struct csched2_vcpu * best_push_svc, *best_pull_svc;
> -    /* NB: Read by consider() */
> +    /* NB: Read by consider() (and the various consider_foo() functions). */
>       struct csched2_runqueue_data *lrqd;
> -    struct csched2_runqueue_data *orqd;
> +    struct csched2_runqueue_data *orqd;
> +    bool_t push_has_soft_aff, pull_has_soft_aff;
> +    s_time_t push_soft_aff_load, pull_soft_aff_load;
>   } balance_state_t;
>
> -static void consider(balance_state_t *st,
> -                     struct csched2_vcpu *push_svc,
> -                     struct csched2_vcpu *pull_svc)
> +static inline s_time_t consider_load(balance_state_t *st,
> +                                     struct csched2_vcpu *push_svc,
> +                                     struct csched2_vcpu *pull_svc)
>   {
>       s_time_t l_load, o_load, delta;
>
> @@ -1821,11 +1823,166 @@ static void consider(balance_state_t *st,
>       if ( delta < 0 )
>           delta = -delta;
>
> +    return delta;
> +}
> +
> +/*
> + * Load balancing and soft-affinity.
> + *
> + * When trying to figure out whether or not it's best to move a vcpu from
> + * one runqueue to another, we must keep soft-affinity in mind. Intuitively
> + * we would want to know the following:
> + *  - 'how much' affinity does the vcpu have with its current runq?
> + *  - 'how much' affinity will it have with its new runq?
> + *
> + * But we certainly need to be more precise about how much it is that 'how
> + * much'! Let's start with some definitions:
> + *
> + *  - let v be a vcpu, running in runq I, with soft-affinity to vi
> + *    pcpus of runq I, and soft affinity with vj pcpus of runq J;
> + *  - let k be another vcpu, running in runq J, with soft-affinity to kj
> + *    pcpus of runq J, and with ki pcpus of runq I;
> + *  - let runq I have Ci pcpus, and runq J Cj pcpus;
> + *  - let vcpu v have an average load of lv, and k an average load of lk;
> + *  - let runq I have an average load of Li, and J an average load of Lj.
> + *
> + * We also define the following::
> + *
> + *  - lvi = lv * (vi / Ci)  as the 'perceived load' of v, when running
> + *                          in runq i;
> + *  - lvj = lv * (vj / Cj)  as the 'perceived load' of v, it running
> + *                          in runq j;
> + *  - the same for k, mutatis mutandis.
> + *
> + * Idea is that vi/Ci (i.e., the ratio of the number of cpus of a runq that
> + * a vcpu has soft-affinity with, over the total number of cpus of the runq
> + * itself) can be seen as the 'degree of soft-affinity' of v to runq I (and
> + * vj/Cj the one of v to J). In other words, we define the degree of soft
> + * affinity of a vcpu to a runq as what fraction of pcpus of the runq itself
> + * the vcpu has soft-affinity with. Then, we multiply this 'degree of
> + * soft-affinity' by the vcpu load, and call the result the 'perceived load'.
> + *
> + * Basically, if a soft-affinity is defined, the work done by a vcpu on a
> + * runq to which it has higher degree of soft-affinity, is considered
> + * 'lighter' than the same work done by the same vcpu on a runq to which it
> + * has smaller degree of soft-affinity (degree of soft affinity is <= 1). In
> + * fact, if soft-affinity is used to achieve NUMA-aware scheduling, the higher
> + * the degree of soft-affinity of the vcpu to a runq, the greater the probability
> + * of accessing local memory, when running on such runq. And that is certainly\
> + * 'lighter' than having to fetch memory from remote NUMA nodes.
Do we ensure that while defining soft-affinity for a vcpu, NUMA 
architecture is considered. If not, then this whole calculation can go 
wrong and have negative impact on performance.

Degree of affinity to runq will give good result if the affinity to 
pcpus has been chosen after due consideration ..
> + *
> + * SoXX, evaluating pushing v from I to J would mean removing (from I) a
> + * perceived load of lv*(vi/Ci) and adding (to J) a perceived load of
> + * lv*(vj/Cj), which we (looking at things from the point of view of I,
> + * which is what balance_load() does) can call D_push:
> + *
> + *  - D_push = -lv * (vi / Ci) + lv * (vj / Cj) =
> + *           = lv * (vj/Cj - vi/Ci)
> + *
> + * On the other hand, pulling k from J to I would entail a D_pull:
> + *
> + *  - D_pull = lk * (ki / Ci) - lk * (kj / Cj) =
> + *           = lk * (ki/Ci - kj/Cj)
> + *
> + * Note that if v (k) has soft-afinity with all the cpus of both I and J,
> + * D_push (D_pull) will be 0, and the same is true in case it has no soft
> + * affinity at all with any of the cpus of I and J. Note also that both
> + * D_push and D_pull can be positive or negative (there's no abs() around
> + * in this case!) depending on the relationship between the degrees of soft
> + * affinity of the vcpu to I and J.
> + *
> + * If there is no soft-affinity, load_balance() (actually, consider()) acts
> + * as follows:
> + *
> + *  - D = abs(Li - Lj)
If we are consider absolute of Li -Lj, how will we know which runq has 
less workload which, I think, is an essential parameter for load 
balancing. Am I missing something here ?
> + *  - consider pushing v from I to J:
> + *     - D' = abs(Li - lv - (Lj + lv))   (from now, abs(x) == |x|)
> + *     - if (D' < D) { push }
> + *  - consider pulling k from J to I:
> + *     - D' = |Li + lk - (Lj - lk)|
> + *     - if (D' < D) { pull }
For both push and pull we are checking (D` < D) ?
> + *  - consider both push and pull:
> + *     - D' = |Li - lv + lk - (Lj + lv - lk)|
> + *     - if (D' < D) { push; pull }
> + *
> + * In order to make soft-affinity part of the process, we use D_push and
> + * D_pull, so that, the final behavior will look like this:
> + *
> + *  - D = abs(Li - Lj)
> + *  - consider pushing v from I to J:
> + *     - D' = |Li - lv - (Lj + lv)|
> + *     - D_push = lv * (vj/Cj - vi/Ci)
> + *     - if (D' + D_push < D) { push }
> + *  - consider pulling k from J to I:
> + *     - D' = |Li + lk - (Lj - lk)|
> + *       D_pull = lk * (ki/Ci - kj/Cj)
> + *     - if (D' < D) { pull }
> + *  - consider both push and pull:
> + *     - D' = |Li - lv + lk - (Lj + lv - lk)|
> + *     - D_push = lv * (vj/Cj - vi/Ci)
> + *       D_pull = lk * (ki/Ci - kj/Cj)
> + *     - if (D' + D_push + D_pull < D) { push; pull }
> + *
> + * So, for instance, the complete formula, in case of a push, with soft
> + * affinity being considered looks like this:
> + *
> + *  - D'' = D' + D_push =
> + *        = |Li - lv - (Lj + lv)| + lv*(vj/Cj - vi/Ci)
> + *
> + * which highlights how soft-affinity being considered acts as a *modifier*
> + * of the "normal" results obtained by just using the actual vcpus loads.
> + * This approach is modular, in the sense that it only takes implementing
> + * another function that returns another modifier, to make the load balancer
> + * consider some other factor or characteristics of the vcpus.
> + *
> + * Finally there is the scope for actually using a scaling factor, to limit
> + * the influence that soft-affinity will actually have on baseline results
> + * from consider_load(). Basically, that means that instead of D_push and/or
> + * D_pull, we'll be adding D_push/S and/or D_pull/S (with S the scaling
> + * factor). Check prep_soft_aff_load() for details on this.
> + */
> +
Anshul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-09-02 11:46 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17 17:17 [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-17 17:17 ` [PATCH 01/24] xen: credit1: small optimization in Credit1's tickling logic Dario Faggioli
2016-09-12 15:01   ` George Dunlap
2016-08-17 17:17 ` [PATCH 02/24] xen: credit1: fix mask to be used for tickling in Credit1 Dario Faggioli
2016-08-17 23:42   ` Dario Faggioli
2016-09-12 15:04     ` George Dunlap
2016-08-17 17:17 ` [PATCH 03/24] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-12 15:14   ` George Dunlap
2016-09-12 17:00     ` Dario Faggioli
2016-09-14  9:34       ` George Dunlap
2016-09-14 13:54         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 04/24] xen: credit2: properly schedule migration of a running vcpu Dario Faggioli
2016-09-12 17:11   ` George Dunlap
2016-08-17 17:18 ` [PATCH 05/24] xen: credit2: make tickling more deterministic Dario Faggioli
2016-08-31 17:10   ` anshul makkar
2016-09-05 13:47     ` Dario Faggioli
2016-09-07 12:25       ` anshul makkar
2016-09-13 11:13       ` George Dunlap
2016-09-29 15:24         ` Dario Faggioli
2016-09-13 11:28   ` George Dunlap
2016-09-30  2:22     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 06/24] xen: credit2: implement yield() Dario Faggioli
2016-09-13 13:33   ` George Dunlap
2016-09-29 16:05     ` Dario Faggioli
2016-09-20 13:25   ` George Dunlap
2016-09-20 13:37     ` George Dunlap
2016-08-17 17:18 ` [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields Dario Faggioli
2016-09-20 13:32   ` George Dunlap
2016-09-29 16:46     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 08/24] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-08-18  0:57   ` Meng Xu
2016-08-18  9:41     ` Dario Faggioli
2016-09-20 13:50   ` George Dunlap
2016-08-17 17:18 ` [PATCH 09/24] xen/tools: tracing: improve tracing of context switches Dario Faggioli
2016-09-20 14:08   ` George Dunlap
2016-08-17 17:18 ` [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records Dario Faggioli
2016-09-20 14:35   ` George Dunlap
2016-09-29 17:23     ` Dario Faggioli
2016-09-29 17:28       ` George Dunlap
2016-09-29 20:53         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 11/24] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-20 14:37   ` George Dunlap
2016-08-17 17:18 ` [PATCH 12/24] xen: libxc: allow to set the ratelimit value online Dario Faggioli
2016-09-20 14:43   ` George Dunlap
2016-09-20 14:45     ` Wei Liu
2016-09-28 15:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 13/24] libxc: improve error handling of xc Credit1 and Credit2 helpers Dario Faggioli
2016-09-20 15:10   ` Wei Liu
2016-08-17 17:19 ` [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-08-22  9:21   ` Ian Jackson
2016-09-05 14:02     ` Dario Faggioli
2016-08-22  9:28   ` Ian Jackson
2016-09-28 15:37     ` George Dunlap
2016-09-30  1:03     ` Dario Faggioli
2016-09-28 15:39   ` George Dunlap
2016-08-17 17:19 ` [PATCH 15/24] xl: " Dario Faggioli
2016-09-28 15:46   ` George Dunlap
2016-08-17 17:19 ` [PATCH 16/24] xen: sched: factor affinity helpers out of sched_credit.c Dario Faggioli
2016-09-28 15:49   ` George Dunlap
2016-08-17 17:19 ` [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle() Dario Faggioli
2016-09-01 10:52   ` anshul makkar
2016-09-05 14:55     ` Dario Faggioli
2016-09-07 13:24       ` anshul makkar
2016-09-07 13:31         ` Dario Faggioli
2016-09-28 20:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick() Dario Faggioli
2016-09-01 11:08   ` anshul makkar
2016-09-05 13:26     ` Dario Faggioli
2016-09-07 12:52       ` anshul makkar
2016-09-29 11:11   ` George Dunlap
2016-08-17 17:19 ` [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing Dario Faggioli
2016-09-02 11:46   ` anshul makkar [this message]
2016-09-05 12:49     ` Dario Faggioli
2016-08-17 17:19 ` [PATCH 20/24] xen: credit2: kick away vcpus not running within their soft-affinity Dario Faggioli
2016-08-17 17:20 ` [PATCH 21/24] xen: credit2: optimize runq_candidate() a little bit Dario Faggioli
2016-08-17 17:20 ` [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER Dario Faggioli
2016-09-30 15:30   ` George Dunlap
2016-08-17 17:20 ` [PATCH 23/24] xen: credit2: optimize runq_tickle() a little bit Dario Faggioli
2016-09-02 12:38   ` anshul makkar
2016-09-05 12:52     ` Dario Faggioli
2016-08-17 17:20 ` [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting Dario Faggioli
2016-08-18  0:11 ` [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-18 11:49 ` Dario Faggioli
2016-08-18 11:53 ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57C96679.3000902@citrix.com \
    --to=anshul.makkar@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.