[PATCH 0/6] xen: sched: control structure memory layout optimizations

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/6] xen: sched: control structure memory layout optimizations
@ 2017-06-23 10:54 Dario Faggioli
  2017-06-23 10:54 ` [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically Dario Faggioli
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:54 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Meng Xu, Anshul Makkar

Hi,

This series contains some (micro)optimization patches that were lying around in
my local branches for some time.

In some more details:

- patches 1 and 2 get rid of (potentially) big static arrays inside Credit2's
  data structure, by making them dynamic, or per-CPU variables;

- patches 3, 4 and 5, reorder the fields in the schedulers' control
  structures. The main goal is optimize memory layout (e.g., reduce/avoid
  padding) and cache layout (related fields, that are accessed close to each
  other, in the same cacheline). While there, I'm also trying to improve code
  readability and comment wording, style and alignment;

- patch 7, speeds up tickling in Credit1 and Credit2, in presence of 1:1 vCPU
  to pCPU pinning.

For finding holes, and visualizing the cache layout, I've used pahole.
Individual changelogs of patches 3-6 have the details about the actual
improvements.

There's a git branch available here:
 git://xenbits.xen.org/people/dariof/xen.git  rel/sched/datas-mem-cache-optim
 http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/datas-mem-cache-optim

Thanks and Regards,
Dario
---
Dario Faggioli (6):
      xen: credit2: allocate runqueue data structure dynamically
      xen: credit2: make the cpu to runqueue map per-cpu
      xen: credit: rearrange members of control structures
      xen: credit2: rearrange members of control structures
      xen: RTDS: rearrange members of control structures
      xen: sched: optimize exclusive pinning case (Credit1 & 2)

 xen/common/sched_credit.c    |   60 +++++++++++----
 xen/common/sched_credit2.c   |  170 +++++++++++++++++++++++++-----------------
 xen/common/sched_rt.c        |   13 ++-
 xen/include/xen/perfc_defn.h |    1 
 4 files changed, 156 insertions(+), 88 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
@ 2017-06-23 10:54 ` Dario Faggioli
  2017-07-21 16:50   ` George Dunlap
  2017-06-23 10:54 ` [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu Dario Faggioli
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:54 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

Instead of keeping an NR_CPUS big array of csched2_runqueue_data
elements, directly inside the csched2_private structure, allocate
it dynamically.

This has two positive effects:
- reduces the size of csched2_private sensibly, which is
  especially good in case there are more instance of Credit2
  (in different cpupools), and is also good from the point
  of view of fitting the struct into CPU caches;
- we can use nr_cpu_ids as array size, which may be sensibly
  smaller than NR_CPUS

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/sched_credit2.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 126417c..10d9488 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -385,7 +385,7 @@ struct csched2_private {
 
     int runq_map[NR_CPUS];
     cpumask_t active_queues; /* Queues which may have active cpus */
-    struct csched2_runqueue_data rqd[NR_CPUS];
+    struct csched2_runqueue_data *rqd;
 
     unsigned int load_precision_shift;
     unsigned int load_window_shift;
@@ -3099,9 +3099,11 @@ csched2_init(struct scheduler *ops)
     printk(XENLOG_INFO "load tracking window length %llu ns\n",
            1ULL << opt_load_window_shift);
 
-    /* Basically no CPU information is available at this point; just
+    /*
+     * Basically no CPU information is available at this point; just
      * set up basic structures, and a callback when the CPU info is
-     * available. */
+     * available.
+     */
 
     prv = xzalloc(struct csched2_private);
     if ( prv == NULL )
@@ -3111,7 +3113,13 @@ csched2_init(struct scheduler *ops)
     rwlock_init(&prv->lock);
     INIT_LIST_HEAD(&prv->sdom);
 
-    /* But un-initialize all runqueues */
+    /* Allocate all runqueues and mark them as un-initialized */
+    prv->rqd = xzalloc_array(struct csched2_runqueue_data, nr_cpu_ids);
+    if ( !prv->rqd )
+    {
+        xfree(prv);
+        return -ENOMEM;
+    }
     for ( i = 0; i < nr_cpu_ids; i++ )
     {
         prv->runq_map[i] = -1;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
  2017-06-23 10:54 ` [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically Dario Faggioli
@ 2017-06-23 10:54 ` Dario Faggioli
  2017-07-21 16:56   ` George Dunlap
  2017-06-23 10:55 ` [PATCH 3/6] xen: credit: rearrange members of control structures Dario Faggioli
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:54 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

Instead of keeping an NR_CPUS big array of int-s,
directly inside csched2_private, use a per-cpu
variable.

That's especially beneficial (in terms of saved
memory) when there are more instance of Credit2 (in
different cpupools), and also helps fitting
csched2_private itself into CPU caches.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/sched_credit2.c |   33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 10d9488..15862f2 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -383,7 +383,6 @@ struct csched2_private {
     
     struct list_head sdom; /* Used mostly for dump keyhandler. */
 
-    int runq_map[NR_CPUS];
     cpumask_t active_queues; /* Queues which may have active cpus */
     struct csched2_runqueue_data *rqd;
 
@@ -393,6 +392,14 @@ struct csched2_private {
 };
 
 /*
+ * Physical CPU
+ *
+ * The only per-pCPU information we need to maintain is of which runqueue
+ * each CPU is part of.
+ */
+static DEFINE_PER_CPU(int, runq_map);
+
+/*
  * Virtual CPU
  */
 struct csched2_vcpu {
@@ -448,16 +455,16 @@ static inline struct csched2_dom *csched2_dom(const struct domain *d)
 }
 
 /* CPU to runq_id macro */
-static inline int c2r(const struct scheduler *ops, unsigned int cpu)
+static inline int c2r(unsigned int cpu)
 {
-    return csched2_priv(ops)->runq_map[(cpu)];
+    return per_cpu(runq_map, cpu);
 }
 
 /* CPU to runqueue struct macro */
 static inline struct csched2_runqueue_data *c2rqd(const struct scheduler *ops,
                                                   unsigned int cpu)
 {
-    return &csched2_priv(ops)->rqd[c2r(ops, cpu)];
+    return &csched2_priv(ops)->rqd[c2r(cpu)];
 }
 
 /*
@@ -1082,7 +1089,7 @@ runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
     ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
 
     ASSERT(!vcpu_on_runq(svc));
-    ASSERT(c2r(ops, cpu) == c2r(ops, svc->vcpu->processor));
+    ASSERT(c2r(cpu) == c2r(svc->vcpu->processor));
 
     ASSERT(&svc->rqd->runq == runq);
     ASSERT(!is_idle_vcpu(svc->vcpu));
@@ -1733,7 +1740,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
     if ( min_rqi == -1 )
     {
         new_cpu = get_fallback_cpu(svc);
-        min_rqi = c2r(ops, new_cpu);
+        min_rqi = c2r(new_cpu);
         min_avgload = prv->rqd[min_rqi].b_avgload;
         goto out_up;
     }
@@ -2622,7 +2629,7 @@ csched2_schedule(
             unsigned tasklet:8, idle:8, smt_idle:8, tickled:8;
         } d;
         d.cpu = cpu;
-        d.rq_id = c2r(ops, cpu);
+        d.rq_id = c2r(cpu);
         d.tasklet = tasklet_work_scheduled;
         d.idle = is_idle_vcpu(current);
         d.smt_idle = cpumask_test_cpu(cpu, &rqd->smt_idle);
@@ -2783,7 +2790,7 @@ dump_pcpu(const struct scheduler *ops, int cpu)
 #define cpustr keyhandler_scratch
 
     cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu));
-    printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(ops, cpu), cpustr);
+    printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(cpu), cpustr);
     cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu));
     printk("core=%s\n", cpustr);
 
@@ -2930,7 +2937,7 @@ init_pdata(struct csched2_private *prv, unsigned int cpu)
     }
     
     /* Set the runqueue map */
-    prv->runq_map[cpu] = rqi;
+    per_cpu(runq_map, cpu) = rqi;
     
     __cpumask_set_cpu(cpu, &rqd->idle);
     __cpumask_set_cpu(cpu, &rqd->active);
@@ -3034,7 +3041,7 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
     ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized));
     
     /* Find the old runqueue and remove this cpu from it */
-    rqi = prv->runq_map[cpu];
+    rqi = per_cpu(runq_map, cpu);
 
     rqd = prv->rqd + rqi;
 
@@ -3055,6 +3062,8 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
     else if ( rqd->pick_bias == cpu )
         rqd->pick_bias = cpumask_first(&rqd->active);
 
+    per_cpu(runq_map, cpu) = -1;
+
     spin_unlock(&rqd->lock);
 
     __cpumask_clear_cpu(cpu, &prv->initialized);
@@ -3121,10 +3130,8 @@ csched2_init(struct scheduler *ops)
         return -ENOMEM;
     }
     for ( i = 0; i < nr_cpu_ids; i++ )
-    {
-        prv->runq_map[i] = -1;
         prv->rqd[i].id = -1;
-    }
+
     /* initialize ratelimit */
     prv->ratelimit_us = sched_ratelimit_us;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] xen: credit: rearrange members of control structures
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
  2017-06-23 10:54 ` [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically Dario Faggioli
  2017-06-23 10:54 ` [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu Dario Faggioli
@ 2017-06-23 10:55 ` Dario Faggioli
  2017-07-21 17:02   ` George Dunlap
  2017-06-23 10:55 ` [PATCH 4/6] xen: credit2: " Dario Faggioli
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:55 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

With the aim of improving memory size and layout, and
at the same time trying to put related fields reside
in the same cacheline.

Here's a summary of the output of `pahole`, with and
without this patch, for the affected data structures.

csched_pcpu:
 * Before:
    size: 88, cachelines: 2, members: 6
    sum members: 80, holes: 1, sum holes: 4
    padding: 4
    paddings: 1, sum paddings: 5
    last cacheline: 24 bytes
 * After:
    size: 80, cachelines: 2, members: 6
    paddings: 1, sum paddings: 5
    last cacheline: 16 bytes

csched_vcpu:
 * Before:
    size: 72, cachelines: 2, members: 9
    padding: 2
    last cacheline: 8 bytes
 * After:
    same numbers, but move some fields to put
    related fields in same cache line.

csched_private:
 * Before:
    size: 152, cachelines: 3, members: 17
    sum members: 140, holes: 2, sum holes: 8
    padding: 4
    paddings: 1, sum paddings: 5
    last cacheline: 24 bytes
 * After:
    same numbers, but move some fields to put
    related fields in same cache line.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/sched_credit.c |   41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index efdf6bf..4f6330e 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -169,10 +169,12 @@ integer_param("sched_credit_tslice_ms", sched_credit_tslice_ms);
 struct csched_pcpu {
     struct list_head runq;
     uint32_t runq_sort_last;
-    struct timer ticker;
-    unsigned int tick;
+
     unsigned int idle_bias;
     unsigned int nr_runnable;
+
+    unsigned int tick;
+    struct timer ticker;
 };
 
 /*
@@ -181,13 +183,18 @@ struct csched_pcpu {
 struct csched_vcpu {
     struct list_head runq_elem;
     struct list_head active_vcpu_elem;
+
+    /* Up-pointers */
     struct csched_dom *sdom;
     struct vcpu *vcpu;
-    atomic_t credit;
-    unsigned int residual;
+
     s_time_t start_time;   /* When we were scheduled (used for credit) */
     unsigned flags;
-    int16_t pri;
+    int pri;
+
+    atomic_t credit;
+    unsigned int residual;
+
 #ifdef CSCHED_STATS
     struct {
         int credit_last;
@@ -219,21 +226,25 @@ struct csched_dom {
 struct csched_private {
     /* lock for the whole pluggable scheduler, nests inside cpupool_lock */
     spinlock_t lock;
-    struct list_head active_sdom;
-    uint32_t ncpus;
-    struct timer  master_ticker;
-    unsigned int master;
+
     cpumask_var_t idlers;
     cpumask_var_t cpus;
+    uint32_t *balance_bias;
+    uint32_t runq_sort;
+    unsigned int ratelimit_us;
+
+    /* Period of master and tick in milliseconds */
+    unsigned int tslice_ms, tick_period_us, ticks_per_tslice;
+    uint32_t ncpus;
+
+    struct list_head active_sdom;
     uint32_t weight;
     uint32_t credit;
     int credit_balance;
-    uint32_t runq_sort;
-    uint32_t *balance_bias;
-    unsigned ratelimit_us;
-    /* Period of master and tick in milliseconds */
-    unsigned tslice_ms, tick_period_us, ticks_per_tslice;
-    unsigned credits_per_tslice;
+    unsigned int credits_per_tslice;
+
+    unsigned int master;
+    struct timer master_ticker;
 };
 
 static void csched_tick(void *_cpu);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] xen: credit2: rearrange members of control structures
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
                   ` (2 preceding siblings ...)
  2017-06-23 10:55 ` [PATCH 3/6] xen: credit: rearrange members of control structures Dario Faggioli
@ 2017-06-23 10:55 ` Dario Faggioli
  2017-07-21 17:05   ` George Dunlap
  2017-06-23 10:55 ` [PATCH 5/6] xen: RTDS: " Dario Faggioli
  2017-06-23 10:55 ` [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
  5 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:55 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

With the aim of improving memory size and layout, and
at the same time trying to put related fields reside
in the same cacheline.

Here's a summary of the output of `pahole`, with and
without this patch, for the affected data structures.

csched2_runqueue_data:
 * Before:
    size: 216, cachelines: 4, members: 14
    sum members: 208, holes: 2, sum holes: 8
    last cacheline: 24 bytes
 * After:
    size: 208, cachelines: 4, members: 14
    last cacheline: 16 bytes

csched2_private:
 * Before:
    size: 120, cachelines: 2, members: 8
    sum members: 112, holes: 1, sum holes: 4
    padding: 4
    last cacheline: 56 bytes
 * After:
    size: 112, cachelines: 2, members: 8
    last cacheline: 48 bytes

csched2_vcpu:
 * Before:
    size: 112, cachelines: 2, members: 14
    sum members: 108, holes: 1, sum holes: 4
    last cacheline: 48 bytes
 * After:
    size: 112, cachelines: 2, members: 14
    padding: 4
    last cacheline: 48 bytes

While there, improve the wording, style and alignment
of comments too.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/sched_credit2.c |  102 ++++++++++++++++++++++----------------------
 1 file changed, 51 insertions(+), 51 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 15862f2..9814072 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -355,40 +355,41 @@ custom_param("credit2_runqueue", parse_credit2_runqueue);
  * Per-runqueue data
  */
 struct csched2_runqueue_data {
-    int id;
-
-    spinlock_t lock;      /* Lock for this runqueue. */
-    cpumask_t active;      /* CPUs enabled for this runqueue */
-
-    struct list_head runq; /* Ordered list of runnable vms */
-    struct list_head svc;  /* List of all vcpus assigned to this runqueue */
-    unsigned int max_weight;
-    unsigned int pick_bias;/* Last CPU we picked. Start from it next time */
-
-    cpumask_t idle,        /* Currently idle pcpus */
-        smt_idle,          /* Fully idle-and-untickled cores (see below) */
-        tickled;           /* Have been asked to go through schedule */
-    int load;              /* Instantaneous load: Length of queue  + num non-idle threads */
-    s_time_t load_last_update;  /* Last time average was updated */
-    s_time_t avgload;           /* Decaying queue load */
-    s_time_t b_avgload;         /* Decaying queue load modified by balancing */
+    spinlock_t lock;           /* Lock for this runqueue                     */
+
+    struct list_head runq;     /* Ordered list of runnable vms               */
+    int id;                    /* ID of this runqueue (-1 if invalid)        */
+
+    int load;                  /* Instantaneous load (num of non-idle vcpus) */
+    s_time_t load_last_update; /* Last time average was updated              */
+    s_time_t avgload;          /* Decaying queue load                        */
+    s_time_t b_avgload;        /* Decaying queue load modified by balancing  */
+
+    cpumask_t active,          /* CPUs enabled for this runqueue             */
+        smt_idle,              /* Fully idle-and-untickled cores (see below) */
+        tickled,               /* Have been asked to go through schedule     */
+        idle;                  /* Currently idle pcpus                       */
+
+    struct list_head svc;      /* List of all vcpus assigned to the runqueue */
+    unsigned int max_weight;   /* Max weight of the vcpus in this runqueue   */
+    unsigned int pick_bias;    /* Last picked pcpu. Start from it next time  */
 };
 
 /*
  * System-wide private data
  */
 struct csched2_private {
-    rwlock_t lock;
-    cpumask_t initialized; /* CPU is initialized for this pool */
-    
-    struct list_head sdom; /* Used mostly for dump keyhandler. */
+    rwlock_t lock;                     /* Private scheduler lock             */
 
-    cpumask_t active_queues; /* Queues which may have active cpus */
-    struct csched2_runqueue_data *rqd;
+    unsigned int load_precision_shift; /* Precision of load calculations     */
+    unsigned int load_window_shift;    /* Lenght of load decaying window     */
+    unsigned int ratelimit_us;         /* Rate limiting for this scheduler   */
+
+    cpumask_t active_queues;           /* Runqueues with (maybe) active cpus */
+    struct csched2_runqueue_data *rqd; /* Data of the various runqueues      */
 
-    unsigned int load_precision_shift;
-    unsigned int load_window_shift;
-    unsigned ratelimit_us; /* each cpupool can have its own ratelimit */
+    cpumask_t initialized;             /* CPUs part of this scheduler        */
+    struct list_head sdom;             /* List of domains (for debug key)    */
 };
 
 /*
@@ -403,37 +404,36 @@ static DEFINE_PER_CPU(int, runq_map);
  * Virtual CPU
  */
 struct csched2_vcpu {
-    struct list_head rqd_elem;         /* On the runqueue data list  */
-    struct list_head runq_elem;        /* On the runqueue            */
-    struct csched2_runqueue_data *rqd; /* Up-pointer to the runqueue */
-
-    /* Up-pointers */
-    struct csched2_dom *sdom;
-    struct vcpu *vcpu;
-
-    unsigned int weight;
-    unsigned int residual;
-
-    int credit;
-    s_time_t start_time; /* When we were scheduled (used for credit) */
-    unsigned flags;      /* 16 bits doesn't seem to play well with clear_bit() */
-    int tickled_cpu;     /* cpu tickled for picking us up (-1 if none) */
-
-    /* Individual contribution to load */
-    s_time_t load_last_update;  /* Last time average was updated */
-    s_time_t avgload;           /* Decaying queue load */
-
-    struct csched2_runqueue_data *migrate_rqd; /* Pre-determined rqd to which to migrate */
+    struct list_head rqd_elem;         /* On csched2_runqueue_data's svc list */
+    struct csched2_runqueue_data *rqd; /* Up-pointer to the runqueue          */
+
+    int credit;                        /* Current amount of credit            */
+    unsigned int weight;               /* Weight of this vcpu                 */
+    unsigned int residual;             /* Reminder of div(max_weight/weight)  */
+    unsigned flags;                    /* Status flags (16 bits would be ok,  */
+                                       /* but clear_bit() does not like that) */
+    s_time_t start_time;               /* Time we were scheduled (for credit) */
+
+    /* Individual contribution to load                                        */
+    s_time_t load_last_update;         /* Last time average was updated       */
+    s_time_t avgload;                  /* Decaying queue load                 */
+
+    struct list_head runq_elem;        /* On the runqueue (rqd->runq)         */
+    struct csched2_dom *sdom;          /* Up-pointer to domain                */
+    struct vcpu *vcpu;                 /* Up-pointer, to vcpu                 */
+
+    struct csched2_runqueue_data *migrate_rqd; /* Pre-determined migr. target */
+    int tickled_cpu;                   /* Cpu that will pick us (-1 if none)  */
 };
 
 /*
  * Domain
  */
 struct csched2_dom {
-    struct list_head sdom_elem;
-    struct domain *dom;
-    uint16_t weight;
-    uint16_t nr_vcpus;
+    struct list_head sdom_elem; /* On csched2_runqueue_data's sdom list       */
+    struct domain *dom;         /* Up-pointer to domain                       */
+    uint16_t weight;            /* User specified weight                      */
+    uint16_t nr_vcpus;          /* Number of vcpus of this domain             */
 };
 
 /*


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] xen: RTDS: rearrange members of control structures
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
                   ` (3 preceding siblings ...)
  2017-06-23 10:55 ` [PATCH 4/6] xen: credit2: " Dario Faggioli
@ 2017-06-23 10:55 ` Dario Faggioli
  2017-07-21 17:06   ` George Dunlap
  2017-07-21 17:51   ` Meng Xu
  2017-06-23 10:55 ` [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
  5 siblings, 2 replies; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:55 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Meng Xu

Nothing changed in `pahole` output, in terms of holes
and padding, but some fields have been moved, to put
related members in same cache line.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: Meng Xu <mengxu@cis.upenn.edu>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/common/sched_rt.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 1b30014..39f6bee 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
 struct rt_private {
     spinlock_t lock;            /* the global coarse-grained lock */
     struct list_head sdom;      /* list of availalbe domains, used for dump */
+
     struct list_head runq;      /* ordered list of runnable vcpus */
     struct list_head depletedq; /* unordered list of depleted vcpus */
+
+    struct timer *repl_timer;   /* replenishment timer */
     struct list_head replq;     /* ordered list of vcpus that need replenishment */
+
     cpumask_t tickled;          /* cpus been tickled */
-    struct timer *repl_timer;   /* replenishment timer */
 };
 
 /*
@@ -185,10 +188,6 @@ struct rt_vcpu {
     struct list_head q_elem;     /* on the runq/depletedq list */
     struct list_head replq_elem; /* on the replenishment events list */
 
-    /* Up-pointers */
-    struct rt_dom *sdom;
-    struct vcpu *vcpu;
-
     /* VCPU parameters, in nanoseconds */
     s_time_t period;
     s_time_t budget;
@@ -198,6 +197,10 @@ struct rt_vcpu {
     s_time_t last_start;         /* last start time */
     s_time_t cur_deadline;       /* current deadline for EDF */
 
+    /* Up-pointers */
+    struct rt_dom *sdom;
+    struct vcpu *vcpu;
+
     unsigned flags;              /* mark __RTDS_scheduled, etc.. */
 };
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)
  2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
                   ` (4 preceding siblings ...)
  2017-06-23 10:55 ` [PATCH 5/6] xen: RTDS: " Dario Faggioli
@ 2017-06-23 10:55 ` Dario Faggioli
  2017-07-21 17:19   ` George Dunlap
  5 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-06-23 10:55 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar

Exclusive pinning of vCPUs is used, sometimes, for
achieving the highest level of determinism, and the
least possible overhead, for the vCPUs in question.

Although static 1:1 pinning is not recommended, for
general use cases, optimizing the tickling code (of
Credit1 and Credit2) is easy and cheap enough, so go
for it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshulmakkar@gmail.com>
---
 xen/common/sched_credit.c    |   19 +++++++++++++++++++
 xen/common/sched_credit2.c   |   21 ++++++++++++++++++++-
 xen/include/xen/perfc_defn.h |    1 +
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 4f6330e..85e014d 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -429,6 +429,24 @@ static inline void __runq_tickle(struct csched_vcpu *new)
     idlers_empty = cpumask_empty(&idle_mask);
 
     /*
+     * Exclusive pinning is when a vcpu has hard-affinity with only one
+     * cpu, and there is no other vcpu that has hard-affinity with that
+     * same cpu. This is infrequent, but if it happens, is for achieving
+     * the most possible determinism, and least possible overhead for
+     * the vcpus in question.
+     *
+     * Try to identify the vast majority of these situations, and deal
+     * with them quickly.
+     */
+    if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) == cpu &&
+                  cpumask_test_cpu(cpu, &idle_mask)) )
+    {
+        SCHED_STAT_CRANK(tickled_idle_cpu_excl);
+        __cpumask_set_cpu(cpu, &mask);
+        goto tickle;
+    }
+
+    /*
      * If the pcpu is idle, or there are no idlers and the new
      * vcpu is a higher priority than the old vcpu, run it here.
      *
@@ -524,6 +542,7 @@ static inline void __runq_tickle(struct csched_vcpu *new)
         }
     }
 
+ tickle:
     if ( !cpumask_empty(&mask) )
     {
         if ( unlikely(tb_init_done) )
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 9814072..3a1ecbb 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1186,7 +1186,26 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
                 cpupool_domain_cpumask(new->vcpu->domain));
 
     /*
-     * First of all, consider idle cpus, checking if we can just
+     * Exclusive pinning is when a vcpu has hard-affinity with only one
+     * cpu, and there is no other vcpu that has hard-affinity with that
+     * same cpu. This is infrequent, but if it happens, is for achieving
+     * the most possible determinism, and least possible overhead for
+     * the vcpus in question.
+     *
+     * Try to identify the vast majority of these situations, and deal
+     * with them quickly.
+     */
+    if ( unlikely(cpumask_cycle(cpu, cpumask_scratch_cpu(cpu)) == cpu &&
+                  cpumask_test_cpu(cpu, &rqd->idle) &&
+                  !cpumask_test_cpu(cpu, &rqd->tickled)) )
+    {
+        SCHED_STAT_CRANK(tickled_idle_cpu_excl);
+        ipid = cpu;
+        goto tickle;
+    }
+
+    /*
+     * Afterwards, let's consider idle cpus, checking if we can just
      * re-use the pcpu where we were running before.
      *
      * If there are cores where all the siblings are idle, consider
diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h
index 53849af..ad914dc 100644
--- a/xen/include/xen/perfc_defn.h
+++ b/xen/include/xen/perfc_defn.h
@@ -30,6 +30,7 @@ PERFCOUNTER(vcpu_wake_runnable,     "sched: vcpu_wake_runnable")
 PERFCOUNTER(vcpu_wake_not_runnable, "sched: vcpu_wake_not_runnable")
 PERFCOUNTER(tickled_no_cpu,         "sched: tickled_no_cpu")
 PERFCOUNTER(tickled_idle_cpu,       "sched: tickled_idle_cpu")
+PERFCOUNTER(tickled_idle_cpu_excl,  "sched: tickled_idle_cpu_exclusive")
 PERFCOUNTER(tickled_busy_cpu,       "sched: tickled_busy_cpu")
 PERFCOUNTER(vcpu_check,             "sched: vcpu_check")
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically
  2017-06-23 10:54 ` [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically Dario Faggioli
@ 2017-07-21 16:50   ` George Dunlap
  0 siblings, 0 replies; 18+ messages in thread
From: George Dunlap @ 2017-07-21 16:50 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Anshul Makkar

On 06/23/2017 11:54 AM, Dario Faggioli wrote:
> Instead of keeping an NR_CPUS big array of csched2_runqueue_data
> elements, directly inside the csched2_private structure, allocate
> it dynamically.
> 
> This has two positive effects:
> - reduces the size of csched2_private sensibly, which is
>   especially good in case there are more instance of Credit2
>   (in different cpupools), and is also good from the point
>   of view of fitting the struct into CPU caches;
> - we can use nr_cpu_ids as array size, which may be sensibly
>   smaller than NR_CPUS
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Looks good, thanks:

Acked-by: George Dunlap <george.dunlap@citrix.com>



> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
>  xen/common/sched_credit2.c |   16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 126417c..10d9488 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -385,7 +385,7 @@ struct csched2_private {
>  
>      int runq_map[NR_CPUS];
>      cpumask_t active_queues; /* Queues which may have active cpus */
> -    struct csched2_runqueue_data rqd[NR_CPUS];
> +    struct csched2_runqueue_data *rqd;
>  
>      unsigned int load_precision_shift;
>      unsigned int load_window_shift;
> @@ -3099,9 +3099,11 @@ csched2_init(struct scheduler *ops)
>      printk(XENLOG_INFO "load tracking window length %llu ns\n",
>             1ULL << opt_load_window_shift);
>  
> -    /* Basically no CPU information is available at this point; just
> +    /*
> +     * Basically no CPU information is available at this point; just
>       * set up basic structures, and a callback when the CPU info is
> -     * available. */
> +     * available.
> +     */
>  
>      prv = xzalloc(struct csched2_private);
>      if ( prv == NULL )
> @@ -3111,7 +3113,13 @@ csched2_init(struct scheduler *ops)
>      rwlock_init(&prv->lock);
>      INIT_LIST_HEAD(&prv->sdom);
>  
> -    /* But un-initialize all runqueues */
> +    /* Allocate all runqueues and mark them as un-initialized */
> +    prv->rqd = xzalloc_array(struct csched2_runqueue_data, nr_cpu_ids);
> +    if ( !prv->rqd )
> +    {
> +        xfree(prv);
> +        return -ENOMEM;
> +    }
>      for ( i = 0; i < nr_cpu_ids; i++ )
>      {
>          prv->runq_map[i] = -1;
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu
  2017-06-23 10:54 ` [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu Dario Faggioli
@ 2017-07-21 16:56   ` George Dunlap
  0 siblings, 0 replies; 18+ messages in thread
From: George Dunlap @ 2017-07-21 16:56 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Anshul Makkar

On 06/23/2017 11:54 AM, Dario Faggioli wrote:
> Instead of keeping an NR_CPUS big array of int-s,
> directly inside csched2_private, use a per-cpu
> variable.
> 
> That's especially beneficial (in terms of saved
> memory) when there are more instance of Credit2 (in
> different cpupools), and also helps fitting
> csched2_private itself into CPU caches.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Sounds good:

Acked-by: George Dunlap <george.dunlap@citrix.com>

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
>  xen/common/sched_credit2.c |   33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 10d9488..15862f2 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -383,7 +383,6 @@ struct csched2_private {
>      
>      struct list_head sdom; /* Used mostly for dump keyhandler. */
>  
> -    int runq_map[NR_CPUS];
>      cpumask_t active_queues; /* Queues which may have active cpus */
>      struct csched2_runqueue_data *rqd;
>  
> @@ -393,6 +392,14 @@ struct csched2_private {
>  };
>  
>  /*
> + * Physical CPU
> + *
> + * The only per-pCPU information we need to maintain is of which runqueue
> + * each CPU is part of.
> + */
> +static DEFINE_PER_CPU(int, runq_map);
> +
> +/*
>   * Virtual CPU
>   */
>  struct csched2_vcpu {
> @@ -448,16 +455,16 @@ static inline struct csched2_dom *csched2_dom(const struct domain *d)
>  }
>  
>  /* CPU to runq_id macro */
> -static inline int c2r(const struct scheduler *ops, unsigned int cpu)
> +static inline int c2r(unsigned int cpu)
>  {
> -    return csched2_priv(ops)->runq_map[(cpu)];
> +    return per_cpu(runq_map, cpu);
>  }
>  
>  /* CPU to runqueue struct macro */
>  static inline struct csched2_runqueue_data *c2rqd(const struct scheduler *ops,
>                                                    unsigned int cpu)
>  {
> -    return &csched2_priv(ops)->rqd[c2r(ops, cpu)];
> +    return &csched2_priv(ops)->rqd[c2r(cpu)];
>  }
>  
>  /*
> @@ -1082,7 +1089,7 @@ runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
>      ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
>  
>      ASSERT(!vcpu_on_runq(svc));
> -    ASSERT(c2r(ops, cpu) == c2r(ops, svc->vcpu->processor));
> +    ASSERT(c2r(cpu) == c2r(svc->vcpu->processor));
>  
>      ASSERT(&svc->rqd->runq == runq);
>      ASSERT(!is_idle_vcpu(svc->vcpu));
> @@ -1733,7 +1740,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>      if ( min_rqi == -1 )
>      {
>          new_cpu = get_fallback_cpu(svc);
> -        min_rqi = c2r(ops, new_cpu);
> +        min_rqi = c2r(new_cpu);
>          min_avgload = prv->rqd[min_rqi].b_avgload;
>          goto out_up;
>      }
> @@ -2622,7 +2629,7 @@ csched2_schedule(
>              unsigned tasklet:8, idle:8, smt_idle:8, tickled:8;
>          } d;
>          d.cpu = cpu;
> -        d.rq_id = c2r(ops, cpu);
> +        d.rq_id = c2r(cpu);
>          d.tasklet = tasklet_work_scheduled;
>          d.idle = is_idle_vcpu(current);
>          d.smt_idle = cpumask_test_cpu(cpu, &rqd->smt_idle);
> @@ -2783,7 +2790,7 @@ dump_pcpu(const struct scheduler *ops, int cpu)
>  #define cpustr keyhandler_scratch
>  
>      cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu));
> -    printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(ops, cpu), cpustr);
> +    printk("CPU[%02d] runq=%d, sibling=%s, ", cpu, c2r(cpu), cpustr);
>      cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu));
>      printk("core=%s\n", cpustr);
>  
> @@ -2930,7 +2937,7 @@ init_pdata(struct csched2_private *prv, unsigned int cpu)
>      }
>      
>      /* Set the runqueue map */
> -    prv->runq_map[cpu] = rqi;
> +    per_cpu(runq_map, cpu) = rqi;
>      
>      __cpumask_set_cpu(cpu, &rqd->idle);
>      __cpumask_set_cpu(cpu, &rqd->active);
> @@ -3034,7 +3041,7 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>      ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized));
>      
>      /* Find the old runqueue and remove this cpu from it */
> -    rqi = prv->runq_map[cpu];
> +    rqi = per_cpu(runq_map, cpu);
>  
>      rqd = prv->rqd + rqi;
>  
> @@ -3055,6 +3062,8 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>      else if ( rqd->pick_bias == cpu )
>          rqd->pick_bias = cpumask_first(&rqd->active);
>  
> +    per_cpu(runq_map, cpu) = -1;
> +
>      spin_unlock(&rqd->lock);
>  
>      __cpumask_clear_cpu(cpu, &prv->initialized);
> @@ -3121,10 +3130,8 @@ csched2_init(struct scheduler *ops)
>          return -ENOMEM;
>      }
>      for ( i = 0; i < nr_cpu_ids; i++ )
> -    {
> -        prv->runq_map[i] = -1;
>          prv->rqd[i].id = -1;
> -    }
> +
>      /* initialize ratelimit */
>      prv->ratelimit_us = sched_ratelimit_us;
>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] xen: credit: rearrange members of control structures
  2017-06-23 10:55 ` [PATCH 3/6] xen: credit: rearrange members of control structures Dario Faggioli
@ 2017-07-21 17:02   ` George Dunlap
  0 siblings, 0 replies; 18+ messages in thread
From: George Dunlap @ 2017-07-21 17:02 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Anshul Makkar

On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> With the aim of improving memory size and layout, and
> at the same time trying to put related fields reside
> in the same cacheline.
> 
> Here's a summary of the output of `pahole`, with and
> without this patch, for the affected data structures.
> 
> csched_pcpu:
>  * Before:
>     size: 88, cachelines: 2, members: 6
>     sum members: 80, holes: 1, sum holes: 4
>     padding: 4
>     paddings: 1, sum paddings: 5
>     last cacheline: 24 bytes
>  * After:
>     size: 80, cachelines: 2, members: 6
>     paddings: 1, sum paddings: 5
>     last cacheline: 16 bytes
> 
> csched_vcpu:
>  * Before:
>     size: 72, cachelines: 2, members: 9
>     padding: 2
>     last cacheline: 8 bytes
>  * After:
>     same numbers, but move some fields to put
>     related fields in same cache line.
> 
> csched_private:
>  * Before:
>     size: 152, cachelines: 3, members: 17
>     sum members: 140, holes: 2, sum holes: 8
>     padding: 4
>     paddings: 1, sum paddings: 5
>     last cacheline: 24 bytes
>  * After:
>     same numbers, but move some fields to put
>     related fields in same cache line.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: George Dunlap <george.dunlap@citrix.com>

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
>  xen/common/sched_credit.c |   41 ++++++++++++++++++++++++++---------------
>  1 file changed, 26 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index efdf6bf..4f6330e 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -169,10 +169,12 @@ integer_param("sched_credit_tslice_ms", sched_credit_tslice_ms);
>  struct csched_pcpu {
>      struct list_head runq;
>      uint32_t runq_sort_last;
> -    struct timer ticker;
> -    unsigned int tick;
> +
>      unsigned int idle_bias;
>      unsigned int nr_runnable;
> +
> +    unsigned int tick;
> +    struct timer ticker;
>  };
>  
>  /*
> @@ -181,13 +183,18 @@ struct csched_pcpu {
>  struct csched_vcpu {
>      struct list_head runq_elem;
>      struct list_head active_vcpu_elem;
> +
> +    /* Up-pointers */
>      struct csched_dom *sdom;
>      struct vcpu *vcpu;
> -    atomic_t credit;
> -    unsigned int residual;
> +
>      s_time_t start_time;   /* When we were scheduled (used for credit) */
>      unsigned flags;
> -    int16_t pri;
> +    int pri;
> +
> +    atomic_t credit;
> +    unsigned int residual;
> +
>  #ifdef CSCHED_STATS
>      struct {
>          int credit_last;
> @@ -219,21 +226,25 @@ struct csched_dom {
>  struct csched_private {
>      /* lock for the whole pluggable scheduler, nests inside cpupool_lock */
>      spinlock_t lock;
> -    struct list_head active_sdom;
> -    uint32_t ncpus;
> -    struct timer  master_ticker;
> -    unsigned int master;
> +
>      cpumask_var_t idlers;
>      cpumask_var_t cpus;
> +    uint32_t *balance_bias;
> +    uint32_t runq_sort;
> +    unsigned int ratelimit_us;
> +
> +    /* Period of master and tick in milliseconds */
> +    unsigned int tslice_ms, tick_period_us, ticks_per_tslice;
> +    uint32_t ncpus;
> +
> +    struct list_head active_sdom;
>      uint32_t weight;
>      uint32_t credit;
>      int credit_balance;
> -    uint32_t runq_sort;
> -    uint32_t *balance_bias;
> -    unsigned ratelimit_us;
> -    /* Period of master and tick in milliseconds */
> -    unsigned tslice_ms, tick_period_us, ticks_per_tslice;
> -    unsigned credits_per_tslice;
> +    unsigned int credits_per_tslice;
> +
> +    unsigned int master;
> +    struct timer master_ticker;
>  };
>  
>  static void csched_tick(void *_cpu);
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] xen: credit2: rearrange members of control structures
  2017-06-23 10:55 ` [PATCH 4/6] xen: credit2: " Dario Faggioli
@ 2017-07-21 17:05   ` George Dunlap
  2017-07-21 19:53     ` Dario Faggioli
  0 siblings, 1 reply; 18+ messages in thread
From: George Dunlap @ 2017-07-21 17:05 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Anshul Makkar

On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> With the aim of improving memory size and layout, and
> at the same time trying to put related fields reside
> in the same cacheline.
> 
> Here's a summary of the output of `pahole`, with and
> without this patch, for the affected data structures.
> 
> csched2_runqueue_data:
>  * Before:
>     size: 216, cachelines: 4, members: 14
>     sum members: 208, holes: 2, sum holes: 8
>     last cacheline: 24 bytes
>  * After:
>     size: 208, cachelines: 4, members: 14
>     last cacheline: 16 bytes
> 
> csched2_private:
>  * Before:
>     size: 120, cachelines: 2, members: 8
>     sum members: 112, holes: 1, sum holes: 4
>     padding: 4
>     last cacheline: 56 bytes
>  * After:
>     size: 112, cachelines: 2, members: 8
>     last cacheline: 48 bytes
> 
> csched2_vcpu:
>  * Before:
>     size: 112, cachelines: 2, members: 14
>     sum members: 108, holes: 1, sum holes: 4
>     last cacheline: 48 bytes
>  * After:
>     size: 112, cachelines: 2, members: 14
>     padding: 4
>     last cacheline: 48 bytes
> 
> While there, improve the wording, style and alignment
> of comments too.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

I haven't taken a careful look at these; the idea sounds good and I'll
trust that you've taken a careful look at them:

Acked-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] xen: RTDS: rearrange members of control structures
  2017-06-23 10:55 ` [PATCH 5/6] xen: RTDS: " Dario Faggioli
@ 2017-07-21 17:06   ` George Dunlap
  2017-07-21 17:51   ` Meng Xu
  1 sibling, 0 replies; 18+ messages in thread
From: George Dunlap @ 2017-07-21 17:06 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: George Dunlap, Meng Xu

On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> Nothing changed in `pahole` output, in terms of holes
> and padding, but some fields have been moved, to put
> related members in same cache line.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: George Dunlap <george.dunlap@citrix.com>

> ---
> Cc: Meng Xu <mengxu@cis.upenn.edu>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> ---
>  xen/common/sched_rt.c |   13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..39f6bee 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
>  struct rt_private {
>      spinlock_t lock;            /* the global coarse-grained lock */
>      struct list_head sdom;      /* list of availalbe domains, used for dump */
> +
>      struct list_head runq;      /* ordered list of runnable vcpus */
>      struct list_head depletedq; /* unordered list of depleted vcpus */
> +
> +    struct timer *repl_timer;   /* replenishment timer */
>      struct list_head replq;     /* ordered list of vcpus that need replenishment */
> +
>      cpumask_t tickled;          /* cpus been tickled */
> -    struct timer *repl_timer;   /* replenishment timer */
>  };
>  
>  /*
> @@ -185,10 +188,6 @@ struct rt_vcpu {
>      struct list_head q_elem;     /* on the runq/depletedq list */
>      struct list_head replq_elem; /* on the replenishment events list */
>  
> -    /* Up-pointers */
> -    struct rt_dom *sdom;
> -    struct vcpu *vcpu;
> -
>      /* VCPU parameters, in nanoseconds */
>      s_time_t period;
>      s_time_t budget;
> @@ -198,6 +197,10 @@ struct rt_vcpu {
>      s_time_t last_start;         /* last start time */
>      s_time_t cur_deadline;       /* current deadline for EDF */
>  
> +    /* Up-pointers */
> +    struct rt_dom *sdom;
> +    struct vcpu *vcpu;
> +
>      unsigned flags;              /* mark __RTDS_scheduled, etc.. */
>  };
>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)
  2017-06-23 10:55 ` [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
@ 2017-07-21 17:19   ` George Dunlap
  2017-07-21 19:55     ` Dario Faggioli
  0 siblings, 1 reply; 18+ messages in thread
From: George Dunlap @ 2017-07-21 17:19 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Anshul Makkar

On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> Exclusive pinning of vCPUs is used, sometimes, for
> achieving the highest level of determinism, and the
> least possible overhead, for the vCPUs in question.
> 
> Although static 1:1 pinning is not recommended, for
> general use cases, optimizing the tickling code (of
> Credit1 and Credit2) is easy and cheap enough, so go
> for it.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshulmakkar@gmail.com>
> ---
>  xen/common/sched_credit.c    |   19 +++++++++++++++++++
>  xen/common/sched_credit2.c   |   21 ++++++++++++++++++++-
>  xen/include/xen/perfc_defn.h |    1 +
>  3 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 4f6330e..85e014d 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct csched_vcpu *new)
>      idlers_empty = cpumask_empty(&idle_mask);
>  
>      /*
> +     * Exclusive pinning is when a vcpu has hard-affinity with only one
> +     * cpu, and there is no other vcpu that has hard-affinity with that
> +     * same cpu. This is infrequent, but if it happens, is for achieving
> +     * the most possible determinism, and least possible overhead for
> +     * the vcpus in question.
> +     *
> +     * Try to identify the vast majority of these situations, and deal
> +     * with them quickly.
> +     */
> +    if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) == cpu &&

Won't this check entail a full "loop" of the cpumask?  It's cheap enough
if nr_cpu_ids is small; but don't we support (theoretically) 4096
logical cpus?

It seems like having a vcpu flag that identifies a vcpu as being pinned
would be a more efficient way to do this.  That way we could run this
check once whenever the hard affinity changed, rather than every time we
want to think about where to run this vcpu.

What do you think?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] xen: RTDS: rearrange members of control structures
  2017-06-23 10:55 ` [PATCH 5/6] xen: RTDS: " Dario Faggioli
  2017-07-21 17:06   ` George Dunlap
@ 2017-07-21 17:51   ` Meng Xu
  2017-07-21 19:51     ` Dario Faggioli
  1 sibling, 1 reply; 18+ messages in thread
From: Meng Xu @ 2017-07-21 17:51 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel

On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
>
> Nothing changed in `pahole` output, in terms of holes
> and padding, but some fields have been moved, to put
> related members in same cache line.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: Meng Xu <mengxu@cis.upenn.edu>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> ---
>  xen/common/sched_rt.c |   13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..39f6bee 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
>  struct rt_private {
>      spinlock_t lock;            /* the global coarse-grained lock */
>      struct list_head sdom;      /* list of availalbe domains, used for dump */
> +
>      struct list_head runq;      /* ordered list of runnable vcpus */
>      struct list_head depletedq; /* unordered list of depleted vcpus */
> +
> +    struct timer *repl_timer;   /* replenishment timer */
>      struct list_head replq;     /* ordered list of vcpus that need replenishment */
> +
>      cpumask_t tickled;          /* cpus been tickled */
> -    struct timer *repl_timer;   /* replenishment timer */
>  };
>
>  /*
> @@ -185,10 +188,6 @@ struct rt_vcpu {
>      struct list_head q_elem;     /* on the runq/depletedq list */
>      struct list_head replq_elem; /* on the replenishment events list */
>
> -    /* Up-pointers */
> -    struct rt_dom *sdom;
> -    struct vcpu *vcpu;
> -
>      /* VCPU parameters, in nanoseconds */
>      s_time_t period;
>      s_time_t budget;
> @@ -198,6 +197,10 @@ struct rt_vcpu {
>      s_time_t last_start;         /* last start time */
>      s_time_t cur_deadline;       /* current deadline for EDF */
>
> +    /* Up-pointers */
> +    struct rt_dom *sdom;
> +    struct vcpu *vcpu;
> +
>      unsigned flags;              /* mark __RTDS_scheduled, etc.. */
>  };
>

Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>

BTW, Dario, I'm wondering if you used any tool to give hints about how
to arrange the fields in a structure or you just did it manually?

Thanks,

Meng

-----------
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] xen: RTDS: rearrange members of control structures
  2017-07-21 17:51   ` Meng Xu
@ 2017-07-21 19:51     ` Dario Faggioli
  0 siblings, 0 replies; 18+ messages in thread
From: Dario Faggioli @ 2017-07-21 19:51 UTC (permalink / raw)
  To: Meng Xu; +Cc: George Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3319 bytes --]

On Fri, 2017-07-21 at 13:51 -0400, Meng Xu wrote:
> On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli
> <dario.faggioli@citrix.com> wrote:
> > 
> > Nothing changed in `pahole` output, in terms of holes
> > and padding, but some fields have been moved, to put
> > related members in same cache line.
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> > ---
> > Cc: Meng Xu <mengxu@cis.upenn.edu>
> > Cc: George Dunlap <george.dunlap@eu.citrix.com>
> > ---
> >  xen/common/sched_rt.c |   13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> > index 1b30014..39f6bee 100644
> > --- a/xen/common/sched_rt.c
> > +++ b/xen/common/sched_rt.c
> > @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
> >  struct rt_private {
> >      spinlock_t lock;            /* the global coarse-grained lock
> > */
> >      struct list_head sdom;      /* list of availalbe domains, used
> > for dump */
> > +
> >      struct list_head runq;      /* ordered list of runnable vcpus
> > */
> >      struct list_head depletedq; /* unordered list of depleted
> > vcpus */
> > +
> > +    struct timer *repl_timer;   /* replenishment timer */
> >      struct list_head replq;     /* ordered list of vcpus that need
> > replenishment */
> > +
> >      cpumask_t tickled;          /* cpus been tickled */
> > -    struct timer *repl_timer;   /* replenishment timer */
> >  };
> > 
> >  /*
> > @@ -185,10 +188,6 @@ struct rt_vcpu {
> >      struct list_head q_elem;     /* on the runq/depletedq list */
> >      struct list_head replq_elem; /* on the replenishment events
> > list */
> > 
> > -    /* Up-pointers */
> > -    struct rt_dom *sdom;
> > -    struct vcpu *vcpu;
> > -
> >      /* VCPU parameters, in nanoseconds */
> >      s_time_t period;
> >      s_time_t budget;
> > @@ -198,6 +197,10 @@ struct rt_vcpu {
> >      s_time_t last_start;         /* last start time */
> >      s_time_t cur_deadline;       /* current deadline for EDF */
> > 
> > +    /* Up-pointers */
> > +    struct rt_dom *sdom;
> > +    struct vcpu *vcpu;
> > +
> >      unsigned flags;              /* mark __RTDS_scheduled, etc..
> > */
> >  };
> > 
> 
> Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
> 
> BTW, Dario, I'm wondering if you used any tool to give hints about
> how
> to arrange the fields in a structure or you just did it manually?
> 
I used pahole for figuring out the cache layout. But just that. So,
basically, I --manually-- tried to move the fields around, and check
the result with pahole (and then did it again, and again. :-D).

TBH, the improvement for RTDS is probably not even noticeable, as we
access almost all the fields anyway. But it still makes sense, IMO.

Thanks for the review,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] xen: credit2: rearrange members of control structures
  2017-07-21 17:05   ` George Dunlap
@ 2017-07-21 19:53     ` Dario Faggioli
  0 siblings, 0 replies; 18+ messages in thread
From: Dario Faggioli @ 2017-07-21 19:53 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Anshul Makkar


[-- Attachment #1.1: Type: text/plain, Size: 912 bytes --]

On Fri, 2017-07-21 at 18:05 +0100, George Dunlap wrote:
> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> > 
> > While there, improve the wording, style and alignment
> > of comments too.
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> 
> I haven't taken a careful look at these; the idea sounds good and
> I'll
> trust that you've taken a careful look at them:
> 
Hehe... thanks! :-)

I've even done the whole thing twice. In fact, I was about to submit
the series, when I discovered that I did optimize the cache layout of a
debug build, and hence had to redo everything from the beginning! :-P

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)
  2017-07-21 17:19   ` George Dunlap
@ 2017-07-21 19:55     ` Dario Faggioli
  2017-07-21 20:30       ` George Dunlap
  0 siblings, 1 reply; 18+ messages in thread
From: Dario Faggioli @ 2017-07-21 19:55 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Anshul Makkar


[-- Attachment #1.1: Type: text/plain, Size: 2087 bytes --]

On Fri, 2017-07-21 at 18:19 +0100, George Dunlap wrote:
> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> > index 4f6330e..85e014d 100644
> > --- a/xen/common/sched_credit.c
> > +++ b/xen/common/sched_credit.c
> > @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct
> > csched_vcpu *new)
> >      idlers_empty = cpumask_empty(&idle_mask);
> >  
> >      /*
> > +     * Exclusive pinning is when a vcpu has hard-affinity with
> > only one
> > +     * cpu, and there is no other vcpu that has hard-affinity with
> > that
> > +     * same cpu. This is infrequent, but if it happens, is for
> > achieving
> > +     * the most possible determinism, and least possible overhead
> > for
> > +     * the vcpus in question.
> > +     *
> > +     * Try to identify the vast majority of these situations, and
> > deal
> > +     * with them quickly.
> > +     */
> > +    if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity) 
> > == cpu &&
> 
> Won't this check entail a full "loop" of the cpumask?  It's cheap
> enough
> if nr_cpu_ids is small; but don't we support (theoretically) 4096
> logical cpus?
> 
> It seems like having a vcpu flag that identifies a vcpu as being
> pinned
> would be a more efficient way to do this.  That way we could run this
> check once whenever the hard affinity changed, rather than every time
> we
> want to think about where to run this vcpu.
> 
> What do you think?
> 
Right. We actually should get some help from the hardware (ffs &
firends)... but I think you're right. Implementing this with a flag, as
 you're suggesting, is most likely better, and easy enough.

I'll go for that!

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2)
  2017-07-21 19:55     ` Dario Faggioli
@ 2017-07-21 20:30       ` George Dunlap
  0 siblings, 0 replies; 18+ messages in thread
From: George Dunlap @ 2017-07-21 20:30 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar

On Fri, Jul 21, 2017 at 8:55 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Fri, 2017-07-21 at 18:19 +0100, George Dunlap wrote:
>> On 06/23/2017 11:55 AM, Dario Faggioli wrote:
>> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
>> > index 4f6330e..85e014d 100644
>> > --- a/xen/common/sched_credit.c
>> > +++ b/xen/common/sched_credit.c
>> > @@ -429,6 +429,24 @@ static inline void __runq_tickle(struct
>> > csched_vcpu *new)
>> >      idlers_empty = cpumask_empty(&idle_mask);
>> >
>> >      /*
>> > +     * Exclusive pinning is when a vcpu has hard-affinity with
>> > only one
>> > +     * cpu, and there is no other vcpu that has hard-affinity with
>> > that
>> > +     * same cpu. This is infrequent, but if it happens, is for
>> > achieving
>> > +     * the most possible determinism, and least possible overhead
>> > for
>> > +     * the vcpus in question.
>> > +     *
>> > +     * Try to identify the vast majority of these situations, and
>> > deal
>> > +     * with them quickly.
>> > +     */
>> > +    if ( unlikely(cpumask_cycle(cpu, new->vcpu->cpu_hard_affinity)
>> > == cpu &&
>>
>> Won't this check entail a full "loop" of the cpumask?  It's cheap
>> enough
>> if nr_cpu_ids is small; but don't we support (theoretically) 4096
>> logical cpus?
>>
>> It seems like having a vcpu flag that identifies a vcpu as being
>> pinned
>> would be a more efficient way to do this.  That way we could run this
>> check once whenever the hard affinity changed, rather than every time
>> we
>> want to think about where to run this vcpu.
>>
>> What do you think?
>>
> Right. We actually should get some help from the hardware (ffs &
> firends)... but I think you're right. Implementing this with a flag, as
>  you're suggesting, is most likely better, and easy enough.
>
> I'll go for that!

Cool.  BTW I checked the first 5 in.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-07-21 20:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-23 10:54 [PATCH 0/6] xen: sched: control structure memory layout optimizations Dario Faggioli
2017-06-23 10:54 ` [PATCH 1/6] xen: credit2: allocate runqueue data structure dynamically Dario Faggioli
2017-07-21 16:50   ` George Dunlap
2017-06-23 10:54 ` [PATCH 2/6] xen: credit2: make the cpu to runqueue map per-cpu Dario Faggioli
2017-07-21 16:56   ` George Dunlap
2017-06-23 10:55 ` [PATCH 3/6] xen: credit: rearrange members of control structures Dario Faggioli
2017-07-21 17:02   ` George Dunlap
2017-06-23 10:55 ` [PATCH 4/6] xen: credit2: " Dario Faggioli
2017-07-21 17:05   ` George Dunlap
2017-07-21 19:53     ` Dario Faggioli
2017-06-23 10:55 ` [PATCH 5/6] xen: RTDS: " Dario Faggioli
2017-07-21 17:06   ` George Dunlap
2017-07-21 17:51   ` Meng Xu
2017-07-21 19:51     ` Dario Faggioli
2017-06-23 10:55 ` [PATCH 6/6] xen: sched: optimize exclusive pinning case (Credit1 & 2) Dario Faggioli
2017-07-21 17:19   ` George Dunlap
2017-07-21 19:55     ` Dario Faggioli
2017-07-21 20:30       ` George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.