All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] The 'null' Scheduler
@ 2017-03-17 18:42 Dario Faggioli
  2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-03-17 18:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Jonathan Davies, Wei Liu, Ian Jackson, George Dunlap,
	Marcus Granado, Stefano Stabellini, Julien Grall

Hello,

This patch series implements what I call the 'null' scheduler.

It's a very simple, very static scheduling posicy that always schedules the same vCPU(s) on the same pCPU(s). That's it.

If there are less vCPUs than pCPUs, some of the pCPUs are _always_ idle. If there are more, some vCPUs _never_ run.
That is not entirely true, as there is some logic to make sure that waiting to run vCPUs are executed, for instance, on a new pCPU that enters the cpupool, and things like that.

The demand for this cames from Xen on ARM people and the embedded world in general (Hey, Stefano! :-P), where it is not uncommon to have super static systems that perceives an advanced general purpose scheduler just as pure overhead.

As a matter of fact, this may turn out useful in less embedded scenario, like High Performace Computing (where, again, scheduling is, often, unnecessary overhead), but even some of our much more classic Xen use case (like consolidation, Cc-ing Jonathan and Marcus who said they were interested in it).

The scheduler is really simple, and especially the hot paths --i.e., sleep, wakeup and schedule-- are super lean and quick, in 99% of the cases. All the slightly more complicated logic for dealing with pCPUs coming and going from a cpupool that uses this scheduler resides in functions that handle insertion, removal and migration of vCPUs, which are only called when such configuration changes happens (so, typically, "offline", in most of the embedded usecases).

I implemented support for hard affinity in order to provide at least a rudimental interface for interacting with the scheduler and affect the placement (it's called assignment within the code) of vCPUs on pCPUs.

I've tested the scheduler both inside a cpupool (using both Credit1 and Credit2 as boot schedulers) and as default, choosing it at boot and using it for Dom0 and a few other domains. In the latter case, you probably want to limit the number of Dom0's vCPUs too, or there will be very few to experiment with! :-P

I haven't done any performance or overhead measurements so far, but I will soon enough.

I also consider this to be experimental, and I'll also write a feature document ASAP.

Thanks and Regards,
Dario
---
Dario Faggioli (3):
      xen: sched: introduce the 'null' semi-static scheduler
      xen: sched_null: support for hard affinity
      tools: sched: add support for 'null' scheduler

 docs/misc/xen-command-line.markdown |    2 
 tools/libxl/libxl.h                 |    6 
 tools/libxl/libxl_sched.c           |   24 +
 tools/libxl/libxl_types.idl         |    1 
 xen/common/Kconfig                  |   11 
 xen/common/Makefile                 |    1 
 xen/common/sched_null.c             |  837 +++++++++++++++++++++++++++++++++++
 xen/common/schedule.c               |    2 
 xen/include/public/domctl.h         |    1 
 9 files changed, 884 insertions(+), 1 deletion(-)
 create mode 100644 xen/common/sched_null.c
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-17 18:42 [PATCH 0/3] The 'null' Scheduler Dario Faggioli
@ 2017-03-17 18:42 ` Dario Faggioli
  2017-03-20 23:21   ` Stefano Stabellini
  2017-03-27 10:31   ` George Dunlap
  2017-03-17 18:43 ` [PATCH 2/3] xen: sched_null: support for hard affinity Dario Faggioli
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-03-17 18:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Jonathan Davies, Julien Grall, Stefano Stabellini, George Dunlap,
	Marcus Granado

In cases where one is absolutely sure that there will be
less vCPUs than pCPUs, having to pay the cose, mostly in
terms of overhead, of an advanced scheduler may be not
desirable.

The simple scheduler implemented here could be a solution.
Here how it works:
 - each vCPU is statically assigned to a pCPU;
 - if there are pCPUs without any vCPU assigned, they
   stay idle (as in, the run their idle vCPU);
 - if there are vCPUs which are not assigned to any
   pCPU (e.g., because there are more vCPUs than pCPUs)
   they *don't* run, until they get assigned;
 - if a vCPU assigned to a pCPU goes away, one of the
   waiting to be assigned vCPU, if any, gets assigned
   to the pCPU and can run there.

This scheduler, therefore, if used in configurations
where every vCPUs can be assigned to a pCPU, guarantees
low overhead, low latency, and consistent performance.

If used as default scheduler, at Xen boot, it is
recommended to limit the number of Dom0 vCPUs (e.g., with
'dom0_max_vcpus=x'). Otherwise, all the pCPUs will have
one Dom0's vCPU assigned, and there won't be room for
running efficiently (if at all) any guest.

Target use cases are embedded and HPC, but it may well
be interesting also in circumnstances.

Kconfig and documentation are update accordingly.

While there, also document the availability of sched=rtds
as boot parameter, which apparently had been forgotten.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jonathan Davies <Jonathan.Davies@citrix.com>
Cc: Marcus Granado <marcus.granado@citrix.com>
---
 docs/misc/xen-command-line.markdown |    2 
 xen/common/Kconfig                  |   11 
 xen/common/Makefile                 |    1 
 xen/common/sched_null.c             |  816 +++++++++++++++++++++++++++++++++++
 xen/common/schedule.c               |    2 
 xen/include/public/domctl.h         |    1 
 6 files changed, 832 insertions(+), 1 deletion(-)
 create mode 100644 xen/common/sched_null.c

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 4daf5b5..ad6a5ca 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1434,7 +1434,7 @@ Map the HPET page as read only in Dom0. If disabled the page will be mapped
 with read and write permissions.
 
 ### sched
-> `= credit | credit2 | arinc653`
+> `= credit | credit2 | arinc653 | rtds | null`
 
 > Default: `sched=credit`
 
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index f2ecbc4..518520e 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -187,6 +187,14 @@ config SCHED_ARINC653
 	  The ARINC653 scheduler is a hard real-time scheduler for single
 	  cores, targeted for avionics, drones, and medical devices.
 
+config SCHED_NULL
+	bool "Null scheduler support (EXPERIMENTAL)"
+	default y
+	---help---
+	  The null scheduler is a static, zero overhead scheduler,
+	  for when there always are less vCPUs than pCPUs, typically
+	  in embedded or HPC scenarios.
+
 choice
 	prompt "Default Scheduler?"
 	default SCHED_CREDIT_DEFAULT
@@ -199,6 +207,8 @@ choice
 		bool "RT Scheduler" if SCHED_RTDS
 	config SCHED_ARINC653_DEFAULT
 		bool "ARINC653 Scheduler" if SCHED_ARINC653
+	config SCHED_NULL_DEFAULT
+		bool "Null Scheduler" if SCHED_NULL
 endchoice
 
 config SCHED_DEFAULT
@@ -207,6 +217,7 @@ config SCHED_DEFAULT
 	default "credit2" if SCHED_CREDIT2_DEFAULT
 	default "rtds" if SCHED_RTDS_DEFAULT
 	default "arinc653" if SCHED_ARINC653_DEFAULT
+	default "null" if SCHED_NULL_DEFAULT
 	default "credit"
 
 endmenu
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 0fed30b..26c5a64 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -40,6 +40,7 @@ obj-$(CONFIG_SCHED_ARINC653) += sched_arinc653.o
 obj-$(CONFIG_SCHED_CREDIT) += sched_credit.o
 obj-$(CONFIG_SCHED_CREDIT2) += sched_credit2.o
 obj-$(CONFIG_SCHED_RTDS) += sched_rt.o
+obj-$(CONFIG_SCHED_NULL) += sched_null.o
 obj-y += schedule.o
 obj-y += shutdown.o
 obj-y += softirq.o
diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
new file mode 100644
index 0000000..6a13308
--- /dev/null
+++ b/xen/common/sched_null.c
@@ -0,0 +1,816 @@
+/*
+ * xen/common/sched_null.c
+ *
+ *  Copyright (c) 2017, Dario Faggioli, Citrix Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * The 'null' scheduler always choose to run, on each pCPU, either nothing
+ * (i.e., the pCPU stays idle) or always the same vCPU.
+ *
+ * It is aimed at supporting static scenarios, where there always are
+ * less vCPUs than pCPUs (and the vCPUs don't need to move among pCPUs
+ * for any reason) with the least possible overhead.
+ *
+ * Typical usecase are embedded applications, but also HPC, especially
+ * if the scheduler is used inside a cpupool.
+ */
+
+#include <xen/sched.h>
+#include <xen/sched-if.h>
+#include <xen/softirq.h>
+#include <xen/keyhandler.h>
+
+
+/*
+ * Locking:
+ * - Scheduler-lock (a.k.a. runqueue lock):
+ *  + is per-pCPU;
+ *  + serializes assignment and deassignment of vCPUs to a pCPU.
+ * - Private data lock (a.k.a. private scheduler lock):
+ *  + is scheduler-wide;
+ *  + serializes accesses to the list of domains in this scheduler.
+ * - Waitqueue lock:
+ *  + is scheduler-wide;
+ *  + serialize accesses to the list of vCPUs waiting to be assigned
+ *    to pCPUs.
+ *
+ * Ordering is: private lock, runqueue lock, waitqueue lock. Or, OTOH,
+ * waitqueue lock nests inside runqueue lock which nests inside private
+ * lock. More specifically:
+ *  + if we need both runqueue and private locks, we must acquire the
+ *    private lock for first;
+ *  + if we need both runqueue and waitqueue locks, we must acquire
+ *    the runqueue lock for first;
+ *  + if we need both private and waitqueue locks, we must acquire
+ *    the private lock for first;
+ *  + if we already own a runqueue lock, we must never acquire
+ *    the private lock;
+ *  + if we already own the waitqueue lock, we must never acquire
+ *    the runqueue lock or the private lock.
+ */
+
+/*
+ * System-wide private data
+ */
+struct null_private {
+    spinlock_t lock;        /* scheduler lock; nests inside cpupool_lock */
+    struct list_head ndom;  /* Domains of this scheduler                 */
+    struct list_head waitq; /* vCPUs not assigned to any pCPU            */
+    spinlock_t waitq_lock;  /* serializes waitq; nests inside runq locks */
+    cpumask_t cpus_free;    /* CPUs without a vCPU associated to them    */
+};
+
+/*
+ * Physical CPU
+ */
+struct null_pcpu {
+    struct vcpu *vcpu;
+};
+DEFINE_PER_CPU(struct null_pcpu, npc);
+
+/*
+ * Virtual CPU
+ */
+struct null_vcpu {
+    struct list_head waitq_elem;
+    struct null_dom *ndom;
+    struct vcpu *vcpu;
+    int pcpu;          /* To what pCPU the vCPU is assigned (-1 if none) */
+};
+
+/*
+ * Domain
+ */
+struct null_dom {
+    struct list_head ndom_elem;
+    struct domain *dom;
+};
+
+/*
+ * Accessor helpers functions
+ */
+static inline struct null_private *null_priv(const struct scheduler *ops)
+{
+    return ops->sched_data;
+}
+
+static inline struct null_vcpu *null_vcpu(const struct vcpu *v)
+{
+    return v->sched_priv;
+}
+
+static inline struct null_dom *null_dom(const struct domain *d)
+{
+    return d->sched_priv;
+}
+
+static int null_init(struct scheduler *ops)
+{
+    struct null_private *prv;
+
+    printk("Initializing null scheduler\n"
+           "WARNING: This is experimental software in development.\n"
+           "Use at your own risk.\n");
+
+    prv = xzalloc(struct null_private);
+    if ( prv == NULL )
+        return -ENOMEM;
+
+    spin_lock_init(&prv->lock);
+    spin_lock_init(&prv->waitq_lock);
+    INIT_LIST_HEAD(&prv->ndom);
+    INIT_LIST_HEAD(&prv->waitq);
+
+    ops->sched_data = prv;
+
+    return 0;
+}
+
+static void null_deinit(struct scheduler *ops)
+{
+    xfree(ops->sched_data);
+    ops->sched_data = NULL;
+}
+
+static void init_pdata(struct null_private *prv, unsigned int cpu)
+{
+    /* Mark the pCPU as free, and with no vCPU assigned */
+    cpumask_set_cpu(cpu, &prv->cpus_free);
+    per_cpu(npc, cpu).vcpu = NULL;
+}
+
+static void null_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+{
+    struct null_private *prv = null_priv(ops);
+    struct schedule_data *sd = &per_cpu(schedule_data, cpu);
+
+    /* alloc_pdata is not implemented, so we want this to be NULL. */
+    ASSERT(!pdata);
+
+    /*
+     * The scheduler lock points already to the default per-cpu spinlock,
+     * so there is no remapping to be done.
+     */
+    ASSERT(sd->schedule_lock == &sd->_lock && !spin_is_locked(&sd->_lock));
+
+    init_pdata(prv, cpu);
+}
+
+static void null_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
+{
+    struct null_private *prv = null_priv(ops);
+
+    /* alloc_pdata not implemented, so this must have stayed NULL */
+    ASSERT(!pcpu);
+
+    cpumask_clear_cpu(cpu, &prv->cpus_free);
+    per_cpu(npc, cpu).vcpu = NULL;
+}
+
+static void *null_alloc_vdata(const struct scheduler *ops,
+                              struct vcpu *v, void *dd)
+{
+    struct null_vcpu *nvc;
+
+    nvc = xzalloc(struct null_vcpu);
+    if ( nvc == NULL )
+        return NULL;
+
+    INIT_LIST_HEAD(&nvc->waitq_elem);
+
+    /* Not assigned to any pCPU */
+    nvc->pcpu = -1;
+    /* Up pointers */
+    nvc->ndom = dd;
+    nvc->vcpu = v;
+
+    SCHED_STAT_CRANK(vcpu_alloc);
+
+    return nvc;
+}
+
+static void null_free_vdata(const struct scheduler *ops, void *priv)
+{
+    struct null_vcpu *nvc = priv;
+
+    xfree(nvc);
+}
+
+static void * null_alloc_domdata(const struct scheduler *ops,
+                                 struct domain *d)
+{
+    struct null_private *prv = null_priv(ops);
+    struct null_dom *ndom;
+    unsigned long flags;
+
+    ndom = xzalloc(struct null_dom);
+    if ( ndom == NULL )
+        return NULL;
+
+    INIT_LIST_HEAD(&ndom->ndom_elem);
+    ndom->dom = d;
+
+    spin_lock_irqsave(&prv->lock, flags);
+    list_add_tail(&ndom->ndom_elem, &null_priv(ops)->ndom);
+    spin_unlock_irqrestore(&prv->lock, flags);
+
+    return (void*)ndom;
+}
+
+static void null_free_domdata(const struct scheduler *ops, void *data)
+{
+    unsigned long flags;
+    struct null_dom *ndom = data;
+    struct null_private *prv = null_priv(ops);
+
+    spin_lock_irqsave(&prv->lock, flags);
+    list_del_init(&ndom->ndom_elem);
+    spin_unlock_irqrestore(&prv->lock, flags);
+
+    xfree(data);
+}
+
+static int null_dom_init(const struct scheduler *ops, struct domain *d)
+{
+    struct null_dom *ndom;
+
+    if ( is_idle_domain(d) )
+        return 0;
+
+    ndom = null_alloc_domdata(ops, d);
+    if ( ndom == NULL )
+        return -ENOMEM;
+
+    d->sched_priv = ndom;
+
+    return 0;
+}
+static void null_dom_destroy(const struct scheduler *ops, struct domain *d)
+{
+    null_free_domdata(ops, null_dom(d));
+}
+
+/*
+ * vCPU to pCPU assignment and placement. This _only_ happens:
+ *  - on insert,
+ *  - on migrate.
+ *
+ * Insert occurs when a vCPU joins this scheduler for the first time
+ * (e.g., when the domain it's part of is moved to the scheduler's
+ * cpupool).
+ *
+ * Migration may be necessary if a pCPU (with a vCPU assigned to it)
+ * is removed from the scheduler's cpupool.
+ *
+ * So this is not part of any hot path.
+ */
+static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
+{
+    unsigned int cpu = v->processor;
+    cpumask_t *cpus = cpupool_domain_cpumask(v->domain);
+
+    ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
+
+    /*
+     * If our processor is free, or we are assigned to it, and it is
+     * also still valid, just go for it.
+     */
+    if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v)
+                && cpumask_test_cpu(cpu, cpus)) )
+        return cpu;
+
+    /* If not, just go for a valid free pCPU, if any */
+    cpumask_and(cpumask_scratch_cpu(cpu), &prv->cpus_free, cpus);
+    cpu = cpumask_first(cpumask_scratch_cpu(cpu));
+
+    /*
+     * If we didn't find any free pCPU, just pick any valid pcpu, even if
+     * it has another vCPU assigned. This will happen during shutdown and
+     * suspend/resume, but it may also happen during "normal operation", if
+     * all the pCPUs are busy.
+     *
+     * In fact, there must always be something sane in v->processor, or
+     * vcpu_schedule_lock() and friends won't work. This is not a problem,
+     * as we will actually assign the vCPU to the pCPU we return from here,
+     * only if the pCPU is free.
+     */
+    if ( unlikely(cpu == nr_cpu_ids) )
+        cpu = cpumask_any(cpus);
+
+    return cpu;
+}
+
+static void vcpu_assign(struct null_private *prv, struct vcpu *v,
+                        unsigned int cpu)
+{
+    ASSERT(null_vcpu(v)->pcpu == -1);
+
+    per_cpu(npc, cpu).vcpu = v;
+    v->processor = null_vcpu(v)->pcpu = cpu;
+    cpumask_clear_cpu(cpu, &prv->cpus_free);
+
+    gdprintk(XENLOG_INFO, "%d <-- d%dv%d\n", cpu, v->domain->domain_id, v->vcpu_id);
+}
+
+static void vcpu_deassign(struct null_private *prv, struct vcpu *v,
+                          unsigned int cpu)
+{
+    ASSERT(null_vcpu(v)->pcpu == cpu);
+
+    null_vcpu(v)->pcpu = -1;
+    per_cpu(npc, cpu).vcpu = NULL;
+    cpumask_set_cpu(cpu, &prv->cpus_free);
+
+    gdprintk(XENLOG_INFO, "%d <-- NULL (d%dv%d)\n", cpu, v->domain->domain_id, v->vcpu_id);
+}
+
+/* Change the scheduler of cpu to us (null). */
+static void null_switch_sched(struct scheduler *new_ops, unsigned int cpu,
+                              void *pdata, void *vdata)
+{
+    struct schedule_data *sd = &per_cpu(schedule_data, cpu);
+    struct null_private *prv = null_priv(new_ops);
+    struct null_vcpu *nvc = vdata;
+
+    ASSERT(nvc && is_idle_vcpu(nvc->vcpu));
+
+    idle_vcpu[cpu]->sched_priv = vdata;
+
+    /*
+     * We are holding the runqueue lock already (it's been taken in
+     * schedule_cpu_switch()). It actually may or may not be the 'right'
+     * one for this cpu, but that is ok for preventing races.
+     */
+    ASSERT(!local_irq_is_enabled());
+
+    init_pdata(prv, cpu);
+
+    per_cpu(scheduler, cpu) = new_ops;
+    per_cpu(schedule_data, cpu).sched_priv = pdata;
+
+    /*
+     * (Re?)route the lock to the per pCPU lock as /last/ thing. In fact,
+     * if it is free (and it can be) we want that anyone that manages
+     * taking it, finds all the initializations we've done above in place.
+     */
+    smp_mb();
+    sd->schedule_lock = &sd->_lock;
+}
+
+static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
+{
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *nvc = null_vcpu(v);
+    unsigned int cpu;
+    spinlock_t *lock;
+
+    ASSERT(!is_idle_vcpu(v));
+
+ retry:
+    lock = vcpu_schedule_lock_irq(v);
+
+    cpu = pick_cpu(prv, v);
+
+    /* We hold v->processor's runq lock, but we need cpu's one */
+    if ( cpu != v->processor )
+    {
+        spin_unlock(lock);
+        lock = pcpu_schedule_lock(cpu);
+    }
+
+    /*
+     * If the pCPU is free, we assign v to it.
+     *
+     * If it is not free (e.g., because we raced with another insert
+     * or migrate), but there are free pCPUs, we try to pick again.
+     *
+     * If the pCPU is not free, and there aren't any (valid) others,
+     * we have no alternatives than to go into the waitqueue.
+     */
+    if ( likely(per_cpu(npc, cpu).vcpu == NULL) )
+    {
+        /*
+         * Insert is followed by vcpu_wake(), so there's no need to poke
+         * the pcpu with the SCHEDULE_SOFTIRQ, as wake will do that.
+         */
+        vcpu_assign(prv, v, cpu);
+    }
+    else if ( cpumask_intersects(&prv->cpus_free,
+                                 cpupool_domain_cpumask(v->domain)) )
+    {
+        spin_unlock(lock);
+        goto retry;
+    }
+    else
+    {
+        spin_lock(&prv->waitq_lock);
+        list_add_tail(&nvc->waitq_elem, &prv->waitq);
+        spin_unlock(&prv->waitq_lock);
+    }
+    spin_unlock_irq(lock);
+
+    SCHED_STAT_CRANK(vcpu_insert);
+}
+
+static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
+{
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *wvc, *nvc = null_vcpu(v);
+    unsigned int cpu;
+    spinlock_t *lock;
+
+    ASSERT(!is_idle_vcpu(v));
+
+    lock = vcpu_schedule_lock_irq(v);
+
+    cpu = v->processor;
+
+    /* If v is in waitqueue, just get it out of there and bail */
+    if ( unlikely(nvc->pcpu == -1) )
+    {
+        spin_lock(&prv->waitq_lock);
+
+        ASSERT(!list_empty(&null_vcpu(v)->waitq_elem));
+        list_del_init(&nvc->waitq_elem);
+
+        spin_unlock(&prv->waitq_lock);
+
+        goto out;
+    }
+
+    /*
+     * If v is assigned to a pCPU, let's see if there is someone waiting.
+     * If yes, we assign it to cpu, in spite of v. If no, we just set
+     * cpu free.
+     */
+
+    ASSERT(per_cpu(npc, cpu).vcpu == v);
+    ASSERT(!cpumask_test_cpu(cpu, &prv->cpus_free));
+
+    spin_lock(&prv->waitq_lock);
+    wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
+    if ( wvc )
+    {
+        vcpu_assign(prv, wvc->vcpu, cpu);
+        list_del_init(&wvc->waitq_elem);
+        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+    }
+    else
+    {
+        vcpu_deassign(prv, v, cpu);
+    }
+    spin_unlock(&prv->waitq_lock);
+
+ out:
+    vcpu_schedule_unlock_irq(lock, v);
+
+    SCHED_STAT_CRANK(vcpu_remove);
+}
+
+static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
+{
+    ASSERT(!is_idle_vcpu(v));
+
+    if ( unlikely(curr_on_cpu(v->processor) == v) )
+    {
+        SCHED_STAT_CRANK(vcpu_wake_running);
+        return;
+    }
+
+    if ( null_vcpu(v)->pcpu == -1 )
+    {
+	/* Not exactly "on runq", but close enough for reusing the counter */
+        SCHED_STAT_CRANK(vcpu_wake_onrunq);
+	return;
+    }
+
+    if ( likely(vcpu_runnable(v)) )
+        SCHED_STAT_CRANK(vcpu_wake_runnable);
+    else
+        SCHED_STAT_CRANK(vcpu_wake_not_runnable);
+
+    /* Note that we get here only for vCPUs assigned to a pCPU */
+    cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
+}
+
+static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
+{
+    ASSERT(!is_idle_vcpu(v));
+
+    /* If v is not assigned to a pCPU, or is not running, no need to bother */
+    if ( curr_on_cpu(v->processor) == v )
+        cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
+
+    SCHED_STAT_CRANK(vcpu_sleep);
+}
+
+static int null_cpu_pick(const struct scheduler *ops, struct vcpu *v)
+{
+    ASSERT(!is_idle_vcpu(v));
+    return pick_cpu(null_priv(ops), v);
+}
+
+static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
+                              unsigned int new_cpu)
+{
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *nvc = null_vcpu(v);
+    unsigned int cpu = v->processor;
+
+    ASSERT(!is_idle_vcpu(v));
+
+    if ( v->processor == new_cpu )
+        return;
+
+    /*
+     * v is either in the waitqueue, or assigned to a pCPU.
+     *
+     * In the former case, there is nothing to do.
+     *
+     * In the latter, the pCPU to which it was assigned would become free,
+     * and we, therefore, should check whether there is anyone in the
+     * waitqueue that can be assigned to it.
+     */
+    if ( likely(nvc->pcpu != -1) )
+    {
+        struct null_vcpu *wvc;
+
+        spin_lock(&prv->waitq_lock);
+        wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
+        if ( wvc && cpumask_test_cpu(cpu, cpupool_domain_cpumask(v->domain)) )
+        {
+            vcpu_assign(prv, wvc->vcpu, cpu);
+            list_del_init(&wvc->waitq_elem);
+            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+        }
+        else
+        {
+            vcpu_deassign(prv, v, cpu);
+        }
+        spin_unlock(&prv->waitq_lock);
+
+	SCHED_STAT_CRANK(migrate_running);
+    }
+    else
+        SCHED_STAT_CRANK(migrate_on_runq);
+
+    SCHED_STAT_CRANK(migrated);
+
+    /*
+     * Let's now consider new_cpu, which is where v is being sent. It can be
+     * either free, or have a vCPU already assigned to it.
+     *
+     * In the former case, we should assign v to it, and try to get it to run.
+     *
+     * In latter, all we can do is to park v in the waitqueue.
+     */
+    if ( per_cpu(npc, new_cpu).vcpu == NULL )
+    {
+        /* We don't know whether v was in the waitqueue. If yes, remove it */
+        spin_lock(&prv->waitq_lock);
+        list_del_init(&nvc->waitq_elem);
+        spin_unlock(&prv->waitq_lock);
+
+        vcpu_assign(prv, v, new_cpu);
+    }
+    else
+    {
+        /* We don't know whether v was in the waitqueue. If no, put it there */
+        spin_lock(&prv->waitq_lock);
+        if ( list_empty(&nvc->waitq_elem) )
+        {
+            list_add_tail(&nvc->waitq_elem, &prv->waitq);
+            nvc->pcpu = -1;
+        }
+        else
+            ASSERT(nvc->pcpu == -1);
+        spin_unlock(&prv->waitq_lock);
+    }
+
+    /*
+     * Whatever all the above, we always at least override v->processor.
+     * This is especially important for shutdown or suspend/resume paths,
+     * when it is important to let our caller (cpu_disable_scheduler())
+     * know that the migration did happen, to the best of our possibilities,
+     * at least. In case of suspend, any temporary inconsistency caused
+     * by this, will be fixed-up during resume.
+     */
+    v->processor = new_cpu;
+}
+
+#ifndef NDEBUG
+static inline void null_vcpu_check(struct vcpu *v)
+{
+    struct null_vcpu * const nvc = null_vcpu(v);
+    struct null_dom * const ndom = nvc->ndom;
+
+    BUG_ON(nvc->vcpu != v);
+    BUG_ON(ndom != null_dom(v->domain));
+    if ( ndom )
+    {
+        BUG_ON(is_idle_vcpu(v));
+        BUG_ON(ndom->dom != v->domain);
+    }
+    else
+    {
+        BUG_ON(!is_idle_vcpu(v));
+    }
+    SCHED_STAT_CRANK(vcpu_check);
+}
+#define NULL_VCPU_CHECK(v)  (null_vcpu_check(v))
+#else
+#define NULL_VCPU_CHECK(v)
+#endif
+
+
+/*
+ * The most simple scheduling function of all times! We either return:
+ *  - the vCPU assigned to the pCPU, if there's one and it can run;
+ *  - the idle vCPU, otherwise.
+ */
+static struct task_slice null_schedule(const struct scheduler *ops,
+                                       s_time_t now,
+                                       bool_t tasklet_work_scheduled)
+{
+    const unsigned int cpu = smp_processor_id();
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *wvc;
+    struct task_slice ret;
+
+    SCHED_STAT_CRANK(schedule);
+    NULL_VCPU_CHECK(current);
+
+    ret.task = per_cpu(npc, cpu).vcpu;
+    ret.migrated = 0;
+    ret.time = -1;
+
+    /*
+     * We may be new in the cpupool, or just coming back online. In which
+     * case, there may be vCPUs in the waitqueue that we can assign to us
+     * and run.
+     */
+    if ( unlikely(ret.task == NULL) )
+    {
+        spin_lock(&prv->waitq_lock);
+        wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
+        if ( wvc )
+        {
+            vcpu_assign(prv, wvc->vcpu, cpu);
+            list_del_init(&wvc->waitq_elem);
+            ret.task = wvc->vcpu;
+        }
+        spin_unlock(&prv->waitq_lock);
+    }
+
+    if ( unlikely(tasklet_work_scheduled ||
+                  ret.task == NULL ||
+                  !vcpu_runnable(ret.task)) )
+        ret.task = idle_vcpu[cpu];
+
+    NULL_VCPU_CHECK(ret.task);
+    return ret;
+}
+
+static inline void dump_vcpu(struct null_private *prv, struct null_vcpu *nvc)
+{
+    printk("[%i.%i] pcpu=%d", nvc->vcpu->domain->domain_id,
+            nvc->vcpu->vcpu_id, nvc->pcpu);
+}
+
+static void null_dump_pcpu(const struct scheduler *ops, int cpu)
+{
+    struct null_private *prv = null_priv(ops);
+    struct null_vcpu *nvc;
+    spinlock_t *lock;
+    unsigned long flags;
+#define cpustr keyhandler_scratch
+
+    lock = pcpu_schedule_lock_irqsave(cpu, &flags);
+
+    cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu));
+    printk("CPU[%02d] sibling=%s, ", cpu, cpustr);
+    cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu));
+    printk("core=%s", cpustr);
+    if ( per_cpu(npc, cpu).vcpu != NULL )
+        printk(", vcpu=d%dv%d", per_cpu(npc, cpu).vcpu->domain->domain_id,
+               per_cpu(npc, cpu).vcpu->vcpu_id);
+    printk("\n");
+
+    /* current VCPU (nothing to say if that's the idle vcpu) */
+    nvc = null_vcpu(curr_on_cpu(cpu));
+    if ( nvc && !is_idle_vcpu(nvc->vcpu) )
+    {
+        printk("\trun: ");
+        dump_vcpu(prv, nvc);
+        printk("\n");
+    }
+
+    pcpu_schedule_unlock_irqrestore(lock, flags, cpu);
+#undef cpustr
+}
+
+static void null_dump(const struct scheduler *ops)
+{
+    struct null_private *prv = null_priv(ops);
+    struct list_head *iter;
+    unsigned long flags;
+    unsigned int loop;
+#define cpustr keyhandler_scratch
+
+    spin_lock_irqsave(&prv->lock, flags);
+
+    cpulist_scnprintf(cpustr, sizeof(cpustr), &prv->cpus_free);
+    printk("\tcpus_free = %s\n", cpustr);
+
+    printk("Domain info:\n");
+    loop = 0;
+    list_for_each( iter, &prv->ndom )
+    {
+        struct null_dom *ndom;
+        struct vcpu *v;
+
+        ndom = list_entry(iter, struct null_dom, ndom_elem);
+
+        printk("\tDomain: %d\n", ndom->dom->domain_id);
+        for_each_vcpu( ndom->dom, v )
+        {
+            struct null_vcpu * const nvc = null_vcpu(v);
+            spinlock_t *lock;
+
+            lock = vcpu_schedule_lock(nvc->vcpu);
+
+            printk("\t%3d: ", ++loop);
+            dump_vcpu(prv, nvc);
+            printk("\n");
+
+            vcpu_schedule_unlock(lock, nvc->vcpu);
+        }
+    }
+
+    printk("Waitqueue: ");
+    loop = 0;
+    spin_lock(&prv->waitq_lock);
+    list_for_each( iter, &prv->waitq )
+    {
+        struct null_vcpu *nvc = list_entry(iter, struct null_vcpu, waitq_elem);
+
+        if ( loop++ != 0 )
+            printk(", ");
+        if ( loop % 24 == 0 )
+            printk("\n\t");
+        printk("d%dv%d", nvc->vcpu->domain->domain_id, nvc->vcpu->vcpu_id);
+    }
+    printk("\n");
+    spin_unlock(&prv->waitq_lock);
+
+    spin_unlock_irqrestore(&prv->lock, flags);
+#undef cpustr
+}
+
+const struct scheduler sched_null_def = {
+    .name           = "null Scheduler",
+    .opt_name       = "null",
+    .sched_id       = XEN_SCHEDULER_NULL,
+    .sched_data     = NULL,
+
+    .init           = null_init,
+    .deinit         = null_deinit,
+    .init_pdata     = null_init_pdata,
+    .switch_sched   = null_switch_sched,
+    .deinit_pdata   = null_deinit_pdata,
+
+    .alloc_vdata    = null_alloc_vdata,
+    .free_vdata     = null_free_vdata,
+    .alloc_domdata  = null_alloc_domdata,
+    .free_domdata   = null_free_domdata,
+
+    .init_domain    = null_dom_init,
+    .destroy_domain = null_dom_destroy,
+
+    .insert_vcpu    = null_vcpu_insert,
+    .remove_vcpu    = null_vcpu_remove,
+
+    .wake           = null_vcpu_wake,
+    .sleep          = null_vcpu_sleep,
+    .pick_cpu       = null_cpu_pick,
+    .migrate        = null_vcpu_migrate,
+    .do_schedule    = null_schedule,
+
+    .dump_cpu_state = null_dump_pcpu,
+    .dump_settings  = null_dump,
+};
+
+REGISTER_SCHEDULER(sched_null_def);
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 223a120..b482037 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
 
  out:
     per_cpu(cpupool, cpu) = c;
+    /* Trigger a reschedule so the CPU can pick up some work ASAP. */
+    cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
 
     return 0;
 }
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 85cbb7c..32b578d 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -352,6 +352,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
 #define XEN_SCHEDULER_CREDIT2  6
 #define XEN_SCHEDULER_ARINC653 7
 #define XEN_SCHEDULER_RTDS     8
+#define XEN_SCHEDULER_NULL     9
 
 typedef struct xen_domctl_sched_credit {
     uint16_t weight;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] xen: sched_null: support for hard affinity
  2017-03-17 18:42 [PATCH 0/3] The 'null' Scheduler Dario Faggioli
  2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
@ 2017-03-17 18:43 ` Dario Faggioli
  2017-03-20 23:46   ` Stefano Stabellini
  2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
  2017-03-20 22:23 ` [PATCH 0/3] The 'null' Scheduler Stefano Stabellini
  3 siblings, 1 reply; 22+ messages in thread
From: Dario Faggioli @ 2017-03-17 18:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Jonathan Davies, Julien Grall, George Dunlap,
	Marcus Granado

As a (rudimental) way of directing and affecting the
placement logic implemented by the scheduler, support
vCPU hard affinity.

Basically, a vCPU will now be assigned only to a pCPU
that is part of its own hard affinity. If such pCPU(s)
is (are) busy, the vCPU will wait, like it happens
when there are no free pCPUs.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Stefano Stabellini <stefano@aporeto.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jonathan Davies <Jonathan.Davies@citrix.com>
Cc: Marcus Granado <marcus.granado@citrix.com>
---
 xen/common/sched_null.c |   43 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 6a13308..ea055f1 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -117,6 +117,14 @@ static inline struct null_dom *null_dom(const struct domain *d)
     return d->sched_priv;
 }
 
+static inline bool check_nvc_affinity(struct null_vcpu *nvc, unsigned int cpu)
+{
+    cpumask_and(cpumask_scratch_cpu(cpu), nvc->vcpu->cpu_hard_affinity,
+                cpupool_domain_cpumask(nvc->vcpu->domain));
+
+    return cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu));
+}
+
 static int null_init(struct scheduler *ops)
 {
     struct null_private *prv;
@@ -284,16 +292,20 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
 
     ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
 
+    cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, cpus);
+
     /*
      * If our processor is free, or we are assigned to it, and it is
-     * also still valid, just go for it.
+     * also still valid and part of our affinity, just go for it.
      */
     if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v)
-                && cpumask_test_cpu(cpu, cpus)) )
+                && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu))) )
         return cpu;
 
-    /* If not, just go for a valid free pCPU, if any */
+    /* If not, just go for a free pCPU, within our affinity, if any */
     cpumask_and(cpumask_scratch_cpu(cpu), &prv->cpus_free, cpus);
+    cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
+                v->cpu_hard_affinity);
     cpu = cpumask_first(cpumask_scratch_cpu(cpu));
 
     /*
@@ -308,7 +320,10 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
      * only if the pCPU is free.
      */
     if ( unlikely(cpu == nr_cpu_ids) )
-        cpu = cpumask_any(cpus);
+    {
+        cpumask_and(cpumask_scratch_cpu(cpu), cpus, v->cpu_hard_affinity);
+        cpu = cpumask_any(cpumask_scratch_cpu(cpu));
+    }
 
     return cpu;
 }
@@ -391,6 +406,9 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
         lock = pcpu_schedule_lock(cpu);
     }
 
+    cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+		cpupool_domain_cpumask(v->domain));
+
     /*
      * If the pCPU is free, we assign v to it.
      *
@@ -408,8 +426,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
          */
         vcpu_assign(prv, v, cpu);
     }
-    else if ( cpumask_intersects(&prv->cpus_free,
-                                 cpupool_domain_cpumask(v->domain)) )
+    else if ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
     {
         spin_unlock(lock);
         goto retry;
@@ -462,7 +479,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
 
     spin_lock(&prv->waitq_lock);
     wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
-    if ( wvc )
+    if ( wvc && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu)) )
     {
         vcpu_assign(prv, wvc->vcpu, cpu);
         list_del_init(&wvc->waitq_elem);
@@ -550,7 +567,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
 
         spin_lock(&prv->waitq_lock);
         wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
-        if ( wvc && cpumask_test_cpu(cpu, cpupool_domain_cpumask(v->domain)) )
+        if ( wvc && check_nvc_affinity(wvc, cpu) )
         {
             vcpu_assign(prv, wvc->vcpu, cpu);
             list_del_init(&wvc->waitq_elem);
@@ -573,11 +590,15 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
      * Let's now consider new_cpu, which is where v is being sent. It can be
      * either free, or have a vCPU already assigned to it.
      *
-     * In the former case, we should assign v to it, and try to get it to run.
+     * In the former case, we should assign v to it, and try to get it to run,
+     * if possible, according to affinity.
      *
      * In latter, all we can do is to park v in the waitqueue.
      */
-    if ( per_cpu(npc, new_cpu).vcpu == NULL )
+    cpumask_and(cpumask_scratch_cpu(cpu), cpupool_domain_cpumask(v->domain),
+                nvc->vcpu->cpu_hard_affinity);
+    if ( per_cpu(npc, new_cpu).vcpu == NULL &&
+         cpumask_test_cpu(new_cpu, cpumask_scratch_cpu(cpu)) )
     {
         /* We don't know whether v was in the waitqueue. If yes, remove it */
         spin_lock(&prv->waitq_lock);
@@ -666,7 +687,7 @@ static struct task_slice null_schedule(const struct scheduler *ops,
     {
         spin_lock(&prv->waitq_lock);
         wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
-        if ( wvc )
+        if ( wvc && check_nvc_affinity(wvc, cpu) )
         {
             vcpu_assign(prv, wvc->vcpu, cpu);
             list_del_init(&wvc->waitq_elem);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-03-17 18:42 [PATCH 0/3] The 'null' Scheduler Dario Faggioli
  2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
  2017-03-17 18:43 ` [PATCH 2/3] xen: sched_null: support for hard affinity Dario Faggioli
@ 2017-03-17 18:43 ` Dario Faggioli
  2017-03-20 22:28   ` Stefano Stabellini
                     ` (2 more replies)
  2017-03-20 22:23 ` [PATCH 0/3] The 'null' Scheduler Stefano Stabellini
  3 siblings, 3 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-03-17 18:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Ian Jackson, Julien Grall, Wei Liu, George Dunlap

It being very very basic, also means this scheduler does
not need much support at the tools level (for now).

Basically, just the definition of the symbol of the
scheduler itself and a couple of stubs.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Stefano Stabellini <stefano@aporeto.com>
Cc: Julien Grall <julien.grall@arm.com>
---
 tools/libxl/libxl.h         |    6 ++++++
 tools/libxl/libxl_sched.c   |   24 ++++++++++++++++++++++++
 tools/libxl/libxl_types.idl |    1 +
 3 files changed, 31 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 4c60e8f..5adac2d 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -210,6 +210,12 @@
 #define LIBXL_HAVE_SCHED_RTDS 1
 
 /*
+ * LIBXL_HAVE_SCHED_NULL indicates that the 'null' static scheduler
+ * is available.
+ */
+#define LIBXL_HAVE_SCHED_NULL 1
+
+/*
  * libxl_domain_build_info has u.hvm.viridian_enable and _disable bitmaps
  * of the specified width.
  */
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index 84d3837..d44fbe1 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc *gc, uint32_t domid,
     return 0;
 }
 
+static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
+                                 const libxl_domain_sched_params *scinfo)
+{
+    /* The null scheduler doesn't take any domain-specific parameters. */
+    return 0;
+}
+
+static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
+                               libxl_domain_sched_params *scinfo)
+{
+    /* The null scheduler doesn't have any domain-specific parameters. */
+    return ERROR_INVAL;
+}
+
 static int sched_credit_domain_get(libxl__gc *gc, uint32_t domid,
                                    libxl_domain_sched_params *scinfo)
 {
@@ -730,6 +744,9 @@ int libxl_domain_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_set(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_set(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -758,6 +775,7 @@ int libxl_vcpu_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -792,6 +810,7 @@ int libxl_vcpu_sched_params_set_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -832,6 +851,9 @@ int libxl_domain_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_get(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_get(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -858,6 +880,7 @@ int libxl_vcpu_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -890,6 +913,7 @@ int libxl_vcpu_sched_params_get_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 6d28dea..ce733c4 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -191,6 +191,7 @@ libxl_scheduler = Enumeration("scheduler", [
     (6, "credit2"),
     (7, "arinc653"),
     (8, "rtds"),
+    (9, "null"),
     ])
 
 # Consistent with SHUTDOWN_* in sched.h (apart from UNKNOWN)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/3] The 'null' Scheduler
  2017-03-17 18:42 [PATCH 0/3] The 'null' Scheduler Dario Faggioli
                   ` (2 preceding siblings ...)
  2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
@ 2017-03-20 22:23 ` Stefano Stabellini
  3 siblings, 0 replies; 22+ messages in thread
From: Stefano Stabellini @ 2017-03-20 22:23 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Jonathan Davies, Wei Liu, Ian Jackson, George Dunlap,
	Marcus Granado, Stefano Stabellini, Julien Grall, xen-devel

On Fri, 17 Mar 2017, Dario Faggioli wrote:
> Hello,
> 
> This patch series implements what I call the 'null' scheduler.
> 
> It's a very simple, very static scheduling posicy that always schedules the same vCPU(s) on the same pCPU(s). That's it.
> 
> If there are less vCPUs than pCPUs, some of the pCPUs are _always_ idle. If there are more, some vCPUs _never_ run.
> That is not entirely true, as there is some logic to make sure that waiting to run vCPUs are executed, for instance, on a new pCPU that enters the cpupool, and things like that.
> 
> The demand for this cames from Xen on ARM people and the embedded world in general (Hey, Stefano! :-P), where it is not uncommon to have super static systems that perceives an advanced general purpose scheduler just as pure overhead.

Yep, this scheduler is exactly what I want, thanks! :-)


> As a matter of fact, this may turn out useful in less embedded scenario, like High Performace Computing (where, again, scheduling is, often, unnecessary overhead), but even some of our much more classic Xen use case (like consolidation, Cc-ing Jonathan and Marcus who said they were interested in it).
> 
> The scheduler is really simple, and especially the hot paths --i.e., sleep, wakeup and schedule-- are super lean and quick, in 99% of the cases. All the slightly more complicated logic for dealing with pCPUs coming and going from a cpupool that uses this scheduler resides in functions that handle insertion, removal and migration of vCPUs, which are only called when such configuration changes happens (so, typically, "offline", in most of the embedded usecases).
> 
> I implemented support for hard affinity in order to provide at least a rudimental interface for interacting with the scheduler and affect the placement (it's called assignment within the code) of vCPUs on pCPUs.
> 
> I've tested the scheduler both inside a cpupool (using both Credit1 and Credit2 as boot schedulers) and as default, choosing it at boot and using it for Dom0 and a few other domains. In the latter case, you probably want to limit the number of Dom0's vCPUs too, or there will be very few to experiment with! :-P
> 
> I haven't done any performance or overhead measurements so far, but I will soon enough.
> 
> I also consider this to be experimental, and I'll also write a feature document ASAP.
> 
> Thanks and Regards,
> Dario
> ---
> Dario Faggioli (3):
>       xen: sched: introduce the 'null' semi-static scheduler
>       xen: sched_null: support for hard affinity
>       tools: sched: add support for 'null' scheduler
> 
>  docs/misc/xen-command-line.markdown |    2 
>  tools/libxl/libxl.h                 |    6 
>  tools/libxl/libxl_sched.c           |   24 +
>  tools/libxl/libxl_types.idl         |    1 
>  xen/common/Kconfig                  |   11 
>  xen/common/Makefile                 |    1 
>  xen/common/sched_null.c             |  837 +++++++++++++++++++++++++++++++++++
>  xen/common/schedule.c               |    2 
>  xen/include/public/domctl.h         |    1 
>  9 files changed, 884 insertions(+), 1 deletion(-)
>  create mode 100644 xen/common/sched_null.c
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
@ 2017-03-20 22:28   ` Stefano Stabellini
  2017-03-21 17:09   ` Wei Liu
  2017-03-27 10:50   ` George Dunlap
  2 siblings, 0 replies; 22+ messages in thread
From: Stefano Stabellini @ 2017-03-20 22:28 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Ian Jackson, George Dunlap, Stefano Stabellini,
	Julien Grall, xen-devel

On Fri, 17 Mar 2017, Dario Faggioli wrote:
> It being very very basic, also means this scheduler does
> not need much support at the tools level (for now).
> 
> Basically, just the definition of the symbol of the
> scheduler itself and a couple of stubs.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Stefano Stabellini <stefano@aporeto.com>
> Cc: Julien Grall <julien.grall@arm.com>
> ---
>  tools/libxl/libxl.h         |    6 ++++++
>  tools/libxl/libxl_sched.c   |   24 ++++++++++++++++++++++++
>  tools/libxl/libxl_types.idl |    1 +
>  3 files changed, 31 insertions(+)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 4c60e8f..5adac2d 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -210,6 +210,12 @@
>  #define LIBXL_HAVE_SCHED_RTDS 1
>  
>  /*
> + * LIBXL_HAVE_SCHED_NULL indicates that the 'null' static scheduler
> + * is available.
> + */
> +#define LIBXL_HAVE_SCHED_NULL 1
> +
> +/*
>   * libxl_domain_build_info has u.hvm.viridian_enable and _disable bitmaps
>   * of the specified width.
>   */
> diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
> index 84d3837..d44fbe1 100644
> --- a/tools/libxl/libxl_sched.c
> +++ b/tools/libxl/libxl_sched.c
> @@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc *gc, uint32_t domid,
>      return 0;
>  }
>  
> +static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
> +                                 const libxl_domain_sched_params *scinfo)
> +{
> +    /* The null scheduler doesn't take any domain-specific parameters. */
> +    return 0;
> +}
> +
> +static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
> +                               libxl_domain_sched_params *scinfo)
> +{
> +    /* The null scheduler doesn't have any domain-specific parameters. */
> +    return ERROR_INVAL;
> +}
> +
>  static int sched_credit_domain_get(libxl__gc *gc, uint32_t domid,
>                                     libxl_domain_sched_params *scinfo)
>  {
> @@ -730,6 +744,9 @@ int libxl_domain_sched_params_set(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_RTDS:
>          ret=sched_rtds_domain_set(gc, domid, scinfo);
>          break;
> +    case LIBXL_SCHEDULER_NULL:
> +        ret=sched_null_domain_set(gc, domid, scinfo);
> +        break;
>      default:
>          LOGD(ERROR, domid, "Unknown scheduler");
>          ret=ERROR_INVAL;
> @@ -758,6 +775,7 @@ int libxl_vcpu_sched_params_set(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_CREDIT:
>      case LIBXL_SCHEDULER_CREDIT2:
>      case LIBXL_SCHEDULER_ARINC653:
> +    case LIBXL_SCHEDULER_NULL:
>          LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
>          rc = ERROR_INVAL;
>          break;
> @@ -792,6 +810,7 @@ int libxl_vcpu_sched_params_set_all(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_CREDIT:
>      case LIBXL_SCHEDULER_CREDIT2:
>      case LIBXL_SCHEDULER_ARINC653:
> +    case LIBXL_SCHEDULER_NULL:
>          LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
>          rc = ERROR_INVAL;
>          break;
> @@ -832,6 +851,9 @@ int libxl_domain_sched_params_get(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_RTDS:
>          ret=sched_rtds_domain_get(gc, domid, scinfo);
>          break;
> +    case LIBXL_SCHEDULER_NULL:
> +        ret=sched_null_domain_get(gc, domid, scinfo);
> +        break;
>      default:
>          LOGD(ERROR, domid, "Unknown scheduler");
>          ret=ERROR_INVAL;
> @@ -858,6 +880,7 @@ int libxl_vcpu_sched_params_get(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_CREDIT:
>      case LIBXL_SCHEDULER_CREDIT2:
>      case LIBXL_SCHEDULER_ARINC653:
> +    case LIBXL_SCHEDULER_NULL:
>          LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
>          rc = ERROR_INVAL;
>          break;
> @@ -890,6 +913,7 @@ int libxl_vcpu_sched_params_get_all(libxl_ctx *ctx, uint32_t domid,
>      case LIBXL_SCHEDULER_CREDIT:
>      case LIBXL_SCHEDULER_CREDIT2:
>      case LIBXL_SCHEDULER_ARINC653:
> +    case LIBXL_SCHEDULER_NULL:
>          LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
>          rc = ERROR_INVAL;
>          break;
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 6d28dea..ce733c4 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -191,6 +191,7 @@ libxl_scheduler = Enumeration("scheduler", [
>      (6, "credit2"),
>      (7, "arinc653"),
>      (8, "rtds"),
> +    (9, "null"),
>      ])
>  
>  # Consistent with SHUTDOWN_* in sched.h (apart from UNKNOWN)
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
@ 2017-03-20 23:21   ` Stefano Stabellini
  2017-03-21  8:26     ` Dario Faggioli
  2017-03-27 10:31   ` George Dunlap
  1 sibling, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2017-03-20 23:21 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Jonathan Davies, Stefano Stabellini, George Dunlap,
	Marcus Granado, Julien Grall, xen-devel

On Fri, 17 Mar 2017, Dario Faggioli wrote:
> In cases where one is absolutely sure that there will be
> less vCPUs than pCPUs, having to pay the cose, mostly in
> terms of overhead, of an advanced scheduler may be not
> desirable.
> 
> The simple scheduler implemented here could be a solution.
> Here how it works:
>  - each vCPU is statically assigned to a pCPU;
>  - if there are pCPUs without any vCPU assigned, they
>    stay idle (as in, the run their idle vCPU);
>  - if there are vCPUs which are not assigned to any
>    pCPU (e.g., because there are more vCPUs than pCPUs)
>    they *don't* run, until they get assigned;
>  - if a vCPU assigned to a pCPU goes away, one of the
>    waiting to be assigned vCPU, if any, gets assigned
>    to the pCPU and can run there.
> 
> This scheduler, therefore, if used in configurations
> where every vCPUs can be assigned to a pCPU, guarantees
> low overhead, low latency, and consistent performance.
> 
> If used as default scheduler, at Xen boot, it is
> recommended to limit the number of Dom0 vCPUs (e.g., with
> 'dom0_max_vcpus=x'). Otherwise, all the pCPUs will have
> one Dom0's vCPU assigned, and there won't be room for
> running efficiently (if at all) any guest.
> 
> Target use cases are embedded and HPC, but it may well
> be interesting also in circumnstances.
> 
> Kconfig and documentation are update accordingly.
> 
> While there, also document the availability of sched=rtds
> as boot parameter, which apparently had been forgotten.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Jonathan Davies <Jonathan.Davies@citrix.com>
> Cc: Marcus Granado <marcus.granado@citrix.com>
> ---
>  docs/misc/xen-command-line.markdown |    2 
>  xen/common/Kconfig                  |   11 
>  xen/common/Makefile                 |    1 
>  xen/common/sched_null.c             |  816 +++++++++++++++++++++++++++++++++++
>  xen/common/schedule.c               |    2 
>  xen/include/public/domctl.h         |    1 
>  6 files changed, 832 insertions(+), 1 deletion(-)
>  create mode 100644 xen/common/sched_null.c
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 4daf5b5..ad6a5ca 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1434,7 +1434,7 @@ Map the HPET page as read only in Dom0. If disabled the page will be mapped
>  with read and write permissions.
>  
>  ### sched
> -> `= credit | credit2 | arinc653`
> +> `= credit | credit2 | arinc653 | rtds | null`
>  
>  > Default: `sched=credit`
>  
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index f2ecbc4..518520e 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -187,6 +187,14 @@ config SCHED_ARINC653
>  	  The ARINC653 scheduler is a hard real-time scheduler for single
>  	  cores, targeted for avionics, drones, and medical devices.
>  
> +config SCHED_NULL
> +	bool "Null scheduler support (EXPERIMENTAL)"
> +	default y
> +	---help---
> +	  The null scheduler is a static, zero overhead scheduler,
> +	  for when there always are less vCPUs than pCPUs, typically
> +	  in embedded or HPC scenarios.
> +
>  choice
>  	prompt "Default Scheduler?"
>  	default SCHED_CREDIT_DEFAULT
> @@ -199,6 +207,8 @@ choice
>  		bool "RT Scheduler" if SCHED_RTDS
>  	config SCHED_ARINC653_DEFAULT
>  		bool "ARINC653 Scheduler" if SCHED_ARINC653
> +	config SCHED_NULL_DEFAULT
> +		bool "Null Scheduler" if SCHED_NULL
>  endchoice
>  
>  config SCHED_DEFAULT
> @@ -207,6 +217,7 @@ config SCHED_DEFAULT
>  	default "credit2" if SCHED_CREDIT2_DEFAULT
>  	default "rtds" if SCHED_RTDS_DEFAULT
>  	default "arinc653" if SCHED_ARINC653_DEFAULT
> +	default "null" if SCHED_NULL_DEFAULT
>  	default "credit"
>  
>  endmenu
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 0fed30b..26c5a64 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -40,6 +40,7 @@ obj-$(CONFIG_SCHED_ARINC653) += sched_arinc653.o
>  obj-$(CONFIG_SCHED_CREDIT) += sched_credit.o
>  obj-$(CONFIG_SCHED_CREDIT2) += sched_credit2.o
>  obj-$(CONFIG_SCHED_RTDS) += sched_rt.o
> +obj-$(CONFIG_SCHED_NULL) += sched_null.o
>  obj-y += schedule.o
>  obj-y += shutdown.o
>  obj-y += softirq.o
> diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
> new file mode 100644
> index 0000000..6a13308
> --- /dev/null
> +++ b/xen/common/sched_null.c
> @@ -0,0 +1,816 @@
> +/*
> + * xen/common/sched_null.c
> + *
> + *  Copyright (c) 2017, Dario Faggioli, Citrix Ltd
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License v2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +/*
> + * The 'null' scheduler always choose to run, on each pCPU, either nothing
> + * (i.e., the pCPU stays idle) or always the same vCPU.
> + *
> + * It is aimed at supporting static scenarios, where there always are
> + * less vCPUs than pCPUs (and the vCPUs don't need to move among pCPUs
> + * for any reason) with the least possible overhead.
> + *
> + * Typical usecase are embedded applications, but also HPC, especially
> + * if the scheduler is used inside a cpupool.
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/sched-if.h>
> +#include <xen/softirq.h>
> +#include <xen/keyhandler.h>
> +
> +
> +/*
> + * Locking:
> + * - Scheduler-lock (a.k.a. runqueue lock):
> + *  + is per-pCPU;
> + *  + serializes assignment and deassignment of vCPUs to a pCPU.
> + * - Private data lock (a.k.a. private scheduler lock):
> + *  + is scheduler-wide;
> + *  + serializes accesses to the list of domains in this scheduler.
> + * - Waitqueue lock:
> + *  + is scheduler-wide;
> + *  + serialize accesses to the list of vCPUs waiting to be assigned
> + *    to pCPUs.
> + *
> + * Ordering is: private lock, runqueue lock, waitqueue lock. Or, OTOH,
> + * waitqueue lock nests inside runqueue lock which nests inside private
> + * lock. More specifically:
> + *  + if we need both runqueue and private locks, we must acquire the
> + *    private lock for first;
> + *  + if we need both runqueue and waitqueue locks, we must acquire
> + *    the runqueue lock for first;
> + *  + if we need both private and waitqueue locks, we must acquire
> + *    the private lock for first;
> + *  + if we already own a runqueue lock, we must never acquire
> + *    the private lock;
> + *  + if we already own the waitqueue lock, we must never acquire
> + *    the runqueue lock or the private lock.
> + */
> +
> +/*
> + * System-wide private data
> + */
> +struct null_private {
> +    spinlock_t lock;        /* scheduler lock; nests inside cpupool_lock */
> +    struct list_head ndom;  /* Domains of this scheduler                 */
> +    struct list_head waitq; /* vCPUs not assigned to any pCPU            */
> +    spinlock_t waitq_lock;  /* serializes waitq; nests inside runq locks */
> +    cpumask_t cpus_free;    /* CPUs without a vCPU associated to them    */
> +};
> +
> +/*
> + * Physical CPU
> + */
> +struct null_pcpu {
> +    struct vcpu *vcpu;
> +};
> +DEFINE_PER_CPU(struct null_pcpu, npc);
> +
> +/*
> + * Virtual CPU
> + */
> +struct null_vcpu {
> +    struct list_head waitq_elem;
> +    struct null_dom *ndom;

This field is redundant, given that from struct vcpu you can get struct
domain and from struct domain you can get struct null_dom. I would
remove it.


> +    struct vcpu *vcpu;
> +    int pcpu;          /* To what pCPU the vCPU is assigned (-1 if none) */

Isn't this always the same as struct vcpu->processor?
If it's only different when the vcpu is waiting in the waitq, then you
can just remove this field and replace the pcpu == -1 test with
list_empty(waitq_elem).


> +};
> +
> +/*
> + * Domain
> + */
> +struct null_dom {
> +    struct list_head ndom_elem;
> +    struct domain *dom;
> +};
> +
> +/*
> + * Accessor helpers functions
> + */
> +static inline struct null_private *null_priv(const struct scheduler *ops)
> +{
> +    return ops->sched_data;
> +}
> +
> +static inline struct null_vcpu *null_vcpu(const struct vcpu *v)
> +{
> +    return v->sched_priv;
> +}
> +
> +static inline struct null_dom *null_dom(const struct domain *d)
> +{
> +    return d->sched_priv;
> +}
> +
> +static int null_init(struct scheduler *ops)
> +{
> +    struct null_private *prv;
> +
> +    printk("Initializing null scheduler\n"
> +           "WARNING: This is experimental software in development.\n"
> +           "Use at your own risk.\n");
> +
> +    prv = xzalloc(struct null_private);
> +    if ( prv == NULL )
> +        return -ENOMEM;
> +
> +    spin_lock_init(&prv->lock);
> +    spin_lock_init(&prv->waitq_lock);
> +    INIT_LIST_HEAD(&prv->ndom);
> +    INIT_LIST_HEAD(&prv->waitq);
> +
> +    ops->sched_data = prv;
> +
> +    return 0;
> +}
> +
> +static void null_deinit(struct scheduler *ops)
> +{
> +    xfree(ops->sched_data);
> +    ops->sched_data = NULL;
> +}
> +
> +static void init_pdata(struct null_private *prv, unsigned int cpu)
> +{
> +    /* Mark the pCPU as free, and with no vCPU assigned */
> +    cpumask_set_cpu(cpu, &prv->cpus_free);
> +    per_cpu(npc, cpu).vcpu = NULL;
> +}
> +
> +static void null_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct schedule_data *sd = &per_cpu(schedule_data, cpu);
> +
> +    /* alloc_pdata is not implemented, so we want this to be NULL. */
> +    ASSERT(!pdata);
> +
> +    /*
> +     * The scheduler lock points already to the default per-cpu spinlock,
> +     * so there is no remapping to be done.
> +     */
> +    ASSERT(sd->schedule_lock == &sd->_lock && !spin_is_locked(&sd->_lock));
> +
> +    init_pdata(prv, cpu);
> +}
> +
> +static void null_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> +{
> +    struct null_private *prv = null_priv(ops);
> +
> +    /* alloc_pdata not implemented, so this must have stayed NULL */
> +    ASSERT(!pcpu);
> +
> +    cpumask_clear_cpu(cpu, &prv->cpus_free);
> +    per_cpu(npc, cpu).vcpu = NULL;
> +}
> +
> +static void *null_alloc_vdata(const struct scheduler *ops,
> +                              struct vcpu *v, void *dd)
> +{
> +    struct null_vcpu *nvc;
> +
> +    nvc = xzalloc(struct null_vcpu);
> +    if ( nvc == NULL )
> +        return NULL;
> +
> +    INIT_LIST_HEAD(&nvc->waitq_elem);
> +
> +    /* Not assigned to any pCPU */
> +    nvc->pcpu = -1;
> +    /* Up pointers */
> +    nvc->ndom = dd;
> +    nvc->vcpu = v;
> +
> +    SCHED_STAT_CRANK(vcpu_alloc);
> +
> +    return nvc;
> +}
> +
> +static void null_free_vdata(const struct scheduler *ops, void *priv)
> +{
> +    struct null_vcpu *nvc = priv;
> +
> +    xfree(nvc);
> +}
> +
> +static void * null_alloc_domdata(const struct scheduler *ops,
> +                                 struct domain *d)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_dom *ndom;
> +    unsigned long flags;
> +
> +    ndom = xzalloc(struct null_dom);
> +    if ( ndom == NULL )
> +        return NULL;
> +
> +    INIT_LIST_HEAD(&ndom->ndom_elem);

This is not need given the following list_add_tail


> +    ndom->dom = d;
> +
> +    spin_lock_irqsave(&prv->lock, flags);
> +    list_add_tail(&ndom->ndom_elem, &null_priv(ops)->ndom);
> +    spin_unlock_irqrestore(&prv->lock, flags);
> +
> +    return (void*)ndom;
> +}
> +
> +static void null_free_domdata(const struct scheduler *ops, void *data)
> +{
> +    unsigned long flags;
> +    struct null_dom *ndom = data;
> +    struct null_private *prv = null_priv(ops);
> +
> +    spin_lock_irqsave(&prv->lock, flags);
> +    list_del_init(&ndom->ndom_elem);
> +    spin_unlock_irqrestore(&prv->lock, flags);
> +
> +    xfree(data);
> +}
> +
> +static int null_dom_init(const struct scheduler *ops, struct domain *d)
> +{
> +    struct null_dom *ndom;
> +
> +    if ( is_idle_domain(d) )
> +        return 0;
> +
> +    ndom = null_alloc_domdata(ops, d);
> +    if ( ndom == NULL )
> +        return -ENOMEM;
> +
> +    d->sched_priv = ndom;
> +
> +    return 0;
> +}
> +static void null_dom_destroy(const struct scheduler *ops, struct domain *d)
> +{
> +    null_free_domdata(ops, null_dom(d));
> +}
> +
> +/*
> + * vCPU to pCPU assignment and placement. This _only_ happens:
> + *  - on insert,
> + *  - on migrate.
> + *
> + * Insert occurs when a vCPU joins this scheduler for the first time
> + * (e.g., when the domain it's part of is moved to the scheduler's
> + * cpupool).
> + *
> + * Migration may be necessary if a pCPU (with a vCPU assigned to it)
> + * is removed from the scheduler's cpupool.
> + *
> + * So this is not part of any hot path.
> + */
> +static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
> +{
> +    unsigned int cpu = v->processor;
> +    cpumask_t *cpus = cpupool_domain_cpumask(v->domain);
> +
> +    ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
> +
> +    /*
> +     * If our processor is free, or we are assigned to it, and it is
> +     * also still valid, just go for it.
> +     */
> +    if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v)
> +                && cpumask_test_cpu(cpu, cpus)) )
> +        return cpu;
> +
> +    /* If not, just go for a valid free pCPU, if any */
> +    cpumask_and(cpumask_scratch_cpu(cpu), &prv->cpus_free, cpus);
> +    cpu = cpumask_first(cpumask_scratch_cpu(cpu));
> +
> +    /*
> +     * If we didn't find any free pCPU, just pick any valid pcpu, even if
> +     * it has another vCPU assigned. This will happen during shutdown and
> +     * suspend/resume, but it may also happen during "normal operation", if
> +     * all the pCPUs are busy.
> +     *
> +     * In fact, there must always be something sane in v->processor, or
> +     * vcpu_schedule_lock() and friends won't work. This is not a problem,
> +     * as we will actually assign the vCPU to the pCPU we return from here,
> +     * only if the pCPU is free.
> +     */
> +    if ( unlikely(cpu == nr_cpu_ids) )
> +        cpu = cpumask_any(cpus);
> +
> +    return cpu;
> +}
> +
> +static void vcpu_assign(struct null_private *prv, struct vcpu *v,
> +                        unsigned int cpu)
> +{
> +    ASSERT(null_vcpu(v)->pcpu == -1);
> +
> +    per_cpu(npc, cpu).vcpu = v;
> +    v->processor = null_vcpu(v)->pcpu = cpu;
> +    cpumask_clear_cpu(cpu, &prv->cpus_free);
> +
> +    gdprintk(XENLOG_INFO, "%d <-- d%dv%d\n", cpu, v->domain->domain_id, v->vcpu_id);
> +}
> +
> +static void vcpu_deassign(struct null_private *prv, struct vcpu *v,
> +                          unsigned int cpu)
> +{
> +    ASSERT(null_vcpu(v)->pcpu == cpu);
> +
> +    null_vcpu(v)->pcpu = -1;
> +    per_cpu(npc, cpu).vcpu = NULL;
> +    cpumask_set_cpu(cpu, &prv->cpus_free);
> +
> +    gdprintk(XENLOG_INFO, "%d <-- NULL (d%dv%d)\n", cpu, v->domain->domain_id, v->vcpu_id);
> +}
> +
> +/* Change the scheduler of cpu to us (null). */
> +static void null_switch_sched(struct scheduler *new_ops, unsigned int cpu,
> +                              void *pdata, void *vdata)
> +{
> +    struct schedule_data *sd = &per_cpu(schedule_data, cpu);
> +    struct null_private *prv = null_priv(new_ops);
> +    struct null_vcpu *nvc = vdata;
> +
> +    ASSERT(nvc && is_idle_vcpu(nvc->vcpu));
> +
> +    idle_vcpu[cpu]->sched_priv = vdata;
> +
> +    /*
> +     * We are holding the runqueue lock already (it's been taken in
> +     * schedule_cpu_switch()). It actually may or may not be the 'right'
> +     * one for this cpu, but that is ok for preventing races.
> +     */
> +    ASSERT(!local_irq_is_enabled());
> +
> +    init_pdata(prv, cpu);
> +
> +    per_cpu(scheduler, cpu) = new_ops;
> +    per_cpu(schedule_data, cpu).sched_priv = pdata;
> +
> +    /*
> +     * (Re?)route the lock to the per pCPU lock as /last/ thing. In fact,
> +     * if it is free (and it can be) we want that anyone that manages
> +     * taking it, finds all the initializations we've done above in place.
> +     */
> +    smp_mb();
> +    sd->schedule_lock = &sd->_lock;
> +}
> +
> +static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *nvc = null_vcpu(v);
> +    unsigned int cpu;
> +    spinlock_t *lock;
> +
> +    ASSERT(!is_idle_vcpu(v));
> +
> + retry:
> +    lock = vcpu_schedule_lock_irq(v);
> +
> +    cpu = pick_cpu(prv, v);
> +
> +    /* We hold v->processor's runq lock, but we need cpu's one */
> +    if ( cpu != v->processor )
> +    {
> +        spin_unlock(lock);
> +        lock = pcpu_schedule_lock(cpu);
> +    }
> +
> +    /*
> +     * If the pCPU is free, we assign v to it.
> +     *
> +     * If it is not free (e.g., because we raced with another insert
> +     * or migrate), but there are free pCPUs, we try to pick again.
> +     *
> +     * If the pCPU is not free, and there aren't any (valid) others,
> +     * we have no alternatives than to go into the waitqueue.
> +     */
> +    if ( likely(per_cpu(npc, cpu).vcpu == NULL) )
> +    {
> +        /*
> +         * Insert is followed by vcpu_wake(), so there's no need to poke
> +         * the pcpu with the SCHEDULE_SOFTIRQ, as wake will do that.
> +         */
> +        vcpu_assign(prv, v, cpu);
> +    }
> +    else if ( cpumask_intersects(&prv->cpus_free,
> +                                 cpupool_domain_cpumask(v->domain)) )
> +    {
> +        spin_unlock(lock);
> +        goto retry;
> +    }
> +    else
> +    {
> +        spin_lock(&prv->waitq_lock);
> +        list_add_tail(&nvc->waitq_elem, &prv->waitq);
> +        spin_unlock(&prv->waitq_lock);
> +    }
> +    spin_unlock_irq(lock);
> +
> +    SCHED_STAT_CRANK(vcpu_insert);
> +}
> +
> +static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *wvc, *nvc = null_vcpu(v);
> +    unsigned int cpu;
> +    spinlock_t *lock;
> +
> +    ASSERT(!is_idle_vcpu(v));
> +
> +    lock = vcpu_schedule_lock_irq(v);
> +
> +    cpu = v->processor;
> +
> +    /* If v is in waitqueue, just get it out of there and bail */
> +    if ( unlikely(nvc->pcpu == -1) )
> +    {
> +        spin_lock(&prv->waitq_lock);
> +
> +        ASSERT(!list_empty(&null_vcpu(v)->waitq_elem));
> +        list_del_init(&nvc->waitq_elem);
> +
> +        spin_unlock(&prv->waitq_lock);
> +
> +        goto out;
> +    }
> +
> +    /*
> +     * If v is assigned to a pCPU, let's see if there is someone waiting.
> +     * If yes, we assign it to cpu, in spite of v. If no, we just set
> +     * cpu free.
> +     */
> +
> +    ASSERT(per_cpu(npc, cpu).vcpu == v);
> +    ASSERT(!cpumask_test_cpu(cpu, &prv->cpus_free));
> +
> +    spin_lock(&prv->waitq_lock);
> +    wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> +    if ( wvc )
> +    {
> +        vcpu_assign(prv, wvc->vcpu, cpu);

The vcpu_assign in null_vcpu_insert is protected by the pcpu runq lock,
while this call is protected by the waitq_lock lock. Is that safe?


> +        list_del_init(&wvc->waitq_elem);
> +        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> +    }
> +    else
> +    {
> +        vcpu_deassign(prv, v, cpu);
> +    }
> +    spin_unlock(&prv->waitq_lock);
> +
> + out:
> +    vcpu_schedule_unlock_irq(lock, v);
> +
> +    SCHED_STAT_CRANK(vcpu_remove);
> +}
> +
> +static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
> +{
> +    ASSERT(!is_idle_vcpu(v));
> +
> +    if ( unlikely(curr_on_cpu(v->processor) == v) )
> +    {
> +        SCHED_STAT_CRANK(vcpu_wake_running);
> +        return;
> +    }
> +
> +    if ( null_vcpu(v)->pcpu == -1 )
> +    {
> +	/* Not exactly "on runq", but close enough for reusing the counter */
> +        SCHED_STAT_CRANK(vcpu_wake_onrunq);
> +	return;

coding style


> +    }
> +
> +    if ( likely(vcpu_runnable(v)) )
> +        SCHED_STAT_CRANK(vcpu_wake_runnable);
> +    else
> +        SCHED_STAT_CRANK(vcpu_wake_not_runnable);
> +
> +    /* Note that we get here only for vCPUs assigned to a pCPU */
> +    cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
> +}
> +
> +static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
> +{
> +    ASSERT(!is_idle_vcpu(v));
> +
> +    /* If v is not assigned to a pCPU, or is not running, no need to bother */
> +    if ( curr_on_cpu(v->processor) == v )
> +        cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
> +
> +    SCHED_STAT_CRANK(vcpu_sleep);
> +}
> +
> +static int null_cpu_pick(const struct scheduler *ops, struct vcpu *v)
> +{
> +    ASSERT(!is_idle_vcpu(v));
> +    return pick_cpu(null_priv(ops), v);
> +}
> +
> +static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
> +                              unsigned int new_cpu)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *nvc = null_vcpu(v);
> +    unsigned int cpu = v->processor;
> +
> +    ASSERT(!is_idle_vcpu(v));
> +
> +    if ( v->processor == new_cpu )
> +        return;
> +
> +    /*
> +     * v is either in the waitqueue, or assigned to a pCPU.
> +     *
> +     * In the former case, there is nothing to do.
> +     *
> +     * In the latter, the pCPU to which it was assigned would become free,
> +     * and we, therefore, should check whether there is anyone in the
> +     * waitqueue that can be assigned to it.
> +     */
> +    if ( likely(nvc->pcpu != -1) )
> +    {
> +        struct null_vcpu *wvc;
> +
> +        spin_lock(&prv->waitq_lock);
> +        wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> +        if ( wvc && cpumask_test_cpu(cpu, cpupool_domain_cpumask(v->domain)) )
> +        {
> +            vcpu_assign(prv, wvc->vcpu, cpu);
> +            list_del_init(&wvc->waitq_elem);
> +            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> +        }
> +        else
> +        {
> +            vcpu_deassign(prv, v, cpu);
> +        }
> +        spin_unlock(&prv->waitq_lock);

This looks very similar to null_vcpu_remove, maybe you want to refactor
the code and call a single shared service function.


> +	SCHED_STAT_CRANK(migrate_running);

coding style


> +    }
> +    else
> +        SCHED_STAT_CRANK(migrate_on_runq);
> +
> +    SCHED_STAT_CRANK(migrated);
> +
> +    /*
> +     * Let's now consider new_cpu, which is where v is being sent. It can be
> +     * either free, or have a vCPU already assigned to it.
> +     *
> +     * In the former case, we should assign v to it, and try to get it to run.
> +     *
> +     * In latter, all we can do is to park v in the waitqueue.
> +     */
> +    if ( per_cpu(npc, new_cpu).vcpu == NULL )
> +    {
> +        /* We don't know whether v was in the waitqueue. If yes, remove it */
> +        spin_lock(&prv->waitq_lock);
> +        list_del_init(&nvc->waitq_elem);
> +        spin_unlock(&prv->waitq_lock);
> +
> +        vcpu_assign(prv, v, new_cpu);

This vcpu_assign call seems to be unprotected. Should it be within a
spin_lock'ed area?


> +    }
> +    else
> +    {
> +        /* We don't know whether v was in the waitqueue. If no, put it there */
> +        spin_lock(&prv->waitq_lock);
> +        if ( list_empty(&nvc->waitq_elem) )
> +        {
> +            list_add_tail(&nvc->waitq_elem, &prv->waitq);
> +            nvc->pcpu = -1;
> +        }
> +        else
> +            ASSERT(nvc->pcpu == -1);
> +        spin_unlock(&prv->waitq_lock);
> +    }
> +
> +    /*
> +     * Whatever all the above, we always at least override v->processor.
> +     * This is especially important for shutdown or suspend/resume paths,
> +     * when it is important to let our caller (cpu_disable_scheduler())
> +     * know that the migration did happen, to the best of our possibilities,
> +     * at least. In case of suspend, any temporary inconsistency caused
> +     * by this, will be fixed-up during resume.
> +     */
> +    v->processor = new_cpu;
> +}
> +
> +#ifndef NDEBUG
> +static inline void null_vcpu_check(struct vcpu *v)
> +{
> +    struct null_vcpu * const nvc = null_vcpu(v);
> +    struct null_dom * const ndom = nvc->ndom;
> +
> +    BUG_ON(nvc->vcpu != v);
> +    BUG_ON(ndom != null_dom(v->domain));
> +    if ( ndom )
> +    {
> +        BUG_ON(is_idle_vcpu(v));
> +        BUG_ON(ndom->dom != v->domain);
> +    }
> +    else
> +    {
> +        BUG_ON(!is_idle_vcpu(v));
> +    }
> +    SCHED_STAT_CRANK(vcpu_check);
> +}
> +#define NULL_VCPU_CHECK(v)  (null_vcpu_check(v))
> +#else
> +#define NULL_VCPU_CHECK(v)
> +#endif
> +
> +
> +/*
> + * The most simple scheduling function of all times! We either return:
> + *  - the vCPU assigned to the pCPU, if there's one and it can run;
> + *  - the idle vCPU, otherwise.
> + */
> +static struct task_slice null_schedule(const struct scheduler *ops,
> +                                       s_time_t now,
> +                                       bool_t tasklet_work_scheduled)
> +{
> +    const unsigned int cpu = smp_processor_id();
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *wvc;
> +    struct task_slice ret;
> +
> +    SCHED_STAT_CRANK(schedule);
> +    NULL_VCPU_CHECK(current);
> +
> +    ret.task = per_cpu(npc, cpu).vcpu;
> +    ret.migrated = 0;
> +    ret.time = -1;
> +
> +    /*
> +     * We may be new in the cpupool, or just coming back online. In which
> +     * case, there may be vCPUs in the waitqueue that we can assign to us
> +     * and run.
> +     */
> +    if ( unlikely(ret.task == NULL) )
> +    {
> +        spin_lock(&prv->waitq_lock);
> +        wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> +        if ( wvc )
> +        {
> +            vcpu_assign(prv, wvc->vcpu, cpu);
> +            list_del_init(&wvc->waitq_elem);
> +            ret.task = wvc->vcpu;
> +        }
> +        spin_unlock(&prv->waitq_lock);
> +    }
> +
> +    if ( unlikely(tasklet_work_scheduled ||
> +                  ret.task == NULL ||
> +                  !vcpu_runnable(ret.task)) )
> +        ret.task = idle_vcpu[cpu];
> +
> +    NULL_VCPU_CHECK(ret.task);
> +    return ret;
> +}
> +
> +static inline void dump_vcpu(struct null_private *prv, struct null_vcpu *nvc)
> +{
> +    printk("[%i.%i] pcpu=%d", nvc->vcpu->domain->domain_id,
> +            nvc->vcpu->vcpu_id, nvc->pcpu);
> +}
> +
> +static void null_dump_pcpu(const struct scheduler *ops, int cpu)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *nvc;
> +    spinlock_t *lock;
> +    unsigned long flags;
> +#define cpustr keyhandler_scratch
> +
> +    lock = pcpu_schedule_lock_irqsave(cpu, &flags);
> +
> +    cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu));
> +    printk("CPU[%02d] sibling=%s, ", cpu, cpustr);
> +    cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu));
> +    printk("core=%s", cpustr);
> +    if ( per_cpu(npc, cpu).vcpu != NULL )
> +        printk(", vcpu=d%dv%d", per_cpu(npc, cpu).vcpu->domain->domain_id,
> +               per_cpu(npc, cpu).vcpu->vcpu_id);
> +    printk("\n");
> +
> +    /* current VCPU (nothing to say if that's the idle vcpu) */
> +    nvc = null_vcpu(curr_on_cpu(cpu));
> +    if ( nvc && !is_idle_vcpu(nvc->vcpu) )
> +    {
> +        printk("\trun: ");
> +        dump_vcpu(prv, nvc);
> +        printk("\n");
> +    }
> +
> +    pcpu_schedule_unlock_irqrestore(lock, flags, cpu);
> +#undef cpustr
> +}
> +
> +static void null_dump(const struct scheduler *ops)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct list_head *iter;
> +    unsigned long flags;
> +    unsigned int loop;
> +#define cpustr keyhandler_scratch
> +
> +    spin_lock_irqsave(&prv->lock, flags);
> +
> +    cpulist_scnprintf(cpustr, sizeof(cpustr), &prv->cpus_free);
> +    printk("\tcpus_free = %s\n", cpustr);
> +
> +    printk("Domain info:\n");
> +    loop = 0;
> +    list_for_each( iter, &prv->ndom )
> +    {
> +        struct null_dom *ndom;
> +        struct vcpu *v;
> +
> +        ndom = list_entry(iter, struct null_dom, ndom_elem);
> +
> +        printk("\tDomain: %d\n", ndom->dom->domain_id);
> +        for_each_vcpu( ndom->dom, v )
> +        {
> +            struct null_vcpu * const nvc = null_vcpu(v);
> +            spinlock_t *lock;
> +
> +            lock = vcpu_schedule_lock(nvc->vcpu);
> +
> +            printk("\t%3d: ", ++loop);
> +            dump_vcpu(prv, nvc);
> +            printk("\n");
> +
> +            vcpu_schedule_unlock(lock, nvc->vcpu);
> +        }
> +    }
> +
> +    printk("Waitqueue: ");
> +    loop = 0;
> +    spin_lock(&prv->waitq_lock);
> +    list_for_each( iter, &prv->waitq )
> +    {
> +        struct null_vcpu *nvc = list_entry(iter, struct null_vcpu, waitq_elem);
> +
> +        if ( loop++ != 0 )
> +            printk(", ");
> +        if ( loop % 24 == 0 )
> +            printk("\n\t");
> +        printk("d%dv%d", nvc->vcpu->domain->domain_id, nvc->vcpu->vcpu_id);
> +    }
> +    printk("\n");
> +    spin_unlock(&prv->waitq_lock);
> +
> +    spin_unlock_irqrestore(&prv->lock, flags);
> +#undef cpustr
> +}
> +
> +const struct scheduler sched_null_def = {
> +    .name           = "null Scheduler",
> +    .opt_name       = "null",
> +    .sched_id       = XEN_SCHEDULER_NULL,
> +    .sched_data     = NULL,
> +
> +    .init           = null_init,
> +    .deinit         = null_deinit,
> +    .init_pdata     = null_init_pdata,
> +    .switch_sched   = null_switch_sched,
> +    .deinit_pdata   = null_deinit_pdata,
> +
> +    .alloc_vdata    = null_alloc_vdata,
> +    .free_vdata     = null_free_vdata,
> +    .alloc_domdata  = null_alloc_domdata,
> +    .free_domdata   = null_free_domdata,
> +
> +    .init_domain    = null_dom_init,
> +    .destroy_domain = null_dom_destroy,
> +
> +    .insert_vcpu    = null_vcpu_insert,
> +    .remove_vcpu    = null_vcpu_remove,
> +
> +    .wake           = null_vcpu_wake,
> +    .sleep          = null_vcpu_sleep,
> +    .pick_cpu       = null_cpu_pick,
> +    .migrate        = null_vcpu_migrate,
> +    .do_schedule    = null_schedule,
> +
> +    .dump_cpu_state = null_dump_pcpu,
> +    .dump_settings  = null_dump,
> +};
> +
> +REGISTER_SCHEDULER(sched_null_def);
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 223a120..b482037 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
>  
>   out:
>      per_cpu(cpupool, cpu) = c;
> +    /* Trigger a reschedule so the CPU can pick up some work ASAP. */
> +    cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>  
>      return 0;
>  }

Why? This change is not explained in the commit message.


> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 85cbb7c..32b578d 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -352,6 +352,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
>  #define XEN_SCHEDULER_CREDIT2  6
>  #define XEN_SCHEDULER_ARINC653 7
>  #define XEN_SCHEDULER_RTDS     8
> +#define XEN_SCHEDULER_NULL     9
>  
>  typedef struct xen_domctl_sched_credit {
>      uint16_t weight;
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] xen: sched_null: support for hard affinity
  2017-03-17 18:43 ` [PATCH 2/3] xen: sched_null: support for hard affinity Dario Faggioli
@ 2017-03-20 23:46   ` Stefano Stabellini
  2017-03-21  8:47     ` Dario Faggioli
  0 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2017-03-20 23:46 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Jonathan Davies, George Dunlap, Marcus Granado,
	Stefano Stabellini, Julien Grall, xen-devel

On Fri, 17 Mar 2017, Dario Faggioli wrote:
> As a (rudimental) way of directing and affecting the
> placement logic implemented by the scheduler, support
> vCPU hard affinity.
> 
> Basically, a vCPU will now be assigned only to a pCPU
> that is part of its own hard affinity. If such pCPU(s)
> is (are) busy, the vCPU will wait, like it happens
> when there are no free pCPUs.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Stefano Stabellini <stefano@aporeto.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Jonathan Davies <Jonathan.Davies@citrix.com>
> Cc: Marcus Granado <marcus.granado@citrix.com>
> ---
>  xen/common/sched_null.c |   43 ++++++++++++++++++++++++++++++++-----------
>  1 file changed, 32 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
> index 6a13308..ea055f1 100644
> --- a/xen/common/sched_null.c
> +++ b/xen/common/sched_null.c
> @@ -117,6 +117,14 @@ static inline struct null_dom *null_dom(const struct domain *d)
>      return d->sched_priv;
>  }
>  
> +static inline bool check_nvc_affinity(struct null_vcpu *nvc, unsigned int cpu)
> +{
> +    cpumask_and(cpumask_scratch_cpu(cpu), nvc->vcpu->cpu_hard_affinity,
> +                cpupool_domain_cpumask(nvc->vcpu->domain));
> +
> +    return cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu));
> +}

If you make it take a struct vcpu* as first argument, it will be more
generally usable


>  static int null_init(struct scheduler *ops)
>  {
>      struct null_private *prv;
> @@ -284,16 +292,20 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
>  
>      ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
>  
> +    cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, cpus);
> +
>      /*
>       * If our processor is free, or we are assigned to it, and it is
> -     * also still valid, just go for it.
> +     * also still valid and part of our affinity, just go for it.
>       */
>      if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v)
> -                && cpumask_test_cpu(cpu, cpus)) )
> +                && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu))) )

Then you can use it here:
     check_nvc_affinity(v, cpu);


>          return cpu;
>  
> -    /* If not, just go for a valid free pCPU, if any */
> +    /* If not, just go for a free pCPU, within our affinity, if any */
>      cpumask_and(cpumask_scratch_cpu(cpu), &prv->cpus_free, cpus);
> +    cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
> +                v->cpu_hard_affinity);

You can do this with one cpumask_and (in addition to the one above):

   cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
               &prv->cpus_free);


>      cpu = cpumask_first(cpumask_scratch_cpu(cpu));
>  
>      /*
> @@ -308,7 +320,10 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v)
>       * only if the pCPU is free.
>       */
>      if ( unlikely(cpu == nr_cpu_ids) )
> -        cpu = cpumask_any(cpus);
> +    {
> +        cpumask_and(cpumask_scratch_cpu(cpu), cpus, v->cpu_hard_affinity);

Could the intersection be 0?


> +        cpu = cpumask_any(cpumask_scratch_cpu(cpu));
> +    }
>  
>      return cpu;
>  }
> @@ -391,6 +406,9 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
>          lock = pcpu_schedule_lock(cpu);
>      }
>  
> +    cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
> +		cpupool_domain_cpumask(v->domain));
> +

coding style


>      /*
>       * If the pCPU is free, we assign v to it.
>       *
> @@ -408,8 +426,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
>           */
>          vcpu_assign(prv, v, cpu);
>      }
> -    else if ( cpumask_intersects(&prv->cpus_free,
> -                                 cpupool_domain_cpumask(v->domain)) )
> +    else if ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)) )
>      {
>          spin_unlock(lock);
>          goto retry;
> @@ -462,7 +479,7 @@ static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
>  
>      spin_lock(&prv->waitq_lock);
>      wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> -    if ( wvc )
> +    if ( wvc && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu)) )

shouldn't this be
    
    check_nvc_affinity(wvc, cpu)

?


>      {
>          vcpu_assign(prv, wvc->vcpu, cpu);
>          list_del_init(&wvc->waitq_elem);
> @@ -550,7 +567,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
>  
>          spin_lock(&prv->waitq_lock);
>          wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> -        if ( wvc && cpumask_test_cpu(cpu, cpupool_domain_cpumask(v->domain)) )
> +        if ( wvc && check_nvc_affinity(wvc, cpu) )
>          {
>              vcpu_assign(prv, wvc->vcpu, cpu);
>              list_del_init(&wvc->waitq_elem);
> @@ -573,11 +590,15 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
>       * Let's now consider new_cpu, which is where v is being sent. It can be
>       * either free, or have a vCPU already assigned to it.
>       *
> -     * In the former case, we should assign v to it, and try to get it to run.
> +     * In the former case, we should assign v to it, and try to get it to run,
> +     * if possible, according to affinity.
>       *
>       * In latter, all we can do is to park v in the waitqueue.
>       */
> -    if ( per_cpu(npc, new_cpu).vcpu == NULL )
> +    cpumask_and(cpumask_scratch_cpu(cpu), cpupool_domain_cpumask(v->domain),
> +                nvc->vcpu->cpu_hard_affinity);
> +    if ( per_cpu(npc, new_cpu).vcpu == NULL &&
> +         cpumask_test_cpu(new_cpu, cpumask_scratch_cpu(cpu)) )

could you do instead:
            check_nvc_affinity(nvc, new_cpu)
?


>      {
>          /* We don't know whether v was in the waitqueue. If yes, remove it */
>          spin_lock(&prv->waitq_lock);
> @@ -666,7 +687,7 @@ static struct task_slice null_schedule(const struct scheduler *ops,
>      {
>          spin_lock(&prv->waitq_lock);
>          wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem);
> -        if ( wvc )
> +        if ( wvc && check_nvc_affinity(wvc, cpu) )
>          {
>              vcpu_assign(prv, wvc->vcpu, cpu);
>              list_del_init(&wvc->waitq_elem);
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-20 23:21   ` Stefano Stabellini
@ 2017-03-21  8:26     ` Dario Faggioli
  0 siblings, 0 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-03-21  8:26 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jonathan Davies, Julien Grall, George Dunlap, Marcus Granado, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 9015 bytes --]

On Mon, 2017-03-20 at 16:21 -0700, Stefano Stabellini wrote:
> On Fri, 17 Mar 2017, Dario Faggioli wrote:
> > 
> > --- /dev/null
> > +++ b/xen/common/sched_null.c

> > +/*
> > + * Virtual CPU
> > + */
> > +struct null_vcpu {
> > +    struct list_head waitq_elem;
> > +    struct null_dom *ndom;
> 
> This field is redundant, given that from struct vcpu you can get
> struct
> domain and from struct domain you can get struct null_dom. I would
> remove it.
> 
It's kind of a paradigm, followed by pretty much all schedulers. So
it's good to uniform to that, and it's often quite useful (or it may be
in future)... I'll have a look, though, at whether it is really
important to have it in this simple scheduler too, and do remove it if
it's not worth.

> > +    struct vcpu *vcpu;
> > +    int pcpu;          /* To what pCPU the vCPU is assigned (-1 if
> > none) */
> 
> Isn't this always the same as struct vcpu->processor?
> If it's only different when the vcpu is waiting in the waitq, then
> you
> can just remove this field and replace the pcpu == -1 test with
> list_empty(waitq_elem).
> 
I'll think about it. Right now, it's useful for ASSERTing and
consistency checking, which is something I want, at least in this
phase. It is also useful for reporting to what pcpu a vcpu is currently
assigned, for which thing you can't trust v->processor. That only
happens in `xl debug-key r' for now, but we may want to have less
tricky way of exporting such information in future.

Anyway, as I said, I'll ponder whether I can get rid of it.

> > +static void null_vcpu_remove(const struct scheduler *ops, struct
> > vcpu *v)
> > +{
> > +    struct null_private *prv = null_priv(ops);
> > +    struct null_vcpu *wvc, *nvc = null_vcpu(v);
> > +    unsigned int cpu;
> > +    spinlock_t *lock;
> > +
> > +    ASSERT(!is_idle_vcpu(v));
> > +
> > +    lock = vcpu_schedule_lock_irq(v);
> > +
> > +    cpu = v->processor;
> > +
> > +    /* If v is in waitqueue, just get it out of there and bail */
> > +    if ( unlikely(nvc->pcpu == -1) )
> > +    {
> > +        spin_lock(&prv->waitq_lock);
> > +
> > +        ASSERT(!list_empty(&null_vcpu(v)->waitq_elem));
> > +        list_del_init(&nvc->waitq_elem);
> > +
> > +        spin_unlock(&prv->waitq_lock);
> > +
> > +        goto out;
> > +    }
> > +
> > +    /*
> > +     * If v is assigned to a pCPU, let's see if there is someone
> > waiting.
> > +     * If yes, we assign it to cpu, in spite of v. If no, we just
> > set
> > +     * cpu free.
> > +     */
> > +
> > +    ASSERT(per_cpu(npc, cpu).vcpu == v);
> > +    ASSERT(!cpumask_test_cpu(cpu, &prv->cpus_free));
> > +
> > +    spin_lock(&prv->waitq_lock);
> > +    wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu,
> > waitq_elem);
> > +    if ( wvc )
> > +    {
> > +        vcpu_assign(prv, wvc->vcpu, cpu);
> 
> The vcpu_assign in null_vcpu_insert is protected by the pcpu runq
> lock,
> while this call is protected by the waitq_lock lock. Is that safe?
> 
vcpu assignment is always protected by the runqueue lock. Both in
null_vcpu_insert and() in here, we take it with:

 lock = vcpu_schedule_lock_irq(v);

at the beginning of the function (left in context, see above).

Taking the waitq_lock here serializes access to the waitqueue
(prv->waitqueue), i.e., the list_first_entry_or_null() call above, and
the list_del_init() call below.

> > +        list_del_init(&wvc->waitq_elem);
> > +        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> > +    }
> > +    else
> > +    {
> > +        vcpu_deassign(prv, v, cpu);
> > +    }
> > +    spin_unlock(&prv->waitq_lock);
> > +
> > + out:
> > +    vcpu_schedule_unlock_irq(lock, v);
> > +
> > +    SCHED_STAT_CRANK(vcpu_remove);
> > +}
> > +
> > +static void null_vcpu_wake(const struct scheduler *ops, struct
> > vcpu *v)
> > +{
> > +    ASSERT(!is_idle_vcpu(v));
> > +
> > +    if ( unlikely(curr_on_cpu(v->processor) == v) )
> > +    {
> > +        SCHED_STAT_CRANK(vcpu_wake_running);
> > +        return;
> > +    }
> > +
> > +    if ( null_vcpu(v)->pcpu == -1 )
> > +    {
> > +	/* Not exactly "on runq", but close enough for reusing the
> > counter */
> > +        SCHED_STAT_CRANK(vcpu_wake_onrunq);
> > +	return;
> 
> coding style
> 
Yeah... I need to double check the style of all the file (patch
series?). I mostly wrote this while travelling, with an editor set
differently from what I use when at home. I thought I had done that,
but I clearly missed a couple of sports. Sorry.

> > +static void null_vcpu_migrate(const struct scheduler *ops, struct
> > vcpu *v,
> > +                              unsigned int new_cpu)
> > +{
> > +    struct null_private *prv = null_priv(ops);
> > +    struct null_vcpu *nvc = null_vcpu(v);
> > +    unsigned int cpu = v->processor;
> > +
> > +    ASSERT(!is_idle_vcpu(v));
> > +
> > +    if ( v->processor == new_cpu )
> > +        return;
> > +
> > +    /*
> > +     * v is either in the waitqueue, or assigned to a pCPU.
> > +     *
> > +     * In the former case, there is nothing to do.
> > +     *
> > +     * In the latter, the pCPU to which it was assigned would
> > become free,
> > +     * and we, therefore, should check whether there is anyone in
> > the
> > +     * waitqueue that can be assigned to it.
> > +     */
> > +    if ( likely(nvc->pcpu != -1) )
> > +    {
> > +        struct null_vcpu *wvc;
> > +
> > +        spin_lock(&prv->waitq_lock);
> > +        wvc = list_first_entry_or_null(&prv->waitq, struct
> > null_vcpu, waitq_elem);
> > +        if ( wvc && cpumask_test_cpu(cpu,
> > cpupool_domain_cpumask(v->domain)) )
> > +        {
> > +            vcpu_assign(prv, wvc->vcpu, cpu);
> > +            list_del_init(&wvc->waitq_elem);
> > +            cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> > +        }
> > +        else
> > +        {
> > +            vcpu_deassign(prv, v, cpu);
> > +        }
> > +        spin_unlock(&prv->waitq_lock);
> 
> This looks very similar to null_vcpu_remove, maybe you want to
> refactor
> the code and call a single shared service function.
> 
Yes, maybe I can.

> > +	SCHED_STAT_CRANK(migrate_running);
> > +    }
> > +    else
> > +        SCHED_STAT_CRANK(migrate_on_runq);
> > +
> > +    SCHED_STAT_CRANK(migrated);
> > +
> > +    /*
> > +     * Let's now consider new_cpu, which is where v is being sent.
> > It can be
> > +     * either free, or have a vCPU already assigned to it.
> > +     *
> > +     * In the former case, we should assign v to it, and try to
> > get it to run.
> > +     *
> > +     * In latter, all we can do is to park v in the waitqueue.
> > +     */
> > +    if ( per_cpu(npc, new_cpu).vcpu == NULL )
> > +    {
> > +        /* We don't know whether v was in the waitqueue. If yes,
> > remove it */
> > +        spin_lock(&prv->waitq_lock);
> > +        list_del_init(&nvc->waitq_elem);
> > +        spin_unlock(&prv->waitq_lock);
> > +
> > +        vcpu_assign(prv, v, new_cpu);
> 
> This vcpu_assign call seems to be unprotected. Should it be within a
> spin_lock'ed area?
> 
Lock is taken by the caller. Check calls to SCHED_OP(...,migrate) in
xen/common/schedule.c.

That's by design, and it's like that for most functions (with the sole
exceptions of debug dump and insert/remove, IIRC), for all schedulers.

> > diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> > index 223a120..b482037 100644
> > --- a/xen/common/schedule.c
> > +++ b/xen/common/schedule.c
> > @@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu,
> > struct cpupool *c)
> >  
> >   out:
> >      per_cpu(cpupool, cpu) = c;
> > +    /* Trigger a reschedule so the CPU can pick up some work ASAP.
> > */
> > +    cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> >  
> >      return 0;
> >  }
> 
> Why? This change is not explained in the commit message.
> 
You're right. This may well go into it's own patch, actually. I'll see
how I like it better.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] xen: sched_null: support for hard affinity
  2017-03-20 23:46   ` Stefano Stabellini
@ 2017-03-21  8:47     ` Dario Faggioli
  0 siblings, 0 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-03-21  8:47 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jonathan Davies, George Dunlap, Marcus Granado,
	Stefano Stabellini, Julien Grall, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3876 bytes --]

On Mon, 2017-03-20 at 16:46 -0700, Stefano Stabellini wrote:
> On Fri, 17 Mar 2017, Dario Faggioli wrote:
> > 
> > --- a/xen/common/sched_null.c
> > +++ b/xen/common/sched_null.c
> > @@ -117,6 +117,14 @@ static inline struct null_dom *null_dom(const
> > struct domain *d)
> >      return d->sched_priv;
> >  }
> >  
> > +static inline bool check_nvc_affinity(struct null_vcpu *nvc,
> > unsigned int cpu)
> > +{
> > +    cpumask_and(cpumask_scratch_cpu(cpu), nvc->vcpu-
> > >cpu_hard_affinity,
> > +                cpupool_domain_cpumask(nvc->vcpu->domain));
> > +
> > +    return cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu));
> > +}
> 
> If you make it take a struct vcpu* as first argument, it will be more
> generally usable
> 
Yes, that's probably a good idea. Thanks.

> >          return cpu;
> >  
> > -    /* If not, just go for a valid free pCPU, if any */
> > +    /* If not, just go for a free pCPU, within our affinity, if
> > any */
> >      cpumask_and(cpumask_scratch_cpu(cpu), &prv->cpus_free, cpus);
> > +    cpumask_and(cpumask_scratch_cpu(cpu),
> > cpumask_scratch_cpu(cpu),
> > +                v->cpu_hard_affinity);
> 
> You can do this with one cpumask_and (in addition to the one above):
> 
>    cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu),
>                &prv->cpus_free);
> 
Mmm... right. Wow... Quite an overlook on my side! :-P

> > @@ -308,7 +320,10 @@ static unsigned int pick_cpu(struct
> > null_private *prv, struct vcpu *v)
> >       * only if the pCPU is free.
> >       */
> >      if ( unlikely(cpu == nr_cpu_ids) )
> > -        cpu = cpumask_any(cpus);
> > +    {
> > +        cpumask_and(cpumask_scratch_cpu(cpu), cpus, v-
> > >cpu_hard_affinity);
> 
> Could the intersection be 0?
> 
Not really. Because vcpu_set_hard_affinity() fails when trying to set
the affinity to something that has a zero intersection with the set of
cpus of the pool the domain is in.

Other schedulers relies on this too.

However, I need to re-check what happens if everything is ok when
changing the affinity, but then all the cpus in the affinity itself are
removed from the pool... I will do that as, as I said, this is
something general (and this area is really a can of worms... And I keep
finding weird corner cases! :-O)

> > @@ -408,8 +426,7 @@ static void null_vcpu_insert(const struct
> > scheduler *ops, struct vcpu *v)
> >           */
> >          vcpu_assign(prv, v, cpu);
> >      }
> > -    else if ( cpumask_intersects(&prv->cpus_free,
> > -                                 cpupool_domain_cpumask(v-
> > >domain)) )
> > +    else if ( cpumask_intersects(&prv->cpus_free,
> > cpumask_scratch_cpu(cpu)) )
> >      {
> >          spin_unlock(lock);
> >          goto retry;
> > @@ -462,7 +479,7 @@ static void null_vcpu_remove(const struct
> > scheduler *ops, struct vcpu *v)
> >  
> >      spin_lock(&prv->waitq_lock);
> >      wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu,
> > waitq_elem);
> > -    if ( wvc )
> > +    if ( wvc && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu)) )
> 
> shouldn't this be
>     
>     check_nvc_affinity(wvc, cpu)
> 
> ?
> 
Mmm... well, considering that I never touch cpumask_scratch_cpu() in
this function, this is clearly a bug. Will fix. :-/

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
  2017-03-20 22:28   ` Stefano Stabellini
@ 2017-03-21 17:09   ` Wei Liu
  2017-03-27 10:50   ` George Dunlap
  2 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2017-03-21 17:09 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Ian Jackson, George Dunlap, Stefano Stabellini,
	Julien Grall, xen-devel

On Fri, Mar 17, 2017 at 07:43:13PM +0100, Dario Faggioli wrote:
> It being very very basic, also means this scheduler does
> not need much support at the tools level (for now).
> 
> Basically, just the definition of the symbol of the
> scheduler itself and a couple of stubs.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
  2017-03-20 23:21   ` Stefano Stabellini
@ 2017-03-27 10:31   ` George Dunlap
  2017-03-27 10:48     ` George Dunlap
  2017-04-06 15:07     ` Dario Faggioli
  1 sibling, 2 replies; 22+ messages in thread
From: George Dunlap @ 2017-03-27 10:31 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Jonathan Davies, Julien Grall, Stefano Stabellini, Marcus Granado

On 17/03/17 18:42, Dario Faggioli wrote:
> In cases where one is absolutely sure that there will be
> less vCPUs than pCPUs, having to pay the cose, mostly in
> terms of overhead, of an advanced scheduler may be not
> desirable.
> 
> The simple scheduler implemented here could be a solution.
> Here how it works:
>  - each vCPU is statically assigned to a pCPU;
>  - if there are pCPUs without any vCPU assigned, they
>    stay idle (as in, the run their idle vCPU);
>  - if there are vCPUs which are not assigned to any
>    pCPU (e.g., because there are more vCPUs than pCPUs)
>    they *don't* run, until they get assigned;
>  - if a vCPU assigned to a pCPU goes away, one of the
>    waiting to be assigned vCPU, if any, gets assigned
>    to the pCPU and can run there.

Hmm -- I'm not sure about this 'waitqueue' thing.  If you have a
multi-vcpu VM and one vcpu hangs, what normally happens is that the rest
of the VM ends up wedging itself in an unpredictable way, and if there's
a watchdog timer or sanity check of any sort then it will hit a
bugcheck.  As implemented, any number of mundane operations may cause
such a situation if you have one less pcpu or one more vcpu than you
thought.  This seems like a fairly "sharp edge" to have in the interface.

Would it be possible instead to have domain assignment, vcpu-add /
remove, pcpu remove, &c just fail (perhaps with -ENOSPC and/or -EBUSY)
if we ever reach a situation where |vcpus| > |pcpus|?

Or, to fail as many operations *as possible* which would bring us to
that state, use the `waitqueue` idea as a backup for situations where we
can't really avoid it?

Regarding the code, my brain doesn't seem to be at 100% this morning for
some reason, so just a couple of questions...

> +static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
> +{
> +    struct null_private *prv = null_priv(ops);
> +    struct null_vcpu *nvc = null_vcpu(v);
> +    unsigned int cpu;
> +    spinlock_t *lock;
> +
> +    ASSERT(!is_idle_vcpu(v));
> +
> + retry:
> +    lock = vcpu_schedule_lock_irq(v);
> +
> +    cpu = pick_cpu(prv, v);
> +
> +    /* We hold v->processor's runq lock, but we need cpu's one */
> +    if ( cpu != v->processor )
> +    {
> +        spin_unlock(lock);
> +        lock = pcpu_schedule_lock(cpu);

Don't we need to hold the lock for v->processor until we change
v->processor?  Otherwise someone might call vcpu_schedule_lock(v) at
this point and reasonably believe that is has the right to modify v.

Or does this not matter because we're just now calling insert (and so
nobody else is going to call vcpu_schedule_lock() on v?

> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 223a120..b482037 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
>  
>   out:
>      per_cpu(cpupool, cpu) = c;
> +    /* Trigger a reschedule so the CPU can pick up some work ASAP. */
> +    cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);

Is this a more generic fix / improvement?

At first blush everything else looks good.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-27 10:31   ` George Dunlap
@ 2017-03-27 10:48     ` George Dunlap
  2017-04-06 14:43       ` Dario Faggioli
  2017-04-06 15:07     ` Dario Faggioli
  1 sibling, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-03-27 10:48 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Jonathan Davies, Julien Grall, Stefano Stabellini, Marcus Granado

On Mon, Mar 27, 2017 at 11:31 AM, George Dunlap
<george.dunlap@citrix.com> wrote:
> On 17/03/17 18:42, Dario Faggioli wrote:
>> In cases where one is absolutely sure that there will be
>> less vCPUs than pCPUs, having to pay the cose, mostly in
>> terms of overhead, of an advanced scheduler may be not
>> desirable.
>>
>> The simple scheduler implemented here could be a solution.
>> Here how it works:
>>  - each vCPU is statically assigned to a pCPU;
>>  - if there are pCPUs without any vCPU assigned, they
>>    stay idle (as in, the run their idle vCPU);
>>  - if there are vCPUs which are not assigned to any
>>    pCPU (e.g., because there are more vCPUs than pCPUs)
>>    they *don't* run, until they get assigned;
>>  - if a vCPU assigned to a pCPU goes away, one of the
>>    waiting to be assigned vCPU, if any, gets assigned
>>    to the pCPU and can run there.
>
> Hmm -- I'm not sure about this 'waitqueue' thing.  If you have a
> multi-vcpu VM and one vcpu hangs, what normally happens is that the rest
> of the VM ends up wedging itself in an unpredictable way, and if there's
> a watchdog timer or sanity check of any sort then it will hit a
> bugcheck.  As implemented, any number of mundane operations may cause
> such a situation if you have one less pcpu or one more vcpu than you
> thought.  This seems like a fairly "sharp edge" to have in the interface.
>
> Would it be possible instead to have domain assignment, vcpu-add /
> remove, pcpu remove, &c just fail (perhaps with -ENOSPC and/or -EBUSY)
> if we ever reach a situation where |vcpus| > |pcpus|?
>
> Or, to fail as many operations *as possible* which would bring us to
> that state, use the `waitqueue` idea as a backup for situations where we
> can't really avoid it?

I suppose one reason is that it looks like a lot of the operations
can't really fail -- insert_vcpu and deinit_pdata both return void,
and the scheduler isn't even directly involved in setting the hard
affinity, so doesn't get a chance to object that with the new hard
affinity there is nowhere to run the vcpu.

I don't want to wait to re-write the interfaces to get this scheduler
in, so I suppose the waitqueue thing will have to do for now. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
  2017-03-20 22:28   ` Stefano Stabellini
  2017-03-21 17:09   ` Wei Liu
@ 2017-03-27 10:50   ` George Dunlap
  2017-04-06 10:49     ` Dario Faggioli
  2 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-03-27 10:50 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Stefano Stabellini, Ian Jackson, Julien Grall, Wei Liu

On 17/03/17 18:43, Dario Faggioli wrote:
> It being very very basic, also means this scheduler does
> not need much support at the tools level (for now).
> 
> Basically, just the definition of the symbol of the
> scheduler itself and a couple of stubs.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Stefano Stabellini <stefano@aporeto.com>
> Cc: Julien Grall <julien.grall@arm.com>
> ---
>  tools/libxl/libxl.h         |    6 ++++++
>  tools/libxl/libxl_sched.c   |   24 ++++++++++++++++++++++++
>  tools/libxl/libxl_types.idl |    1 +
>  3 files changed, 31 insertions(+)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 4c60e8f..5adac2d 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -210,6 +210,12 @@
>  #define LIBXL_HAVE_SCHED_RTDS 1
>  
>  /*
> + * LIBXL_HAVE_SCHED_NULL indicates that the 'null' static scheduler
> + * is available.
> + */
> +#define LIBXL_HAVE_SCHED_NULL 1
> +
> +/*
>   * libxl_domain_build_info has u.hvm.viridian_enable and _disable bitmaps
>   * of the specified width.
>   */
> diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
> index 84d3837..d44fbe1 100644
> --- a/tools/libxl/libxl_sched.c
> +++ b/tools/libxl/libxl_sched.c
> @@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc *gc, uint32_t domid,
>      return 0;
>  }
>  
> +static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
> +                                 const libxl_domain_sched_params *scinfo)
> +{
> +    /* The null scheduler doesn't take any domain-specific parameters. */
> +    return 0;
> +}
> +
> +static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
> +                               libxl_domain_sched_params *scinfo)
> +{
> +    /* The null scheduler doesn't have any domain-specific parameters. */
> +    return ERROR_INVAL;
> +}

Why the different return value?  Why not return either INVAL or SUCCESS
for both?

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-03-27 10:50   ` George Dunlap
@ 2017-04-06 10:49     ` Dario Faggioli
  2017-04-06 13:59       ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Dario Faggioli @ 2017-04-06 10:49 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Stefano Stabellini, Ian Jackson, Julien Grall, Wei Liu


[-- Attachment #1.1: Type: text/plain, Size: 2585 bytes --]

On Mon, 2017-03-27 at 11:50 +0100, George Dunlap wrote:
> On 17/03/17 18:43, Dario Faggioli wrote:
> > --- a/tools/libxl/libxl_sched.c
> > +++ b/tools/libxl/libxl_sched.c
> > @@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc
> > *gc, uint32_t domid,
> >      return 0;
> >  }
> >  
> > +static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
> > +                                 const libxl_domain_sched_params
> > *scinfo)
> > +{
> > +    /* The null scheduler doesn't take any domain-specific
> > parameters. */
> > +    return 0;
> > +}
> > +
> > +static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
> > +                               libxl_domain_sched_params *scinfo)
> > +{
> > +    /* The null scheduler doesn't have any domain-specific
> > parameters. */
> > +    return ERROR_INVAL;
> > +}
> 
> Why the different return value?  Why not return either INVAL or
> SUCCESS
> for both?
> 
Because domain_set() is called by libxl_domain_sched_params_set(),
which is in turn called unconditionally within libxl__build_post(),
with the purpose of setting the scheduling parameters chosen by the
user during domain creation.

If that fails (I've tried that), domain creation fails too. So either
it returns success, or we'd have to modify (at least)
liblx__build_post(), teaching it about acceptable failures.

OTOH, we indeed could return success for domain_get() too, for the sake
of having the two above functions return the same. But I really think
that call should fail, as an indication to the callers that they won't
get the value of any parameter for this scheduler.

The behavior achieved by this patch is the same one already in place
for ARINC653, with the difference that sched_arinc653_sched_get() is
not implemented, which means it is libxl_domain_sched_params_get() that
fails, if LIBXL_SCHED_ARINC653 is used. That's an option too, but then
libxl will also print "Unknown scheduler", which is not really correct
(not even in the ARINC case, actually!).

So, having evaluated all the choices, the asymmetric behavior above
seems the best one to me. And I'll add a comment to explain the
situation.

Thanks and regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-04-06 10:49     ` Dario Faggioli
@ 2017-04-06 13:59       ` George Dunlap
  2017-04-06 15:18         ` Dario Faggioli
  0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-04-06 13:59 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Stefano Stabellini, Ian Jackson, Julien Grall, Wei Liu

On 06/04/17 11:49, Dario Faggioli wrote:
> On Mon, 2017-03-27 at 11:50 +0100, George Dunlap wrote:
>> On 17/03/17 18:43, Dario Faggioli wrote:
>>> --- a/tools/libxl/libxl_sched.c
>>> +++ b/tools/libxl/libxl_sched.c
>>> @@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc
>>> *gc, uint32_t domid,
>>>      return 0;
>>>  }
>>>  
>>> +static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
>>> +                                 const libxl_domain_sched_params
>>> *scinfo)
>>> +{
>>> +    /* The null scheduler doesn't take any domain-specific
>>> parameters. */
>>> +    return 0;
>>> +}
>>> +
>>> +static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
>>> +                               libxl_domain_sched_params *scinfo)
>>> +{
>>> +    /* The null scheduler doesn't have any domain-specific
>>> parameters. */
>>> +    return ERROR_INVAL;
>>> +}
>>
>> Why the different return value?  Why not return either INVAL or
>> SUCCESS
>> for both?
>>
> Because domain_set() is called by libxl_domain_sched_params_set(),
> which is in turn called unconditionally within libxl__build_post(),
> with the purpose of setting the scheduling parameters chosen by the
> user during domain creation.
> 
> If that fails (I've tried that), domain creation fails too. So either
> it returns success, or we'd have to modify (at least)
> liblx__build_post(), teaching it about acceptable failures.
> 
> OTOH, we indeed could return success for domain_get() too, for the sake
> of having the two above functions return the same. But I really think
> that call should fail, as an indication to the callers that they won't
> get the value of any parameter for this scheduler.

I see.  So if *our* code doesn't know that there aren't any parameters
to set, that's OK; but if *other people's code doesn't know that there
aren't any parameters to get, it needs to be changed to know that.  Got
it. ;-)

There is a sort of mathematical logic to the idea that setting a null
set of parameters should always succeed; and it's certainly convenient
for tools to be able to always just call libxl_domain_sched_params_set()
without having to check what scheduler is there.  But the same logic I
think applies to get(), so I would say to return 0 for both.

But Wei and Ian have the final say.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-27 10:48     ` George Dunlap
@ 2017-04-06 14:43       ` Dario Faggioli
  0 siblings, 0 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-04-06 14:43 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Jonathan Davies, Julien Grall, Stefano Stabellini, Marcus Granado


[-- Attachment #1.1: Type: text/plain, Size: 2674 bytes --]

On Mon, 2017-03-27 at 11:48 +0100, George Dunlap wrote:
> On Mon, Mar 27, 2017 at 11:31 AM, George Dunlap
> <george.dunlap@citrix.com> wrote:
> > 
> > Would it be possible instead to have domain assignment, vcpu-add /
> > remove, pcpu remove, &c just fail (perhaps with -ENOSPC and/or
> > -EBUSY)
> > if we ever reach a situation where |vcpus| > |pcpus|?
> > 
> > Or, to fail as many operations *as possible* which would bring us
> > to
> > that state, use the `waitqueue` idea as a backup for situations
> > where we
> > can't really avoid it?
> 
> I suppose one reason is that it looks like a lot of the operations
> can't really fail -- insert_vcpu and deinit_pdata both return void,
> and the scheduler isn't even directly involved in setting the hard
> affinity, so doesn't get a chance to object that with the new hard
> affinity there is nowhere to run the vcpu.
> 
This is exactly how it is.

The waitqueue handling is the most complicated thing to deal with in
this scheduler, and I expect it to be completely useless, at least if
the scheduler is used in the way we think it should be actually used.

*But* I feel like assuming that this will happen 100% of the time is
unrealistic, and a waitqueue was the best I could come up with.

As you say, there are a whole bunch of operations that just can't be
forced to fail by the scheduler. E.g., I won't be able to forbid
removing a pCPU from a sched_null pool because it has a vCPU assigned
to it, nor adding a domain (and hence it's vCPUs) to such pool, if
there are not enough free pCPUs. :-/

> I don't want to wait to re-write the interfaces to get this scheduler
> in, so I suppose the waitqueue thing will have to do for now. :-)
> 
Yep. :-D

Let me add that, FWIW, I've tested situations where a (Linux) VM with 4
vCPUs was in a null pool with 4 pCPUs and then, with all the vCPUs
running, I removed and re-added 3 of the 4 pCPUs of the pool. And while
I agree that this should not be done, and that it's at high risk of
confusing, stalling or deadlocking the guest kernel, nothing exploded.

Doing the same thing to dom0, for instance, proved to be a lot less
safe. :-)

What I certainly can do is adding a warning when a vCPU hits the
waitqueue. Chatty indeed, but we _do_want_ to be a bit nasty to the
ones that misuse the scheduler... it's for their own good! :-P

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler
  2017-03-27 10:31   ` George Dunlap
  2017-03-27 10:48     ` George Dunlap
@ 2017-04-06 15:07     ` Dario Faggioli
  1 sibling, 0 replies; 22+ messages in thread
From: Dario Faggioli @ 2017-04-06 15:07 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Jonathan Davies, Julien Grall, Stefano Stabellini, Marcus Granado


[-- Attachment #1.1: Type: text/plain, Size: 2236 bytes --]

On Mon, 2017-03-27 at 11:31 +0100, George Dunlap wrote:
> On 17/03/17 18:42, Dario Faggioli wrote:
> > +static void null_vcpu_insert(const struct scheduler *ops, struct
> > vcpu *v)
> > +{
> > +    struct null_private *prv = null_priv(ops);
> > +    struct null_vcpu *nvc = null_vcpu(v);
> > +    unsigned int cpu;
> > +    spinlock_t *lock;
> > +
> > +    ASSERT(!is_idle_vcpu(v));
> > +
> > + retry:
> > +    lock = vcpu_schedule_lock_irq(v);
> > +
> > +    cpu = pick_cpu(prv, v);
> > +
> > +    /* We hold v->processor's runq lock, but we need cpu's one */
> > +    if ( cpu != v->processor )
> > +    {
> > +        spin_unlock(lock);
> > +        lock = pcpu_schedule_lock(cpu);
> 
> Don't we need to hold the lock for v->processor until we change
> v->processor?  Otherwise someone might call vcpu_schedule_lock(v) at
> this point and reasonably believe that is has the right to modify v.
> 
Yes, this is actually the case.

> Or does this not matter because we're just now calling insert (and so
> nobody else is going to call vcpu_schedule_lock() on v?
> 
Indeed no one will. But I still prefer to turn this into something much
more similar to what other schedulers do, and what is the most correct
and safe to do, i.e., as you suggest, change v->processor while holding
the original lock.

> > diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> > index 223a120..b482037 100644
> > --- a/xen/common/schedule.c
> > +++ b/xen/common/schedule.c
> > @@ -1785,6 +1785,8 @@ int schedule_cpu_switch(unsigned int cpu,
> > struct cpupool *c)
> >  
> >   out:
> >      per_cpu(cpupool, cpu) = c;
> > +    /* Trigger a reschedule so the CPU can pick up some work ASAP.
> > */
> > +    cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
> 
> Is this a more generic fix / improvement?
> 
Yep, I'm moving it in its own patch.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-04-06 13:59       ` George Dunlap
@ 2017-04-06 15:18         ` Dario Faggioli
  2017-04-07  9:42           ` Wei Liu
  0 siblings, 1 reply; 22+ messages in thread
From: Dario Faggioli @ 2017-04-06 15:18 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Julien Grall, Ian Jackson


[-- Attachment #1.1: Type: text/plain, Size: 2202 bytes --]

On Thu, 2017-04-06 at 14:59 +0100, George Dunlap wrote:
> On 06/04/17 11:49, Dario Faggioli wrote:
> > 
> > If that fails (I've tried that), domain creation fails too. So
> > either
> > it returns success, or we'd have to modify (at least)
> > liblx__build_post(), teaching it about acceptable failures.
> > 
> > OTOH, we indeed could return success for domain_get() too, for the
> > sake
> > of having the two above functions return the same. But I really
> > think
> > that call should fail, as an indication to the callers that they
> > won't
> > get the value of any parameter for this scheduler.
> 
> I see.  So if *our* code doesn't know that there aren't any
> parameters
> to set, that's OK; but if *other people's code doesn't know that
> there
> aren't any parameters to get, it needs to be changed to know
> that.  Got
> it. ;-)
> 
Of course! I mean, there must be some advantages and benefits in being
*us*! :-D

Actually, jokes apart, this is indeed asymmetric, but looks fair to me.
As in, _any_ caller (either xl/libxl or external) will see
libxl_domain_sched_params_set() succeeding, as well as _any_ caller
(either xl/libxl or external) will see libxl_domain_sched_params_get()
failing. :-)

> There is a sort of mathematical logic to the idea that setting a null
> set of parameters should always succeed; and it's certainly
> convenient
> for tools to be able to always just call
> libxl_domain_sched_params_set()
> without having to check what scheduler is there.  But the same logic
> I
> think applies to get(), so I would say to return 0 for both.
> 
I understand your point, and I'm happy to do that.

> But Wei and Ian have the final say.
> 
Wei acked this patch in v1 (20170321170902.ndk6h5ylyfkk4coo@citrix.com)
but that was before you raised this, so I'm happy to resend with this
changed, and doing whatever he prefers with his ack.

Wei?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-04-06 15:18         ` Dario Faggioli
@ 2017-04-07  9:42           ` Wei Liu
  2017-04-07 10:05             ` Dario Faggioli
  0 siblings, 1 reply; 22+ messages in thread
From: Wei Liu @ 2017-04-07  9:42 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Ian Jackson, George Dunlap, Stefano Stabellini,
	Julien Grall, xen-devel

On Thu, Apr 06, 2017 at 05:18:33PM +0200, Dario Faggioli wrote:
> On Thu, 2017-04-06 at 14:59 +0100, George Dunlap wrote:
> > On 06/04/17 11:49, Dario Faggioli wrote:
> > > 
> > > If that fails (I've tried that), domain creation fails too. So
> > > either
> > > it returns success, or we'd have to modify (at least)
> > > liblx__build_post(), teaching it about acceptable failures.
> > > 
> > > OTOH, we indeed could return success for domain_get() too, for the
> > > sake
> > > of having the two above functions return the same. But I really
> > > think
> > > that call should fail, as an indication to the callers that they
> > > won't
> > > get the value of any parameter for this scheduler.
> > 
> > I see.  So if *our* code doesn't know that there aren't any
> > parameters
> > to set, that's OK; but if *other people's code doesn't know that
> > there
> > aren't any parameters to get, it needs to be changed to know
> > that.  Got
> > it. ;-)
> > 
> Of course! I mean, there must be some advantages and benefits in being
> *us*! :-D
> 
> Actually, jokes apart, this is indeed asymmetric, but looks fair to me.
> As in, _any_ caller (either xl/libxl or external) will see
> libxl_domain_sched_params_set() succeeding, as well as _any_ caller
> (either xl/libxl or external) will see libxl_domain_sched_params_get()
> failing. :-)
> 
> > There is a sort of mathematical logic to the idea that setting a null
> > set of parameters should always succeed; and it's certainly
> > convenient
> > for tools to be able to always just call
> > libxl_domain_sched_params_set()
> > without having to check what scheduler is there.  But the same logic
> > I
> > think applies to get(), so I would say to return 0 for both.
> > 
> I understand your point, and I'm happy to do that.
> 
> > But Wei and Ian have the final say.
> > 
> Wei acked this patch in v1 (20170321170902.ndk6h5ylyfkk4coo@citrix.com)
> but that was before you raised this, so I'm happy to resend with this
> changed, and doing whatever he prefers with his ack.
> 
> Wei?

I don't feel strongly about this. But if you want to return success in
the get function, please make sure you initialise output to a known
state, instead of returning random garbage.

Wei.

> 
> Regards,
> Dario
> -- 
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-04-07  9:42           ` Wei Liu
@ 2017-04-07 10:05             ` Dario Faggioli
  2017-04-07 10:13               ` Wei Liu
  0 siblings, 1 reply; 22+ messages in thread
From: Dario Faggioli @ 2017-04-07 10:05 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, xen-devel, Julien Grall, Ian Jackson, George Dunlap


[-- Attachment #1.1.1: Type: text/plain, Size: 5781 bytes --]

On Fri, 2017-04-07 at 10:42 +0100, Wei Liu wrote:
> On Thu, Apr 06, 2017 at 05:18:33PM +0200, Dario Faggioli wrote:
> > 
> > Wei acked this patch in v1 (20170321170902.ndk6h5ylyfkk4coo@citrix.
> > com)
> > but that was before you raised this, so I'm happy to resend with
> > this
> > changed, and doing whatever he prefers with his ack.
> > 
> > Wei?
> 
> I don't feel strongly about this. But if you want to return success
> in
> the get function, please make sure you initialise output to a known
> state, instead of returning random garbage.
> 
Sure. That should not be a problem, as libxl_domain_sched_params_get()
calls, as the first thing it does, libxl_domain_sched_params_init(),
which sets all scinfo fields to their default.

Whatever I return, I won't touch scinfo at all, which means I'm just
returning those defaults.

So, can the below patch --which would make George much happier :-D
--also have your Ack?

Thanks and Regards,
Dario
---
tools: sched: add support for 'null' scheduler

From: Dario Faggioli <dario.faggioli@citrix.com>

It being very very basic, also means this scheduler does
not need much support at the tools level (for now).

Basically, just the definition of the symbol of the
scheduler itself and a couple of stubs.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
 tools/libxl/libxl.h         |    6 ++++++
 tools/libxl/libxl_sched.c   |   24 ++++++++++++++++++++++++
 tools/libxl/libxl_types.idl |    1 +
 3 files changed, 31 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a402236..cf8687a 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -210,6 +210,12 @@
 #define LIBXL_HAVE_SCHED_RTDS 1
 
 /*
+ * LIBXL_HAVE_SCHED_NULL indicates that the 'null' static scheduler
+ * is available.
+ */
+#define LIBXL_HAVE_SCHED_NULL 1
+
+/*
  * libxl_domain_build_info has u.hvm.viridian_enable and _disable bitmaps
  * of the specified width.
  */
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index 84d3837..d763b9a 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc *gc, uint32_t domid,
     return 0;
 }
 
+static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
+                                 const libxl_domain_sched_params *scinfo)
+{
+    /* There aren't any domain-specific parameters to be set. */
+    return 0;
+}
+
+static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
+                               libxl_domain_sched_params *scinfo)
+{
+    /* There aren't any domain-specific parameters to return. */
+    return 0;
+}
+
 static int sched_credit_domain_get(libxl__gc *gc, uint32_t domid,
                                    libxl_domain_sched_params *scinfo)
 {
@@ -730,6 +744,9 @@ int libxl_domain_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_set(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_set(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -758,6 +775,7 @@ int libxl_vcpu_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -792,6 +810,7 @@ int libxl_vcpu_sched_params_set_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -832,6 +851,9 @@ int libxl_domain_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_get(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_get(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -858,6 +880,7 @@ int libxl_vcpu_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -890,6 +913,7 @@ int libxl_vcpu_sched_params_get_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index d970284..d42f6a1 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -191,6 +191,7 @@ libxl_scheduler = Enumeration("scheduler", [
     (6, "credit2"),
     (7, "arinc653"),
     (8, "rtds"),
+    (9, "null"),
     ])
 
 # Consistent with SHUTDOWN_* in sched.h (apart from UNKNOWN)
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.1.2: tools-sched-add-support-for.patch --]
[-- Type: text/x-patch, Size: 4643 bytes --]

tools: sched: add support for 'null' scheduler

From: Dario Faggioli <dario.faggioli@citrix.com>

It being very very basic, also means this scheduler does
not need much support at the tools level (for now).

Basically, just the definition of the symbol of the
scheduler itself and a couple of stubs.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Stefano Stabellini <stefano@aporeto.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
---
 tools/libxl/libxl.h         |    6 ++++++
 tools/libxl/libxl_sched.c   |   24 ++++++++++++++++++++++++
 tools/libxl/libxl_types.idl |    1 +
 3 files changed, 31 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a402236..cf8687a 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -210,6 +210,12 @@
 #define LIBXL_HAVE_SCHED_RTDS 1
 
 /*
+ * LIBXL_HAVE_SCHED_NULL indicates that the 'null' static scheduler
+ * is available.
+ */
+#define LIBXL_HAVE_SCHED_NULL 1
+
+/*
  * libxl_domain_build_info has u.hvm.viridian_enable and _disable bitmaps
  * of the specified width.
  */
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index 84d3837..d763b9a 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -178,6 +178,20 @@ static int sched_arinc653_domain_set(libxl__gc *gc, uint32_t domid,
     return 0;
 }
 
+static int sched_null_domain_set(libxl__gc *gc, uint32_t domid,
+                                 const libxl_domain_sched_params *scinfo)
+{
+    /* There aren't any domain-specific parameters to be set. */
+    return 0;
+}
+
+static int sched_null_domain_get(libxl__gc *gc, uint32_t domid,
+                               libxl_domain_sched_params *scinfo)
+{
+    /* There aren't any domain-specific parameters to return. */
+    return 0;
+}
+
 static int sched_credit_domain_get(libxl__gc *gc, uint32_t domid,
                                    libxl_domain_sched_params *scinfo)
 {
@@ -730,6 +744,9 @@ int libxl_domain_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_set(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_set(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -758,6 +775,7 @@ int libxl_vcpu_sched_params_set(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -792,6 +810,7 @@ int libxl_vcpu_sched_params_set_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter setting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -832,6 +851,9 @@ int libxl_domain_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_RTDS:
         ret=sched_rtds_domain_get(gc, domid, scinfo);
         break;
+    case LIBXL_SCHEDULER_NULL:
+        ret=sched_null_domain_get(gc, domid, scinfo);
+        break;
     default:
         LOGD(ERROR, domid, "Unknown scheduler");
         ret=ERROR_INVAL;
@@ -858,6 +880,7 @@ int libxl_vcpu_sched_params_get(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
@@ -890,6 +913,7 @@ int libxl_vcpu_sched_params_get_all(libxl_ctx *ctx, uint32_t domid,
     case LIBXL_SCHEDULER_CREDIT:
     case LIBXL_SCHEDULER_CREDIT2:
     case LIBXL_SCHEDULER_ARINC653:
+    case LIBXL_SCHEDULER_NULL:
         LOGD(ERROR, domid, "per-VCPU parameter getting not supported for this scheduler");
         rc = ERROR_INVAL;
         break;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index d970284..d42f6a1 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -191,6 +191,7 @@ libxl_scheduler = Enumeration("scheduler", [
     (6, "credit2"),
     (7, "arinc653"),
     (8, "rtds"),
+    (9, "null"),
     ])
 
 # Consistent with SHUTDOWN_* in sched.h (apart from UNKNOWN)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] tools: sched: add support for 'null' scheduler
  2017-04-07 10:05             ` Dario Faggioli
@ 2017-04-07 10:13               ` Wei Liu
  0 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2017-04-07 10:13 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Ian Jackson, George Dunlap, Stefano Stabellini,
	Julien Grall, xen-devel

On Fri, Apr 07, 2017 at 12:05:01PM +0200, Dario Faggioli wrote:
> On Fri, 2017-04-07 at 10:42 +0100, Wei Liu wrote:
> > On Thu, Apr 06, 2017 at 05:18:33PM +0200, Dario Faggioli wrote:
> > > 
> > > Wei acked this patch in v1 (20170321170902.ndk6h5ylyfkk4coo@citrix.
> > > com)
> > > but that was before you raised this, so I'm happy to resend with
> > > this
> > > changed, and doing whatever he prefers with his ack.
> > > 
> > > Wei?
> > 
> > I don't feel strongly about this. But if you want to return success
> > in
> > the get function, please make sure you initialise output to a known
> > state, instead of returning random garbage.
> > 
> Sure. That should not be a problem, as libxl_domain_sched_params_get()
> calls, as the first thing it does, libxl_domain_sched_params_init(),
> which sets all scinfo fields to their default.
> 
> Whatever I return, I won't touch scinfo at all, which means I'm just
> returning those defaults.
> 

That sounds fine.

> So, can the below patch --which would make George much happier :-D
> --also have your Ack?
> 

Yes.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-04-07 10:13 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-17 18:42 [PATCH 0/3] The 'null' Scheduler Dario Faggioli
2017-03-17 18:42 ` [PATCH 1/3] xen: sched: introduce the 'null' semi-static scheduler Dario Faggioli
2017-03-20 23:21   ` Stefano Stabellini
2017-03-21  8:26     ` Dario Faggioli
2017-03-27 10:31   ` George Dunlap
2017-03-27 10:48     ` George Dunlap
2017-04-06 14:43       ` Dario Faggioli
2017-04-06 15:07     ` Dario Faggioli
2017-03-17 18:43 ` [PATCH 2/3] xen: sched_null: support for hard affinity Dario Faggioli
2017-03-20 23:46   ` Stefano Stabellini
2017-03-21  8:47     ` Dario Faggioli
2017-03-17 18:43 ` [PATCH 3/3] tools: sched: add support for 'null' scheduler Dario Faggioli
2017-03-20 22:28   ` Stefano Stabellini
2017-03-21 17:09   ` Wei Liu
2017-03-27 10:50   ` George Dunlap
2017-04-06 10:49     ` Dario Faggioli
2017-04-06 13:59       ` George Dunlap
2017-04-06 15:18         ` Dario Faggioli
2017-04-07  9:42           ` Wei Liu
2017-04-07 10:05             ` Dario Faggioli
2017-04-07 10:13               ` Wei Liu
2017-03-20 22:23 ` [PATCH 0/3] The 'null' Scheduler Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.