All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for 4.7 0/4] Assorted scheduling fixes
@ 2016-05-03 21:46 Dario Faggioli
  2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-03 21:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Tianyang Chen, George Dunlap, Julien Grall, Meng Xu,
	Varun.Swara, Steve Capper

Hi,

This small series contains some bugfixes for various schedulers. They're all
bugfixes, so I think all should be considered for 4.7. Here's some more
detailed analysis.

Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have an
ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe it
is important, but at least does not make anything explode.

Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
Julien reported here:

 https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html

Julien, the patch is very very similar to the one attached to one of my reply
in that thread, but I had to change some small bits... Can you please re-test
it?

Patch 4 makes the code of RTDS look consistent with what we state in patch 2,
so it's also important. Furthermore, it does fix a bug (although, again, not
one that would splat Xen) as, without it, we may have a timer used by the RTDS
scheduler bound to the pCPU of another cpupool with another scheduler. That
would introduce some unwanted and very difficult to recognize interference
between different schedulers in different pool, and should hence be avoided.

So this was awesomeness; about risks:
 - patch 1 is very small, super-self contained (zero impact outside of Credit2
   code) and it fixes an actual and 100% reproducible bug;
 - patch 2 is also totally self-contained and it can't possibly cause problems
   to anything else than to what it is trying to fix (Credit2's load balancer).
   It doesn't cure any ASSERT or Oops, so it's less interesting, but given the
   low risk --also considering that Credit2 will still be considered
   experimental in 4.7-- I think it can go in;
 - patch 3 is bigger, and a bit more complex. Note, however, that most of its
   content is code comments and ASSERT-s; it is self contained to scheduling
   (in the sense that it impacts all schedulers, but "just" them), and fixes
   a situation that, AFAIUI, is important for ARM;
 - patch 4 may again look not that critical. But, the fact that someone wanting
   to experiment with RTDS in a cpupool would face the kind of interference
   between independent cpupools that the patch cures is, I think, something
   worthwhile trying to avoid. Besides, it is again quite self contained, as
   it's indeed only relevant for RTDS (which is also going to be called
   experimental for 4.7).

Hope this helps. Thanks and Regards,
Dario
---
Dario Faggioli (4):
      xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
      xen: sched: fix killing an uninitialized timer in free_pdata.
      xen: credit2: fix 2 (minor) issues in load tracking logic
      xen: adopt .deinit_pdata and improve timer handling

 xen/common/sched_credit.c  |   31 ++++++++++++++++--
 xen/common/sched_credit2.c |   26 +++++++++++----
 xen/common/sched_rt.c      |   74 +++++++++++++++++++++++++++++++++-----------
 xen/common/schedule.c      |   38 ++++++++++++++++++++++-
 xen/include/xen/sched-if.h |    1 +
 5 files changed, 138 insertions(+), 32 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
@ 2016-05-03 21:46 ` Dario Faggioli
  2016-05-04  8:48   ` Jan Beulich
  2016-05-04 15:11   ` George Dunlap
  2016-05-03 21:46 ` [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata Dario Faggioli
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-03 21:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, George Dunlap

In fact, interrupts are already disabled when calling
the hook from schedule_cpu_switch(), and hence using
anything different than just spin_lock() is wrong (and
ASSERT()-s in debug builds) or unnecessary.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/sched_credit2.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 46b9279..50c1b9d 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2232,7 +2232,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * And owning exactly that one (the lock of the old scheduler of this
      * cpu) is what is necessary to prevent races.
      */
-    spin_lock_irq(&prv->lock);
+    spin_lock(&prv->lock);
 
     idle_vcpu[cpu]->sched_priv = vdata;
 
@@ -2257,7 +2257,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     smp_mb();
     per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
 
-    spin_unlock_irq(&prv->lock);
+    spin_unlock(&prv->lock);
 }
 
 static void


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata.
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
  2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
@ 2016-05-03 21:46 ` Dario Faggioli
  2016-05-04 15:25   ` George Dunlap
  2016-05-03 21:46 ` [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic Dario Faggioli
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-03 21:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Steve Capper, George Dunlap, Julien Grall, Varun.Swara

commit 64269d9365 "sched: implement .init_pdata in Credit,
Credit2 and RTDS" helped fixing Credit2 runqueues, and
the races we had in switching scheduler for pCPUs, but
introduced another issue. In fact, if CPU bringup fails
during __cpu_up() (and, more precisely, after CPU_UP_PREPARE,
but before CPU_STARTING) the CPU_UP_CANCELED notifier
would be executed, which calls the free_pdata hook.

Such hook does, right now, two things: (1) undo the
initialization done inside the init_pdata hook and (2)
free the memory allocated by the alloc_pdata hook.

However, in the failure path just described, it is possible
that only alloc_pdata were called, and this is potentially
an issue (depending on how actually free_pdata does).

In fact, for Credit1 (the only scheduler that actually
allocates per-pCPU data), this result in calling kill_timer()
on a timer that had not yet been initialized, which causes
the following:

(XEN) Xen call trace:
(XEN)    [<000000000022e304>] timer.c#active_timer+0x8/0x24 (PC)
(XEN)    [<000000000022f624>] kill_timer+0x108/0x2e0 (LR)
(XEN)    [<00000000002208c0>] sched_credit.c#csched_free_pdata+0xd8/0x114
(XEN)    [<0000000000227a18>] schedule.c#cpu_schedule_callback+0xc0/0x12c
(XEN)    [<0000000000219944>] notifier_call_chain+0x78/0x9c
(XEN)    [<00000000002015fc>] cpu_up+0x104/0x130
(XEN)    [<000000000028f7c0>] start_xen+0xaf8/0xce0
(XEN)    [<00000000810021d8>] 00000000810021d8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'timer->status >= TIMER_STATUS_inactive' failed at timer.c:279
(XEN) ****************************************

Solve this by making the scheduler hooks API symmetric again,
i.e., by adding a deinit_pdata hook and making it responsible
of undoing what init_pdata did, rather than asking to free_pdata
to do everything.

This is cleaner and, in the case at hand, makes it possible to
only call free_pdata (which is the right thing to do) as only
allocation and no initialization was performed.

Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Varun.Swara@arm.com
Cc: Steve Capper <Steve.Capper@arm.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/sched_credit.c  |   31 +++++++++++++++++++++++++++----
 xen/common/sched_credit2.c |   16 +++++++++++++---
 xen/common/schedule.c      |   38 +++++++++++++++++++++++++++++++++++++-
 xen/include/xen/sched-if.h |    1 +
 4 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index bc36837..db4d42a 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -485,11 +485,35 @@ static void
 csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 {
     struct csched_private *prv = CSCHED_PRIV(ops);
+
+    /*
+     * pcpu either points to a valid struct csched_pcpu, or is NULL, if we're
+     * beeing called from CPU_UP_CANCELLED, because bringing up a pCPU failed
+     * very early. xfree() does not really mind, but we want to be sure that,
+     * when we get here, either init_pdata has never been called, or
+     * deinit_pdata has been called already.
+     */
+    ASSERT(!cpumask_test_cpu(cpu, prv->cpus));
+
+    xfree(pcpu);
+}
+
+static void
+csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
+{
+    struct csched_private *prv = CSCHED_PRIV(ops);
     struct csched_pcpu *spc = pcpu;
     unsigned long flags;
 
-    if ( spc == NULL )
-        return;
+    /*
+     * Scheduler specific data for this pCPU must still be there and and be
+     * valid. In fact, if we are here:
+     *  1. alloc_pdata must have been called for this cpu, and free_pdata
+     *     must not have been called on it before us,
+     *  2. init_pdata must have been called on this cpu, and deinit_pdata
+     *     (us!) must not have been called on it already.
+     */
+    ASSERT(spc && cpumask_test_cpu(cpu, prv->cpus));
 
     spin_lock_irqsave(&prv->lock, flags);
 
@@ -507,8 +531,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
         kill_timer(&prv->master_ticker);
 
     spin_unlock_irqrestore(&prv->lock, flags);
-
-    xfree(spc);
 }
 
 static void *
@@ -2091,6 +2113,7 @@ static const struct scheduler sched_credit_def = {
     .free_vdata     = csched_free_vdata,
     .alloc_pdata    = csched_alloc_pdata,
     .init_pdata     = csched_init_pdata,
+    .deinit_pdata   = csched_deinit_pdata,
     .free_pdata     = csched_free_pdata,
     .switch_sched   = csched_switch_sched,
     .alloc_domdata  = csched_alloc_domdata,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 50c1b9d..803cc44 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2201,6 +2201,12 @@ csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
     unsigned long flags;
     unsigned rqi;
 
+    /*
+     * pdata contains what alloc_pdata returned. But since we don't (need to)
+     * implement alloc_pdata, either that's NULL, or something is very wrong!
+     */
+    ASSERT(!pdata);
+
     spin_lock_irqsave(&prv->lock, flags);
     old_lock = pcpu_schedule_lock(cpu);
 
@@ -2261,7 +2267,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
 }
 
 static void
-csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
+csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 {
     unsigned long flags;
     struct csched2_private *prv = CSCHED2_PRIV(ops);
@@ -2270,7 +2276,11 @@ csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 
     spin_lock_irqsave(&prv->lock, flags);
 
-    ASSERT(cpumask_test_cpu(cpu, &prv->initialized));
+    /*
+     * alloc_pdata is not implemented, so pcpu must be NULL. On the other
+     * hand, init_pdata must have been called for this pCPU.
+     */
+    ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized));
     
     /* Find the old runqueue and remove this cpu from it */
     rqi = prv->runq_map[cpu];
@@ -2387,7 +2397,7 @@ static const struct scheduler sched_credit2_def = {
     .alloc_vdata    = csched2_alloc_vdata,
     .free_vdata     = csched2_free_vdata,
     .init_pdata     = csched2_init_pdata,
-    .free_pdata     = csched2_free_pdata,
+    .deinit_pdata   = csched2_deinit_pdata,
     .switch_sched   = csched2_switch_sched,
     .alloc_domdata  = csched2_alloc_domdata,
     .free_domdata   = csched2_free_domdata,
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 5546999..80fea39 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1546,6 +1546,38 @@ static int cpu_schedule_callback(
     struct schedule_data *sd = &per_cpu(schedule_data, cpu);
     int rc = 0;
 
+    /*
+     * From the scheduler perspective, bringing up a pCPU requires
+     * allocating and initializing the per-pCPU scheduler specific data,
+     * as well as "registering" this pCPU to the scheduler (which may
+     * involve modifying some scheduler wide data structures).
+     * This happens by calling the alloc_pdata and init_pdata hooks, in
+     * this order. A scheduler that does not need to allocate any per-pCPU
+     * data can avoid implementing alloc_pdata. init_pdata may, however, be
+     * necessary/useful in this case too (e.g., it can contain the "register
+     * the pCPU to the scheduler" part). alloc_pdata (if present) is called
+     * during CPU_UP_PREPARE. init_pdata (if present) is called during
+     * CPU_STARTING.
+     *
+     * On the other hand, at teardown, we need to reverse what has been done
+     * during initialization, and then free the per-pCPU specific data. This
+     * happens by calling the deinit_pdata and free_pdata hooks, in this
+     * order. If no per-pCPU memory was allocated, there is no need to
+     * provide an implementation of free_pdata. deinit_pdata may, however,
+     * be necessary/useful in this case too (e.g., it can undo something done
+     * on scheduler wide data structure during init_pdata). Both deinit_pdata
+     * and free_pdata are called during CPU_DEAD.
+     *
+     * If someting goes wrong during bringup, we go to CPU_UP_CANCELLED
+     * *before* having called init_pdata. In this case, as there is no
+     * initialization needing undoing, only free_pdata should be called.
+     * This means it is possible to call free_pdata just after alloc_pdata,
+     * without a init_pdata/deinit_pdata "cycle" in between the two.
+     *
+     * So, in summary, the usage pattern should look either
+     *  - alloc_pdata-->init_pdata-->deinit_pdata-->free_pdata, or
+     *  - alloc_pdata-->free_pdata.
+     */
     switch ( action )
     {
     case CPU_STARTING:
@@ -1554,8 +1586,10 @@ static int cpu_schedule_callback(
     case CPU_UP_PREPARE:
         rc = cpu_schedule_up(cpu);
         break;
-    case CPU_UP_CANCELED:
     case CPU_DEAD:
+        SCHED_OP(sched, deinit_pdata, sd->sched_priv, cpu);
+        /* Fallthrough */
+    case CPU_UP_CANCELED:
         cpu_schedule_down(cpu);
         break;
     default:
@@ -1713,6 +1747,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
 
     SCHED_OP(new_ops, tick_resume, cpu);
 
+    SCHED_OP(old_ops, deinit_pdata, ppriv_old, cpu);
+
     SCHED_OP(old_ops, free_vdata, vpriv_old);
     SCHED_OP(old_ops, free_pdata, ppriv_old, cpu);
 
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 1db7c8d..bc0e794 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -138,6 +138,7 @@ struct scheduler {
     void         (*free_pdata)     (const struct scheduler *, void *, int);
     void *       (*alloc_pdata)    (const struct scheduler *, int);
     void         (*init_pdata)     (const struct scheduler *, void *, int);
+    void         (*deinit_pdata)   (const struct scheduler *, void *, int);
     void         (*free_domdata)   (const struct scheduler *, void *);
     void *       (*alloc_domdata)  (const struct scheduler *, struct domain *);
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
  2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
  2016-05-03 21:46 ` [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata Dario Faggioli
@ 2016-05-03 21:46 ` Dario Faggioli
  2016-05-04 15:38   ` George Dunlap
  2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-03 21:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, George Dunlap

All calculations that involve load_last_update uses quantities
shifted by LOADAVG_GRANULARITY_SHIFT, so make sure that this
is true even when the field is assigned a value for the first
time, during vcpu allocation.

Also, during migration, while the loads of both the source and
destination runqueues certainly need changing, the vcpu being
moved does not change its running/non-running status, and its
calculated load should hence not be affected.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/sched_credit2.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 803cc44..393364a 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -867,7 +867,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
         svc->weight = svc->sdom->weight;
         /* Starting load of 50% */
         svc->avgload = 1ULL << (CSCHED2_PRIV(ops)->load_window_shift - 1);
-        svc->load_last_update = NOW();
+        svc->load_last_update = NOW() >> LOADAVG_GRANULARITY_SHIFT;
     }
     else
     {
@@ -1301,7 +1301,7 @@ static void migrate(const struct scheduler *ops,
         if ( __vcpu_on_runq(svc) )
         {
             __runq_remove(svc);
-            update_load(ops, svc->rqd, svc, -1, now);
+            update_load(ops, svc->rqd, NULL, -1, now);
             on_runq=1;
         }
         __runq_deassign(svc);
@@ -1314,7 +1314,7 @@ static void migrate(const struct scheduler *ops,
         __runq_assign(svc, trqd);
         if ( on_runq )
         {
-            update_load(ops, svc->rqd, svc, 1, now);
+            update_load(ops, svc->rqd, NULL, 1, now);
             runq_insert(ops, svc->vcpu->processor, svc);
             runq_tickle(ops, svc->vcpu->processor, svc, now);
             SCHED_STAT_CRANK(migrate_on_runq);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
                   ` (2 preceding siblings ...)
  2016-05-03 21:46 ` [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic Dario Faggioli
@ 2016-05-03 21:46 ` Dario Faggioli
  2016-05-04 15:51   ` George Dunlap
  2016-05-07 21:19   ` Meng Xu
  2016-05-04  1:26 ` [PATCH for 4.7 0/4] Assorted scheduling fixes Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-03 21:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Tianyang Chen, Wei Liu, Meng Xu, George Dunlap

The scheduling hooks API is now used properly, and no
initialization or de-initialization happen in
alloc/free_pdata any longer.

In fact, just like it is for Credit2, there is no real
need for implementing alloc_pdata and free_pdata.

This also made it possible to improve the replenishment
timer handling logic, such that now the timer is always
kept on one of the pCPU of the scheduler it's servicing.
Before this commit, in fact, even if the pCPU where the
timer happened to be initialized at creation time was
moved to another cpupool, the timer stayed there,
potentially inferfearing with the new scheduler of the
pCPU itself.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
--
Cc: Meng Xu <mengxu@cis.upenn.edu>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Tianyang Chen <tiche@seas.upenn.edu>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/sched_rt.c |   74 ++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 55 insertions(+), 19 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 673fc92..7f8f411 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -590,6 +590,10 @@ rt_init(struct scheduler *ops)
     if ( prv == NULL )
         return -ENOMEM;
 
+    prv->repl_timer = xzalloc(struct timer);
+    if ( prv->repl_timer == NULL )
+        return -ENOMEM;
+
     spin_lock_init(&prv->lock);
     INIT_LIST_HEAD(&prv->sdom);
     INIT_LIST_HEAD(&prv->runq);
@@ -600,12 +604,6 @@ rt_init(struct scheduler *ops)
 
     ops->sched_data = prv;
 
-    /*
-     * The timer initialization will happen later when
-     * the first pcpu is added to this pool in alloc_pdata.
-     */
-    prv->repl_timer = NULL;
-
     return 0;
 }
 
@@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
 {
     struct rt_private *prv = rt_priv(ops);
 
-    kill_timer(prv->repl_timer);
+    ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
+           prv->repl_timer->status == TIMER_STATUS_killed);
     xfree(prv->repl_timer);
 
     ops->sched_data = NULL;
@@ -632,9 +631,19 @@ rt_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
     spinlock_t *old_lock;
     unsigned long flags;
 
-    /* Move the scheduler lock to our global runqueue lock.  */
     old_lock = pcpu_schedule_lock_irqsave(cpu, &flags);
 
+    /*
+     * TIMER_STATUS_invalid means we are the first cpu that sees the timer
+     * allocated but not initialized, and so it's up to us to initialize it.
+     */
+    if ( prv->repl_timer->status == TIMER_STATUS_invalid )
+    {
+        init_timer(prv->repl_timer, repl_timer_handler, (void*) ops, cpu);
+        dprintk(XENLOG_DEBUG, "RTDS: timer initialized on cpu %u\n", cpu);
+    }
+
+    /* Move the scheduler lock to our global runqueue lock.  */
     per_cpu(schedule_data, cpu).schedule_lock = &prv->lock;
 
     /* _Not_ pcpu_schedule_unlock(): per_cpu().schedule_lock changed! */
@@ -659,6 +668,20 @@ rt_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      */
     ASSERT(per_cpu(schedule_data, cpu).schedule_lock != &prv->lock);
 
+    /*
+     * If we are the absolute first cpu being switched toward this
+     * scheduler (in which case we'll see TIMER_STATUS_invalid), or the
+     * first one that is added back to the cpupool that had all its cpus
+     * removed (in which case we'll see TIMER_STATUS_killed), it's our
+     * job to (re)initialize the timer.
+     */
+    if ( prv->repl_timer->status == TIMER_STATUS_invalid ||
+         prv->repl_timer->status == TIMER_STATUS_killed )
+    {
+        init_timer(prv->repl_timer, repl_timer_handler, (void*) new_ops, cpu);
+        dprintk(XENLOG_DEBUG, "RTDS: timer initialized on cpu %u\n", cpu);
+    }
+
     idle_vcpu[cpu]->sched_priv = vdata;
     per_cpu(scheduler, cpu) = new_ops;
     per_cpu(schedule_data, cpu).sched_priv = NULL; /* no pdata */
@@ -672,23 +695,36 @@ rt_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     per_cpu(schedule_data, cpu).schedule_lock = &prv->lock;
 }
 
-static void *
-rt_alloc_pdata(const struct scheduler *ops, int cpu)
+static void
+rt_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 {
+    unsigned long flags;
     struct rt_private *prv = rt_priv(ops);
 
-    if ( prv->repl_timer == NULL )
-    {
-        /* Allocate the timer on the first cpu of this pool. */
-        prv->repl_timer = xzalloc(struct timer);
+    spin_lock_irqsave(&prv->lock, flags);
 
-        if ( prv->repl_timer == NULL )
-            return ERR_PTR(-ENOMEM);
+    if ( prv->repl_timer->cpu == cpu )
+    {
+        struct cpupool *c = per_cpu(cpupool, cpu);
+        unsigned int new_cpu = cpumask_cycle(cpu, cpupool_online_cpumask(c));
 
-        init_timer(prv->repl_timer, repl_timer_handler, (void *)ops, cpu);
+        /*
+         * Make sure the timer run on one of the cpus that are still available
+         * to this scheduler. If there aren't any left, it means it's the time
+         * to just kill it.
+         */
+        if ( new_cpu >= nr_cpu_ids )
+        {
+            kill_timer(prv->repl_timer);
+            dprintk(XENLOG_DEBUG, "RTDS: timer killed on cpu %d\n", cpu);
+        }
+        else
+        {
+            migrate_timer(prv->repl_timer, new_cpu);
+        }
     }
 
-    return NULL;
+    spin_unlock_irqrestore(&prv->lock, flags);
 }
 
 static void *
@@ -1433,9 +1469,9 @@ static const struct scheduler sched_rtds_def = {
     .dump_settings  = rt_dump,
     .init           = rt_init,
     .deinit         = rt_deinit,
-    .alloc_pdata    = rt_alloc_pdata,
     .init_pdata     = rt_init_pdata,
     .switch_sched   = rt_switch_sched,
+    .deinit_pdata   = rt_deinit_pdata,
     .alloc_domdata  = rt_alloc_domdata,
     .free_domdata   = rt_free_domdata,
     .init_domain    = rt_dom_init,


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
                   ` (3 preceding siblings ...)
  2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
@ 2016-05-04  1:26 ` Konrad Rzeszutek Wilk
  2016-05-04  9:06   ` Dario Faggioli
  2016-05-04 15:53 ` George Dunlap
  2016-05-04 16:04 ` Wei Liu
  6 siblings, 1 reply; 37+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-05-04  1:26 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Steve Capper, Tianyang Chen, George Dunlap,
	Julien Grall, Meng Xu, Varun.Swara, xen-devel

On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
> Hi,
> 
> This small series contains some bugfixes for various schedulers. They're all
> bugfixes, so I think all should be considered for 4.7. Here's some more
> detailed analysis.
> 
> Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have an
> ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe it
> is important, but at least does not make anything explode.
> 
> Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
> Julien reported here:
> 
>  https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html
> 
> Julien, the patch is very very similar to the one attached to one of my reply
> in that thread, but I had to change some small bits... Can you please re-test
> it?

I tested it. It does not crash anymore. However I see this:

(XEN) CPU1 never came online
(XEN) Failed to bring up CPU 1 (error -5)
(XEN) Bringing up CPU2
(XEN) CPU2 never came online
(XEN) Failed to bring up CPU 2 (error -5)
(XEN) Bringing up CPU3
(XEN) CPU3 never came online
(XEN) Failed to bring up CPU 3 (error -5)
(XEN) Brought up 1 CPUs


Which is not very reassuring. I don't know if that is expected.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
@ 2016-05-04  8:48   ` Jan Beulich
  2016-05-04  9:08     ` Dario Faggioli
  2016-05-04 15:11   ` George Dunlap
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2016-05-04  8:48 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Wei Liu, George Dunlap

>>> On 03.05.16 at 23:46, <dario.faggioli@citrix.com> wrote:
> In fact, interrupts are already disabled when calling
> the hook from schedule_cpu_switch(), and hence using
> anything different than just spin_lock() is wrong (and
> ASSERT()-s in debug builds) or unnecessary.

Well, this is a little too broad a statement: spin_lock_irqsave()
would be quite fine in this situation.

> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Actual code change
Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-04  1:26 ` [PATCH for 4.7 0/4] Assorted scheduling fixes Konrad Rzeszutek Wilk
@ 2016-05-04  9:06   ` Dario Faggioli
  2016-05-05 12:00     ` Julien Grall
  0 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-04  9:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Wei Liu, Steve Capper, Tianyang Chen, George Dunlap,
	Julien Grall, Meng Xu, Varun.Swara, xen-devel

On Tue, 2016-05-03 at 21:26 -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
> > 
> > Julien, the patch is very very similar to the one attached to one
> > of my reply
> > in that thread, but I had to change some small bits... Can you
> > please re-test
> > it?
> I tested it. It does not crash anymore. However I see this:
> 
On what hardware/emulator? The same Julien was using?

> (XEN) CPU1 never came online
> (XEN) Failed to bring up CPU 1 (error -5)
> (XEN) Bringing up CPU2
> (XEN) CPU2 never came online
> (XEN) Failed to bring up CPU 2 (error -5)
> (XEN) Bringing up CPU3
> (XEN) CPU3 never came online
> (XEN) Failed to bring up CPU 3 (error -5)
> (XEN) Brought up 1 CPUs
> 
> 
> Which is not very reassuring. I don't know if that is expected.
>
I don't know... I've never said this on any x86 box before, and I
wouldn't think it to be related to this series... do you see this only
with the series applied?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04  8:48   ` Jan Beulich
@ 2016-05-04  9:08     ` Dario Faggioli
  0 siblings, 0 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-04  9:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, George Dunlap

On Wed, 2016-05-04 at 02:48 -0600, Jan Beulich wrote:
> > > > On 03.05.16 at 23:46, <dario.faggioli@citrix.com> wrote:
> > In fact, interrupts are already disabled when calling
> > the hook from schedule_cpu_switch(), and hence using
> > anything different than just spin_lock() is wrong (and
> > ASSERT()-s in debug builds) or unnecessary.
> Well, this is a little too broad a statement: spin_lock_irqsave()
> would be quite fine in this situation.
> 
That would be covered by the "or unnecessary" part.

As in "spin_lock_irq() is wrong and spin_lock_irqsave() is
unnecessary".
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Actual code change
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
  2016-05-04  8:48   ` Jan Beulich
@ 2016-05-04 15:11   ` George Dunlap
  2016-05-04 15:58     ` Dario Faggioli
  1 sibling, 1 reply; 37+ messages in thread
From: George Dunlap @ 2016-05-04 15:11 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Wei Liu

On 03/05/16 22:46, Dario Faggioli wrote:
> In fact, interrupts are already disabled when calling
> the hook from schedule_cpu_switch(), and hence using
> anything different than just spin_lock() is wrong (and
> ASSERT()-s in debug builds) or unnecessary.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Good catch.  But would it be better to either 1) add an assert that irqs
are disabled, or 2) use spin_lock_irqsave(), just to make sure the two
bits of code won't silently go out of sync?

Re Jan's comment -- you meant to say that spin_lock_irq() is wrong and
spin_lock_irqsave() is unnecessary?  That's probably a bit more
inference than we want changeset messages to have, ideally. :-)

I wouldn't require a re-spin just for that, but if you are going to
respin it, I would suggest just dropping the "and hence...".

Thanks,
 -George

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_credit2.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 46b9279..50c1b9d 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2232,7 +2232,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>       * And owning exactly that one (the lock of the old scheduler of this
>       * cpu) is what is necessary to prevent races.
>       */
> -    spin_lock_irq(&prv->lock);
> +    spin_lock(&prv->lock);
>  
>      idle_vcpu[cpu]->sched_priv = vdata;
>  
> @@ -2257,7 +2257,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>      smp_mb();
>      per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
>  
> -    spin_unlock_irq(&prv->lock);
> +    spin_unlock(&prv->lock);
>  }
>  
>  static void
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata.
  2016-05-03 21:46 ` [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata Dario Faggioli
@ 2016-05-04 15:25   ` George Dunlap
  0 siblings, 0 replies; 37+ messages in thread
From: George Dunlap @ 2016-05-04 15:25 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Varun.Swara, Julien Grall, Steve Capper, Wei Liu

On 03/05/16 22:46, Dario Faggioli wrote:
> commit 64269d9365 "sched: implement .init_pdata in Credit,
> Credit2 and RTDS" helped fixing Credit2 runqueues, and
> the races we had in switching scheduler for pCPUs, but
> introduced another issue. In fact, if CPU bringup fails
> during __cpu_up() (and, more precisely, after CPU_UP_PREPARE,
> but before CPU_STARTING) the CPU_UP_CANCELED notifier
> would be executed, which calls the free_pdata hook.
> 
> Such hook does, right now, two things: (1) undo the
> initialization done inside the init_pdata hook and (2)
> free the memory allocated by the alloc_pdata hook.
> 
> However, in the failure path just described, it is possible
> that only alloc_pdata were called, and this is potentially
> an issue (depending on how actually free_pdata does).
> 
> In fact, for Credit1 (the only scheduler that actually
> allocates per-pCPU data), this result in calling kill_timer()
> on a timer that had not yet been initialized, which causes
> the following:
> 
> (XEN) Xen call trace:
> (XEN)    [<000000000022e304>] timer.c#active_timer+0x8/0x24 (PC)
> (XEN)    [<000000000022f624>] kill_timer+0x108/0x2e0 (LR)
> (XEN)    [<00000000002208c0>] sched_credit.c#csched_free_pdata+0xd8/0x114
> (XEN)    [<0000000000227a18>] schedule.c#cpu_schedule_callback+0xc0/0x12c
> (XEN)    [<0000000000219944>] notifier_call_chain+0x78/0x9c
> (XEN)    [<00000000002015fc>] cpu_up+0x104/0x130
> (XEN)    [<000000000028f7c0>] start_xen+0xaf8/0xce0
> (XEN)    [<00000000810021d8>] 00000000810021d8
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'timer->status >= TIMER_STATUS_inactive' failed at timer.c:279
> (XEN) ****************************************
> 
> Solve this by making the scheduler hooks API symmetric again,
> i.e., by adding a deinit_pdata hook and making it responsible
> of undoing what init_pdata did, rather than asking to free_pdata
> to do everything.
> 
> This is cleaner and, in the case at hand, makes it possible to
> only call free_pdata (which is the right thing to do) as only
> allocation and no initialization was performed.
> 
> Reported-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Looks good, thanks!

Reviewed-by: George Dunlap <george.dunlap@citrix.com>


> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Varun.Swara@arm.com
> Cc: Steve Capper <Steve.Capper@arm.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_credit.c  |   31 +++++++++++++++++++++++++++----
>  xen/common/sched_credit2.c |   16 +++++++++++++---
>  xen/common/schedule.c      |   38 +++++++++++++++++++++++++++++++++++++-
>  xen/include/xen/sched-if.h |    1 +
>  4 files changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index bc36837..db4d42a 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -485,11 +485,35 @@ static void
>  csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
>      struct csched_private *prv = CSCHED_PRIV(ops);
> +
> +    /*
> +     * pcpu either points to a valid struct csched_pcpu, or is NULL, if we're
> +     * beeing called from CPU_UP_CANCELLED, because bringing up a pCPU failed
> +     * very early. xfree() does not really mind, but we want to be sure that,
> +     * when we get here, either init_pdata has never been called, or
> +     * deinit_pdata has been called already.
> +     */
> +    ASSERT(!cpumask_test_cpu(cpu, prv->cpus));
> +
> +    xfree(pcpu);
> +}
> +
> +static void
> +csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> +{
> +    struct csched_private *prv = CSCHED_PRIV(ops);
>      struct csched_pcpu *spc = pcpu;
>      unsigned long flags;
>  
> -    if ( spc == NULL )
> -        return;
> +    /*
> +     * Scheduler specific data for this pCPU must still be there and and be
> +     * valid. In fact, if we are here:
> +     *  1. alloc_pdata must have been called for this cpu, and free_pdata
> +     *     must not have been called on it before us,
> +     *  2. init_pdata must have been called on this cpu, and deinit_pdata
> +     *     (us!) must not have been called on it already.
> +     */
> +    ASSERT(spc && cpumask_test_cpu(cpu, prv->cpus));
>  
>      spin_lock_irqsave(&prv->lock, flags);
>  
> @@ -507,8 +531,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>          kill_timer(&prv->master_ticker);
>  
>      spin_unlock_irqrestore(&prv->lock, flags);
> -
> -    xfree(spc);
>  }
>  
>  static void *
> @@ -2091,6 +2113,7 @@ static const struct scheduler sched_credit_def = {
>      .free_vdata     = csched_free_vdata,
>      .alloc_pdata    = csched_alloc_pdata,
>      .init_pdata     = csched_init_pdata,
> +    .deinit_pdata   = csched_deinit_pdata,
>      .free_pdata     = csched_free_pdata,
>      .switch_sched   = csched_switch_sched,
>      .alloc_domdata  = csched_alloc_domdata,
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 50c1b9d..803cc44 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2201,6 +2201,12 @@ csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
>      unsigned long flags;
>      unsigned rqi;
>  
> +    /*
> +     * pdata contains what alloc_pdata returned. But since we don't (need to)
> +     * implement alloc_pdata, either that's NULL, or something is very wrong!
> +     */
> +    ASSERT(!pdata);
> +
>      spin_lock_irqsave(&prv->lock, flags);
>      old_lock = pcpu_schedule_lock(cpu);
>  
> @@ -2261,7 +2267,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>  }
>  
>  static void
> -csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> +csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
>      unsigned long flags;
>      struct csched2_private *prv = CSCHED2_PRIV(ops);
> @@ -2270,7 +2276,11 @@ csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  
>      spin_lock_irqsave(&prv->lock, flags);
>  
> -    ASSERT(cpumask_test_cpu(cpu, &prv->initialized));
> +    /*
> +     * alloc_pdata is not implemented, so pcpu must be NULL. On the other
> +     * hand, init_pdata must have been called for this pCPU.
> +     */
> +    ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized));
>      
>      /* Find the old runqueue and remove this cpu from it */
>      rqi = prv->runq_map[cpu];
> @@ -2387,7 +2397,7 @@ static const struct scheduler sched_credit2_def = {
>      .alloc_vdata    = csched2_alloc_vdata,
>      .free_vdata     = csched2_free_vdata,
>      .init_pdata     = csched2_init_pdata,
> -    .free_pdata     = csched2_free_pdata,
> +    .deinit_pdata   = csched2_deinit_pdata,
>      .switch_sched   = csched2_switch_sched,
>      .alloc_domdata  = csched2_alloc_domdata,
>      .free_domdata   = csched2_free_domdata,
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 5546999..80fea39 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1546,6 +1546,38 @@ static int cpu_schedule_callback(
>      struct schedule_data *sd = &per_cpu(schedule_data, cpu);
>      int rc = 0;
>  
> +    /*
> +     * From the scheduler perspective, bringing up a pCPU requires
> +     * allocating and initializing the per-pCPU scheduler specific data,
> +     * as well as "registering" this pCPU to the scheduler (which may
> +     * involve modifying some scheduler wide data structures).
> +     * This happens by calling the alloc_pdata and init_pdata hooks, in
> +     * this order. A scheduler that does not need to allocate any per-pCPU
> +     * data can avoid implementing alloc_pdata. init_pdata may, however, be
> +     * necessary/useful in this case too (e.g., it can contain the "register
> +     * the pCPU to the scheduler" part). alloc_pdata (if present) is called
> +     * during CPU_UP_PREPARE. init_pdata (if present) is called during
> +     * CPU_STARTING.
> +     *
> +     * On the other hand, at teardown, we need to reverse what has been done
> +     * during initialization, and then free the per-pCPU specific data. This
> +     * happens by calling the deinit_pdata and free_pdata hooks, in this
> +     * order. If no per-pCPU memory was allocated, there is no need to
> +     * provide an implementation of free_pdata. deinit_pdata may, however,
> +     * be necessary/useful in this case too (e.g., it can undo something done
> +     * on scheduler wide data structure during init_pdata). Both deinit_pdata
> +     * and free_pdata are called during CPU_DEAD.
> +     *
> +     * If someting goes wrong during bringup, we go to CPU_UP_CANCELLED
> +     * *before* having called init_pdata. In this case, as there is no
> +     * initialization needing undoing, only free_pdata should be called.
> +     * This means it is possible to call free_pdata just after alloc_pdata,
> +     * without a init_pdata/deinit_pdata "cycle" in between the two.
> +     *
> +     * So, in summary, the usage pattern should look either
> +     *  - alloc_pdata-->init_pdata-->deinit_pdata-->free_pdata, or
> +     *  - alloc_pdata-->free_pdata.
> +     */
>      switch ( action )
>      {
>      case CPU_STARTING:
> @@ -1554,8 +1586,10 @@ static int cpu_schedule_callback(
>      case CPU_UP_PREPARE:
>          rc = cpu_schedule_up(cpu);
>          break;
> -    case CPU_UP_CANCELED:
>      case CPU_DEAD:
> +        SCHED_OP(sched, deinit_pdata, sd->sched_priv, cpu);
> +        /* Fallthrough */
> +    case CPU_UP_CANCELED:
>          cpu_schedule_down(cpu);
>          break;
>      default:
> @@ -1713,6 +1747,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
>  
>      SCHED_OP(new_ops, tick_resume, cpu);
>  
> +    SCHED_OP(old_ops, deinit_pdata, ppriv_old, cpu);
> +
>      SCHED_OP(old_ops, free_vdata, vpriv_old);
>      SCHED_OP(old_ops, free_pdata, ppriv_old, cpu);
>  
> diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
> index 1db7c8d..bc0e794 100644
> --- a/xen/include/xen/sched-if.h
> +++ b/xen/include/xen/sched-if.h
> @@ -138,6 +138,7 @@ struct scheduler {
>      void         (*free_pdata)     (const struct scheduler *, void *, int);
>      void *       (*alloc_pdata)    (const struct scheduler *, int);
>      void         (*init_pdata)     (const struct scheduler *, void *, int);
> +    void         (*deinit_pdata)   (const struct scheduler *, void *, int);
>      void         (*free_domdata)   (const struct scheduler *, void *);
>      void *       (*alloc_domdata)  (const struct scheduler *, struct domain *);
>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic
  2016-05-03 21:46 ` [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic Dario Faggioli
@ 2016-05-04 15:38   ` George Dunlap
  0 siblings, 0 replies; 37+ messages in thread
From: George Dunlap @ 2016-05-04 15:38 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Wei Liu

On 03/05/16 22:46, Dario Faggioli wrote:
> All calculations that involve load_last_update uses quantities
> shifted by LOADAVG_GRANULARITY_SHIFT, so make sure that this
> is true even when the field is assigned a value for the first
> time, during vcpu allocation.
> 
> Also, during migration, while the loads of both the source and
> destination runqueues certainly need changing, the vcpu being
> moved does not change its running/non-running status, and its
> calculated load should hence not be affected.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Looks good, thanks.

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

I suppose the code that sets 'svc->load_last_update = now' should
probably have changed the variable name to make it clear that now !=
NOW().  Maybe next time I'm looking through that code I'll do a bit of
renaming.

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_credit2.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 803cc44..393364a 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -867,7 +867,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
>          svc->weight = svc->sdom->weight;
>          /* Starting load of 50% */
>          svc->avgload = 1ULL << (CSCHED2_PRIV(ops)->load_window_shift - 1);
> -        svc->load_last_update = NOW();
> +        svc->load_last_update = NOW() >> LOADAVG_GRANULARITY_SHIFT;
>      }
>      else
>      {
> @@ -1301,7 +1301,7 @@ static void migrate(const struct scheduler *ops,
>          if ( __vcpu_on_runq(svc) )
>          {
>              __runq_remove(svc);
> -            update_load(ops, svc->rqd, svc, -1, now);
> +            update_load(ops, svc->rqd, NULL, -1, now);
>              on_runq=1;
>          }
>          __runq_deassign(svc);
> @@ -1314,7 +1314,7 @@ static void migrate(const struct scheduler *ops,
>          __runq_assign(svc, trqd);
>          if ( on_runq )
>          {
> -            update_load(ops, svc->rqd, svc, 1, now);
> +            update_load(ops, svc->rqd, NULL, 1, now);
>              runq_insert(ops, svc->vcpu->processor, svc);
>              runq_tickle(ops, svc->vcpu->processor, svc, now);
>              SCHED_STAT_CRANK(migrate_on_runq);
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
@ 2016-05-04 15:51   ` George Dunlap
  2016-05-04 15:53     ` Meng Xu
  2016-05-07 21:19   ` Meng Xu
  1 sibling, 1 reply; 37+ messages in thread
From: George Dunlap @ 2016-05-04 15:51 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Tianyang Chen, Wei Liu, Meng Xu

On 03/05/16 22:46, Dario Faggioli wrote:
> The scheduling hooks API is now used properly, and no
> initialization or de-initialization happen in
> alloc/free_pdata any longer.
> 
> In fact, just like it is for Credit2, there is no real
> need for implementing alloc_pdata and free_pdata.
> 
> This also made it possible to improve the replenishment
> timer handling logic, such that now the timer is always
> kept on one of the pCPU of the scheduler it's servicing.
> Before this commit, in fact, even if the pCPU where the
> timer happened to be initialized at creation time was
> moved to another cpupool, the timer stayed there,
> potentially inferfearing with the new scheduler of the

* interfering

> pCPU itself.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

I don't know much about the logic, so I'll wait for Meng Xu to review it.

 -George

> --
> Cc: Meng Xu <mengxu@cis.upenn.edu>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Tianyang Chen <tiche@seas.upenn.edu>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_rt.c |   74 ++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 55 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 673fc92..7f8f411 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -590,6 +590,10 @@ rt_init(struct scheduler *ops)
>      if ( prv == NULL )
>          return -ENOMEM;
>  
> +    prv->repl_timer = xzalloc(struct timer);
> +    if ( prv->repl_timer == NULL )
> +        return -ENOMEM;
> +
>      spin_lock_init(&prv->lock);
>      INIT_LIST_HEAD(&prv->sdom);
>      INIT_LIST_HEAD(&prv->runq);
> @@ -600,12 +604,6 @@ rt_init(struct scheduler *ops)
>  
>      ops->sched_data = prv;
>  
> -    /*
> -     * The timer initialization will happen later when
> -     * the first pcpu is added to this pool in alloc_pdata.
> -     */
> -    prv->repl_timer = NULL;
> -
>      return 0;
>  }
>  
> @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
>  {
>      struct rt_private *prv = rt_priv(ops);
>  
> -    kill_timer(prv->repl_timer);
> +    ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> +           prv->repl_timer->status == TIMER_STATUS_killed);
>      xfree(prv->repl_timer);
>  
>      ops->sched_data = NULL;
> @@ -632,9 +631,19 @@ rt_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
>      spinlock_t *old_lock;
>      unsigned long flags;
>  
> -    /* Move the scheduler lock to our global runqueue lock.  */
>      old_lock = pcpu_schedule_lock_irqsave(cpu, &flags);
>  
> +    /*
> +     * TIMER_STATUS_invalid means we are the first cpu that sees the timer
> +     * allocated but not initialized, and so it's up to us to initialize it.
> +     */
> +    if ( prv->repl_timer->status == TIMER_STATUS_invalid )
> +    {
> +        init_timer(prv->repl_timer, repl_timer_handler, (void*) ops, cpu);
> +        dprintk(XENLOG_DEBUG, "RTDS: timer initialized on cpu %u\n", cpu);
> +    }
> +
> +    /* Move the scheduler lock to our global runqueue lock.  */
>      per_cpu(schedule_data, cpu).schedule_lock = &prv->lock;
>  
>      /* _Not_ pcpu_schedule_unlock(): per_cpu().schedule_lock changed! */
> @@ -659,6 +668,20 @@ rt_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>       */
>      ASSERT(per_cpu(schedule_data, cpu).schedule_lock != &prv->lock);
>  
> +    /*
> +     * If we are the absolute first cpu being switched toward this
> +     * scheduler (in which case we'll see TIMER_STATUS_invalid), or the
> +     * first one that is added back to the cpupool that had all its cpus
> +     * removed (in which case we'll see TIMER_STATUS_killed), it's our
> +     * job to (re)initialize the timer.
> +     */
> +    if ( prv->repl_timer->status == TIMER_STATUS_invalid ||
> +         prv->repl_timer->status == TIMER_STATUS_killed )
> +    {
> +        init_timer(prv->repl_timer, repl_timer_handler, (void*) new_ops, cpu);
> +        dprintk(XENLOG_DEBUG, "RTDS: timer initialized on cpu %u\n", cpu);
> +    }
> +
>      idle_vcpu[cpu]->sched_priv = vdata;
>      per_cpu(scheduler, cpu) = new_ops;
>      per_cpu(schedule_data, cpu).sched_priv = NULL; /* no pdata */
> @@ -672,23 +695,36 @@ rt_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>      per_cpu(schedule_data, cpu).schedule_lock = &prv->lock;
>  }
>  
> -static void *
> -rt_alloc_pdata(const struct scheduler *ops, int cpu)
> +static void
> +rt_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
> +    unsigned long flags;
>      struct rt_private *prv = rt_priv(ops);
>  
> -    if ( prv->repl_timer == NULL )
> -    {
> -        /* Allocate the timer on the first cpu of this pool. */
> -        prv->repl_timer = xzalloc(struct timer);
> +    spin_lock_irqsave(&prv->lock, flags);
>  
> -        if ( prv->repl_timer == NULL )
> -            return ERR_PTR(-ENOMEM);
> +    if ( prv->repl_timer->cpu == cpu )
> +    {
> +        struct cpupool *c = per_cpu(cpupool, cpu);
> +        unsigned int new_cpu = cpumask_cycle(cpu, cpupool_online_cpumask(c));
>  
> -        init_timer(prv->repl_timer, repl_timer_handler, (void *)ops, cpu);
> +        /*
> +         * Make sure the timer run on one of the cpus that are still available
> +         * to this scheduler. If there aren't any left, it means it's the time
> +         * to just kill it.
> +         */
> +        if ( new_cpu >= nr_cpu_ids )
> +        {
> +            kill_timer(prv->repl_timer);
> +            dprintk(XENLOG_DEBUG, "RTDS: timer killed on cpu %d\n", cpu);
> +        }
> +        else
> +        {
> +            migrate_timer(prv->repl_timer, new_cpu);
> +        }
>      }
>  
> -    return NULL;
> +    spin_unlock_irqrestore(&prv->lock, flags);
>  }
>  
>  static void *
> @@ -1433,9 +1469,9 @@ static const struct scheduler sched_rtds_def = {
>      .dump_settings  = rt_dump,
>      .init           = rt_init,
>      .deinit         = rt_deinit,
> -    .alloc_pdata    = rt_alloc_pdata,
>      .init_pdata     = rt_init_pdata,
>      .switch_sched   = rt_switch_sched,
> +    .deinit_pdata   = rt_deinit_pdata,
>      .alloc_domdata  = rt_alloc_domdata,
>      .free_domdata   = rt_free_domdata,
>      .init_domain    = rt_dom_init,
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-04 15:51   ` George Dunlap
@ 2016-05-04 15:53     ` Meng Xu
  2016-05-06 23:05       ` Dario Faggioli
  0 siblings, 1 reply; 37+ messages in thread
From: Meng Xu @ 2016-05-04 15:53 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Dario Faggioli, Tianyang Chen, Wei Liu

On Wed, May 4, 2016 at 11:51 AM, George Dunlap <george.dunlap@citrix.com> wrote:
> On 03/05/16 22:46, Dario Faggioli wrote:
>> The scheduling hooks API is now used properly, and no
>> initialization or de-initialization happen in
>> alloc/free_pdata any longer.
>>
>> In fact, just like it is for Credit2, there is no real
>> need for implementing alloc_pdata and free_pdata.
>>
>> This also made it possible to improve the replenishment
>> timer handling logic, such that now the timer is always
>> kept on one of the pCPU of the scheduler it's servicing.
>> Before this commit, in fact, even if the pCPU where the
>> timer happened to be initialized at creation time was
>> moved to another cpupool, the timer stayed there,
>> potentially inferfearing with the new scheduler of the
>
> * interfering
>
>> pCPU itself.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>
> I don't know much about the logic, so I'll wait for Meng Xu to review it.
>

I will look at it this week...(I will try to do it asap...)

Meng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
                   ` (4 preceding siblings ...)
  2016-05-04  1:26 ` [PATCH for 4.7 0/4] Assorted scheduling fixes Konrad Rzeszutek Wilk
@ 2016-05-04 15:53 ` George Dunlap
  2016-05-04 16:04 ` Wei Liu
  6 siblings, 0 replies; 37+ messages in thread
From: George Dunlap @ 2016-05-04 15:53 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Wei Liu, Tianyang Chen, Julien Grall, Meng Xu, Varun.Swara, Steve Capper

On 03/05/16 22:46, Dario Faggioli wrote:
> Hi,
> 
> This small series contains some bugfixes for various schedulers. They're all
> bugfixes, so I think all should be considered for 4.7. Here's some more
> detailed analysis.
> 
> Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have an
> ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe it
> is important, but at least does not make anything explode.
> 
> Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
> Julien reported here:
> 
>  https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html
> 
> Julien, the patch is very very similar to the one attached to one of my reply
> in that thread, but I had to change some small bits... Can you please re-test
> it?
> 
> Patch 4 makes the code of RTDS look consistent with what we state in patch 2,
> so it's also important. Furthermore, it does fix a bug (although, again, not
> one that would splat Xen) as, without it, we may have a timer used by the RTDS
> scheduler bound to the pCPU of another cpupool with another scheduler. That
> would introduce some unwanted and very difficult to recognize interference
> between different schedulers in different pool, and should hence be avoided.
> 
> So this was awesomeness; about risks:
>  - patch 1 is very small, super-self contained (zero impact outside of Credit2
>    code) and it fixes an actual and 100% reproducible bug;
>  - patch 2 is also totally self-contained and it can't possibly cause problems
>    to anything else than to what it is trying to fix (Credit2's load balancer).
>    It doesn't cure any ASSERT or Oops, so it's less interesting, but given the
>    low risk --also considering that Credit2 will still be considered
>    experimental in 4.7-- I think it can go in;
>  - patch 3 is bigger, and a bit more complex. Note, however, that most of its
>    content is code comments and ASSERT-s; it is self contained to scheduling
>    (in the sense that it impacts all schedulers, but "just" them), and fixes
>    a situation that, AFAIUI, is important for ARM;

I think you reordered patch 2 and 3 between the time you wrote this and
the time you posted it. :-)

>  - patch 4 may again look not that critical. But, the fact that someone wanting
>    to experiment with RTDS in a cpupool would face the kind of interference
>    between independent cpupools that the patch cures is, I think, something
>    worthwhile trying to avoid. Besides, it is again quite self contained, as
>    it's indeed only relevant for RTDS (which is also going to be called
>    experimental for 4.7).

FWIW I agree with the rationale here.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04 15:11   ` George Dunlap
@ 2016-05-04 15:58     ` Dario Faggioli
  2016-05-04 17:05       ` George Dunlap
  0 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-04 15:58 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Wei Liu

On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
> On 03/05/16 22:46, Dario Faggioli wrote:
> > 
> > In fact, interrupts are already disabled when calling
> > the hook from schedule_cpu_switch(), and hence using
> > anything different than just spin_lock() is wrong (and
> > ASSERT()-s in debug builds) or unnecessary.
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Good catch.  
>
Well, much rather my bad introducing this! :-/

> But would it be better to either 1) add an assert that irqs
> are disabled, or 2) use spin_lock_irqsave(), just to make sure the
> two
> bits of code won't silently go out of sync?
> 
There's an ASSERT() already, in spin_lock_irq() (asking for IRQs to be
enabled), which is in fact the one that triggers.

I introduced the bug because of an oversight when applying the last
review comments to a previous patch series, and did not spot it because
--and this is the true mistake, IMO-- I tested the final result only
with debug=n. This is why I think an ASSERT() is after all not that
useful here... In fact, if this were tested with debug=y, what's
already in place would have been enough.

There are quite a few other similar cases all around scheduling code.
Some of them have comments, none has ASSERT()-s, and I think that is
fine.

I also don't like using spin_lock_irqsave() when it is not necessary,
even if this is not an hot path. In fact, apart from being slower, I
find it more confusing than helpful, especially when looking at the
code and trying to reason about whether we can be called with IRQ
enabled or not.

So, in summary, I'd be fine with adding a comment, and I'm ok
respinning for this. This patch is pretty independent from the rest of
the series anyway, so I can easily respin just this one.

Thoughts?

> Re Jan's comment -- you meant to say that spin_lock_irq() is wrong
> and
> spin_lock_irqsave() is unnecessary?  That's probably a bit more
> inference than we want changeset messages to have, ideally. :-)
> 
Yes, that's what I meant, as I replied myself to Jan.

> I wouldn't require a re-spin just for that, but if you are going to
> respin it, I would suggest just dropping the "and hence...".
> 
Ok, if I have to respin, I'll do this. If not, I'm fine with whoever
commits this has to drop that sentence.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
                   ` (5 preceding siblings ...)
  2016-05-04 15:53 ` George Dunlap
@ 2016-05-04 16:04 ` Wei Liu
  2016-05-07 21:23   ` Meng Xu
  6 siblings, 1 reply; 37+ messages in thread
From: Wei Liu @ 2016-05-04 16:04 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Wei Liu, Tianyang Chen, George Dunlap, Julien Grall, Meng Xu,
	Varun.Swara, xen-devel, Steve Capper

On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
> Hi,
> 
> This small series contains some bugfixes for various schedulers. They're all
> bugfixes, so I think all should be considered for 4.7. Here's some more
> detailed analysis.
> 
> Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have an
> ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe it
> is important, but at least does not make anything explode.
> 
> Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
> Julien reported here:
> 
>  https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html
> 
> Julien, the patch is very very similar to the one attached to one of my reply
> in that thread, but I had to change some small bits... Can you please re-test
> it?
> 
> Patch 4 makes the code of RTDS look consistent with what we state in patch 2,
> so it's also important. Furthermore, it does fix a bug (although, again, not
> one that would splat Xen) as, without it, we may have a timer used by the RTDS
> scheduler bound to the pCPU of another cpupool with another scheduler. That
> would introduce some unwanted and very difficult to recognize interference
> between different schedulers in different pool, and should hence be avoided.
> 
> So this was awesomeness; about risks:
>  - patch 1 is very small, super-self contained (zero impact outside of Credit2
>    code) and it fixes an actual and 100% reproducible bug;
>  - patch 2 is also totally self-contained and it can't possibly cause problems
>    to anything else than to what it is trying to fix (Credit2's load balancer).
>    It doesn't cure any ASSERT or Oops, so it's less interesting, but given the
>    low risk --also considering that Credit2 will still be considered
>    experimental in 4.7-- I think it can go in;
>  - patch 3 is bigger, and a bit more complex. Note, however, that most of its
>    content is code comments and ASSERT-s; it is self contained to scheduling
>    (in the sense that it impacts all schedulers, but "just" them), and fixes
>    a situation that, AFAIUI, is important for ARM;

You meant patch 2 actually.

For the first three patches:

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

>  - patch 4 may again look not that critical. But, the fact that someone wanting
>    to experiment with RTDS in a cpupool would face the kind of interference
>    between independent cpupools that the patch cures is, I think, something
>    worthwhile trying to avoid. Besides, it is again quite self contained, as
>    it's indeed only relevant for RTDS (which is also going to be called
>    experimental for 4.7).

I will wait for Meng to review this one.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04 15:58     ` Dario Faggioli
@ 2016-05-04 17:05       ` George Dunlap
  2016-05-04 17:21         ` Dario Faggioli
  0 siblings, 1 reply; 37+ messages in thread
From: George Dunlap @ 2016-05-04 17:05 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Wei Liu

On 04/05/16 16:58, Dario Faggioli wrote:
> On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
>> On 03/05/16 22:46, Dario Faggioli wrote:
>>>
>>> In fact, interrupts are already disabled when calling
>>> the hook from schedule_cpu_switch(), and hence using
>>> anything different than just spin_lock() is wrong (and
>>> ASSERT()-s in debug builds) or unnecessary.
>>>
>>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>> Good catch.  
>>
> Well, much rather my bad introducing this! :-/
> 
>> But would it be better to either 1) add an assert that irqs
>> are disabled, or 2) use spin_lock_irqsave(), just to make sure the
>> two
>> bits of code won't silently go out of sync?
>>
> There's an ASSERT() already, in spin_lock_irq() (asking for IRQs to be
> enabled), which is in fact the one that triggers.
> 
> I introduced the bug because of an oversight when applying the last
> review comments to a previous patch series, and did not spot it because
> --and this is the true mistake, IMO-- I tested the final result only
> with debug=n. This is why I think an ASSERT() is after all not that
> useful here... In fact, if this were tested with debug=y, what's
> already in place would have been enough.
> 
> There are quite a few other similar cases all around scheduling code.
> Some of them have comments, none has ASSERT()-s, and I think that is
> fine.

I'm a bit confused by these few paragraphs, and it makes me wonder if
maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done in
this patch (replace spin_lock_irq() with spin_lock()), but also add to
it an assert to make sure that if irqs ever *stop* being disabled when
calling this, we'll find out".

Is that what you understood me to mean, or did you think I meant,
"Instead of changing to spin_lock(), [do something else]"?

> I also don't like using spin_lock_irqsave() when it is not necessary,
> even if this is not an hot path. In fact, apart from being slower, I
> find it more confusing than helpful, especially when looking at the
> code and trying to reason about whether we can be called with IRQ
> enabled or not.

I agree with this; I mainly included the suggestion as a potential
alternative to making the code robust.

I've checked in patches 2 and 3, btw, so no need to re-send them. :-)

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04 17:05       ` George Dunlap
@ 2016-05-04 17:21         ` Dario Faggioli
  2016-05-04 17:34           ` George Dunlap
  0 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-04 17:21 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Wei Liu

On Wed, 2016-05-04 at 18:05 +0100, George Dunlap wrote:
> On 04/05/16 16:58, Dario Faggioli wrote:
> > On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
> > There are quite a few other similar cases all around scheduling
> > code.
> > Some of them have comments, none has ASSERT()-s, and I think that
> > is
> > fine.
> I'm a bit confused by these few paragraphs, and it makes me wonder if
> maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done
> in
> this patch (replace spin_lock_irq() with spin_lock()), but also add
> to
> it an assert to make sure that if irqs ever *stop* being disabled
> when
> calling this, we'll find out".
> 
No, you were clear. I wasn't, sorry. :-)

> Is that what you understood me to mean, or did you think I meant,
> "Instead of changing to spin_lock(), [do something else]"?
> 
I understood what you meant. What I meant was this:

instead of another ASSERT, I would prefer to add a comment, like we are
doing in other similar places already:

static void
csched2_deinit_pdata(...)
{
    ...
    /* No need to save IRQs here, they're already disabled */
    spin_lock(&rqd->lock);
   ...
}

And I'm now adding this:

After all, I'm fine with an ASSERT() too, but then I think we should
add one to the same effect to csched_switch_sched() too.

> > I also don't like using spin_lock_irqsave() when it is not
> > necessary,
> > even if this is not an hot path. In fact, apart from being slower,
> > I
> > find it more confusing than helpful, especially when looking at the
> > code and trying to reason about whether we can be called with IRQ
> > enabled or not.
> I agree with this; I mainly included the suggestion as a potential
> alternative to making the code robust.
> 
> I've checked in patches 2 and 3, btw, so no need to re-send them. :-)
> 
Great, thanks. :-)

Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04 17:21         ` Dario Faggioli
@ 2016-05-04 17:34           ` George Dunlap
  2016-05-06 13:21             ` Dario Faggioli
  0 siblings, 1 reply; 37+ messages in thread
From: George Dunlap @ 2016-05-04 17:34 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Wei Liu

On 04/05/16 18:21, Dario Faggioli wrote:
> On Wed, 2016-05-04 at 18:05 +0100, George Dunlap wrote:
>> On 04/05/16 16:58, Dario Faggioli wrote:
>>> On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
>>> There are quite a few other similar cases all around scheduling
>>> code.
>>> Some of them have comments, none has ASSERT()-s, and I think that
>>> is
>>> fine.
>> I'm a bit confused by these few paragraphs, and it makes me wonder if
>> maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done
>> in
>> this patch (replace spin_lock_irq() with spin_lock()), but also add
>> to
>> it an assert to make sure that if irqs ever *stop* being disabled
>> when
>> calling this, we'll find out".
>>
> No, you were clear. I wasn't, sorry. :-)
> 
>> Is that what you understood me to mean, or did you think I meant,
>> "Instead of changing to spin_lock(), [do something else]"?
>>
> I understood what you meant. What I meant was this:
> 
> instead of another ASSERT, I would prefer to add a comment, like we are
> doing in other similar places already:
> 
> static void
> csched2_deinit_pdata(...)
> {
>     ...
>     /* No need to save IRQs here, they're already disabled */
>     spin_lock(&rqd->lock);
>    ...
> }
> 
> And I'm now adding this:
> 
> After all, I'm fine with an ASSERT() too, but then I think we should
> add one to the same effect to csched_switch_sched() too.

Well an ASSERT() is sort of like a comment, in that if you see
ASSERT(irqs_disabled()), you know there's no need to save irqs because
they should already disabled.  But it has the advantage that osstest
will be able to "read" it once we get some proper cpupool tests for
osstest. :-)

If we weren't in the feature freeze, I'd definitely favor adding an
ASSERT to credit1.  As it is, I think either way (adding now or waiting
until the 4.8 development window) should be fine.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-04  9:06   ` Dario Faggioli
@ 2016-05-05 12:00     ` Julien Grall
  2016-05-05 12:38       ` Dario Faggioli
  0 siblings, 1 reply; 37+ messages in thread
From: Julien Grall @ 2016-05-05 12:00 UTC (permalink / raw)
  To: Dario Faggioli, Konrad Rzeszutek Wilk
  Cc: Wei Liu, Steve Capper, Tianyang Chen, George Dunlap, Meng Xu,
	Varun.Swara, xen-devel

Hi Dario,

On 04/05/16 10:06, Dario Faggioli wrote:
> On Tue, 2016-05-03 at 21:26 -0400, Konrad Rzeszutek Wilk wrote:
>> On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
>>>
>>> Julien, the patch is very very similar to the one attached to one
>>> of my reply
>>> in that thread, but I had to change some small bits... Can you
>>> please re-test
>>> it?
>> I tested it. It does not crash anymore. However I see this:
>>
> On what hardware/emulator? The same Julien was using?
>> (XEN) CPU1 never came online
>> (XEN) Failed to bring up CPU 1 (error -5)
>> (XEN) Bringing up CPU2
>> (XEN) CPU2 never came online
>> (XEN) Failed to bring up CPU 2 (error -5)
>> (XEN) Bringing up CPU3
>> (XEN) CPU3 never came online
>> (XEN) Failed to bring up CPU 3 (error -5)
>> (XEN) Brought up 1 CPUs
>>
>>
>> Which is not very reassuring. I don't know if that is expected.
>>
> I don't know... I've never said this on any x86 box before, and I
> wouldn't think it to be related to this series... do you see this only
> with the series applied?

Xen ARM doesn't wait indefinitely a secondary CPU to come online. 
Instead, if it isn't online after a 1s, it will be skipped.

This could happen on the Foundation Model if the number of cores given 
on the model command line, handled by the firmware and described in the 
device tree don't match each other.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-05 12:00     ` Julien Grall
@ 2016-05-05 12:38       ` Dario Faggioli
  0 siblings, 0 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-05 12:38 UTC (permalink / raw)
  To: Julien Grall, Konrad Rzeszutek Wilk
  Cc: Wei Liu, Steve Capper, Tianyang Chen, George Dunlap, Meng Xu,
	Varun.Swara, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1679 bytes --]

On Thu, 2016-05-05 at 13:00 +0100, Julien Grall wrote:
> Hi Dario,
> On 04/05/16 10:06, Dario Faggioli wrote:
> > On what hardware/emulator? The same Julien was using?
> > > 
> > > (XEN) CPU1 never came online
> > > (XEN) Failed to bring up CPU 1 (error -5)
> > > (XEN) Bringing up CPU2
> > > (XEN) CPU2 never came online
> > > (XEN) Failed to bring up CPU 2 (error -5)
> > > (XEN) Bringing up CPU3
> > > (XEN) CPU3 never came online
> > > (XEN) Failed to bring up CPU 3 (error -5)
> > > (XEN) Brought up 1 CPUs
> > > 
> > > 
> > > Which is not very reassuring. I don't know if that is expected.
> > > 
> > I don't know... I've never said this on any x86 box before, and I
> > wouldn't think it to be related to this series... do you see this
> > only
> > with the series applied?
> Xen ARM doesn't wait indefinitely a secondary CPU to come online. 
> Instead, if it isn't online after a 1s, it will be skipped.
> 
Ok.

> This could happen on the Foundation Model if the number of cores
> given 
> on the model command line, handled by the firmware and described in
> the 
> device tree don't match each other.
> 
Right. Thanks for the explanation.

So, does this mean that the behavior reported above by Konrad, i.e.,
logging and not crashing, is indeed intentional (or at least expected)
and that therefore even this version of the patch cures the issue?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-04 17:34           ` George Dunlap
@ 2016-05-06 13:21             ` Dario Faggioli
  2016-05-06 13:48               ` Wei Liu
  2016-05-09 14:42               ` George Dunlap
  0 siblings, 2 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-06 13:21 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 4137 bytes --]

On Wed, 2016-05-04 at 18:34 +0100, George Dunlap wrote:
> On 04/05/16 18:21, Dario Faggioli wrote:
> > 
> > After all, I'm fine with an ASSERT() too, but then I think we
> > should
> > add one to the same effect to csched_switch_sched() too.
> Well an ASSERT() is sort of like a comment, in that if you see
> ASSERT(irqs_disabled()), you know there's no need to save irqs
> because
> they should already disabled.  But it has the advantage that osstest
> will be able to "read" it once we get some proper cpupool tests for
> osstest. :-)
> 
> If we weren't in the feature freeze, I'd definitely favor adding an
> ASSERT to credit1.  As it is, I think either way (adding now or
> waiting
> until the 4.8 development window) should be fine.
> 
Ok, here you go (inline and attached) the patch with ASSERT()-s both in
Credit2 and Credit1 (despite the freeze, I think it's the best thing to
do, see the changelog).

Thanks and Regards,
Dario
--
commit cbabd44e171d0bd2169f1c7100e69a9e48289980
Author: Dario Faggioli <dario.faggioli@citrix.com>
Date:   Tue Apr 26 18:56:56 2016 +0200

    xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
    
    interrupts are already disabled when calling the hook
    (from schedule_cpu_switch()), so we must use spin_lock()
    and spin_unlock().
    
    Add an ASSERT(), so we will notice if this code and its
    caller get out of sync with respect to disabling interrupts
    (and add one at the same exact occurrence of this pattern
    in Credit1 too)
    
    Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
    ---
    Cc: George Dunlap <george.dunlap@citrix.com>
    Cc: Wei Liu <wei.liu2@citrix.com>
    --
    Changes from v1:
     * add the ASSERT(), as requested by George
     * add the ASSERT in Credit1 too
    --
    For Wei:
     - the Credit2 spin_lock_irq()-->spin_lock() change needs
       to go in, as it fixes a bug;
     - adding the ASSERT was requested during review;
     - adding the ASSERT in Credit1 is not strictly necessary,
       but imptoves code quality and consistency at zero cost
       and risk, so I think we should just go for it now, instead
       of waitign for 4.8 (it's basically like I'm adding a
       comment!).

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index db4d42a..a38a63d 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -615,6 +615,7 @@ csched_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * schedule_cpu_switch()). It actually may or may not be the 'right'
      * one for this cpu, but that is ok for preventing races.
      */
+    ASSERT(!local_irq_is_enabled());
     spin_lock(&prv->lock);
     init_pdata(prv, pdata, cpu);
     spin_unlock(&prv->lock);
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index f3b62ac..f95e509 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2238,7 +2238,8 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * And owning exactly that one (the lock of the old scheduler of this
      * cpu) is what is necessary to prevent races.
      */
-    spin_lock_irq(&prv->lock);
+    ASSERT(!local_irq_is_enabled());
+    spin_lock(&prv->lock);
 
     idle_vcpu[cpu]->sched_priv = vdata;
 
@@ -2263,7 +2264,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     smp_mb();
     per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
 
-    spin_unlock_irq(&prv->lock);
+    spin_unlock(&prv->lock);
 }
 
 static void
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.1.2: xen-sched-avoid-spuriously-reenable-irqs.patch --]
[-- Type: text/x-patch, Size: 2601 bytes --]

commit cbabd44e171d0bd2169f1c7100e69a9e48289980
Author: Dario Faggioli <dario.faggioli@citrix.com>
Date:   Tue Apr 26 18:56:56 2016 +0200

    xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
    
    interrupts are already disabled when calling the hook
    (from schedule_cpu_switch()), so we must use spin_lock()
    and spin_unlock().
    
    Add an ASSERT(), so we will notice if this code and its
    caller get out of sync with respect to disabling interrupts
    (and add one at the same exact occurrence of this pattern
    in Credit1 too)
    
    Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
    ---
    Cc: George Dunlap <george.dunlap@citrix.com>
    Cc: Wei Liu <wei.liu2@citrix.com>
    --
    Changes from v1:
     * add the ASSERT(), as requested by George
     * add the ASSERT in Credit1 too
    --
    For Wei:
     - the Credit2 spin_lock_irq()-->spin_lock() change needs
       to go in, as it fixes a bug;
     - adding the ASSERT was requested during review;
     - adding the ASSERT in Credit1 is not strictly necessary,
       but imptoves code quality and consistency at zero cost
       and risk, so I think we should just go for it now, instead
       of waitign for 4.8 (it's basically like I'm adding a
       comment!).

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index db4d42a..a38a63d 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -615,6 +615,7 @@ csched_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * schedule_cpu_switch()). It actually may or may not be the 'right'
      * one for this cpu, but that is ok for preventing races.
      */
+    ASSERT(!local_irq_is_enabled());
     spin_lock(&prv->lock);
     init_pdata(prv, pdata, cpu);
     spin_unlock(&prv->lock);
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index f3b62ac..f95e509 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2238,7 +2238,8 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * And owning exactly that one (the lock of the old scheduler of this
      * cpu) is what is necessary to prevent races.
      */
-    spin_lock_irq(&prv->lock);
+    ASSERT(!local_irq_is_enabled());
+    spin_lock(&prv->lock);
 
     idle_vcpu[cpu]->sched_priv = vdata;
 
@@ -2263,7 +2264,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     smp_mb();
     per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
 
-    spin_unlock_irq(&prv->lock);
+    spin_unlock(&prv->lock);
 }
 
 static void

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-06 13:21             ` Dario Faggioli
@ 2016-05-06 13:48               ` Wei Liu
  2016-05-09 14:42               ` George Dunlap
  1 sibling, 0 replies; 37+ messages in thread
From: Wei Liu @ 2016-05-06 13:48 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Wei Liu, George Dunlap

On Fri, May 06, 2016 at 03:21:40PM +0200, Dario Faggioli wrote:
> On Wed, 2016-05-04 at 18:34 +0100, George Dunlap wrote:
> > On 04/05/16 18:21, Dario Faggioli wrote:
> > > 
> > > After all, I'm fine with an ASSERT() too, but then I think we
> > > should
> > > add one to the same effect to csched_switch_sched() too.
> > Well an ASSERT() is sort of like a comment, in that if you see
> > ASSERT(irqs_disabled()), you know there's no need to save irqs
> > because
> > they should already disabled.  But it has the advantage that osstest
> > will be able to "read" it once we get some proper cpupool tests for
> > osstest. :-)
> > 
> > If we weren't in the feature freeze, I'd definitely favor adding an
> > ASSERT to credit1.  As it is, I think either way (adding now or
> > waiting
> > until the 4.8 development window) should be fine.
> > 
> Ok, here you go (inline and attached) the patch with ASSERT()-s both in
> Credit2 and Credit1 (despite the freeze, I think it's the best thing to
> do, see the changelog).
> 
> Thanks and Regards,
> Dario
> --
> commit cbabd44e171d0bd2169f1c7100e69a9e48289980
> Author: Dario Faggioli <dario.faggioli@citrix.com>
> Date:   Tue Apr 26 18:56:56 2016 +0200
> 
>     xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
>     
>     interrupts are already disabled when calling the hook
>     (from schedule_cpu_switch()), so we must use spin_lock()
>     and spin_unlock().
>     
>     Add an ASSERT(), so we will notice if this code and its
>     caller get out of sync with respect to disabling interrupts
>     (and add one at the same exact occurrence of this pattern
>     in Credit1 too)
>     
>     Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>     ---
>     Cc: George Dunlap <george.dunlap@citrix.com>
>     Cc: Wei Liu <wei.liu2@citrix.com>
>     --
>     Changes from v1:
>      * add the ASSERT(), as requested by George
>      * add the ASSERT in Credit1 too
>     --
>     For Wei:
>      - the Credit2 spin_lock_irq()-->spin_lock() change needs
>        to go in, as it fixes a bug;
>      - adding the ASSERT was requested during review;
>      - adding the ASSERT in Credit1 is not strictly necessary,
>        but imptoves code quality and consistency at zero cost
>        and risk, so I think we should just go for it now, instead
>        of waitign for 4.8 (it's basically like I'm adding a
>        comment!).
> 

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-04 15:53     ` Meng Xu
@ 2016-05-06 23:05       ` Dario Faggioli
  0 siblings, 0 replies; 37+ messages in thread
From: Dario Faggioli @ 2016-05-06 23:05 UTC (permalink / raw)
  To: Meng Xu, George Dunlap; +Cc: xen-devel, Tianyang Chen, Wei Liu


[-- Attachment #1.1: Type: text/plain, Size: 1309 bytes --]

On Wed, 2016-05-04 at 11:53 -0400, Meng Xu wrote:
> On Wed, May 4, 2016 at 11:51 AM, George Dunlap <george.dunlap@citrix.
> com> wrote:
> > On 03/05/16 22:46, Dario Faggioli wrote:
> > > This also made it possible to improve the replenishment
> > > timer handling logic, such that now the timer is always
> > > kept on one of the pCPU of the scheduler it's servicing.
> > > Before this commit, in fact, even if the pCPU where the
> > > timer happened to be initialized at creation time was
> > > moved to another cpupool, the timer stayed there,
> > > potentially inferfearing with the new scheduler of the
> > * interfering
> > 
> > > 
> > > pCPU itself.
> > > 
> > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> > I don't know much about the logic, so I'll wait for Meng Xu to
> > review it.
> > 
> I will look at it this week...(I will try to do it asap...)
> 
Hey Meng... di d you have the chance? I hate to push, but we really
like to have the chance to get this into 4.7.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
  2016-05-04 15:51   ` George Dunlap
@ 2016-05-07 21:19   ` Meng Xu
  2016-05-08  3:12     ` Meng Xu
                       ` (2 more replies)
  1 sibling, 3 replies; 37+ messages in thread
From: Meng Xu @ 2016-05-07 21:19 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap

On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
>
> The scheduling hooks API is now used properly, and no
> initialization or de-initialization happen in
> alloc/free_pdata any longer.
>
> In fact, just like it is for Credit2, there is no real
> need for implementing alloc_pdata and free_pdata.
>
> This also made it possible to improve the replenishment
> timer handling logic, such that now the timer is always
> kept on one of the pCPU of the scheduler it's servicing.
> Before this commit, in fact, even if the pCPU where the
> timer happened to be initialized at creation time was
> moved to another cpupool, the timer stayed there,
> potentially inferfearing with the new scheduler of the
> pCPU itself.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> --

Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>

I do have a minor comment about the patch, although it is not
important at all and it is not really about this patch...

> @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
>  {
>      struct rt_private *prv = rt_priv(ops);
>
> -    kill_timer(prv->repl_timer);
> +    ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> +           prv->repl_timer->status == TIMER_STATUS_killed);

I found in xen/timer.h, the comment after the definition of the
TIMER_STATUS_invalid is

                #define TIMER_STATUS_invalid  0 /* Should never see
this.           */

This comment is a little contrary to how the status is used here.
Actually, what does it exactly means by "Should never see this"?

This _invalid status is used in timer.h and it is the status when a
timer is initialized by init_timer().

So I'm thinking maybe this comment can be better improved to avoid confusion?

Anyway, this is just a comment and should not be a blocker, IMO. I
just want to raise it up since I saw it... :-)


===About the testing I did===
---Below is how I tested it---
I booted up two vcpus, created one cpupool for each type of
schedulers, and migrated them around.
The scripts to run the test cases can be found at
https://github.com/PennPanda/scripts/tree/master/xen/xen-test

---Below is the testing scenarios---
echo "start test case 1..."
xl cpupool-list
xl cpupool-destroy cpupool-credit
xl cpupool-destroy cpupool-credit2
xl cpupool-destroy cpupool-rtds
xl cpupool-create ${cpupool_credit_file}
xl cpupool-create ${cpupool_credit2_file}
xl cpupool-create ${cpupool_rtds_file}
# Add cpus to each cpupool
echo "Add CPUs to each cpupool"
for ((i=0;i<5; i+=1));do
xl cpupool-cpu-remove Pool-0 ${i}
done
echo "xl cpupool-cpu-add cpupool-credit 0"
xl cpupool-cpu-add cpupool-credit 0
echo "xl cpupool-cpu-add cpupool-credit2 1,2"
xl cpupool-cpu-add cpupool-credit2 1
xl cpupool-cpu-add cpupool-credit2 2
echo "xl cpupool-cpu-add cpupool-rtds 3,4"
xl cpupool-cpu-add cpupool-rtds 3
xl cpupool-cpu-add cpupool-rtds 4
xl cpupool-list -c
xl cpupool-list
# Migrate vm1 among cpupools
echo "Migrate ${vm1_name} among cpupools"
xl cpupool-migrate ${vm1_name} cpupool-rtds
xl cpupool-migrate ${vm1_name} cpupool-credit2
xl cpupool-migrate ${vm1_name} cpupool-rtds
xl cpupool-migrate ${vm1_name} cpupool-credit
xl cpupool-migrate ${vm1_name} cpupool-rtds

Thank you very much and best regards,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 0/4] Assorted scheduling fixes
  2016-05-04 16:04 ` Wei Liu
@ 2016-05-07 21:23   ` Meng Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Meng Xu @ 2016-05-07 21:23 UTC (permalink / raw)
  To: Wei Liu
  Cc: Dario Faggioli, Tianyang Chen, George Dunlap, Julien Grall,
	Varun.Swara, xen-devel, Steve Capper

Hi Wei,

On Wed, May 4, 2016 at 12:04 PM, Wei Liu <wei.liu2@citrix.com> wrote:
>
> On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
> > Hi,
> >
> > This small series contains some bugfixes for various schedulers. They're all
> > bugfixes, so I think all should be considered for 4.7. Here's some more
> > detailed analysis.
> >
> > Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have an
> > ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe it
> > is important, but at least does not make anything explode.
> >
> > Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
> > Julien reported here:
> >
> >  https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html
> >
> > Julien, the patch is very very similar to the one attached to one of my reply
> > in that thread, but I had to change some small bits... Can you please re-test
> > it?
> >
> > Patch 4 makes the code of RTDS look consistent with what we state in patch 2,
> > so it's also important. Furthermore, it does fix a bug (although, again, not
> > one that would splat Xen) as, without it, we may have a timer used by the RTDS
> > scheduler bound to the pCPU of another cpupool with another scheduler. That
> > would introduce some unwanted and very difficult to recognize interference
> > between different schedulers in different pool, and should hence be avoided.
> >
> > So this was awesomeness; about risks:
> >  - patch 1 is very small, super-self contained (zero impact outside of Credit2
> >    code) and it fixes an actual and 100% reproducible bug;
> >  - patch 2 is also totally self-contained and it can't possibly cause problems
> >    to anything else than to what it is trying to fix (Credit2's load balancer).
> >    It doesn't cure any ASSERT or Oops, so it's less interesting, but given the
> >    low risk --also considering that Credit2 will still be considered
> >    experimental in 4.7-- I think it can go in;
> >  - patch 3 is bigger, and a bit more complex. Note, however, that most of its
> >    content is code comments and ASSERT-s; it is self contained to scheduling
> >    (in the sense that it impacts all schedulers, but "just" them), and fixes
> >    a situation that, AFAIUI, is important for ARM;
>
> You meant patch 2 actually.
>
> For the first three patches:
>
> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
>
> >  - patch 4 may again look not that critical. But, the fact that someone wanting
> >    to experiment with RTDS in a cpupool would face the kind of interference
> >    between independent cpupools that the patch cures is, I think, something
> >    worthwhile trying to avoid.


Yes. It's better to avoid this type of interference.


>
> Besides, it is again quite self contained, as
> >    it's indeed only relevant for RTDS (which is also going to be called
> >    experimental for 4.7).


Yes. It should not affect other schedulers or other parts of the system.
Actually, it does not affect the logic in RTDS either.


> I will wait for Meng to review this one.


I just reviewed and tested this patch on my computer.
Thank you very much!

Best regards,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-07 21:19   ` Meng Xu
@ 2016-05-08  3:12     ` Meng Xu
  2016-05-09  8:07       ` Juergen Gross
  2016-05-09 13:22     ` Dario Faggioli
  2016-05-09 14:46     ` George Dunlap
  2 siblings, 1 reply; 37+ messages in thread
From: Meng Xu @ 2016-05-08  3:12 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: jgross, xen-devel, Tianyang Chen, Wei Liu, George Dunlap

On Sat, May 7, 2016 at 5:19 PM, Meng Xu <mengxu@cis.upenn.edu> wrote:
> On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
> <dario.faggioli@citrix.com> wrote:
>>
>> The scheduling hooks API is now used properly, and no
>> initialization or de-initialization happen in
>> alloc/free_pdata any longer.
>>
>> In fact, just like it is for Credit2, there is no real
>> need for implementing alloc_pdata and free_pdata.
>>
>> This also made it possible to improve the replenishment
>> timer handling logic, such that now the timer is always
>> kept on one of the pCPU of the scheduler it's servicing.
>> Before this commit, in fact, even if the pCPU where the
>> timer happened to be initialized at creation time was
>> moved to another cpupool, the timer stayed there,
>> potentially inferfearing with the new scheduler of the
>> pCPU itself.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>> --
>
> Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>

>
> ---Below is the testing scenarios---
> echo "start test case 1..."
> xl cpupool-list
> xl cpupool-destroy cpupool-credit
> xl cpupool-destroy cpupool-credit2
> xl cpupool-destroy cpupool-rtds
> xl cpupool-create ${cpupool_credit_file}
> xl cpupool-create ${cpupool_credit2_file}
> xl cpupool-create ${cpupool_rtds_file}
> # Add cpus to each cpupool
> echo "Add CPUs to each cpupool"
> for ((i=0;i<5; i+=1));do
> xl cpupool-cpu-remove Pool-0 ${i}
> done
> echo "xl cpupool-cpu-add cpupool-credit 0"
> xl cpupool-cpu-add cpupool-credit 0
> echo "xl cpupool-cpu-add cpupool-credit2 1,2"
> xl cpupool-cpu-add cpupool-credit2 1
> xl cpupool-cpu-add cpupool-credit2 2
> echo "xl cpupool-cpu-add cpupool-rtds 3,4"
> xl cpupool-cpu-add cpupool-rtds 3
> xl cpupool-cpu-add cpupool-rtds 4
> xl cpupool-list -c
> xl cpupool-list
> # Migrate vm1 among cpupools
> echo "Migrate ${vm1_name} among cpupools"
> xl cpupool-migrate ${vm1_name} cpupool-rtds
> xl cpupool-migrate ${vm1_name} cpupool-credit2
> xl cpupool-migrate ${vm1_name} cpupool-rtds
> xl cpupool-migrate ${vm1_name} cpupool-credit
> xl cpupool-migrate ${vm1_name} cpupool-rtds
>

I forgot one thing in the previous email.
When I tried to migrate Domain-0 from the Pool-0 (with rtds or credit
scheduler) to another newly created pool, say cpupool-credit, it
always fails.

This situation happens even when I boot into credit scheduler and try
to migrate Domain-0 to another cpupool.

I'm wondering if Domain-0 can be migrated among cpupools?
From the Xen wiki: http://wiki.xenproject.org/wiki/Cpupools_Howto, it
seems Domain-0 can be migrated....

Thanks,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-08  3:12     ` Meng Xu
@ 2016-05-09  8:07       ` Juergen Gross
  0 siblings, 0 replies; 37+ messages in thread
From: Juergen Gross @ 2016-05-09  8:07 UTC (permalink / raw)
  To: Meng Xu, Dario Faggioli; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap

On 08/05/16 05:12, Meng Xu wrote:
> On Sat, May 7, 2016 at 5:19 PM, Meng Xu <mengxu@cis.upenn.edu> wrote:
>> On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
>> <dario.faggioli@citrix.com> wrote:
>>>
>>> The scheduling hooks API is now used properly, and no
>>> initialization or de-initialization happen in
>>> alloc/free_pdata any longer.
>>>
>>> In fact, just like it is for Credit2, there is no real
>>> need for implementing alloc_pdata and free_pdata.
>>>
>>> This also made it possible to improve the replenishment
>>> timer handling logic, such that now the timer is always
>>> kept on one of the pCPU of the scheduler it's servicing.
>>> Before this commit, in fact, even if the pCPU where the
>>> timer happened to be initialized at creation time was
>>> moved to another cpupool, the timer stayed there,
>>> potentially inferfearing with the new scheduler of the
>>> pCPU itself.
>>>
>>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>>> --
>>
>> Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>
> 
>>
>> ---Below is the testing scenarios---
>> echo "start test case 1..."
>> xl cpupool-list
>> xl cpupool-destroy cpupool-credit
>> xl cpupool-destroy cpupool-credit2
>> xl cpupool-destroy cpupool-rtds
>> xl cpupool-create ${cpupool_credit_file}
>> xl cpupool-create ${cpupool_credit2_file}
>> xl cpupool-create ${cpupool_rtds_file}
>> # Add cpus to each cpupool
>> echo "Add CPUs to each cpupool"
>> for ((i=0;i<5; i+=1));do
>> xl cpupool-cpu-remove Pool-0 ${i}
>> done
>> echo "xl cpupool-cpu-add cpupool-credit 0"
>> xl cpupool-cpu-add cpupool-credit 0
>> echo "xl cpupool-cpu-add cpupool-credit2 1,2"
>> xl cpupool-cpu-add cpupool-credit2 1
>> xl cpupool-cpu-add cpupool-credit2 2
>> echo "xl cpupool-cpu-add cpupool-rtds 3,4"
>> xl cpupool-cpu-add cpupool-rtds 3
>> xl cpupool-cpu-add cpupool-rtds 4
>> xl cpupool-list -c
>> xl cpupool-list
>> # Migrate vm1 among cpupools
>> echo "Migrate ${vm1_name} among cpupools"
>> xl cpupool-migrate ${vm1_name} cpupool-rtds
>> xl cpupool-migrate ${vm1_name} cpupool-credit2
>> xl cpupool-migrate ${vm1_name} cpupool-rtds
>> xl cpupool-migrate ${vm1_name} cpupool-credit
>> xl cpupool-migrate ${vm1_name} cpupool-rtds
>>
> 
> I forgot one thing in the previous email.
> When I tried to migrate Domain-0 from the Pool-0 (with rtds or credit
> scheduler) to another newly created pool, say cpupool-credit, it
> always fails.
> 
> This situation happens even when I boot into credit scheduler and try
> to migrate Domain-0 to another cpupool.
> 
> I'm wondering if Domain-0 can be migrated among cpupools?
> From the Xen wiki: http://wiki.xenproject.org/wiki/Cpupools_Howto, it
> seems Domain-0 can be migrated....

It can't. Domain-0 is always member of Pool-0.

I think at least an update of the xl man page would be a good idea.
I'll do a patch.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-07 21:19   ` Meng Xu
  2016-05-08  3:12     ` Meng Xu
@ 2016-05-09 13:22     ` Dario Faggioli
  2016-05-09 14:08       ` Meng Xu
  2016-05-09 14:46     ` George Dunlap
  2 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-09 13:22 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap


[-- Attachment #1.1: Type: text/plain, Size: 2852 bytes --]

On Sat, 2016-05-07 at 23:19 +0200, Meng Xu wrote:
> On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
> <dario.faggioli@citrix.com> wrote:
> > 
> > 
> > The scheduling hooks API is now used properly, and no
> > initialization or de-initialization happen in
> > alloc/free_pdata any longer.
> > 
> > In fact, just like it is for Credit2, there is no real
> > need for implementing alloc_pdata and free_pdata.
> > 
> > This also made it possible to improve the replenishment
> > timer handling logic, such that now the timer is always
> > kept on one of the pCPU of the scheduler it's servicing.
> > Before this commit, in fact, even if the pCPU where the
> > timer happened to be initialized at creation time was
> > moved to another cpupool, the timer stayed there,
> > potentially inferfearing with the new scheduler of the
> > pCPU itself.
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> > --
> Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>
> 
Thanks! :-)

> I do have a minor comment about the patch, although it is not
> important at all and it is not really about this patch...
> 
> > 
> > @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
> >  {
> >      struct rt_private *prv = rt_priv(ops);
> > 
> > -    kill_timer(prv->repl_timer);
> > +    ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> > +           prv->repl_timer->status == TIMER_STATUS_killed);
> I found in xen/timer.h, the comment after the definition of the
> TIMER_STATUS_invalid is
> 
>                 #define TIMER_STATUS_invalid  0 /* Should never see
> this.           */
> 
> This comment is a little contrary to how the status is used here.
> Actually, what does it exactly means by "Should never see this"?
> 
> This _invalid status is used in timer.h and it is the status when a
> timer is initialized by init_timer().
> 
As far as my understanding goes, this means that a timer, during its
operations, should never be found in this state.

In fact, this mark a situation where the timer has been allocated but
never initialized, and there are ASSERT()s around to enforce that.

However, if what one wants is _exactly_ to check whether the timer has
been allocated ut not initialized, I don't see why I can't use this.

> So I'm thinking maybe this comment can be better improved to avoid
> confusion?
> 
I don't think things are confusing, neither right now, nor after this
patch, but I'm open to others' opinion. :-)

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-09 13:22     ` Dario Faggioli
@ 2016-05-09 14:08       ` Meng Xu
  2016-05-09 14:52         ` Dario Faggioli
  0 siblings, 1 reply; 37+ messages in thread
From: Meng Xu @ 2016-05-09 14:08 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap

>
>
>
> > I do have a minor comment about the patch, although it is not
> > important at all and it is not really about this patch...
> >
> > >
> > > @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
> > >  {
> > >      struct rt_private *prv = rt_priv(ops);
> > >
> > > -    kill_timer(prv->repl_timer);
> > > +    ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> > > +           prv->repl_timer->status == TIMER_STATUS_killed);
> > I found in xen/timer.h, the comment after the definition of the
> > TIMER_STATUS_invalid is
> >
> >                 #define TIMER_STATUS_invalid  0 /* Should never see
> > this.           */
> >
> > This comment is a little contrary to how the status is used here.
> > Actually, what does it exactly means by "Should never see this"?
> >
> > This _invalid status is used in timer.h and it is the status when a
> > timer is initialized by init_timer().
> >
> As far as my understanding goes, this means that a timer, during its
> operations, should never be found in this state.
>
> In fact, this mark a situation where the timer has been allocated but
> never initialized, and there are ASSERT()s around to enforce that.
>
> However, if what one wants is _exactly_ to check whether the timer has
> been allocated ut not initialized, I don't see why I can't use this.



You can use this. Actually, I agree with how you used this here.
Actually, this is also how the existing init_timer() uses it.


>
>
> > So I'm thinking maybe this comment can be better improved to avoid
> > confusion?
> >
> I don't think things are confusing, neither right now, nor after this
> patch, but I'm open to others' opinion. :-)


Hmm, I won't get confused with the comment from now on, but I'm unsure
if someone else will or not. The tricky thing is when I know it, I
won't feel weird. However, when I first read it, I feel a little
confusing if not reading the other parts of the code related to this
macro.

Anyway, I'm ok with either way: change the comment or not.

Best Regards,

Meng


-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
  2016-05-06 13:21             ` Dario Faggioli
  2016-05-06 13:48               ` Wei Liu
@ 2016-05-09 14:42               ` George Dunlap
  1 sibling, 0 replies; 37+ messages in thread
From: George Dunlap @ 2016-05-09 14:42 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Wei Liu

On Fri, May 6, 2016 at 2:21 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Wed, 2016-05-04 at 18:34 +0100, George Dunlap wrote:
>> On 04/05/16 18:21, Dario Faggioli wrote:
>> >
>> > After all, I'm fine with an ASSERT() too, but then I think we
>> > should
>> > add one to the same effect to csched_switch_sched() too.
>> Well an ASSERT() is sort of like a comment, in that if you see
>> ASSERT(irqs_disabled()), you know there's no need to save irqs
>> because
>> they should already disabled.  But it has the advantage that osstest
>> will be able to "read" it once we get some proper cpupool tests for
>> osstest. :-)
>>
>> If we weren't in the feature freeze, I'd definitely favor adding an
>> ASSERT to credit1.  As it is, I think either way (adding now or
>> waiting
>> until the 4.8 development window) should be fine.
>>
> Ok, here you go (inline and attached) the patch with ASSERT()-s both in
> Credit2 and Credit1 (despite the freeze, I think it's the best thing to
> do, see the changelog).
>
> Thanks and Regards,
> Dario
> --
> commit cbabd44e171d0bd2169f1c7100e69a9e48289980
> Author: Dario Faggioli <dario.faggioli@citrix.com>
> Date:   Tue Apr 26 18:56:56 2016 +0200
>
>     xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()
>
>     interrupts are already disabled when calling the hook
>     (from schedule_cpu_switch()), so we must use spin_lock()
>     and spin_unlock().
>
>     Add an ASSERT(), so we will notice if this code and its
>     caller get out of sync with respect to disabling interrupts
>     (and add one at the same exact occurrence of this pattern
>     in Credit1 too)
>
>     Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

And queued, thanks.

 -George

>     ---
>     Cc: George Dunlap <george.dunlap@citrix.com>
>     Cc: Wei Liu <wei.liu2@citrix.com>
>     --
>     Changes from v1:
>      * add the ASSERT(), as requested by George
>      * add the ASSERT in Credit1 too
>     --
>     For Wei:
>      - the Credit2 spin_lock_irq()-->spin_lock() change needs
>        to go in, as it fixes a bug;
>      - adding the ASSERT was requested during review;
>      - adding the ASSERT in Credit1 is not strictly necessary,
>        but imptoves code quality and consistency at zero cost
>        and risk, so I think we should just go for it now, instead
>        of waitign for 4.8 (it's basically like I'm adding a
>        comment!).
>
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index db4d42a..a38a63d 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -615,6 +615,7 @@ csched_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>       * schedule_cpu_switch()). It actually may or may not be the 'right'
>       * one for this cpu, but that is ok for preventing races.
>       */
> +    ASSERT(!local_irq_is_enabled());
>      spin_lock(&prv->lock);
>      init_pdata(prv, pdata, cpu);
>      spin_unlock(&prv->lock);
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index f3b62ac..f95e509 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2238,7 +2238,8 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>       * And owning exactly that one (the lock of the old scheduler of this
>       * cpu) is what is necessary to prevent races.
>       */
> -    spin_lock_irq(&prv->lock);
> +    ASSERT(!local_irq_is_enabled());
> +    spin_lock(&prv->lock);
>
>      idle_vcpu[cpu]->sched_priv = vdata;
>
> @@ -2263,7 +2264,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>      smp_mb();
>      per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
>
> -    spin_unlock_irq(&prv->lock);
> +    spin_unlock(&prv->lock);
>  }
>
>  static void
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-07 21:19   ` Meng Xu
  2016-05-08  3:12     ` Meng Xu
  2016-05-09 13:22     ` Dario Faggioli
@ 2016-05-09 14:46     ` George Dunlap
  2016-05-09 14:58       ` Wei Liu
  2 siblings, 1 reply; 37+ messages in thread
From: George Dunlap @ 2016-05-09 14:46 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Dario Faggioli, Tianyang Chen, Wei Liu

On Sat, May 7, 2016 at 10:19 PM, Meng Xu <mengxu@cis.upenn.edu> wrote:
> On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
> <dario.faggioli@citrix.com> wrote:
>>
>> The scheduling hooks API is now used properly, and no
>> initialization or de-initialization happen in
>> alloc/free_pdata any longer.
>>
>> In fact, just like it is for Credit2, there is no real
>> need for implementing alloc_pdata and free_pdata.
>>
>> This also made it possible to improve the replenishment
>> timer handling logic, such that now the timer is always
>> kept on one of the pCPU of the scheduler it's servicing.
>> Before this commit, in fact, even if the pCPU where the
>> timer happened to be initialized at creation time was
>> moved to another cpupool, the timer stayed there,
>> potentially inferfearing with the new scheduler of the
>> pCPU itself.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>> --
>
> Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>

And on that basis:

Acked-by: George Dunlap <george.dunlap@citrix.com>

But it looks like it still needs a release ack?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-09 14:08       ` Meng Xu
@ 2016-05-09 14:52         ` Dario Faggioli
  2016-05-09 14:58           ` Meng Xu
  0 siblings, 1 reply; 37+ messages in thread
From: Dario Faggioli @ 2016-05-09 14:52 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap


[-- Attachment #1.1: Type: text/plain, Size: 1164 bytes --]

On Mon, 2016-05-09 at 10:08 -0400, Meng Xu wrote:
> > I don't think things are confusing, neither right now, nor after
> > this
> > patch, but I'm open to others' opinion. :-)
> 
> Hmm, I won't get confused with the comment from now on, but I'm
> unsure
> if someone else will or not. The tricky thing is when I know it, I
> won't feel weird. However, when I first read it, I feel a little
> confusing if not reading the other parts of the code related to this
> macro.
> 
I don't feel the same, but I understand the concern.

I think we have two options here:
 1. we just do nothing;
 2. you send a patch that, according to your best judgement, improve 
    things (as we all do all the time! :-P).

:-D

> Anyway, I'm ok with either way: change the comment or not.
> 
Me too, and in fact, I'm not changing it, but I won't stop you tryingto
do so. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-09 14:46     ` George Dunlap
@ 2016-05-09 14:58       ` Wei Liu
  2016-05-09 15:35         ` George Dunlap
  0 siblings, 1 reply; 37+ messages in thread
From: Wei Liu @ 2016-05-09 14:58 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Dario Faggioli, Tianyang Chen, Wei Liu, Meng Xu

On Mon, May 09, 2016 at 03:46:23PM +0100, George Dunlap wrote:
> On Sat, May 7, 2016 at 10:19 PM, Meng Xu <mengxu@cis.upenn.edu> wrote:
> > On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
> > <dario.faggioli@citrix.com> wrote:
> >>
> >> The scheduling hooks API is now used properly, and no
> >> initialization or de-initialization happen in
> >> alloc/free_pdata any longer.
> >>
> >> In fact, just like it is for Credit2, there is no real
> >> need for implementing alloc_pdata and free_pdata.
> >>
> >> This also made it possible to improve the replenishment
> >> timer handling logic, such that now the timer is always
> >> kept on one of the pCPU of the scheduler it's servicing.
> >> Before this commit, in fact, even if the pCPU where the
> >> timer happened to be initialized at creation time was
> >> moved to another cpupool, the timer stayed there,
> >> potentially inferfearing with the new scheduler of the
> >> pCPU itself.
> >>
> >> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> >> --
> >
> > Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>
> 
> And on that basis:
> 
> Acked-by: George Dunlap <george.dunlap@citrix.com>
> 
> But it looks like it still needs a release ack?
> 

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-09 14:52         ` Dario Faggioli
@ 2016-05-09 14:58           ` Meng Xu
  0 siblings, 0 replies; 37+ messages in thread
From: Meng Xu @ 2016-05-09 14:58 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Tianyang Chen, Wei Liu, George Dunlap

On Mon, May 9, 2016 at 10:52 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Mon, 2016-05-09 at 10:08 -0400, Meng Xu wrote:
>> > I don't think things are confusing, neither right now, nor after
>> > this
>> > patch, but I'm open to others' opinion. :-)
>>
>> Hmm, I won't get confused with the comment from now on, but I'm
>> unsure
>> if someone else will or not. The tricky thing is when I know it, I
>> won't feel weird. However, when I first read it, I feel a little
>> confusing if not reading the other parts of the code related to this
>> macro.
>>
> I don't feel the same, but I understand the concern.
>
> I think we have two options here:
>  1. we just do nothing;
>  2. you send a patch that, according to your best judgement, improve
>     things (as we all do all the time! :-P).
>
> :-D
>
>> Anyway, I'm ok with either way: change the comment or not.
>>
> Me too, and in fact, I'm not changing it, but I won't stop you tryingto
> do so. :-)
>

OK. I can do it... But is just one comment line change too small to be
a patch? ;-)

Thanks,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
  2016-05-09 14:58       ` Wei Liu
@ 2016-05-09 15:35         ` George Dunlap
  0 siblings, 0 replies; 37+ messages in thread
From: George Dunlap @ 2016-05-09 15:35 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Dario Faggioli, Tianyang Chen, Meng Xu

On Mon, May 9, 2016 at 3:58 PM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Mon, May 09, 2016 at 03:46:23PM +0100, George Dunlap wrote:
>> On Sat, May 7, 2016 at 10:19 PM, Meng Xu <mengxu@cis.upenn.edu> wrote:
>> > On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
>> > <dario.faggioli@citrix.com> wrote:
>> >>
>> >> The scheduling hooks API is now used properly, and no
>> >> initialization or de-initialization happen in
>> >> alloc/free_pdata any longer.
>> >>
>> >> In fact, just like it is for Credit2, there is no real
>> >> need for implementing alloc_pdata and free_pdata.
>> >>
>> >> This also made it possible to improve the replenishment
>> >> timer handling logic, such that now the timer is always
>> >> kept on one of the pCPU of the scheduler it's servicing.
>> >> Before this commit, in fact, even if the pCPU where the
>> >> timer happened to be initialized at creation time was
>> >> moved to another cpupool, the timer stayed there,
>> >> potentially inferfearing with the new scheduler of the
>> >> pCPU itself.
>> >>
>> >> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>> >> --
>> >
>> > Reviewed-and-Tested-by: Meng Xu <mengxu@cis.upenn.edu>
>>
>> And on that basis:
>>
>> Acked-by: George Dunlap <george.dunlap@citrix.com>
>>
>> But it looks like it still needs a release ack?
>>
>
> Release-acked-by: Wei Liu <wei.liu2@citrix.com>

And pushed.  Thanks.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-05-09 15:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
2016-05-04  8:48   ` Jan Beulich
2016-05-04  9:08     ` Dario Faggioli
2016-05-04 15:11   ` George Dunlap
2016-05-04 15:58     ` Dario Faggioli
2016-05-04 17:05       ` George Dunlap
2016-05-04 17:21         ` Dario Faggioli
2016-05-04 17:34           ` George Dunlap
2016-05-06 13:21             ` Dario Faggioli
2016-05-06 13:48               ` Wei Liu
2016-05-09 14:42               ` George Dunlap
2016-05-03 21:46 ` [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata Dario Faggioli
2016-05-04 15:25   ` George Dunlap
2016-05-03 21:46 ` [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic Dario Faggioli
2016-05-04 15:38   ` George Dunlap
2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
2016-05-04 15:51   ` George Dunlap
2016-05-04 15:53     ` Meng Xu
2016-05-06 23:05       ` Dario Faggioli
2016-05-07 21:19   ` Meng Xu
2016-05-08  3:12     ` Meng Xu
2016-05-09  8:07       ` Juergen Gross
2016-05-09 13:22     ` Dario Faggioli
2016-05-09 14:08       ` Meng Xu
2016-05-09 14:52         ` Dario Faggioli
2016-05-09 14:58           ` Meng Xu
2016-05-09 14:46     ` George Dunlap
2016-05-09 14:58       ` Wei Liu
2016-05-09 15:35         ` George Dunlap
2016-05-04  1:26 ` [PATCH for 4.7 0/4] Assorted scheduling fixes Konrad Rzeszutek Wilk
2016-05-04  9:06   ` Dario Faggioli
2016-05-05 12:00     ` Julien Grall
2016-05-05 12:38       ` Dario Faggioli
2016-05-04 15:53 ` George Dunlap
2016-05-04 16:04 ` Wei Liu
2016-05-07 21:23   ` Meng Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.