All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen/arm: Fix rtds scheduler for arm
@ 2015-01-30 12:30 Denis Drozdov
  2015-01-30 12:38 ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Denis Drozdov @ 2015-01-30 12:30 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, denys drozdov

From: denys drozdov <denys.drozdov@globallogic.com>

Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b
Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
---
 xen/common/sched_rt.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index e70d6c7..1ab0a62 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
     struct rt_vcpu *snext = NULL;
     struct rt_dom *sdom = NULL;
     struct rt_private *prv = rt_priv(ops);
-    cpumask_t *online;
-    spinlock_t *lock = vcpu_schedule_lock_irq(vc);
+    cpumask_t *online; 
+    unsigned long flags;
+    spinlock_t *lock = vcpu_schedule_lock_irqsave(vc, &flags);
 
     clear_bit(__RTDS_scheduled, &svc->flags);
     /* not insert idle vcpu to runq */
@@ -1032,7 +1033,7 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
         runq_tickle(ops, snext);
     }
 out:
-    vcpu_schedule_unlock_irq(lock, vc);
+    vcpu_schedule_unlock_irqrestore(lock, flags, vc);
 }
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 12:30 [PATCH] xen/arm: Fix rtds scheduler for arm Denis Drozdov
@ 2015-01-30 12:38 ` Jan Beulich
  2015-01-30 15:40   ` Denis Drozdov
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-01-30 12:38 UTC (permalink / raw)
  To: denys drozdov; +Cc: george.dunlap, xen-devel

>>> On 30.01.15 at 13:30, <denys.drozdov@globallogic.com> wrote:
> From: denys drozdov <denys.drozdov@globallogic.com>

There's a description missing here of _what_ (case) you are fixing.

> Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b

And the purpose of this line is unclear.

> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct 
> vcpu *vc)
>      struct rt_vcpu *snext = NULL;
>      struct rt_dom *sdom = NULL;
>      struct rt_private *prv = rt_priv(ops);
> -    cpumask_t *online;
> -    spinlock_t *lock = vcpu_schedule_lock_irq(vc);
> +    cpumask_t *online; 

You're introducing a trailing blank here.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 12:38 ` Jan Beulich
@ 2015-01-30 15:40   ` Denis Drozdov
  2015-01-30 15:46     ` Julien Grall
  0 siblings, 1 reply; 24+ messages in thread
From: Denis Drozdov @ 2015-01-30 15:40 UTC (permalink / raw)
  To: JBeulich, xen-devel, george.dunlap; +Cc: denys drozdov

From: denys drozdov <denys.drozdov@globallogic.com>

Update RT scheduler to run on arm platform

Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
---
 xen/common/sched_rt.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index e70d6c7..5fcc205 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1011,7 +1011,8 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
     struct rt_dom *sdom = NULL;
     struct rt_private *prv = rt_priv(ops);
     cpumask_t *online;
-    spinlock_t *lock = vcpu_schedule_lock_irq(vc);
+    unsigned long flags;
+    spinlock_t *lock = vcpu_schedule_lock_irqsave(vc, &flags);
 
     clear_bit(__RTDS_scheduled, &svc->flags);
     /* not insert idle vcpu to runq */
@@ -1032,7 +1033,7 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
         runq_tickle(ops, snext);
     }
 out:
-    vcpu_schedule_unlock_irq(lock, vc);
+    vcpu_schedule_unlock_irqrestore(lock, flags, vc);
 }
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 15:40   ` Denis Drozdov
@ 2015-01-30 15:46     ` Julien Grall
  2015-01-30 15:56       ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Julien Grall @ 2015-01-30 15:46 UTC (permalink / raw)
  To: Denis Drozdov, JBeulich, xen-devel, george.dunlap

Hello Denys,

On 30/01/15 15:40, Denis Drozdov wrote:
> From: denys drozdov <denys.drozdov@globallogic.com>
> 
> Update RT scheduler to run on arm platform

You need to give more background of the problem (i.e why you have to
disable the IRQ on ARM).

As the scheduler is common with x86, I would expect the problem is also
happening on this architecture.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 15:46     ` Julien Grall
@ 2015-01-30 15:56       ` Andrew Cooper
  2015-01-30 16:10         ` Julien Grall
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2015-01-30 15:56 UTC (permalink / raw)
  To: Julien Grall, Denis Drozdov, JBeulich, xen-devel, george.dunlap

On 30/01/15 15:46, Julien Grall wrote:
> Hello Denys,
>
> On 30/01/15 15:40, Denis Drozdov wrote:
>> From: denys drozdov <denys.drozdov@globallogic.com>
>>
>> Update RT scheduler to run on arm platform
> You need to give more background of the problem (i.e why you have to
> disable the IRQ on ARM).
>
> As the scheduler is common with x86, I would expect the problem is also
> happening on this architecture.

Changing a spinlock_irq into an irqsave is safe, functionally speaking,
but it is concerning that the scheduler appears to be called in
different interrupt states between architectures.

~Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 15:56       ` Andrew Cooper
@ 2015-01-30 16:10         ` Julien Grall
  2015-01-30 16:19           ` Denys Drozdov
  0 siblings, 1 reply; 24+ messages in thread
From: Julien Grall @ 2015-01-30 16:10 UTC (permalink / raw)
  To: Andrew Cooper, Denis Drozdov, JBeulich, xen-devel, george.dunlap

On 30/01/15 15:56, Andrew Cooper wrote:
> On 30/01/15 15:46, Julien Grall wrote:
>> Hello Denys,
>>
>> On 30/01/15 15:40, Denis Drozdov wrote:
>>> From: denys drozdov <denys.drozdov@globallogic.com>
>>>
>>> Update RT scheduler to run on arm platform
>> You need to give more background of the problem (i.e why you have to
>> disable the IRQ on ARM).
>>
>> As the scheduler is common with x86, I would expect the problem is also
>> happening on this architecture.
> 
> Changing a spinlock_irq into an irqsave is safe, functionally speaking,
> but it is concerning that the scheduler appears to be called in
> different interrupt states between architectures.

For instance credit2 is also using vcpu_schedule_lock_irq... Although,
IIRC, it's not used by default. So it has to be fixed too on ARM.

In any case, the commit message needs more background such as stack
trace and/or explaining why we have to take spinlock_irq.

This may also need to document context_saved to explain the IRQs may not
be enabled when this function is called.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 16:10         ` Julien Grall
@ 2015-01-30 16:19           ` Denys Drozdov
  2015-01-30 16:29             ` Julien Grall
  2015-01-31 10:50             ` Ian Campbell
  0 siblings, 2 replies; 24+ messages in thread
From: Denys Drozdov @ 2015-01-30 16:19 UTC (permalink / raw)
  To: Julien Grall; +Cc: George Dunlap, Andrew Cooper, JBeulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2032 bytes --]

Julien,

The details of this issue can be found in the current mailing list.
Please have a look at
http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html.
This is exactly the same behaviour we are observing when running on our arm
setup.

The fix changes spin_lock_irq/spin_unlock_irq to
spin_lock_irq_save/spin_lock_irq_restore,
since context_save executed right from IRQ level. The arm interrupt
handling differs from x86. ARM is handling  context_saved with IRQ disabled
in CPSR, though x86 has IRQs on. Thus it is failing on ASSERT inside
spin_lock_irq when running on ARM. I guess it should work on x86, so this
issue is ARM platform specific.

On Fri, Jan 30, 2015 at 6:10 PM, Julien Grall <julien.grall@linaro.org>
wrote:

> On 30/01/15 15:56, Andrew Cooper wrote:
> > On 30/01/15 15:46, Julien Grall wrote:
> >> Hello Denys,
> >>
> >> On 30/01/15 15:40, Denis Drozdov wrote:
> >>> From: denys drozdov <denys.drozdov@globallogic.com>
> >>>
> >>> Update RT scheduler to run on arm platform
> >> You need to give more background of the problem (i.e why you have to
> >> disable the IRQ on ARM).
> >>
> >> As the scheduler is common with x86, I would expect the problem is also
> >> happening on this architecture.
> >
> > Changing a spinlock_irq into an irqsave is safe, functionally speaking,
> > but it is concerning that the scheduler appears to be called in
> > different interrupt states between architectures.
>
> For instance credit2 is also using vcpu_schedule_lock_irq... Although,
> IIRC, it's not used by default. So it has to be fixed too on ARM.
>
> In any case, the commit message needs more background such as stack
> trace and/or explaining why we have to take spinlock_irq.
>
> This may also need to document context_saved to explain the IRQs may not
> be enabled when this function is called.
>
> Regards,
>
> --
> Julien Grall
>



-- 

Denis Drozdov
GlobalLogic
M +(380)988969537  S denis.drozdov
www.globallogic.com
<http://www.globallogic.com/>
http://www.globallogic.com/email_disclaimer.txt

[-- Attachment #1.2: Type: text/html, Size: 4742 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 16:19           ` Denys Drozdov
@ 2015-01-30 16:29             ` Julien Grall
  2015-01-31 10:50             ` Ian Campbell
  1 sibling, 0 replies; 24+ messages in thread
From: Julien Grall @ 2015-01-30 16:29 UTC (permalink / raw)
  To: Denys Drozdov; +Cc: George Dunlap, Andrew Cooper, JBeulich, xen-devel

On 30/01/15 16:19, Denys Drozdov wrote:
> Julien,
> 
> The details of this issue can be found in the current mailing list.
> Please have a look
> athttp://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html
> <http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html>.
> This is exactly the same behaviour we are observing when running on our
> arm setup.

I understood the problem... I just pointed out that the commit message
is not useful. Hence the request of more background.

If we have to read the log in a couple of years time. We should be able
to understand the problem/solution with the commit message.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-30 16:19           ` Denys Drozdov
  2015-01-30 16:29             ` Julien Grall
@ 2015-01-31 10:50             ` Ian Campbell
  2015-02-02 10:49               ` Denys Drozdov
  1 sibling, 1 reply; 24+ messages in thread
From: Ian Campbell @ 2015-01-31 10:50 UTC (permalink / raw)
  To: Denys Drozdov
  Cc: George Dunlap, Andrew Cooper, Julien Grall, JBeulich, xen-devel

On Fri, 2015-01-30 at 18:19 +0200, Denys Drozdov wrote:

> since context_save executed right from IRQ level. The arm interrupt
> handling differs from x86. ARM is handling  context_saved with IRQ
> disabled in CPSR, though x86 has IRQs on. Thus it is failing on ASSERT
> inside spin_lock_irq when running on ARM. I guess it should work on
> x86, so this issue is ARM platform specific.

FWIW I was waiting for it to happen to a xen-unstable run but the latest
osstest gate run at
http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/ which included
Dario's patches to rationalize the schedulr tests vs. archs also
resulted in a similar sounding failure on credit2:
http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/info.html
http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/serial-marilith-n5.txt

        [Thu Jan 29 13:29:28 2015](XEN) Assertion 'local_irq_is_enabled()' failed at spinlock.c:137
        [Thu Jan 29 13:29:28 2015](XEN) ----[ Xen-4.6-unstable  arm32  debug=y  Not tainted ]----
        [Thu Jan 29 13:29:28 2015](XEN) CPU:    0
        [Thu Jan 29 13:29:28 2015](XEN) PC:     00229734 _spin_lock_irq+0x18/0x94
        [Thu Jan 29 13:29:28 2015](XEN) CPSR:   200000da MODE:Hypervisor
        [Thu Jan 29 13:29:28 2015](XEN)      R0: 4000823c R1: 00000000 R2: 02faf080 R3: 600000da
        [Thu Jan 29 13:29:28 2015](XEN)      R4: 4000823c R5: 4000d000 R6: 4000823c R7: 002ee020
        [Thu Jan 29 13:29:28 2015](XEN)      R8: 4000f218 R9: 00000000 R10:0026fe08 R11:7ffcfefc R12:00000002
        [Thu Jan 29 13:29:28 2015](XEN) HYP: SP: 7ffcfeec LR: 0021f34c
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN)   VTCR_EL2: 80003558
        [Thu Jan 29 13:29:28 2015](XEN)  VTTBR_EL2: 00010002b9ffc000
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN)  SCTLR_EL2: 30cd187f
        [Thu Jan 29 13:29:28 2015](XEN)    HCR_EL2: 000000000038643f
        [Thu Jan 29 13:29:28 2015](XEN)  TTBR0_EL2: 00000000ff6e8000
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN)    ESR_EL2: 00000000
        [Thu Jan 29 13:29:28 2015](XEN)  HPFAR_EL2: 0000000000000000
        [Thu Jan 29 13:29:28 2015](XEN)      HDFAR: 00000000
        [Thu Jan 29 13:29:28 2015](XEN)      HIFAR: 00000000
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN) Xen stack trace from sp=7ffcfeec:
        [Thu Jan 29 13:29:28 2015](XEN)    0024d068 00000000 002f0328 7ffcff2c 0021f34c 00000000 00000000 6591e5c1
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 4000d000 4000d000 00000000 00000000 00000000 00000000 7ffcff3c
        [Thu Jan 29 13:29:28 2015](XEN)    002285dc 00007fff 00000000 7ffcff4c 00242614 00000000 00000000 7ffcff54
        [Thu Jan 29 13:29:28 2015](XEN)    002427c8 00000000 00242b6c 00000000 ffffffff 28000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 27a00000 000001d3 00000000 00000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000 00000000 00000000
        [Thu Jan 29 13:29:28 2015](XEN) Xen call trace:
        [Thu Jan 29 13:29:28 2015](XEN)    [<00229734>] _spin_lock_irq+0x18/0x94 (PC)
        [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>] csched2_context_saved+0x44/0x18c (LR)
        [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>] csched2_context_saved+0x44/0x18c
        [Thu Jan 29 13:29:28 2015](XEN)    [<002285dc>] context_saved+0x58/0x80
        [Thu Jan 29 13:29:28 2015](XEN)    [<00242614>] schedule_tail+0x148/0x2f0
        [Thu Jan 29 13:29:28 2015](XEN)    [<002427c8>] continue_new_vcpu+0xc/0x70
        [Thu Jan 29 13:29:28 2015](XEN)    [<00242b6c>] context_switch+0x74/0x7c
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN) 
        [Thu Jan 29 13:29:28 2015](XEN) ****************************************
        [Thu Jan 29 13:29:28 2015](XEN) Panic on CPU 0:
        [Thu Jan 29 13:29:28 2015](XEN) Assertion 'local_irq_is_enabled()' failed at spinlock.c:137
        [Thu Jan 29 13:29:28 2015](XEN) ****************************************
        
I didn't have a chance yet to think about whether the ARM ctxt switch or
the scheduler(s) are in the wrong here...

Ian.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-01-31 10:50             ` Ian Campbell
@ 2015-02-02 10:49               ` Denys Drozdov
  2015-02-02 11:14                 ` Ian Campbell
  0 siblings, 1 reply; 24+ messages in thread
From: Denys Drozdov @ 2015-02-02 10:49 UTC (permalink / raw)
  To: Ian Campbell
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5359 bytes --]

Hi Ian,

The issue observed on credit2 scheduler is similar to the rt scheduler on
arm platform. The root cause is that interrupts are disabled in the
beginning of arm context_switch, thus spin_lock_irq is failing in
ASSERT(local_irq_is_enabled()). I propose to change both credit2 and rt
scheduler to run on arm platform as well and re-run the regression with
scheduler patches.

On Sat, Jan 31, 2015 at 12:50 PM, Ian Campbell <ian.campbell@citrix.com>
wrote:

> On Fri, 2015-01-30 at 18:19 +0200, Denys Drozdov wrote:
>
> > since context_save executed right from IRQ level. The arm interrupt
> > handling differs from x86. ARM is handling  context_saved with IRQ
> > disabled in CPSR, though x86 has IRQs on. Thus it is failing on ASSERT
> > inside spin_lock_irq when running on ARM. I guess it should work on
> > x86, so this issue is ARM platform specific.
>
> FWIW I was waiting for it to happen to a xen-unstable run but the latest
> osstest gate run at
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/ which included
> Dario's patches to rationalize the schedulr tests vs. archs also
> resulted in a similar sounding failure on credit2:
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/info.html
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/serial-marilith-n5.txt
>
>         [Thu Jan 29 13:29:28 2015](XEN) Assertion 'local_irq_is_enabled()'
> failed at spinlock.c:137
>         [Thu Jan 29 13:29:28 2015](XEN) ----[ Xen-4.6-unstable  arm32
> debug=y  Not tainted ]----
>         [Thu Jan 29 13:29:28 2015](XEN) CPU:    0
>         [Thu Jan 29 13:29:28 2015](XEN) PC:     00229734
> _spin_lock_irq+0x18/0x94
>         [Thu Jan 29 13:29:28 2015](XEN) CPSR:   200000da MODE:Hypervisor
>         [Thu Jan 29 13:29:28 2015](XEN)      R0: 4000823c R1: 00000000 R2:
> 02faf080 R3: 600000da
>         [Thu Jan 29 13:29:28 2015](XEN)      R4: 4000823c R5: 4000d000 R6:
> 4000823c R7: 002ee020
>         [Thu Jan 29 13:29:28 2015](XEN)      R8: 4000f218 R9: 00000000
> R10:0026fe08 R11:7ffcfefc R12:00000002
>         [Thu Jan 29 13:29:28 2015](XEN) HYP: SP: 7ffcfeec LR: 0021f34c
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN)   VTCR_EL2: 80003558
>         [Thu Jan 29 13:29:28 2015](XEN)  VTTBR_EL2: 00010002b9ffc000
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN)  SCTLR_EL2: 30cd187f
>         [Thu Jan 29 13:29:28 2015](XEN)    HCR_EL2: 000000000038643f
>         [Thu Jan 29 13:29:28 2015](XEN)  TTBR0_EL2: 00000000ff6e8000
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN)    ESR_EL2: 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)  HPFAR_EL2: 0000000000000000
>         [Thu Jan 29 13:29:28 2015](XEN)      HDFAR: 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)      HIFAR: 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN) Xen stack trace from sp=7ffcfeec:
>         [Thu Jan 29 13:29:28 2015](XEN)    0024d068 00000000 002f0328
> 7ffcff2c 0021f34c 00000000 00000000 6591e5c1
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 4000d000 4000d000
> 00000000 00000000 00000000 00000000 7ffcff3c
>         [Thu Jan 29 13:29:28 2015](XEN)    002285dc 00007fff 00000000
> 7ffcff4c 00242614 00000000 00000000 7ffcff54
>         [Thu Jan 29 13:29:28 2015](XEN)    002427c8 00000000 00242b6c
> 00000000 ffffffff 28000000 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000
> 00000000 00000000 00000000 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 27a00000
> 000001d3 00000000 00000000 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000
> 00000000 00000000 00000000 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000
> 00000000 00000000 00000000 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000 00000000
> 00000000 00000000
>         [Thu Jan 29 13:29:28 2015](XEN) Xen call trace:
>         [Thu Jan 29 13:29:28 2015](XEN)    [<00229734>]
> _spin_lock_irq+0x18/0x94 (PC)
>         [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>]
> csched2_context_saved+0x44/0x18c (LR)
>         [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>]
> csched2_context_saved+0x44/0x18c
>         [Thu Jan 29 13:29:28 2015](XEN)    [<002285dc>]
> context_saved+0x58/0x80
>         [Thu Jan 29 13:29:28 2015](XEN)    [<00242614>]
> schedule_tail+0x148/0x2f0
>         [Thu Jan 29 13:29:28 2015](XEN)    [<002427c8>]
> continue_new_vcpu+0xc/0x70
>         [Thu Jan 29 13:29:28 2015](XEN)    [<00242b6c>]
> context_switch+0x74/0x7c
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN)
>         [Thu Jan 29 13:29:28 2015](XEN)
> ****************************************
>         [Thu Jan 29 13:29:28 2015](XEN) Panic on CPU 0:
>         [Thu Jan 29 13:29:28 2015](XEN) Assertion 'local_irq_is_enabled()'
> failed at spinlock.c:137
>         [Thu Jan 29 13:29:28 2015](XEN)
> ****************************************
>
> I didn't have a chance yet to think about whether the ARM ctxt switch or
> the scheduler(s) are in the wrong here...
>
> Ian.
>
>
>

-- 

Denis Drozdov
<http://www.globallogic.com/email_disclaimer.txt>

[-- Attachment #1.2: Type: text/html, Size: 7375 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 10:49               ` Denys Drozdov
@ 2015-02-02 11:14                 ` Ian Campbell
  2015-02-02 11:40                   ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Ian Campbell @ 2015-02-02 11:14 UTC (permalink / raw)
  To: Denys Drozdov
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Jan Beulich, xen-devel

On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:

Please don't top post.

> The issue observed on credit2 scheduler is similar to the rt scheduler
> on arm platform. The root cause is that interrupts are disabled in the
> beginning of arm context_switch, thus spin_lock_irq is failing in
> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
> rt scheduler to run on arm platform as well and re-run the regression
> with scheduler patches.

I'd like to hear from the scheduler and other $arch folks regarding
whether we think the rtds and credit2 schedulers are wrong or whether it
is actually the ARM arch code which needs fixing before considering any
change.

What lead you to conclude that the schedulers were the ones at fault?

Ian

> 
> On Sat, Jan 31, 2015 at 12:50 PM, Ian Campbell
> <ian.campbell@citrix.com> wrote:
>         On Fri, 2015-01-30 at 18:19 +0200, Denys Drozdov wrote:
>         
>         > since context_save executed right from IRQ level. The arm
>         interrupt
>         > handling differs from x86. ARM is handling  context_saved
>         with IRQ
>         > disabled in CPSR, though x86 has IRQs on. Thus it is failing
>         on ASSERT
>         > inside spin_lock_irq when running on ARM. I guess it should
>         work on
>         > x86, so this issue is ARM platform specific.
>         
>         FWIW I was waiting for it to happen to a xen-unstable run but
>         the latest
>         osstest gate run at
>         http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/ which
>         included
>         Dario's patches to rationalize the schedulr tests vs. archs
>         also
>         resulted in a similar sounding failure on credit2:
>         http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/info.html
>         http://www.chiark.greenend.org.uk/~xensrcts/logs/33915/test-armhf-armhf-xl-credit2/serial-marilith-n5.txt
>         
>                 [Thu Jan 29 13:29:28 2015](XEN) Assertion
>         'local_irq_is_enabled()' failed at spinlock.c:137
>                 [Thu Jan 29 13:29:28 2015](XEN) ----[ Xen-4.6-unstable
>         arm32  debug=y  Not tainted ]----
>                 [Thu Jan 29 13:29:28 2015](XEN) CPU:    0
>                 [Thu Jan 29 13:29:28 2015](XEN) PC:     00229734
>         _spin_lock_irq+0x18/0x94
>                 [Thu Jan 29 13:29:28 2015](XEN) CPSR:   200000da
>         MODE:Hypervisor
>                 [Thu Jan 29 13:29:28 2015](XEN)      R0: 4000823c R1:
>         00000000 R2: 02faf080 R3: 600000da
>                 [Thu Jan 29 13:29:28 2015](XEN)      R4: 4000823c R5:
>         4000d000 R6: 4000823c R7: 002ee020
>                 [Thu Jan 29 13:29:28 2015](XEN)      R8: 4000f218 R9:
>         00000000 R10:0026fe08 R11:7ffcfefc R12:00000002
>                 [Thu Jan 29 13:29:28 2015](XEN) HYP: SP: 7ffcfeec LR:
>         0021f34c
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN)   VTCR_EL2: 80003558
>                 [Thu Jan 29 13:29:28 2015](XEN)  VTTBR_EL2:
>         00010002b9ffc000
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN)  SCTLR_EL2: 30cd187f
>                 [Thu Jan 29 13:29:28 2015](XEN)    HCR_EL2:
>         000000000038643f
>                 [Thu Jan 29 13:29:28 2015](XEN)  TTBR0_EL2:
>         00000000ff6e8000
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN)    ESR_EL2: 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)  HPFAR_EL2:
>         0000000000000000
>                 [Thu Jan 29 13:29:28 2015](XEN)      HDFAR: 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)      HIFAR: 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN) Xen stack trace from
>         sp=7ffcfeec:
>                 [Thu Jan 29 13:29:28 2015](XEN)    0024d068 00000000
>         002f0328 7ffcff2c 0021f34c 00000000 00000000 6591e5c1
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 4000d000
>         4000d000 00000000 00000000 00000000 00000000 7ffcff3c
>                 [Thu Jan 29 13:29:28 2015](XEN)    002285dc 00007fff
>         00000000 7ffcff4c 00242614 00000000 00000000 7ffcff54
>                 [Thu Jan 29 13:29:28 2015](XEN)    002427c8 00000000
>         00242b6c 00000000 ffffffff 28000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000
>         00000000 00000000 00000000 00000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000
>         27a00000 000001d3 00000000 00000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000
>         00000000 00000000 00000000 00000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000
>         00000000 00000000 00000000 00000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN)    00000000 00000000
>         00000000 00000000 00000000
>                 [Thu Jan 29 13:29:28 2015](XEN) Xen call trace:
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<00229734>]
>         _spin_lock_irq+0x18/0x94 (PC)
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>]
>         csched2_context_saved+0x44/0x18c (LR)
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<0021f34c>]
>         csched2_context_saved+0x44/0x18c
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<002285dc>]
>         context_saved+0x58/0x80
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<00242614>]
>         schedule_tail+0x148/0x2f0
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<002427c8>]
>         continue_new_vcpu+0xc/0x70
>                 [Thu Jan 29 13:29:28 2015](XEN)    [<00242b6c>]
>         context_switch+0x74/0x7c
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN)
>                 [Thu Jan 29 13:29:28 2015](XEN)
>         ****************************************
>                 [Thu Jan 29 13:29:28 2015](XEN) Panic on CPU 0:
>                 [Thu Jan 29 13:29:28 2015](XEN) Assertion
>         'local_irq_is_enabled()' failed at spinlock.c:137
>                 [Thu Jan 29 13:29:28 2015](XEN)
>         ****************************************
>         
>         I didn't have a chance yet to think about whether the ARM ctxt
>         switch or
>         the scheduler(s) are in the wrong here...
>         
>         Ian.
>         
>         
> 
> 
> -- 
> 
> Denis Drozdov
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 11:14                 ` Ian Campbell
@ 2015-02-02 11:40                   ` Jan Beulich
  2015-02-02 12:16                     ` Ian Campbell
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-02-02 11:40 UTC (permalink / raw)
  To: Ian Campbell
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Denys Drozdov, xen-devel

>>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
> On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
>> The issue observed on credit2 scheduler is similar to the rt scheduler
>> on arm platform. The root cause is that interrupts are disabled in the
>> beginning of arm context_switch, thus spin_lock_irq is failing in
>> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
>> rt scheduler to run on arm platform as well and re-run the regression
>> with scheduler patches.
> 
> I'd like to hear from the scheduler and other $arch folks regarding
> whether we think the rtds and credit2 schedulers are wrong or whether it
> is actually the ARM arch code which needs fixing before considering any
> change.

The aspect to be understood first is why ARM needs IRQs disabled
past __context_switch() into schedule_tail() (and there until after
ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
there's no question that the schedulers need to be adjusted to
accommodate for this. X86 calls context_saved() before
schedule_tail() and has no need for IRQs to be disabled after
__context_switch() returned. Otoh some roughly equivalent stuff
ARM does in ctxt_switch_{from,to}() is being done in
__context_switch() on x86 (and in the context here the restore
parts seem to be of most interest) - maybe the call to
context_saved() could be deferred on ARM until after IRQs got
re-enabled?

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 11:40                   ` Jan Beulich
@ 2015-02-02 12:16                     ` Ian Campbell
  2015-02-02 12:59                       ` Julien Grall
  2015-02-02 13:07                       ` Jan Beulich
  0 siblings, 2 replies; 24+ messages in thread
From: Ian Campbell @ 2015-02-02 12:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Denys Drozdov, xen-devel

On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
> >>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
> >> The issue observed on credit2 scheduler is similar to the rt scheduler
> >> on arm platform. The root cause is that interrupts are disabled in the
> >> beginning of arm context_switch, thus spin_lock_irq is failing in
> >> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
> >> rt scheduler to run on arm platform as well and re-run the regression
> >> with scheduler patches.
> > 
> > I'd like to hear from the scheduler and other $arch folks regarding
> > whether we think the rtds and credit2 schedulers are wrong or whether it
> > is actually the ARM arch code which needs fixing before considering any
> > change.
> 
> The aspect to be understood first is why ARM needs IRQs disabled
> past __context_switch() into schedule_tail() (and there until after
> ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
> there's no question that the schedulers need to be adjusted to
> accommodate for this.

I don't think it's *necessary*, but we do seem to have ended up with the
context switch having that requirement today (and relying on it in
several places in switch from/to (mostly to).

I'm pretty sure we could rework things more along the lines of how x86
does it if needed, but it would be a non-trivial refactoring I think.

>  X86 calls context_saved() before
> schedule_tail() and has no need for IRQs to be disabled after
> __context_switch() returned.

In fact from the comment ("...which may fault...") I think it requires
that they are not disabled?

>  Otoh some roughly equivalent stuff
> ARM does in ctxt_switch_{from,to}() is being done in
> __context_switch() on x86 (and in the context here the restore
> parts seem to be of most interest) - maybe the call to
> context_saved() could be deferred on ARM until after IRQs got
> re-enabled?

Is it allowable to have the context_saved of prev be after (most of) the
state of next has been restored?

Ian.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 12:16                     ` Ian Campbell
@ 2015-02-02 12:59                       ` Julien Grall
  2015-02-02 13:06                         ` Ian Campbell
  2015-02-04 15:04                         ` Stefano Stabellini
  2015-02-02 13:07                       ` Jan Beulich
  1 sibling, 2 replies; 24+ messages in thread
From: Julien Grall @ 2015-02-02 12:59 UTC (permalink / raw)
  To: Ian Campbell, Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Denys Drozdov, xen-devel

On 02/02/15 12:16, Ian Campbell wrote:
> On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
>>>>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
>>> On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
>>>> The issue observed on credit2 scheduler is similar to the rt scheduler
>>>> on arm platform. The root cause is that interrupts are disabled in the
>>>> beginning of arm context_switch, thus spin_lock_irq is failing in
>>>> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
>>>> rt scheduler to run on arm platform as well and re-run the regression
>>>> with scheduler patches.
>>>
>>> I'd like to hear from the scheduler and other $arch folks regarding
>>> whether we think the rtds and credit2 schedulers are wrong or whether it
>>> is actually the ARM arch code which needs fixing before considering any
>>> change.
>>
>> The aspect to be understood first is why ARM needs IRQs disabled
>> past __context_switch() into schedule_tail() (and there until after
>> ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
>> there's no question that the schedulers need to be adjusted to
>> accommodate for this.
> 
> I don't think it's *necessary*, but we do seem to have ended up with the
> context switch having that requirement today (and relying on it in
> several places in switch from/to (mostly to).

> I'm pretty sure we could rework things more along the lines of how x86
> does it if needed, but it would be a non-trivial refactoring I think.

We have some part of the code which may inject an interrupt during
context switch.
For instance the timer may inject an IRQ as long as it has not been
disabled. Same when the timer is restored.

The former may result to inject an interrupt to the wrong vCPU.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 12:59                       ` Julien Grall
@ 2015-02-02 13:06                         ` Ian Campbell
  2015-02-04 15:04                         ` Stefano Stabellini
  1 sibling, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2015-02-02 13:06 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Andrew Cooper, Denys Drozdov, Jan Beulich, xen-devel

On Mon, 2015-02-02 at 12:59 +0000, Julien Grall wrote:
> On 02/02/15 12:16, Ian Campbell wrote:
> > On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
> >>>>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
> >>> On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
> >>>> The issue observed on credit2 scheduler is similar to the rt scheduler
> >>>> on arm platform. The root cause is that interrupts are disabled in the
> >>>> beginning of arm context_switch, thus spin_lock_irq is failing in
> >>>> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
> >>>> rt scheduler to run on arm platform as well and re-run the regression
> >>>> with scheduler patches.
> >>>
> >>> I'd like to hear from the scheduler and other $arch folks regarding
> >>> whether we think the rtds and credit2 schedulers are wrong or whether it
> >>> is actually the ARM arch code which needs fixing before considering any
> >>> change.
> >>
> >> The aspect to be understood first is why ARM needs IRQs disabled
> >> past __context_switch() into schedule_tail() (and there until after
> >> ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
> >> there's no question that the schedulers need to be adjusted to
> >> accommodate for this.
> > 
> > I don't think it's *necessary*, but we do seem to have ended up with the
> > context switch having that requirement today (and relying on it in
> > several places in switch from/to (mostly to).
> 
> > I'm pretty sure we could rework things more along the lines of how x86
> > does it if needed, but it would be a non-trivial refactoring I think.
> 
> We have some part of the code which may inject an interrupt during
> context switch.
> For instance the timer may inject an IRQ as long as it has not been
> disabled. Same when the timer is restored.
> 
> The former may result to inject an interrupt to the wrong vCPU.

I'm pretty sure we could structure things, with appropriate use of
locks, smaller critical sections with IRQs disabled, better quiescing of
subsystems on save and ordering of operations etc such that this would
work without the big hammer of disabling IRQs for the entire context
switch.

In fact I rather expect that eventually some embedded RTOS type person
will want to do exactly that to minimize IRQ latency.

Ian.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 12:16                     ` Ian Campbell
  2015-02-02 12:59                       ` Julien Grall
@ 2015-02-02 13:07                       ` Jan Beulich
  2015-02-02 18:03                         ` Denis Drozdov
  1 sibling, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-02-02 13:07 UTC (permalink / raw)
  To: Ian Campbell
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Denys Drozdov, xen-devel

>>> On 02.02.15 at 13:16, <Ian.Campbell@citrix.com> wrote:
> On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
>>  X86 calls context_saved() before
>> schedule_tail() and has no need for IRQs to be disabled after
>> __context_switch() returned.
> 
> In fact from the comment ("...which may fault...") I think it requires
> that they are not disabled?

Correct, or else such a fault would be fatal.

>>  Otoh some roughly equivalent stuff
>> ARM does in ctxt_switch_{from,to}() is being done in
>> __context_switch() on x86 (and in the context here the restore
>> parts seem to be of most interest) - maybe the call to
>> context_saved() could be deferred on ARM until after IRQs got
>> re-enabled?
> 
> Is it allowable to have the context_saved of prev be after (most of) the
> state of next has been restored?

I think so, but I'd clearly like this to be confirmed by e.g. George,
not the least because it's only the newer schedulers that make use
of this.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 13:07                       ` Jan Beulich
@ 2015-02-02 18:03                         ` Denis Drozdov
  2015-02-04 15:13                           ` Ian Campbell
  0 siblings, 1 reply; 24+ messages in thread
From: Denis Drozdov @ 2015-02-02 18:03 UTC (permalink / raw)
  To: Ian.Campbell, JBeulich, julien.grall, andrew.cooper3,
	george.dunlap, xen-devel
  Cc: denys drozdov

From: denys drozdov <denys.drozdov@globallogic.com>

Make Credit2 and RT schedulers to run on arm platform
context_saved should be deferred on ARM after IRQs enabled

Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
---
 xen/arch/arm/domain.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 7221bc8..1626376 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -64,7 +64,7 @@ static void ctxt_switch_from(struct vcpu *p)
      * mode. Therefore we don't need to save the context of an idle VCPU.
      */
     if ( is_idle_vcpu(p) )
-        goto end_context;
+        return;
 
     p2m_save_state(p);
 
@@ -138,9 +138,6 @@ static void ctxt_switch_from(struct vcpu *p)
     gic_save_state(p);
 
     isb();
-
-end_context:
-    context_saved(p);
 }
 
 static void ctxt_switch_to(struct vcpu *n)
@@ -246,6 +243,8 @@ static void schedule_tail(struct vcpu *prev)
 
     local_irq_enable();
 
+    context_saved(prev);
+
     if ( prev != current )
         update_runstate_area(current);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 12:59                       ` Julien Grall
  2015-02-02 13:06                         ` Ian Campbell
@ 2015-02-04 15:04                         ` Stefano Stabellini
  2015-02-04 15:09                           ` Ian Campbell
  1 sibling, 1 reply; 24+ messages in thread
From: Stefano Stabellini @ 2015-02-04 15:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, George Dunlap, Andrew Cooper, Denys Drozdov,
	xen-devel, Jan Beulich

On Mon, 2 Feb 2015, Julien Grall wrote:
> On 02/02/15 12:16, Ian Campbell wrote:
> > On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
> >>>>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
> >>> On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
> >>>> The issue observed on credit2 scheduler is similar to the rt scheduler
> >>>> on arm platform. The root cause is that interrupts are disabled in the
> >>>> beginning of arm context_switch, thus spin_lock_irq is failing in
> >>>> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
> >>>> rt scheduler to run on arm platform as well and re-run the regression
> >>>> with scheduler patches.
> >>>
> >>> I'd like to hear from the scheduler and other $arch folks regarding
> >>> whether we think the rtds and credit2 schedulers are wrong or whether it
> >>> is actually the ARM arch code which needs fixing before considering any
> >>> change.
> >>
> >> The aspect to be understood first is why ARM needs IRQs disabled
> >> past __context_switch() into schedule_tail() (and there until after
> >> ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
> >> there's no question that the schedulers need to be adjusted to
> >> accommodate for this.
> > 
> > I don't think it's *necessary*, but we do seem to have ended up with the
> > context switch having that requirement today (and relying on it in
> > several places in switch from/to (mostly to).
> 
> > I'm pretty sure we could rework things more along the lines of how x86
> > does it if needed, but it would be a non-trivial refactoring I think.
> 
> We have some part of the code which may inject an interrupt during
> context switch.
> For instance the timer may inject an IRQ as long as it has not been
> disabled. Same when the timer is restored.
> 
> The former may result to inject an interrupt to the wrong vCPU.

Also the gic save and restore functions need to be run with interrupt
disabled.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-04 15:04                         ` Stefano Stabellini
@ 2015-02-04 15:09                           ` Ian Campbell
  0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2015-02-04 15:09 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Julien Grall, Denys Drozdov,
	xen-devel, Jan Beulich

On Wed, 2015-02-04 at 15:04 +0000, Stefano Stabellini wrote:
> On Mon, 2 Feb 2015, Julien Grall wrote:
> > On 02/02/15 12:16, Ian Campbell wrote:
> > > On Mon, 2015-02-02 at 11:40 +0000, Jan Beulich wrote:
> > >>>>> On 02.02.15 at 12:14, <Ian.Campbell@citrix.com> wrote:
> > >>> On Mon, 2015-02-02 at 12:49 +0200, Denys Drozdov wrote:
> > >>>> The issue observed on credit2 scheduler is similar to the rt scheduler
> > >>>> on arm platform. The root cause is that interrupts are disabled in the
> > >>>> beginning of arm context_switch, thus spin_lock_irq is failing in
> > >>>> ASSERT(local_irq_is_enabled()). I propose to change both credit2 and
> > >>>> rt scheduler to run on arm platform as well and re-run the regression
> > >>>> with scheduler patches.
> > >>>
> > >>> I'd like to hear from the scheduler and other $arch folks regarding
> > >>> whether we think the rtds and credit2 schedulers are wrong or whether it
> > >>> is actually the ARM arch code which needs fixing before considering any
> > >>> change.
> > >>
> > >> The aspect to be understood first is why ARM needs IRQs disabled
> > >> past __context_switch() into schedule_tail() (and there until after
> > >> ctxt_switch_from() and ctxt_switch_to()). If that's indeed necessary,
> > >> there's no question that the schedulers need to be adjusted to
> > >> accommodate for this.
> > > 
> > > I don't think it's *necessary*, but we do seem to have ended up with the
> > > context switch having that requirement today (and relying on it in
> > > several places in switch from/to (mostly to).
> > 
> > > I'm pretty sure we could rework things more along the lines of how x86
> > > does it if needed, but it would be a non-trivial refactoring I think.
> > 
> > We have some part of the code which may inject an interrupt during
> > context switch.
> > For instance the timer may inject an IRQ as long as it has not been
> > disabled. Same when the timer is restored.
> > 
> > The former may result to inject an interrupt to the wrong vCPU.
> 
> Also the gic save and restore functions need to be run with interrupt
> disabled.

I am aware that today all sorts of things rely on interrupts being
disabled during context switch on ARM.

My point was that we *could*, if required, rework things to not rely on
this.

Ian.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-02 18:03                         ` Denis Drozdov
@ 2015-02-04 15:13                           ` Ian Campbell
  2015-02-04 16:19                             ` Dario Faggioli
  2015-02-04 16:45                             ` Julien Grall
  0 siblings, 2 replies; 24+ messages in thread
From: Ian Campbell @ 2015-02-04 15:13 UTC (permalink / raw)
  To: Denis Drozdov
  Cc: george.dunlap, andrew.cooper3, julien.grall, JBeulich, xen-devel

On Mon, 2015-02-02 at 20:03 +0200, Denis Drozdov wrote:
> From: denys drozdov <denys.drozdov@globallogic.com>
> 
> Make Credit2 and RT schedulers to run on arm platform
> context_saved should be deferred on ARM after IRQs enabled

A better subject+message would be:

        xen/arm: Call context_saved() with interrupts enabled during context switch
        
        This is a requirement of the scheduler interface, violating this
        causes for example with the RT scheduler:
                <insert crash log here>

> Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

However I would like an ack from a scheduler person (e.g. George) before
it gets applied.

> ---
>  xen/arch/arm/domain.c |    7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 7221bc8..1626376 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -64,7 +64,7 @@ static void ctxt_switch_from(struct vcpu *p)
>       * mode. Therefore we don't need to save the context of an idle VCPU.
>       */
>      if ( is_idle_vcpu(p) )
> -        goto end_context;
> +        return;
>  
>      p2m_save_state(p);
>  
> @@ -138,9 +138,6 @@ static void ctxt_switch_from(struct vcpu *p)
>      gic_save_state(p);
>  
>      isb();
> -
> -end_context:
> -    context_saved(p);
>  }
>  
>  static void ctxt_switch_to(struct vcpu *n)
> @@ -246,6 +243,8 @@ static void schedule_tail(struct vcpu *prev)
>  
>      local_irq_enable();
>  
> +    context_saved(prev);
> +
>      if ( prev != current )
>          update_runstate_area(current);
>  

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-04 15:13                           ` Ian Campbell
@ 2015-02-04 16:19                             ` Dario Faggioli
  2015-02-04 16:45                             ` Julien Grall
  1 sibling, 0 replies; 24+ messages in thread
From: Dario Faggioli @ 2015-02-04 16:19 UTC (permalink / raw)
  To: Ian Campbell
  Cc: george.dunlap, andrew.cooper3, julien.grall, Denis Drozdov,
	xen-devel, JBeulich


[-- Attachment #1.1: Type: text/plain, Size: 1537 bytes --]

On Wed, 2015-02-04 at 15:13 +0000, Ian Campbell wrote:
> On Mon, 2015-02-02 at 20:03 +0200, Denis Drozdov wrote:
> > From: denys drozdov <denys.drozdov@globallogic.com>
> > 
> > Make Credit2 and RT schedulers to run on arm platform
> > context_saved should be deferred on ARM after IRQs enabled
> 
> A better subject+message would be:
> 
>         xen/arm: Call context_saved() with interrupts enabled during context switch
>         
>         This is a requirement of the scheduler interface, violating this
>         causes for example with the RT scheduler:
>                 <insert crash log here>
> 
> > Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> However I would like an ack from a scheduler person (e.g. George) before
> it gets applied.
> 
Deferring the call to context_saved() until below both
ctxt_switch_from() and ctxt_switch_to(), for ARM, like the patch is
doing seems ok to me.

As Jan noted already, it basically makes things look exactly as they
look in x86 already and, checking how booth credit2 and RTDS use
the .context_saved hook, all seems fine to me.

So, FWIW:

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Fix rtds scheduler for arm
  2015-02-04 15:13                           ` Ian Campbell
  2015-02-04 16:19                             ` Dario Faggioli
@ 2015-02-04 16:45                             ` Julien Grall
  2015-02-04 17:13                               ` [PATCH] xen/arm: Call context_saved() with interrupts enabled during context switch Denis Drozdov
  1 sibling, 1 reply; 24+ messages in thread
From: Julien Grall @ 2015-02-04 16:45 UTC (permalink / raw)
  To: Ian Campbell, Denis Drozdov
  Cc: george.dunlap, andrew.cooper3, JBeulich, xen-devel

Hello,

On 04/02/2015 15:13, Ian Campbell wrote:
> On Mon, 2015-02-02 at 20:03 +0200, Denis Drozdov wrote:
>> From: denys drozdov <denys.drozdov@globallogic.com>
>>
>> Make Credit2 and RT schedulers to run on arm platform
>> context_saved should be deferred on ARM after IRQs enabled
>
> A better subject+message would be:
>
>          xen/arm: Call context_saved() with interrupts enabled during context switch
>
>          This is a requirement of the scheduler interface, violating this
>          causes for example with the RT scheduler:
>                  <insert crash log here>
>
>> Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

The commit message suggested by Ian looks more clear to me. With this 
change:

Reviewed-by: Julien Grall <julien.grall@linaro.org>

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] xen/arm: Call context_saved() with interrupts enabled during context switch
  2015-02-04 16:45                             ` Julien Grall
@ 2015-02-04 17:13                               ` Denis Drozdov
  2015-02-05 12:48                                 ` Ian Campbell
  0 siblings, 1 reply; 24+ messages in thread
From: Denis Drozdov @ 2015-02-04 17:13 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, JBeulich, andrew.cooper3,
	george.dunlap, xen-devel
  Cc: denys drozdov

From: denys drozdov <denys.drozdov@globallogic.com>

This is a requirement of the scheduler interface, violating this
causes for example with the RT scheduler:

(XEN) Assertion 'local_irq_is_enabled()' failed at spinlock.c:137
(XEN) ----[ Xen-4.5.0  arm32  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) PC:     0022a074 _spin_lock_irq+0x18/0x94

(XEN) Xen call trace:
(XEN)    [<0022a074>] _spin_lock_irq+0x18/0x94 (PC)
(XEN)    [<002256b8>] rt_context_saved+0x3c/0x144 (LR)
(XEN)    [<002256b8>] rt_context_saved+0x3c/0x144
(XEN)    [<00228ed8>] context_saved+0x4c/0x80
(XEN)    [<002524b4>] schedule_tail+0x148/0x2f0
(XEN)    [<00252668>] continue_new_vcpu+0xc/0x70
(XEN)    [<00252a18>] context_switch+0x74/0x7c


Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/domain.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 7221bc8..1626376 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -64,7 +64,7 @@ static void ctxt_switch_from(struct vcpu *p)
      * mode. Therefore we don't need to save the context of an idle VCPU.
      */
     if ( is_idle_vcpu(p) )
-        goto end_context;
+        return;
 
     p2m_save_state(p);
 
@@ -138,9 +138,6 @@ static void ctxt_switch_from(struct vcpu *p)
     gic_save_state(p);
 
     isb();
-
-end_context:
-    context_saved(p);
 }
 
 static void ctxt_switch_to(struct vcpu *n)
@@ -246,6 +243,8 @@ static void schedule_tail(struct vcpu *prev)
 
     local_irq_enable();
 
+    context_saved(prev);
+
     if ( prev != current )
         update_runstate_area(current);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] xen/arm: Call context_saved() with interrupts enabled during context switch
  2015-02-04 17:13                               ` [PATCH] xen/arm: Call context_saved() with interrupts enabled during context switch Denis Drozdov
@ 2015-02-05 12:48                                 ` Ian Campbell
  0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2015-02-05 12:48 UTC (permalink / raw)
  To: Denis Drozdov
  Cc: george.dunlap, andrew.cooper3, julien.grall, JBeulich, xen-devel

On Wed, 2015-02-04 at 19:13 +0200, Denis Drozdov wrote:
> From: denys drozdov <denys.drozdov@globallogic.com>
> 
> This is a requirement of the scheduler interface, violating this
> causes for example with the RT scheduler:
> 
> (XEN) Assertion 'local_irq_is_enabled()' failed at spinlock.c:137
> (XEN) ----[ Xen-4.5.0  arm32  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) PC:     0022a074 _spin_lock_irq+0x18/0x94
> 
> (XEN) Xen call trace:
> (XEN)    [<0022a074>] _spin_lock_irq+0x18/0x94 (PC)
> (XEN)    [<002256b8>] rt_context_saved+0x3c/0x144 (LR)
> (XEN)    [<002256b8>] rt_context_saved+0x3c/0x144
> (XEN)    [<00228ed8>] context_saved+0x4c/0x80
> (XEN)    [<002524b4>] schedule_tail+0x148/0x2f0
> (XEN)    [<00252668>] continue_new_vcpu+0xc/0x70
> (XEN)    [<00252a18>] context_switch+0x74/0x7c
> 
> 
> Signed-off-by: denys drozdov <denys.drozdov@globallogic.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reviewed-by: Julien Grall <julien.grall@linaro.org>

Dario's R-by counts as the scheduled ack I was looking for, so applied
thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-02-05 12:48 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30 12:30 [PATCH] xen/arm: Fix rtds scheduler for arm Denis Drozdov
2015-01-30 12:38 ` Jan Beulich
2015-01-30 15:40   ` Denis Drozdov
2015-01-30 15:46     ` Julien Grall
2015-01-30 15:56       ` Andrew Cooper
2015-01-30 16:10         ` Julien Grall
2015-01-30 16:19           ` Denys Drozdov
2015-01-30 16:29             ` Julien Grall
2015-01-31 10:50             ` Ian Campbell
2015-02-02 10:49               ` Denys Drozdov
2015-02-02 11:14                 ` Ian Campbell
2015-02-02 11:40                   ` Jan Beulich
2015-02-02 12:16                     ` Ian Campbell
2015-02-02 12:59                       ` Julien Grall
2015-02-02 13:06                         ` Ian Campbell
2015-02-04 15:04                         ` Stefano Stabellini
2015-02-04 15:09                           ` Ian Campbell
2015-02-02 13:07                       ` Jan Beulich
2015-02-02 18:03                         ` Denis Drozdov
2015-02-04 15:13                           ` Ian Campbell
2015-02-04 16:19                             ` Dario Faggioli
2015-02-04 16:45                             ` Julien Grall
2015-02-04 17:13                               ` [PATCH] xen/arm: Call context_saved() with interrupts enabled during context switch Denis Drozdov
2015-02-05 12:48                                 ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.