All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/xen: resume timer irqs early
@ 2014-08-07 17:16 David Vrabel
  2014-08-07 17:29 ` Boris Ostrovsky
  0 siblings, 1 reply; 7+ messages in thread
From: David Vrabel @ 2014-08-07 17:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

If the timer irqs are resumed during device resume it is possible in
certain circumstances for the resume to hang early on, before device
interrupts are resumed.

It is not entirely clear what is occuring the point of the hang but I
think a task necessary for the resume calls schedule_timeout(),
waiting for a timer interrupt (which never arrives).  This failure may
require specific tasks to be running on the other VCPUs to trigger
(processes are not frozen during a suspend/resume if PREEMPT is
disabled).

Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
syscore_resume().

Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
interrupts and IRQF_FORCE_RESUME was already set.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/time.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 7b78f88..90dd311 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -443,8 +443,10 @@ void xen_setup_timer(int cpu)
 		name = "<timer kasprintf failed>";
 
 	irq = bind_virq_to_irqhandler(VIRQ_TIMER, cpu, xen_timer_interrupt,
-				      IRQF_PERCPU|IRQF_NOBALANCING|IRQF_TIMER|
-				      IRQF_FORCE_RESUME,
+				      IRQF_PERCPU | IRQF_NOBALANCING
+				      | IRQF_TIMER
+				      | IRQF_NO_SUSPEND | IRQF_FORCE_RESUME
+				      | IRQF_EARLY_RESUME,
 				      name, NULL);
 	(void)xen_set_irq_priority(irq, XEN_IRQ_PRIORITY_MAX);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-07 17:16 [PATCH] x86/xen: resume timer irqs early David Vrabel
@ 2014-08-07 17:29 ` Boris Ostrovsky
  2014-08-08 10:41   ` David Vrabel
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Ostrovsky @ 2014-08-07 17:29 UTC (permalink / raw)
  To: David Vrabel, xen-devel

On 08/07/2014 01:16 PM, David Vrabel wrote:
> If the timer irqs are resumed during device resume it is possible in
> certain circumstances for the resume to hang early on, before device
> interrupts are resumed.
>
> It is not entirely clear what is occuring the point of the hang but I
> think a task necessary for the resume calls schedule_timeout(),
> waiting for a timer interrupt (which never arrives).  This failure may
> require specific tasks to be running on the other VCPUs to trigger
> (processes are not frozen during a suspend/resume if PREEMPT is
> disabled).
>
> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
> syscore_resume().
>
> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
> interrupts and IRQF_FORCE_RESUME was already set.


IRQF_NO_SUSPEND is a component of IRQF_TIMER.


-boris



>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>   arch/x86/xen/time.c |    6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 7b78f88..90dd311 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -443,8 +443,10 @@ void xen_setup_timer(int cpu)
>   		name = "<timer kasprintf failed>";
>   
>   	irq = bind_virq_to_irqhandler(VIRQ_TIMER, cpu, xen_timer_interrupt,
> -				      IRQF_PERCPU|IRQF_NOBALANCING|IRQF_TIMER|
> -				      IRQF_FORCE_RESUME,
> +				      IRQF_PERCPU | IRQF_NOBALANCING
> +				      | IRQF_TIMER
> +				      | IRQF_NO_SUSPEND | IRQF_FORCE_RESUME
> +				      | IRQF_EARLY_RESUME,
>   				      name, NULL);
>   	(void)xen_set_irq_priority(irq, XEN_IRQ_PRIORITY_MAX);
>   

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-07 17:29 ` Boris Ostrovsky
@ 2014-08-08 10:41   ` David Vrabel
  2014-08-08 14:04     ` Boris Ostrovsky
  0 siblings, 1 reply; 7+ messages in thread
From: David Vrabel @ 2014-08-08 10:41 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel

On 07/08/14 18:29, Boris Ostrovsky wrote:
> On 08/07/2014 01:16 PM, David Vrabel wrote:
>> If the timer irqs are resumed during device resume it is possible in
>> certain circumstances for the resume to hang early on, before device
>> interrupts are resumed.
>>
>> It is not entirely clear what is occuring the point of the hang but I
>> think a task necessary for the resume calls schedule_timeout(),
>> waiting for a timer interrupt (which never arrives).  This failure may
>> require specific tasks to be running on the other VCPUs to trigger
>> (processes are not frozen during a suspend/resume if PREEMPT is
>> disabled).
>>
>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
>> syscore_resume().
>>
>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
>> interrupts and IRQF_FORCE_RESUME was already set.
> 
> 
> IRQF_NO_SUSPEND is a component of IRQF_TIMER.

So it is.  How about this instead?

8<----------------------------
x86/xen: resume timer irqs early

If the timer irqs are resumed during device resume it is possible in
certain circumstances for the resume to hang early on, before device
interrupts are resumed.

It is not entirely clear what is occuring the point of the hang but I
think a task necessary for the resume calls schedule_timeout(),
waiting for a timer interrupt (which never arrives).  This failure may
require specific tasks to be running on the other VCPUs to trigger
(processes are not frozen during a suspend/resume if PREEMPT is
disabled).

Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
syscore_resume().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/time.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 7b78f88..5718b0b 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -444,7 +444,7 @@ void xen_setup_timer(int cpu)
 
 	irq = bind_virq_to_irqhandler(VIRQ_TIMER, cpu, xen_timer_interrupt,
 				      IRQF_PERCPU|IRQF_NOBALANCING|IRQF_TIMER|
-				      IRQF_FORCE_RESUME,
+				      IRQF_FORCE_RESUME|IRQF_EARLY_RESUME,
 				      name, NULL);
 	(void)xen_set_irq_priority(irq, XEN_IRQ_PRIORITY_MAX);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-08 10:41   ` David Vrabel
@ 2014-08-08 14:04     ` Boris Ostrovsky
  2014-08-08 14:35       ` David Vrabel
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Ostrovsky @ 2014-08-08 14:04 UTC (permalink / raw)
  To: David Vrabel, xen-devel

On 08/08/2014 06:41 AM, David Vrabel wrote:
> On 07/08/14 18:29, Boris Ostrovsky wrote:
>> On 08/07/2014 01:16 PM, David Vrabel wrote:
>>> If the timer irqs are resumed during device resume it is possible in
>>> certain circumstances for the resume to hang early on, before device
>>> interrupts are resumed.
>>>
>>> It is not entirely clear what is occuring the point of the hang but I
>>> think a task necessary for the resume calls schedule_timeout(),
>>> waiting for a timer interrupt (which never arrives).  This failure may
>>> require specific tasks to be running on the other VCPUs to trigger
>>> (processes are not frozen during a suspend/resume if PREEMPT is
>>> disabled).
>>>
>>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
>>> syscore_resume().
>>>
>>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
>>> interrupts and IRQF_FORCE_RESUME was already set.
>>
>> IRQF_NO_SUSPEND is a component of IRQF_TIMER.
> So it is.  How about this instead?

The change makes sense so

     Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

but I am curious whether you actually were able to prove that it in fact 
fixes the hang (the description doesn't make it clear).

-boris


>
> 8<----------------------------
> x86/xen: resume timer irqs early
>
> If the timer irqs are resumed during device resume it is possible in
> certain circumstances for the resume to hang early on, before device
> interrupts are resumed.
>
> It is not entirely clear what is occuring the point of the hang but I
> think a task necessary for the resume calls schedule_timeout(),
> waiting for a timer interrupt (which never arrives).  This failure may
> require specific tasks to be running on the other VCPUs to trigger
> (processes are not frozen during a suspend/resume if PREEMPT is
> disabled).
>
> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
> syscore_resume().
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>   arch/x86/xen/time.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 7b78f88..5718b0b 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -444,7 +444,7 @@ void xen_setup_timer(int cpu)
>   
>   	irq = bind_virq_to_irqhandler(VIRQ_TIMER, cpu, xen_timer_interrupt,
>   				      IRQF_PERCPU|IRQF_NOBALANCING|IRQF_TIMER|
> -				      IRQF_FORCE_RESUME,
> +				      IRQF_FORCE_RESUME|IRQF_EARLY_RESUME,
>   				      name, NULL);
>   	(void)xen_set_irq_priority(irq, XEN_IRQ_PRIORITY_MAX);
>   

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-08 14:04     ` Boris Ostrovsky
@ 2014-08-08 14:35       ` David Vrabel
  2014-08-08 17:15         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 7+ messages in thread
From: David Vrabel @ 2014-08-08 14:35 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel

On 08/08/14 15:04, Boris Ostrovsky wrote:
> On 08/08/2014 06:41 AM, David Vrabel wrote:
>> On 07/08/14 18:29, Boris Ostrovsky wrote:
>>> On 08/07/2014 01:16 PM, David Vrabel wrote:
>>>> If the timer irqs are resumed during device resume it is possible in
>>>> certain circumstances for the resume to hang early on, before device
>>>> interrupts are resumed.
>>>>
>>>> It is not entirely clear what is occuring the point of the hang but I
>>>> think a task necessary for the resume calls schedule_timeout(),
>>>> waiting for a timer interrupt (which never arrives).  This failure may
>>>> require specific tasks to be running on the other VCPUs to trigger
>>>> (processes are not frozen during a suspend/resume if PREEMPT is
>>>> disabled).
>>>>
>>>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
>>>> syscore_resume().
>>>>
>>>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
>>>> interrupts and IRQF_FORCE_RESUME was already set.
>>>
>>> IRQF_NO_SUSPEND is a component of IRQF_TIMER.
>> So it is.  How about this instead?
> 
> The change makes sense so
> 
>     Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> but I am curious whether you actually were able to prove that it in fact
> fixes the hang (the description doesn't make it clear).

Without the patch repeatedly migrating a VM would hang during resume
after < 500 iterations.  With the patch the VM was migrated > 8000 times
without a problem.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-08 14:35       ` David Vrabel
@ 2014-08-08 17:15         ` Konrad Rzeszutek Wilk
  2014-08-08 17:38           ` David Vrabel
  0 siblings, 1 reply; 7+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-08 17:15 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, Boris Ostrovsky

On Fri, Aug 08, 2014 at 03:35:27PM +0100, David Vrabel wrote:
> On 08/08/14 15:04, Boris Ostrovsky wrote:
> > On 08/08/2014 06:41 AM, David Vrabel wrote:
> >> On 07/08/14 18:29, Boris Ostrovsky wrote:
> >>> On 08/07/2014 01:16 PM, David Vrabel wrote:
> >>>> If the timer irqs are resumed during device resume it is possible in
> >>>> certain circumstances for the resume to hang early on, before device
> >>>> interrupts are resumed.
> >>>>
> >>>> It is not entirely clear what is occuring the point of the hang but I
> >>>> think a task necessary for the resume calls schedule_timeout(),
> >>>> waiting for a timer interrupt (which never arrives).  This failure may
> >>>> require specific tasks to be running on the other VCPUs to trigger
> >>>> (processes are not frozen during a suspend/resume if PREEMPT is
> >>>> disabled).
> >>>>
> >>>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
> >>>> syscore_resume().
> >>>>
> >>>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
> >>>> interrupts and IRQF_FORCE_RESUME was already set.
> >>>
> >>> IRQF_NO_SUSPEND is a component of IRQF_TIMER.
> >> So it is.  How about this instead?
> > 
> > The change makes sense so
> > 
> >     Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > 
> > but I am curious whether you actually were able to prove that it in fact
> > fixes the hang (the description doesn't make it clear).
> 
> Without the patch repeatedly migrating a VM would hang during resume
> after < 500 iterations.  With the patch the VM was migrated > 8000 times
> without a problem.

Ah, should said patch have a Reported-by too then?

It would also be neat to have that in the description of the patch I think.
> 
> David
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: resume timer irqs early
  2014-08-08 17:15         ` Konrad Rzeszutek Wilk
@ 2014-08-08 17:38           ` David Vrabel
  0 siblings, 0 replies; 7+ messages in thread
From: David Vrabel @ 2014-08-08 17:38 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Boris Ostrovsky

On 08/08/14 18:15, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 08, 2014 at 03:35:27PM +0100, David Vrabel wrote:
>> On 08/08/14 15:04, Boris Ostrovsky wrote:
>>> On 08/08/2014 06:41 AM, David Vrabel wrote:
>>>> On 07/08/14 18:29, Boris Ostrovsky wrote:
>>>>> On 08/07/2014 01:16 PM, David Vrabel wrote:
>>>>>> If the timer irqs are resumed during device resume it is possible in
>>>>>> certain circumstances for the resume to hang early on, before device
>>>>>> interrupts are resumed.
>>>>>>
>>>>>> It is not entirely clear what is occuring the point of the hang but I
>>>>>> think a task necessary for the resume calls schedule_timeout(),
>>>>>> waiting for a timer interrupt (which never arrives).  This failure may
>>>>>> require specific tasks to be running on the other VCPUs to trigger
>>>>>> (processes are not frozen during a suspend/resume if PREEMPT is
>>>>>> disabled).
>>>>>>
>>>>>> Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
>>>>>> syscore_resume().
>>>>>>
>>>>>> Also add IRQF_NO_SUSPEND as it is not necessary to suspend the timer
>>>>>> interrupts and IRQF_FORCE_RESUME was already set.
>>>>>
>>>>> IRQF_NO_SUSPEND is a component of IRQF_TIMER.
>>>> So it is.  How about this instead?
>>>
>>> The change makes sense so
>>>
>>>     Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>
>>> but I am curious whether you actually were able to prove that it in fact
>>> fixes the hang (the description doesn't make it clear).
>>
>> Without the patch repeatedly migrating a VM would hang during resume
>> after < 500 iterations.  With the patch the VM was migrated > 8000 times
>> without a problem.
> 
> Ah, should said patch have a Reported-by too then?

I don't think the XenServer automated test system really minds.

> It would also be neat to have that in the description of the patch I think.

I'm not really keen on system-specific numbers like this, but it would
probably be useful in this case since my analysis is so woolly.

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-08-08 17:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-07 17:16 [PATCH] x86/xen: resume timer irqs early David Vrabel
2014-08-07 17:29 ` Boris Ostrovsky
2014-08-08 10:41   ` David Vrabel
2014-08-08 14:04     ` Boris Ostrovsky
2014-08-08 14:35       ` David Vrabel
2014-08-08 17:15         ` Konrad Rzeszutek Wilk
2014-08-08 17:38           ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.