All of lore.kernel.org
 help / color / mirror / Atom feed
* xen: IPI interrupts not resumed early enough on suspend/resume
@ 2011-10-03 15:10 Ian Campbell
  2011-10-03 18:42   ` Thomas Gleixner
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Campbell @ 2011-10-03 15:10 UTC (permalink / raw)
  To: Thomas Gleixner, Jeremy Fitzhardinge, Konrad Rzeszutek Wilk
  Cc: xen-devel, linux-kernel

Hi Thomas,

Recently I've been chasing an issue where a Xen guest will fail to
resume about 1 time in 100. I eventually managed to bisect this back to
676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME".

The Xen suspend procedure (drivers/xen/manage.c:do_suspend()) is roughly
(I've omitted some uninteresting parts) as follows:
  dpm_suspend_start()
  dpm_suspend_noirq()
  stop_machine()
   -> xen_suspend()
        syscore_suspend()
        HYPERVISOR_suspend() /* Hypercall, returns on resume */
        xen_irq_resume() /* Re-establishes evtchn<->irq bindings */
        syscore_resume()
  dpm_resume_noirq()
  dpm_resume_end()

The resume process appears to be coming to a halt at the end of the
stop_machine invocation of xen_suspend(), i.e. after syscore_resume()
but before dpm_resume_noirq().

Looking at the stack traces of all VCPUs when this happens it appears
that they are all idle, which suggests we are missing an event to cause
a reschedule out of the stop_machine thread back into the suspending
thread.

One of the effects of 676dc3cf5bc3 was to move the unmasking of the
timer and IPI interrupts from xen_irq_resume() (i.e. within the
stop_machine region) to dpm_resume_noirq() (i.e. outside the
stop_machine region). Since the IPI interrupts includes the reschedule
IPI I rather suspect that is the reason for the problem. I added a hack
to unmask the reched* IPIs at xen_irq_resume() time and so far it seems
to fix things, which backs up my gut feeling.

I can see a few options for how I might go about solving this in a
non-hacky way, which approach do you think would be preferable:

      * Add "IRQF_RESUME_EARLY", driven from syscore_resume, and use it
        for these interrupts.
      * register syscore ops for the Xen event channel subsystem to
        unmask the IPIs earlier (would probably look a lot like the code
        removed by 676dc3cf5bc3).
      * add syscore_ops to Xen smp subsystem to unmask the specific IPIs
        (which it binds at start of day) earlier.
      * push dpm_(suspend|resume)_noirq down into stop machine region
      * use something other than stop_machine to quiesce system and move
        to cpu0 for suspend (doesn't seem sensible to reproduce that
        functionality).

Routing IPIs through the regular IRQ path seems a little bit unusual but
it looks like powerpc does something similar in smp_request_message_ipi
and mpic_request_ipis and that code uses the syscore approach. Does
applying that here too seem sane?

Any preference / advice?

Thanks,
Ian.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-10-17 13:55 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-03 15:10 xen: IPI interrupts not resumed early enough on suspend/resume Ian Campbell
2011-10-03 18:42 ` Thomas Gleixner
2011-10-03 18:42   ` Thomas Gleixner
2011-10-03 19:08   ` Ian Campbell
2011-10-03 20:35     ` Thomas Gleixner
2011-10-03 20:35       ` Thomas Gleixner
2011-10-07 15:32       ` Ian Campbell
2011-10-07 15:32         ` Ian Campbell
2011-10-07 16:29         ` Thomas Gleixner
2011-10-10 13:06           ` Ian Campbell
2011-10-14 13:23             ` Ian Campbell
2011-10-15 21:14               ` [Xen-devel] " Ian Campbell
2011-10-15 21:14                 ` Ian Campbell
2011-10-17  8:51                 ` Thomas Gleixner
2011-10-17  9:25                   ` Ian Campbell
2011-10-17  9:39                     ` Thomas Gleixner
2011-10-17 13:55                   ` Ian Campbell
2011-10-17 13:55                     ` Ian Campbell
2011-10-03 20:02   ` Rafael J. Wysocki
2011-10-03 20:28     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.