* [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-06 6:43 Tian, Kevin 2011-05-06 9:59 ` Thomas Gleixner 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder 0 siblings, 2 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-06 6:43 UTC (permalink / raw) To: linux-kernel, mingo, hpa, tglx; +Cc: Ian Campbell, JBeulich, xen-devel x86: don't unmask disabled irqs when migrating them it doesn't make sense to mask/unmask a disabled irq when migrating it from offlined cpu to another, because it's not expected to handle any instance of it. Current mask/set_affinity/unmask steps may trigger unexpected instance on disabled irq which then simply bug on when there is no handler for it. One failing example is observed in Xen. Xen pvops guest marks a special type of irqs as disabled, which are simply used as a notification mechanism to wake up blocked guest in event polling state. Absolutely unmask them may cause the notification instance instead injected into the guest and then cause trouble. Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Jan Beulich <JBeulich@novell.com> --- linux-2.6.39-rc6.orig/arch/x86/kernel/irq.c 2011-05-04 10:59:13.000000000 +0800 +++ linux-2.6.39-rc6/arch/x86/kernel/irq.c 2011-05-06 09:20:25.563963000 +0800 @@ -237,6 +237,7 @@ void fixup_irqs(void) for_each_irq_desc(irq, desc) { int break_affinity = 0; int set_affinity = 1; + int do_mask; const struct cpumask *affinity; if (!desc) @@ -268,7 +269,9 @@ void fixup_irqs(void) } chip = irq_data_get_irq_chip(data); - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) + do_mask = !irqd_irq_disabled(data) && + !irqd_can_move_in_process_context(data) && chip->irq_mask; + if (do_mask) chip->irq_mask(data); if (chip->irq_set_affinity) @@ -276,7 +279,7 @@ void fixup_irqs(void) else if (!(warned++)) set_affinity = 0; - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) + if (do_mask) chip->irq_unmask(data); raw_spin_unlock(&desc->lock); ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 6:43 [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them Tian, Kevin @ 2011-05-06 9:59 ` Thomas Gleixner 2011-05-06 12:54 ` Tian, Kevin 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder 1 sibling, 1 reply; 41+ messages in thread From: Thomas Gleixner @ 2011-05-06 9:59 UTC (permalink / raw) To: Tian, Kevin; +Cc: linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Fri, 6 May 2011, Tian, Kevin wrote: > x86: don't unmask disabled irqs when migrating them > > it doesn't make sense to mask/unmask a disabled irq when migrating it > from offlined cpu to another, because it's not expected to handle > any instance of it. Current mask/set_affinity/unmask steps may trigger > unexpected instance on disabled irq which then simply bug on when there > is no handler for it. One failing example is observed in Xen. Xen pvops So there is no handler, why the heck is there an irq action? if (!irq_has_action(irq) .... continue; Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, then it rightfully explodes. And we do not fix it by magic somewhere else. > guest marks a special type of irqs as disabled, which are simply used As I explained before several times, IRQF_DISABLED has absolutely nothing to do with it and pvops _CANNOT_ mark an interrupt disabled. > > chip = irq_data_get_irq_chip(data); > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > + do_mask = !irqd_irq_disabled(data) && > + !irqd_can_move_in_process_context(data) && chip->irq_mask; > + if (do_mask) > chip->irq_mask(data); This is completely wrong. irqd_irq_disabled() is a status information which does not tell you whether the interrupt is actually masked at the hardware level because we do lazy interrupt hardware masking. So your change would keep the line unmasked at the hardware level for all interrupts which are in the lazy disabled state. The only conditional which is interesting is the unmask path and that's a simple optimization and not a correctness problem. Thanks, tglx ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 9:59 ` Thomas Gleixner @ 2011-05-06 12:54 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-06 12:54 UTC (permalink / raw) To: Thomas Gleixner Cc: linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel > From: Thomas Gleixner > Sent: Friday, May 06, 2011 6:00 PM > > On Fri, 6 May 2011, Tian, Kevin wrote: > > x86: don't unmask disabled irqs when migrating them > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > from offlined cpu to another, because it's not expected to handle any > > instance of it. Current mask/set_affinity/unmask steps may trigger > > unexpected instance on disabled irq which then simply bug on when > > there is no handler for it. One failing example is observed in Xen. > > Xen pvops > > So there is no handler, why the heck is there an irq action? > > if (!irq_has_action(irq) .... > continue; > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > then it rightfully explodes. And we do not fix it by magic somewhere else. sorry that my bad description here. there does be a dummy handler registered on such irqs which simply throws out a BUG_ON when hit. I should just say such injection is not expected instead of no handler. :-) > > > guest marks a special type of irqs as disabled, which are simply used > > As I explained before several times, IRQF_DISABLED has absolutely nothing to > do with it and pvops _CANNOT_ mark an interrupt disabled. I have to admit that I need more study about whole interrupt sub-system, to better understand your explanation here. Also here again my description is not accurate enough. I meant that Xen pvops request the special irq with below flags: IRQF_DISABLED|IRQF_PERCPU|IRQF_NOBALANCING and then later explicitly disable it with disable_irq(). As you said that IRQF_DISABLED itself has nothing to do with it, and it's the later disable_irq() which takes real effect because Xen event chip hooks this callback to mask the irq from the chip level. > > > > > chip = irq_data_get_irq_chip(data); > > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > > + do_mask = !irqd_irq_disabled(data) && > > + !irqd_can_move_in_process_context(data) && chip->irq_mask; > > + if (do_mask) > > chip->irq_mask(data); > > This is completely wrong. irqd_irq_disabled() is a status information which does > not tell you whether the interrupt is actually masked at the hardware level > because we do lazy interrupt hardware masking. So your change would keep > the line unmasked at the hardware level for all interrupts which are in the lazy > disabled state. Got it. > > The only conditional which is interesting is the unmask path and that's a simple > optimization and not a correctness problem. > So what's your suggestion based on my updated information? Is there any interface I may take to differentiate above exception with normal case? Basically in Xen usage we want such irqs permanently disabled at the chip level. Or could we only do mask/unmask for irqs which are unmasked atm if as you said it's just an optimization step? :-) Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-06 12:54 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-06 12:54 UTC (permalink / raw) To: Thomas Gleixner Cc: Ian, linux-kernel, JBeulich, Campbell, mingo, hpa, xen-devel > From: Thomas Gleixner > Sent: Friday, May 06, 2011 6:00 PM > > On Fri, 6 May 2011, Tian, Kevin wrote: > > x86: don't unmask disabled irqs when migrating them > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > from offlined cpu to another, because it's not expected to handle any > > instance of it. Current mask/set_affinity/unmask steps may trigger > > unexpected instance on disabled irq which then simply bug on when > > there is no handler for it. One failing example is observed in Xen. > > Xen pvops > > So there is no handler, why the heck is there an irq action? > > if (!irq_has_action(irq) .... > continue; > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > then it rightfully explodes. And we do not fix it by magic somewhere else. sorry that my bad description here. there does be a dummy handler registered on such irqs which simply throws out a BUG_ON when hit. I should just say such injection is not expected instead of no handler. :-) > > > guest marks a special type of irqs as disabled, which are simply used > > As I explained before several times, IRQF_DISABLED has absolutely nothing to > do with it and pvops _CANNOT_ mark an interrupt disabled. I have to admit that I need more study about whole interrupt sub-system, to better understand your explanation here. Also here again my description is not accurate enough. I meant that Xen pvops request the special irq with below flags: IRQF_DISABLED|IRQF_PERCPU|IRQF_NOBALANCING and then later explicitly disable it with disable_irq(). As you said that IRQF_DISABLED itself has nothing to do with it, and it's the later disable_irq() which takes real effect because Xen event chip hooks this callback to mask the irq from the chip level. > > > > > chip = irq_data_get_irq_chip(data); > > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > > + do_mask = !irqd_irq_disabled(data) && > > + !irqd_can_move_in_process_context(data) && chip->irq_mask; > > + if (do_mask) > > chip->irq_mask(data); > > This is completely wrong. irqd_irq_disabled() is a status information which does > not tell you whether the interrupt is actually masked at the hardware level > because we do lazy interrupt hardware masking. So your change would keep > the line unmasked at the hardware level for all interrupts which are in the lazy > disabled state. Got it. > > The only conditional which is interesting is the unmask path and that's a simple > optimization and not a correctness problem. > So what's your suggestion based on my updated information? Is there any interface I may take to differentiate above exception with normal case? Basically in Xen usage we want such irqs permanently disabled at the chip level. Or could we only do mask/unmask for irqs which are unmasked atm if as you said it's just an optimization step? :-) Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [Xen-devel] RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 12:54 ` Tian, Kevin (?) @ 2011-05-06 13:14 ` Tian, Kevin -1 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-06 13:14 UTC (permalink / raw) To: Tian, Kevin, Thomas Gleixner Cc: linux-kernel, JBeulich, Campbell, mingo, hpa, xen-devel > From: Tian, Kevin > Sent: Friday, May 06, 2011 8:55 PM > > > From: Thomas Gleixner > > Sent: Friday, May 06, 2011 6:00 PM > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > x86: don't unmask disabled irqs when migrating them > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating > > > it from offlined cpu to another, because it's not expected to handle > > > any instance of it. Current mask/set_affinity/unmask steps may > > > trigger unexpected instance on disabled irq which then simply bug on > > > when there is no handler for it. One failing example is observed in Xen. > > > Xen pvops > > > > So there is no handler, why the heck is there an irq action? > > > > if (!irq_has_action(irq) .... > > continue; > > > > Should have caught an uninitialized interrupt. If Xen abuses > > interrupts that way, then it rightfully explodes. And we do not fix it by magic > somewhere else. > > sorry that my bad description here. there does be a dummy handler registered > on such irqs which simply throws out a BUG_ON when hit. I should just say such > injection is not expected instead of no handler. :-) > > > > > > guest marks a special type of irqs as disabled, which are simply > > > used > > > > As I explained before several times, IRQF_DISABLED has absolutely > > nothing to do with it and pvops _CANNOT_ mark an interrupt disabled. > > I have to admit that I need more study about whole interrupt sub-system, to > better understand your explanation here. Also here again my description is not > accurate enough. I meant that Xen pvops request the special irq with below > flags: > IRQF_DISABLED|IRQF_PERCPU|IRQF_NOBALANCING > and then later explicitly disable it with disable_irq(). As you said that > IRQF_DISABLED itself has nothing to do with it, and it's the later disable_irq() > which takes real effect because Xen event chip hooks this callback to mask the > irq from the chip level. > > > > > > > > > chip = irq_data_get_irq_chip(data); > > > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > > > + do_mask = !irqd_irq_disabled(data) && > > > + !irqd_can_move_in_process_context(data) && > chip->irq_mask; > > > + if (do_mask) > > > chip->irq_mask(data); > > > > This is completely wrong. irqd_irq_disabled() is a status information > > which does not tell you whether the interrupt is actually masked at > > the hardware level because we do lazy interrupt hardware masking. So > > your change would keep the line unmasked at the hardware level for all > > interrupts which are in the lazy disabled state. > > Got it. > > > > > The only conditional which is interesting is the unmask path and > > that's a simple optimization and not a correctness problem. > > > > So what's your suggestion based on my updated information? Is there any > interface I may take to differentiate above exception with normal case? > Basically in Xen usage we want such irqs permanently disabled at the chip level. > Or could we only do mask/unmask for irqs which are unmasked atm if as you > said it's just an optimization step? :-) > After another thought, I just got myself messed. This is not required by Xen actually, as the 1st patch to handle IRQF_PERCPU has been enough to cover the special requirement. Could you ack the 1st one if you agree? This 2nd patch is cooked upon the question whether generally a irq disabled at chip level should be unmasked at migration. If there's no action registered, it won't be migrated. The only question is if there does be action registered, and this irq is disabled at chip level, will there be any issue to force unmask it at migration, or should we only mask/unmask for a irq which is unmasked at the migration? Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 12:54 ` Tian, Kevin (?) (?) @ 2011-05-06 13:24 ` Thomas Gleixner 2011-05-06 14:04 ` Ian Campbell -1 siblings, 1 reply; 41+ messages in thread From: Thomas Gleixner @ 2011-05-06 13:24 UTC (permalink / raw) To: Tian, Kevin; +Cc: linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Fri, 6 May 2011, Tian, Kevin wrote: > > From: Thomas Gleixner > > Sent: Friday, May 06, 2011 6:00 PM > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > x86: don't unmask disabled irqs when migrating them > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > > from offlined cpu to another, because it's not expected to handle any > > > instance of it. Current mask/set_affinity/unmask steps may trigger > > > unexpected instance on disabled irq which then simply bug on when > > > there is no handler for it. One failing example is observed in Xen. > > > Xen pvops > > > > So there is no handler, why the heck is there an irq action? > > > > if (!irq_has_action(irq) .... > > continue; > > > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > > then it rightfully explodes. And we do not fix it by magic somewhere else. > > sorry that my bad description here. there does be a dummy handler registered > on such irqs which simply throws out a BUG_ON when hit. I should just say such > injection is not expected instead of no handler. :-) So can please someone point me to that particular incarnation of nonsense and provide a reasonable explanation for this abuse? What is the point of an interrupt, which is permanently disabled, has a handler with a BUG() inside and an irqaction assigned ? What's the purpose of this? Why is the irqaction there in the first place? To be called by some other weird means than by the irq handling code? > > The only conditional which is interesting is the unmask path and that's a simple > > optimization and not a correctness problem. > > > > So what's your suggestion based on my updated information? Is there any > interface I may take to differentiate above exception with normal case? Basically > in Xen usage we want such irqs permanently disabled at the chip level. Or > could we only do mask/unmask for irqs which are unmasked atm if as you said > it's just an optimization step? :-) No we can make the unmask conditional on !irqd_irq_disabled() because that's not violating any of the semantics. The interrupt would be masked anyway when it arrives and the handler code sees that it is lazy disabled. I mean real handler code, not the Xen abomination. The only valid reason why I'd apply that patch is that it avoids a potential extra interrupt, but not to prevent screwed up handlers from exploding. Thanks, tglx ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 13:24 ` Thomas Gleixner @ 2011-05-06 14:04 ` Ian Campbell 2011-05-08 1:44 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 41+ messages in thread From: Ian Campbell @ 2011-05-06 14:04 UTC (permalink / raw) To: Thomas Gleixner Cc: Tian, Kevin, linux-kernel, mingo, hpa, JBeulich, xen-devel, Jeremy Fitzhardinge On Fri, 2011-05-06 at 14:24 +0100, Thomas Gleixner wrote: > On Fri, 6 May 2011, Tian, Kevin wrote: > > > From: Thomas Gleixner > > > Sent: Friday, May 06, 2011 6:00 PM > > > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > > x86: don't unmask disabled irqs when migrating them > > > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > > > from offlined cpu to another, because it's not expected to handle any > > > > instance of it. Current mask/set_affinity/unmask steps may trigger > > > > unexpected instance on disabled irq which then simply bug on when > > > > there is no handler for it. One failing example is observed in Xen. > > > > Xen pvops > > > > > > So there is no handler, why the heck is there an irq action? > > > > > > if (!irq_has_action(irq) .... > > > continue; > > > > > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > > > then it rightfully explodes. And we do not fix it by magic somewhere else. > > > > sorry that my bad description here. there does be a dummy handler registered > > on such irqs which simply throws out a BUG_ON when hit. I should just say such > > injection is not expected instead of no handler. :-) > > So can please someone point me to that particular incarnation of > nonsense and provide a reasonable explanation for this abuse? > > What is the point of an interrupt, which is permanently disabled, has > a handler with a BUG() inside and an irqaction assigned ? > > What's the purpose of this? Why is the irqaction there in the first > place? To be called by some other weird means than by the irq > handling code? The Xen PV spinlock code (arch/x86/xen/spinlock.c) allocates an IRQ (per-cpu lock_kicker_irq). I think it is there purely in order to have the associated underlying evtchn to use as the thing to poll (Xen has an evtchn poll hypercall, see xen_poll_irq()) on the slow path and kick on release. There is never any need to call a handler for that evtchn -- just notifying the evtchn is enough to wake the poller. The irq is setup using request_irq(). Is there a different API to register an IRQ without attaching a handler/action to it? (I can't think why such a thing would exist). I'm not really sure why these can't just be an evtchn without an associated IRQ since it doesn't really have any interrupt-like semantics. Perhaps just a general desire to keep event channels abstracted into the core Xen event subsystem with IRQs as the public facing API? Jeremy? Ian. > > > The only conditional which is interesting is the unmask path and that's a simple > > > optimization and not a correctness problem. > > > > > > > So what's your suggestion based on my updated information? Is there any > > interface I may take to differentiate above exception with normal case? Basically > > in Xen usage we want such irqs permanently disabled at the chip level. Or > > could we only do mask/unmask for irqs which are unmasked atm if as you said > > it's just an optimization step? :-) > > No we can make the unmask conditional on !irqd_irq_disabled() because > that's not violating any of the semantics. The interrupt would be > masked anyway when it arrives and the handler code sees that it is > lazy disabled. I mean real handler code, not the Xen abomination. > > The only valid reason why I'd apply that patch is that it avoids a > potential extra interrupt, but not to prevent screwed up handlers from > exploding. > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 14:04 ` Ian Campbell @ 2011-05-08 1:44 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 41+ messages in thread From: Jeremy Fitzhardinge @ 2011-05-08 1:44 UTC (permalink / raw) To: Ian Campbell Cc: Thomas Gleixner, Tian, Kevin, linux-kernel, mingo, hpa, JBeulich, xen-devel On 05/07/2011 12:04 AM, Ian Campbell wrote: > I'm not really sure why these can't just be an evtchn without an > associated IRQ since it doesn't really have any interrupt-like > semantics. Perhaps just a general desire to keep event channels > abstracted into the core Xen event subsystem with IRQs as the public > facing API? Jeremy? It doesn't really need to be an irq. The main reason was so that it would appear in /proc/interrupts so I could use the counter as a "number of times a spinlock was kicked" counter. That could be exposed in some other way if being part of the interrupt infrastructure brings too much baggage with it. J ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-08 1:44 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 41+ messages in thread From: Jeremy Fitzhardinge @ 2011-05-08 1:44 UTC (permalink / raw) To: Ian Campbell Cc: Tian, Kevin, xen-devel, linux-kernel, JBeulich, mingo, hpa, Thomas Gleixner On 05/07/2011 12:04 AM, Ian Campbell wrote: > I'm not really sure why these can't just be an evtchn without an > associated IRQ since it doesn't really have any interrupt-like > semantics. Perhaps just a general desire to keep event channels > abstracted into the core Xen event subsystem with IRQs as the public > facing API? Jeremy? It doesn't really need to be an irq. The main reason was so that it would appear in /proc/interrupts so I could use the counter as a "number of times a spinlock was kicked" counter. That could be exposed in some other way if being part of the interrupt infrastructure brings too much baggage with it. J ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-08 1:44 ` Jeremy Fitzhardinge @ 2011-05-09 0:44 ` Tian, Kevin -1 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-09 0:44 UTC (permalink / raw) To: Jeremy Fitzhardinge, Ian Campbell Cc: Thomas Gleixner, linux-kernel, mingo, hpa, JBeulich, xen-devel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1100 bytes --] > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > Sent: Sunday, May 08, 2011 9:45 AM > > On 05/07/2011 12:04 AM, Ian Campbell wrote: > > I'm not really sure why these can't just be an evtchn without an > > associated IRQ since it doesn't really have any interrupt-like > > semantics. Perhaps just a general desire to keep event channels > > abstracted into the core Xen event subsystem with IRQs as the public > > facing API? Jeremy? > > It doesn't really need to be an irq. The main reason was so that it would > appear in /proc/interrupts so I could use the counter as a "number of times a > spinlock was kicked" counter. That could be exposed in some other way if > being part of the interrupt infrastructure brings too much baggage with it. > Perhaps we don't need an irq binding here. Just like a local APIC interrupt source which only needs vector. Somehow the virq or vipi concept in Xen context is similar. Thanks Kevin ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-09 0:44 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-09 0:44 UTC (permalink / raw) To: Jeremy Fitzhardinge, Ian Campbell Cc: xen-devel, linux-kernel, JBeulich, mingo, hpa, Thomas Gleixner [-- Attachment #1: Type: text/plain, Size: 962 bytes --] > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > Sent: Sunday, May 08, 2011 9:45 AM > > On 05/07/2011 12:04 AM, Ian Campbell wrote: > > I'm not really sure why these can't just be an evtchn without an > > associated IRQ since it doesn't really have any interrupt-like > > semantics. Perhaps just a general desire to keep event channels > > abstracted into the core Xen event subsystem with IRQs as the public > > facing API? Jeremy? > > It doesn't really need to be an irq. The main reason was so that it would > appear in /proc/interrupts so I could use the counter as a "number of times a > spinlock was kicked" counter. That could be exposed in some other way if > being part of the interrupt infrastructure brings too much baggage with it. > Perhaps we don't need an irq binding here. Just like a local APIC interrupt source which only needs vector. Somehow the virq or vipi concept in Xen context is similar. Thanks Kevin [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-09 0:44 ` Tian, Kevin @ 2011-05-09 1:45 ` Jeremy Fitzhardinge -1 siblings, 0 replies; 41+ messages in thread From: Jeremy Fitzhardinge @ 2011-05-09 1:45 UTC (permalink / raw) To: Tian, Kevin Cc: Ian Campbell, Thomas Gleixner, linux-kernel, mingo, hpa, JBeulich, xen-devel On 05/09/2011 10:44 AM, Tian, Kevin wrote: >> It doesn't really need to be an irq. The main reason was so that it would >> appear in /proc/interrupts so I could use the counter as a "number of times a >> spinlock was kicked" counter. That could be exposed in some other way if >> being part of the interrupt infrastructure brings too much baggage with it. >> > Perhaps we don't need an irq binding here. Just like a local APIC interrupt > source which only needs vector. Somehow the virq or vipi concept in Xen > context is similar. An event channel is logically equivalent to a vector, so that would make sense. We currently allocate irqs for cross-cpu call and reschedule event channels, whereas native x86 simply uses a naked vector for those. But they are real interrupts, so an irq at least makes some logical sense in those cases. For spinlocks, the event channel is more like a usermode-level signal which is always blocked and only ever tested with sigsuspend (or is it sigpoll? something like that). J ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-09 1:45 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 41+ messages in thread From: Jeremy Fitzhardinge @ 2011-05-09 1:45 UTC (permalink / raw) To: Tian, Kevin Cc: xen-devel, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Thomas Gleixner On 05/09/2011 10:44 AM, Tian, Kevin wrote: >> It doesn't really need to be an irq. The main reason was so that it would >> appear in /proc/interrupts so I could use the counter as a "number of times a >> spinlock was kicked" counter. That could be exposed in some other way if >> being part of the interrupt infrastructure brings too much baggage with it. >> > Perhaps we don't need an irq binding here. Just like a local APIC interrupt > source which only needs vector. Somehow the virq or vipi concept in Xen > context is similar. An event channel is logically equivalent to a vector, so that would make sense. We currently allocate irqs for cross-cpu call and reschedule event channels, whereas native x86 simply uses a naked vector for those. But they are real interrupts, so an irq at least makes some logical sense in those cases. For spinlocks, the event channel is more like a usermode-level signal which is always blocked and only ever tested with sigsuspend (or is it sigpoll? something like that). J ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 12:54 ` Tian, Kevin @ 2011-05-06 14:28 ` Stefano Stabellini -1 siblings, 0 replies; 41+ messages in thread From: Stefano Stabellini @ 2011-05-06 14:28 UTC (permalink / raw) To: Tian, Kevin Cc: Thomas Gleixner, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Fri, 6 May 2011, Tian, Kevin wrote: > > From: Thomas Gleixner > > Sent: Friday, May 06, 2011 6:00 PM > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > x86: don't unmask disabled irqs when migrating them > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > > from offlined cpu to another, because it's not expected to handle any > > > instance of it. Current mask/set_affinity/unmask steps may trigger > > > unexpected instance on disabled irq which then simply bug on when > > > there is no handler for it. One failing example is observed in Xen. > > > Xen pvops > > > > So there is no handler, why the heck is there an irq action? > > > > if (!irq_has_action(irq) .... > > continue; > > > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > > then it rightfully explodes. And we do not fix it by magic somewhere else. > > sorry that my bad description here. there does be a dummy handler registered > on such irqs which simply throws out a BUG_ON when hit. I should just say such > injection is not expected instead of no handler. :-) I don't think this patch is necessary anymore after the event channel handling cleanup patches I have just sent to the list. Could you please try the following two patches: http://marc.info/?l=linux-kernel&m=130468120032172&w=2 http://marc.info/?l=linux-kernel&m=130468178200468&w=2 and let me know if you still need this patch? ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-06 14:28 ` Stefano Stabellini 0 siblings, 0 replies; 41+ messages in thread From: Stefano Stabellini @ 2011-05-06 14:28 UTC (permalink / raw) To: Tian, Kevin Cc: xen-devel, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Thomas Gleixner On Fri, 6 May 2011, Tian, Kevin wrote: > > From: Thomas Gleixner > > Sent: Friday, May 06, 2011 6:00 PM > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > x86: don't unmask disabled irqs when migrating them > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating it > > > from offlined cpu to another, because it's not expected to handle any > > > instance of it. Current mask/set_affinity/unmask steps may trigger > > > unexpected instance on disabled irq which then simply bug on when > > > there is no handler for it. One failing example is observed in Xen. > > > Xen pvops > > > > So there is no handler, why the heck is there an irq action? > > > > if (!irq_has_action(irq) .... > > continue; > > > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way, > > then it rightfully explodes. And we do not fix it by magic somewhere else. > > sorry that my bad description here. there does be a dummy handler registered > on such irqs which simply throws out a BUG_ON when hit. I should just say such > injection is not expected instead of no handler. :-) I don't think this patch is necessary anymore after the event channel handling cleanup patches I have just sent to the list. Could you please try the following two patches: http://marc.info/?l=linux-kernel&m=130468120032172&w=2 http://marc.info/?l=linux-kernel&m=130468178200468&w=2 and let me know if you still need this patch? ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 14:28 ` Stefano Stabellini (?) @ 2011-05-06 21:43 ` Tian, Kevin -1 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-06 21:43 UTC (permalink / raw) To: Stefano Stabellini Cc: Thomas Gleixner, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com] > Sent: Friday, May 06, 2011 10:29 PM > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > From: Thomas Gleixner > > > Sent: Friday, May 06, 2011 6:00 PM > > > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > > x86: don't unmask disabled irqs when migrating them > > > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating > > > > it from offlined cpu to another, because it's not expected to > > > > handle any instance of it. Current mask/set_affinity/unmask steps > > > > may trigger unexpected instance on disabled irq which then simply > > > > bug on when there is no handler for it. One failing example is observed in > Xen. > > > > Xen pvops > > > > > > So there is no handler, why the heck is there an irq action? > > > > > > if (!irq_has_action(irq) .... > > > continue; > > > > > > Should have caught an uninitialized interrupt. If Xen abuses > > > interrupts that way, then it rightfully explodes. And we do not fix it by magic > somewhere else. > > > > sorry that my bad description here. there does be a dummy handler > > registered on such irqs which simply throws out a BUG_ON when hit. I > > should just say such injection is not expected instead of no handler. > > :-) > > I don't think this patch is necessary anymore after the event channel handling > cleanup patches I have just sent to the list. > Could you please try the following two patches: > > http://marc.info/?l=linux-kernel&m=130468120032172&w=2 > http://marc.info/?l=linux-kernel&m=130468178200468&w=2 > > and let me know if you still need this patch? thanks, and I'll take a look at them. Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-06 14:28 ` Stefano Stabellini (?) (?) @ 2011-05-09 2:11 ` Tian, Kevin 2011-05-09 12:02 ` Stefano Stabellini -1 siblings, 1 reply; 41+ messages in thread From: Tian, Kevin @ 2011-05-09 2:11 UTC (permalink / raw) To: Stefano Stabellini Cc: Thomas Gleixner, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com] > Sent: Friday, May 06, 2011 10:29 PM > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > From: Thomas Gleixner > > > Sent: Friday, May 06, 2011 6:00 PM > > > > > > On Fri, 6 May 2011, Tian, Kevin wrote: > > > > x86: don't unmask disabled irqs when migrating them > > > > > > > > it doesn't make sense to mask/unmask a disabled irq when migrating > > > > it from offlined cpu to another, because it's not expected to > > > > handle any instance of it. Current mask/set_affinity/unmask steps > > > > may trigger unexpected instance on disabled irq which then simply > > > > bug on when there is no handler for it. One failing example is observed in > Xen. > > > > Xen pvops > > > > > > So there is no handler, why the heck is there an irq action? > > > > > > if (!irq_has_action(irq) .... > > > continue; > > > > > > Should have caught an uninitialized interrupt. If Xen abuses > > > interrupts that way, then it rightfully explodes. And we do not fix it by magic > somewhere else. > > > > sorry that my bad description here. there does be a dummy handler > > registered on such irqs which simply throws out a BUG_ON when hit. I > > should just say such injection is not expected instead of no handler. > > :-) > > I don't think this patch is necessary anymore after the event channel handling > cleanup patches I have just sent to the list. > Could you please try the following two patches: > > http://marc.info/?l=linux-kernel&m=130468120032172&w=2 > http://marc.info/?l=linux-kernel&m=130468178200468&w=2 > > and let me know if you still need this patch? yes, with your patch this issue disappears, since you explicitly make mask/unmask as a nop for xen_percpu_chip, which effectively avoids them from undesired unmask when doing the migration. Though it works, it's not intuitive as to me it's an workaround to make Xen chip implementation adapting to specific fixup_irqs logic. If the logic within fixup_irqs is changed in the future, it's still possible to expose this issue again. If we explicitly avoid the percpu irqs in the 1st place in fixup_irqs like what my 1st patch does, it'd be clearer. But based on your referred patchesand Jeremy's info about possible unbinding xenn event from irq namespace, my patches are not that necessary now as before. So I'll leave to Thomas to decide whether he wants to take my 1st patch. the 2nd one talked in this thread could be dropped. Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-09 2:11 ` Tian, Kevin @ 2011-05-09 12:02 ` Stefano Stabellini 2011-05-09 12:36 ` Thomas Gleixner 0 siblings, 1 reply; 41+ messages in thread From: Stefano Stabellini @ 2011-05-09 12:02 UTC (permalink / raw) To: Tian, Kevin Cc: Stefano Stabellini, Thomas Gleixner, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Mon, 9 May 2011, Tian, Kevin wrote: > yes, with your patch this issue disappears, since you explicitly make mask/unmask as > a nop for xen_percpu_chip, which effectively avoids them from undesired unmask > when doing the migration. Though it works, it's not intuitive as to me it's an > workaround to make Xen chip implementation adapting to specific fixup_irqs logic. I have been tring to follow the example of existing supported drivers. The only x86 driver I could find that uses handle_percpu_irq is uv_irq that does exatly the same thing. ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-09 12:02 ` Stefano Stabellini @ 2011-05-09 12:36 ` Thomas Gleixner 2011-05-10 3:24 ` Tian, Kevin 0 siblings, 1 reply; 41+ messages in thread From: Thomas Gleixner @ 2011-05-09 12:36 UTC (permalink / raw) To: Stefano Stabellini Cc: Tian, Kevin, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Mon, 9 May 2011, Stefano Stabellini wrote: > On Mon, 9 May 2011, Tian, Kevin wrote: > > yes, with your patch this issue disappears, since you explicitly make mask/unmask as > > a nop for xen_percpu_chip, which effectively avoids them from undesired unmask > > when doing the migration. Though it works, it's not intuitive as to me it's an > > workaround to make Xen chip implementation adapting to specific fixup_irqs logic. > > I have been tring to follow the example of existing supported drivers. > The only x86 driver I could find that uses handle_percpu_irq is uv_irq > that does exatly the same thing. Which is a good enough argument to make that change at the common code level instead of having fancy workarounds here and there. Thanks, tglx ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-09 12:36 ` Thomas Gleixner @ 2011-05-10 3:24 ` Tian, Kevin 2011-05-18 23:49 ` Tian, Kevin 0 siblings, 1 reply; 41+ messages in thread From: Tian, Kevin @ 2011-05-10 3:24 UTC (permalink / raw) To: Thomas Gleixner, Stefano Stabellini Cc: linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel > From: Thomas Gleixner [mailto:tglx@linutronix.de] > Sent: Monday, May 09, 2011 8:37 PM > > On Mon, 9 May 2011, Stefano Stabellini wrote: > > > On Mon, 9 May 2011, Tian, Kevin wrote: > > > yes, with your patch this issue disappears, since you explicitly > > > make mask/unmask as a nop for xen_percpu_chip, which effectively > > > avoids them from undesired unmask when doing the migration. Though > > > it works, it's not intuitive as to me it's an workaround to make Xen chip > implementation adapting to specific fixup_irqs logic. > > > > I have been tring to follow the example of existing supported drivers. > > The only x86 driver I could find that uses handle_percpu_irq is uv_irq > > that does exatly the same thing. > > Which is a good enough argument to make that change at the common code > level instead of having fancy workarounds here and there. > So Thomas, what's your suggestion to continue here? Is my original patch to skip percpu irq in common code a good option to go, or you want a cleaner code in other form? Once it's clear I'll discuss with Stefano e.g. possibly merge with his cleanup patch series. :-) Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-10 3:24 ` Tian, Kevin @ 2011-05-18 23:49 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-18 23:49 UTC (permalink / raw) To: Thomas Gleixner, Stefano Stabellini Cc: linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel > From: Tian, Kevin > Sent: Tuesday, May 10, 2011 11:24 AM > > > From: Thomas Gleixner [mailto:tglx@linutronix.de] > > Sent: Monday, May 09, 2011 8:37 PM > > > > On Mon, 9 May 2011, Stefano Stabellini wrote: > > > > > On Mon, 9 May 2011, Tian, Kevin wrote: > > > > yes, with your patch this issue disappears, since you explicitly > > > > make mask/unmask as a nop for xen_percpu_chip, which effectively > > > > avoids them from undesired unmask when doing the migration. Though > > > > it works, it's not intuitive as to me it's an workaround to make > > > > Xen chip > > implementation adapting to specific fixup_irqs logic. > > > > > > I have been tring to follow the example of existing supported drivers. > > > The only x86 driver I could find that uses handle_percpu_irq is > > > uv_irq that does exatly the same thing. > > > > Which is a good enough argument to make that change at the common code > > level instead of having fancy workarounds here and there. > > > > So Thomas, what's your suggestion to continue here? Is my original patch to > skip percpu irq in common code a good option to go, or you want a cleaner code > in other form? Once it's clear I'll discuss with Stefano e.g. possibly merge with > his cleanup patch series. :-) > Hi, Thomas, any response on this? Sorry that you may explain your comments clearly in earlier thread, but I may not catch all of them in one place. I sent out two fixes here: [1/2] don't move irq when it's percpu type [2/2] don't unmask irq when it's disabled at irqchip level for [2/2], as you explained it's legitimate to mask/unmask a disabled irq since from irqchip level mask/disable actually means same thing. The only possible gain to do that is to avoid a potential extra interrupt, which is a different purpose as what I originally target. So for [2/2] I think it's not required now. for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq. However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses nop mask/unmask hack borrowed from uv machine to work around the issue. As you suggested it's better to consolidate into the common place instead of scattering in different places. My view on this common logic is what [1/2] tries to address, is it correct? If yes, would you consider taking this patch? Stefano told me that his patches will go in in next merge window. So I think either you can take [1/2] now and then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's patch to clean both xen and uv parts. Let me know your thought. Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-18 23:49 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-05-18 23:49 UTC (permalink / raw) To: Thomas Gleixner, Stefano Stabellini Cc: Ian, linux-kernel, JBeulich, Campbell, mingo, hpa, xen-devel > From: Tian, Kevin > Sent: Tuesday, May 10, 2011 11:24 AM > > > From: Thomas Gleixner [mailto:tglx@linutronix.de] > > Sent: Monday, May 09, 2011 8:37 PM > > > > On Mon, 9 May 2011, Stefano Stabellini wrote: > > > > > On Mon, 9 May 2011, Tian, Kevin wrote: > > > > yes, with your patch this issue disappears, since you explicitly > > > > make mask/unmask as a nop for xen_percpu_chip, which effectively > > > > avoids them from undesired unmask when doing the migration. Though > > > > it works, it's not intuitive as to me it's an workaround to make > > > > Xen chip > > implementation adapting to specific fixup_irqs logic. > > > > > > I have been tring to follow the example of existing supported drivers. > > > The only x86 driver I could find that uses handle_percpu_irq is > > > uv_irq that does exatly the same thing. > > > > Which is a good enough argument to make that change at the common code > > level instead of having fancy workarounds here and there. > > > > So Thomas, what's your suggestion to continue here? Is my original patch to > skip percpu irq in common code a good option to go, or you want a cleaner code > in other form? Once it's clear I'll discuss with Stefano e.g. possibly merge with > his cleanup patch series. :-) > Hi, Thomas, any response on this? Sorry that you may explain your comments clearly in earlier thread, but I may not catch all of them in one place. I sent out two fixes here: [1/2] don't move irq when it's percpu type [2/2] don't unmask irq when it's disabled at irqchip level for [2/2], as you explained it's legitimate to mask/unmask a disabled irq since from irqchip level mask/disable actually means same thing. The only possible gain to do that is to avoid a potential extra interrupt, which is a different purpose as what I originally target. So for [2/2] I think it's not required now. for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq. However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses nop mask/unmask hack borrowed from uv machine to work around the issue. As you suggested it's better to consolidate into the common place instead of scattering in different places. My view on this common logic is what [1/2] tries to address, is it correct? If yes, would you consider taking this patch? Stefano told me that his patches will go in in next merge window. So I think either you can take [1/2] now and then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's patch to clean both xen and uv parts. Let me know your thought. Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-18 23:49 ` Tian, Kevin (?) @ 2011-05-19 12:08 ` Stefano Stabellini 2011-05-19 16:18 ` Konrad Rzeszutek Wilk -1 siblings, 1 reply; 41+ messages in thread From: Stefano Stabellini @ 2011-05-19 12:08 UTC (permalink / raw) To: Tian, Kevin Cc: Thomas Gleixner, Stefano Stabellini, linux-kernel, mingo, hpa, Ian Campbell, JBeulich, xen-devel On Thu, 19 May 2011, Tian, Kevin wrote: > for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq. > However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses > nop mask/unmask hack borrowed from uv machine to work around the issue. As > you suggested it's better to consolidate into the common place instead of scattering > in different places. My view on this common logic is what [1/2] tries to address, is > it correct? If yes, would you consider taking this patch? Stefano told me that his > patches will go in in next merge window. So I think either you can take [1/2] now and > then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's > patch to clean both xen and uv parts. Actually I think Kevin's generic patch is better too. If you ack it I'll remove my patch right away from the queue (maybe I should remove it anyway?). Kevin probably needs to write a cleanup patch to remove the equivalent hack from the uv_irq. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [Xen-devel] RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them 2011-05-19 12:08 ` Stefano Stabellini @ 2011-05-19 16:18 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2011-05-19 16:18 UTC (permalink / raw) To: Stefano Stabellini Cc: Tian, Kevin, xen-devel, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Thomas Gleixner On Thu, May 19, 2011 at 01:08:07PM +0100, Stefano Stabellini wrote: > On Thu, 19 May 2011, Tian, Kevin wrote: > > for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq. > > However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses > > nop mask/unmask hack borrowed from uv machine to work around the issue. As > > you suggested it's better to consolidate into the common place instead of scattering > > in different places. My view on this common logic is what [1/2] tries to address, is > > it correct? If yes, would you consider taking this patch? Stefano told me that his > > patches will go in in next merge window. So I think either you can take [1/2] now and > > then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's > > patch to clean both xen and uv parts. > > Actually I think Kevin's generic patch is better too. > If you ack it I'll remove my patch right away from the queue (maybe I > should remove it anyway?). I dropped your patch. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them @ 2011-05-19 16:18 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2011-05-19 16:18 UTC (permalink / raw) To: Stefano Stabellini Cc: Tian, Kevin, xen-devel, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Thomas Gleixner On Thu, May 19, 2011 at 01:08:07PM +0100, Stefano Stabellini wrote: > On Thu, 19 May 2011, Tian, Kevin wrote: > > for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq. > > However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses > > nop mask/unmask hack borrowed from uv machine to work around the issue. As > > you suggested it's better to consolidate into the common place instead of scattering > > in different places. My view on this common logic is what [1/2] tries to address, is > > it correct? If yes, would you consider taking this patch? Stefano told me that his > > patches will go in in next merge window. So I think either you can take [1/2] now and > > then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's > > patch to clean both xen and uv parts. > > Actually I think Kevin's generic patch is better too. > If you ack it I'll remove my patch right away from the queue (maybe I > should remove it anyway?). I dropped your patch. ^ permalink raw reply [flat|nested] 41+ messages in thread
* [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) 2011-05-06 6:43 [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them Tian, Kevin 2011-05-06 9:59 ` Thomas Gleixner @ 2011-08-29 4:15 ` Jonathan Nieder 2011-08-31 1:04 ` Dave Hansen ` (2 more replies) 1 sibling, 3 replies; 41+ messages in thread From: Jonathan Nieder @ 2011-08-29 4:15 UTC (permalink / raw) To: Thomas Gleixner Cc: linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen Hi, Lars Boegild Thomsen writes[1]: > After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after > sleep. Back to latest 2.6 kernel works fine. [...] > Upon wakeup, the power light go from slow flashing to on, the battery light > goes from off to on, the hdd light blink once and then everything is dead. > Nothing happens on the screen, all keys dead. The fan/hdd switch on > physically (very hard to hear on this model or I am getting deaf). > Ctrl+alt+del or the alt+sysreq is non-responsive. The only LED that show > keyboard status is CAPS lock and that is unresponsive too. Only way I have > found to get it rebooted is holding down the power button a few secs until it > switch physically off and then switch it on again. [...] > Here's the result of the final bisect: > > 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit [...] > I also tried to go back to HEAD and manually change arch/x86/irq.c revert this > particular commit and it works. For reference: > commit 983bbf1af0664b78689612b247acb514300f62c7 > Author: Tian, Kevin <kevin.tian@intel.com> > Date: Fri May 6 14:43:56 2011 +0800 > > x86: Don't unmask disabled irqs when migrating them > > It doesn't make sense to unconditionally unmask a disabled irq when > migrating it from offlined cpu to another. If the irq triggers then it > will be disabled in the interrupt handler anyway. So we can just avoid > unmasking it. > > [ tglx: Made masking unconditional again and fixed the changelog ] > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > Signed-off-by: Kevin Tian <kevin.tian@intel.com> > Cc: Ian Campbell <Ian.Campbell@citrix.com> > Cc: Jan Beulich <JBeulich@novell.com> > Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com> > Link: http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F7E3%40shsmsx502.ccr.corp.intel.com%3 > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index 544efe2741be..6c0802eb2f7f 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -276,7 +276,8 @@ void fixup_irqs(void) > else if (!(warned++)) > set_affinity = 0; > > - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) > + if (!irqd_can_move_in_process_context(data) && > + !irqd_irq_disabled(data) && chip->irq_unmask) > chip->irq_unmask(data); > > raw_spin_unlock(&desc->lock); Known problem? Ideas? Regards, Jonathan [1] http://bugs.debian.org/635575 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder @ 2011-08-31 1:04 ` Dave Hansen 2011-08-31 8:22 ` Jonathan Nieder 2011-09-01 6:24 ` Tian, Kevin 2012-04-15 14:06 ` Jonathan Nieder 2 siblings, 1 reply; 41+ messages in thread From: Dave Hansen @ 2011-08-31 1:04 UTC (permalink / raw) To: Jonathan Nieder Cc: Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, Len Brown On Sun, 2011-08-28 at 23:15 -0500, Jonathan Nieder wrote: > Lars Boegild Thomsen writes[1]: > > After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after > > sleep. Back to latest 2.6 kernel works fine. > [...] > > Upon wakeup, the power light go from slow flashing to on, the battery light > > goes from off to on, the hdd light blink once and then everything is dead. > > Nothing happens on the screen, all keys dead. The fan/hdd switch on > > physically (very hard to hear on this model or I am getting deaf). > > Ctrl+alt+del or the alt+sysreq is non-responsive. The only LED that show > > keyboard status is CAPS lock and that is unresponsive too. Only way I have > > found to get it rebooted is holding down the power button a few secs until it > > switch physically off and then switch it on again. > [...] > > Here's the result of the final bisect: > > > > 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit > [...] > > I also tried to go back to HEAD and manually change arch/x86/irq.c revert this > > particular commit and it works. > > For reference: > > > commit 983bbf1af0664b78689612b247acb514300f62c7 > > Author: Tian, Kevin <kevin.tian@intel.com> > > Date: Fri May 6 14:43:56 2011 +0800 > > > > x86: Don't unmask disabled irqs when migrating them > > > > It doesn't make sense to unconditionally unmask a disabled irq when > > migrating it from offlined cpu to another. If the irq triggers then it > > will be disabled in the interrupt handler anyway. So we can just avoid > > unmasking it. > > > > [ tglx: Made masking unconditional again and fixed the changelog ] > > > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > > Signed-off-by: Kevin Tian <kevin.tian@intel.com> > > Cc: Ian Campbell <Ian.Campbell@citrix.com> > > Cc: Jan Beulich <JBeulich@novell.com> > > Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com> > > Link: http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F7E3%40shsmsx502.ccr.corp.intel.com%3 > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > > index 544efe2741be..6c0802eb2f7f 100644 > > --- a/arch/x86/kernel/irq.c > > +++ b/arch/x86/kernel/irq.c > > @@ -276,7 +276,8 @@ void fixup_irqs(void) > > else if (!(warned++)) > > set_affinity = 0; > > > > - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) > > + if (!irqd_can_move_in_process_context(data) && > > + !irqd_irq_disabled(data) && chip->irq_unmask) > > chip->irq_unmask(data); > > > > raw_spin_unlock(&desc->lock); > > Known problem? Ideas? > [1] http://bugs.debian.org/635575 cc'ing Len Brown who tried to fix this, but in different code: http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4731fdcf6f7bdab3e369a3f844d4ea4d4017284d I'm seeing the exact same symptoms on my S10-3, fwiw. They definitely don't happen when intel_idle is compiled out or when intel_idle.max_cstate=0 is specified on the kernel command-line. -- Dave ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) 2011-08-31 1:04 ` Dave Hansen @ 2011-08-31 8:22 ` Jonathan Nieder 2011-09-02 3:01 ` Serge E. Hallyn 0 siblings, 1 reply; 41+ messages in thread From: Jonathan Nieder @ 2011-08-31 8:22 UTC (permalink / raw) To: Dave Hansen Cc: Thomas Gleixner, linux-kernel, Kevin Tian, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, Len Brown Hi Dave, Dave Hansen wrote: > I'm seeing the exact same symptoms on my S10-3, fwiw. They definitely > don't happen when intel_idle is compiled out or when > intel_idle.max_cstate=0 is specified on the kernel command-line. Lars reminds me[1] that the kernel he was testing already had the intel_idle driver compiled out, so you're probably seeing a different (possibly related) bug. (By the way, Lars, it is fine to communicate with lkml directly. :) The people there don't bite.) Over and out, Jonathan [1] http://bugs.debian.org/635575#42 (.config attached to that bug in a later comment) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) 2011-08-31 8:22 ` Jonathan Nieder @ 2011-09-02 3:01 ` Serge E. Hallyn 0 siblings, 0 replies; 41+ messages in thread From: Serge E. Hallyn @ 2011-09-02 3:01 UTC (permalink / raw) To: Jonathan Nieder Cc: Dave Hansen, Thomas Gleixner, linux-kernel, Kevin Tian, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, Len Brown Quoting Jonathan Nieder (jrnieder@gmail.com): > Hi Dave, > > Dave Hansen wrote: > > > I'm seeing the exact same symptoms on my S10-3, fwiw. They definitely > > don't happen when intel_idle is compiled out or when > > intel_idle.max_cstate=0 is specified on the kernel command-line. > > Lars reminds me[1] that the kernel he was testing already had the > intel_idle driver compiled out, so you're probably seeing a different > (possibly related) bug. (By the way, Lars, it is fine to communicate > with lkml directly. :) The people there don't bite.) Right, over the past year or so this has been hit or miss. Sometimes intel_idle.max_cstate=0 would fix it. Sometimes (like today) not. Ever since the problems have started, using intel_idle has not worked once, but disabling it is not 100% reliable. And when it doesn't work, it doesn't work at all until I get a new kernel. Sorry, not very helpful. -serge ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder @ 2011-09-01 6:24 ` Tian, Kevin 2011-09-01 6:24 ` Tian, Kevin 2012-04-15 14:06 ` Jonathan Nieder 2 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-09-01 6:24 UTC (permalink / raw) To: Jonathan Nieder, Thomas Gleixner Cc: linux-kernel, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen > From: Jonathan Nieder [mailto:jrnieder@gmail.com] > Sent: Monday, August 29, 2011 12:16 PM > > Hi, > > Lars Boegild Thomsen writes[1]: > > > After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after > > sleep. Back to latest 2.6 kernel works fine. > [...] > > Upon wakeup, the power light go from slow flashing to on, the battery light > > goes from off to on, the hdd light blink once and then everything is dead. > > Nothing happens on the screen, all keys dead. The fan/hdd switch on > > physically (very hard to hear on this model or I am getting deaf). > > Ctrl+alt+del or the alt+sysreq is non-responsive. The only LED that show > > keyboard status is CAPS lock and that is unresponsive too. Only way I have > > found to get it rebooted is holding down the power button a few secs until it > > switch physically off and then switch it on again. > [...] > > Here's the result of the final bisect: > > > > 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit > [...] > > I also tried to go back to HEAD and manually change arch/x86/irq.c revert this > > particular commit and it works. > > For reference: > > > commit 983bbf1af0664b78689612b247acb514300f62c7 > > Author: Tian, Kevin <kevin.tian@intel.com> > > Date: Fri May 6 14:43:56 2011 +0800 > > > > x86: Don't unmask disabled irqs when migrating them > > > > It doesn't make sense to unconditionally unmask a disabled irq when > > migrating it from offlined cpu to another. If the irq triggers then it > > will be disabled in the interrupt handler anyway. So we can just avoid > > unmasking it. > > > > [ tglx: Made masking unconditional again and fixed the changelog ] > > > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > > Signed-off-by: Kevin Tian <kevin.tian@intel.com> > > Cc: Ian Campbell <Ian.Campbell@citrix.com> > > Cc: Jan Beulich <JBeulich@novell.com> > > Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com> > > Link: > http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F > 7E3%40shsmsx502.ccr.corp.intel.com%3 > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > > index 544efe2741be..6c0802eb2f7f 100644 > > --- a/arch/x86/kernel/irq.c > > +++ b/arch/x86/kernel/irq.c > > @@ -276,7 +276,8 @@ void fixup_irqs(void) > > else if (!(warned++)) > > set_affinity = 0; > > > > - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) > > + if (!irqd_can_move_in_process_context(data) && > > + !irqd_irq_disabled(data) && chip->irq_unmask) > > chip->irq_unmask(data); > > > > raw_spin_unlock(&desc->lock); > > Known problem? Ideas? > this is the 1st problem reported on this change. But honestly speaking I didn't see obvious problem with it. As the commit msg says, it simply completes a delayed irq disable action, by keeping interrupt line masked, instead of relying on a future interrupt handler to actually mask it. fixup_irqs is invoked in suspend path. The only impact this change may bring to resume path is the interrupt line state, which is saved later in suspend and then restored in resume. w/ above change after resume given interrupt line is always masked, while w/o it there may be at least one interrupt raising. If this does matter to make your ideapad working, I'd think there may have other bugs which are hidden originally, e.g. by triggering a reschedule from that interrupt though the handler itself does nothing except masking the interrupt line. So... above commit is not important which can be easily reverted. My only concern is whether other severe issues are just hidden. btw, any serial output you may capture? Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) @ 2011-09-01 6:24 ` Tian, Kevin 0 siblings, 0 replies; 41+ messages in thread From: Tian, Kevin @ 2011-09-01 6:24 UTC (permalink / raw) To: Jonathan Nieder, Thomas Gleixner Cc: xen-devel, Zhang, linux-kernel, JBeulich, Ian Campbell, Fengzhe, mingo, hpa, Lars Boegild Thomsen > From: Jonathan Nieder [mailto:jrnieder@gmail.com] > Sent: Monday, August 29, 2011 12:16 PM > > Hi, > > Lars Boegild Thomsen writes[1]: > > > After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after > > sleep. Back to latest 2.6 kernel works fine. > [...] > > Upon wakeup, the power light go from slow flashing to on, the battery light > > goes from off to on, the hdd light blink once and then everything is dead. > > Nothing happens on the screen, all keys dead. The fan/hdd switch on > > physically (very hard to hear on this model or I am getting deaf). > > Ctrl+alt+del or the alt+sysreq is non-responsive. The only LED that show > > keyboard status is CAPS lock and that is unresponsive too. Only way I have > > found to get it rebooted is holding down the power button a few secs until it > > switch physically off and then switch it on again. > [...] > > Here's the result of the final bisect: > > > > 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit > [...] > > I also tried to go back to HEAD and manually change arch/x86/irq.c revert this > > particular commit and it works. > > For reference: > > > commit 983bbf1af0664b78689612b247acb514300f62c7 > > Author: Tian, Kevin <kevin.tian@intel.com> > > Date: Fri May 6 14:43:56 2011 +0800 > > > > x86: Don't unmask disabled irqs when migrating them > > > > It doesn't make sense to unconditionally unmask a disabled irq when > > migrating it from offlined cpu to another. If the irq triggers then it > > will be disabled in the interrupt handler anyway. So we can just avoid > > unmasking it. > > > > [ tglx: Made masking unconditional again and fixed the changelog ] > > > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > > Signed-off-by: Kevin Tian <kevin.tian@intel.com> > > Cc: Ian Campbell <Ian.Campbell@citrix.com> > > Cc: Jan Beulich <JBeulich@novell.com> > > Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com> > > Link: > http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F > 7E3%40shsmsx502.ccr.corp.intel.com%3 > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > > index 544efe2741be..6c0802eb2f7f 100644 > > --- a/arch/x86/kernel/irq.c > > +++ b/arch/x86/kernel/irq.c > > @@ -276,7 +276,8 @@ void fixup_irqs(void) > > else if (!(warned++)) > > set_affinity = 0; > > > > - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) > > + if (!irqd_can_move_in_process_context(data) && > > + !irqd_irq_disabled(data) && chip->irq_unmask) > > chip->irq_unmask(data); > > > > raw_spin_unlock(&desc->lock); > > Known problem? Ideas? > this is the 1st problem reported on this change. But honestly speaking I didn't see obvious problem with it. As the commit msg says, it simply completes a delayed irq disable action, by keeping interrupt line masked, instead of relying on a future interrupt handler to actually mask it. fixup_irqs is invoked in suspend path. The only impact this change may bring to resume path is the interrupt line state, which is saved later in suspend and then restored in resume. w/ above change after resume given interrupt line is always masked, while w/o it there may be at least one interrupt raising. If this does matter to make your ideapad working, I'd think there may have other bugs which are hidden originally, e.g. by triggering a reschedule from that interrupt though the handler itself does nothing except masking the interrupt line. So... above commit is not important which can be easily reverted. My only concern is whether other severe issues are just hidden. btw, any serial output you may capture? Thanks Kevin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2011-09-01 6:24 ` Tian, Kevin (?) @ 2012-05-12 23:13 ` Jonathan Nieder 2012-05-13 1:22 ` Lars Boegild Thomsen -1 siblings, 1 reply; 41+ messages in thread From: Jonathan Nieder @ 2012-05-12 23:13 UTC (permalink / raw) To: Tian, Kevin Cc: Thomas Gleixner, xen-devel, Fengzhe Zhang, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Lars Boegild Thomsen, Andreas Wallberg, Robert Scott Hi again, In September, Kevin Tian wrote: >> Lars Boegild Thomsen writes[1]: >>> After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after >>> sleep. Back to latest 2.6 kernel works fine. > > [...] >>> Upon wakeup, the power light go from slow flashing to on, the battery light >>> goes from off to on, the hdd light blink once and then everything is dead. [...] >>> 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit >> [...] >>> I also tried to go back to HEAD and manually change arch/x86/irq.c revert this >>> particular commit and it works. [...] > fixup_irqs is invoked in suspend path. The only impact this change may > bring to resume path is the interrupt line state, which is saved later > in suspend and then restored in resume. w/ above change after resume > given interrupt line is always masked, while w/o it there may be at least > one interrupt raising. If this does matter to make your ideapad working, > I'd think there may have other bugs which are hidden originally, e.g. by > triggering a reschedule from that interrupt though the handler itself > does nothing except masking the interrupt line. > > So... above commit is not important which can be easily reverted. My > only concern is whether other severe issues are just hidden. > > btw, any serial output you may capture? Sorry for the slow response. Result from reporters is: - 3.4-rc2 is affected as well - this only affects suspend-to-RAM --- suspend-to-disk works fine - all five pm_test modes for suspend-to-RAM work fine - this netbook doesn't have a serial port. Netconsole gives: | [ 745.161322] PM: Syncing filesystems ... done. | [ 747.088247] PM: Preparing system for mem sleep | [ 747.187932] Freezing user space processes ... (elapsed 0.01 seconds) done. | [ 747.204325] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. | [ 747.220416] PM: Entering mem sleep | [ 747.221085] sd 1:0:0:0: [sda] Synchronizing SCSI cache | [ 747.222247] sd 1:0:0:0: [sda] Stopping disk | | (then nothing) Serial console via a USB-to-serial converter gives: | [ 814.016541] PM: Syncing filesystems ... done. | [ 814.018516] PM: Preparing system for mem sleep | [ 814.100393] Freezing user space processes ... (elapsed 0.01 se | | before it goes to sleep, and it doesn't output anything on (attempted) wakeup. - passing parameters "hpet=disable highres=off nohz=off" helps some people if I understand correctly, but I might have misunderstood.[2] I'd be interested to hear whether the same problem occurs when trying to suspend from the minimal initramfs environment. (On systems like Debian that use initramfs-tools, that means passing the kernel command line parameter "break=top", booting, loading some appropriate minimal collection of modules --- maybe none ---, and then running "echo mem >/sys/power/state". initramfs-tools(8) has details.) Hope that helps, Jonathan >> [1] http://bugs.debian.org/635575 [2] https://bugzilla.kernel.org/show_bug.cgi?id=41932 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-05-12 23:13 ` [regression] Ideapad S10-3 does not wake up from suspend Jonathan Nieder @ 2012-05-13 1:22 ` Lars Boegild Thomsen 2012-07-15 23:24 ` Jonathan Nieder 0 siblings, 1 reply; 41+ messages in thread From: Lars Boegild Thomsen @ 2012-05-13 1:22 UTC (permalink / raw) To: Jonathan Nieder Cc: Tian, Kevin, Thomas Gleixner, xen-devel, Fengzhe Zhang, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Andreas Wallberg, Robert Scott On Sunday 13 May 2012 07:13:49 Jonathan Nieder wrote: > In September, Kevin Tian wrote: > >> Lars Boegild Thomsen writes[1]: > >>> After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up > >>> after sleep. Back to latest 2.6 kernel works fine. > - passing parameters "hpet=disable highres=off nohz=off" helps some > people if I understand correctly, but I might have misunderstood.[2] I didn't notice this one before but that actually works for me. Adding those kernel params and sleep works again. I tried combinations thereof but no-go - all 3 required. > I'd be interested to hear whether the same problem occurs when trying > to suspend from the minimal initramfs environment. (On systems like > Debian that use initramfs-tools, that means passing the kernel command > line parameter "break=top", booting, loading some appropriate minimal > collection of modules --- maybe none ---, and then running > "echo mem >/sys/power/state". initramfs-tools(8) has details.) And I just tried this loading no modules manually and it's the same - never wakes up after sleep. //Lars... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-05-13 1:22 ` Lars Boegild Thomsen @ 2012-07-15 23:24 ` Jonathan Nieder 0 siblings, 0 replies; 41+ messages in thread From: Jonathan Nieder @ 2012-07-15 23:24 UTC (permalink / raw) To: Lars Boegild Thomsen Cc: Feng Tang, Tian, Kevin, Thomas Gleixner, xen-devel, Fengzhe Zhang, linux-kernel, JBeulich, Ian Campbell, mingo, hpa, Andreas Wallberg, Robert Scott, Damjan Georgievski Hi again, In May, Lars Boegild Thomsen wrote: > On Sunday 13 May 2012 07:13:49 Jonathan Nieder wrote: >>>> Lars Boegild Thomsen writes: >>>>> After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up >>>>> after sleep. Back to latest 2.6 kernel works fine. >> >> - passing parameters "hpet=disable highres=off nohz=off" helps some >> people if I understand correctly, [...] > I didn't notice this one before but that actually works for me. Adding those > kernel params and sleep works again. I tried combinations thereof but no-go - > all 3 required. > >> I'd be interested to hear whether the same problem occurs when trying >> to suspend from the minimal initramfs environment. [...] > And I just tried this loading no modules manually and it's the same - never > wakes up after sleep. Thanks. Please test with the debugging patch below from Feng Tang[1]. I don't know if it will get any useful information because e.g. serial console and netconsole are not very convenient on this machine[2], but it seems worth a try. If one of the people cc-ed needs help building a patched kernel to test, feel free to write privately and I can give more detailed instructions. Hope that helps, Jonathan [1] https://bugzilla.kernel.org/show_bug.cgi?id=41932#c18 [2] http://thread.gmane.org/gmane.linux.kernel/1136253/focus=1285790 diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index f113755..d2e0c90 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -439,6 +439,14 @@ again: * in the event mask */ if (next_event.tv64 != KTIME_MAX) { + s64 delta; + + delta = next_event.tv64 - now.tv64; + if (delta >= 10000000000) { + printk("%s(): The delta is big: %lld\n", __func__, delta); + next_event.tv64 = now.tv64 + 3000000000; + } + /* * Rearm the broadcast device. If event expired, * repeat the above ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder 2011-08-31 1:04 ` Dave Hansen 2011-09-01 6:24 ` Tian, Kevin @ 2012-04-15 14:06 ` Jonathan Nieder 2012-04-16 18:05 ` Robert Scott 2 siblings, 1 reply; 41+ messages in thread From: Jonathan Nieder @ 2012-04-15 14:06 UTC (permalink / raw) To: Thomas Gleixner Cc: linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, Robert Scott, e568b31a443d, Liu, Chuansheng Hi, Quick summary and update. > Lars Boegild Thomsen writes[1]: >> After update from 2.6 kernel to 3.0 my Idepad S10-3 will not wake up after >> sleep. [...] >> 983bbf1af0664b78689612b247acb514300f62c7 is the first bad commit 983bbf1af06 is "x86: Don't unmask disabled irqs when migrating them", 2011-05-06, and looks like this: >> --- a/arch/x86/kernel/irq.c >> +++ b/arch/x86/kernel/irq.c >> @@ -276,7 +276,8 @@ void fixup_irqs(void) >> else if (!(warned++)) >> set_affinity = 0; >> >> - if (!irqd_can_move_in_process_context(data) && chip->irq_unmask) >> + if (!irqd_can_move_in_process_context(data) && >> + !irqd_irq_disabled(data) && chip->irq_unmask) >> chip->irq_unmask(data); Robert Scott found[1], using 3.2.12: > I'm getting the same behaviour on my Lenovo Ideapad S10-3 An anonymous contributor[2] also reports the same problem in v3.3. Lars, Robert, anon: can you try 3.4-rc2 or newer and let us know how it goes? I suspect v3.4-rc2~24^2~4 ("x86: Preserve lazy irq disable semantics in fixup_irqs()") will fix this. Liu Chuansheng et al: do you think that commit would be a good candidate for inclusion in -stable kernels? Thanks and hope that helps, Jonathan > [1] http://bugs.debian.org/635575 [2] https://bugzilla.kernel.org/show_bug.cgi?id=41932 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-04-15 14:06 ` Jonathan Nieder @ 2012-04-16 18:05 ` Robert Scott 2012-04-17 2:04 ` Jonathan Nieder 0 siblings, 1 reply; 41+ messages in thread From: Robert Scott @ 2012-04-16 18:05 UTC (permalink / raw) To: Jonathan Nieder Cc: Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, e568b31a443d, Liu, Chuansheng On Sunday 15 April 2012, Jonathan Nieder wrote: > Lars, Robert, anon: can you try 3.4-rc2 or newer and let us know how > it goes? I suspect v3.4-rc2~24^2~4 ("x86: Preserve lazy irq disable > semantics in fixup_irqs()") will fix this. Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. robert. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-04-16 18:05 ` Robert Scott @ 2012-04-17 2:04 ` Jonathan Nieder 2012-04-18 10:03 ` Lars Boegild Thomsen ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Jonathan Nieder @ 2012-04-17 2:04 UTC (permalink / raw) To: Robert Scott Cc: Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, e568b31a443d, Liu, Chuansheng Robert Scott wrote: > On Sunday 15 April 2012, Jonathan Nieder wrote: >> Lars, Robert, anon: can you try 3.4-rc2 or newer and let us know how >> it goes? I suspect v3.4-rc2~24^2~4 ("x86: Preserve lazy irq disable >> semantics in fixup_irqs()") will fix this. > > Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. Alas. Does suspend-to-disk ("echo disk >/sys/power/state") have the same problem? Can you get a log of the failure with "no_console_suspend" appended to the kernel command line, for example with a serial console or netconsole? If someone has time to go through the steps in "Documentation/power/basic-pm-debugging.txt", that would also be useful. Thanks, and sorry to take so long to get to this. Sincerely, Jonathan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-04-17 2:04 ` Jonathan Nieder @ 2012-04-18 10:03 ` Lars Boegild Thomsen 2012-04-22 16:34 ` [Xen-devel] " Pasi Kärkkäinen 2012-04-21 13:14 ` Robert Scott 2012-05-06 12:44 ` Robert Scott 2 siblings, 1 reply; 41+ messages in thread From: Lars Boegild Thomsen @ 2012-04-18 10:03 UTC (permalink / raw) To: Jonathan Nieder Cc: Robert Scott, Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, e568b31a443d, Liu, Chuansheng On Tuesday 17 April 2012 10:04:33 Jonathan Nieder wrote: > > Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. > Alas. Does suspend-to-disk ("echo disk >/sys/power/state") have the > same problem? Can you get a log of the failure with > "no_console_suspend" appended to the kernel command line, for example > with a serial console or netconsole? Using a serial console is a bit of a problem on this netbook as it hasn't got a serial port. Not sure how the netconsole works - will read up on that. And no - suspend to disk works fine. I have by the way experimentally switched to 32bit version and it's the same thing (originally I was running 64bit version but that really doesn't make sense on a netbook - except doing it because it could be done). //Lars... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [Xen-devel] [regression] Ideapad S10-3 does not wake up from suspend 2012-04-18 10:03 ` Lars Boegild Thomsen @ 2012-04-22 16:34 ` Pasi Kärkkäinen 0 siblings, 0 replies; 41+ messages in thread From: Pasi Kärkkäinen @ 2012-04-22 16:34 UTC (permalink / raw) To: Lars Boegild Thomsen Cc: Jonathan Nieder, Kevin Tian, xen-devel, Robert Scott, hpa, linux-kernel, Ian Campbell, mingo, Fengzhe Zhang, e568b31a443d, Thomas Gleixner, Liu, Chuansheng On Wed, Apr 18, 2012 at 06:03:00PM +0800, Lars Boegild Thomsen wrote: > On Tuesday 17 April 2012 10:04:33 Jonathan Nieder wrote: > > > Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. > > Alas. Does suspend-to-disk ("echo disk >/sys/power/state") have the > > same problem? Can you get a log of the failure with > > "no_console_suspend" appended to the kernel command line, for example > > with a serial console or netconsole? > > Using a serial console is a bit of a problem on this netbook as it hasn't got > a serial port. Not sure how the netconsole works - will read up on that. > You can get an ExpressCard Serial Card and use that.. or if the laptop has vPro then it has Serial Over LAN in the AMT chip. http://wiki.xen.org/wiki/Xen_Serial_Console -- Pasi > And no - suspend to disk works fine. > > I have by the way experimentally switched to 32bit version and it's the same > thing (originally I was running 64bit version but that really doesn't make > sense on a netbook - except doing it because it could be done). > > //Lars... > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-04-17 2:04 ` Jonathan Nieder 2012-04-18 10:03 ` Lars Boegild Thomsen @ 2012-04-21 13:14 ` Robert Scott 2012-05-06 12:44 ` Robert Scott 2 siblings, 0 replies; 41+ messages in thread From: Robert Scott @ 2012-04-21 13:14 UTC (permalink / raw) To: Jonathan Nieder Cc: Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, e568b31a443d, Liu, Chuansheng On Tuesday 17 April 2012, Jonathan Nieder wrote: > Robert Scott wrote: > > On Sunday 15 April 2012, Jonathan Nieder wrote: > > >> Lars, Robert, anon: can you try 3.4-rc2 or newer and let us know how > >> it goes? I suspect v3.4-rc2~24^2~4 ("x86: Preserve lazy irq disable > >> semantics in fixup_irqs()") will fix this. > > > > Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. > > Alas. Does suspend-to-disk ("echo disk >/sys/power/state") have the > same problem? Can you get a log of the failure with > "no_console_suspend" appended to the kernel command line, for example > with a serial console or netconsole? (back on 3.2.0-0.bpo.2-686-pae 3.2.12-1~bpo60+1*:) eth0 seems to go down/come up to early/late to get anything useful from netconsole: [ 745.161322] PM: Syncing filesystems ... done. [ 747.088247] PM: Preparing system for mem sleep [ 747.187932] Freezing user space processes ... (elapsed 0.01 seconds) done. [ 747.204325] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. [ 747.220416] PM: Entering mem sleep [ 747.221085] sd 1:0:0:0: [sda] Synchronizing SCSI cache [ 747.222247] sd 1:0:0:0: [sda] Stopping disk (then nothing) I may be able to hook up a usb-serial adaptor to try and get more info but it'll take me a bit longer what with all the rewiring fun as I don't have a null modem cable lying around. > If someone has time to go through the steps in > "Documentation/power/basic-pm-debugging.txt", that would also be useful. "freezer", "devices", "platform", "processors", or "core" pm_test modes all work fine, naturally. robert. * I notice I accidentally copy/pasted the linux-modules version in a previous mail but you knew what I was talking about ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [regression] Ideapad S10-3 does not wake up from suspend 2012-04-17 2:04 ` Jonathan Nieder 2012-04-18 10:03 ` Lars Boegild Thomsen 2012-04-21 13:14 ` Robert Scott @ 2012-05-06 12:44 ` Robert Scott 2 siblings, 0 replies; 41+ messages in thread From: Robert Scott @ 2012-05-06 12:44 UTC (permalink / raw) To: Jonathan Nieder Cc: Thomas Gleixner, linux-kernel, Kevin Tian, Fengzhe Zhang, mingo, hpa, Ian Campbell, JBeulich, xen-devel, Lars Boegild Thomsen, e568b31a443d, Liu, Chuansheng On Tuesday 17 April 2012, Jonathan Nieder wrote: > Robert Scott wrote: > > On Sunday 15 April 2012, Jonathan Nieder wrote: > > >> Lars, Robert, anon: can you try 3.4-rc2 or newer and let us know how > >> it goes? I suspect v3.4-rc2~24^2~4 ("x86: Preserve lazy irq disable > >> semantics in fixup_irqs()") will fix this. > > > > Hmm, no, 3.4-rc2 seems to produce the same results I'm afraid. > > Alas. Does suspend-to-disk ("echo disk >/sys/power/state") have the > same problem? Can you get a log of the failure with > "no_console_suspend" appended to the kernel command line, for example > with a serial console or netconsole? Well, for what it's worth, using a USB serial console to capture output, all you get is: [ 814.016541] PM: Syncing filesystems ... done. [ 814.018516] PM: Preparing system for mem sleep [ 814.100393] Freezing user space processes ... (elapsed 0.01 se before it goes to sleep, and it doesn't output anything on (attempted) wakeup. Though I don't know if this is due to USB being woken up quite late or something. robert. ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2012-07-15 23:24 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-05-06 6:43 [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them Tian, Kevin 2011-05-06 9:59 ` Thomas Gleixner 2011-05-06 12:54 ` Tian, Kevin 2011-05-06 12:54 ` Tian, Kevin 2011-05-06 13:14 ` [Xen-devel] " Tian, Kevin 2011-05-06 13:24 ` Thomas Gleixner 2011-05-06 14:04 ` Ian Campbell 2011-05-08 1:44 ` Jeremy Fitzhardinge 2011-05-08 1:44 ` Jeremy Fitzhardinge 2011-05-09 0:44 ` Tian, Kevin 2011-05-09 0:44 ` Tian, Kevin 2011-05-09 1:45 ` Jeremy Fitzhardinge 2011-05-09 1:45 ` Jeremy Fitzhardinge 2011-05-06 14:28 ` Stefano Stabellini 2011-05-06 14:28 ` Stefano Stabellini 2011-05-06 21:43 ` Tian, Kevin 2011-05-09 2:11 ` Tian, Kevin 2011-05-09 12:02 ` Stefano Stabellini 2011-05-09 12:36 ` Thomas Gleixner 2011-05-10 3:24 ` Tian, Kevin 2011-05-18 23:49 ` Tian, Kevin 2011-05-18 23:49 ` Tian, Kevin 2011-05-19 12:08 ` Stefano Stabellini 2011-05-19 16:18 ` [Xen-devel] " Konrad Rzeszutek Wilk 2011-05-19 16:18 ` Konrad Rzeszutek Wilk 2011-08-29 4:15 ` [regression] Ideapad S10-3 does not wake up from suspend (Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them) Jonathan Nieder 2011-08-31 1:04 ` Dave Hansen 2011-08-31 8:22 ` Jonathan Nieder 2011-09-02 3:01 ` Serge E. Hallyn 2011-09-01 6:24 ` Tian, Kevin 2011-09-01 6:24 ` Tian, Kevin 2012-05-12 23:13 ` [regression] Ideapad S10-3 does not wake up from suspend Jonathan Nieder 2012-05-13 1:22 ` Lars Boegild Thomsen 2012-07-15 23:24 ` Jonathan Nieder 2012-04-15 14:06 ` Jonathan Nieder 2012-04-16 18:05 ` Robert Scott 2012-04-17 2:04 ` Jonathan Nieder 2012-04-18 10:03 ` Lars Boegild Thomsen 2012-04-22 16:34 ` [Xen-devel] " Pasi Kärkkäinen 2012-04-21 13:14 ` Robert Scott 2012-05-06 12:44 ` Robert Scott
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.