* [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs @ 2011-04-06 22:30 Shaun Ruffell 2011-04-07 0:16 ` Don Zickus 2011-04-13 19:33 ` Maciej Rutecki 0 siblings, 2 replies; 18+ messages in thread From: Shaun Ruffell @ 2011-04-06 22:30 UTC (permalink / raw) To: Don Zickus; +Cc: linux-kernel, Cyrill Gorcunov, Ingo Molnar Hello Don, With 2.6.39-rc2 I was seeing the following NMIs when building the kernel: [ 191.647131] Uhhuh. NMI received for unknown reason 21 on CPU 3. [ 191.650068] Do you have a strange power saving mode enabled? [ 191.650068] Dazed and confused, but trying to continue [ 676.020001] Uhhuh. NMI received for unknown reason 21 on CPU 1. [ 676.020001] Do you have a strange power saving mode enabled? [ 676.020001] Dazed and confused, but trying to continue [ 892.520335] Starting new kernel I'm running on a Dell PowerEdge 2600 with the following processor: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 3.06GHz stepping : 7 ... I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not certain where to go from here. Is this something that is already known or is there more information I should try to collect? Here is the commit for reference: commit 242214f9c1eeaae40eca11e3b4d37bfce960a7cd Author: Don Zickus <dzickus@redhat.com> Date: Thu Mar 24 23:36:25 2011 +0300 perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows The read of a proper MSR register was missed and instead of counter the configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI hitting the system. As result the user may obtain "Dazed and confused, but trying to continue" message. Fix it by reading a proper MSR register. When an NMI happens on a P4, the perf nmi handler checks the configuration register to see if the overflow bit is set or not before taking appropriate action. Unfortunately, various P4 machines had a broken overflow bit, so a backup mechanism was implemented. This mechanism checked to see if the counter rolled over or not. A previous commit that implemented this backup mechanism was broken. Instead of reading the counter register, it used the configuration register to determine if the counter rolled over or not. Reading that bit would give incorrect results. This would lead to 'Dazed and confused' messages for the end user when using the perf tool (or if the nmi watchdog is running). The fix is to read the counter register before determining if the counter rolled over or not. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <4D8BAB49.3080701@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c index 3769ac8..d3d7b59 100644 --- a/arch/x86/kernel/cpu/perf_event_p4.c +++ b/arch/x86/kernel/cpu/perf_event_p4.c @@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(struct hw_perf_event *hwc) * the counter has reached zero value and continued counting before * real NMI signal was received: */ + rdmsrl(hwc->event_base, v); if (!(v & ARCH_P4_UNFLAGGED_BIT)) return 1; ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-06 22:30 [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs Shaun Ruffell @ 2011-04-07 0:16 ` Don Zickus 2011-04-07 3:18 ` Cyrill Gorcunov 2011-04-13 19:33 ` Maciej Rutecki 1 sibling, 1 reply; 18+ messages in thread From: Don Zickus @ 2011-04-07 0:16 UTC (permalink / raw) To: Shaun Ruffell; +Cc: linux-kernel, Cyrill Gorcunov, Ingo Molnar On Wed, Apr 06, 2011 at 05:30:36PM -0500, Shaun Ruffell wrote: > Hello Don, > > I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not > certain where to go from here. Is this something that is already known > or is there more information I should try to collect? Nope, this is an ongoing issue. What happened was the perf P4 nmi handler was swallowing all the NMIs. My patch fixed that and exposed a double NMI problem. We have been chasing it for a couple of months. I think Cyril was finally able to duplicate it (as he wrote the P4 code). I have confidence that he will find a fix for it soon. :-) Thanks for the report though! Cheers, Don ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-07 0:16 ` Don Zickus @ 2011-04-07 3:18 ` Cyrill Gorcunov 2011-04-07 14:38 ` Shaun Ruffell 0 siblings, 1 reply; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-07 3:18 UTC (permalink / raw) To: Don Zickus; +Cc: Shaun Ruffell, linux-kernel, Cyrill Gorcunov, Ingo Molnar On Thursday, April 7, 2011, Don Zickus <dzickus@redhat.com> wrote: > On Wed, Apr 06, 2011 at 05:30:36PM -0500, Shaun Ruffell wrote: >> Hello Don, >> >> I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not >> certain where to go from here. Is this something that is already known >> or is there more information I should try to collect? > > Nope, this is an ongoing issue. What happened was the perf P4 nmi handler > was swallowing all the NMIs. My patch fixed that and exposed a double NMI > problem. We have been chasing it for a couple of months. I think Cyril > was finally able to duplicate it (as he wrote the P4 code). I have > confidence that he will find a fix for it soon. :-) > > Thanks for the report though! > > Cheers, > Don > Hi, yeah, i got it too and i hope to fix this issue soon. Will ping as only get working fix. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-07 3:18 ` Cyrill Gorcunov @ 2011-04-07 14:38 ` Shaun Ruffell 2011-04-07 14:43 ` Cyrill Gorcunov 0 siblings, 1 reply; 18+ messages in thread From: Shaun Ruffell @ 2011-04-07 14:38 UTC (permalink / raw) To: Cyrill Gorcunov, Don Zickus; +Cc: linux-kernel On Thu, Apr 07, 2011 at 07:18:50AM +0400, Cyrill Gorcunov wrote: > On Thursday, April 7, 2011, Don Zickus <dzickus@redhat.com> wrote: >> On Wed, Apr 06, 2011 at 05:30:36PM -0500, Shaun Ruffell wrote: >>> >>> I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not >>> certain where to go from here. Is this something that is already known >>> or is there more information I should try to collect? >> >> Nope, this is an ongoing issue. What happened was the perf P4 nmi handler >> was swallowing all the NMIs. My patch fixed that and exposed a double NMI >> problem. We have been chasing it for a couple of months. I think Cyril >> was finally able to duplicate it (as he wrote the P4 code). I have >> confidence that he will find a fix for it soon. :-) >> >> Thanks for the report though! > > Hi, yeah, i got it too and i hope to fix this issue soon. Will ping as > only get working fix. Don, Cyrill, Thanks for the explanation and my apologies for not relating the previous discussions about this to what I was seeing. This issue would be a blocker for any 2.6.39 final right? Cyrill, I would be more than happy to test any patches. It's relatively quick for me to reproduce. Thanks, Shaun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-07 14:38 ` Shaun Ruffell @ 2011-04-07 14:43 ` Cyrill Gorcunov 0 siblings, 0 replies; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-07 14:43 UTC (permalink / raw) To: Shaun Ruffell; +Cc: Don Zickus, linux-kernel On 04/07/2011 06:38 PM, Shaun Ruffell wrote: > On Thu, Apr 07, 2011 at 07:18:50AM +0400, Cyrill Gorcunov wrote: >> On Thursday, April 7, 2011, Don Zickus <dzickus@redhat.com> wrote: >>> On Wed, Apr 06, 2011 at 05:30:36PM -0500, Shaun Ruffell wrote: >>>> >>>> I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not >>>> certain where to go from here. Is this something that is already known >>>> or is there more information I should try to collect? >>> >>> Nope, this is an ongoing issue. What happened was the perf P4 nmi handler >>> was swallowing all the NMIs. My patch fixed that and exposed a double NMI >>> problem. We have been chasing it for a couple of months. I think Cyril >>> was finally able to duplicate it (as he wrote the P4 code). I have >>> confidence that he will find a fix for it soon. :-) >>> >>> Thanks for the report though! >> >> Hi, yeah, i got it too and i hope to fix this issue soon. Will ping as >> only get working fix. > > Don, Cyrill, > > Thanks for the explanation and my apologies for not relating the > previous discussions about this to what I was seeing. This issue would > be a blocker for any 2.6.39 final right? Well, could be, I didn't find the real reason for doubled nmi. Still investigating. As a workaround you could simply disable nmi-watchdog for a while in command line if it bothers you. > > Cyrill, I would be more than happy to test any patches. It's relatively > quick for me to reproduce. OK, I'll prepare some patches for you to test. Still think on where it could fails :( Stay tuned. > > Thanks, > Shaun -- Cyrill ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-06 22:30 [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs Shaun Ruffell 2011-04-07 0:16 ` Don Zickus @ 2011-04-13 19:33 ` Maciej Rutecki 2011-04-13 20:01 ` Cyrill Gorcunov 1 sibling, 1 reply; 18+ messages in thread From: Maciej Rutecki @ 2011-04-13 19:33 UTC (permalink / raw) To: Shaun Ruffell; +Cc: Don Zickus, linux-kernel, Cyrill Gorcunov, Ingo Molnar I created a Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=33252 for your bug report, please add your address to the CC list in there, thanks! On czwartek, 7 kwietnia 2011 o 00:30:36 Shaun Ruffell wrote: > Hello Don, > > With 2.6.39-rc2 I was seeing the following NMIs when building the kernel: > > [ 191.647131] Uhhuh. NMI received for unknown reason 21 on CPU 3. > [ 191.650068] Do you have a strange power saving mode enabled? > [ 191.650068] Dazed and confused, but trying to continue > [ 676.020001] Uhhuh. NMI received for unknown reason 21 on CPU 1. > [ 676.020001] Do you have a strange power saving mode enabled? > [ 676.020001] Dazed and confused, but trying to continue > [ 892.520335] Starting new kernel > > I'm running on a Dell PowerEdge 2600 with the following processor: > > processor : 0 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) Xeon(TM) CPU 3.06GHz > stepping : 7 > ... > > I was able to bisect it down to commit 242214f9c1eeaae40, but I'm not > certain where to go from here. Is this something that is already known > or is there more information I should try to collect? > > Here is the commit for reference: > > commit 242214f9c1eeaae40eca11e3b4d37bfce960a7cd > Author: Don Zickus <dzickus@redhat.com> > Date: Thu Mar 24 23:36:25 2011 +0300 > > perf, x86: P4 PMU - Read proper MSR register to catch unflagged > overflows > > The read of a proper MSR register was missed and instead of > counter the configration register was tested (it has > ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI > hitting the system. As result the user may obtain "Dazed and > confused, but trying to continue" message. Fix it by reading a > proper MSR register. > > When an NMI happens on a P4, the perf nmi handler checks the > configuration register to see if the overflow bit is set or not > before taking appropriate action. Unfortunately, various P4 > machines had a broken overflow bit, so a backup mechanism was > implemented. This mechanism checked to see if the counter > rolled over or not. > > A previous commit that implemented this backup mechanism was > broken. Instead of reading the counter register, it used the > configuration register to determine if the counter rolled over > or not. Reading that bit would give incorrect results. > > This would lead to 'Dazed and confused' messages for the end > user when using the perf tool (or if the nmi watchdog is > running). > > The fix is to read the counter register before determining if > the counter rolled over or not. > > Signed-off-by: Don Zickus <dzickus@redhat.com> > Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> > Cc: Lin Ming <ming.m.lin@intel.com> > LKML-Reference: <4D8BAB49.3080701@openvz.org> > Signed-off-by: Ingo Molnar <mingo@elte.hu> > > diff --git a/arch/x86/kernel/cpu/perf_event_p4.c > b/arch/x86/kernel/cpu/perf_event_p4.c index 3769ac8..d3d7b59 100644 > --- a/arch/x86/kernel/cpu/perf_event_p4.c > +++ b/arch/x86/kernel/cpu/perf_event_p4.c > @@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(struct > hw_perf_event *hwc) * the counter has reached zero value and continued > counting before * real NMI signal was received: > */ > + rdmsrl(hwc->event_base, v); > if (!(v & ARCH_P4_UNFLAGGED_BIT)) > return 1; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Maciej Rutecki http://www.maciek.unixy.pl ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 19:33 ` Maciej Rutecki @ 2011-04-13 20:01 ` Cyrill Gorcunov 2011-04-13 20:35 ` Shaun Ruffell 2011-04-14 6:47 ` Ingo Molnar 0 siblings, 2 replies; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-13 20:01 UTC (permalink / raw) To: maciej.rutecki, Shaun Ruffell, Don Zickus Cc: linux-kernel, Ingo Molnar, Lin Ming On 04/13/2011 11:33 PM, Maciej Rutecki wrote: > I created a Bugzilla entry at > https://bugzilla.kernel.org/show_bug.cgi?id=33252 > for your bug report, please add your address to the CC list in there, thanks! > Here is a patch flying around which I tuned a bit, when all reporters confirm it works for them we could close the bug. Thanks. Cyrill -- From: Don Zickus <dzickus@redhat.com> Subject: [PATCH] perf, x86: fix unknown NMIs on a Pentium4 box v2 When using perf on a Pentium4 box, lots of unknown NMIs would be generated. This is the result of a P4 quirk that is subtle. The P4 generates an NMI when the counter overflow and unlike other arches where the NMI is a one time event, the P4 continues to assert its NMI until clear by the OS. As a side effect to this quirk, the NMI on the apic is masked off to prevent a stream of NMIs until the overflow flag is cleared. During the perf re-design, this subtle-ness was overlooked and the apic was unmasked _before_ the overflow flag was cleared. As a result, this generated an extra NMI on the P4 mchines. The fix is trivial, wait until the NMI is properly handled before un-masking the apic. Sadly, in the old nmi watchdog there was a note that explained this exact behaviour. v2: Unmask LVT entry iif IRQ being handled by perf subsystem and add a comment. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> --- Don, Shaun, Ming, I've tested it on my non-HT machine, so if you have a chance to test it on HT machine -- this would be a great thing! Don, note the version v2 changes, thanks. I've tuned the former a bit but left your From field untouched, are you OK with that? arch/x86/kernel/cpu/perf_event.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c ===================================================================== --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c @@ -1370,12 +1370,19 @@ perf_event_nmi_handler(struct notifier_b return NOTIFY_DONE; } - apic_write(APIC_LVTPC, APIC_DM_NMI); handled = x86_pmu.handle_irq(args->regs); if (!handled) return NOTIFY_DONE; + /* + * Unmasking should be done after IRQ handled, otherwise + * there is a race between clearing of counter overflow + * flag and LTV entry unmasking (which might lead to double + * NMIs generation). + */ + apic_write(APIC_LVTPC, APIC_DM_NMI); + this_nmi = percpu_read(irq_stat.__nmi_count); if ((handled > 1) || /* the next nmi could be a back-to-back nmi */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 20:01 ` Cyrill Gorcunov @ 2011-04-13 20:35 ` Shaun Ruffell 2011-04-13 20:43 ` Cyrill Gorcunov 2011-04-14 6:47 ` Ingo Molnar 1 sibling, 1 reply; 18+ messages in thread From: Shaun Ruffell @ 2011-04-13 20:35 UTC (permalink / raw) To: Cyrill Gorcunov Cc: maciej.rutecki, Don Zickus, linux-kernel, Ingo Molnar, Lin Ming On Thu, Apr 14, 2011 at 12:01:35AM +0400, Cyrill Gorcunov wrote: > On 04/13/2011 11:33 PM, Maciej Rutecki wrote: >> I created a Bugzilla entry at >> https://bugzilla.kernel.org/show_bug.cgi?id=33252 >> for your bug report, please add your address to the CC list in there, thanks! > > Here is a patch flying around which I tuned a bit, when all reporters confirm it > works for them we could close the bug. Thanks. > > Cyrill > -- > From: Don Zickus <dzickus@redhat.com> > Subject: [PATCH] perf, x86: fix unknown NMIs on a Pentium4 box v2 > > When using perf on a Pentium4 box, lots of unknown NMIs would be generated. > This is the result of a P4 quirk that is subtle. The P4 generates an NMI > when the counter overflow and unlike other arches where the NMI is a one time > event, the P4 continues to assert its NMI until clear by the OS. > > As a side effect to this quirk, the NMI on the apic is masked off to prevent > a stream of NMIs until the overflow flag is cleared. During the perf > re-design, this subtle-ness was overlooked and the apic was unmasked _before_ > the overflow flag was cleared. As a result, this generated an extra NMI on > the P4 mchines. > > The fix is trivial, wait until the NMI is properly handled before un-masking > the apic. > > Sadly, in the old nmi watchdog there was a note that explained this exact > behaviour. > > v2: Unmask LVT entry iif IRQ being handled by perf subsystem and add a comment. > > Signed-off-by: Don Zickus <dzickus@redhat.com> > Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> > --- > > Don, Shaun, Ming, I've tested it on my non-HT machine, so if you have a chance > to test it on HT machine -- this would be a great thing! > > Don, note the version v2 changes, thanks. I've tuned the former a bit > but left your From field untouched, are you OK with that? > > arch/x86/kernel/cpu/perf_event.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c > ===================================================================== > --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c > +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c > @@ -1370,12 +1370,19 @@ perf_event_nmi_handler(struct notifier_b > return NOTIFY_DONE; > } > > - apic_write(APIC_LVTPC, APIC_DM_NMI); > > handled = x86_pmu.handle_irq(args->regs); > if (!handled) > return NOTIFY_DONE; > > + /* > + * Unmasking should be done after IRQ handled, otherwise > + * there is a race between clearing of counter overflow > + * flag and LTV entry unmasking (which might lead to double > + * NMIs generation). > + */ > + apic_write(APIC_LVTPC, APIC_DM_NMI); > + > this_nmi = percpu_read(irq_stat.__nmi_count); > if ((handled > 1) || > /* the next nmi could be a back-to-back nmi */ I had the first version of the patch running the test builds all night without any NMIs. I installed this one and ran it through the case where I would reliably get early NMIs and it still no NMIs. So for v2: Tested-by: Shaun Ruffell <sruffell@digium.com> Thanks! ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 20:35 ` Shaun Ruffell @ 2011-04-13 20:43 ` Cyrill Gorcunov 2011-04-13 21:22 ` Don Zickus 0 siblings, 1 reply; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-13 20:43 UTC (permalink / raw) To: Shaun Ruffell Cc: maciej.rutecki, Don Zickus, linux-kernel, Ingo Molnar, Lin Ming On 04/14/2011 12:35 AM, Shaun Ruffell wrote: ... > > I had the first version of the patch running the test builds all night without > any NMIs. I installed this one and ran it through the case where I would > reliably get early NMIs and it still no NMIs. > > So for v2: > Tested-by: Shaun Ruffell <sruffell@digium.com> > > Thanks! Thanks a huge Shaun. The thing is (if only I don't miss something) at moment there is no much difference in which patch to pick up. But as only kgdb dives in or any other subsystem (which say would use same manner of nmi delivery) we might be unmasking lvt entry even if nothing were handled at all, so I bias to a second version. -- Cyrill ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 20:43 ` Cyrill Gorcunov @ 2011-04-13 21:22 ` Don Zickus 2011-04-13 21:25 ` Cyrill Gorcunov 0 siblings, 1 reply; 18+ messages in thread From: Don Zickus @ 2011-04-13 21:22 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Shaun Ruffell, maciej.rutecki, linux-kernel, Ingo Molnar, Lin Ming On Thu, Apr 14, 2011 at 12:43:47AM +0400, Cyrill Gorcunov wrote: > On 04/14/2011 12:35 AM, Shaun Ruffell wrote: > ... > > > > I had the first version of the patch running the test builds all night without > > any NMIs. I installed this one and ran it through the case where I would > > reliably get early NMIs and it still no NMIs. > > > > So for v2: > > Tested-by: Shaun Ruffell <sruffell@digium.com> > > > > Thanks! > > Thanks a huge Shaun. The thing is (if only I don't miss something) at moment there is > no much difference in which patch to pick up. But as only kgdb dives in or any other > subsystem (which say would use same manner of nmi delivery) we might be unmasking > lvt entry even if nothing were handled at all, so I bias to a second version. I agree with the second version. Initially I wanted to enable it in the case of the !handled path. But your reasoning makes sense to me, don't enable it in the !handled case because you might accidentally do something bad. Cheers, Don ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 21:22 ` Don Zickus @ 2011-04-13 21:25 ` Cyrill Gorcunov 2011-04-13 21:53 ` Shaun Ruffell 0 siblings, 1 reply; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-13 21:25 UTC (permalink / raw) To: Don Zickus Cc: Shaun Ruffell, maciej.rutecki, linux-kernel, Ingo Molnar, Lin Ming, Peter Zijlstra, Stephane Eranian, Robert Richter On 04/14/2011 01:22 AM, Don Zickus wrote: > On Thu, Apr 14, 2011 at 12:43:47AM +0400, Cyrill Gorcunov wrote: >> On 04/14/2011 12:35 AM, Shaun Ruffell wrote: >> ... >>> >>> I had the first version of the patch running the test builds all night without >>> any NMIs. I installed this one and ran it through the case where I would >>> reliably get early NMIs and it still no NMIs. >>> >>> So for v2: >>> Tested-by: Shaun Ruffell <sruffell@digium.com> >>> >>> Thanks! >> >> Thanks a huge Shaun. The thing is (if only I don't miss something) at moment there is >> no much difference in which patch to pick up. But as only kgdb dives in or any other >> subsystem (which say would use same manner of nmi delivery) we might be unmasking >> lvt entry even if nothing were handled at all, so I bias to a second version. > > I agree with the second version. Initially I wanted to enable it in the > case of the !handled path. But your reasoning makes sense to me, don't > enable it in the !handled case because you might accidentally do something > bad. > > Cheers, > Don OK, thanks for review! Shaun please continue testing it, if all will be fine until tomorrow we will ask Ingo to pick it up then. Sounds OK for everyone? (CC'ed a couple of people which were involved into this code snippet) -- Cyrill ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 21:25 ` Cyrill Gorcunov @ 2011-04-13 21:53 ` Shaun Ruffell 2011-04-14 14:30 ` Shaun Ruffell 0 siblings, 1 reply; 18+ messages in thread From: Shaun Ruffell @ 2011-04-13 21:53 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Don Zickus, maciej.rutecki, linux-kernel, Ingo Molnar, Lin Ming, Peter Zijlstra, Stephane Eranian, Robert Richter On Thu, Apr 14, 2011 at 01:25:51AM +0400, Cyrill Gorcunov wrote: ... > OK, thanks for review! Shaun please continue testing it, if all will be fine until tomorrow > we will ask Ingo to pick it up then. Sounds OK for everyone? Sounds good. I'll let it run tonight with v2 tonight. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 21:53 ` Shaun Ruffell @ 2011-04-14 14:30 ` Shaun Ruffell 2011-04-14 14:33 ` Cyrill Gorcunov 0 siblings, 1 reply; 18+ messages in thread From: Shaun Ruffell @ 2011-04-14 14:30 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Don Zickus, maciej.rutecki, linux-kernel, Ingo Molnar, Lin Ming, Peter Zijlstra, Stephane Eranian, Robert Richter On Wed, Apr 13, 2011 at 04:53:22PM -0500, Shaun Ruffell wrote: > On Thu, Apr 14, 2011 at 01:25:51AM +0400, Cyrill Gorcunov wrote: > ... >> OK, thanks for review! Shaun please continue testing it, if all will be fine until tomorrow >> we will ask Ingo to pick it up then. Sounds OK for everyone? > > Sounds good. I'll let it run tonight with v2 tonight. I know Ingo asked to merge v1 now and something else in a later version. However, just to follow up on this thread, v2 ran all night without producing any NMIs on the PowerEdge 2600. Cheers, Shaun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-14 14:30 ` Shaun Ruffell @ 2011-04-14 14:33 ` Cyrill Gorcunov 0 siblings, 0 replies; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-14 14:33 UTC (permalink / raw) To: Shaun Ruffell Cc: Don Zickus, maciej.rutecki, linux-kernel, Ingo Molnar, Lin Ming, Peter Zijlstra, Stephane Eranian, Robert Richter On 04/14/2011 06:30 PM, Shaun Ruffell wrote: > On Wed, Apr 13, 2011 at 04:53:22PM -0500, Shaun Ruffell wrote: >> On Thu, Apr 14, 2011 at 01:25:51AM +0400, Cyrill Gorcunov wrote: >> ... >>> OK, thanks for review! Shaun please continue testing it, if all will be fine until tomorrow >>> we will ask Ingo to pick it up then. Sounds OK for everyone? >> >> Sounds good. I'll let it run tonight with v2 tonight. > > I know Ingo asked to merge v1 now and something else in a later version. > However, just to follow up on this thread, v2 ran all night without producing > any NMIs on the PowerEdge 2600. > > Cheers, > Shaun Thanks Shaun, I'll send v1 shortly, just to be on safe side. Then we will update it for .40 as Ingo proposed. Thanks again for testing! -- Cyrill ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-13 20:01 ` Cyrill Gorcunov 2011-04-13 20:35 ` Shaun Ruffell @ 2011-04-14 6:47 ` Ingo Molnar 2011-04-14 7:51 ` Cyrill Gorcunov 1 sibling, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2011-04-14 6:47 UTC (permalink / raw) To: Cyrill Gorcunov, Peter Zijlstra Cc: maciej.rutecki, Shaun Ruffell, Don Zickus, linux-kernel, Lin Ming * Cyrill Gorcunov <gorcunov@openvz.org> wrote: > - apic_write(APIC_LVTPC, APIC_DM_NMI); > > handled = x86_pmu.handle_irq(args->regs); > if (!handled) > return NOTIFY_DONE; > > + /* > + * Unmasking should be done after IRQ handled, otherwise > + * there is a race between clearing of counter overflow > + * flag and LTV entry unmasking (which might lead to double > + * NMIs generation). > + */ > + apic_write(APIC_LVTPC, APIC_DM_NMI); Here we could leak a masked IRQ through the !handled path. If we got a LVTPC irq we better handle it and unmask the LVTPC unconditionally - regardless of whether we consider it 'handled' or not from the kernel POV ... Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-14 6:47 ` Ingo Molnar @ 2011-04-14 7:51 ` Cyrill Gorcunov 2011-04-14 8:05 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-14 7:51 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, maciej.rutecki, Shaun Ruffell, Don Zickus, linux-kernel, Lin Ming On Thu, Apr 14, 2011 at 10:47 AM, Ingo Molnar <mingo@elte.hu> wrote: > > * Cyrill Gorcunov <gorcunov@openvz.org> wrote: > >> - apic_write(APIC_LVTPC, APIC_DM_NMI); >> >> handled = x86_pmu.handle_irq(args->regs); >> if (!handled) >> return NOTIFY_DONE; >> >> + /* >> + * Unmasking should be done after IRQ handled, otherwise >> + * there is a race between clearing of counter overflow >> + * flag and LTV entry unmasking (which might lead to double >> + * NMIs generation). >> + */ >> + apic_write(APIC_LVTPC, APIC_DM_NMI); > > Here we could leak a masked IRQ through the !handled path. If we got a LVTPC > irq we better handle it and unmask the LVTPC unconditionally - regardless of > whether we consider it 'handled' or not from the kernel POV ... > > Thanks, > > Ingo If there is no counters overflowed I believe we should not poke LVTPC until we sure NMI comes from it (and counter overflow is the only sign that NMI came from LVTPC as far as I may say, and I see also a possibility for race if counter signal reaches LVTPC and it is being processed inside apic chip {which might take some time too before real NMI signal appears in cpu} and as result hard to tell what we get in output -- double nmi again or something else). I might be screwed of course :) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-14 7:51 ` Cyrill Gorcunov @ 2011-04-14 8:05 ` Ingo Molnar 2011-04-14 9:27 ` Cyrill Gorcunov 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2011-04-14 8:05 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Peter Zijlstra, maciej.rutecki, Shaun Ruffell, Don Zickus, linux-kernel, Lin Ming * Cyrill Gorcunov <gorcunov@openvz.org> wrote: > On Thu, Apr 14, 2011 at 10:47 AM, Ingo Molnar <mingo@elte.hu> wrote: > > > > * Cyrill Gorcunov <gorcunov@openvz.org> wrote: > > > >> - apic_write(APIC_LVTPC, APIC_DM_NMI); > >> > >> handled = x86_pmu.handle_irq(args->regs); > >> if (!handled) > >> return NOTIFY_DONE; > >> > >> + /* > >> + * Unmasking should be done after IRQ handled, otherwise > >> + * there is a race between clearing of counter overflow > >> + * flag and LTV entry unmasking (which might lead to double > >> + * NMIs generation). > >> + */ > >> + apic_write(APIC_LVTPC, APIC_DM_NMI); > > > > Here we could leak a masked IRQ through the !handled path. If we got a LVTPC > > irq we better handle it and unmask the LVTPC unconditionally - regardless of > > whether we consider it 'handled' or not from the kernel POV ... > > > > Thanks, > > > > Ingo > > If there is no counters overflowed I believe we should not poke LVTPC until > we sure NMI comes from it (and counter overflow is the only sign that NMI > came from LVTPC as far as I may say, and I see also a possibility for race if > counter signal reaches LVTPC and it is being processed inside apic chip > {which might take some time too before real NMI signal appears in cpu} and as > result hard to tell what we get in output -- double nmi again or something > else). Well, we unmasked unconditionally before. If we unmask conditionally now, we risk not unmasking. We risk a completely stuck PMU (there wont ever come *any* NMI from it if we ever forget to unmask) versus spurious NMIs. Maybe we can do it - but it will need a lot of testing on a lot of CPU types to make sure there's no other CPU quirks in this area ... So unless the conditional unmasking fixes a real bug (in kgdb or elsewhere) lets unmask unconditionally now to fix the P4 regression in .39 - and queue up a *separate* patch that moves it even further down and makes it conditional - but queue that up for .40. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs 2011-04-14 8:05 ` Ingo Molnar @ 2011-04-14 9:27 ` Cyrill Gorcunov 0 siblings, 0 replies; 18+ messages in thread From: Cyrill Gorcunov @ 2011-04-14 9:27 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, maciej.rutecki, Shaun Ruffell, Don Zickus, linux-kernel, Lin Ming On Thu, Apr 14, 2011 at 12:05 PM, Ingo Molnar <mingo@elte.hu> wrote: ... >> If there is no counters overflowed I believe we should not poke LVTPC until >> we sure NMI comes from it (and counter overflow is the only sign that NMI >> came from LVTPC as far as I may say, and I see also a possibility for race if >> counter signal reaches LVTPC and it is being processed inside apic chip >> {which might take some time too before real NMI signal appears in cpu} and as >> result hard to tell what we get in output -- double nmi again or something >> else). > > Well, we unmasked unconditionally before. If we unmask conditionally now, we > risk not unmasking. We risk a completely stuck PMU (there wont ever come *any* > NMI from it if we ever forget to unmask) versus spurious NMIs. > > Maybe we can do it - but it will need a lot of testing on a lot of CPU types to > make sure there's no other CPU quirks in this area ... > > So unless the conditional unmasking fixes a real bug (in kgdb or elsewhere) > lets unmask unconditionally now to fix the P4 regression in .39 - and queue up > a *separate* patch that moves it even further down and makes it conditional - > but queue that up for .40. > > Thanks, > > Ingo > OK. Ingo I'll send a patch from Don with all tested-by (including me) and my ack as only get back home. (I don't mind if Don beat me on this ;) ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2011-04-14 14:33 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-04-06 22:30 [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read proper MSR register to catch" and NMIs Shaun Ruffell 2011-04-07 0:16 ` Don Zickus 2011-04-07 3:18 ` Cyrill Gorcunov 2011-04-07 14:38 ` Shaun Ruffell 2011-04-07 14:43 ` Cyrill Gorcunov 2011-04-13 19:33 ` Maciej Rutecki 2011-04-13 20:01 ` Cyrill Gorcunov 2011-04-13 20:35 ` Shaun Ruffell 2011-04-13 20:43 ` Cyrill Gorcunov 2011-04-13 21:22 ` Don Zickus 2011-04-13 21:25 ` Cyrill Gorcunov 2011-04-13 21:53 ` Shaun Ruffell 2011-04-14 14:30 ` Shaun Ruffell 2011-04-14 14:33 ` Cyrill Gorcunov 2011-04-14 6:47 ` Ingo Molnar 2011-04-14 7:51 ` Cyrill Gorcunov 2011-04-14 8:05 ` Ingo Molnar 2011-04-14 9:27 ` Cyrill Gorcunov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.