All of lore.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: airlied@gmail.com, gorcunov@gmail.com
Cc: a.p.zijlstra@chello.nl, dzickus@redhat.com, eranian@google.com,
	linux-kernel@vger.kernel.org, linux@horizon.com,
	ming.m.lin@intel.com, mingo@elte.hu
Subject: Re: 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Date: 16 Feb 2011 06:57:01 -0500	[thread overview]
Message-ID: <20110216115701.3956.qmail@science.horizon.com> (raw)
In-Reply-To: <AANLkTikLVMNQzmugH-ORc89_ZouX6zWvG-tHSf1QoVf3@mail.gmail.com>

> Ping on this problem, still seeing
> 
> Uhhuh. NMI received for unknown reason 3c on CPU 0.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
> 
> on my Pentium-D system here with latest Linus head.
> 
> its sometimes 3c, sometimes 3d, I'm going to bisect and push for
> reverts if nobody still has any clue about how to fix this.

The second patch (not the one you quote) fixed it for me.  Almost 8 days
of uptime and no log spam.

It's appended below for your convenience.  Are you using this
unsuccessfully?


From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: [PATCH] perf, x86: P4 PMU -- Fix unflagged overflows test

A couple of people have reported an unknown NMI issue on p4 pmu.
This patch should fix it.

Reported-by: George Spelvin <linux@horizon.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Don Zickus <dzickus@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/x86/include/asm/perf_event_p4.h |    1 +
 arch/x86/kernel/cpu/perf_event_p4.c  |   11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

Index: linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux-2.6.tip.orig/arch/x86/include/asm/perf_event_p4.h
+++ linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
@@ -22,6 +22,7 @@
 
 #define ARCH_P4_CNTRVAL_BITS	(40)
 #define ARCH_P4_CNTRVAL_MASK	((1ULL << ARCH_P4_CNTRVAL_BITS) - 1)
+#define ARCH_P4_UNFLAGGED_BIT	((1ULL) << (ARCH_P4_CNTRVAL_BITS - 1))
 
 #define P4_ESCR_EVENT_MASK	0x7e000000U
 #define P4_ESCR_EVENT_SHIFT	25
Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -770,9 +770,14 @@ static inline int p4_pmu_clear_cccr_ovf(
 		return 1;
 	}
 
-	/* it might be unflagged overflow */
-	rdmsrl(hwc->event_base + hwc->idx, v);
-	if (!(v & ARCH_P4_CNTRVAL_MASK))
+	/*
+	 * at some circumstances the overflow might issue NMI but did
+	 * not set P4_CCCR_OVF bit so since a counter holds a negative value
+	 * we simply check for high bit being set, if it's cleared it means
+	 * the counter has reached zero value and continued counting before
+	 * real NMI signal was received
+	 */
+	if (!(v & ARCH_P4_UNFLAGGED_BIT))
 		return 1;
 
 	return 0;

  parent reply	other threads:[~2011-02-16 11:57 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-01 16:27 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0 George Spelvin
2011-02-01 17:52 ` Cyrill Gorcunov
2011-02-01 18:41   ` Don Zickus
2011-02-01 18:44     ` Cyrill Gorcunov
2011-02-01 18:51       ` Don Zickus
2011-02-01 20:00         ` Cyrill Gorcunov
2011-02-02  2:36   ` George Spelvin
2011-02-02  4:18     ` Cyrill Gorcunov
2011-02-16  1:57       ` Dave Airlie
2011-02-16  4:19         ` Cyrill Gorcunov
2011-02-16  8:37           ` Ingo Molnar
2011-02-16  8:49             ` Cyrill Gorcunov
2011-02-16  8:56               ` Ingo Molnar
2011-02-16  9:33                 ` Cyrill Gorcunov
2011-02-16 10:09                   ` Ingo Molnar
2011-02-16 11:08                     ` Cyrill Gorcunov
2011-02-16 11:33                       ` [tip:perf/urgent] perf, x86: P4 PMU: Fix spurious NMI messages tip-bot for Cyrill Gorcunov
2011-02-16 11:57         ` George Spelvin [this message]
2011-02-17  2:56           ` 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0 Dave Airlie
2011-02-17  7:48             ` Cyrill Gorcunov
2011-02-14 13:36 Preeti Khurana
2011-02-17  0:17 ` Ryan Underwood
2011-02-17  7:59   ` Cyrill Gorcunov
2011-02-18  2:40     ` Paul E. McKenney
2011-02-18 20:38       ` Underwood, Ryan
2011-02-21  6:56         ` Preeti Khurana
2011-02-21 16:45           ` Underwood, Ryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110216115701.3956.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=airlied@gmail.com \
    --cc=dzickus@redhat.com \
    --cc=eranian@google.com \
    --cc=gorcunov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.