All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] new nmi_watchdog using perf events
@ 2010-02-06  2:47 Don Zickus
  2010-02-06  2:47 ` [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c Don Zickus
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Don Zickus @ 2010-02-06  2:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: gorcunov, aris, linux-kernel, Don Zickus

This patch series tries to implement a new nmi_watchdog using the perf
events subsystem.  I am posting this series early for feedback on the
approach.  It isn't feature compatible with the old nmi_watchdog yet, nor
does it have all the old workarounds either.

The basic design is to create an in-kernel performance counter that goes off
every few seconds and checks for cpu lockups.  It is fairly straight
forward.  Some of the quirks are making sure the cpu lockup detection works
correctly.

It has been lightly tested for now.  Once people are ok with the approach,
I'll expand testing to more machines in our lab.

I tried taking a generic approach so all arches could use it if they want
and then implement some per arch specific hooks.  I believe this is what
Ingo was suggesting.  The initial work is based off of kernel/softlockup.c.

Any feedback would be great.

v2:
- moved a notify_die call into a #ifdef block
- used better default values for configuring the nmi_watchdog based on
  the old nmi_watchdog values

Cheers,
Don

--
damn it forgot to cc lkml

Don Zickus (3):
  [RFC][x86] move notify_die from nmi.c to traps.c
  [RFC] nmi_watchdog: new implementation using perf events
  [RFC] nmi_watchdog: config option to enable new nmi_watchdog

 arch/x86/kernel/apic/Makefile |    7 ++-
 arch/x86/kernel/apic/hw_nmi.c |  114 ++++++++++++++++++++++++
 arch/x86/kernel/apic/nmi.c    |    7 --
 arch/x86/kernel/traps.c       |    7 ++
 include/linux/nmi.h           |    4 +
 kernel/Makefile               |    1 +
 kernel/nmi_watchdog.c         |  196 +++++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug             |   13 +++
 8 files changed, 341 insertions(+), 8 deletions(-)
 create mode 100644 arch/x86/kernel/apic/hw_nmi.c
 create mode 100644 kernel/nmi_watchdog.c


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c
  2010-02-06  2:47 [PATCH 0/3 v2] new nmi_watchdog using perf events Don Zickus
@ 2010-02-06  2:47 ` Don Zickus
  2010-02-08  8:51   ` [tip:perf/nmi] x86: Move " tip-bot for Don Zickus
  2010-02-06  2:47 ` [PATCH 2/3 v2] nmi_watchdog: new implementation using perf events Don Zickus
  2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
  2 siblings, 1 reply; 11+ messages in thread
From: Don Zickus @ 2010-02-06  2:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: gorcunov, aris, linux-kernel, Don Zickus

In order to handle a new nmi_watchdog approach, I need to move the notify_die()
routine out of nmi_watchdog_tick() and into default_do_nmi().  This lets me easily
swap out the old nmi_watchdog with the new one with just a config change.

The change probably makes sense from a high level perspective because the
nmi_watchdog shouldn't be handling notify_die routines anyway.  However, this
move does change the semantics a little bit.  Instead of checking on every nmi
interrupt if the cpus are stuck, only check them on the nmi_watchdog interrupts.

v2: move notify_die call into #idef block

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/apic/nmi.c |    7 -------
 arch/x86/kernel/traps.c    |    5 +++++
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index 0159a69..5d47682 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -400,13 +400,6 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
 	int cpu = smp_processor_id();
 	int rc = 0;
 
-	/* check for other users first */
-	if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
-			== NOTIFY_STOP) {
-		rc = 1;
-		touched = 1;
-	}
-
 	sum = get_timer_irqs(cpu);
 
 	if (__get_cpu_var(nmi_touch)) {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 3339917..3be4687 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -400,7 +400,12 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 		if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT)
 								== NOTIFY_STOP)
 			return;
+
 #ifdef CONFIG_X86_LOCAL_APIC
+	        if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
+	        			                == NOTIFY_STOP)
+	                return;
+
 		/*
 		 * Ok, so this is none of the documented NMI sources,
 		 * so it must be the NMI watchdog.
-- 
1.6.6.83.gc9a2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3 v2] nmi_watchdog: new implementation using perf events
  2010-02-06  2:47 [PATCH 0/3 v2] new nmi_watchdog using perf events Don Zickus
  2010-02-06  2:47 ` [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c Don Zickus
@ 2010-02-06  2:47 ` Don Zickus
  2010-02-08  8:51   ` [tip:perf/nmi] nmi_watchdog: Add new, generic implementation, " tip-bot for Don Zickus
  2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
  2 siblings, 1 reply; 11+ messages in thread
From: Don Zickus @ 2010-02-06  2:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: gorcunov, aris, linux-kernel, Don Zickus

This is a new generic nmi_watchdog implementation using the perf events
infrastructure as suggested by Ingo.

The implementation is simple, just create an in-kernel perf event and register
an overflow handler to check for cpu lockups.  I created a generic implementation
that lives in kernel/ and the hardware specific part that for now lives in arch/x86.

I have done light testing to make sure the framework works correctly and it does.

v2:  sets the correct timeout values based on the old nmi watchdog

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/apic/hw_nmi.c |  114 ++++++++++++++++++++++++
 kernel/nmi_watchdog.c         |  191 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 305 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kernel/apic/hw_nmi.c
 create mode 100644 kernel/nmi_watchdog.c

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
new file mode 100644
index 0000000..8c0e6a4
--- /dev/null
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -0,0 +1,114 @@
+/*
+ *  HW NMI watchdog support
+ *
+ *  started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ *  Arch specific calls to support NMI watchdog
+ *
+ *  Bits copied from original nmi.c file
+ *
+ */
+
+#include <asm/apic.h>
+#include <linux/smp.h>
+#include <linux/cpumask.h>
+#include <linux/sched.h>
+#include <linux/percpu.h>
+#include <linux/cpumask.h>
+#include <linux/kernel_stat.h>
+#include <asm/mce.h>
+
+#include <linux/nmi.h>
+#include <linux/module.h>
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+static DEFINE_PER_CPU(unsigned, last_irq_sum);
+
+/*
+ * Take the local apic timer and PIT/HPET into account. We don't
+ * know which one is active, when we have highres/dyntick on
+ */
+static inline unsigned int get_timer_irqs(int cpu)
+{
+        return per_cpu(irq_stat, cpu).apic_timer_irqs +
+                per_cpu(irq_stat, cpu).irq0_irqs;
+}
+
+static inline int mce_in_progress(void)
+{
+#if defined(CONFIG_X86_MCE)
+        return atomic_read(&mce_entry) > 0;
+#endif
+        return 0;
+}
+
+int hw_nmi_is_cpu_stuck(struct pt_regs *regs)
+{
+	unsigned int sum;
+	int cpu = smp_processor_id();
+
+	/* FIXME: cheap hack for this check, probably should get its own
+	 * die_notifier handler
+	 */
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		static DEFINE_SPINLOCK(lock);	/* Serialise the printks */
+
+		spin_lock(&lock);
+		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
+		show_regs(regs);
+		dump_stack();
+		spin_unlock(&lock);
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+
+	/* if we are doing an mce, just assume the cpu is not stuck */
+        /* Could check oops_in_progress here too, but it's safer not to */
+        if (mce_in_progress())
+                return 0;
+
+	/* We determine if the cpu is stuck by checking whether any
+	 * interrupts have happened since we last checked.  Of course
+	 * an nmi storm could create false positives, but the higher
+	 * level logic should account for that
+	 */
+	sum = get_timer_irqs(cpu);
+	if (__get_cpu_var(last_irq_sum) == sum) {
+		return 1;
+	} else {
+		__get_cpu_var(last_irq_sum) = sum;
+		return 0;
+	}
+}
+
+void arch_trigger_all_cpu_backtrace(void)
+{
+	int i;
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	printk(KERN_INFO "sending NMI to all CPUs:\n");
+	apic->send_IPI_all(NMI_VECTOR);
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+	}
+}
+
+/* STUB calls to mimic old nmi_watchdog behaviour */
+unsigned int nmi_watchdog = NMI_NONE;
+EXPORT_SYMBOL(nmi_watchdog);
+atomic_t nmi_active = ATOMIC_INIT(0);           /* oprofile uses this */
+EXPORT_SYMBOL(nmi_active);
+int nmi_watchdog_enabled;
+int unknown_nmi_panic;
+void cpu_nmi_set_wd_enabled(void) { return; }
+void acpi_nmi_enable(void) { return; }
+void acpi_nmi_disable(void) { return; }
+void stop_apic_nmi_watchdog(void *unused) { return; }
+void setup_apic_nmi_watchdog(void *unused) { return; }
+int __init check_nmi_watchdog(void) { return 0; }
diff --git a/kernel/nmi_watchdog.c b/kernel/nmi_watchdog.c
new file mode 100644
index 0000000..36817b2
--- /dev/null
+++ b/kernel/nmi_watchdog.c
@@ -0,0 +1,191 @@
+/*
+ * Detect Hard Lockups using the NMI
+ *
+ * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ * this code detects hard lockups: incidents in where on a CPU
+ * the kernel does not respond to anything except NMI.
+ *
+ * Note: Most of this code is borrowed heavily from softlockup.c,
+ * so thanks to Ingo for the initial implementation.
+ * Some chunks also taken from arch/x86/kernel/apic/nmi.c, thanks
+ * to those contributors as well.
+ */
+
+#include <linux/mm.h>
+#include <linux/cpu.h>
+#include <linux/nmi.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/freezer.h>
+#include <linux/lockdep.h>
+#include <linux/notifier.h>
+#include <linux/module.h>
+#include <linux/sysctl.h>
+
+#include <asm/irq_regs.h>
+#include <linux/perf_event.h>
+
+static DEFINE_PER_CPU(struct perf_event *, nmi_watchdog_ev);
+static DEFINE_PER_CPU(int, nmi_watchdog_touch);
+static DEFINE_PER_CPU(long, alert_counter);
+
+void touch_nmi_watchdog(void)
+{
+	__raw_get_cpu_var(nmi_watchdog_touch) = 1;
+	touch_softlockup_watchdog();
+}
+EXPORT_SYMBOL(touch_nmi_watchdog);
+
+void touch_all_nmi_watchdog(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		per_cpu(nmi_watchdog_touch, cpu) = 1;
+	touch_softlockup_watchdog();
+}
+
+#ifdef CONFIG_SYSCTL
+/*
+ * proc handler for /proc/sys/kernel/nmi_watchdog
+ */
+int proc_nmi_enabled(struct ctl_table *table, int write,
+		     void __user *buffer, size_t *length, loff_t *ppos)
+{
+	int cpu;
+
+	if (per_cpu(nmi_watchdog_ev, smp_processor_id()) == NULL)
+		nmi_watchdog_enabled = 0;
+	else
+		nmi_watchdog_enabled = 1;
+
+	touch_all_nmi_watchdog();
+	proc_dointvec(table, write, buffer, length, ppos);
+	if (nmi_watchdog_enabled)
+		for_each_online_cpu(cpu)
+			perf_event_enable(per_cpu(nmi_watchdog_ev, cpu));
+	else
+		for_each_online_cpu(cpu)
+			perf_event_disable(per_cpu(nmi_watchdog_ev, cpu));
+	return 0;
+}
+
+#endif /* CONFIG_SYSCTL */
+
+struct perf_event_attr wd_attr = {
+	.type = PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CPU_CYCLES,
+	.size = sizeof(struct perf_event_attr),
+	.pinned = 1,
+	.disabled = 1,
+};
+
+static int panic_on_timeout;
+
+void wd_overflow(struct perf_event *event, int nmi,
+		 struct perf_sample_data *data,
+		 struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+	int touched = 0;
+
+	if (__get_cpu_var(nmi_watchdog_touch)) {
+		per_cpu(nmi_watchdog_touch, cpu) = 0;
+		touched = 1;
+	}
+
+	/* check to see if the cpu is doing anything */
+	if (!touched && hw_nmi_is_cpu_stuck(regs)) {
+		/*
+		 * Ayiee, looks like this CPU is stuck ...
+		 * wait a few IRQs (5 seconds) before doing the oops ...
+		 */
+		per_cpu(alert_counter,cpu) += 1;
+		if (per_cpu(alert_counter,cpu) == 5) {
+			/*
+			 * die_nmi will return ONLY if NOTIFY_STOP happens..
+			 */
+			die_nmi("BUG: NMI Watchdog detected LOCKUP",
+				regs, panic_on_timeout);
+		}
+	} else {
+		per_cpu(alert_counter,cpu) = 0;
+	}
+
+	return;
+}
+
+/*
+ * Create/destroy watchdog threads as CPUs come and go:
+ */
+static int __cpuinit
+cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+	int hotcpu = (unsigned long)hcpu;
+	struct perf_event *event;
+
+	switch (action) {
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		per_cpu(nmi_watchdog_touch, hotcpu) = 0;
+		break;
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+		/* originally wanted the below chunk to be in CPU_UP_PREPARE, but caps is unpriv for non-CPU0 */
+		wd_attr.sample_period = cpu_khz * 1000;
+		event = perf_event_create_kernel_counter(&wd_attr, hotcpu, -1, wd_overflow);
+		if (IS_ERR(event)) {
+			printk(KERN_ERR "nmi watchdog failed to create perf event on %i: %p\n", hotcpu, event);
+			return NOTIFY_BAD;
+		}
+		per_cpu(nmi_watchdog_ev, hotcpu) = event;
+		perf_event_enable(per_cpu(nmi_watchdog_ev, hotcpu));
+		break;
+#ifdef CONFIG_HOTPLUG_CPU
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+		perf_event_disable(per_cpu(nmi_watchdog_ev, hotcpu));
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		event = per_cpu(nmi_watchdog_ev, hotcpu);
+		per_cpu(nmi_watchdog_ev, hotcpu) = NULL;
+		perf_event_release_kernel(event);
+		break;
+#endif /* CONFIG_HOTPLUG_CPU */
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata cpu_nfb = {
+	.notifier_call = cpu_callback
+};
+
+static int __initdata nonmi_watchdog;
+
+static int __init nonmi_watchdog_setup(char *str)
+{
+	nonmi_watchdog = 1;
+	return 1;
+}
+__setup("nonmi_watchdog", nonmi_watchdog_setup);
+
+static int __init spawn_nmi_watchdog_task(void)
+{
+	void *cpu = (void *)(long)smp_processor_id();
+	int err;
+
+	if (nonmi_watchdog)
+		return 0;
+
+	err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
+	if (err == NOTIFY_BAD) {
+		BUG();
+		return 1;
+	}
+	cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
+	register_cpu_notifier(&cpu_nfb);
+
+	return 0;
+}
+early_initcall(spawn_nmi_watchdog_task);
-- 
1.6.6.83.gc9a2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog
  2010-02-06  2:47 [PATCH 0/3 v2] new nmi_watchdog using perf events Don Zickus
  2010-02-06  2:47 ` [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c Don Zickus
  2010-02-06  2:47 ` [PATCH 2/3 v2] nmi_watchdog: new implementation using perf events Don Zickus
@ 2010-02-06  2:47 ` Don Zickus
  2010-02-08  7:19   ` Ingo Molnar
                     ` (2 more replies)
  2 siblings, 3 replies; 11+ messages in thread
From: Don Zickus @ 2010-02-06  2:47 UTC (permalink / raw)
  To: mingo, peterz; +Cc: gorcunov, aris, linux-kernel, Don Zickus

These are the bits that enable the new nmi_watchdog and safely isolate the
old nmi_watchdog.  Only one or the other can run, not both at the same
time.

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/apic/Makefile |    7 ++++++-
 arch/x86/kernel/traps.c       |    2 ++
 include/linux/nmi.h           |    4 ++++
 kernel/Makefile               |    1 +
 lib/Kconfig.debug             |   13 +++++++++++++
 5 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index 565c1bf..1a4512e 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -2,7 +2,12 @@
 # Makefile for local APIC drivers and for the IO-APIC code
 #
 
-obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o probe_$(BITS).o ipi.o nmi.o
+obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o probe_$(BITS).o ipi.o
+ifneq ($(CONFIG_NMI_WATCHDOG),y)
+obj-$(CONFIG_X86_LOCAL_APIC)	+= nmi.o
+endif
+obj-$(CONFIG_NMI_WATCHDOG)	+= hw_nmi.o
+
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
 obj-$(CONFIG_SMP)		+= ipi.o
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 3be4687..5b89638 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -406,6 +406,7 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 	        			                == NOTIFY_STOP)
 	                return;
 
+#ifndef CONFIG_NMI_WATCHDOG
 		/*
 		 * Ok, so this is none of the documented NMI sources,
 		 * so it must be the NMI watchdog.
@@ -413,6 +414,7 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 		if (nmi_watchdog_tick(regs, reason))
 			return;
 		if (!do_nmi_callback(regs, cpu))
+#endif /* !CONFIG_NMI_WATCHDOG */
 			unknown_nmi_error(reason, regs);
 #else
 		unknown_nmi_error(reason, regs);
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b752e80..a42ff0b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -47,4 +47,8 @@ static inline bool trigger_all_cpu_backtrace(void)
 }
 #endif
 
+#ifdef CONFIG_NMI_WATCHDOG
+int hw_nmi_is_cpu_stuck(struct pt_regs *);
+#endif
+
 #endif
diff --git a/kernel/Makefile b/kernel/Makefile
index 864ff75..8a5abe5 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -76,6 +76,7 @@ obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
 obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_KGDB) += kgdb.o
 obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
+obj-$(CONFIG_NMI_WATCHDOG) += nmi_watchdog.o
 obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
 obj-$(CONFIG_SECCOMP) += seccomp.o
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 25c3ed5..04a43a2 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -170,6 +170,19 @@ config DETECT_SOFTLOCKUP
 	   can be detected via the NMI-watchdog, on platforms that
 	   support it.)
 
+config NMI_WATCHDOG
+	bool "Detect Hard Lockups with an NMI Watchdog"
+	depends on DEBUG_KERNEL && PERF_EVENTS
+	default y
+	help
+	  Say Y here to enable the kernel to use the NMI as a watchdog
+	  to detect hard lockups.  This is useful when a cpu hangs for no
+	  reason but can still respond to NMIs.  A backtrace is displayed
+	  for reviewing and reporting.
+
+	  The overhead should be minimal, just an extra NMI every few 
+	  seconds.
+
 config BOOTPARAM_SOFTLOCKUP_PANIC
 	bool "Panic (Reboot) On Soft Lockups"
 	depends on DETECT_SOFTLOCKUP
-- 
1.6.6.83.gc9a2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog
  2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
@ 2010-02-08  7:19   ` Ingo Molnar
  2010-02-08  9:39     ` Cyrill Gorcunov
  2010-02-08 14:58     ` Don Zickus
  2010-02-08  8:52   ` [tip:perf/nmi] nmi_watchdog: Config " tip-bot for Don Zickus
  2010-02-09 10:51   ` [tip:perf/nmi] nmi_watchdog: Only enable on x86 for now tip-bot for Ingo Molnar
  2 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2010-02-08  7:19 UTC (permalink / raw)
  To: Don Zickus; +Cc: peterz, gorcunov, aris, linux-kernel, Andrew Morton


* Don Zickus <dzickus@redhat.com> wrote:

> +config NMI_WATCHDOG
> +	bool "Detect Hard Lockups with an NMI Watchdog"
> +	depends on DEBUG_KERNEL && PERF_EVENTS
> +	default y
> +	help
> +	  Say Y here to enable the kernel to use the NMI as a watchdog
> +	  to detect hard lockups.  This is useful when a cpu hangs for no
> +	  reason but can still respond to NMIs.  A backtrace is displayed
> +	  for reviewing and reporting.
> +
> +	  The overhead should be minimal, just an extra NMI every few 
> +	  seconds.

Thought for later patches: I think an architecture should be able to express 
via a Kconfig switch that it actually _has_ NMI events. There's architectures 
which dont have a PMU driver and only have software events. There's also 
architectures that have a PMU driver but no NMIs.

Something like ARCH_HAS_NMI_PERF_EVENTS?

Also, i havent checked, but what is the practical effect of the new generic 
watchdog on x86 CPUs that does not have a native PMU driver yet - such as 
P4s?

Anyway, i'll create a tip:perf/nmi topic branch for these patches, it 
certainly looks like a useful generalization and a new architecture that has 
perf could easily enable it, without having to write its own NMI watchdog 
implementation. It's also useful for any new watchdog features that people 
might want to add. Plus it makes the x86 PMU code cleaner in the long run as 
well.

Thanks, 

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:perf/nmi] x86: Move notify_die from nmi.c to traps.c
  2010-02-06  2:47 ` [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c Don Zickus
@ 2010-02-08  8:51   ` tip-bot for Don Zickus
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Don Zickus @ 2010-02-08  8:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, akpm, tglx, dzickus, mingo

Commit-ID:  e40b17208b6805be50ffe891878662b6076206b9
Gitweb:     http://git.kernel.org/tip/e40b17208b6805be50ffe891878662b6076206b9
Author:     Don Zickus <dzickus@redhat.com>
AuthorDate: Fri, 5 Feb 2010 21:47:03 -0500
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 8 Feb 2010 08:29:02 +0100

x86: Move notify_die from nmi.c to traps.c

In order to handle a new nmi_watchdog approach, I need to move
the notify_die() routine out of nmi_watchdog_tick() and into
default_do_nmi(). This lets me easily swap out the old
nmi_watchdog with the new one with just a config change.

The change probably makes sense from a high level perspective
because the nmi_watchdog shouldn't be handling notify_die
routines anyway.  However, this move does change the semantics a
little bit.  Instead of checking on every nmi interrupt if the
cpus are stuck, only check them on the nmi_watchdog interrupts.

 v2: Move notify_die call into #idef block

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/apic/nmi.c |    7 -------
 arch/x86/kernel/traps.c    |    5 +++++
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index 0159a69..5d47682 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -400,13 +400,6 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
 	int cpu = smp_processor_id();
 	int rc = 0;
 
-	/* check for other users first */
-	if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
-			== NOTIFY_STOP) {
-		rc = 1;
-		touched = 1;
-	}
-
 	sum = get_timer_irqs(cpu);
 
 	if (__get_cpu_var(nmi_touch)) {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 1168e44..51ef893 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -400,7 +400,12 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 		if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT)
 								== NOTIFY_STOP)
 			return;
+
 #ifdef CONFIG_X86_LOCAL_APIC
+	        if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
+	        			                == NOTIFY_STOP)
+	                return;
+
 		/*
 		 * Ok, so this is none of the documented NMI sources,
 		 * so it must be the NMI watchdog.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:perf/nmi] nmi_watchdog: Add new, generic implementation, using perf events
  2010-02-06  2:47 ` [PATCH 2/3 v2] nmi_watchdog: new implementation using perf events Don Zickus
@ 2010-02-08  8:51   ` tip-bot for Don Zickus
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Don Zickus @ 2010-02-08  8:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, akpm, tglx, dzickus, mingo

Commit-ID:  1fb9d6ad2766a1dd70d167552988375049a97f21
Gitweb:     http://git.kernel.org/tip/1fb9d6ad2766a1dd70d167552988375049a97f21
Author:     Don Zickus <dzickus@redhat.com>
AuthorDate: Fri, 5 Feb 2010 21:47:04 -0500
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 8 Feb 2010 08:29:02 +0100

nmi_watchdog: Add new, generic implementation, using perf events

This is a new generic nmi_watchdog implementation using the perf
events infrastructure as suggested by Ingo.

The implementation is simple, just create an in-kernel perf
event and register an overflow handler to check for cpu lockups.

I created a generic implementation that lives in kernel/ and
the hardware specific part that for now lives in arch/x86.

This approach has a number of advantages:

 - It simplifies the x86 PMU implementation in the long run,
   in that it removes the hardcoded low-level PMU implementation
   that was the NMI watchdog before.

 - It allows new NMI watchdog features to be added in a central
   place.

 - It allows other architectures to enable the NMI watchdog,
   as long as they have perf events (that provide NMIs)
   implemented.

 - It also allows for more graceful co-existence of existing
   perf events apps and the NMI watchdog - before these changes
   the relationship was exclusive. (The NMI watchdog will 'spend'
   a perf event when enabled. In later iterations we might be
   able to piggyback from an existing NMI event without having
   to allocate a hardware event for the NMI watchdog - turning
   this into a no-hardware-cost feature.)

As for compatibility, we'll keep the old NMI watchdog code as
well until the new one can 100% replace it on all CPUs, old and
new alike.  That might take some time as the NMI watchdog has
been ported to many CPU models.

I have done light testing to make sure the framework works
correctly and it does.

 v2: Set the correct timeout values based on the old nmi
     watchdog

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/apic/hw_nmi.c |  114 ++++++++++++++++++++++++
 kernel/nmi_watchdog.c         |  191 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 305 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
new file mode 100644
index 0000000..8c0e6a4
--- /dev/null
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -0,0 +1,114 @@
+/*
+ *  HW NMI watchdog support
+ *
+ *  started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ *  Arch specific calls to support NMI watchdog
+ *
+ *  Bits copied from original nmi.c file
+ *
+ */
+
+#include <asm/apic.h>
+#include <linux/smp.h>
+#include <linux/cpumask.h>
+#include <linux/sched.h>
+#include <linux/percpu.h>
+#include <linux/cpumask.h>
+#include <linux/kernel_stat.h>
+#include <asm/mce.h>
+
+#include <linux/nmi.h>
+#include <linux/module.h>
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+static DEFINE_PER_CPU(unsigned, last_irq_sum);
+
+/*
+ * Take the local apic timer and PIT/HPET into account. We don't
+ * know which one is active, when we have highres/dyntick on
+ */
+static inline unsigned int get_timer_irqs(int cpu)
+{
+        return per_cpu(irq_stat, cpu).apic_timer_irqs +
+                per_cpu(irq_stat, cpu).irq0_irqs;
+}
+
+static inline int mce_in_progress(void)
+{
+#if defined(CONFIG_X86_MCE)
+        return atomic_read(&mce_entry) > 0;
+#endif
+        return 0;
+}
+
+int hw_nmi_is_cpu_stuck(struct pt_regs *regs)
+{
+	unsigned int sum;
+	int cpu = smp_processor_id();
+
+	/* FIXME: cheap hack for this check, probably should get its own
+	 * die_notifier handler
+	 */
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		static DEFINE_SPINLOCK(lock);	/* Serialise the printks */
+
+		spin_lock(&lock);
+		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
+		show_regs(regs);
+		dump_stack();
+		spin_unlock(&lock);
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+
+	/* if we are doing an mce, just assume the cpu is not stuck */
+        /* Could check oops_in_progress here too, but it's safer not to */
+        if (mce_in_progress())
+                return 0;
+
+	/* We determine if the cpu is stuck by checking whether any
+	 * interrupts have happened since we last checked.  Of course
+	 * an nmi storm could create false positives, but the higher
+	 * level logic should account for that
+	 */
+	sum = get_timer_irqs(cpu);
+	if (__get_cpu_var(last_irq_sum) == sum) {
+		return 1;
+	} else {
+		__get_cpu_var(last_irq_sum) = sum;
+		return 0;
+	}
+}
+
+void arch_trigger_all_cpu_backtrace(void)
+{
+	int i;
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+	printk(KERN_INFO "sending NMI to all CPUs:\n");
+	apic->send_IPI_all(NMI_VECTOR);
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+		mdelay(1);
+	}
+}
+
+/* STUB calls to mimic old nmi_watchdog behaviour */
+unsigned int nmi_watchdog = NMI_NONE;
+EXPORT_SYMBOL(nmi_watchdog);
+atomic_t nmi_active = ATOMIC_INIT(0);           /* oprofile uses this */
+EXPORT_SYMBOL(nmi_active);
+int nmi_watchdog_enabled;
+int unknown_nmi_panic;
+void cpu_nmi_set_wd_enabled(void) { return; }
+void acpi_nmi_enable(void) { return; }
+void acpi_nmi_disable(void) { return; }
+void stop_apic_nmi_watchdog(void *unused) { return; }
+void setup_apic_nmi_watchdog(void *unused) { return; }
+int __init check_nmi_watchdog(void) { return 0; }
diff --git a/kernel/nmi_watchdog.c b/kernel/nmi_watchdog.c
new file mode 100644
index 0000000..36817b2
--- /dev/null
+++ b/kernel/nmi_watchdog.c
@@ -0,0 +1,191 @@
+/*
+ * Detect Hard Lockups using the NMI
+ *
+ * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ * this code detects hard lockups: incidents in where on a CPU
+ * the kernel does not respond to anything except NMI.
+ *
+ * Note: Most of this code is borrowed heavily from softlockup.c,
+ * so thanks to Ingo for the initial implementation.
+ * Some chunks also taken from arch/x86/kernel/apic/nmi.c, thanks
+ * to those contributors as well.
+ */
+
+#include <linux/mm.h>
+#include <linux/cpu.h>
+#include <linux/nmi.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/freezer.h>
+#include <linux/lockdep.h>
+#include <linux/notifier.h>
+#include <linux/module.h>
+#include <linux/sysctl.h>
+
+#include <asm/irq_regs.h>
+#include <linux/perf_event.h>
+
+static DEFINE_PER_CPU(struct perf_event *, nmi_watchdog_ev);
+static DEFINE_PER_CPU(int, nmi_watchdog_touch);
+static DEFINE_PER_CPU(long, alert_counter);
+
+void touch_nmi_watchdog(void)
+{
+	__raw_get_cpu_var(nmi_watchdog_touch) = 1;
+	touch_softlockup_watchdog();
+}
+EXPORT_SYMBOL(touch_nmi_watchdog);
+
+void touch_all_nmi_watchdog(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		per_cpu(nmi_watchdog_touch, cpu) = 1;
+	touch_softlockup_watchdog();
+}
+
+#ifdef CONFIG_SYSCTL
+/*
+ * proc handler for /proc/sys/kernel/nmi_watchdog
+ */
+int proc_nmi_enabled(struct ctl_table *table, int write,
+		     void __user *buffer, size_t *length, loff_t *ppos)
+{
+	int cpu;
+
+	if (per_cpu(nmi_watchdog_ev, smp_processor_id()) == NULL)
+		nmi_watchdog_enabled = 0;
+	else
+		nmi_watchdog_enabled = 1;
+
+	touch_all_nmi_watchdog();
+	proc_dointvec(table, write, buffer, length, ppos);
+	if (nmi_watchdog_enabled)
+		for_each_online_cpu(cpu)
+			perf_event_enable(per_cpu(nmi_watchdog_ev, cpu));
+	else
+		for_each_online_cpu(cpu)
+			perf_event_disable(per_cpu(nmi_watchdog_ev, cpu));
+	return 0;
+}
+
+#endif /* CONFIG_SYSCTL */
+
+struct perf_event_attr wd_attr = {
+	.type = PERF_TYPE_HARDWARE,
+	.config = PERF_COUNT_HW_CPU_CYCLES,
+	.size = sizeof(struct perf_event_attr),
+	.pinned = 1,
+	.disabled = 1,
+};
+
+static int panic_on_timeout;
+
+void wd_overflow(struct perf_event *event, int nmi,
+		 struct perf_sample_data *data,
+		 struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+	int touched = 0;
+
+	if (__get_cpu_var(nmi_watchdog_touch)) {
+		per_cpu(nmi_watchdog_touch, cpu) = 0;
+		touched = 1;
+	}
+
+	/* check to see if the cpu is doing anything */
+	if (!touched && hw_nmi_is_cpu_stuck(regs)) {
+		/*
+		 * Ayiee, looks like this CPU is stuck ...
+		 * wait a few IRQs (5 seconds) before doing the oops ...
+		 */
+		per_cpu(alert_counter,cpu) += 1;
+		if (per_cpu(alert_counter,cpu) == 5) {
+			/*
+			 * die_nmi will return ONLY if NOTIFY_STOP happens..
+			 */
+			die_nmi("BUG: NMI Watchdog detected LOCKUP",
+				regs, panic_on_timeout);
+		}
+	} else {
+		per_cpu(alert_counter,cpu) = 0;
+	}
+
+	return;
+}
+
+/*
+ * Create/destroy watchdog threads as CPUs come and go:
+ */
+static int __cpuinit
+cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+	int hotcpu = (unsigned long)hcpu;
+	struct perf_event *event;
+
+	switch (action) {
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		per_cpu(nmi_watchdog_touch, hotcpu) = 0;
+		break;
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+		/* originally wanted the below chunk to be in CPU_UP_PREPARE, but caps is unpriv for non-CPU0 */
+		wd_attr.sample_period = cpu_khz * 1000;
+		event = perf_event_create_kernel_counter(&wd_attr, hotcpu, -1, wd_overflow);
+		if (IS_ERR(event)) {
+			printk(KERN_ERR "nmi watchdog failed to create perf event on %i: %p\n", hotcpu, event);
+			return NOTIFY_BAD;
+		}
+		per_cpu(nmi_watchdog_ev, hotcpu) = event;
+		perf_event_enable(per_cpu(nmi_watchdog_ev, hotcpu));
+		break;
+#ifdef CONFIG_HOTPLUG_CPU
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+		perf_event_disable(per_cpu(nmi_watchdog_ev, hotcpu));
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		event = per_cpu(nmi_watchdog_ev, hotcpu);
+		per_cpu(nmi_watchdog_ev, hotcpu) = NULL;
+		perf_event_release_kernel(event);
+		break;
+#endif /* CONFIG_HOTPLUG_CPU */
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata cpu_nfb = {
+	.notifier_call = cpu_callback
+};
+
+static int __initdata nonmi_watchdog;
+
+static int __init nonmi_watchdog_setup(char *str)
+{
+	nonmi_watchdog = 1;
+	return 1;
+}
+__setup("nonmi_watchdog", nonmi_watchdog_setup);
+
+static int __init spawn_nmi_watchdog_task(void)
+{
+	void *cpu = (void *)(long)smp_processor_id();
+	int err;
+
+	if (nonmi_watchdog)
+		return 0;
+
+	err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
+	if (err == NOTIFY_BAD) {
+		BUG();
+		return 1;
+	}
+	cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
+	register_cpu_notifier(&cpu_nfb);
+
+	return 0;
+}
+early_initcall(spawn_nmi_watchdog_task);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:perf/nmi] nmi_watchdog: Config option to enable new nmi_watchdog
  2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
  2010-02-08  7:19   ` Ingo Molnar
@ 2010-02-08  8:52   ` tip-bot for Don Zickus
  2010-02-09 10:51   ` [tip:perf/nmi] nmi_watchdog: Only enable on x86 for now tip-bot for Ingo Molnar
  2 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Don Zickus @ 2010-02-08  8:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, akpm, tglx, dzickus, mingo

Commit-ID:  84e478c6f1eb9c4bfa1fff2f8108e9a061b46428
Gitweb:     http://git.kernel.org/tip/84e478c6f1eb9c4bfa1fff2f8108e9a061b46428
Author:     Don Zickus <dzickus@redhat.com>
AuthorDate: Fri, 5 Feb 2010 21:47:05 -0500
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 8 Feb 2010 08:29:03 +0100

nmi_watchdog: Config option to enable new nmi_watchdog

These are the bits that enable the new nmi_watchdog and safely
isolate the old nmi_watchdog.  Only one or the other can run,
not both at the same time.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/apic/Makefile |    7 ++++++-
 arch/x86/kernel/traps.c       |    2 ++
 include/linux/nmi.h           |    4 ++++
 kernel/Makefile               |    1 +
 lib/Kconfig.debug             |   13 +++++++++++++
 5 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index 565c1bf..1a4512e 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -2,7 +2,12 @@
 # Makefile for local APIC drivers and for the IO-APIC code
 #
 
-obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o probe_$(BITS).o ipi.o nmi.o
+obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o probe_$(BITS).o ipi.o
+ifneq ($(CONFIG_NMI_WATCHDOG),y)
+obj-$(CONFIG_X86_LOCAL_APIC)	+= nmi.o
+endif
+obj-$(CONFIG_NMI_WATCHDOG)	+= hw_nmi.o
+
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
 obj-$(CONFIG_SMP)		+= ipi.o
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 51ef893..973cbc4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -406,6 +406,7 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 	        			                == NOTIFY_STOP)
 	                return;
 
+#ifndef CONFIG_NMI_WATCHDOG
 		/*
 		 * Ok, so this is none of the documented NMI sources,
 		 * so it must be the NMI watchdog.
@@ -413,6 +414,7 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 		if (nmi_watchdog_tick(regs, reason))
 			return;
 		if (!do_nmi_callback(regs, cpu))
+#endif /* !CONFIG_NMI_WATCHDOG */
 			unknown_nmi_error(reason, regs);
 #else
 		unknown_nmi_error(reason, regs);
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b752e80..a42ff0b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -47,4 +47,8 @@ static inline bool trigger_all_cpu_backtrace(void)
 }
 #endif
 
+#ifdef CONFIG_NMI_WATCHDOG
+int hw_nmi_is_cpu_stuck(struct pt_regs *);
+#endif
+
 #endif
diff --git a/kernel/Makefile b/kernel/Makefile
index 864ff75..8a5abe5 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -76,6 +76,7 @@ obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
 obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_KGDB) += kgdb.o
 obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
+obj-$(CONFIG_NMI_WATCHDOG) += nmi_watchdog.o
 obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
 obj-$(CONFIG_SECCOMP) += seccomp.o
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 25c3ed5..f80b67e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -170,6 +170,19 @@ config DETECT_SOFTLOCKUP
 	   can be detected via the NMI-watchdog, on platforms that
 	   support it.)
 
+config NMI_WATCHDOG
+	bool "Detect Hard Lockups with an NMI Watchdog"
+	depends on DEBUG_KERNEL && PERF_EVENTS
+	default y
+	help
+	  Say Y here to enable the kernel to use the NMI as a watchdog
+	  to detect hard lockups.  This is useful when a cpu hangs for no
+	  reason but can still respond to NMIs.  A backtrace is displayed
+	  for reviewing and reporting.
+
+	  The overhead should be minimal, just an extra NMI every few
+	  seconds.
+
 config BOOTPARAM_SOFTLOCKUP_PANIC
 	bool "Panic (Reboot) On Soft Lockups"
 	depends on DETECT_SOFTLOCKUP

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3 v2] nmi_watchdog: config option to enable new  nmi_watchdog
  2010-02-08  7:19   ` Ingo Molnar
@ 2010-02-08  9:39     ` Cyrill Gorcunov
  2010-02-08 14:58     ` Don Zickus
  1 sibling, 0 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2010-02-08  9:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Don Zickus, peterz, aris, linux-kernel, Andrew Morton

On 2/8/10, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Don Zickus <dzickus@redhat.com> wrote:
>
>> +config NMI_WATCHDOG
>> +	bool "Detect Hard Lockups with an NMI Watchdog"
>> +	depends on DEBUG_KERNEL && PERF_EVENTS
>> +	default y
>> +	help
>> +	  Say Y here to enable the kernel to use the NMI as a watchdog
>> +	  to detect hard lockups.  This is useful when a cpu hangs for no
>> +	  reason but can still respond to NMIs.  A backtrace is displayed
>> +	  for reviewing and reporting.
>> +
>> +	  The overhead should be minimal, just an extra NMI every few
>> +	  seconds.
>
> Thought for later patches: I think an architecture should be able to express
> via a Kconfig switch that it actually _has_ NMI events. There's
> architectures
> which dont have a PMU driver and only have software events. There's also
> architectures that have a PMU driver but no NMIs.
>
> Something like ARCH_HAS_NMI_PERF_EVENTS?
>
> Also, i havent checked, but what is the practical effect of the new generic
> watchdog on x86 CPUs that does not have a native PMU driver yet - such as
> P4s?
>

p4 pmu is not yet implemented. I'll try to post on lkml the thnigs
i've done for it today evening, though it's pretty ugly i would say.

> Anyway, i'll create a tip:perf/nmi topic branch for these patches, it
> certainly looks like a useful generalization and a new architecture that has
> perf could easily enable it, without having to write its own NMI watchdog
> implementation. It's also useful for any new watchdog features that people
> might want to add. Plus it makes the x86 PMU code cleaner in the long run as
> well.
>
> Thanks,
>
> 	Ingo
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog
  2010-02-08  7:19   ` Ingo Molnar
  2010-02-08  9:39     ` Cyrill Gorcunov
@ 2010-02-08 14:58     ` Don Zickus
  1 sibling, 0 replies; 11+ messages in thread
From: Don Zickus @ 2010-02-08 14:58 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: peterz, gorcunov, aris, linux-kernel, Andrew Morton

On Mon, Feb 08, 2010 at 08:19:54AM +0100, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > +config NMI_WATCHDOG
> > +	bool "Detect Hard Lockups with an NMI Watchdog"
> > +	depends on DEBUG_KERNEL && PERF_EVENTS
> > +	default y
> > +	help
> > +	  Say Y here to enable the kernel to use the NMI as a watchdog
> > +	  to detect hard lockups.  This is useful when a cpu hangs for no
> > +	  reason but can still respond to NMIs.  A backtrace is displayed
> > +	  for reviewing and reporting.
> > +
> > +	  The overhead should be minimal, just an extra NMI every few 
> > +	  seconds.
> 
> Thought for later patches: I think an architecture should be able to express 
> via a Kconfig switch that it actually _has_ NMI events. There's architectures 
> which dont have a PMU driver and only have software events. There's also 
> architectures that have a PMU driver but no NMIs.
> 
> Something like ARCH_HAS_NMI_PERF_EVENTS?

I guess I assumed the perf event subsystem would take care of that which
is why I made the config option dependent on PERF_EVENTS.  I am open to
suggestions on enhance it.

> 
> Also, i havent checked, but what is the practical effect of the new generic 
> watchdog on x86 CPUs that does not have a native PMU driver yet - such as 
> P4s?

I believe the call to perf_event_create_kernel_counter would fail, which
then prevents the cpu from coming online.  Probably not the smartest thing
to do.  I was looking at adding code to fall back to trying PERF_TYPE_SOFTWARE.
Let me dig up a P4 box and see what happens.

> 
> Anyway, i'll create a tip:perf/nmi topic branch for these patches, it 
> certainly looks like a useful generalization and a new architecture that has 
> perf could easily enable it, without having to write its own NMI watchdog 
> implementation. It's also useful for any new watchdog features that people 
> might want to add. Plus it makes the x86 PMU code cleaner in the long run as 
> well.

Agreed.

Cheers,
Don

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:perf/nmi] nmi_watchdog: Only enable on x86 for now
  2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
  2010-02-08  7:19   ` Ingo Molnar
  2010-02-08  8:52   ` [tip:perf/nmi] nmi_watchdog: Config " tip-bot for Don Zickus
@ 2010-02-09 10:51   ` tip-bot for Ingo Molnar
  2 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Ingo Molnar @ 2010-02-09 10:51 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, dzickus, mingo

Commit-ID:  8e7672cdb413af859086ffceaed68f7e1e8ea4c2
Gitweb:     http://git.kernel.org/tip/8e7672cdb413af859086ffceaed68f7e1e8ea4c2
Author:     Ingo Molnar <mingo@elte.hu>
AuthorDate: Tue, 9 Feb 2010 06:11:00 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 9 Feb 2010 06:11:00 +0100

nmi_watchdog: Only enable on x86 for now

It wont even build on other platforms just yet - so restrict it
to x86 for now.

Cc: Don Zickus <dzickus@redhat.com>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 lib/Kconfig.debug |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index f80b67e..acef882 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -173,6 +173,7 @@ config DETECT_SOFTLOCKUP
 config NMI_WATCHDOG
 	bool "Detect Hard Lockups with an NMI Watchdog"
 	depends on DEBUG_KERNEL && PERF_EVENTS
+	depends on X86
 	default y
 	help
 	  Say Y here to enable the kernel to use the NMI as a watchdog

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-02-09 10:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-06  2:47 [PATCH 0/3 v2] new nmi_watchdog using perf events Don Zickus
2010-02-06  2:47 ` [PATCH 1/3 v2] [x86] move notify_die from nmi.c to traps.c Don Zickus
2010-02-08  8:51   ` [tip:perf/nmi] x86: Move " tip-bot for Don Zickus
2010-02-06  2:47 ` [PATCH 2/3 v2] nmi_watchdog: new implementation using perf events Don Zickus
2010-02-08  8:51   ` [tip:perf/nmi] nmi_watchdog: Add new, generic implementation, " tip-bot for Don Zickus
2010-02-06  2:47 ` [PATCH 3/3 v2] nmi_watchdog: config option to enable new nmi_watchdog Don Zickus
2010-02-08  7:19   ` Ingo Molnar
2010-02-08  9:39     ` Cyrill Gorcunov
2010-02-08 14:58     ` Don Zickus
2010-02-08  8:52   ` [tip:perf/nmi] nmi_watchdog: Config " tip-bot for Don Zickus
2010-02-09 10:51   ` [tip:perf/nmi] nmi_watchdog: Only enable on x86 for now tip-bot for Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.