linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4][V2] Improve watchdog config for arch watchdogs
@ 2017-05-25  8:28 Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-25  8:28 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

Hi Don,

The kbuild 0day caught a compile bug on some archs, I think I've
got that fixed.

I'm resending without the powerpc watchdog example. However if you
decide to accept these changes for the next merge window, I may send
the powerpc patch through your tree with appropriate acks, because
it's not very intrusive.

So the basic idea of these 4 patches is to allow arch specific
(non-perf) hardlockup watchdogs to properly hook into the existing
watchdog interfaces (cmdline, sysctl, kernel apis).

sparc for example implements its own detector, but it can't use all
the same sysctls, some cmdline options are a bit different, etc.
powerpc wants to move to its own detector like sparc, but without
causing interface functionality to change.

Thanks,
Nick

Nicholas Piggin (4):
  watchdog: remove unused declaration
  watchdog: introduce arch_touch_nmi_watchdog()
  watchdog: split up config options
  watchdog: provide watchdog_reconfigure() for arch watchdogs

 arch/blackfin/include/asm/nmi.h            |   2 +
 arch/blackfin/kernel/nmi.c                 |   2 +-
 arch/mn10300/include/asm/nmi.h             |   2 +
 arch/mn10300/kernel/mn10300-watchdog-low.S |   8 +-
 arch/mn10300/kernel/mn10300-watchdog.c     |   2 +-
 arch/powerpc/kernel/setup_64.c             |   2 +-
 arch/sparc/include/asm/nmi.h               |   1 +
 arch/sparc/kernel/nmi.c                    |   6 +-
 arch/x86/kernel/apic/hw_nmi.c              |   2 +-
 include/linux/nmi.h                        |  58 ++++---
 kernel/Makefile                            |   2 +-
 kernel/sysctl.c                            |  18 +-
 kernel/watchdog.c                          | 262 +++++++++++++++++++----------
 kernel/watchdog_hld.c                      |  37 +---
 lib/Kconfig.debug                          |  27 ++-
 15 files changed, 261 insertions(+), 170 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] watchdog: remove unused declaration
  2017-05-25  8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
@ 2017-05-25  8:28 ` Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-25  8:28 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 include/linux/nmi.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index aa3cd0878270..5e2e57536d98 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -12,9 +12,6 @@ extern void touch_softlockup_watchdog_sched(void);
 extern void touch_softlockup_watchdog(void);
 extern void touch_softlockup_watchdog_sync(void);
 extern void touch_all_softlockup_watchdogs(void);
-extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
-				  void __user *buffer,
-				  size_t *lenp, loff_t *ppos);
 extern unsigned int  softlockup_panic;
 extern unsigned int  hardlockup_panic;
 void lockup_detector_init(void);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
  2017-05-25  8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
@ 2017-05-25  8:28 ` Nicholas Piggin
  2017-05-25 13:55   ` Don Zickus
  2017-05-25  8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
  3 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-25  8:28 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

For architectures that define HAVE_NMI_WATCHDOG, instead of having
them provide the complete touch_nmi_watchdog() function, just have
them provide arch_touch_nmi_watchdog().

This gives the generic code more flexibility in implementing this
function, and arch implementations don't miss out on touching the
softlockup watchdog or other generic details.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/blackfin/include/asm/nmi.h            |  2 ++
 arch/blackfin/kernel/nmi.c                 |  2 +-
 arch/mn10300/include/asm/nmi.h             |  2 ++
 arch/mn10300/kernel/mn10300-watchdog-low.S |  8 ++++----
 arch/mn10300/kernel/mn10300-watchdog.c     |  2 +-
 arch/sparc/include/asm/nmi.h               |  1 +
 arch/sparc/kernel/nmi.c                    |  6 ++----
 include/linux/nmi.h                        | 26 +++++++++++++++-----------
 kernel/watchdog_hld.c                      |  5 ++---
 9 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/arch/blackfin/include/asm/nmi.h b/arch/blackfin/include/asm/nmi.h
index b9caac4fcfd8..6e7047ceec79 100644
--- a/arch/blackfin/include/asm/nmi.h
+++ b/arch/blackfin/include/asm/nmi.h
@@ -7,6 +7,8 @@
 #ifndef _BFIN_NMI_H_
 #define _BFIN_NMI_H_
 
+void arch_touch_nmi_watchdog(void);
+
 #include <linux/nmi.h>
 
 #endif
diff --git a/arch/blackfin/kernel/nmi.c b/arch/blackfin/kernel/nmi.c
index 633c37083e87..1e714329fe8a 100644
--- a/arch/blackfin/kernel/nmi.c
+++ b/arch/blackfin/kernel/nmi.c
@@ -190,7 +190,7 @@ static int __init init_nmi_wdt(void)
 }
 device_initcall(init_nmi_wdt);
 
-void touch_nmi_watchdog(void)
+void arch_touch_nmi_watchdog(void)
 {
 	atomic_set(&nmi_touched[smp_processor_id()], 1);
 }
diff --git a/arch/mn10300/include/asm/nmi.h b/arch/mn10300/include/asm/nmi.h
index f3671cbbc117..aee8aa22d9ee 100644
--- a/arch/mn10300/include/asm/nmi.h
+++ b/arch/mn10300/include/asm/nmi.h
@@ -11,4 +11,6 @@
 #ifndef _ASM_NMI_H
 #define _ASM_NMI_H
 
+void arch_touch_nmi_watchdog(void);
+
 #endif /* _ASM_NMI_H */
diff --git a/arch/mn10300/kernel/mn10300-watchdog-low.S b/arch/mn10300/kernel/mn10300-watchdog-low.S
index f2f5c9cfaabd..34f8773de7d0 100644
--- a/arch/mn10300/kernel/mn10300-watchdog-low.S
+++ b/arch/mn10300/kernel/mn10300-watchdog-low.S
@@ -50,9 +50,9 @@ watchdog_handler:
 #   we can't inline it)
 #
 ###############################################################################
-	.globl	touch_nmi_watchdog
-	.type	touch_nmi_watchdog,@function
-touch_nmi_watchdog:
+	.globl	arch_touch_nmi_watchdog
+	.type	arch_touch_nmi_watchdog,@function
+arch_touch_nmi_watchdog:
 	clr	d0
 	clr	d1
 	mov	watchdog_alert_counter, a0
@@ -63,4 +63,4 @@ touch_nmi_watchdog:
 	lne
 	ret	[],0
 
-	.size	touch_nmi_watchdog,.-touch_nmi_watchdog
+	.size	arch_touch_nmi_watchdog,.-arch_touch_nmi_watchdog
diff --git a/arch/mn10300/kernel/mn10300-watchdog.c b/arch/mn10300/kernel/mn10300-watchdog.c
index a2d8e6938d67..0d5641beadf5 100644
--- a/arch/mn10300/kernel/mn10300-watchdog.c
+++ b/arch/mn10300/kernel/mn10300-watchdog.c
@@ -31,7 +31,7 @@ static unsigned int watchdog;
 static unsigned int watchdog_hz = 1;
 unsigned int watchdog_alert_counter[NR_CPUS];
 
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(arch_touch_nmi_watchdog);
 
 /*
  * the best way to detect whether a CPU has a 'hard lockup' problem
diff --git a/arch/sparc/include/asm/nmi.h b/arch/sparc/include/asm/nmi.h
index 26ad2b2607c6..284eac3ffaf2 100644
--- a/arch/sparc/include/asm/nmi.h
+++ b/arch/sparc/include/asm/nmi.h
@@ -7,6 +7,7 @@ void nmi_adjust_hz(unsigned int new_hz);
 
 extern atomic_t nmi_active;
 
+void arch_touch_nmi_watchdog(void);
 void start_nmi_watchdog(void *unused);
 void stop_nmi_watchdog(void *unused);
 
diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
index 95e73c63c99d..048ad783ea3f 100644
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -51,7 +51,7 @@ static DEFINE_PER_CPU(unsigned int, last_irq_sum);
 static DEFINE_PER_CPU(long, alert_counter);
 static DEFINE_PER_CPU(int, nmi_touch);
 
-void touch_nmi_watchdog(void)
+void arch_touch_nmi_watchdog(void)
 {
 	if (atomic_read(&nmi_active)) {
 		int cpu;
@@ -61,10 +61,8 @@ void touch_nmi_watchdog(void)
 				per_cpu(nmi_touch, cpu) = 1;
 		}
 	}
-
-	touch_softlockup_watchdog();
 }
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(arch_touch_nmi_watchdog);
 
 static void die_nmi(const char *str, struct pt_regs *regs, int do_panic)
 {
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 5e2e57536d98..6ea465a842a1 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -6,6 +6,9 @@
 
 #include <linux/sched.h>
 #include <asm/irq.h>
+#if defined(CONFIG_HAVE_NMI_WATCHDOG)
+#include <asm/nmi.h>
+#endif
 
 #ifdef CONFIG_LOCKUP_DETECTOR
 extern void touch_softlockup_watchdog_sched(void);
@@ -58,6 +61,14 @@ static inline void reset_hung_task_detector(void)
 #define NMI_WATCHDOG_ENABLED      (1 << NMI_WATCHDOG_ENABLED_BIT)
 #define SOFT_WATCHDOG_ENABLED     (1 << SOFT_WATCHDOG_ENABLED_BIT)
 
+#if defined(CONFIG_HARDLOCKUP_DETECTOR)
+extern void hld_touch_nmi_watchdog(void);
+extern void hardlockup_detector_disable(void);
+#else
+static inline void hld_touch_nmi_watchdog(void) {}
+static inline void hardlockup_detector_disable(void) {}
+#endif
+
 /**
  * touch_nmi_watchdog - restart NMI watchdog timeout.
  * 
@@ -65,21 +76,14 @@ static inline void reset_hung_task_detector(void)
  * may be used to reset the timeout - for code which intentionally
  * disables interrupts for a long time. This call is stateless.
  */
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
-#include <asm/nmi.h>
-extern void touch_nmi_watchdog(void);
-#else
 static inline void touch_nmi_watchdog(void)
 {
+#if defined(CONFIG_HAVE_NMI_WATCHDOG)
+	arch_touch_nmi_watchdog();
+#endif
+	hld_touch_nmi_watchdog();
 	touch_softlockup_watchdog();
 }
-#endif
-
-#if defined(CONFIG_HARDLOCKUP_DETECTOR)
-extern void hardlockup_detector_disable(void);
-#else
-static inline void hardlockup_detector_disable(void) {}
-#endif
 
 /*
  * Create trigger_all_cpu_backtrace() out of the arch-provided
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 54a427d1f344..e0d7a7c43fb5 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -56,7 +56,7 @@ static int __init hardlockup_panic_setup(char *str)
 }
 __setup("nmi_watchdog=", hardlockup_panic_setup);
 
-void touch_nmi_watchdog(void)
+void hld_touch_nmi_watchdog(void)
 {
 	/*
 	 * Using __raw here because some code paths have
@@ -66,9 +66,8 @@ void touch_nmi_watchdog(void)
 	 * going off.
 	 */
 	raw_cpu_write(watchdog_nmi_touch, true);
-	touch_softlockup_watchdog();
 }
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(hld_touch_nmi_watchdog);
 
 static struct perf_event_attr wd_hw_attr = {
 	.type		= PERF_TYPE_HARDWARE,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] watchdog: split out config options
  2017-05-25  8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
  2017-05-25  8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
@ 2017-05-25  8:28 ` Nicholas Piggin
  2017-05-25 11:30   ` kbuild test robot
  2017-05-25 12:09   ` kbuild test robot
  2017-05-25  8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
  3 siblings, 2 replies; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-25  8:28 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.

LOCKUP_DETECTOR provides the boot, sysctl, and programming interfaces
for lockup detectors. An architecture that defines HAVE_NMI_WATCHDOG
need not use this this if it has a very basic watchdog or uses its own
options and interfaces (e.g., sparc). touch_nmi_watchdog() will
continue to call their arch_touch_nmi_watchdog().

HARDLOCKUP_DETECTOR_PERF is the perf-based lockup detector.

HARDLOCKUP_DETECTOR is the framework for arch NMI_WATCHDOG hard lockup
detectors that conform to the LOCKUP_DETECTOR interfaces.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/setup_64.c |   2 +-
 arch/x86/kernel/apic/hw_nmi.c  |   2 +-
 include/linux/nmi.h            |  35 ++++--
 kernel/Makefile                |   2 +-
 kernel/sysctl.c                |  18 ++--
 kernel/watchdog.c              | 238 ++++++++++++++++++++++++++---------------
 kernel/watchdog_hld.c          |  32 ------
 lib/Kconfig.debug              |  27 +++--
 8 files changed, 212 insertions(+), 144 deletions(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f35ff9dea4fb..ab650905f75a 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -727,7 +727,7 @@ struct ppc_pci_io ppc_pci_io;
 EXPORT_SYMBOL(ppc_pci_io);
 #endif
 
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
 u64 hw_nmi_get_sample_period(int watchdog_thresh)
 {
 	return ppc_proc_freq * watchdog_thresh;
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index c73c9fb281e1..d6f387780849 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -19,7 +19,7 @@
 #include <linux/init.h>
 #include <linux/delay.h>
 
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
 u64 hw_nmi_get_sample_period(int watchdog_thresh)
 {
 	return (u64)(cpu_khz) * 1000 * watchdog_thresh;
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 6ea465a842a1..c11dc695fb0b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -11,13 +11,21 @@
 #endif
 
 #ifdef CONFIG_LOCKUP_DETECTOR
+void lockup_detector_init(void);
+#else
+static inline void lockup_detector_init(void)
+{
+}
+#endif
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
 extern void touch_softlockup_watchdog_sched(void);
 extern void touch_softlockup_watchdog(void);
 extern void touch_softlockup_watchdog_sync(void);
 extern void touch_all_softlockup_watchdogs(void);
 extern unsigned int  softlockup_panic;
-extern unsigned int  hardlockup_panic;
-void lockup_detector_init(void);
+extern int soft_watchdog_enabled;
+extern atomic_t watchdog_park_in_progress;
 #else
 static inline void touch_softlockup_watchdog_sched(void)
 {
@@ -31,9 +39,6 @@ static inline void touch_softlockup_watchdog_sync(void)
 static inline void touch_all_softlockup_watchdogs(void)
 {
 }
-static inline void lockup_detector_init(void)
-{
-}
 #endif
 
 #ifdef CONFIG_DETECT_HUNG_TASK
@@ -61,14 +66,19 @@ static inline void reset_hung_task_detector(void)
 #define NMI_WATCHDOG_ENABLED      (1 << NMI_WATCHDOG_ENABLED_BIT)
 #define SOFT_WATCHDOG_ENABLED     (1 << SOFT_WATCHDOG_ENABLED_BIT)
 
-#if defined(CONFIG_HARDLOCKUP_DETECTOR)
-extern void hld_touch_nmi_watchdog(void);
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
 extern void hardlockup_detector_disable(void);
+extern unsigned int hardlockup_panic;
 #else
-static inline void hld_touch_nmi_watchdog(void) {}
 static inline void hardlockup_detector_disable(void) {}
 #endif
 
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
+extern void hld_touch_nmi_watchdog(void);
+#else
+static inline void hld_touch_nmi_watchdog(void) {}
+#endif
+
 /**
  * touch_nmi_watchdog - restart NMI watchdog timeout.
  * 
@@ -140,15 +150,18 @@ static inline bool trigger_single_cpu_backtrace(int cpu)
 }
 #endif
 
-#ifdef CONFIG_LOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
 u64 hw_nmi_get_sample_period(int watchdog_thresh);
+#endif
+
+#ifdef CONFIG_LOCKUP_DETECTOR
 extern int nmi_watchdog_enabled;
-extern int soft_watchdog_enabled;
 extern int watchdog_user_enabled;
 extern int watchdog_thresh;
 extern unsigned long watchdog_enabled;
+extern struct cpumask watchdog_cpumask;
 extern unsigned long *watchdog_cpumask_bits;
-extern atomic_t watchdog_park_in_progress;
+extern int __read_mostly watchdog_suspended;
 #ifdef CONFIG_SMP
 extern int sysctl_softlockup_all_cpu_backtrace;
 extern int sysctl_hardlockup_all_cpu_backtrace;
diff --git a/kernel/Makefile b/kernel/Makefile
index 72aa080f91f0..4cb8e8b23c6e 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -82,7 +82,7 @@ obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_KGDB) += debug/
 obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
 obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o
-obj-$(CONFIG_HARDLOCKUP_DETECTOR) += watchdog_hld.o
+obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o
 obj-$(CONFIG_SECCOMP) += seccomp.o
 obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_SYSCTL) += utsname_sysctl.o
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4dfba1a76cc3..fa37faa89143 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -873,13 +873,21 @@ static struct ctl_table kern_table[] = {
 		.mode           = 0644,
 		.proc_handler   = proc_nmi_watchdog,
 		.extra1		= &zero,
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
 		.extra2		= &one,
 #else
 		.extra2		= &zero,
 #endif
 	},
 	{
+		.procname	= "watchdog_cpumask",
+		.data		= &watchdog_cpumask_bits,
+		.maxlen		= NR_CPUS,
+		.mode		= 0644,
+		.proc_handler	= proc_watchdog_cpumask,
+	},
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+	{
 		.procname       = "soft_watchdog",
 		.data           = &soft_watchdog_enabled,
 		.maxlen         = sizeof (int),
@@ -889,13 +897,6 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 	{
-		.procname	= "watchdog_cpumask",
-		.data		= &watchdog_cpumask_bits,
-		.maxlen		= NR_CPUS,
-		.mode		= 0644,
-		.proc_handler	= proc_watchdog_cpumask,
-	},
-	{
 		.procname	= "softlockup_panic",
 		.data		= &softlockup_panic,
 		.maxlen		= sizeof(int),
@@ -904,6 +905,7 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one,
 	},
+#endif
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 	{
 		.procname	= "hardlockup_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 03e0b69bb5bf..deb010505646 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -29,15 +29,55 @@
 #include <linux/kvm_para.h>
 #include <linux/kthread.h>
 
+/* Watchdog configuration */
 static DEFINE_MUTEX(watchdog_proc_mutex);
 
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+int __read_mostly nmi_watchdog_enabled;
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
 unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED|NMI_WATCHDOG_ENABLED;
+
+/* boot commands */
+/*
+ * Should we panic when a soft-lockup or hard-lockup occurs:
+ */
+unsigned int __read_mostly hardlockup_panic =
+			CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
+/*
+ * We may not want to enable hard lockup detection by default in all cases,
+ * for example when running the kernel as a guest on a hypervisor. In these
+ * cases this function can be called to disable hard lockup detection. This
+ * function should only be executed once by the boot processor before the
+ * kernel command line parameters are parsed, because otherwise it is not
+ * possible to override this in hardlockup_panic_setup().
+ */
+void hardlockup_detector_disable(void)
+{
+	watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+}
+
+static int __init hardlockup_panic_setup(char *str)
+{
+	if (!strncmp(str, "panic", 5))
+		hardlockup_panic = 1;
+	else if (!strncmp(str, "nopanic", 7))
+		hardlockup_panic = 0;
+	else if (!strncmp(str, "0", 1))
+		watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+	else if (!strncmp(str, "1", 1))
+		watchdog_enabled |= NMI_WATCHDOG_ENABLED;
+	return 1;
+}
+__setup("nmi_watchdog=", hardlockup_panic_setup);
+
 #else
 unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
 #endif
-int __read_mostly nmi_watchdog_enabled;
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
 int __read_mostly soft_watchdog_enabled;
+#endif
+
 int __read_mostly watchdog_user_enabled;
 int __read_mostly watchdog_thresh = 10;
 
@@ -45,15 +85,9 @@ int __read_mostly watchdog_thresh = 10;
 int __read_mostly sysctl_softlockup_all_cpu_backtrace;
 int __read_mostly sysctl_hardlockup_all_cpu_backtrace;
 #endif
-static struct cpumask watchdog_cpumask __read_mostly;
+struct cpumask watchdog_cpumask __read_mostly;
 unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
 
-/* Helper for online, unparked cpus. */
-#define for_each_watchdog_cpu(cpu) \
-	for_each_cpu_and((cpu), cpu_online_mask, &watchdog_cpumask)
-
-atomic_t watchdog_park_in_progress = ATOMIC_INIT(0);
-
 /*
  * The 'watchdog_running' variable is set to 1 when the watchdog threads
  * are registered/started and is set to 0 when the watchdog threads are
@@ -72,7 +106,27 @@ static int __read_mostly watchdog_running;
  * of 'watchdog_running' cannot change while the watchdog is deactivated
  * temporarily (see related code in 'proc' handlers).
  */
-static int __read_mostly watchdog_suspended;
+int __read_mostly watchdog_suspended;
+
+/*
+ * These functions can be overridden if an architecture implements its
+ * own hardlockup detector.
+ */
+int __weak watchdog_nmi_enable(unsigned int cpu)
+{
+	return 0;
+}
+void __weak watchdog_nmi_disable(unsigned int cpu)
+{
+}
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+
+/* Helper for online, unparked cpus. */
+#define for_each_watchdog_cpu(cpu) \
+	for_each_cpu_and((cpu), cpu_online_mask, &watchdog_cpumask)
+
+atomic_t watchdog_park_in_progress = ATOMIC_INIT(0);
 
 static u64 __read_mostly sample_period;
 
@@ -120,6 +174,7 @@ static int __init softlockup_all_cpu_backtrace_setup(char *str)
 	return 1;
 }
 __setup("softlockup_all_cpu_backtrace=", softlockup_all_cpu_backtrace_setup);
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
 static int __init hardlockup_all_cpu_backtrace_setup(char *str)
 {
 	sysctl_hardlockup_all_cpu_backtrace =
@@ -128,6 +183,7 @@ static int __init hardlockup_all_cpu_backtrace_setup(char *str)
 }
 __setup("hardlockup_all_cpu_backtrace=", hardlockup_all_cpu_backtrace_setup);
 #endif
+#endif
 
 /*
  * Hard-lockup warnings should be triggered after just a few seconds. Soft-
@@ -213,18 +269,6 @@ void touch_softlockup_watchdog_sync(void)
 	__this_cpu_write(watchdog_touch_ts, 0);
 }
 
-/* watchdog detector functions */
-bool is_hardlockup(void)
-{
-	unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
-
-	if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
-		return true;
-
-	__this_cpu_write(hrtimer_interrupts_saved, hrint);
-	return false;
-}
-
 static int is_softlockup(unsigned long touch_ts)
 {
 	unsigned long now = get_timestamp();
@@ -237,21 +281,21 @@ static int is_softlockup(unsigned long touch_ts)
 	return 0;
 }
 
-static void watchdog_interrupt_count(void)
+/* watchdog detector functions */
+bool is_hardlockup(void)
 {
-	__this_cpu_inc(hrtimer_interrupts);
-}
+	unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
 
-/*
- * These two functions are mostly architecture specific
- * defining them as weak here.
- */
-int __weak watchdog_nmi_enable(unsigned int cpu)
-{
-	return 0;
+	if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
+		return true;
+
+	__this_cpu_write(hrtimer_interrupts_saved, hrint);
+	return false;
 }
-void __weak watchdog_nmi_disable(unsigned int cpu)
+
+static void watchdog_interrupt_count(void)
 {
+	__this_cpu_inc(hrtimer_interrupts);
 }
 
 static int watchdog_enable_all_cpus(void);
@@ -502,57 +546,6 @@ static void watchdog_unpark_threads(void)
 		kthread_unpark(per_cpu(softlockup_watchdog, cpu));
 }
 
-/*
- * Suspend the hard and soft lockup detector by parking the watchdog threads.
- */
-int lockup_detector_suspend(void)
-{
-	int ret = 0;
-
-	get_online_cpus();
-	mutex_lock(&watchdog_proc_mutex);
-	/*
-	 * Multiple suspend requests can be active in parallel (counted by
-	 * the 'watchdog_suspended' variable). If the watchdog threads are
-	 * running, the first caller takes care that they will be parked.
-	 * The state of 'watchdog_running' cannot change while a suspend
-	 * request is active (see related code in 'proc' handlers).
-	 */
-	if (watchdog_running && !watchdog_suspended)
-		ret = watchdog_park_threads();
-
-	if (ret == 0)
-		watchdog_suspended++;
-	else {
-		watchdog_disable_all_cpus();
-		pr_err("Failed to suspend lockup detectors, disabled\n");
-		watchdog_enabled = 0;
-	}
-
-	mutex_unlock(&watchdog_proc_mutex);
-
-	return ret;
-}
-
-/*
- * Resume the hard and soft lockup detector by unparking the watchdog threads.
- */
-void lockup_detector_resume(void)
-{
-	mutex_lock(&watchdog_proc_mutex);
-
-	watchdog_suspended--;
-	/*
-	 * The watchdog threads are unparked if they were previously running
-	 * and if there is no more active suspend request.
-	 */
-	if (watchdog_running && !watchdog_suspended)
-		watchdog_unpark_threads();
-
-	mutex_unlock(&watchdog_proc_mutex);
-	put_online_cpus();
-}
-
 static int update_watchdog_all_cpus(void)
 {
 	int ret;
@@ -604,6 +597,81 @@ static void watchdog_disable_all_cpus(void)
 	}
 }
 
+#else /* SOFTLOCKUP */
+static int watchdog_park_threads(void)
+{
+	return 0;
+}
+
+static void watchdog_unpark_threads(void)
+{
+}
+
+static int watchdog_enable_all_cpus(void)
+{
+	return 0;
+}
+
+static void watchdog_disable_all_cpus(void)
+{
+}
+
+static void set_sample_period(void)
+{
+}
+#endif /* SOFTLOCKUP */
+
+/*
+ * Suspend the hard and soft lockup detector by parking the watchdog threads.
+ */
+int lockup_detector_suspend(void)
+{
+	int ret = 0;
+
+	get_online_cpus();
+	mutex_lock(&watchdog_proc_mutex);
+	/*
+	 * Multiple suspend requests can be active in parallel (counted by
+	 * the 'watchdog_suspended' variable). If the watchdog threads are
+	 * running, the first caller takes care that they will be parked.
+	 * The state of 'watchdog_running' cannot change while a suspend
+	 * request is active (see related code in 'proc' handlers).
+	 */
+	if (watchdog_running && !watchdog_suspended)
+		ret = watchdog_park_threads();
+
+	if (ret == 0)
+		watchdog_suspended++;
+	else {
+		watchdog_disable_all_cpus();
+		pr_err("Failed to suspend lockup detectors, disabled\n");
+		watchdog_enabled = 0;
+	}
+
+	mutex_unlock(&watchdog_proc_mutex);
+
+	return ret;
+}
+
+/*
+ * Resume the hard and soft lockup detector by unparking the watchdog threads.
+ */
+void lockup_detector_resume(void)
+{
+	mutex_lock(&watchdog_proc_mutex);
+
+	watchdog_suspended--;
+	/*
+	 * The watchdog threads are unparked if they were previously running
+	 * and if there is no more active suspend request.
+	 */
+	if (watchdog_running && !watchdog_suspended)
+		watchdog_unpark_threads();
+
+	mutex_unlock(&watchdog_proc_mutex);
+	put_online_cpus();
+}
+
 #ifdef CONFIG_SYSCTL
 
 /*
@@ -810,9 +878,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 			 * a temporary cpumask, so we are likely not in a
 			 * position to do much else to make things better.
 			 */
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
 			if (smpboot_update_cpumask_percpu_thread(
 				    &watchdog_threads, &watchdog_cpumask) != 0)
 				pr_err("cpumask update failed\n");
+#endif
 		}
 	}
 out:
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index e0d7a7c43fb5..ffd4fa59fbde 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -22,39 +22,7 @@ static DEFINE_PER_CPU(bool, hard_watchdog_warn);
 static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
 static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
 
-/* boot commands */
-/*
- * Should we panic when a soft-lockup or hard-lockup occurs:
- */
-unsigned int __read_mostly hardlockup_panic =
-			CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
 static unsigned long hardlockup_allcpu_dumped;
-/*
- * We may not want to enable hard lockup detection by default in all cases,
- * for example when running the kernel as a guest on a hypervisor. In these
- * cases this function can be called to disable hard lockup detection. This
- * function should only be executed once by the boot processor before the
- * kernel command line parameters are parsed, because otherwise it is not
- * possible to override this in hardlockup_panic_setup().
- */
-void hardlockup_detector_disable(void)
-{
-	watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
-}
-
-static int __init hardlockup_panic_setup(char *str)
-{
-	if (!strncmp(str, "panic", 5))
-		hardlockup_panic = 1;
-	else if (!strncmp(str, "nopanic", 7))
-		hardlockup_panic = 0;
-	else if (!strncmp(str, "0", 1))
-		watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
-	else if (!strncmp(str, "1", 1))
-		watchdog_enabled |= NMI_WATCHDOG_ENABLED;
-	return 1;
-}
-__setup("nmi_watchdog=", hardlockup_panic_setup);
 
 void hld_touch_nmi_watchdog(void)
 {
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e4587ebe52c7..74d5ad667426 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -801,10 +801,25 @@ config LOCKUP_DETECTOR
 	  The frequency of hrtimer and NMI events and the soft and hard lockup
 	  thresholds can be controlled through the sysctl watchdog_thresh.
 
+config SOFTLOCKUP_DETECTOR
+	bool "Detect Soft Lockups"
+	depends on LOCKUP_DETECTOR
+
+config HARDLOCKUP_DETECTOR_PERF
+	bool
+	select SOFTLOCKUP_DETECTOR
+
+#
+# arch/ can define HAVE_NMI_WATCHDOG to provide their own NMI watchdog
+# rather than the perf based detector.
+#
+# The arch may do its own thing, or select HARDLOCKUP_DETECTOR, which
+#
 config HARDLOCKUP_DETECTOR
-	def_bool y
-	depends on LOCKUP_DETECTOR && !HAVE_NMI_WATCHDOG
-	depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI
+	bool "Detect Hard Lockups"
+	depends on LOCKUP_DETECTOR
+	depends on !HAVE_NMI_WATCHDOG || (PERF_EVENTS && HAVE_PERF_EVENTS_NMI)
+	select HARDLOCKUP_DETECTOR_PERF if !HAVE_NMI_WATCHDOG
 
 config BOOTPARAM_HARDLOCKUP_PANIC
 	bool "Panic (Reboot) On Hard Lockups"
@@ -826,7 +841,7 @@ config BOOTPARAM_HARDLOCKUP_PANIC_VALUE
 
 config BOOTPARAM_SOFTLOCKUP_PANIC
 	bool "Panic (Reboot) On Soft Lockups"
-	depends on LOCKUP_DETECTOR
+	depends on SOFTLOCKUP_DETECTOR
 	help
 	  Say Y here to enable the kernel to panic on "soft lockups",
 	  which are bugs that cause the kernel to loop in kernel
@@ -843,7 +858,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC
 
 config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
 	int
-	depends on LOCKUP_DETECTOR
+	depends on SOFTLOCKUP_DETECTOR
 	range 0 1
 	default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC
 	default 1 if BOOTPARAM_SOFTLOCKUP_PANIC
@@ -851,7 +866,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
 config DETECT_HUNG_TASK
 	bool "Detect Hung Tasks"
 	depends on DEBUG_KERNEL
-	default LOCKUP_DETECTOR
+	default SOFTLOCKUP_DETECTOR
 	help
 	  Say Y here to enable the kernel to detect "hung tasks",
 	  which are bugs that cause the task to be stuck in
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
  2017-05-25  8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
                   ` (2 preceding siblings ...)
  2017-05-25  8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
@ 2017-05-25  8:28 ` Nicholas Piggin
  2017-05-25 14:08   ` Don Zickus
  3 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-25  8:28 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

After reconfiguring watchdog sysctls etc., architecture specific
watchdogs may not get all their parameters updated.

watchdog_reconfigure() can be implemented to pull the new values
in and set the arch NMI watchdog.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/watchdog.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index deb010505646..d2996f5bf551 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -120,6 +120,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
 {
 }
 
+void __weak watchdog_nmi_reconfigure(void)
+{
+}
+
+
 #ifdef CONFIG_SOFTLOCKUP_DETECTOR
 
 /* Helper for online, unparked cpus. */
@@ -597,6 +602,12 @@ static void watchdog_disable_all_cpus(void)
 	}
 }
 
+static int watchdog_update_cpus(void)
+{
+	return smpboot_update_cpumask_percpu_thread(
+		    &watchdog_threads, &watchdog_cpumask);
+}
+
 #else /* SOFTLOCKUP */
 static int watchdog_park_threads(void)
 {
@@ -616,6 +627,10 @@ static void watchdog_disable_all_cpus(void)
 {
 }
 
+static int watchdog_update_cpus(void)
+{
+}
+
 static void set_sample_period(void)
 {
 }
@@ -648,6 +663,8 @@ int lockup_detector_suspend(void)
 		watchdog_enabled = 0;
 	}
 
+	watchdog_nmi_reconfigure();
+
 	mutex_unlock(&watchdog_proc_mutex);
 
 	return ret;
@@ -668,6 +685,8 @@ void lockup_detector_resume(void)
 	if (watchdog_running && !watchdog_suspended)
 		watchdog_unpark_threads();
 
+	watchdog_nmi_reconfigure();
+
 	mutex_unlock(&watchdog_proc_mutex);
 	put_online_cpus();
 }
@@ -693,6 +712,8 @@ static int proc_watchdog_update(void)
 	else
 		watchdog_disable_all_cpus();
 
+	watchdog_nmi_reconfigure();
+
 	return err;
 
 }
@@ -878,12 +899,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 			 * a temporary cpumask, so we are likely not in a
 			 * position to do much else to make things better.
 			 */
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
-			if (smpboot_update_cpumask_percpu_thread(
-				    &watchdog_threads, &watchdog_cpumask) != 0)
+			if (watchdog_update_cpus() != 0)
 				pr_err("cpumask update failed\n");
-#endif
 		}
+
+		watchdog_nmi_reconfigure();
 	}
 out:
 	mutex_unlock(&watchdog_proc_mutex);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] watchdog: split out config options
  2017-05-25  8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
@ 2017-05-25 11:30   ` kbuild test robot
  2017-05-25 12:09   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2017-05-25 11:30 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: kbuild-all, Don Zickus, Nicholas Piggin, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 8578 bytes --]

Hi Nicholas,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.12-rc2 next-20170525]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/watchdog-remove-unused-declaration/20170525-164255
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=ia64 

All error/warnings (new ones prefixed by >>):

   kernel/watchdog_hld.c: In function 'watchdog_overflow_callback':
>> kernel/watchdog_hld.c:54:7: error: 'struct perf_event' has no member named 'hw'
     event->hw.interrupts = 0;
          ^~
   kernel/watchdog_hld.c: In function 'watchdog_nmi_enable':
>> kernel/watchdog_hld.c:123:20: error: 'struct perf_event' has no member named 'state'
     if (event && event->state > PERF_EVENT_STATE_OFF)
                       ^~
>> kernel/watchdog_hld.c:137:10: error: implicit declaration of function 'perf_event_create_kernel_counter' [-Werror=implicit-function-declaration]
     event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> kernel/watchdog_hld.c:137:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
     event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
           ^
   cc1: some warnings being treated as errors

vim +54 kernel/watchdog_hld.c

73ce0511 Babu Moger      2016-12-14   48  /* Callback function for perf event subsystem */
73ce0511 Babu Moger      2016-12-14   49  static void watchdog_overflow_callback(struct perf_event *event,
73ce0511 Babu Moger      2016-12-14   50  		 struct perf_sample_data *data,
73ce0511 Babu Moger      2016-12-14   51  		 struct pt_regs *regs)
73ce0511 Babu Moger      2016-12-14   52  {
73ce0511 Babu Moger      2016-12-14   53  	/* Ensure the watchdog never gets throttled */
73ce0511 Babu Moger      2016-12-14  @54  	event->hw.interrupts = 0;
73ce0511 Babu Moger      2016-12-14   55  
b94f5118 Don Zickus      2017-01-24   56  	if (atomic_read(&watchdog_park_in_progress) != 0)
b94f5118 Don Zickus      2017-01-24   57  		return;
b94f5118 Don Zickus      2017-01-24   58  
73ce0511 Babu Moger      2016-12-14   59  	if (__this_cpu_read(watchdog_nmi_touch) == true) {
73ce0511 Babu Moger      2016-12-14   60  		__this_cpu_write(watchdog_nmi_touch, false);
73ce0511 Babu Moger      2016-12-14   61  		return;
73ce0511 Babu Moger      2016-12-14   62  	}
73ce0511 Babu Moger      2016-12-14   63  
73ce0511 Babu Moger      2016-12-14   64  	/* check for a hardlockup
73ce0511 Babu Moger      2016-12-14   65  	 * This is done by making sure our timer interrupt
73ce0511 Babu Moger      2016-12-14   66  	 * is incrementing.  The timer interrupt should have
73ce0511 Babu Moger      2016-12-14   67  	 * fired multiple times before we overflow'd.  If it hasn't
73ce0511 Babu Moger      2016-12-14   68  	 * then this is a good indication the cpu is stuck
73ce0511 Babu Moger      2016-12-14   69  	 */
73ce0511 Babu Moger      2016-12-14   70  	if (is_hardlockup()) {
73ce0511 Babu Moger      2016-12-14   71  		int this_cpu = smp_processor_id();
73ce0511 Babu Moger      2016-12-14   72  
73ce0511 Babu Moger      2016-12-14   73  		/* only print hardlockups once */
73ce0511 Babu Moger      2016-12-14   74  		if (__this_cpu_read(hard_watchdog_warn) == true)
73ce0511 Babu Moger      2016-12-14   75  			return;
73ce0511 Babu Moger      2016-12-14   76  
73ce0511 Babu Moger      2016-12-14   77  		pr_emerg("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
73ce0511 Babu Moger      2016-12-14   78  		print_modules();
73ce0511 Babu Moger      2016-12-14   79  		print_irqtrace_events(current);
73ce0511 Babu Moger      2016-12-14   80  		if (regs)
73ce0511 Babu Moger      2016-12-14   81  			show_regs(regs);
73ce0511 Babu Moger      2016-12-14   82  		else
73ce0511 Babu Moger      2016-12-14   83  			dump_stack();
73ce0511 Babu Moger      2016-12-14   84  
73ce0511 Babu Moger      2016-12-14   85  		/*
73ce0511 Babu Moger      2016-12-14   86  		 * Perform all-CPU dump only once to avoid multiple hardlockups
73ce0511 Babu Moger      2016-12-14   87  		 * generating interleaving traces
73ce0511 Babu Moger      2016-12-14   88  		 */
73ce0511 Babu Moger      2016-12-14   89  		if (sysctl_hardlockup_all_cpu_backtrace &&
73ce0511 Babu Moger      2016-12-14   90  				!test_and_set_bit(0, &hardlockup_allcpu_dumped))
73ce0511 Babu Moger      2016-12-14   91  			trigger_allbutself_cpu_backtrace();
73ce0511 Babu Moger      2016-12-14   92  
73ce0511 Babu Moger      2016-12-14   93  		if (hardlockup_panic)
73ce0511 Babu Moger      2016-12-14   94  			nmi_panic(regs, "Hard LOCKUP");
73ce0511 Babu Moger      2016-12-14   95  
73ce0511 Babu Moger      2016-12-14   96  		__this_cpu_write(hard_watchdog_warn, true);
73ce0511 Babu Moger      2016-12-14   97  		return;
73ce0511 Babu Moger      2016-12-14   98  	}
73ce0511 Babu Moger      2016-12-14   99  
73ce0511 Babu Moger      2016-12-14  100  	__this_cpu_write(hard_watchdog_warn, false);
73ce0511 Babu Moger      2016-12-14  101  	return;
73ce0511 Babu Moger      2016-12-14  102  }
73ce0511 Babu Moger      2016-12-14  103  
73ce0511 Babu Moger      2016-12-14  104  /*
73ce0511 Babu Moger      2016-12-14  105   * People like the simple clean cpu node info on boot.
73ce0511 Babu Moger      2016-12-14  106   * Reduce the watchdog noise by only printing messages
73ce0511 Babu Moger      2016-12-14  107   * that are different from what cpu0 displayed.
73ce0511 Babu Moger      2016-12-14  108   */
8dcde9de Prarit Bhargava 2017-02-22  109  static unsigned long firstcpu_err;
8dcde9de Prarit Bhargava 2017-02-22  110  static atomic_t watchdog_cpus;
73ce0511 Babu Moger      2016-12-14  111  
73ce0511 Babu Moger      2016-12-14  112  int watchdog_nmi_enable(unsigned int cpu)
73ce0511 Babu Moger      2016-12-14  113  {
73ce0511 Babu Moger      2016-12-14  114  	struct perf_event_attr *wd_attr;
73ce0511 Babu Moger      2016-12-14  115  	struct perf_event *event = per_cpu(watchdog_ev, cpu);
8dcde9de Prarit Bhargava 2017-02-22  116  	int firstcpu = 0;
73ce0511 Babu Moger      2016-12-14  117  
73ce0511 Babu Moger      2016-12-14  118  	/* nothing to do if the hard lockup detector is disabled */
73ce0511 Babu Moger      2016-12-14  119  	if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
73ce0511 Babu Moger      2016-12-14  120  		goto out;
73ce0511 Babu Moger      2016-12-14  121  
73ce0511 Babu Moger      2016-12-14  122  	/* is it already setup and enabled? */
73ce0511 Babu Moger      2016-12-14 @123  	if (event && event->state > PERF_EVENT_STATE_OFF)
73ce0511 Babu Moger      2016-12-14  124  		goto out;
73ce0511 Babu Moger      2016-12-14  125  
73ce0511 Babu Moger      2016-12-14  126  	/* it is setup but not enabled */
73ce0511 Babu Moger      2016-12-14  127  	if (event != NULL)
73ce0511 Babu Moger      2016-12-14  128  		goto out_enable;
73ce0511 Babu Moger      2016-12-14  129  
8dcde9de Prarit Bhargava 2017-02-22  130  	if (atomic_inc_return(&watchdog_cpus) == 1)
8dcde9de Prarit Bhargava 2017-02-22  131  		firstcpu = 1;
8dcde9de Prarit Bhargava 2017-02-22  132  
73ce0511 Babu Moger      2016-12-14  133  	wd_attr = &wd_hw_attr;
73ce0511 Babu Moger      2016-12-14  134  	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
73ce0511 Babu Moger      2016-12-14  135  
73ce0511 Babu Moger      2016-12-14  136  	/* Try to register using hardware perf events */
73ce0511 Babu Moger      2016-12-14 @137  	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
73ce0511 Babu Moger      2016-12-14  138  
8dcde9de Prarit Bhargava 2017-02-22  139  	/* save the first cpu's error for future comparision */
8dcde9de Prarit Bhargava 2017-02-22  140  	if (firstcpu && IS_ERR(event))

:::::: The code at line 54 was first introduced by commit
:::::: 73ce0511c43686095efd2f65ef564aab952e07bc kernel/watchdog.c: move hardlockup detector to separate file

:::::: TO: Babu Moger <babu.moger@oracle.com>
:::::: CC: Linus Torvalds <torvalds@linux-foundation.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 47748 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] watchdog: split out config options
  2017-05-25  8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
  2017-05-25 11:30   ` kbuild test robot
@ 2017-05-25 12:09   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2017-05-25 12:09 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: kbuild-all, Don Zickus, Nicholas Piggin, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]

Hi Nicholas,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.12-rc2 next-20170525]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/watchdog-remove-unused-declaration/20170525-164255
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=parisc 

All errors (new ones prefixed by >>):

   kernel/built-in.o: In function `watchdog_nmi_enable':
>> (.text.watchdog_nmi_enable+0x298): undefined reference to `hw_nmi_get_sample_period'
   kernel/built-in.o: In function `watchdog_nmi_enable':
   (.text.watchdog_nmi_enable+0x2e8): undefined reference to `hw_nmi_get_sample_period'
   hppa-linux-gnu-ld: drivers/built-in.o(.init.text+0x3d4ac): cannot reach 0000b22c_simple_map_init+0, recompile with -ffunction-sections
   hppa-linux-gnu-ld: drivers/built-in.o(.init.text+0x3d4ac): cannot handle R_PARISC_PCREL17F for simple_map_init
   hppa-linux-gnu-ld: final link failed: Bad value

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
  2017-05-25  8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
@ 2017-05-25 13:55   ` Don Zickus
  2017-05-26  0:31     ` Nicholas Piggin
  0 siblings, 1 reply; 14+ messages in thread
From: Don Zickus @ 2017-05-25 13:55 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, linux-arch

On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> For architectures that define HAVE_NMI_WATCHDOG, instead of having
> them provide the complete touch_nmi_watchdog() function, just have
> them provide arch_touch_nmi_watchdog().
> 
> This gives the generic code more flexibility in implementing this
> function, and arch implementations don't miss out on touching the
> softlockup watchdog or other generic details.

The idea makes sense.  I don't think you can have hld_touch_nmi_watchdog
defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
combine them somehow.  Though renaming hld_touch_nmi_watchdog to
arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.

Cheers,
Don

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/blackfin/include/asm/nmi.h            |  2 ++
>  arch/blackfin/kernel/nmi.c                 |  2 +-
>  arch/mn10300/include/asm/nmi.h             |  2 ++
>  arch/mn10300/kernel/mn10300-watchdog-low.S |  8 ++++----
>  arch/mn10300/kernel/mn10300-watchdog.c     |  2 +-
>  arch/sparc/include/asm/nmi.h               |  1 +
>  arch/sparc/kernel/nmi.c                    |  6 ++----
>  include/linux/nmi.h                        | 26 +++++++++++++++-----------
>  kernel/watchdog_hld.c                      |  5 ++---
>  9 files changed, 30 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/blackfin/include/asm/nmi.h b/arch/blackfin/include/asm/nmi.h
> index b9caac4fcfd8..6e7047ceec79 100644
> --- a/arch/blackfin/include/asm/nmi.h
> +++ b/arch/blackfin/include/asm/nmi.h
> @@ -7,6 +7,8 @@
>  #ifndef _BFIN_NMI_H_
>  #define _BFIN_NMI_H_
>  
> +void arch_touch_nmi_watchdog(void);
> +
>  #include <linux/nmi.h>
>  
>  #endif
> diff --git a/arch/blackfin/kernel/nmi.c b/arch/blackfin/kernel/nmi.c
> index 633c37083e87..1e714329fe8a 100644
> --- a/arch/blackfin/kernel/nmi.c
> +++ b/arch/blackfin/kernel/nmi.c
> @@ -190,7 +190,7 @@ static int __init init_nmi_wdt(void)
>  }
>  device_initcall(init_nmi_wdt);
>  
> -void touch_nmi_watchdog(void)
> +void arch_touch_nmi_watchdog(void)
>  {
>  	atomic_set(&nmi_touched[smp_processor_id()], 1);
>  }
> diff --git a/arch/mn10300/include/asm/nmi.h b/arch/mn10300/include/asm/nmi.h
> index f3671cbbc117..aee8aa22d9ee 100644
> --- a/arch/mn10300/include/asm/nmi.h
> +++ b/arch/mn10300/include/asm/nmi.h
> @@ -11,4 +11,6 @@
>  #ifndef _ASM_NMI_H
>  #define _ASM_NMI_H
>  
> +void arch_touch_nmi_watchdog(void);
> +
>  #endif /* _ASM_NMI_H */
> diff --git a/arch/mn10300/kernel/mn10300-watchdog-low.S b/arch/mn10300/kernel/mn10300-watchdog-low.S
> index f2f5c9cfaabd..34f8773de7d0 100644
> --- a/arch/mn10300/kernel/mn10300-watchdog-low.S
> +++ b/arch/mn10300/kernel/mn10300-watchdog-low.S
> @@ -50,9 +50,9 @@ watchdog_handler:
>  #   we can't inline it)
>  #
>  ###############################################################################
> -	.globl	touch_nmi_watchdog
> -	.type	touch_nmi_watchdog,@function
> -touch_nmi_watchdog:
> +	.globl	arch_touch_nmi_watchdog
> +	.type	arch_touch_nmi_watchdog,@function
> +arch_touch_nmi_watchdog:
>  	clr	d0
>  	clr	d1
>  	mov	watchdog_alert_counter, a0
> @@ -63,4 +63,4 @@ touch_nmi_watchdog:
>  	lne
>  	ret	[],0
>  
> -	.size	touch_nmi_watchdog,.-touch_nmi_watchdog
> +	.size	arch_touch_nmi_watchdog,.-arch_touch_nmi_watchdog
> diff --git a/arch/mn10300/kernel/mn10300-watchdog.c b/arch/mn10300/kernel/mn10300-watchdog.c
> index a2d8e6938d67..0d5641beadf5 100644
> --- a/arch/mn10300/kernel/mn10300-watchdog.c
> +++ b/arch/mn10300/kernel/mn10300-watchdog.c
> @@ -31,7 +31,7 @@ static unsigned int watchdog;
>  static unsigned int watchdog_hz = 1;
>  unsigned int watchdog_alert_counter[NR_CPUS];
>  
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>  
>  /*
>   * the best way to detect whether a CPU has a 'hard lockup' problem
> diff --git a/arch/sparc/include/asm/nmi.h b/arch/sparc/include/asm/nmi.h
> index 26ad2b2607c6..284eac3ffaf2 100644
> --- a/arch/sparc/include/asm/nmi.h
> +++ b/arch/sparc/include/asm/nmi.h
> @@ -7,6 +7,7 @@ void nmi_adjust_hz(unsigned int new_hz);
>  
>  extern atomic_t nmi_active;
>  
> +void arch_touch_nmi_watchdog(void);
>  void start_nmi_watchdog(void *unused);
>  void stop_nmi_watchdog(void *unused);
>  
> diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
> index 95e73c63c99d..048ad783ea3f 100644
> --- a/arch/sparc/kernel/nmi.c
> +++ b/arch/sparc/kernel/nmi.c
> @@ -51,7 +51,7 @@ static DEFINE_PER_CPU(unsigned int, last_irq_sum);
>  static DEFINE_PER_CPU(long, alert_counter);
>  static DEFINE_PER_CPU(int, nmi_touch);
>  
> -void touch_nmi_watchdog(void)
> +void arch_touch_nmi_watchdog(void)
>  {
>  	if (atomic_read(&nmi_active)) {
>  		int cpu;
> @@ -61,10 +61,8 @@ void touch_nmi_watchdog(void)
>  				per_cpu(nmi_touch, cpu) = 1;
>  		}
>  	}
> -
> -	touch_softlockup_watchdog();
>  }
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>  
>  static void die_nmi(const char *str, struct pt_regs *regs, int do_panic)
>  {
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index 5e2e57536d98..6ea465a842a1 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -6,6 +6,9 @@
>  
>  #include <linux/sched.h>
>  #include <asm/irq.h>
> +#if defined(CONFIG_HAVE_NMI_WATCHDOG)
> +#include <asm/nmi.h>
> +#endif
>  
>  #ifdef CONFIG_LOCKUP_DETECTOR
>  extern void touch_softlockup_watchdog_sched(void);
> @@ -58,6 +61,14 @@ static inline void reset_hung_task_detector(void)
>  #define NMI_WATCHDOG_ENABLED      (1 << NMI_WATCHDOG_ENABLED_BIT)
>  #define SOFT_WATCHDOG_ENABLED     (1 << SOFT_WATCHDOG_ENABLED_BIT)
>  
> +#if defined(CONFIG_HARDLOCKUP_DETECTOR)
> +extern void hld_touch_nmi_watchdog(void);
> +extern void hardlockup_detector_disable(void);
> +#else
> +static inline void hld_touch_nmi_watchdog(void) {}
> +static inline void hardlockup_detector_disable(void) {}
> +#endif
> +
>  /**
>   * touch_nmi_watchdog - restart NMI watchdog timeout.
>   * 
> @@ -65,21 +76,14 @@ static inline void reset_hung_task_detector(void)
>   * may be used to reset the timeout - for code which intentionally
>   * disables interrupts for a long time. This call is stateless.
>   */
> -#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
> -#include <asm/nmi.h>
> -extern void touch_nmi_watchdog(void);
> -#else
>  static inline void touch_nmi_watchdog(void)
>  {
> +#if defined(CONFIG_HAVE_NMI_WATCHDOG)
> +	arch_touch_nmi_watchdog();
> +#endif
> +	hld_touch_nmi_watchdog();
>  	touch_softlockup_watchdog();
>  }
> -#endif
> -
> -#if defined(CONFIG_HARDLOCKUP_DETECTOR)
> -extern void hardlockup_detector_disable(void);
> -#else
> -static inline void hardlockup_detector_disable(void) {}
> -#endif
>  
>  /*
>   * Create trigger_all_cpu_backtrace() out of the arch-provided
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 54a427d1f344..e0d7a7c43fb5 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -56,7 +56,7 @@ static int __init hardlockup_panic_setup(char *str)
>  }
>  __setup("nmi_watchdog=", hardlockup_panic_setup);
>  
> -void touch_nmi_watchdog(void)
> +void hld_touch_nmi_watchdog(void)
>  {
>  	/*
>  	 * Using __raw here because some code paths have
> @@ -66,9 +66,8 @@ void touch_nmi_watchdog(void)
>  	 * going off.
>  	 */
>  	raw_cpu_write(watchdog_nmi_touch, true);
> -	touch_softlockup_watchdog();
>  }
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(hld_touch_nmi_watchdog);
>  
>  static struct perf_event_attr wd_hw_attr = {
>  	.type		= PERF_TYPE_HARDWARE,
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
  2017-05-25  8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
@ 2017-05-25 14:08   ` Don Zickus
  2017-05-26  0:39     ` Nicholas Piggin
  0 siblings, 1 reply; 14+ messages in thread
From: Don Zickus @ 2017-05-25 14:08 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, linux-arch

On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> After reconfiguring watchdog sysctls etc., architecture specific
> watchdogs may not get all their parameters updated.
> 
> watchdog_reconfigure() can be implemented to pull the new values
> in and set the arch NMI watchdog.

I understand the reason for this patch and I don't have any real objection
on how it was implemented within the constraints of all the current logic.

I just wonder if the current logic should be adjusted to make the hardlockup
detector, namely the perf implementation more separate so it can handle what
you would like more cleanly.

The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
based on how things are designed.  I am going to poke at it a little bit,
but I will probably not find time to do much and accept what you have for
now.

Cheers,
Don

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  kernel/watchdog.c | 28 ++++++++++++++++++++++++----
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index deb010505646..d2996f5bf551 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -120,6 +120,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
>  {
>  }
>  
> +void __weak watchdog_nmi_reconfigure(void)
> +{
> +}
> +
> +
>  #ifdef CONFIG_SOFTLOCKUP_DETECTOR
>  
>  /* Helper for online, unparked cpus. */
> @@ -597,6 +602,12 @@ static void watchdog_disable_all_cpus(void)
>  	}
>  }
>  
> +static int watchdog_update_cpus(void)
> +{
> +	return smpboot_update_cpumask_percpu_thread(
> +		    &watchdog_threads, &watchdog_cpumask);
> +}
> +
>  #else /* SOFTLOCKUP */
>  static int watchdog_park_threads(void)
>  {
> @@ -616,6 +627,10 @@ static void watchdog_disable_all_cpus(void)
>  {
>  }
>  
> +static int watchdog_update_cpus(void)
> +{
> +}
> +
>  static void set_sample_period(void)
>  {
>  }
> @@ -648,6 +663,8 @@ int lockup_detector_suspend(void)
>  		watchdog_enabled = 0;
>  	}
>  
> +	watchdog_nmi_reconfigure();
> +
>  	mutex_unlock(&watchdog_proc_mutex);
>  
>  	return ret;
> @@ -668,6 +685,8 @@ void lockup_detector_resume(void)
>  	if (watchdog_running && !watchdog_suspended)
>  		watchdog_unpark_threads();
>  
> +	watchdog_nmi_reconfigure();
> +
>  	mutex_unlock(&watchdog_proc_mutex);
>  	put_online_cpus();
>  }
> @@ -693,6 +712,8 @@ static int proc_watchdog_update(void)
>  	else
>  		watchdog_disable_all_cpus();
>  
> +	watchdog_nmi_reconfigure();
> +
>  	return err;
>  
>  }
> @@ -878,12 +899,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
>  			 * a temporary cpumask, so we are likely not in a
>  			 * position to do much else to make things better.
>  			 */
> -#ifdef CONFIG_SOFTLOCKUP_DETECTOR
> -			if (smpboot_update_cpumask_percpu_thread(
> -				    &watchdog_threads, &watchdog_cpumask) != 0)
> +			if (watchdog_update_cpus() != 0)
>  				pr_err("cpumask update failed\n");
> -#endif
>  		}
> +
> +		watchdog_nmi_reconfigure();
>  	}
>  out:
>  	mutex_unlock(&watchdog_proc_mutex);
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
  2017-05-25 13:55   ` Don Zickus
@ 2017-05-26  0:31     ` Nicholas Piggin
  2017-05-26 14:05       ` Don Zickus
  0 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-26  0:31 UTC (permalink / raw)
  To: Don Zickus; +Cc: linux-kernel, linux-arch

On Thu, 25 May 2017 09:55:59 -0400
Don Zickus <dzickus@redhat.com> wrote:

> On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> > For architectures that define HAVE_NMI_WATCHDOG, instead of having
> > them provide the complete touch_nmi_watchdog() function, just have
> > them provide arch_touch_nmi_watchdog().
> > 
> > This gives the generic code more flexibility in implementing this
> > function, and arch implementations don't miss out on touching the
> > softlockup watchdog or other generic details.  
> 
> The idea makes sense.  I don't think you can have hld_touch_nmi_watchdog
> defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
> combine them somehow.  Though renaming hld_touch_nmi_watchdog to
> arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.

Yeah I agree it's not quite right, and I think using
arch_touch_nmi_watchdog would be fine for the hld, which makes sense
if you think of it as a utility or library function for architectures
that want a hardlockup watchdog and can use perf for it.

I can change that if you prefer. BTW the 0day picked up another
Kconfig compile bug, so I'll respin the series and include any changes
you like.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
  2017-05-25 14:08   ` Don Zickus
@ 2017-05-26  0:39     ` Nicholas Piggin
  2017-05-26 14:21       ` Don Zickus
  0 siblings, 1 reply; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-26  0:39 UTC (permalink / raw)
  To: Don Zickus; +Cc: linux-kernel, linux-arch

On Thu, 25 May 2017 10:08:33 -0400
Don Zickus <dzickus@redhat.com> wrote:

> On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> > After reconfiguring watchdog sysctls etc., architecture specific
> > watchdogs may not get all their parameters updated.
> > 
> > watchdog_reconfigure() can be implemented to pull the new values
> > in and set the arch NMI watchdog.  
> 
> I understand the reason for this patch and I don't have any real objection
> on how it was implemented within the constraints of all the current logic.
> 
> I just wonder if the current logic should be adjusted to make the hardlockup
> detector, namely the perf implementation more separate so it can handle what
> you would like more cleanly.
> 
> The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
> based on how things are designed.  I am going to poke at it a little bit,
> but I will probably not find time to do much and accept what you have for
> now.

I actually agree with you. These patches are basically an initial bridge
to get us to decoupling hld-perf from hld-arch, but the code could
definitely use several more passes to clean things up.

One thing we want to be mindful of is some watchdogs are very light weight,
minimal, and some may not even want to call C code (at least from the NMI
and touch-watchdog paths). But having said that, it may not be a bad idea
to have implementations provide a watchdog driver struct with some of the
methods and reconfiguration they support. E.g., suspend/resume, stop/start
on CPUs, adjust timeouts, etc.).

I didn't want to go the whole hog and over-engineer something that doesn't
work though, so I'm hoping we can get the powerpc watchdog in, and then
keep working on the apis.

Let me know what you think after you poke at it though.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
  2017-05-26  0:31     ` Nicholas Piggin
@ 2017-05-26 14:05       ` Don Zickus
  0 siblings, 0 replies; 14+ messages in thread
From: Don Zickus @ 2017-05-26 14:05 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, linux-arch

On Fri, May 26, 2017 at 10:31:03AM +1000, Nicholas Piggin wrote:
> On Thu, 25 May 2017 09:55:59 -0400
> Don Zickus <dzickus@redhat.com> wrote:
> 
> > On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> > > For architectures that define HAVE_NMI_WATCHDOG, instead of having
> > > them provide the complete touch_nmi_watchdog() function, just have
> > > them provide arch_touch_nmi_watchdog().
> > > 
> > > This gives the generic code more flexibility in implementing this
> > > function, and arch implementations don't miss out on touching the
> > > softlockup watchdog or other generic details.  
> > 
> > The idea makes sense.  I don't think you can have hld_touch_nmi_watchdog
> > defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
> > combine them somehow.  Though renaming hld_touch_nmi_watchdog to
> > arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.
> 
> Yeah I agree it's not quite right, and I think using
> arch_touch_nmi_watchdog would be fine for the hld, which makes sense
> if you think of it as a utility or library function for architectures
> that want a hardlockup watchdog and can use perf for it.

Yeah, if you wouldn't mind trying that.  Over the last year it seems there
is a push to make the hld more of a separate thing if folks want to use
perf.  I have been trying to tweak it so it can be used in-place of the arch
solution or just use the arch solution.  And still retain the same function
calls.

Cheers,
Don

> 
> I can change that if you prefer. BTW the 0day picked up another
> Kconfig compile bug, so I'll respin the series and include any changes
> you like.
> 
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
  2017-05-26  0:39     ` Nicholas Piggin
@ 2017-05-26 14:21       ` Don Zickus
  0 siblings, 0 replies; 14+ messages in thread
From: Don Zickus @ 2017-05-26 14:21 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, linux-arch

On Fri, May 26, 2017 at 10:39:09AM +1000, Nicholas Piggin wrote:
> On Thu, 25 May 2017 10:08:33 -0400
> Don Zickus <dzickus@redhat.com> wrote:
> 
> > On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> > > After reconfiguring watchdog sysctls etc., architecture specific
> > > watchdogs may not get all their parameters updated.
> > > 
> > > watchdog_reconfigure() can be implemented to pull the new values
> > > in and set the arch NMI watchdog.  
> > 
> > I understand the reason for this patch and I don't have any real objection
> > on how it was implemented within the constraints of all the current logic.
> > 
> > I just wonder if the current logic should be adjusted to make the hardlockup
> > detector, namely the perf implementation more separate so it can handle what
> > you would like more cleanly.
> > 
> > The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
> > based on how things are designed.  I am going to poke at it a little bit,
> > but I will probably not find time to do much and accept what you have for
> > now.
> 
> I actually agree with you. These patches are basically an initial bridge
> to get us to decoupling hld-perf from hld-arch, but the code could
> definitely use several more passes to clean things up.

Makes sense. :-)

> 
> One thing we want to be mindful of is some watchdogs are very light weight,
> minimal, and some may not even want to call C code (at least from the NMI

Agreed.

> and touch-watchdog paths). But having said that, it may not be a bad idea
> to have implementations provide a watchdog driver struct with some of the
> methods and reconfiguration they support. E.g., suspend/resume, stop/start
> on CPUs, adjust timeouts, etc.).

Hehe.  I was hoping to avoid doing that, but it may lead there over time.

> 
> I didn't want to go the whole hog and over-engineer something that doesn't
> work though, so I'm hoping we can get the powerpc watchdog in, and then
> keep working on the apis.

Agreed. 

> 
> Let me know what you think after you poke at it though.

I will.

Cheers,
Don

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/4] watchdog: Provide watchdog_reconfigure() for arch watchdogs
  2017-05-30  1:26 [PATCH 0/4][V3] Improve watchdog config " Nicholas Piggin
@ 2017-05-30  1:26 ` Nicholas Piggin
  0 siblings, 0 replies; 14+ messages in thread
From: Nicholas Piggin @ 2017-05-30  1:26 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch

After reconfiguring watchdog sysctls etc., architecture specific
watchdogs may not get all their parameters updated.

watchdog_reconfigure() can be implemented to pull the new values
in and set the arch NMI watchdog.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/watchdog.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index deb010505646..5397c637db2d 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -120,6 +120,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
 {
 }
 
+void __weak watchdog_nmi_reconfigure(void)
+{
+}
+
+
 #ifdef CONFIG_SOFTLOCKUP_DETECTOR
 
 /* Helper for online, unparked cpus. */
@@ -597,6 +602,12 @@ static void watchdog_disable_all_cpus(void)
 	}
 }
 
+static int watchdog_update_cpus(void)
+{
+	return smpboot_update_cpumask_percpu_thread(
+		    &watchdog_threads, &watchdog_cpumask);
+}
+
 #else /* SOFTLOCKUP */
 static int watchdog_park_threads(void)
 {
@@ -616,6 +627,11 @@ static void watchdog_disable_all_cpus(void)
 {
 }
 
+static int watchdog_update_cpus(void)
+{
+	return 0;
+}
+
 static void set_sample_period(void)
 {
 }
@@ -648,6 +664,8 @@ int lockup_detector_suspend(void)
 		watchdog_enabled = 0;
 	}
 
+	watchdog_nmi_reconfigure();
+
 	mutex_unlock(&watchdog_proc_mutex);
 
 	return ret;
@@ -668,6 +686,8 @@ void lockup_detector_resume(void)
 	if (watchdog_running && !watchdog_suspended)
 		watchdog_unpark_threads();
 
+	watchdog_nmi_reconfigure();
+
 	mutex_unlock(&watchdog_proc_mutex);
 	put_online_cpus();
 }
@@ -693,6 +713,8 @@ static int proc_watchdog_update(void)
 	else
 		watchdog_disable_all_cpus();
 
+	watchdog_nmi_reconfigure();
+
 	return err;
 
 }
@@ -878,12 +900,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 			 * a temporary cpumask, so we are likely not in a
 			 * position to do much else to make things better.
 			 */
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
-			if (smpboot_update_cpumask_percpu_thread(
-				    &watchdog_threads, &watchdog_cpumask) != 0)
+			if (watchdog_update_cpus() != 0)
 				pr_err("cpumask update failed\n");
-#endif
 		}
+
+		watchdog_nmi_reconfigure();
 	}
 out:
 	mutex_unlock(&watchdog_proc_mutex);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-30  1:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-25  8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
2017-05-25  8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
2017-05-25  8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
2017-05-25 13:55   ` Don Zickus
2017-05-26  0:31     ` Nicholas Piggin
2017-05-26 14:05       ` Don Zickus
2017-05-25  8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
2017-05-25 11:30   ` kbuild test robot
2017-05-25 12:09   ` kbuild test robot
2017-05-25  8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
2017-05-25 14:08   ` Don Zickus
2017-05-26  0:39     ` Nicholas Piggin
2017-05-26 14:21       ` Don Zickus
2017-05-30  1:26 [PATCH 0/4][V3] Improve watchdog config " Nicholas Piggin
2017-05-30  1:26 ` [PATCH 4/4] watchdog: Provide watchdog_reconfigure() " Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).