* [PATCH 0/4][V2] Improve watchdog config for arch watchdogs
@ 2017-05-25 8:28 Nicholas Piggin
2017-05-25 8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-25 8:28 UTC (permalink / raw)
To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch
Hi Don,
The kbuild 0day caught a compile bug on some archs, I think I've
got that fixed.
I'm resending without the powerpc watchdog example. However if you
decide to accept these changes for the next merge window, I may send
the powerpc patch through your tree with appropriate acks, because
it's not very intrusive.
So the basic idea of these 4 patches is to allow arch specific
(non-perf) hardlockup watchdogs to properly hook into the existing
watchdog interfaces (cmdline, sysctl, kernel apis).
sparc for example implements its own detector, but it can't use all
the same sysctls, some cmdline options are a bit different, etc.
powerpc wants to move to its own detector like sparc, but without
causing interface functionality to change.
Thanks,
Nick
Nicholas Piggin (4):
watchdog: remove unused declaration
watchdog: introduce arch_touch_nmi_watchdog()
watchdog: split up config options
watchdog: provide watchdog_reconfigure() for arch watchdogs
arch/blackfin/include/asm/nmi.h | 2 +
arch/blackfin/kernel/nmi.c | 2 +-
arch/mn10300/include/asm/nmi.h | 2 +
arch/mn10300/kernel/mn10300-watchdog-low.S | 8 +-
arch/mn10300/kernel/mn10300-watchdog.c | 2 +-
arch/powerpc/kernel/setup_64.c | 2 +-
arch/sparc/include/asm/nmi.h | 1 +
arch/sparc/kernel/nmi.c | 6 +-
arch/x86/kernel/apic/hw_nmi.c | 2 +-
include/linux/nmi.h | 58 ++++---
kernel/Makefile | 2 +-
kernel/sysctl.c | 18 +-
kernel/watchdog.c | 262 +++++++++++++++++++----------
kernel/watchdog_hld.c | 37 +---
lib/Kconfig.debug | 27 ++-
15 files changed, 261 insertions(+), 170 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/4] watchdog: remove unused declaration
2017-05-25 8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
@ 2017-05-25 8:28 ` Nicholas Piggin
2017-05-25 8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-25 8:28 UTC (permalink / raw)
To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
include/linux/nmi.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index aa3cd0878270..5e2e57536d98 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -12,9 +12,6 @@ extern void touch_softlockup_watchdog_sched(void);
extern void touch_softlockup_watchdog(void);
extern void touch_softlockup_watchdog_sync(void);
extern void touch_all_softlockup_watchdogs(void);
-extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos);
extern unsigned int softlockup_panic;
extern unsigned int hardlockup_panic;
void lockup_detector_init(void);
--
2.11.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
2017-05-25 8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
2017-05-25 8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
@ 2017-05-25 8:28 ` Nicholas Piggin
2017-05-25 13:55 ` Don Zickus
2017-05-25 8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
2017-05-25 8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
3 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-25 8:28 UTC (permalink / raw)
To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch
For architectures that define HAVE_NMI_WATCHDOG, instead of having
them provide the complete touch_nmi_watchdog() function, just have
them provide arch_touch_nmi_watchdog().
This gives the generic code more flexibility in implementing this
function, and arch implementations don't miss out on touching the
softlockup watchdog or other generic details.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/blackfin/include/asm/nmi.h | 2 ++
arch/blackfin/kernel/nmi.c | 2 +-
arch/mn10300/include/asm/nmi.h | 2 ++
arch/mn10300/kernel/mn10300-watchdog-low.S | 8 ++++----
arch/mn10300/kernel/mn10300-watchdog.c | 2 +-
arch/sparc/include/asm/nmi.h | 1 +
arch/sparc/kernel/nmi.c | 6 ++----
include/linux/nmi.h | 26 +++++++++++++++-----------
kernel/watchdog_hld.c | 5 ++---
9 files changed, 30 insertions(+), 24 deletions(-)
diff --git a/arch/blackfin/include/asm/nmi.h b/arch/blackfin/include/asm/nmi.h
index b9caac4fcfd8..6e7047ceec79 100644
--- a/arch/blackfin/include/asm/nmi.h
+++ b/arch/blackfin/include/asm/nmi.h
@@ -7,6 +7,8 @@
#ifndef _BFIN_NMI_H_
#define _BFIN_NMI_H_
+void arch_touch_nmi_watchdog(void);
+
#include <linux/nmi.h>
#endif
diff --git a/arch/blackfin/kernel/nmi.c b/arch/blackfin/kernel/nmi.c
index 633c37083e87..1e714329fe8a 100644
--- a/arch/blackfin/kernel/nmi.c
+++ b/arch/blackfin/kernel/nmi.c
@@ -190,7 +190,7 @@ static int __init init_nmi_wdt(void)
}
device_initcall(init_nmi_wdt);
-void touch_nmi_watchdog(void)
+void arch_touch_nmi_watchdog(void)
{
atomic_set(&nmi_touched[smp_processor_id()], 1);
}
diff --git a/arch/mn10300/include/asm/nmi.h b/arch/mn10300/include/asm/nmi.h
index f3671cbbc117..aee8aa22d9ee 100644
--- a/arch/mn10300/include/asm/nmi.h
+++ b/arch/mn10300/include/asm/nmi.h
@@ -11,4 +11,6 @@
#ifndef _ASM_NMI_H
#define _ASM_NMI_H
+void arch_touch_nmi_watchdog(void);
+
#endif /* _ASM_NMI_H */
diff --git a/arch/mn10300/kernel/mn10300-watchdog-low.S b/arch/mn10300/kernel/mn10300-watchdog-low.S
index f2f5c9cfaabd..34f8773de7d0 100644
--- a/arch/mn10300/kernel/mn10300-watchdog-low.S
+++ b/arch/mn10300/kernel/mn10300-watchdog-low.S
@@ -50,9 +50,9 @@ watchdog_handler:
# we can't inline it)
#
###############################################################################
- .globl touch_nmi_watchdog
- .type touch_nmi_watchdog,@function
-touch_nmi_watchdog:
+ .globl arch_touch_nmi_watchdog
+ .type arch_touch_nmi_watchdog,@function
+arch_touch_nmi_watchdog:
clr d0
clr d1
mov watchdog_alert_counter, a0
@@ -63,4 +63,4 @@ touch_nmi_watchdog:
lne
ret [],0
- .size touch_nmi_watchdog,.-touch_nmi_watchdog
+ .size arch_touch_nmi_watchdog,.-arch_touch_nmi_watchdog
diff --git a/arch/mn10300/kernel/mn10300-watchdog.c b/arch/mn10300/kernel/mn10300-watchdog.c
index a2d8e6938d67..0d5641beadf5 100644
--- a/arch/mn10300/kernel/mn10300-watchdog.c
+++ b/arch/mn10300/kernel/mn10300-watchdog.c
@@ -31,7 +31,7 @@ static unsigned int watchdog;
static unsigned int watchdog_hz = 1;
unsigned int watchdog_alert_counter[NR_CPUS];
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(arch_touch_nmi_watchdog);
/*
* the best way to detect whether a CPU has a 'hard lockup' problem
diff --git a/arch/sparc/include/asm/nmi.h b/arch/sparc/include/asm/nmi.h
index 26ad2b2607c6..284eac3ffaf2 100644
--- a/arch/sparc/include/asm/nmi.h
+++ b/arch/sparc/include/asm/nmi.h
@@ -7,6 +7,7 @@ void nmi_adjust_hz(unsigned int new_hz);
extern atomic_t nmi_active;
+void arch_touch_nmi_watchdog(void);
void start_nmi_watchdog(void *unused);
void stop_nmi_watchdog(void *unused);
diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
index 95e73c63c99d..048ad783ea3f 100644
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -51,7 +51,7 @@ static DEFINE_PER_CPU(unsigned int, last_irq_sum);
static DEFINE_PER_CPU(long, alert_counter);
static DEFINE_PER_CPU(int, nmi_touch);
-void touch_nmi_watchdog(void)
+void arch_touch_nmi_watchdog(void)
{
if (atomic_read(&nmi_active)) {
int cpu;
@@ -61,10 +61,8 @@ void touch_nmi_watchdog(void)
per_cpu(nmi_touch, cpu) = 1;
}
}
-
- touch_softlockup_watchdog();
}
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(arch_touch_nmi_watchdog);
static void die_nmi(const char *str, struct pt_regs *regs, int do_panic)
{
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 5e2e57536d98..6ea465a842a1 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -6,6 +6,9 @@
#include <linux/sched.h>
#include <asm/irq.h>
+#if defined(CONFIG_HAVE_NMI_WATCHDOG)
+#include <asm/nmi.h>
+#endif
#ifdef CONFIG_LOCKUP_DETECTOR
extern void touch_softlockup_watchdog_sched(void);
@@ -58,6 +61,14 @@ static inline void reset_hung_task_detector(void)
#define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT)
#define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT)
+#if defined(CONFIG_HARDLOCKUP_DETECTOR)
+extern void hld_touch_nmi_watchdog(void);
+extern void hardlockup_detector_disable(void);
+#else
+static inline void hld_touch_nmi_watchdog(void) {}
+static inline void hardlockup_detector_disable(void) {}
+#endif
+
/**
* touch_nmi_watchdog - restart NMI watchdog timeout.
*
@@ -65,21 +76,14 @@ static inline void reset_hung_task_detector(void)
* may be used to reset the timeout - for code which intentionally
* disables interrupts for a long time. This call is stateless.
*/
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
-#include <asm/nmi.h>
-extern void touch_nmi_watchdog(void);
-#else
static inline void touch_nmi_watchdog(void)
{
+#if defined(CONFIG_HAVE_NMI_WATCHDOG)
+ arch_touch_nmi_watchdog();
+#endif
+ hld_touch_nmi_watchdog();
touch_softlockup_watchdog();
}
-#endif
-
-#if defined(CONFIG_HARDLOCKUP_DETECTOR)
-extern void hardlockup_detector_disable(void);
-#else
-static inline void hardlockup_detector_disable(void) {}
-#endif
/*
* Create trigger_all_cpu_backtrace() out of the arch-provided
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 54a427d1f344..e0d7a7c43fb5 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -56,7 +56,7 @@ static int __init hardlockup_panic_setup(char *str)
}
__setup("nmi_watchdog=", hardlockup_panic_setup);
-void touch_nmi_watchdog(void)
+void hld_touch_nmi_watchdog(void)
{
/*
* Using __raw here because some code paths have
@@ -66,9 +66,8 @@ void touch_nmi_watchdog(void)
* going off.
*/
raw_cpu_write(watchdog_nmi_touch, true);
- touch_softlockup_watchdog();
}
-EXPORT_SYMBOL(touch_nmi_watchdog);
+EXPORT_SYMBOL(hld_touch_nmi_watchdog);
static struct perf_event_attr wd_hw_attr = {
.type = PERF_TYPE_HARDWARE,
--
2.11.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/4] watchdog: split out config options
2017-05-25 8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
2017-05-25 8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
2017-05-25 8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
@ 2017-05-25 8:28 ` Nicholas Piggin
2017-05-25 11:30 ` kbuild test robot
2017-05-25 12:09 ` kbuild test robot
2017-05-25 8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
3 siblings, 2 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-25 8:28 UTC (permalink / raw)
To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch
Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.
LOCKUP_DETECTOR provides the boot, sysctl, and programming interfaces
for lockup detectors. An architecture that defines HAVE_NMI_WATCHDOG
need not use this this if it has a very basic watchdog or uses its own
options and interfaces (e.g., sparc). touch_nmi_watchdog() will
continue to call their arch_touch_nmi_watchdog().
HARDLOCKUP_DETECTOR_PERF is the perf-based lockup detector.
HARDLOCKUP_DETECTOR is the framework for arch NMI_WATCHDOG hard lockup
detectors that conform to the LOCKUP_DETECTOR interfaces.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/setup_64.c | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 2 +-
include/linux/nmi.h | 35 ++++--
kernel/Makefile | 2 +-
kernel/sysctl.c | 18 ++--
kernel/watchdog.c | 238 ++++++++++++++++++++++++++---------------
kernel/watchdog_hld.c | 32 ------
lib/Kconfig.debug | 27 +++--
8 files changed, 212 insertions(+), 144 deletions(-)
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f35ff9dea4fb..ab650905f75a 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -727,7 +727,7 @@ struct ppc_pci_io ppc_pci_io;
EXPORT_SYMBOL(ppc_pci_io);
#endif
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
u64 hw_nmi_get_sample_period(int watchdog_thresh)
{
return ppc_proc_freq * watchdog_thresh;
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index c73c9fb281e1..d6f387780849 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -19,7 +19,7 @@
#include <linux/init.h>
#include <linux/delay.h>
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
u64 hw_nmi_get_sample_period(int watchdog_thresh)
{
return (u64)(cpu_khz) * 1000 * watchdog_thresh;
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 6ea465a842a1..c11dc695fb0b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -11,13 +11,21 @@
#endif
#ifdef CONFIG_LOCKUP_DETECTOR
+void lockup_detector_init(void);
+#else
+static inline void lockup_detector_init(void)
+{
+}
+#endif
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
extern void touch_softlockup_watchdog_sched(void);
extern void touch_softlockup_watchdog(void);
extern void touch_softlockup_watchdog_sync(void);
extern void touch_all_softlockup_watchdogs(void);
extern unsigned int softlockup_panic;
-extern unsigned int hardlockup_panic;
-void lockup_detector_init(void);
+extern int soft_watchdog_enabled;
+extern atomic_t watchdog_park_in_progress;
#else
static inline void touch_softlockup_watchdog_sched(void)
{
@@ -31,9 +39,6 @@ static inline void touch_softlockup_watchdog_sync(void)
static inline void touch_all_softlockup_watchdogs(void)
{
}
-static inline void lockup_detector_init(void)
-{
-}
#endif
#ifdef CONFIG_DETECT_HUNG_TASK
@@ -61,14 +66,19 @@ static inline void reset_hung_task_detector(void)
#define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT)
#define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT)
-#if defined(CONFIG_HARDLOCKUP_DETECTOR)
-extern void hld_touch_nmi_watchdog(void);
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
extern void hardlockup_detector_disable(void);
+extern unsigned int hardlockup_panic;
#else
-static inline void hld_touch_nmi_watchdog(void) {}
static inline void hardlockup_detector_disable(void) {}
#endif
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
+extern void hld_touch_nmi_watchdog(void);
+#else
+static inline void hld_touch_nmi_watchdog(void) {}
+#endif
+
/**
* touch_nmi_watchdog - restart NMI watchdog timeout.
*
@@ -140,15 +150,18 @@ static inline bool trigger_single_cpu_backtrace(int cpu)
}
#endif
-#ifdef CONFIG_LOCKUP_DETECTOR
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
u64 hw_nmi_get_sample_period(int watchdog_thresh);
+#endif
+
+#ifdef CONFIG_LOCKUP_DETECTOR
extern int nmi_watchdog_enabled;
-extern int soft_watchdog_enabled;
extern int watchdog_user_enabled;
extern int watchdog_thresh;
extern unsigned long watchdog_enabled;
+extern struct cpumask watchdog_cpumask;
extern unsigned long *watchdog_cpumask_bits;
-extern atomic_t watchdog_park_in_progress;
+extern int __read_mostly watchdog_suspended;
#ifdef CONFIG_SMP
extern int sysctl_softlockup_all_cpu_backtrace;
extern int sysctl_hardlockup_all_cpu_backtrace;
diff --git a/kernel/Makefile b/kernel/Makefile
index 72aa080f91f0..4cb8e8b23c6e 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -82,7 +82,7 @@ obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_KGDB) += debug/
obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o
-obj-$(CONFIG_HARDLOCKUP_DETECTOR) += watchdog_hld.o
+obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RELAY) += relay.o
obj-$(CONFIG_SYSCTL) += utsname_sysctl.o
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4dfba1a76cc3..fa37faa89143 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -873,13 +873,21 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = proc_nmi_watchdog,
.extra1 = &zero,
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
.extra2 = &one,
#else
.extra2 = &zero,
#endif
},
{
+ .procname = "watchdog_cpumask",
+ .data = &watchdog_cpumask_bits,
+ .maxlen = NR_CPUS,
+ .mode = 0644,
+ .proc_handler = proc_watchdog_cpumask,
+ },
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+ {
.procname = "soft_watchdog",
.data = &soft_watchdog_enabled,
.maxlen = sizeof (int),
@@ -889,13 +897,6 @@ static struct ctl_table kern_table[] = {
.extra2 = &one,
},
{
- .procname = "watchdog_cpumask",
- .data = &watchdog_cpumask_bits,
- .maxlen = NR_CPUS,
- .mode = 0644,
- .proc_handler = proc_watchdog_cpumask,
- },
- {
.procname = "softlockup_panic",
.data = &softlockup_panic,
.maxlen = sizeof(int),
@@ -904,6 +905,7 @@ static struct ctl_table kern_table[] = {
.extra1 = &zero,
.extra2 = &one,
},
+#endif
#ifdef CONFIG_HARDLOCKUP_DETECTOR
{
.procname = "hardlockup_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 03e0b69bb5bf..deb010505646 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -29,15 +29,55 @@
#include <linux/kvm_para.h>
#include <linux/kthread.h>
+/* Watchdog configuration */
static DEFINE_MUTEX(watchdog_proc_mutex);
-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+int __read_mostly nmi_watchdog_enabled;
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED|NMI_WATCHDOG_ENABLED;
+
+/* boot commands */
+/*
+ * Should we panic when a soft-lockup or hard-lockup occurs:
+ */
+unsigned int __read_mostly hardlockup_panic =
+ CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
+/*
+ * We may not want to enable hard lockup detection by default in all cases,
+ * for example when running the kernel as a guest on a hypervisor. In these
+ * cases this function can be called to disable hard lockup detection. This
+ * function should only be executed once by the boot processor before the
+ * kernel command line parameters are parsed, because otherwise it is not
+ * possible to override this in hardlockup_panic_setup().
+ */
+void hardlockup_detector_disable(void)
+{
+ watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+}
+
+static int __init hardlockup_panic_setup(char *str)
+{
+ if (!strncmp(str, "panic", 5))
+ hardlockup_panic = 1;
+ else if (!strncmp(str, "nopanic", 7))
+ hardlockup_panic = 0;
+ else if (!strncmp(str, "0", 1))
+ watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+ else if (!strncmp(str, "1", 1))
+ watchdog_enabled |= NMI_WATCHDOG_ENABLED;
+ return 1;
+}
+__setup("nmi_watchdog=", hardlockup_panic_setup);
+
#else
unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
#endif
-int __read_mostly nmi_watchdog_enabled;
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
int __read_mostly soft_watchdog_enabled;
+#endif
+
int __read_mostly watchdog_user_enabled;
int __read_mostly watchdog_thresh = 10;
@@ -45,15 +85,9 @@ int __read_mostly watchdog_thresh = 10;
int __read_mostly sysctl_softlockup_all_cpu_backtrace;
int __read_mostly sysctl_hardlockup_all_cpu_backtrace;
#endif
-static struct cpumask watchdog_cpumask __read_mostly;
+struct cpumask watchdog_cpumask __read_mostly;
unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
-/* Helper for online, unparked cpus. */
-#define for_each_watchdog_cpu(cpu) \
- for_each_cpu_and((cpu), cpu_online_mask, &watchdog_cpumask)
-
-atomic_t watchdog_park_in_progress = ATOMIC_INIT(0);
-
/*
* The 'watchdog_running' variable is set to 1 when the watchdog threads
* are registered/started and is set to 0 when the watchdog threads are
@@ -72,7 +106,27 @@ static int __read_mostly watchdog_running;
* of 'watchdog_running' cannot change while the watchdog is deactivated
* temporarily (see related code in 'proc' handlers).
*/
-static int __read_mostly watchdog_suspended;
+int __read_mostly watchdog_suspended;
+
+/*
+ * These functions can be overridden if an architecture implements its
+ * own hardlockup detector.
+ */
+int __weak watchdog_nmi_enable(unsigned int cpu)
+{
+ return 0;
+}
+void __weak watchdog_nmi_disable(unsigned int cpu)
+{
+}
+
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+
+/* Helper for online, unparked cpus. */
+#define for_each_watchdog_cpu(cpu) \
+ for_each_cpu_and((cpu), cpu_online_mask, &watchdog_cpumask)
+
+atomic_t watchdog_park_in_progress = ATOMIC_INIT(0);
static u64 __read_mostly sample_period;
@@ -120,6 +174,7 @@ static int __init softlockup_all_cpu_backtrace_setup(char *str)
return 1;
}
__setup("softlockup_all_cpu_backtrace=", softlockup_all_cpu_backtrace_setup);
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
static int __init hardlockup_all_cpu_backtrace_setup(char *str)
{
sysctl_hardlockup_all_cpu_backtrace =
@@ -128,6 +183,7 @@ static int __init hardlockup_all_cpu_backtrace_setup(char *str)
}
__setup("hardlockup_all_cpu_backtrace=", hardlockup_all_cpu_backtrace_setup);
#endif
+#endif
/*
* Hard-lockup warnings should be triggered after just a few seconds. Soft-
@@ -213,18 +269,6 @@ void touch_softlockup_watchdog_sync(void)
__this_cpu_write(watchdog_touch_ts, 0);
}
-/* watchdog detector functions */
-bool is_hardlockup(void)
-{
- unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
-
- if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
- return true;
-
- __this_cpu_write(hrtimer_interrupts_saved, hrint);
- return false;
-}
-
static int is_softlockup(unsigned long touch_ts)
{
unsigned long now = get_timestamp();
@@ -237,21 +281,21 @@ static int is_softlockup(unsigned long touch_ts)
return 0;
}
-static void watchdog_interrupt_count(void)
+/* watchdog detector functions */
+bool is_hardlockup(void)
{
- __this_cpu_inc(hrtimer_interrupts);
-}
+ unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
-/*
- * These two functions are mostly architecture specific
- * defining them as weak here.
- */
-int __weak watchdog_nmi_enable(unsigned int cpu)
-{
- return 0;
+ if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
+ return true;
+
+ __this_cpu_write(hrtimer_interrupts_saved, hrint);
+ return false;
}
-void __weak watchdog_nmi_disable(unsigned int cpu)
+
+static void watchdog_interrupt_count(void)
{
+ __this_cpu_inc(hrtimer_interrupts);
}
static int watchdog_enable_all_cpus(void);
@@ -502,57 +546,6 @@ static void watchdog_unpark_threads(void)
kthread_unpark(per_cpu(softlockup_watchdog, cpu));
}
-/*
- * Suspend the hard and soft lockup detector by parking the watchdog threads.
- */
-int lockup_detector_suspend(void)
-{
- int ret = 0;
-
- get_online_cpus();
- mutex_lock(&watchdog_proc_mutex);
- /*
- * Multiple suspend requests can be active in parallel (counted by
- * the 'watchdog_suspended' variable). If the watchdog threads are
- * running, the first caller takes care that they will be parked.
- * The state of 'watchdog_running' cannot change while a suspend
- * request is active (see related code in 'proc' handlers).
- */
- if (watchdog_running && !watchdog_suspended)
- ret = watchdog_park_threads();
-
- if (ret == 0)
- watchdog_suspended++;
- else {
- watchdog_disable_all_cpus();
- pr_err("Failed to suspend lockup detectors, disabled\n");
- watchdog_enabled = 0;
- }
-
- mutex_unlock(&watchdog_proc_mutex);
-
- return ret;
-}
-
-/*
- * Resume the hard and soft lockup detector by unparking the watchdog threads.
- */
-void lockup_detector_resume(void)
-{
- mutex_lock(&watchdog_proc_mutex);
-
- watchdog_suspended--;
- /*
- * The watchdog threads are unparked if they were previously running
- * and if there is no more active suspend request.
- */
- if (watchdog_running && !watchdog_suspended)
- watchdog_unpark_threads();
-
- mutex_unlock(&watchdog_proc_mutex);
- put_online_cpus();
-}
-
static int update_watchdog_all_cpus(void)
{
int ret;
@@ -604,6 +597,81 @@ static void watchdog_disable_all_cpus(void)
}
}
+#else /* SOFTLOCKUP */
+static int watchdog_park_threads(void)
+{
+ return 0;
+}
+
+static void watchdog_unpark_threads(void)
+{
+}
+
+static int watchdog_enable_all_cpus(void)
+{
+ return 0;
+}
+
+static void watchdog_disable_all_cpus(void)
+{
+}
+
+static void set_sample_period(void)
+{
+}
+#endif /* SOFTLOCKUP */
+
+/*
+ * Suspend the hard and soft lockup detector by parking the watchdog threads.
+ */
+int lockup_detector_suspend(void)
+{
+ int ret = 0;
+
+ get_online_cpus();
+ mutex_lock(&watchdog_proc_mutex);
+ /*
+ * Multiple suspend requests can be active in parallel (counted by
+ * the 'watchdog_suspended' variable). If the watchdog threads are
+ * running, the first caller takes care that they will be parked.
+ * The state of 'watchdog_running' cannot change while a suspend
+ * request is active (see related code in 'proc' handlers).
+ */
+ if (watchdog_running && !watchdog_suspended)
+ ret = watchdog_park_threads();
+
+ if (ret == 0)
+ watchdog_suspended++;
+ else {
+ watchdog_disable_all_cpus();
+ pr_err("Failed to suspend lockup detectors, disabled\n");
+ watchdog_enabled = 0;
+ }
+
+ mutex_unlock(&watchdog_proc_mutex);
+
+ return ret;
+}
+
+/*
+ * Resume the hard and soft lockup detector by unparking the watchdog threads.
+ */
+void lockup_detector_resume(void)
+{
+ mutex_lock(&watchdog_proc_mutex);
+
+ watchdog_suspended--;
+ /*
+ * The watchdog threads are unparked if they were previously running
+ * and if there is no more active suspend request.
+ */
+ if (watchdog_running && !watchdog_suspended)
+ watchdog_unpark_threads();
+
+ mutex_unlock(&watchdog_proc_mutex);
+ put_online_cpus();
+}
+
#ifdef CONFIG_SYSCTL
/*
@@ -810,9 +878,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
* a temporary cpumask, so we are likely not in a
* position to do much else to make things better.
*/
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
if (smpboot_update_cpumask_percpu_thread(
&watchdog_threads, &watchdog_cpumask) != 0)
pr_err("cpumask update failed\n");
+#endif
}
}
out:
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index e0d7a7c43fb5..ffd4fa59fbde 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -22,39 +22,7 @@ static DEFINE_PER_CPU(bool, hard_watchdog_warn);
static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
-/* boot commands */
-/*
- * Should we panic when a soft-lockup or hard-lockup occurs:
- */
-unsigned int __read_mostly hardlockup_panic =
- CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
static unsigned long hardlockup_allcpu_dumped;
-/*
- * We may not want to enable hard lockup detection by default in all cases,
- * for example when running the kernel as a guest on a hypervisor. In these
- * cases this function can be called to disable hard lockup detection. This
- * function should only be executed once by the boot processor before the
- * kernel command line parameters are parsed, because otherwise it is not
- * possible to override this in hardlockup_panic_setup().
- */
-void hardlockup_detector_disable(void)
-{
- watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
-}
-
-static int __init hardlockup_panic_setup(char *str)
-{
- if (!strncmp(str, "panic", 5))
- hardlockup_panic = 1;
- else if (!strncmp(str, "nopanic", 7))
- hardlockup_panic = 0;
- else if (!strncmp(str, "0", 1))
- watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
- else if (!strncmp(str, "1", 1))
- watchdog_enabled |= NMI_WATCHDOG_ENABLED;
- return 1;
-}
-__setup("nmi_watchdog=", hardlockup_panic_setup);
void hld_touch_nmi_watchdog(void)
{
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e4587ebe52c7..74d5ad667426 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -801,10 +801,25 @@ config LOCKUP_DETECTOR
The frequency of hrtimer and NMI events and the soft and hard lockup
thresholds can be controlled through the sysctl watchdog_thresh.
+config SOFTLOCKUP_DETECTOR
+ bool "Detect Soft Lockups"
+ depends on LOCKUP_DETECTOR
+
+config HARDLOCKUP_DETECTOR_PERF
+ bool
+ select SOFTLOCKUP_DETECTOR
+
+#
+# arch/ can define HAVE_NMI_WATCHDOG to provide their own NMI watchdog
+# rather than the perf based detector.
+#
+# The arch may do its own thing, or select HARDLOCKUP_DETECTOR, which
+#
config HARDLOCKUP_DETECTOR
- def_bool y
- depends on LOCKUP_DETECTOR && !HAVE_NMI_WATCHDOG
- depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI
+ bool "Detect Hard Lockups"
+ depends on LOCKUP_DETECTOR
+ depends on !HAVE_NMI_WATCHDOG || (PERF_EVENTS && HAVE_PERF_EVENTS_NMI)
+ select HARDLOCKUP_DETECTOR_PERF if !HAVE_NMI_WATCHDOG
config BOOTPARAM_HARDLOCKUP_PANIC
bool "Panic (Reboot) On Hard Lockups"
@@ -826,7 +841,7 @@ config BOOTPARAM_HARDLOCKUP_PANIC_VALUE
config BOOTPARAM_SOFTLOCKUP_PANIC
bool "Panic (Reboot) On Soft Lockups"
- depends on LOCKUP_DETECTOR
+ depends on SOFTLOCKUP_DETECTOR
help
Say Y here to enable the kernel to panic on "soft lockups",
which are bugs that cause the kernel to loop in kernel
@@ -843,7 +858,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC
config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
int
- depends on LOCKUP_DETECTOR
+ depends on SOFTLOCKUP_DETECTOR
range 0 1
default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC
default 1 if BOOTPARAM_SOFTLOCKUP_PANIC
@@ -851,7 +866,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
config DETECT_HUNG_TASK
bool "Detect Hung Tasks"
depends on DEBUG_KERNEL
- default LOCKUP_DETECTOR
+ default SOFTLOCKUP_DETECTOR
help
Say Y here to enable the kernel to detect "hung tasks",
which are bugs that cause the task to be stuck in
--
2.11.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
2017-05-25 8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
` (2 preceding siblings ...)
2017-05-25 8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
@ 2017-05-25 8:28 ` Nicholas Piggin
2017-05-25 14:08 ` Don Zickus
3 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-25 8:28 UTC (permalink / raw)
To: Don Zickus; +Cc: Nicholas Piggin, linux-kernel, linux-arch
After reconfiguring watchdog sysctls etc., architecture specific
watchdogs may not get all their parameters updated.
watchdog_reconfigure() can be implemented to pull the new values
in and set the arch NMI watchdog.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
kernel/watchdog.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index deb010505646..d2996f5bf551 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -120,6 +120,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
{
}
+void __weak watchdog_nmi_reconfigure(void)
+{
+}
+
+
#ifdef CONFIG_SOFTLOCKUP_DETECTOR
/* Helper for online, unparked cpus. */
@@ -597,6 +602,12 @@ static void watchdog_disable_all_cpus(void)
}
}
+static int watchdog_update_cpus(void)
+{
+ return smpboot_update_cpumask_percpu_thread(
+ &watchdog_threads, &watchdog_cpumask);
+}
+
#else /* SOFTLOCKUP */
static int watchdog_park_threads(void)
{
@@ -616,6 +627,10 @@ static void watchdog_disable_all_cpus(void)
{
}
+static int watchdog_update_cpus(void)
+{
+}
+
static void set_sample_period(void)
{
}
@@ -648,6 +663,8 @@ int lockup_detector_suspend(void)
watchdog_enabled = 0;
}
+ watchdog_nmi_reconfigure();
+
mutex_unlock(&watchdog_proc_mutex);
return ret;
@@ -668,6 +685,8 @@ void lockup_detector_resume(void)
if (watchdog_running && !watchdog_suspended)
watchdog_unpark_threads();
+ watchdog_nmi_reconfigure();
+
mutex_unlock(&watchdog_proc_mutex);
put_online_cpus();
}
@@ -693,6 +712,8 @@ static int proc_watchdog_update(void)
else
watchdog_disable_all_cpus();
+ watchdog_nmi_reconfigure();
+
return err;
}
@@ -878,12 +899,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
* a temporary cpumask, so we are likely not in a
* position to do much else to make things better.
*/
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
- if (smpboot_update_cpumask_percpu_thread(
- &watchdog_threads, &watchdog_cpumask) != 0)
+ if (watchdog_update_cpus() != 0)
pr_err("cpumask update failed\n");
-#endif
}
+
+ watchdog_nmi_reconfigure();
}
out:
mutex_unlock(&watchdog_proc_mutex);
--
2.11.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] watchdog: split out config options
2017-05-25 8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
@ 2017-05-25 11:30 ` kbuild test robot
2017-05-25 12:09 ` kbuild test robot
1 sibling, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2017-05-25 11:30 UTC (permalink / raw)
To: Nicholas Piggin
Cc: kbuild-all, Don Zickus, Nicholas Piggin, linux-kernel, linux-arch
[-- Attachment #1: Type: text/plain, Size: 8578 bytes --]
Hi Nicholas,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.12-rc2 next-20170525]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/watchdog-remove-unused-declaration/20170525-164255
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64
All error/warnings (new ones prefixed by >>):
kernel/watchdog_hld.c: In function 'watchdog_overflow_callback':
>> kernel/watchdog_hld.c:54:7: error: 'struct perf_event' has no member named 'hw'
event->hw.interrupts = 0;
^~
kernel/watchdog_hld.c: In function 'watchdog_nmi_enable':
>> kernel/watchdog_hld.c:123:20: error: 'struct perf_event' has no member named 'state'
if (event && event->state > PERF_EVENT_STATE_OFF)
^~
>> kernel/watchdog_hld.c:137:10: error: implicit declaration of function 'perf_event_create_kernel_counter' [-Werror=implicit-function-declaration]
event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> kernel/watchdog_hld.c:137:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
^
cc1: some warnings being treated as errors
vim +54 kernel/watchdog_hld.c
73ce0511 Babu Moger 2016-12-14 48 /* Callback function for perf event subsystem */
73ce0511 Babu Moger 2016-12-14 49 static void watchdog_overflow_callback(struct perf_event *event,
73ce0511 Babu Moger 2016-12-14 50 struct perf_sample_data *data,
73ce0511 Babu Moger 2016-12-14 51 struct pt_regs *regs)
73ce0511 Babu Moger 2016-12-14 52 {
73ce0511 Babu Moger 2016-12-14 53 /* Ensure the watchdog never gets throttled */
73ce0511 Babu Moger 2016-12-14 @54 event->hw.interrupts = 0;
73ce0511 Babu Moger 2016-12-14 55
b94f5118 Don Zickus 2017-01-24 56 if (atomic_read(&watchdog_park_in_progress) != 0)
b94f5118 Don Zickus 2017-01-24 57 return;
b94f5118 Don Zickus 2017-01-24 58
73ce0511 Babu Moger 2016-12-14 59 if (__this_cpu_read(watchdog_nmi_touch) == true) {
73ce0511 Babu Moger 2016-12-14 60 __this_cpu_write(watchdog_nmi_touch, false);
73ce0511 Babu Moger 2016-12-14 61 return;
73ce0511 Babu Moger 2016-12-14 62 }
73ce0511 Babu Moger 2016-12-14 63
73ce0511 Babu Moger 2016-12-14 64 /* check for a hardlockup
73ce0511 Babu Moger 2016-12-14 65 * This is done by making sure our timer interrupt
73ce0511 Babu Moger 2016-12-14 66 * is incrementing. The timer interrupt should have
73ce0511 Babu Moger 2016-12-14 67 * fired multiple times before we overflow'd. If it hasn't
73ce0511 Babu Moger 2016-12-14 68 * then this is a good indication the cpu is stuck
73ce0511 Babu Moger 2016-12-14 69 */
73ce0511 Babu Moger 2016-12-14 70 if (is_hardlockup()) {
73ce0511 Babu Moger 2016-12-14 71 int this_cpu = smp_processor_id();
73ce0511 Babu Moger 2016-12-14 72
73ce0511 Babu Moger 2016-12-14 73 /* only print hardlockups once */
73ce0511 Babu Moger 2016-12-14 74 if (__this_cpu_read(hard_watchdog_warn) == true)
73ce0511 Babu Moger 2016-12-14 75 return;
73ce0511 Babu Moger 2016-12-14 76
73ce0511 Babu Moger 2016-12-14 77 pr_emerg("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
73ce0511 Babu Moger 2016-12-14 78 print_modules();
73ce0511 Babu Moger 2016-12-14 79 print_irqtrace_events(current);
73ce0511 Babu Moger 2016-12-14 80 if (regs)
73ce0511 Babu Moger 2016-12-14 81 show_regs(regs);
73ce0511 Babu Moger 2016-12-14 82 else
73ce0511 Babu Moger 2016-12-14 83 dump_stack();
73ce0511 Babu Moger 2016-12-14 84
73ce0511 Babu Moger 2016-12-14 85 /*
73ce0511 Babu Moger 2016-12-14 86 * Perform all-CPU dump only once to avoid multiple hardlockups
73ce0511 Babu Moger 2016-12-14 87 * generating interleaving traces
73ce0511 Babu Moger 2016-12-14 88 */
73ce0511 Babu Moger 2016-12-14 89 if (sysctl_hardlockup_all_cpu_backtrace &&
73ce0511 Babu Moger 2016-12-14 90 !test_and_set_bit(0, &hardlockup_allcpu_dumped))
73ce0511 Babu Moger 2016-12-14 91 trigger_allbutself_cpu_backtrace();
73ce0511 Babu Moger 2016-12-14 92
73ce0511 Babu Moger 2016-12-14 93 if (hardlockup_panic)
73ce0511 Babu Moger 2016-12-14 94 nmi_panic(regs, "Hard LOCKUP");
73ce0511 Babu Moger 2016-12-14 95
73ce0511 Babu Moger 2016-12-14 96 __this_cpu_write(hard_watchdog_warn, true);
73ce0511 Babu Moger 2016-12-14 97 return;
73ce0511 Babu Moger 2016-12-14 98 }
73ce0511 Babu Moger 2016-12-14 99
73ce0511 Babu Moger 2016-12-14 100 __this_cpu_write(hard_watchdog_warn, false);
73ce0511 Babu Moger 2016-12-14 101 return;
73ce0511 Babu Moger 2016-12-14 102 }
73ce0511 Babu Moger 2016-12-14 103
73ce0511 Babu Moger 2016-12-14 104 /*
73ce0511 Babu Moger 2016-12-14 105 * People like the simple clean cpu node info on boot.
73ce0511 Babu Moger 2016-12-14 106 * Reduce the watchdog noise by only printing messages
73ce0511 Babu Moger 2016-12-14 107 * that are different from what cpu0 displayed.
73ce0511 Babu Moger 2016-12-14 108 */
8dcde9de Prarit Bhargava 2017-02-22 109 static unsigned long firstcpu_err;
8dcde9de Prarit Bhargava 2017-02-22 110 static atomic_t watchdog_cpus;
73ce0511 Babu Moger 2016-12-14 111
73ce0511 Babu Moger 2016-12-14 112 int watchdog_nmi_enable(unsigned int cpu)
73ce0511 Babu Moger 2016-12-14 113 {
73ce0511 Babu Moger 2016-12-14 114 struct perf_event_attr *wd_attr;
73ce0511 Babu Moger 2016-12-14 115 struct perf_event *event = per_cpu(watchdog_ev, cpu);
8dcde9de Prarit Bhargava 2017-02-22 116 int firstcpu = 0;
73ce0511 Babu Moger 2016-12-14 117
73ce0511 Babu Moger 2016-12-14 118 /* nothing to do if the hard lockup detector is disabled */
73ce0511 Babu Moger 2016-12-14 119 if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
73ce0511 Babu Moger 2016-12-14 120 goto out;
73ce0511 Babu Moger 2016-12-14 121
73ce0511 Babu Moger 2016-12-14 122 /* is it already setup and enabled? */
73ce0511 Babu Moger 2016-12-14 @123 if (event && event->state > PERF_EVENT_STATE_OFF)
73ce0511 Babu Moger 2016-12-14 124 goto out;
73ce0511 Babu Moger 2016-12-14 125
73ce0511 Babu Moger 2016-12-14 126 /* it is setup but not enabled */
73ce0511 Babu Moger 2016-12-14 127 if (event != NULL)
73ce0511 Babu Moger 2016-12-14 128 goto out_enable;
73ce0511 Babu Moger 2016-12-14 129
8dcde9de Prarit Bhargava 2017-02-22 130 if (atomic_inc_return(&watchdog_cpus) == 1)
8dcde9de Prarit Bhargava 2017-02-22 131 firstcpu = 1;
8dcde9de Prarit Bhargava 2017-02-22 132
73ce0511 Babu Moger 2016-12-14 133 wd_attr = &wd_hw_attr;
73ce0511 Babu Moger 2016-12-14 134 wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
73ce0511 Babu Moger 2016-12-14 135
73ce0511 Babu Moger 2016-12-14 136 /* Try to register using hardware perf events */
73ce0511 Babu Moger 2016-12-14 @137 event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
73ce0511 Babu Moger 2016-12-14 138
8dcde9de Prarit Bhargava 2017-02-22 139 /* save the first cpu's error for future comparision */
8dcde9de Prarit Bhargava 2017-02-22 140 if (firstcpu && IS_ERR(event))
:::::: The code at line 54 was first introduced by commit
:::::: 73ce0511c43686095efd2f65ef564aab952e07bc kernel/watchdog.c: move hardlockup detector to separate file
:::::: TO: Babu Moger <babu.moger@oracle.com>
:::::: CC: Linus Torvalds <torvalds@linux-foundation.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 47748 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] watchdog: split out config options
2017-05-25 8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
2017-05-25 11:30 ` kbuild test robot
@ 2017-05-25 12:09 ` kbuild test robot
1 sibling, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2017-05-25 12:09 UTC (permalink / raw)
To: Nicholas Piggin
Cc: kbuild-all, Don Zickus, Nicholas Piggin, linux-kernel, linux-arch
[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]
Hi Nicholas,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.12-rc2 next-20170525]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/watchdog-remove-unused-declaration/20170525-164255
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=parisc
All errors (new ones prefixed by >>):
kernel/built-in.o: In function `watchdog_nmi_enable':
>> (.text.watchdog_nmi_enable+0x298): undefined reference to `hw_nmi_get_sample_period'
kernel/built-in.o: In function `watchdog_nmi_enable':
(.text.watchdog_nmi_enable+0x2e8): undefined reference to `hw_nmi_get_sample_period'
hppa-linux-gnu-ld: drivers/built-in.o(.init.text+0x3d4ac): cannot reach 0000b22c_simple_map_init+0, recompile with -ffunction-sections
hppa-linux-gnu-ld: drivers/built-in.o(.init.text+0x3d4ac): cannot handle R_PARISC_PCREL17F for simple_map_init
hppa-linux-gnu-ld: final link failed: Bad value
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50215 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
2017-05-25 8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
@ 2017-05-25 13:55 ` Don Zickus
2017-05-26 0:31 ` Nicholas Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Don Zickus @ 2017-05-25 13:55 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linux-kernel, linux-arch
On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> For architectures that define HAVE_NMI_WATCHDOG, instead of having
> them provide the complete touch_nmi_watchdog() function, just have
> them provide arch_touch_nmi_watchdog().
>
> This gives the generic code more flexibility in implementing this
> function, and arch implementations don't miss out on touching the
> softlockup watchdog or other generic details.
The idea makes sense. I don't think you can have hld_touch_nmi_watchdog
defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
combine them somehow. Though renaming hld_touch_nmi_watchdog to
arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.
Cheers,
Don
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/blackfin/include/asm/nmi.h | 2 ++
> arch/blackfin/kernel/nmi.c | 2 +-
> arch/mn10300/include/asm/nmi.h | 2 ++
> arch/mn10300/kernel/mn10300-watchdog-low.S | 8 ++++----
> arch/mn10300/kernel/mn10300-watchdog.c | 2 +-
> arch/sparc/include/asm/nmi.h | 1 +
> arch/sparc/kernel/nmi.c | 6 ++----
> include/linux/nmi.h | 26 +++++++++++++++-----------
> kernel/watchdog_hld.c | 5 ++---
> 9 files changed, 30 insertions(+), 24 deletions(-)
>
> diff --git a/arch/blackfin/include/asm/nmi.h b/arch/blackfin/include/asm/nmi.h
> index b9caac4fcfd8..6e7047ceec79 100644
> --- a/arch/blackfin/include/asm/nmi.h
> +++ b/arch/blackfin/include/asm/nmi.h
> @@ -7,6 +7,8 @@
> #ifndef _BFIN_NMI_H_
> #define _BFIN_NMI_H_
>
> +void arch_touch_nmi_watchdog(void);
> +
> #include <linux/nmi.h>
>
> #endif
> diff --git a/arch/blackfin/kernel/nmi.c b/arch/blackfin/kernel/nmi.c
> index 633c37083e87..1e714329fe8a 100644
> --- a/arch/blackfin/kernel/nmi.c
> +++ b/arch/blackfin/kernel/nmi.c
> @@ -190,7 +190,7 @@ static int __init init_nmi_wdt(void)
> }
> device_initcall(init_nmi_wdt);
>
> -void touch_nmi_watchdog(void)
> +void arch_touch_nmi_watchdog(void)
> {
> atomic_set(&nmi_touched[smp_processor_id()], 1);
> }
> diff --git a/arch/mn10300/include/asm/nmi.h b/arch/mn10300/include/asm/nmi.h
> index f3671cbbc117..aee8aa22d9ee 100644
> --- a/arch/mn10300/include/asm/nmi.h
> +++ b/arch/mn10300/include/asm/nmi.h
> @@ -11,4 +11,6 @@
> #ifndef _ASM_NMI_H
> #define _ASM_NMI_H
>
> +void arch_touch_nmi_watchdog(void);
> +
> #endif /* _ASM_NMI_H */
> diff --git a/arch/mn10300/kernel/mn10300-watchdog-low.S b/arch/mn10300/kernel/mn10300-watchdog-low.S
> index f2f5c9cfaabd..34f8773de7d0 100644
> --- a/arch/mn10300/kernel/mn10300-watchdog-low.S
> +++ b/arch/mn10300/kernel/mn10300-watchdog-low.S
> @@ -50,9 +50,9 @@ watchdog_handler:
> # we can't inline it)
> #
> ###############################################################################
> - .globl touch_nmi_watchdog
> - .type touch_nmi_watchdog,@function
> -touch_nmi_watchdog:
> + .globl arch_touch_nmi_watchdog
> + .type arch_touch_nmi_watchdog,@function
> +arch_touch_nmi_watchdog:
> clr d0
> clr d1
> mov watchdog_alert_counter, a0
> @@ -63,4 +63,4 @@ touch_nmi_watchdog:
> lne
> ret [],0
>
> - .size touch_nmi_watchdog,.-touch_nmi_watchdog
> + .size arch_touch_nmi_watchdog,.-arch_touch_nmi_watchdog
> diff --git a/arch/mn10300/kernel/mn10300-watchdog.c b/arch/mn10300/kernel/mn10300-watchdog.c
> index a2d8e6938d67..0d5641beadf5 100644
> --- a/arch/mn10300/kernel/mn10300-watchdog.c
> +++ b/arch/mn10300/kernel/mn10300-watchdog.c
> @@ -31,7 +31,7 @@ static unsigned int watchdog;
> static unsigned int watchdog_hz = 1;
> unsigned int watchdog_alert_counter[NR_CPUS];
>
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>
> /*
> * the best way to detect whether a CPU has a 'hard lockup' problem
> diff --git a/arch/sparc/include/asm/nmi.h b/arch/sparc/include/asm/nmi.h
> index 26ad2b2607c6..284eac3ffaf2 100644
> --- a/arch/sparc/include/asm/nmi.h
> +++ b/arch/sparc/include/asm/nmi.h
> @@ -7,6 +7,7 @@ void nmi_adjust_hz(unsigned int new_hz);
>
> extern atomic_t nmi_active;
>
> +void arch_touch_nmi_watchdog(void);
> void start_nmi_watchdog(void *unused);
> void stop_nmi_watchdog(void *unused);
>
> diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
> index 95e73c63c99d..048ad783ea3f 100644
> --- a/arch/sparc/kernel/nmi.c
> +++ b/arch/sparc/kernel/nmi.c
> @@ -51,7 +51,7 @@ static DEFINE_PER_CPU(unsigned int, last_irq_sum);
> static DEFINE_PER_CPU(long, alert_counter);
> static DEFINE_PER_CPU(int, nmi_touch);
>
> -void touch_nmi_watchdog(void)
> +void arch_touch_nmi_watchdog(void)
> {
> if (atomic_read(&nmi_active)) {
> int cpu;
> @@ -61,10 +61,8 @@ void touch_nmi_watchdog(void)
> per_cpu(nmi_touch, cpu) = 1;
> }
> }
> -
> - touch_softlockup_watchdog();
> }
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>
> static void die_nmi(const char *str, struct pt_regs *regs, int do_panic)
> {
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index 5e2e57536d98..6ea465a842a1 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -6,6 +6,9 @@
>
> #include <linux/sched.h>
> #include <asm/irq.h>
> +#if defined(CONFIG_HAVE_NMI_WATCHDOG)
> +#include <asm/nmi.h>
> +#endif
>
> #ifdef CONFIG_LOCKUP_DETECTOR
> extern void touch_softlockup_watchdog_sched(void);
> @@ -58,6 +61,14 @@ static inline void reset_hung_task_detector(void)
> #define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT)
> #define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT)
>
> +#if defined(CONFIG_HARDLOCKUP_DETECTOR)
> +extern void hld_touch_nmi_watchdog(void);
> +extern void hardlockup_detector_disable(void);
> +#else
> +static inline void hld_touch_nmi_watchdog(void) {}
> +static inline void hardlockup_detector_disable(void) {}
> +#endif
> +
> /**
> * touch_nmi_watchdog - restart NMI watchdog timeout.
> *
> @@ -65,21 +76,14 @@ static inline void reset_hung_task_detector(void)
> * may be used to reset the timeout - for code which intentionally
> * disables interrupts for a long time. This call is stateless.
> */
> -#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
> -#include <asm/nmi.h>
> -extern void touch_nmi_watchdog(void);
> -#else
> static inline void touch_nmi_watchdog(void)
> {
> +#if defined(CONFIG_HAVE_NMI_WATCHDOG)
> + arch_touch_nmi_watchdog();
> +#endif
> + hld_touch_nmi_watchdog();
> touch_softlockup_watchdog();
> }
> -#endif
> -
> -#if defined(CONFIG_HARDLOCKUP_DETECTOR)
> -extern void hardlockup_detector_disable(void);
> -#else
> -static inline void hardlockup_detector_disable(void) {}
> -#endif
>
> /*
> * Create trigger_all_cpu_backtrace() out of the arch-provided
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 54a427d1f344..e0d7a7c43fb5 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -56,7 +56,7 @@ static int __init hardlockup_panic_setup(char *str)
> }
> __setup("nmi_watchdog=", hardlockup_panic_setup);
>
> -void touch_nmi_watchdog(void)
> +void hld_touch_nmi_watchdog(void)
> {
> /*
> * Using __raw here because some code paths have
> @@ -66,9 +66,8 @@ void touch_nmi_watchdog(void)
> * going off.
> */
> raw_cpu_write(watchdog_nmi_touch, true);
> - touch_softlockup_watchdog();
> }
> -EXPORT_SYMBOL(touch_nmi_watchdog);
> +EXPORT_SYMBOL(hld_touch_nmi_watchdog);
>
> static struct perf_event_attr wd_hw_attr = {
> .type = PERF_TYPE_HARDWARE,
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
2017-05-25 8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
@ 2017-05-25 14:08 ` Don Zickus
2017-05-26 0:39 ` Nicholas Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Don Zickus @ 2017-05-25 14:08 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linux-kernel, linux-arch
On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> After reconfiguring watchdog sysctls etc., architecture specific
> watchdogs may not get all their parameters updated.
>
> watchdog_reconfigure() can be implemented to pull the new values
> in and set the arch NMI watchdog.
I understand the reason for this patch and I don't have any real objection
on how it was implemented within the constraints of all the current logic.
I just wonder if the current logic should be adjusted to make the hardlockup
detector, namely the perf implementation more separate so it can handle what
you would like more cleanly.
The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
based on how things are designed. I am going to poke at it a little bit,
but I will probably not find time to do much and accept what you have for
now.
Cheers,
Don
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> kernel/watchdog.c | 28 ++++++++++++++++++++++++----
> 1 file changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index deb010505646..d2996f5bf551 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -120,6 +120,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
> {
> }
>
> +void __weak watchdog_nmi_reconfigure(void)
> +{
> +}
> +
> +
> #ifdef CONFIG_SOFTLOCKUP_DETECTOR
>
> /* Helper for online, unparked cpus. */
> @@ -597,6 +602,12 @@ static void watchdog_disable_all_cpus(void)
> }
> }
>
> +static int watchdog_update_cpus(void)
> +{
> + return smpboot_update_cpumask_percpu_thread(
> + &watchdog_threads, &watchdog_cpumask);
> +}
> +
> #else /* SOFTLOCKUP */
> static int watchdog_park_threads(void)
> {
> @@ -616,6 +627,10 @@ static void watchdog_disable_all_cpus(void)
> {
> }
>
> +static int watchdog_update_cpus(void)
> +{
> +}
> +
> static void set_sample_period(void)
> {
> }
> @@ -648,6 +663,8 @@ int lockup_detector_suspend(void)
> watchdog_enabled = 0;
> }
>
> + watchdog_nmi_reconfigure();
> +
> mutex_unlock(&watchdog_proc_mutex);
>
> return ret;
> @@ -668,6 +685,8 @@ void lockup_detector_resume(void)
> if (watchdog_running && !watchdog_suspended)
> watchdog_unpark_threads();
>
> + watchdog_nmi_reconfigure();
> +
> mutex_unlock(&watchdog_proc_mutex);
> put_online_cpus();
> }
> @@ -693,6 +712,8 @@ static int proc_watchdog_update(void)
> else
> watchdog_disable_all_cpus();
>
> + watchdog_nmi_reconfigure();
> +
> return err;
>
> }
> @@ -878,12 +899,11 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
> * a temporary cpumask, so we are likely not in a
> * position to do much else to make things better.
> */
> -#ifdef CONFIG_SOFTLOCKUP_DETECTOR
> - if (smpboot_update_cpumask_percpu_thread(
> - &watchdog_threads, &watchdog_cpumask) != 0)
> + if (watchdog_update_cpus() != 0)
> pr_err("cpumask update failed\n");
> -#endif
> }
> +
> + watchdog_nmi_reconfigure();
> }
> out:
> mutex_unlock(&watchdog_proc_mutex);
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
2017-05-25 13:55 ` Don Zickus
@ 2017-05-26 0:31 ` Nicholas Piggin
2017-05-26 14:05 ` Don Zickus
0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-26 0:31 UTC (permalink / raw)
To: Don Zickus; +Cc: linux-kernel, linux-arch
On Thu, 25 May 2017 09:55:59 -0400
Don Zickus <dzickus@redhat.com> wrote:
> On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> > For architectures that define HAVE_NMI_WATCHDOG, instead of having
> > them provide the complete touch_nmi_watchdog() function, just have
> > them provide arch_touch_nmi_watchdog().
> >
> > This gives the generic code more flexibility in implementing this
> > function, and arch implementations don't miss out on touching the
> > softlockup watchdog or other generic details.
>
> The idea makes sense. I don't think you can have hld_touch_nmi_watchdog
> defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
> combine them somehow. Though renaming hld_touch_nmi_watchdog to
> arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.
Yeah I agree it's not quite right, and I think using
arch_touch_nmi_watchdog would be fine for the hld, which makes sense
if you think of it as a utility or library function for architectures
that want a hardlockup watchdog and can use perf for it.
I can change that if you prefer. BTW the 0day picked up another
Kconfig compile bug, so I'll respin the series and include any changes
you like.
Thanks,
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
2017-05-25 14:08 ` Don Zickus
@ 2017-05-26 0:39 ` Nicholas Piggin
2017-05-26 14:21 ` Don Zickus
0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-05-26 0:39 UTC (permalink / raw)
To: Don Zickus; +Cc: linux-kernel, linux-arch
On Thu, 25 May 2017 10:08:33 -0400
Don Zickus <dzickus@redhat.com> wrote:
> On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> > After reconfiguring watchdog sysctls etc., architecture specific
> > watchdogs may not get all their parameters updated.
> >
> > watchdog_reconfigure() can be implemented to pull the new values
> > in and set the arch NMI watchdog.
>
> I understand the reason for this patch and I don't have any real objection
> on how it was implemented within the constraints of all the current logic.
>
> I just wonder if the current logic should be adjusted to make the hardlockup
> detector, namely the perf implementation more separate so it can handle what
> you would like more cleanly.
>
> The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
> based on how things are designed. I am going to poke at it a little bit,
> but I will probably not find time to do much and accept what you have for
> now.
I actually agree with you. These patches are basically an initial bridge
to get us to decoupling hld-perf from hld-arch, but the code could
definitely use several more passes to clean things up.
One thing we want to be mindful of is some watchdogs are very light weight,
minimal, and some may not even want to call C code (at least from the NMI
and touch-watchdog paths). But having said that, it may not be a bad idea
to have implementations provide a watchdog driver struct with some of the
methods and reconfiguration they support. E.g., suspend/resume, stop/start
on CPUs, adjust timeouts, etc.).
I didn't want to go the whole hog and over-engineer something that doesn't
work though, so I'm hoping we can get the powerpc watchdog in, and then
keep working on the apis.
Let me know what you think after you poke at it though.
Thanks,
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()
2017-05-26 0:31 ` Nicholas Piggin
@ 2017-05-26 14:05 ` Don Zickus
0 siblings, 0 replies; 13+ messages in thread
From: Don Zickus @ 2017-05-26 14:05 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linux-kernel, linux-arch
On Fri, May 26, 2017 at 10:31:03AM +1000, Nicholas Piggin wrote:
> On Thu, 25 May 2017 09:55:59 -0400
> Don Zickus <dzickus@redhat.com> wrote:
>
> > On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote:
> > > For architectures that define HAVE_NMI_WATCHDOG, instead of having
> > > them provide the complete touch_nmi_watchdog() function, just have
> > > them provide arch_touch_nmi_watchdog().
> > >
> > > This gives the generic code more flexibility in implementing this
> > > function, and arch implementations don't miss out on touching the
> > > softlockup watchdog or other generic details.
> >
> > The idea makes sense. I don't think you can have hld_touch_nmi_watchdog
> > defined with arch_touch_nmi_watchdog, so I am wondering if it makes sense to
> > combine them somehow. Though renaming hld_touch_nmi_watchdog to
> > arch_touch_nmi_watchdog sounds odd, I think it mimics the idea.
>
> Yeah I agree it's not quite right, and I think using
> arch_touch_nmi_watchdog would be fine for the hld, which makes sense
> if you think of it as a utility or library function for architectures
> that want a hardlockup watchdog and can use perf for it.
Yeah, if you wouldn't mind trying that. Over the last year it seems there
is a push to make the hld more of a separate thing if folks want to use
perf. I have been trying to tweak it so it can be used in-place of the arch
solution or just use the arch solution. And still retain the same function
calls.
Cheers,
Don
>
> I can change that if you prefer. BTW the 0day picked up another
> Kconfig compile bug, so I'll respin the series and include any changes
> you like.
>
> Thanks,
> Nick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs
2017-05-26 0:39 ` Nicholas Piggin
@ 2017-05-26 14:21 ` Don Zickus
0 siblings, 0 replies; 13+ messages in thread
From: Don Zickus @ 2017-05-26 14:21 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linux-kernel, linux-arch
On Fri, May 26, 2017 at 10:39:09AM +1000, Nicholas Piggin wrote:
> On Thu, 25 May 2017 10:08:33 -0400
> Don Zickus <dzickus@redhat.com> wrote:
>
> > On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote:
> > > After reconfiguring watchdog sysctls etc., architecture specific
> > > watchdogs may not get all their parameters updated.
> > >
> > > watchdog_reconfigure() can be implemented to pull the new values
> > > in and set the arch NMI watchdog.
> >
> > I understand the reason for this patch and I don't have any real objection
> > on how it was implemented within the constraints of all the current logic.
> >
> > I just wonder if the current logic should be adjusted to make the hardlockup
> > detector, namely the perf implementation more separate so it can handle what
> > you would like more cleanly.
> >
> > The watchdog_nmi_reconfigure is sort of hackish, but it is hard to fault you
> > based on how things are designed. I am going to poke at it a little bit,
> > but I will probably not find time to do much and accept what you have for
> > now.
>
> I actually agree with you. These patches are basically an initial bridge
> to get us to decoupling hld-perf from hld-arch, but the code could
> definitely use several more passes to clean things up.
Makes sense. :-)
>
> One thing we want to be mindful of is some watchdogs are very light weight,
> minimal, and some may not even want to call C code (at least from the NMI
Agreed.
> and touch-watchdog paths). But having said that, it may not be a bad idea
> to have implementations provide a watchdog driver struct with some of the
> methods and reconfiguration they support. E.g., suspend/resume, stop/start
> on CPUs, adjust timeouts, etc.).
Hehe. I was hoping to avoid doing that, but it may lead there over time.
>
> I didn't want to go the whole hog and over-engineer something that doesn't
> work though, so I'm hoping we can get the powerpc watchdog in, and then
> keep working on the apis.
Agreed.
>
> Let me know what you think after you poke at it though.
I will.
Cheers,
Don
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-05-26 14:21 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-25 8:28 [PATCH 0/4][V2] Improve watchdog config for arch watchdogs Nicholas Piggin
2017-05-25 8:28 ` [PATCH 1/4] watchdog: remove unused declaration Nicholas Piggin
2017-05-25 8:28 ` [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog() Nicholas Piggin
2017-05-25 13:55 ` Don Zickus
2017-05-26 0:31 ` Nicholas Piggin
2017-05-26 14:05 ` Don Zickus
2017-05-25 8:28 ` [PATCH 3/4] watchdog: split out config options Nicholas Piggin
2017-05-25 11:30 ` kbuild test robot
2017-05-25 12:09 ` kbuild test robot
2017-05-25 8:28 ` [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs Nicholas Piggin
2017-05-25 14:08 ` Don Zickus
2017-05-26 0:39 ` Nicholas Piggin
2017-05-26 14:21 ` Don Zickus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).