linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] watchdog: internally split softlockup and hardlockup
@ 2014-11-04 16:20 Don Zickus
  2014-11-04 16:20 ` [PATCH 1/9] watchdog: new definitions and variables, initialization Don Zickus
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

The hardlockup and softockup had always been tied together.  Due to the request
of KVM folks, they had a need to have one enabled but not the other.
Internally rework the code to split things apart more cleanly.

There is a bunch of churn here, but the end result should be code that should
be easier to maintain and fix without knowing the internals of what is going
on.

Tested by Uli and myself.

Ulrich Obergfell (9):
  watchdog: new definitions and variables, initialization
  watchdog: introduce the proc_watchdog_update() function
  watchdog: move definition of 'watchdog_proc_mutex' outside of
    proc_dowatchdog()
  watchdog: introduce the proc_watchdog_common() function
  watchdog: introduce separate handlers for parameters in
    /proc/sys/kernel
  watchdog: implement error handling for failure to set up hardware
    perf events
  watchdog: enable the new user interface of the watchdog mechanism
  watchdog: clean up some function names and arguments
  watchdog: introduce the hardlockup_detector_disable() function

 arch/x86/kernel/kvm.c |    2 +-
 include/linux/nmi.h   |   21 ++--
 kernel/sysctl.c       |   35 +++++--
 kernel/watchdog.c     |  277 ++++++++++++++++++++++++++++++++++++-------------
 4 files changed, 240 insertions(+), 95 deletions(-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/9] watchdog: new definitions and variables, initialization
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 2/9] watchdog: introduce the proc_watchdog_update() function Don Zickus
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Introduce new definitions and variables to separate the user interface
in /proc/sys/kernel from the internal run state of the lockup detectors.
The internal run state is represented by two bits in a new variable that
is named 'watchdog_enabled'. This helps simplify the code, for example:

- In order to check if any of the two lockup detectors is enabled,
  it is sufficient to check if 'watchdog_enabled' is not zero.

- In order to enable/disable one or both lockup detectors,
  it is sufficient to set/clear one or both bits in 'watchdog_enabled'.

- Concurrent updates of 'watchdog_enabled' need not be synchronized via
  a spinlock or a mutex. Updates can either be atomic or concurrency can
  be detected by using 'cmpxchg'.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 include/linux/nmi.h |    2 ++
 kernel/watchdog.c   |   27 ++++++++++++++++++++++++++-
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 9b2022a..3885a7d 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -68,6 +68,8 @@ static inline bool trigger_allbutself_cpu_backtrace(void)
 #ifdef CONFIG_LOCKUP_DETECTOR
 int hw_nmi_is_cpu_stuck(struct pt_regs *);
 u64 hw_nmi_get_sample_period(int watchdog_thresh);
+extern int nmi_watchdog_enabled;
+extern int soft_watchdog_enabled;
 extern int watchdog_user_enabled;
 extern int watchdog_thresh;
 extern int sysctl_softlockup_all_cpu_backtrace;
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 70bf118..91e7bff 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -24,8 +24,33 @@
 #include <linux/kvm_para.h>
 #include <linux/perf_event.h>
 
-int watchdog_user_enabled = 1;
+/*
+ * The run state of the lockup detectors is controlled by the content of the
+ * 'watchdog_enabled' variable. Each lockup detector has its dedicated bit -
+ * bit 0 for the hard lockup detector and bit 1 for the soft lockup detector.
+ *
+ * 'watchdog_user_enabled', 'nmi_watchdog_enabled' and 'soft_watchdog_enabled'
+ * are variables that are only used as an 'interface' between the parameters
+ * in /proc/sys/kernel and the internal state bits in 'watchdog_enabled'. The
+ * 'watchdog_thresh' variable is handled differently because its value is not
+ * boolean, and the lockup detectors are 'suspended' while 'watchdog_thresh'
+ * is equal zero.
+ */
+#define NMI_WATCHDOG_ENABLED_BIT   0
+#define SOFT_WATCHDOG_ENABLED_BIT  1
+#define NMI_WATCHDOG_ENABLED      (1 << NMI_WATCHDOG_ENABLED_BIT)
+#define SOFT_WATCHDOG_ENABLED     (1 << SOFT_WATCHDOG_ENABLED_BIT)
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED|NMI_WATCHDOG_ENABLED;
+#else
+static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
+#endif
+int __read_mostly nmi_watchdog_enabled;
+int __read_mostly soft_watchdog_enabled;
+int __read_mostly watchdog_user_enabled;
 int __read_mostly watchdog_thresh = 10;
+
 #ifdef CONFIG_SMP
 int __read_mostly sysctl_softlockup_all_cpu_backtrace;
 #else
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/9] watchdog: introduce the proc_watchdog_update() function
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
  2014-11-04 16:20 ` [PATCH 1/9] watchdog: new definitions and variables, initialization Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 3/9] watchdog: move definition of 'watchdog_proc_mutex' outside of proc_dowatchdog() Don Zickus
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

This series introduces a separate handler for each watchdog parameter
in /proc/sys/kernel. The separate handlers need a common function that
they can call to update the run state of the lockup detectors, or to
have the lockup detectors use a new sample period.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 91e7bff..dcf28b2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -679,6 +679,29 @@ static void watchdog_disable_all_cpus(void)
 }
 
 /*
+ * Update the run state of the lockup detectors.
+ */
+static int proc_watchdog_update(void)
+{
+	int err = 0;
+
+	/*
+	 * Watchdog threads won't be started if they are already active.
+	 * The 'watchdog_running' variable in watchdog_*_all_cpus() takes
+	 * care of this. If those threads are already active, the sample
+	 * period will be updated and the lockup detectors will be enabled
+	 * or disabled 'on the fly'.
+	 */
+	if (watchdog_enabled && watchdog_thresh)
+		err = watchdog_enable_all_cpus(true);
+	else
+		watchdog_disable_all_cpus();
+
+	return err;
+
+}
+
+/*
  * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
  */
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/9] watchdog: move definition of 'watchdog_proc_mutex' outside of proc_dowatchdog()
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
  2014-11-04 16:20 ` [PATCH 1/9] watchdog: new definitions and variables, initialization Don Zickus
  2014-11-04 16:20 ` [PATCH 2/9] watchdog: introduce the proc_watchdog_update() function Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 4/9] watchdog: introduce the proc_watchdog_common() function Don Zickus
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

This series removes the proc_dowatchdog() function. Since multiple
new functions need the 'watchdog_proc_mutex' to serialize access to
the watchdog parameters in /proc/sys/kernel, move the mutex outside
of any function.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index dcf28b2..df4ef38 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -701,6 +701,8 @@ static int proc_watchdog_update(void)
 
 }
 
+static DEFINE_MUTEX(watchdog_proc_mutex);
+
 /*
  * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
  */
@@ -710,7 +712,6 @@ int proc_dowatchdog(struct ctl_table *table, int write,
 {
 	int err, old_thresh, old_enabled;
 	bool old_hardlockup;
-	static DEFINE_MUTEX(watchdog_proc_mutex);
 
 	mutex_lock(&watchdog_proc_mutex);
 	old_thresh = ACCESS_ONCE(watchdog_thresh);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/9] watchdog: introduce the proc_watchdog_common() function
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (2 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 3/9] watchdog: move definition of 'watchdog_proc_mutex' outside of proc_dowatchdog() Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 5/9] watchdog: introduce separate handlers for parameters in /proc/sys/kernel Don Zickus
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Three of four handlers for the watchdog parameters in /proc/sys/kernel
essentially have to do the same thing.

  if the parameter is being read {
    return the state of the corresponding bit(s) in 'watchdog_enabled'
  } else {
    set/clear the state of the corresponding bit(s) in 'watchdog_enabled'
    update the run state of the lockup detector(s)
  }

Hence, introduce a common function that can be called by those handlers.
The callers pass a 'bit mask' to this function to indicate which bit(s)
should be set/cleared in 'watchdog_enabled'.

This function handles an uncommon race with watchdog_nmi_enable() where
a concurrent update of 'watchdog_enabled' is possible. We use 'cmpxchg'
to detect the concurrency. [This avoids introducing a new spinlock or a
mutex to synchronize updates of 'watchdog_enabled'. Using the same lock
or mutex in watchdog thread context and in system call context needs to
be considered carefully because it can make the code prone to deadlock
situations in connection with parking/unparking the watchdog threads.]

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 65 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index df4ef38..7626208 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -704,6 +704,71 @@ static int proc_watchdog_update(void)
 static DEFINE_MUTEX(watchdog_proc_mutex);
 
 /*
+ * common function for watchdog, nmi_watchdog and soft_watchdog parameter
+ *
+ * caller             | table->data points to | 'which' contains the flag(s)
+ * -------------------|-----------------------|-----------------------------
+ * proc_watchdog      | watchdog_user_enabled | NMI_WATCHDOG_ENABLED or'ed
+ *                    |                       | with SOFT_WATCHDOG_ENABLED
+ * -------------------|-----------------------|-----------------------------
+ * proc_nmi_watchdog  | nmi_watchdog_enabled  | NMI_WATCHDOG_ENABLED
+ * -------------------|-----------------------|-----------------------------
+ * proc_soft_watchdog | soft_watchdog_enabled | SOFT_WATCHDOG_ENABLED
+ */
+static int proc_watchdog_common(int which, struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int err, old, new;
+	int *watchdog_param = (int *)table->data;
+
+	mutex_lock(&watchdog_proc_mutex);
+
+	/*
+	 * If the parameter is being read return the state of the corresponding
+	 * bit(s) in 'watchdog_enabled', else update 'watchdog_enabled' and the
+	 * run state of the lockup detectors.
+	 */
+	if (!write) {
+		*watchdog_param = (watchdog_enabled & which) != 0;
+		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	} else {
+		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+		if (err)
+			goto out;
+
+		/*
+		 * There is a race window between fetching the current value
+		 * from 'watchdog_enabled' and storing the new value. During
+		 * this race window, watchdog_nmi_enable() can sneak in and
+		 * clear the NMI_WATCHDOG_ENABLED bit in 'watchdog_enabled'.
+		 * The 'cmpxchg' detects this race and the loop retries.
+		 */
+		do {
+			old = watchdog_enabled;
+			/*
+			 * If the parameter value is not zero set the
+			 * corresponding bit(s), else clear it(them).
+			 */
+			if (*watchdog_param)
+				new = old | which;
+			else
+				new = old & ~which;
+		} while (cmpxchg(&watchdog_enabled, old, new) != old);
+
+		/*
+		 * Update the run state of the lockup detectors.
+		 * Restore 'watchdog_enabled' on failure.
+		 */
+		err = proc_watchdog_update();
+		if (err)
+			watchdog_enabled = old;
+	}
+out:
+	mutex_unlock(&watchdog_proc_mutex);
+	return err;
+}
+
+/*
  * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
  */
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/9] watchdog: introduce separate handlers for parameters in /proc/sys/kernel
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (3 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 4/9] watchdog: introduce the proc_watchdog_common() function Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 6/9] watchdog: implement error handling for failure to set up hardware perf events Don Zickus
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Separate handlers for each watchdog parameter in /proc/sys/kernel
replace the proc_dowatchdog() function. Three of those handlers
merely call proc_watchdog_common() with one different argument.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 include/linux/nmi.h |    8 +++++++
 kernel/watchdog.c   |   59 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 3885a7d..5b54505 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -74,6 +74,14 @@ extern int watchdog_user_enabled;
 extern int watchdog_thresh;
 extern int sysctl_softlockup_all_cpu_backtrace;
 struct ctl_table;
+extern int proc_watchdog(struct ctl_table *, int ,
+			 void __user *, size_t *, loff_t *);
+extern int proc_nmi_watchdog(struct ctl_table *, int ,
+			     void __user *, size_t *, loff_t *);
+extern int proc_soft_watchdog(struct ctl_table *, int ,
+			      void __user *, size_t *, loff_t *);
+extern int proc_watchdog_thresh(struct ctl_table *, int ,
+				void __user *, size_t *, loff_t *);
 extern int proc_dowatchdog(struct ctl_table *, int ,
 			   void __user *, size_t *, loff_t *);
 #endif
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7626208..2897a5e 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -769,6 +769,65 @@ out:
 }
 
 /*
+ * /proc/sys/kernel/watchdog
+ */
+int proc_watchdog(struct ctl_table *table, int write,
+		  void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	return proc_watchdog_common(NMI_WATCHDOG_ENABLED|SOFT_WATCHDOG_ENABLED,
+				    table, write, buffer, lenp, ppos);
+}
+
+/*
+ * /proc/sys/kernel/nmi_watchdog
+ */
+int proc_nmi_watchdog(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	return proc_watchdog_common(NMI_WATCHDOG_ENABLED,
+				    table, write, buffer, lenp, ppos);
+}
+
+/*
+ * /proc/sys/kernel/soft_watchdog
+ */
+int proc_soft_watchdog(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	return proc_watchdog_common(SOFT_WATCHDOG_ENABLED,
+				    table, write, buffer, lenp, ppos);
+}
+
+/*
+ * /proc/sys/kernel/watchdog_thresh
+ */
+int proc_watchdog_thresh(struct ctl_table *table, int write,
+			 void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int err, old;
+
+	mutex_lock(&watchdog_proc_mutex);
+
+	old = ACCESS_ONCE(watchdog_thresh);
+	err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	if (err || !write)
+		goto out;
+
+	/*
+	 * Update the sample period.
+	 * Restore 'watchdog_thresh' on failure.
+	 */
+	set_sample_period();
+	err = proc_watchdog_update();
+	if (err)
+		watchdog_thresh = old;
+out:
+	mutex_unlock(&watchdog_proc_mutex);
+	return err;
+}
+
+/*
  * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
  */
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/9] watchdog: implement error handling for failure to set up hardware perf events
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (4 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 5/9] watchdog: introduce separate handlers for parameters in /proc/sys/kernel Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism Don Zickus
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

If watchdog_nmi_enable() fails to set up the hardware perf event
of one CPU, the entire hard lockup detector is deemed unreliable.
Hence, disable the hard lockup detector and shut down the hardware
perf events on all CPUs.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 2897a5e..86ba5e5 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -502,6 +502,15 @@ static void watchdog(unsigned int cpu)
 	__this_cpu_write(soft_lockup_hrtimer_cnt,
 			 __this_cpu_read(hrtimer_interrupts));
 	__touch_watchdog();
+
+	/*
+	 * watchdog_nmi_enable() clears the NMI_WATCHDOG_ENABLED bit in the
+	 * failure path. Check for failures that can occur asynchronously -
+	 * for example, when CPUs are on-lined - and shut down the hardware
+	 * perf event on each CPU accordingly.
+	 */
+	if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
+		watchdog_nmi_disable(cpu);
 }
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
@@ -552,6 +561,15 @@ handle_err:
 		goto out_save;
 	}
 
+	/*
+	 * Disable the hard lockup detector if _any_ CPU fails to set up
+	 * set up the hardware perf event. The watchdog() function checks
+	 * the NMI_WATCHDOG_ENABLED bit periodically.
+	 */
+	smp_mb__before_atomic();
+	clear_bit(NMI_WATCHDOG_ENABLED_BIT, &watchdog_enabled);
+	smp_mb__after_atomic();
+
 	/* skip displaying the same error again */
 	if (cpu > 0 && (PTR_ERR(event) == cpu0_err))
 		return PTR_ERR(event);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (5 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 6/9] watchdog: implement error handling for failure to set up hardware perf events Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 8/9] watchdog: clean up some function names and arguments Don Zickus
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

With the current user interface of the watchdog mechanism it is only
possible to disable or enable both lockup detectors at the same time.
This series introduces new kernel parameters and changes the semantics
of some existing kernel parameters, so that the hard lockup detector
and the soft lockup detector can be disabled or enabled individually.
With this series applied, the user interface is as follows.

- parameters in /proc/sys/kernel

  . soft_watchdog
    This is a new parameter to control and examine the run state of
    the soft lockup detector.

  . nmi_watchdog
    The semantics of this parameter have changed. It can now be used
    to control and examine the run state of the hard lockup detector.

  . watchdog
    This parameter is still available to control the run state of both
    lockup detectors at the same time. If this parameter is examined,
    it shows the logical OR of soft_watchdog and nmi_watchdog.

  . watchdog_thresh
    The semantics of this parameter are not affected by the patch.

- kernel command line parameters

  . nosoftlockup
    The semantics of this parameter have changed. It can now be used
    to disable the soft lockup detector at boot time.

  . nmi_watchdog=0 or nmi_watchdog=1
    Disable or enable the hard lockup detector at boot time. The patch
    introduces '=1' as a new option.

  . nowatchdog
    The semantics of this parameter are not affected by the patch. It
    is still available to disable both lockup detectors at boot time.

Also, remove the proc_dowatchdog() function which is no longer needed.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 include/linux/nmi.h |    2 -
 kernel/sysctl.c     |   35 +++++++++++++++-------
 kernel/watchdog.c   |   81 ++++++++++-----------------------------------------
 3 files changed, 40 insertions(+), 78 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 5b54505..0426357 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -82,8 +82,6 @@ extern int proc_soft_watchdog(struct ctl_table *, int ,
 			      void __user *, size_t *, loff_t *);
 extern int proc_watchdog_thresh(struct ctl_table *, int ,
 				void __user *, size_t *, loff_t *);
-extern int proc_dowatchdog(struct ctl_table *, int ,
-			   void __user *, size_t *, loff_t *);
 #endif
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4aada6d..1875cf0 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -838,7 +838,7 @@ static struct ctl_table kern_table[] = {
 		.data           = &watchdog_user_enabled,
 		.maxlen         = sizeof (int),
 		.mode           = 0644,
-		.proc_handler   = proc_dowatchdog,
+		.proc_handler   = proc_watchdog,
 		.extra1		= &zero,
 		.extra2		= &one,
 	},
@@ -847,11 +847,33 @@ static struct ctl_table kern_table[] = {
 		.data		= &watchdog_thresh,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_dowatchdog,
+		.proc_handler	= proc_watchdog_thresh,
 		.extra1		= &zero,
 		.extra2		= &sixty,
 	},
 	{
+		.procname       = "nmi_watchdog",
+		.data           = &nmi_watchdog_enabled,
+		.maxlen         = sizeof (int),
+		.mode           = 0644,
+		.proc_handler   = proc_nmi_watchdog,
+		.extra1		= &zero,
+#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+		.extra2		= &one,
+#else
+		.extra2		= &zero,
+#endif
+	},
+	{
+		.procname       = "soft_watchdog",
+		.data           = &soft_watchdog_enabled,
+		.maxlen         = sizeof (int),
+		.mode           = 0644,
+		.proc_handler   = proc_soft_watchdog,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+	{
 		.procname	= "softlockup_panic",
 		.data		= &softlockup_panic,
 		.maxlen		= sizeof(int),
@@ -871,15 +893,6 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 #endif /* CONFIG_SMP */
-	{
-		.procname       = "nmi_watchdog",
-		.data           = &watchdog_user_enabled,
-		.maxlen         = sizeof (int),
-		.mode           = 0644,
-		.proc_handler   = proc_dowatchdog,
-		.extra1		= &zero,
-		.extra2		= &one,
-	},
 #endif
 #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
 	{
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 86ba5e5..7043274 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -110,15 +110,9 @@ static int __init hardlockup_panic_setup(char *str)
 	else if (!strncmp(str, "nopanic", 7))
 		hardlockup_panic = 0;
 	else if (!strncmp(str, "0", 1))
-		watchdog_user_enabled = 0;
-	else if (!strncmp(str, "1", 1) || !strncmp(str, "2", 1)) {
-		/*
-		 * Setting 'nmi_watchdog=1' or 'nmi_watchdog=2' (legacy option)
-		 * has the same effect.
-		 */
-		watchdog_user_enabled = 1;
-		watchdog_enable_hardlockup_detector(true);
-	}
+		watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+	else if (!strncmp(str, "1", 1))
+		watchdog_enabled |= NMI_WATCHDOG_ENABLED;
 	return 1;
 }
 __setup("nmi_watchdog=", hardlockup_panic_setup);
@@ -137,19 +131,18 @@ __setup("softlockup_panic=", softlockup_panic_setup);
 
 static int __init nowatchdog_setup(char *str)
 {
-	watchdog_user_enabled = 0;
+	watchdog_enabled = 0;
 	return 1;
 }
 __setup("nowatchdog", nowatchdog_setup);
 
-/* deprecated */
 static int __init nosoftlockup_setup(char *str)
 {
-	watchdog_user_enabled = 0;
+	watchdog_enabled &= ~SOFT_WATCHDOG_ENABLED;
 	return 1;
 }
 __setup("nosoftlockup", nosoftlockup_setup);
-/*  */
+
 #ifdef CONFIG_SMP
 static int __init softlockup_all_cpu_backtrace_setup(char *str)
 {
@@ -264,10 +257,11 @@ static int is_softlockup(unsigned long touch_ts)
 {
 	unsigned long now = get_timestamp();
 
-	/* Warn about unreasonable delays: */
-	if (time_after(now, touch_ts + get_softlockup_thresh()))
-		return now - touch_ts;
-
+	if (watchdog_enabled & SOFT_WATCHDOG_ENABLED) {
+		/* Warn about unreasonable delays. */
+		if (time_after(now, touch_ts + get_softlockup_thresh()))
+			return now - touch_ts;
+	}
 	return 0;
 }
 
@@ -526,6 +520,10 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	struct perf_event_attr *wd_attr;
 	struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
+	/* nothing to do if the hard lockup detector is disabled */
+	if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
+		goto out;
+
 	/*
 	 * Some kernels need to default hard lockup detection to
 	 * 'disabled', for example a guest on a hypervisor.
@@ -844,59 +842,12 @@ out:
 	mutex_unlock(&watchdog_proc_mutex);
 	return err;
 }
-
-/*
- * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
- */
-
-int proc_dowatchdog(struct ctl_table *table, int write,
-		    void __user *buffer, size_t *lenp, loff_t *ppos)
-{
-	int err, old_thresh, old_enabled;
-	bool old_hardlockup;
-
-	mutex_lock(&watchdog_proc_mutex);
-	old_thresh = ACCESS_ONCE(watchdog_thresh);
-	old_enabled = ACCESS_ONCE(watchdog_user_enabled);
-	old_hardlockup = watchdog_hardlockup_detector_is_enabled();
-
-	err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-	if (err || !write)
-		goto out;
-
-	set_sample_period();
-	/*
-	 * Watchdog threads shouldn't be enabled if they are
-	 * disabled. The 'watchdog_running' variable check in
-	 * watchdog_*_all_cpus() function takes care of this.
-	 */
-	if (watchdog_user_enabled && watchdog_thresh) {
-		/*
-		 * Prevent a change in watchdog_thresh accidentally overriding
-		 * the enablement of the hardlockup detector.
-		 */
-		if (watchdog_user_enabled != old_enabled)
-			watchdog_enable_hardlockup_detector(true);
-		err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh);
-	} else
-		watchdog_disable_all_cpus();
-
-	/* Restore old values on failure */
-	if (err) {
-		watchdog_thresh = old_thresh;
-		watchdog_user_enabled = old_enabled;
-		watchdog_enable_hardlockup_detector(old_hardlockup);
-	}
-out:
-	mutex_unlock(&watchdog_proc_mutex);
-	return err;
-}
 #endif /* CONFIG_SYSCTL */
 
 void __init lockup_detector_init(void)
 {
 	set_sample_period();
 
-	if (watchdog_user_enabled)
+	if (watchdog_enabled)
 		watchdog_enable_all_cpus(false);
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/9] watchdog: clean up some function names and arguments
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (6 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-11-04 16:20 ` [PATCH 9/9] watchdog: introduce the hardlockup_detector_disable() function Don Zickus
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Rename the update_timers*() functions to update_watchdog*().

Remove the boolean argument from watchdog_enable_all_cpus()
because update_watchdog_all_cpus() is now a generic function
to change the run state of the lockup detectors and to have
the lockup detectors use a new sample period.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   20 ++++++++++++--------
 1 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7043274..e804ea4 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -641,7 +641,7 @@ static void restart_watchdog_hrtimer(void *info)
 				HRTIMER_MODE_REL_PINNED);
 }
 
-static void update_timers(int cpu)
+static void update_watchdog(int cpu)
 {
 	/*
 	 * Make sure that perf event counter will adopt to a new
@@ -656,17 +656,17 @@ static void update_timers(int cpu)
 	watchdog_nmi_enable(cpu);
 }
 
-static void update_timers_all_cpus(void)
+static void update_watchdog_all_cpus(void)
 {
 	int cpu;
 
 	get_online_cpus();
 	for_each_online_cpu(cpu)
-		update_timers(cpu);
+		update_watchdog(cpu);
 	put_online_cpus();
 }
 
-static int watchdog_enable_all_cpus(bool sample_period_changed)
+static int watchdog_enable_all_cpus(void)
 {
 	int err = 0;
 
@@ -676,8 +676,12 @@ static int watchdog_enable_all_cpus(bool sample_period_changed)
 			pr_err("Failed to create watchdog threads, disabled\n");
 		else
 			watchdog_running = 1;
-	} else if (sample_period_changed) {
-		update_timers_all_cpus();
+	} else {
+		/*
+		 * Enable/disable the lockup detectors or
+		 * change the sample period 'on the fly'.
+		 */
+		update_watchdog_all_cpus();
 	}
 
 	return err;
@@ -709,7 +713,7 @@ static int proc_watchdog_update(void)
 	 * or disabled 'on the fly'.
 	 */
 	if (watchdog_enabled && watchdog_thresh)
-		err = watchdog_enable_all_cpus(true);
+		err = watchdog_enable_all_cpus();
 	else
 		watchdog_disable_all_cpus();
 
@@ -849,5 +853,5 @@ void __init lockup_detector_init(void)
 	set_sample_period();
 
 	if (watchdog_enabled)
-		watchdog_enable_all_cpus(false);
+		watchdog_enable_all_cpus();
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 9/9] watchdog: introduce the hardlockup_detector_disable() function
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (7 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 8/9] watchdog: clean up some function names and arguments Don Zickus
@ 2014-11-04 16:20 ` Don Zickus
  2014-12-17 21:25 ` [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
  2015-01-21 14:15 ` Don Zickus
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-11-04 16:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Have kvm_guest_init() use hardlockup_detector_disable()
instead of watchdog_enable_hardlockup_detector(false).

Remove the watchdog_hardlockup_detector_is_enabled() and
the watchdog_enable_hardlockup_detector() function which
are no longer needed.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/kvm.c |    2 +-
 include/linux/nmi.h   |    9 ++-------
 kernel/watchdog.c     |   21 ++-------------------
 3 files changed, 5 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index f6945be..3d310a4 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -506,7 +506,7 @@ void __init kvm_guest_init(void)
 	 * can get false positives too easily, for example if the host is
 	 * overcommitted.
 	 */
-	watchdog_enable_hardlockup_detector(false);
+	hardlockup_detector_disable();
 }
 
 static noinline uint32_t __kvm_cpuid_base(void)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 0426357..3d46fb4 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -25,16 +25,11 @@ static inline void touch_nmi_watchdog(void)
 #endif
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR)
-extern void watchdog_enable_hardlockup_detector(bool val);
-extern bool watchdog_hardlockup_detector_is_enabled(void);
+extern void hardlockup_detector_disable(void);
 #else
-static inline void watchdog_enable_hardlockup_detector(bool val)
+static inline void hardlockup_detector_disable(void)
 {
 }
-static inline bool watchdog_hardlockup_detector_is_enabled(void)
-{
-	return true;
-}
 #endif
 
 /*
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index e804ea4..57d931a 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -83,8 +83,6 @@ static unsigned long soft_lockup_nmi_warn;
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 static int hardlockup_panic =
 			CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
-
-static bool hardlockup_detector_enabled = true;
 /*
  * We may not want to enable hard lockup detection by default in all cases,
  * for example when running the kernel as a guest on a hypervisor. In these
@@ -93,14 +91,9 @@ static bool hardlockup_detector_enabled = true;
  * kernel command line parameters are parsed, because otherwise it is not
  * possible to override this in hardlockup_panic_setup().
  */
-void watchdog_enable_hardlockup_detector(bool val)
-{
-	hardlockup_detector_enabled = val;
-}
-
-bool watchdog_hardlockup_detector_is_enabled(void)
+void hardlockup_detector_disable(void)
 {
-	return hardlockup_detector_enabled;
+	watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
 }
 
 static int __init hardlockup_panic_setup(char *str)
@@ -524,15 +517,6 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
 		goto out;
 
-	/*
-	 * Some kernels need to default hard lockup detection to
-	 * 'disabled', for example a guest on a hypervisor.
-	 */
-	if (!watchdog_hardlockup_detector_is_enabled()) {
-		event = ERR_PTR(-ENOENT);
-		goto handle_err;
-	}
-
 	/* is it already setup and enabled? */
 	if (event && event->state > PERF_EVENT_STATE_OFF)
 		goto out;
@@ -547,7 +531,6 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
 
-handle_err:
 	/* save cpu0 error for future comparision */
 	if (cpu == 0 && IS_ERR(event))
 		cpu0_err = PTR_ERR(event);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/9] watchdog: internally split softlockup and hardlockup
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (8 preceding siblings ...)
  2014-11-04 16:20 ` [PATCH 9/9] watchdog: introduce the hardlockup_detector_disable() function Don Zickus
@ 2014-12-17 21:25 ` Don Zickus
  2015-01-21 14:15 ` Don Zickus
  10 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2014-12-17 21:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe

Hi Andrew,

Poke.  What can I do to move this patch forward?

Cheers,
Don

On Tue, Nov 04, 2014 at 11:20:15AM -0500, Don Zickus wrote:
> The hardlockup and softockup had always been tied together.  Due to the request
> of KVM folks, they had a need to have one enabled but not the other.
> Internally rework the code to split things apart more cleanly.
> 
> There is a bunch of churn here, but the end result should be code that should
> be easier to maintain and fix without knowing the internals of what is going
> on.
> 
> Tested by Uli and myself.
> 
> Ulrich Obergfell (9):
>   watchdog: new definitions and variables, initialization
>   watchdog: introduce the proc_watchdog_update() function
>   watchdog: move definition of 'watchdog_proc_mutex' outside of
>     proc_dowatchdog()
>   watchdog: introduce the proc_watchdog_common() function
>   watchdog: introduce separate handlers for parameters in
>     /proc/sys/kernel
>   watchdog: implement error handling for failure to set up hardware
>     perf events
>   watchdog: enable the new user interface of the watchdog mechanism
>   watchdog: clean up some function names and arguments
>   watchdog: introduce the hardlockup_detector_disable() function
> 
>  arch/x86/kernel/kvm.c |    2 +-
>  include/linux/nmi.h   |   21 ++--
>  kernel/sysctl.c       |   35 +++++--
>  kernel/watchdog.c     |  277 ++++++++++++++++++++++++++++++++++++-------------
>  4 files changed, 240 insertions(+), 95 deletions(-)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/9] watchdog: internally split softlockup and hardlockup
  2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
                   ` (9 preceding siblings ...)
  2014-12-17 21:25 ` [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
@ 2015-01-21 14:15 ` Don Zickus
  2015-01-21 22:23   ` Andrew Morton
  10 siblings, 1 reply; 16+ messages in thread
From: Don Zickus @ 2015-01-21 14:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, uobergfe

On Tue, Nov 04, 2014 at 11:20:15AM -0500, Don Zickus wrote:
> The hardlockup and softockup had always been tied together.  Due to the request
> of KVM folks, they had a need to have one enabled but not the other.
> Internally rework the code to split things apart more cleanly.
> 
> There is a bunch of churn here, but the end result should be code that should
> be easier to maintain and fix without knowing the internals of what is going
> on.
> 
> Tested by Uli and myself.

Hi Andrew,

Poking again.  What can I do to move this along?  Should I repost?

Cheers,
Don

> 
> Ulrich Obergfell (9):
>   watchdog: new definitions and variables, initialization
>   watchdog: introduce the proc_watchdog_update() function
>   watchdog: move definition of 'watchdog_proc_mutex' outside of
>     proc_dowatchdog()
>   watchdog: introduce the proc_watchdog_common() function
>   watchdog: introduce separate handlers for parameters in
>     /proc/sys/kernel
>   watchdog: implement error handling for failure to set up hardware
>     perf events
>   watchdog: enable the new user interface of the watchdog mechanism
>   watchdog: clean up some function names and arguments
>   watchdog: introduce the hardlockup_detector_disable() function
> 
>  arch/x86/kernel/kvm.c |    2 +-
>  include/linux/nmi.h   |   21 ++--
>  kernel/sysctl.c       |   35 +++++--
>  kernel/watchdog.c     |  277 ++++++++++++++++++++++++++++++++++++-------------
>  4 files changed, 240 insertions(+), 95 deletions(-)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/9] watchdog: internally split softlockup and hardlockup
  2015-01-21 14:15 ` Don Zickus
@ 2015-01-21 22:23   ` Andrew Morton
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2015-01-21 22:23 UTC (permalink / raw)
  To: Don Zickus; +Cc: Ingo Molnar, LKML, uobergfe

On Wed, 21 Jan 2015 09:15:52 -0500 Don Zickus <dzickus@redhat.com> wrote:

> On Tue, Nov 04, 2014 at 11:20:15AM -0500, Don Zickus wrote:
> > The hardlockup and softockup had always been tied together.  Due to the request
> > of KVM folks, they had a need to have one enabled but not the other.
> > Internally rework the code to split things apart more cleanly.
> > 
> > There is a bunch of churn here, but the end result should be code that should
> > be easier to maintain and fix without knowing the internals of what is going
> > on.
> > 
> > Tested by Uli and myself.
> 
> Hi Andrew,
> 
> Poking again.  What can I do to move this along?  Should I repost?

I don't actually remember seeing this, but November was soooo long ago ;)

Yes please - refresh, retest, resend.  Don't forget to gather up any
new acked/reviewed/tested-by's.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism
  2015-02-23 21:19   ` Andrew Morton
@ 2015-02-24 15:46     ` Don Zickus
  0 siblings, 0 replies; 16+ messages in thread
From: Don Zickus @ 2015-02-24 15:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Ulrich Obergfell, Ingo Molnar

On Mon, Feb 23, 2015 at 01:19:46PM -0800, Andrew Morton wrote:
> On Thu,  5 Feb 2015 15:40:23 -0500 Don Zickus <dzickus@redhat.com> wrote:
> 
> > From: Ulrich Obergfell <uobergfe@redhat.com>
> > 
> > With the current user interface of the watchdog mechanism it is only
> > possible to disable or enable both lockup detectors at the same time.
> > This series introduces new kernel parameters and changes the semantics
> > of some existing kernel parameters, so that the hard lockup detector
> > and the soft lockup detector can be disabled or enabled individually.
> > With this series applied, the user interface is as follows.
> > 
> > - parameters in /proc/sys/kernel
> > 
> >   . soft_watchdog
> >     This is a new parameter to control and examine the run state of
> >     the soft lockup detector.
> > 
> >   . nmi_watchdog
> >     The semantics of this parameter have changed. It can now be used
> >     to control and examine the run state of the hard lockup detector.
> > 
> >   . watchdog
> >     This parameter is still available to control the run state of both
> >     lockup detectors at the same time. If this parameter is examined,
> >     it shows the logical OR of soft_watchdog and nmi_watchdog.
> > 
> >   . watchdog_thresh
> >     The semantics of this parameter are not affected by the patch.
> > 
> > - kernel command line parameters
> > 
> >   . nosoftlockup
> >     The semantics of this parameter have changed. It can now be used
> >     to disable the soft lockup detector at boot time.
> > 
> >   . nmi_watchdog=0 or nmi_watchdog=1
> >     Disable or enable the hard lockup detector at boot time. The patch
> >     introduces '=1' as a new option.
> > 
> >   . nowatchdog
> >     The semantics of this parameter are not affected by the patch. It
> >     is still available to disable both lockup detectors at boot time.
> 
> So we need a whole bunch of updates and additions to Documentation/?

Ok.

> 
> Are all these changes back-compatible with previous kernel versions?

I believe so.  The motivation for some of the change was the ambiguity of
the /proc/sys/kerne/watchdog variable.  This patchset clears that up.  So
if it breaks backward compatibility, it is because it was misinterpreted
from its original intention.

Old definition of the proc/sys/kernel variable:

watchdog  == nmi_watchdog_enabled

New definition:

watchdog == nmi_watchdog_enabled | softlockup_enabled


Cheers,
Don

> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism
  2015-02-05 20:40 ` [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism Don Zickus
@ 2015-02-23 21:19   ` Andrew Morton
  2015-02-24 15:46     ` Don Zickus
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2015-02-23 21:19 UTC (permalink / raw)
  To: Don Zickus; +Cc: LKML, Ulrich Obergfell, Ingo Molnar

On Thu,  5 Feb 2015 15:40:23 -0500 Don Zickus <dzickus@redhat.com> wrote:

> From: Ulrich Obergfell <uobergfe@redhat.com>
> 
> With the current user interface of the watchdog mechanism it is only
> possible to disable or enable both lockup detectors at the same time.
> This series introduces new kernel parameters and changes the semantics
> of some existing kernel parameters, so that the hard lockup detector
> and the soft lockup detector can be disabled or enabled individually.
> With this series applied, the user interface is as follows.
> 
> - parameters in /proc/sys/kernel
> 
>   . soft_watchdog
>     This is a new parameter to control and examine the run state of
>     the soft lockup detector.
> 
>   . nmi_watchdog
>     The semantics of this parameter have changed. It can now be used
>     to control and examine the run state of the hard lockup detector.
> 
>   . watchdog
>     This parameter is still available to control the run state of both
>     lockup detectors at the same time. If this parameter is examined,
>     it shows the logical OR of soft_watchdog and nmi_watchdog.
> 
>   . watchdog_thresh
>     The semantics of this parameter are not affected by the patch.
> 
> - kernel command line parameters
> 
>   . nosoftlockup
>     The semantics of this parameter have changed. It can now be used
>     to disable the soft lockup detector at boot time.
> 
>   . nmi_watchdog=0 or nmi_watchdog=1
>     Disable or enable the hard lockup detector at boot time. The patch
>     introduces '=1' as a new option.
> 
>   . nowatchdog
>     The semantics of this parameter are not affected by the patch. It
>     is still available to disable both lockup detectors at boot time.

So we need a whole bunch of updates and additions to Documentation/?

Are all these changes back-compatible with previous kernel versions?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism
  2015-02-05 20:40 Don Zickus
@ 2015-02-05 20:40 ` Don Zickus
  2015-02-23 21:19   ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Don Zickus @ 2015-02-05 20:40 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Ulrich Obergfell, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

With the current user interface of the watchdog mechanism it is only
possible to disable or enable both lockup detectors at the same time.
This series introduces new kernel parameters and changes the semantics
of some existing kernel parameters, so that the hard lockup detector
and the soft lockup detector can be disabled or enabled individually.
With this series applied, the user interface is as follows.

- parameters in /proc/sys/kernel

  . soft_watchdog
    This is a new parameter to control and examine the run state of
    the soft lockup detector.

  . nmi_watchdog
    The semantics of this parameter have changed. It can now be used
    to control and examine the run state of the hard lockup detector.

  . watchdog
    This parameter is still available to control the run state of both
    lockup detectors at the same time. If this parameter is examined,
    it shows the logical OR of soft_watchdog and nmi_watchdog.

  . watchdog_thresh
    The semantics of this parameter are not affected by the patch.

- kernel command line parameters

  . nosoftlockup
    The semantics of this parameter have changed. It can now be used
    to disable the soft lockup detector at boot time.

  . nmi_watchdog=0 or nmi_watchdog=1
    Disable or enable the hard lockup detector at boot time. The patch
    introduces '=1' as a new option.

  . nowatchdog
    The semantics of this parameter are not affected by the patch. It
    is still available to disable both lockup detectors at boot time.

Also, remove the proc_dowatchdog() function which is no longer needed.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 include/linux/nmi.h |    2 -
 kernel/sysctl.c     |   35 +++++++++++++++-------
 kernel/watchdog.c   |   81 ++++++++++-----------------------------------------
 3 files changed, 40 insertions(+), 78 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 5b54505..0426357 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -82,8 +82,6 @@ extern int proc_soft_watchdog(struct ctl_table *, int ,
 			      void __user *, size_t *, loff_t *);
 extern int proc_watchdog_thresh(struct ctl_table *, int ,
 				void __user *, size_t *, loff_t *);
-extern int proc_dowatchdog(struct ctl_table *, int ,
-			   void __user *, size_t *, loff_t *);
 #endif
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 88ea2d6..c155263 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -846,7 +846,7 @@ static struct ctl_table kern_table[] = {
 		.data           = &watchdog_user_enabled,
 		.maxlen         = sizeof (int),
 		.mode           = 0644,
-		.proc_handler   = proc_dowatchdog,
+		.proc_handler   = proc_watchdog,
 		.extra1		= &zero,
 		.extra2		= &one,
 	},
@@ -855,11 +855,33 @@ static struct ctl_table kern_table[] = {
 		.data		= &watchdog_thresh,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_dowatchdog,
+		.proc_handler	= proc_watchdog_thresh,
 		.extra1		= &zero,
 		.extra2		= &sixty,
 	},
 	{
+		.procname       = "nmi_watchdog",
+		.data           = &nmi_watchdog_enabled,
+		.maxlen         = sizeof (int),
+		.mode           = 0644,
+		.proc_handler   = proc_nmi_watchdog,
+		.extra1		= &zero,
+#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+		.extra2		= &one,
+#else
+		.extra2		= &zero,
+#endif
+	},
+	{
+		.procname       = "soft_watchdog",
+		.data           = &soft_watchdog_enabled,
+		.maxlen         = sizeof (int),
+		.mode           = 0644,
+		.proc_handler   = proc_soft_watchdog,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+	{
 		.procname	= "softlockup_panic",
 		.data		= &softlockup_panic,
 		.maxlen		= sizeof(int),
@@ -879,15 +901,6 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 #endif /* CONFIG_SMP */
-	{
-		.procname       = "nmi_watchdog",
-		.data           = &watchdog_user_enabled,
-		.maxlen         = sizeof (int),
-		.mode           = 0644,
-		.proc_handler   = proc_dowatchdog,
-		.extra1		= &zero,
-		.extra2		= &one,
-	},
 #endif
 #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
 	{
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7ad8949..d29c945 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -110,15 +110,9 @@ static int __init hardlockup_panic_setup(char *str)
 	else if (!strncmp(str, "nopanic", 7))
 		hardlockup_panic = 0;
 	else if (!strncmp(str, "0", 1))
-		watchdog_user_enabled = 0;
-	else if (!strncmp(str, "1", 1) || !strncmp(str, "2", 1)) {
-		/*
-		 * Setting 'nmi_watchdog=1' or 'nmi_watchdog=2' (legacy option)
-		 * has the same effect.
-		 */
-		watchdog_user_enabled = 1;
-		watchdog_enable_hardlockup_detector(true);
-	}
+		watchdog_enabled &= ~NMI_WATCHDOG_ENABLED;
+	else if (!strncmp(str, "1", 1))
+		watchdog_enabled |= NMI_WATCHDOG_ENABLED;
 	return 1;
 }
 __setup("nmi_watchdog=", hardlockup_panic_setup);
@@ -137,19 +131,18 @@ __setup("softlockup_panic=", softlockup_panic_setup);
 
 static int __init nowatchdog_setup(char *str)
 {
-	watchdog_user_enabled = 0;
+	watchdog_enabled = 0;
 	return 1;
 }
 __setup("nowatchdog", nowatchdog_setup);
 
-/* deprecated */
 static int __init nosoftlockup_setup(char *str)
 {
-	watchdog_user_enabled = 0;
+	watchdog_enabled &= ~SOFT_WATCHDOG_ENABLED;
 	return 1;
 }
 __setup("nosoftlockup", nosoftlockup_setup);
-/*  */
+
 #ifdef CONFIG_SMP
 static int __init softlockup_all_cpu_backtrace_setup(char *str)
 {
@@ -264,10 +257,11 @@ static int is_softlockup(unsigned long touch_ts)
 {
 	unsigned long now = get_timestamp();
 
-	/* Warn about unreasonable delays: */
-	if (time_after(now, touch_ts + get_softlockup_thresh()))
-		return now - touch_ts;
-
+	if (watchdog_enabled & SOFT_WATCHDOG_ENABLED) {
+		/* Warn about unreasonable delays. */
+		if (time_after(now, touch_ts + get_softlockup_thresh()))
+			return now - touch_ts;
+	}
 	return 0;
 }
 
@@ -526,6 +520,10 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	struct perf_event_attr *wd_attr;
 	struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
+	/* nothing to do if the hard lockup detector is disabled */
+	if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
+		goto out;
+
 	/*
 	 * Some kernels need to default hard lockup detection to
 	 * 'disabled', for example a guest on a hypervisor.
@@ -844,59 +842,12 @@ out:
 	mutex_unlock(&watchdog_proc_mutex);
 	return err;
 }
-
-/*
- * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
- */
-
-int proc_dowatchdog(struct ctl_table *table, int write,
-		    void __user *buffer, size_t *lenp, loff_t *ppos)
-{
-	int err, old_thresh, old_enabled;
-	bool old_hardlockup;
-
-	mutex_lock(&watchdog_proc_mutex);
-	old_thresh = ACCESS_ONCE(watchdog_thresh);
-	old_enabled = ACCESS_ONCE(watchdog_user_enabled);
-	old_hardlockup = watchdog_hardlockup_detector_is_enabled();
-
-	err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-	if (err || !write)
-		goto out;
-
-	set_sample_period();
-	/*
-	 * Watchdog threads shouldn't be enabled if they are
-	 * disabled. The 'watchdog_running' variable check in
-	 * watchdog_*_all_cpus() function takes care of this.
-	 */
-	if (watchdog_user_enabled && watchdog_thresh) {
-		/*
-		 * Prevent a change in watchdog_thresh accidentally overriding
-		 * the enablement of the hardlockup detector.
-		 */
-		if (watchdog_user_enabled != old_enabled)
-			watchdog_enable_hardlockup_detector(true);
-		err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh);
-	} else
-		watchdog_disable_all_cpus();
-
-	/* Restore old values on failure */
-	if (err) {
-		watchdog_thresh = old_thresh;
-		watchdog_user_enabled = old_enabled;
-		watchdog_enable_hardlockup_detector(old_hardlockup);
-	}
-out:
-	mutex_unlock(&watchdog_proc_mutex);
-	return err;
-}
 #endif /* CONFIG_SYSCTL */
 
 void __init lockup_detector_init(void)
 {
 	set_sample_period();
 
-	if (watchdog_user_enabled)
+	if (watchdog_enabled)
 		watchdog_enable_all_cpus(false);
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-02-24 15:46 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-04 16:20 [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
2014-11-04 16:20 ` [PATCH 1/9] watchdog: new definitions and variables, initialization Don Zickus
2014-11-04 16:20 ` [PATCH 2/9] watchdog: introduce the proc_watchdog_update() function Don Zickus
2014-11-04 16:20 ` [PATCH 3/9] watchdog: move definition of 'watchdog_proc_mutex' outside of proc_dowatchdog() Don Zickus
2014-11-04 16:20 ` [PATCH 4/9] watchdog: introduce the proc_watchdog_common() function Don Zickus
2014-11-04 16:20 ` [PATCH 5/9] watchdog: introduce separate handlers for parameters in /proc/sys/kernel Don Zickus
2014-11-04 16:20 ` [PATCH 6/9] watchdog: implement error handling for failure to set up hardware perf events Don Zickus
2014-11-04 16:20 ` [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism Don Zickus
2014-11-04 16:20 ` [PATCH 8/9] watchdog: clean up some function names and arguments Don Zickus
2014-11-04 16:20 ` [PATCH 9/9] watchdog: introduce the hardlockup_detector_disable() function Don Zickus
2014-12-17 21:25 ` [PATCH 0/9] watchdog: internally split softlockup and hardlockup Don Zickus
2015-01-21 14:15 ` Don Zickus
2015-01-21 22:23   ` Andrew Morton
2015-02-05 20:40 Don Zickus
2015-02-05 20:40 ` [PATCH 7/9] watchdog: enable the new user interface of the watchdog mechanism Don Zickus
2015-02-23 21:19   ` Andrew Morton
2015-02-24 15:46     ` Don Zickus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).