All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] watchdog: various fixes
@ 2014-08-11 14:49 Don Zickus
  2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
                   ` (4 more replies)
  0 siblings, 5 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm; +Cc: kvm, pbonzini, mingo, LKML, Don Zickus

Just respinning these patches with my sign-off.  I keep forgetting which is
easier for Andrew to digest (this way or just me replying with an ack).

Ulrich Obergfell (3):
  watchdog: fix print-once on enable
  watchdog: control hard lockup detection default
  kvm: ensure hard lockup detection is disabled by default

chai wen (2):
  watchdog: remove unnecessary head files
  softlockup: make detector be aware of task switch of processes
    hogging cpu

 arch/x86/kernel/kvm.c |    8 +++++
 include/linux/nmi.h   |    9 +++++
 kernel/watchdog.c     |   78 +++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 86 insertions(+), 9 deletions(-)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/5] watchdog: remove unnecessary head files
  2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
@ 2014-08-11 14:49 ` Don Zickus
  2014-08-18 18:03   ` [tip:perf/watchdog] watchdog: Remove unnecessary header files tip-bot for chai wen
  2014-08-11 14:49 ` [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu Don Zickus
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm; +Cc: kvm, pbonzini, mingo, LKML, chai wen, Don Zickus

From: chai wen <chaiw.fnst@cn.fujitsu.com>

Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index c3319bd..4c2e11c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -15,11 +15,6 @@
 #include <linux/cpu.h>
 #include <linux/nmi.h>
 #include <linux/init.h>
-#include <linux/delay.h>
-#include <linux/freezer.h>
-#include <linux/kthread.h>
-#include <linux/lockdep.h>
-#include <linux/notifier.h>
 #include <linux/module.h>
 #include <linux/sysctl.h>
 #include <linux/smpboot.h>
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
  2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
@ 2014-08-11 14:49 ` Don Zickus
  2014-08-18  9:03   ` Ingo Molnar
  2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm; +Cc: kvm, pbonzini, mingo, LKML, chai wen, Don Zickus

From: chai wen <chaiw.fnst@cn.fujitsu.com>

For now, soft lockup detector warns once for each case of process softlockup.
But the thread 'watchdog/n' may not always get the cpu at the time slot between
the task switch of two processes hogging that cpu to reset soft_watchdog_warn.

An example would be two processes hogging the cpu.  Process A causes the
softlockup warning and is killed manually by a user.  Process B immediately
becomes the new process hogging the cpu preventing the softlockup code from
resetting the soft_watchdog_warn variable.

This case is a false negative of "warn only once for a process", as there may
be a different process that is going to hog the cpu.  Resolve this by
saving/checking the pid of the hogging process and use that to reset
soft_watchdog_warn too.

Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
[modified the comment and changelog to be more specific]
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   20 ++++++++++++++++++--
 1 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 4c2e11c..6d0a891 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
 static DEFINE_PER_CPU(bool, soft_watchdog_warn);
 static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
 static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
+static DEFINE_PER_CPU(pid_t, softlockup_warn_pid_saved);
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 static DEFINE_PER_CPU(bool, hard_watchdog_warn);
 static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
@@ -317,6 +318,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 	 */
 	duration = is_softlockup(touch_ts);
 	if (unlikely(duration)) {
+		pid_t pid = task_pid_nr(current);
+
 		/*
 		 * If a virtual machine is stopped by the host it can look to
 		 * the watchdog like a soft lockup, check to see if the host
@@ -326,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 			return HRTIMER_RESTART;
 
 		/* only warn once */
-		if (__this_cpu_read(soft_watchdog_warn) == true)
+		if (__this_cpu_read(soft_watchdog_warn) == true) {
+
+			/*
+			 * Handle the case where multiple processes are
+			 * causing softlockups but the duration is small
+			 * enough, the softlockup detector can not reset
+			 * itself in time.  Use pids to detect this.
+			 */
+			if (__this_cpu_read(softlockup_warn_pid_saved) != pid) {
+				__this_cpu_write(soft_watchdog_warn, false);
+				__touch_watchdog();
+			}
 			return HRTIMER_RESTART;
+		}
 
 		if (softlockup_all_cpu_backtrace) {
 			/* Prevent multiple soft-lockup reports if one cpu is already
@@ -342,7 +357,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 
 		printk(KERN_EMERG "BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
 			smp_processor_id(), duration,
-			current->comm, task_pid_nr(current));
+			current->comm, pid);
+		__this_cpu_write(softlockup_warn_pid_saved, pid);
 		print_modules();
 		print_irqtrace_events(current);
 		if (regs)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/5] watchdog: fix print-once on enable
  2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
  2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
  2014-08-11 14:49 ` [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu Don Zickus
@ 2014-08-11 14:49 ` Don Zickus
  2014-08-18  9:05   ` Ingo Molnar
                     ` (2 more replies)
  2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
  2014-08-11 14:49 ` [PATCH 5/5] kvm: ensure hard lockup detection is disabled by default Don Zickus
  4 siblings, 3 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm
  Cc: kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

This patch avoids printing the message 'enabled on all CPUs, ...'
multiple times. For example, the issue can occur in the following
scenario:

1) watchdog_nmi_enable() fails to enable PMU counters and sets
   cpu0_err.

2) 'echo [0|1] > /proc/sys/kernel/nmi_watchdog' is executed to
   disable and re-enable the watchdog mechanism 'on the fly'.

3) If watchdog_nmi_enable() succeeds to enable PMU counters, each
   CPU will print the message because step1 left behind a non-zero
   cpu0_err.

   if (!IS_ERR(event)) {
       if (cpu == 0 || cpu0_err)
       pr_info("enabled on all CPUs, ...")

The patch avoids this by clearing cpu0_err in watchdog_nmi_disable().

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6d0a891..0838685 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -522,6 +522,9 @@ static void watchdog_nmi_disable(unsigned int cpu)
 		/* should be in cleanup, but blocks oprofile */
 		perf_event_release_kernel(event);
 	}
+	if (cpu == 0)
+		/* watchdog_nmi_enable() expects this to be zero initially. */
+		cpu0_err = 0;
 	return;
 }
 #else
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
                   ` (2 preceding siblings ...)
  2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
@ 2014-08-11 14:49 ` Don Zickus
  2014-08-18  9:12   ` Ingo Molnar
  2014-08-18  9:16   ` Ingo Molnar
  2014-08-11 14:49 ` [PATCH 5/5] kvm: ensure hard lockup detection is disabled by default Don Zickus
  4 siblings, 2 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm
  Cc: kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

In some cases we don't want hard lockup detection enabled by default.
An example is when running as a guest. Introduce

  watchdog_enable_hardlockup_detector(bool)

allowing those cases to disable hard lockup detection. This must be
executed early by the boot processor from e.g. smp_prepare_boot_cpu,
in order to allow kernel command line arguments to override it, as
well as to avoid hard lockup detection being enabled before we've
had a chance to indicate that it's unwanted. In summary,

  initial boot:					default=enabled
  smp_prepare_boot_cpu
    watchdog_enable_hardlockup_detector(false):	default=disabled
  cmdline has 'nmi_watchdog=1':			default=enabled

The running kernel still has the ability to enable/disable at any
time with /proc/sys/kernel/nmi_watchdog us usual. However even
when the default has been overridden /proc/sys/kernel/nmi_watchdog
will initially show '1'. To truly turn it on one must disable/enable
it, i.e.
  echo 0 > /proc/sys/kernel/nmi_watchdog
  echo 1 > /proc/sys/kernel/nmi_watchdog

This patch will be immediately useful for KVM with the next patch
of this series. Other hypervisor guest types may find it useful as
well.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 include/linux/nmi.h |    9 +++++++++
 kernel/watchdog.c   |   50 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 447775e..72aacf4 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -17,11 +17,20 @@
 #if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
 #include <asm/nmi.h>
 extern void touch_nmi_watchdog(void);
+extern void watchdog_enable_hardlockup_detector(bool val);
+extern bool watchdog_hardlockup_detector_is_enabled(void);
 #else
 static inline void touch_nmi_watchdog(void)
 {
 	touch_softlockup_watchdog();
 }
+static inline void watchdog_enable_hardlockup_detector(bool)
+{
+}
+static inline bool watchdog_hardlockup_detector_is_enabled(void)
+{
+	return true;
+}
 #endif
 
 /*
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 0838685..8cb24dc 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -59,6 +59,25 @@ static unsigned long soft_lockup_nmi_warn;
 static int hardlockup_panic =
 			CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE;
 
+static bool hardlockup_detector_enabled = true;
+/*
+ * We may not want to enable hard lockup detection by default in all cases,
+ * for example when running the kernel as a guest on a hypervisor. In these
+ * cases this function can be called to disable hard lockup detection. This
+ * function should only be executed once by the boot processor before the
+ * kernel command line parameters are parsed, because otherwise it is not
+ * possible to override this in hardlockup_panic_setup().
+ */
+void watchdog_enable_hardlockup_detector(bool val)
+{
+	hardlockup_detector_enabled = val;
+}
+
+bool watchdog_hardlockup_detector_is_enabled(void)
+{
+	return hardlockup_detector_enabled;
+}
+
 static int __init hardlockup_panic_setup(char *str)
 {
 	if (!strncmp(str, "panic", 5))
@@ -67,6 +86,14 @@ static int __init hardlockup_panic_setup(char *str)
 		hardlockup_panic = 0;
 	else if (!strncmp(str, "0", 1))
 		watchdog_user_enabled = 0;
+	else if (!strncmp(str, "1", 1) || !strncmp(str, "2", 1)) {
+		/*
+		 * Setting 'nmi_watchdog=1' or 'nmi_watchdog=2' (legacy option)
+		 * has the same effect.
+		 */
+		watchdog_user_enabled = 1;
+		watchdog_enable_hardlockup_detector(true);
+	}
 	return 1;
 }
 __setup("nmi_watchdog=", hardlockup_panic_setup);
@@ -462,6 +489,15 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	struct perf_event_attr *wd_attr;
 	struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
+	/*
+	 * Some kernels need to default hard lockup detection to
+	 * 'disabled', for example a guest on a hypervisor.
+	 */
+	if (!watchdog_hardlockup_detector_is_enabled()) {
+		event = ERR_PTR(-ENOENT);
+		goto handle_err;
+	}
+
 	/* is it already setup and enabled? */
 	if (event && event->state > PERF_EVENT_STATE_OFF)
 		goto out;
@@ -476,6 +512,7 @@ static int watchdog_nmi_enable(unsigned int cpu)
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
 
+handle_err:
 	/* save cpu0 error for future comparision */
 	if (cpu == 0 && IS_ERR(event))
 		cpu0_err = PTR_ERR(event);
@@ -621,11 +658,13 @@ int proc_dowatchdog(struct ctl_table *table, int write,
 		    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
 	int err, old_thresh, old_enabled;
+	bool old_hardlockup;
 	static DEFINE_MUTEX(watchdog_proc_mutex);
 
 	mutex_lock(&watchdog_proc_mutex);
 	old_thresh = ACCESS_ONCE(watchdog_thresh);
 	old_enabled = ACCESS_ONCE(watchdog_user_enabled);
+	old_hardlockup = watchdog_hardlockup_detector_is_enabled();
 
 	err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 	if (err || !write)
@@ -637,15 +676,22 @@ int proc_dowatchdog(struct ctl_table *table, int write,
 	 * disabled. The 'watchdog_running' variable check in
 	 * watchdog_*_all_cpus() function takes care of this.
 	 */
-	if (watchdog_user_enabled && watchdog_thresh)
+	if (watchdog_user_enabled && watchdog_thresh) {
+		/*
+		 * Prevent a change in watchdog_thresh accidentally overriding
+		 * the enablement of the hardlockup detector.
+		 */
+		if (watchdog_user_enabled != old_enabled)
+			watchdog_enable_hardlockup_detector(true);
 		err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh);
-	else
+	} else
 		watchdog_disable_all_cpus();
 
 	/* Restore old values on failure */
 	if (err) {
 		watchdog_thresh = old_thresh;
 		watchdog_user_enabled = old_enabled;
+		watchdog_enable_hardlockup_detector(old_hardlockup);
 	}
 out:
 	mutex_unlock(&watchdog_proc_mutex);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/5] kvm: ensure hard lockup detection is disabled by default
  2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
                   ` (3 preceding siblings ...)
  2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
@ 2014-08-11 14:49 ` Don Zickus
  4 siblings, 0 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-11 14:49 UTC (permalink / raw)
  To: akpm
  Cc: kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones, Don Zickus

From: Ulrich Obergfell <uobergfe@redhat.com>

Use watchdog_enable_hardlockup_detector() to set hard lockup detection's
default value to false. It's risky to run this detection in a guest, as
false positives are easy to trigger, especially if the host is
overcommitted.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/kvm.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3dd8e2c..95c3cb1 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -35,6 +35,7 @@
 #include <linux/slab.h>
 #include <linux/kprobes.h>
 #include <linux/debugfs.h>
+#include <linux/nmi.h>
 #include <asm/timer.h>
 #include <asm/cpu.h>
 #include <asm/traps.h>
@@ -499,6 +500,13 @@ void __init kvm_guest_init(void)
 #else
 	kvm_guest_cpu_init();
 #endif
+
+	/*
+	 * Hard lockup detection is enabled by default. Disable it, as guests
+	 * can get false positives too easily, for example if the host is
+	 * overcommitted.
+	 */
+	watchdog_enable_hardlockup_detector(false);
 }
 
 static noinline uint32_t __kvm_cpuid_base(void)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-11 14:49 ` [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu Don Zickus
@ 2014-08-18  9:03   ` Ingo Molnar
  2014-08-18 15:06     ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18  9:03 UTC (permalink / raw)
  To: Don Zickus; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen

* Don Zickus <dzickus@redhat.com> wrote:

> From: chai wen <chaiw.fnst@cn.fujitsu.com>
> 
> For now, soft lockup detector warns once for each case of process softlockup.
> But the thread 'watchdog/n' may not always get the cpu at the time slot between
> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> 
> An example would be two processes hogging the cpu.  Process A causes the
> softlockup warning and is killed manually by a user.  Process B immediately
> becomes the new process hogging the cpu preventing the softlockup code from
> resetting the soft_watchdog_warn variable.
> 
> This case is a false negative of "warn only once for a process", as there may
> be a different process that is going to hog the cpu.  Resolve this by
> saving/checking the pid of the hogging process and use that to reset
> soft_watchdog_warn too.
> 
> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> [modified the comment and changelog to be more specific]
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
>  kernel/watchdog.c |   20 ++++++++++++++++++--
>  1 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 4c2e11c..6d0a891 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> +static DEFINE_PER_CPU(pid_t, softlockup_warn_pid_saved);
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> @@ -317,6 +318,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  	 */
>  	duration = is_softlockup(touch_ts);
>  	if (unlikely(duration)) {
> +		pid_t pid = task_pid_nr(current);
> +
>  		/*
>  		 * If a virtual machine is stopped by the host it can look to
>  		 * the watchdog like a soft lockup, check to see if the host
> @@ -326,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  			return HRTIMER_RESTART;
>  
>  		/* only warn once */
> -		if (__this_cpu_read(soft_watchdog_warn) == true)
> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> +
> +			/*
> +			 * Handle the case where multiple processes are
> +			 * causing softlockups but the duration is small
> +			 * enough, the softlockup detector can not reset
> +			 * itself in time.  Use pids to detect this.
> +			 */
> +			if (__this_cpu_read(softlockup_warn_pid_saved) != pid) {

So I agree with the motivation of this improvement, but is this 
implementation namespace-safe?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] watchdog: fix print-once on enable
  2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
@ 2014-08-18  9:05   ` Ingo Molnar
  2014-08-18  9:07   ` Ingo Molnar
  2014-08-18 18:03   ` [tip:perf/watchdog] watchdog: Fix " tip-bot for Ulrich Obergfell
  2 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18  9:05 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones


* Don Zickus <dzickus@redhat.com> wrote:

> From: Ulrich Obergfell <uobergfe@redhat.com>
> 
> This patch avoids printing the message 'enabled on all CPUs, ...'
> multiple times. For example, the issue can occur in the following
> scenario:
> 
> 1) watchdog_nmi_enable() fails to enable PMU counters and sets
>    cpu0_err.
> 
> 2) 'echo [0|1] > /proc/sys/kernel/nmi_watchdog' is executed to
>    disable and re-enable the watchdog mechanism 'on the fly'.
> 
> 3) If watchdog_nmi_enable() succeeds to enable PMU counters, each
>    CPU will print the message because step1 left behind a non-zero
>    cpu0_err.
> 
>    if (!IS_ERR(event)) {
>        if (cpu == 0 || cpu0_err)
>        pr_info("enabled on all CPUs, ...")
> 
> The patch avoids this by clearing cpu0_err in watchdog_nmi_disable().
> 
> Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
>  kernel/watchdog.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 6d0a891..0838685 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -522,6 +522,9 @@ static void watchdog_nmi_disable(unsigned int cpu)
>  		/* should be in cleanup, but blocks oprofile */
>  		perf_event_release_kernel(event);
>  	}
> +	if (cpu == 0)
> +		/* watchdog_nmi_enable() expects this to be zero initially. */
> +		cpu0_err = 0;

Looks good except a small detail: two-line blocks need curly 
braces too, even if it's just a single C statement. I've fixed 
this up in the commit.

thanks,

	Ingo


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] watchdog: fix print-once on enable
  2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
  2014-08-18  9:05   ` Ingo Molnar
@ 2014-08-18  9:07   ` Ingo Molnar
  2014-08-18 15:07     ` Don Zickus
  2014-08-18 18:03   ` [tip:perf/watchdog] watchdog: Fix " tip-bot for Ulrich Obergfell
  2 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18  9:07 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones


* Don Zickus <dzickus@redhat.com> wrote:

> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -522,6 +522,9 @@ static void watchdog_nmi_disable(unsigned int cpu)
>  		/* should be in cleanup, but blocks oprofile */
>  		perf_event_release_kernel(event);
>  	}
> +	if (cpu == 0)
> +		/* watchdog_nmi_enable() expects this to be zero initially. */
> +		cpu0_err = 0;
>  	return;
>  }

While at it I also removed the stray 'return;'.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
@ 2014-08-18  9:12   ` Ingo Molnar
  2014-08-18 15:07     ` Don Zickus
  2014-08-18  9:16   ` Ingo Molnar
  1 sibling, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18  9:12 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones


* Don Zickus <dzickus@redhat.com> wrote:

> From: Ulrich Obergfell <uobergfe@redhat.com>
> 
> In some cases we don't want hard lockup detection enabled by default.
> An example is when running as a guest. Introduce
> 
>   watchdog_enable_hardlockup_detector(bool)

So, the name watchdog_enable_hardlockup_detector_enable(false) 
is both too long and also really confusing (because first it 
suggests enablement, then disables it), so I renamed it to 
hardlockup_detector_set(), which allows two natural variants:

	hardlockup_detector_set(false);
	...
	hardlockup_detector_set(true);

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
  2014-08-18  9:12   ` Ingo Molnar
@ 2014-08-18  9:16   ` Ingo Molnar
  2014-08-18 10:44     ` Ulrich Obergfell
  2014-08-18 15:17     ` Don Zickus
  1 sibling, 2 replies; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18  9:16 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones


* Don Zickus <dzickus@redhat.com> wrote:

> The running kernel still has the ability to enable/disable at any
> time with /proc/sys/kernel/nmi_watchdog us usual. However even
> when the default has been overridden /proc/sys/kernel/nmi_watchdog
> will initially show '1'. To truly turn it on one must disable/enable
> it, i.e.
>   echo 0 > /proc/sys/kernel/nmi_watchdog
>   echo 1 > /proc/sys/kernel/nmi_watchdog

This looks like a bug, why is this so?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18  9:16   ` Ingo Molnar
@ 2014-08-18 10:44     ` Ulrich Obergfell
  2014-08-18 15:17     ` Don Zickus
  1 sibling, 0 replies; 34+ messages in thread
From: Ulrich Obergfell @ 2014-08-18 10:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Don Zickus, akpm, kvm, pbonzini, mingo, LKML, Andrew Jones

>----- Original Message -----
>From: "Ingo Molnar" <mingo@kernel.org>
>To: "Don Zickus" <dzickus@redhat.com>
>Cc: akpm@linux-foundation.org, kvm@vger.kernel.org, pbonzini@redhat.com, mingo@redhat.com, "LKML" <linux-kernel@vger.kernel.org>, "Ulrich >Obergfell" <uobergfe@redhat.com>, "Andrew Jones" <drjones@redhat.com>
>Sent: Monday, August 18, 2014 11:16:44 AM
>Subject: Re: [PATCH 4/5] watchdog: control hard lockup detection default
>
>
> * Don Zickus <dzickus@redhat.com> wrote:
>
>> The running kernel still has the ability to enable/disable at any
>> time with /proc/sys/kernel/nmi_watchdog us usual. However even
>> when the default has been overridden /proc/sys/kernel/nmi_watchdog
>> will initially show '1'. To truly turn it on one must disable/enable
>> it, i.e.
>>   echo 0 > /proc/sys/kernel/nmi_watchdog
>>   echo 1 > /proc/sys/kernel/nmi_watchdog
>
> This looks like a bug, why is this so?
>
> Thanks,
>
>	Ingo


This is because the hard lockup detector and the soft lockup detector are
enabled and disabled at the same time - there isn't a separate 'knob' for
each of them. Both are controlled via the 'watchdog_user_enabled' variable
which is 1 by default.

  lockup_detector_init
    if (watchdog_user_enabled)
        watchdog_enable_all_cpus
          smpboot_register_percpu_thread(&watchdog_threads)

At boot time, the above code path lauches a 'watchdog/N' thread for each
online CPU. The watchdog_enable() function is executed in the context of
these threads, and this attempts to enable the hard lockup detector and
the soft lockup detector. [Note: Soft lockup detection is implemented in
watchdog_timer_fn().]

  watchdog_enable
    hrtimer_init(hrtimer, ...)
    hrtimer->function = watchdog_timer_fn

    watchdog_nmi_enable
      perf_event_create_kernel_counter(..., watchdog_overflow_callback)

    hrtimer_start(hrtimer, ...)

On bare metal systems or in virtual environments where the hypervisor
does not emulate a PMU, watchdog_nmi_enable() can fail to allocate and
enable a PMU counter. This is reported by a console message:

  NMI watchdog: disabled (cpu0): hardware events not enabled

Hence, we can end up with a situation where the soft lockup detector is
enabled and the hard lockup detector is not enabled. However, the output
of 'cat /proc/sys/kernel/nmi_watchdog' is 1 because it merely shows the
state of the 'watchdog_user_enabled' variable.

The above is the behaviour even without the proposed patch. The patch
merely adds the following hunk in watchdog_nmi_enable() to 'fake' a
-ENOENT error return from perf_event_create_kernel_counter().

  +        if (!watchdog_hardlockup_detector_is_enabled()) {
  +                event = ERR_PTR(-ENOENT);
  +                goto handle_err;
  +        }

The patch does not break the output of 'cat /proc/sys/kernel/nmi_watchdog'
since the discrepancy between the output and the actual state of the hard
lockup detector is nothing new.


Regards,

Uli

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18  9:03   ` Ingo Molnar
@ 2014-08-18 15:06     ` Don Zickus
  2014-08-18 18:01       ` Ingo Molnar
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-18 15:06 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen

On Mon, Aug 18, 2014 at 11:03:19AM +0200, Ingo Molnar wrote:
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > From: chai wen <chaiw.fnst@cn.fujitsu.com>
> > 
> > For now, soft lockup detector warns once for each case of process softlockup.
> > But the thread 'watchdog/n' may not always get the cpu at the time slot between
> > the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> > 
> > An example would be two processes hogging the cpu.  Process A causes the
> > softlockup warning and is killed manually by a user.  Process B immediately
> > becomes the new process hogging the cpu preventing the softlockup code from
> > resetting the soft_watchdog_warn variable.
> > 
> > This case is a false negative of "warn only once for a process", as there may
> > be a different process that is going to hog the cpu.  Resolve this by
> > saving/checking the pid of the hogging process and use that to reset
> > soft_watchdog_warn too.
> > 
> > Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> > [modified the comment and changelog to be more specific]
> > Signed-off-by: Don Zickus <dzickus@redhat.com>
> > ---
> >  kernel/watchdog.c |   20 ++++++++++++++++++--
> >  1 files changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index 4c2e11c..6d0a891 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
> >  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
> >  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
> >  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> > +static DEFINE_PER_CPU(pid_t, softlockup_warn_pid_saved);
> >  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> >  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
> >  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> > @@ -317,6 +318,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> >  	 */
> >  	duration = is_softlockup(touch_ts);
> >  	if (unlikely(duration)) {
> > +		pid_t pid = task_pid_nr(current);
> > +
> >  		/*
> >  		 * If a virtual machine is stopped by the host it can look to
> >  		 * the watchdog like a soft lockup, check to see if the host
> > @@ -326,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> >  			return HRTIMER_RESTART;
> >  
> >  		/* only warn once */
> > -		if (__this_cpu_read(soft_watchdog_warn) == true)
> > +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> > +
> > +			/*
> > +			 * Handle the case where multiple processes are
> > +			 * causing softlockups but the duration is small
> > +			 * enough, the softlockup detector can not reset
> > +			 * itself in time.  Use pids to detect this.
> > +			 */
> > +			if (__this_cpu_read(softlockup_warn_pid_saved) != pid) {
> 
> So I agree with the motivation of this improvement, but is this 
> implementation namespace-safe?

What namespace are you worried about colliding with?  I thought
softlockup_ would provide the safety??  Maybe I am missing something
obvious. :-(

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] watchdog: fix print-once on enable
  2014-08-18  9:07   ` Ingo Molnar
@ 2014-08-18 15:07     ` Don Zickus
  0 siblings, 0 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-18 15:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones

On Mon, Aug 18, 2014 at 11:07:57AM +0200, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -522,6 +522,9 @@ static void watchdog_nmi_disable(unsigned int cpu)
> >  		/* should be in cleanup, but blocks oprofile */
> >  		perf_event_release_kernel(event);
> >  	}
> > +	if (cpu == 0)
> > +		/* watchdog_nmi_enable() expects this to be zero initially. */
> > +		cpu0_err = 0;
> >  	return;
> >  }
> 
> While at it I also removed the stray 'return;'.

Doh, sorry about that.  Thanks!

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18  9:12   ` Ingo Molnar
@ 2014-08-18 15:07     ` Don Zickus
  0 siblings, 0 replies; 34+ messages in thread
From: Don Zickus @ 2014-08-18 15:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones

On Mon, Aug 18, 2014 at 11:12:39AM +0200, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > From: Ulrich Obergfell <uobergfe@redhat.com>
> > 
> > In some cases we don't want hard lockup detection enabled by default.
> > An example is when running as a guest. Introduce
> > 
> >   watchdog_enable_hardlockup_detector(bool)
> 
> So, the name watchdog_enable_hardlockup_detector_enable(false) 
> is both too long and also really confusing (because first it 
> suggests enablement, then disables it), so I renamed it to 
> hardlockup_detector_set(), which allows two natural variants:
> 
> 	hardlockup_detector_set(false);
> 	...
> 	hardlockup_detector_set(true);

Fair enough.  Thanks!

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18  9:16   ` Ingo Molnar
  2014-08-18 10:44     ` Ulrich Obergfell
@ 2014-08-18 15:17     ` Don Zickus
  2014-08-18 18:07       ` Ingo Molnar
  1 sibling, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-18 15:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones

On Mon, Aug 18, 2014 at 11:16:44AM +0200, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > The running kernel still has the ability to enable/disable at any
> > time with /proc/sys/kernel/nmi_watchdog us usual. However even
> > when the default has been overridden /proc/sys/kernel/nmi_watchdog
> > will initially show '1'. To truly turn it on one must disable/enable
> > it, i.e.
> >   echo 0 > /proc/sys/kernel/nmi_watchdog
> >   echo 1 > /proc/sys/kernel/nmi_watchdog
> 
> This looks like a bug, why is this so?

It is, but it always has been there in the case of the PMU not being able
to provide a resource for the hardlockup.  This change just exposes it
more.

Originally I wrote the code to keep the softlockup and hardlockup in sync.
Now this patch attempts to split it up because the guest PMU is still
flushing out bugs.

The above scenario only really applies to developers.  Their guest boots
up with the hardlockup disabled.  If they want to enable it to debug or
develop, they have to go with the above steps.  The idea is once the KVM
PMU is stable enough, the default switches to hardlockup enabled by
default and this problem kinda goes back to one it is today.

I guess I was feeling lazy about modifying a bunch of code to separate the
hard and soft lockup for a temporarily broken feature.  :-/

I thought it would just be easier to put this code in to quickly stabilize
their PMU and switch the default later.

Thoughts?  I think Uli laid out a more detailed example in his email.

Cheers,
Don


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18 15:06     ` Don Zickus
@ 2014-08-18 18:01       ` Ingo Molnar
  2014-08-18 18:43         ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18 18:01 UTC (permalink / raw)
  To: Don Zickus; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen


* Don Zickus <dzickus@redhat.com> wrote:

> On Mon, Aug 18, 2014 at 11:03:19AM +0200, Ingo Molnar wrote:
> > * Don Zickus <dzickus@redhat.com> wrote:
> > 
> > > From: chai wen <chaiw.fnst@cn.fujitsu.com>
> > > 
> > > For now, soft lockup detector warns once for each case of process softlockup.
> > > But the thread 'watchdog/n' may not always get the cpu at the time slot between
> > > the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> > > 
> > > An example would be two processes hogging the cpu.  Process A causes the
> > > softlockup warning and is killed manually by a user.  Process B immediately
> > > becomes the new process hogging the cpu preventing the softlockup code from
> > > resetting the soft_watchdog_warn variable.
> > > 
> > > This case is a false negative of "warn only once for a process", as there may
> > > be a different process that is going to hog the cpu.  Resolve this by
> > > saving/checking the pid of the hogging process and use that to reset
> > > soft_watchdog_warn too.
> > > 
> > > Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> > > [modified the comment and changelog to be more specific]
> > > Signed-off-by: Don Zickus <dzickus@redhat.com>
> > > ---
> > >  kernel/watchdog.c |   20 ++++++++++++++++++--
> > >  1 files changed, 18 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > > index 4c2e11c..6d0a891 100644
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
> > >  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
> > >  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
> > >  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> > > +static DEFINE_PER_CPU(pid_t, softlockup_warn_pid_saved);
> > >  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> > >  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
> > >  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> > > @@ -317,6 +318,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > >  	 */
> > >  	duration = is_softlockup(touch_ts);
> > >  	if (unlikely(duration)) {
> > > +		pid_t pid = task_pid_nr(current);
> > > +
> > >  		/*
> > >  		 * If a virtual machine is stopped by the host it can look to
> > >  		 * the watchdog like a soft lockup, check to see if the host
> > > @@ -326,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > >  			return HRTIMER_RESTART;
> > >  
> > >  		/* only warn once */
> > > -		if (__this_cpu_read(soft_watchdog_warn) == true)
> > > +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> > > +
> > > +			/*
> > > +			 * Handle the case where multiple processes are
> > > +			 * causing softlockups but the duration is small
> > > +			 * enough, the softlockup detector can not reset
> > > +			 * itself in time.  Use pids to detect this.
> > > +			 */
> > > +			if (__this_cpu_read(softlockup_warn_pid_saved) != pid) {
> > 
> > So I agree with the motivation of this improvement, but is this 
> > implementation namespace-safe?
> 
> What namespace are you worried about colliding with?  I thought
> softlockup_ would provide the safety??  Maybe I am missing something
> obvious. :-(

I meant PID namespaces - a PID in itself isn't guaranteed to be 
unique across the system.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [tip:perf/watchdog] watchdog: Remove unnecessary header files
  2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
@ 2014-08-18 18:03   ` tip-bot for chai wen
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for chai wen @ 2014-08-18 18:03 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, chaiw.fnst, tglx, dzickus

Commit-ID:  f530504a063cfa028971e4b26ea8e0c32908de25
Gitweb:     http://git.kernel.org/tip/f530504a063cfa028971e4b26ea8e0c32908de25
Author:     chai wen <chaiw.fnst@cn.fujitsu.com>
AuthorDate: Mon, 11 Aug 2014 10:49:23 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 18 Aug 2014 11:17:46 +0200

watchdog: Remove unnecessary header files

Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1407768567-171794-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/watchdog.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index c3319bd..4c2e11c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -15,11 +15,6 @@
 #include <linux/cpu.h>
 #include <linux/nmi.h>
 #include <linux/init.h>
-#include <linux/delay.h>
-#include <linux/freezer.h>
-#include <linux/kthread.h>
-#include <linux/lockdep.h>
-#include <linux/notifier.h>
 #include <linux/module.h>
 #include <linux/sysctl.h>
 #include <linux/smpboot.h>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/watchdog] watchdog: Fix print-once on enable
  2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
  2014-08-18  9:05   ` Ingo Molnar
  2014-08-18  9:07   ` Ingo Molnar
@ 2014-08-18 18:03   ` tip-bot for Ulrich Obergfell
  2 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Ulrich Obergfell @ 2014-08-18 18:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, drjones, uobergfe, tglx, dzickus

Commit-ID:  df577149594cefacd62740e86de080c6336d699e
Gitweb:     http://git.kernel.org/tip/df577149594cefacd62740e86de080c6336d699e
Author:     Ulrich Obergfell <uobergfe@redhat.com>
AuthorDate: Mon, 11 Aug 2014 10:49:25 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 18 Aug 2014 11:17:46 +0200

watchdog: Fix print-once on enable

This patch avoids printing the message 'enabled on all CPUs,
...' multiple times. For example, the issue can occur in the
following scenario:

1) watchdog_nmi_enable() fails to enable PMU counters and sets
   cpu0_err.

2) 'echo [0|1] > /proc/sys/kernel/nmi_watchdog' is executed to
   disable and re-enable the watchdog mechanism 'on the fly'.

3) If watchdog_nmi_enable() succeeds to enable PMU counters,
   each CPU will print the message because step1 left behind a
   non-zero cpu0_err.

   if (!IS_ERR(event)) {
       if (cpu == 0 || cpu0_err)
           pr_info("enabled on all CPUs, ...")

The patch avoids this by clearing cpu0_err in watchdog_nmi_disable().

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1407768567-171794-4-git-send-email-dzickus@redhat.com
[ Applied small cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/watchdog.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 4c2e11c..df5494e 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -506,7 +506,10 @@ static void watchdog_nmi_disable(unsigned int cpu)
 		/* should be in cleanup, but blocks oprofile */
 		perf_event_release_kernel(event);
 	}
-	return;
+	if (cpu == 0) {
+		/* watchdog_nmi_enable() expects this to be zero initially. */
+		cpu0_err = 0;
+	}
 }
 #else
 static int watchdog_nmi_enable(unsigned int cpu) { return 0; }

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18 15:17     ` Don Zickus
@ 2014-08-18 18:07       ` Ingo Molnar
  2014-08-18 18:53         ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18 18:07 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones


* Don Zickus <dzickus@redhat.com> wrote:

> On Mon, Aug 18, 2014 at 11:16:44AM +0200, Ingo Molnar wrote:
> > 
> > * Don Zickus <dzickus@redhat.com> wrote:
> > 
> > > The running kernel still has the ability to enable/disable at any
> > > time with /proc/sys/kernel/nmi_watchdog us usual. However even
> > > when the default has been overridden /proc/sys/kernel/nmi_watchdog
> > > will initially show '1'. To truly turn it on one must disable/enable
> > > it, i.e.
> > >   echo 0 > /proc/sys/kernel/nmi_watchdog
> > >   echo 1 > /proc/sys/kernel/nmi_watchdog
> > 
> > This looks like a bug, why is this so?
> 
> It is, but it always has been there in the case of the PMU 
> not being able to provide a resource for the hardlockup.  
> This change just exposes it more.

There seems to be two issues:

1)

When it's impossible to enable the hardlockup detector, it 
should default to -1 or so, and attempts to set it should 
return a -EINVAL or so.

Bootup messages should also indicate when it's not possible to 
enable it but a user requests it.

2)

The softlockup and hardlockup detection control variables 
should be in separate flags, inside and outside the kernel - 
they (should) not relate to each other.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18 18:01       ` Ingo Molnar
@ 2014-08-18 18:43         ` Don Zickus
  2014-08-18 19:02           ` Ingo Molnar
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-18 18:43 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen

On Mon, Aug 18, 2014 at 08:01:58PM +0200, Ingo Molnar wrote:
> > > >  	duration = is_softlockup(touch_ts);
> > > >  	if (unlikely(duration)) {
> > > > +		pid_t pid = task_pid_nr(current);
> > > > +
> > > >  		/*
> > > >  		 * If a virtual machine is stopped by the host it can look to
> > > >  		 * the watchdog like a soft lockup, check to see if the host
> > > > @@ -326,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > > >  			return HRTIMER_RESTART;
> > > >  
> > > >  		/* only warn once */
> > > > -		if (__this_cpu_read(soft_watchdog_warn) == true)
> > > > +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> > > > +
> > > > +			/*
> > > > +			 * Handle the case where multiple processes are
> > > > +			 * causing softlockups but the duration is small
> > > > +			 * enough, the softlockup detector can not reset
> > > > +			 * itself in time.  Use pids to detect this.
> > > > +			 */
> > > > +			if (__this_cpu_read(softlockup_warn_pid_saved) != pid) {
> > > 
> > > So I agree with the motivation of this improvement, but is this 
> > > implementation namespace-safe?
> > 
> > What namespace are you worried about colliding with?  I thought
> > softlockup_ would provide the safety??  Maybe I am missing something
> > obvious. :-(
> 
> I meant PID namespaces - a PID in itself isn't guaranteed to be 
> unique across the system.

Ah,  I don't think we thought about that.  Is there a better way to do
this?  Is there a domain id or something that can be OR'd with the pid?

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18 18:07       ` Ingo Molnar
@ 2014-08-18 18:53         ` Don Zickus
  2014-08-18 19:00           ` Ingo Molnar
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-18 18:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones

On Mon, Aug 18, 2014 at 08:07:35PM +0200, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > On Mon, Aug 18, 2014 at 11:16:44AM +0200, Ingo Molnar wrote:
> > > 
> > > * Don Zickus <dzickus@redhat.com> wrote:
> > > 
> > > > The running kernel still has the ability to enable/disable at any
> > > > time with /proc/sys/kernel/nmi_watchdog us usual. However even
> > > > when the default has been overridden /proc/sys/kernel/nmi_watchdog
> > > > will initially show '1'. To truly turn it on one must disable/enable
> > > > it, i.e.
> > > >   echo 0 > /proc/sys/kernel/nmi_watchdog
> > > >   echo 1 > /proc/sys/kernel/nmi_watchdog
> > > 
> > > This looks like a bug, why is this so?
> > 
> > It is, but it always has been there in the case of the PMU 
> > not being able to provide a resource for the hardlockup.  
> > This change just exposes it more.
> 
> There seems to be two issues:
> 
> 1)
> 
> When it's impossible to enable the hardlockup detector, it 
> should default to -1 or so, and attempts to set it should 
> return a -EINVAL or so.

Ok, it didn't because I set the knob to mean both hard and soft lockup.
But the code knows the failures and can set to -1 if it had to.

> 
> Bootup messages should also indicate when it's not possible to 
> enable it but a user requests it.

It does today.

> 
> 2)
> 
> The softlockup and hardlockup detection control variables 
> should be in separate flags, inside and outside the kernel - 
> they (should) not relate to each other.

They did because years ago I thought we wanted to keep them as one entity
instead of two.  I would have to re-work the code to do this (and add more
knobs).

I presume you would want those changes done before taking this patchset?

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] watchdog: control hard lockup detection default
  2014-08-18 18:53         ` Don Zickus
@ 2014-08-18 19:00           ` Ingo Molnar
  0 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18 19:00 UTC (permalink / raw)
  To: Don Zickus
  Cc: akpm, kvm, pbonzini, mingo, LKML, Ulrich Obergfell, Andrew Jones

* Don Zickus <dzickus@redhat.com> wrote:

> > 2)
> > 
> > The softlockup and hardlockup detection control variables 
> > should be in separate flags, inside and outside the kernel 
> > - they (should) not relate to each other.
> 
> They did because years ago I thought we wanted to keep them 
> as one entity instead of two.  I would have to re-work the 
> code to do this (and add more knobs).
> 
> I presume you would want those changes done before taking 
> this patchset?

Yeah, fixing/cleaning up things would be nice before spreading 
the pain via new features/control mechanisms.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18 18:43         ` Don Zickus
@ 2014-08-18 19:02           ` Ingo Molnar
  2014-08-18 20:38             ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2014-08-18 19:02 UTC (permalink / raw)
  To: Don Zickus; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen


* Don Zickus <dzickus@redhat.com> wrote:

> > > > So I agree with the motivation of this improvement, but 
> > > > is this implementation namespace-safe?
> > > 
> > > What namespace are you worried about colliding with?  I 
> > > thought softlockup_ would provide the safety??  Maybe I 
> > > am missing something obvious. :-(
> > 
> > I meant PID namespaces - a PID in itself isn't guaranteed 
> > to be unique across the system.
> 
> Ah, I don't think we thought about that.  Is there a better 
> way to do this?  Is there a domain id or something that can 
> be OR'd with the pid?

What is always unique is the task pointer itself. We use pids 
when we interface with user-space - but we don't really do that 
here, right?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18 19:02           ` Ingo Molnar
@ 2014-08-18 20:38             ` Don Zickus
  2014-08-19  1:36               ` Chai Wen
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-18 20:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, kvm, pbonzini, mingo, LKML, chai wen

On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > > > > So I agree with the motivation of this improvement, but 
> > > > > is this implementation namespace-safe?
> > > > 
> > > > What namespace are you worried about colliding with?  I 
> > > > thought softlockup_ would provide the safety??  Maybe I 
> > > > am missing something obvious. :-(
> > > 
> > > I meant PID namespaces - a PID in itself isn't guaranteed 
> > > to be unique across the system.
> > 
> > Ah, I don't think we thought about that.  Is there a better 
> > way to do this?  Is there a domain id or something that can 
> > be OR'd with the pid?
> 
> What is always unique is the task pointer itself. We use pids 
> when we interface with user-space - but we don't really do that 
> here, right?

No, I don't believe so.  Ok, so saving 'current' and comparing that should
be enough, correct?

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-18 20:38             ` Don Zickus
@ 2014-08-19  1:36               ` Chai Wen
  2014-08-21  1:37                 ` Chai Wen
  0 siblings, 1 reply; 34+ messages in thread
From: Chai Wen @ 2014-08-19  1:36 UTC (permalink / raw)
  To: Don Zickus; +Cc: Ingo Molnar, akpm, kvm, pbonzini, mingo, LKML

On 08/19/2014 04:38 AM, Don Zickus wrote:

> On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
>>
>> * Don Zickus <dzickus@redhat.com> wrote:
>>
>>>>>> So I agree with the motivation of this improvement, but 
>>>>>> is this implementation namespace-safe?
>>>>>
>>>>> What namespace are you worried about colliding with?  I 
>>>>> thought softlockup_ would provide the safety??  Maybe I 
>>>>> am missing something obvious. :-(
>>>>
>>>> I meant PID namespaces - a PID in itself isn't guaranteed 
>>>> to be unique across the system.
>>>
>>> Ah, I don't think we thought about that.  Is there a better 
>>> way to do this?  Is there a domain id or something that can 
>>> be OR'd with the pid?
>>
>> What is always unique is the task pointer itself. We use pids 
>> when we interface with user-space - but we don't really do that 
>> here, right?
> 
> No, I don't believe so.  Ok, so saving 'current' and comparing that should
> be enough, correct?
> 


I am not sure of the safety about using pid here with namespace.
But as to the pointer of process, is there a chance that we got a 'historical'
address saved in the 'softlockup_warn_pid(or address)_saved' and the current
hogging process happened to get the same task pointer address?
If it never happens, I think the comparing of address is ok.

thanks
chai wen

> Cheers,
> Don
> .
> 



-- 
Regards

Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-19  1:36               ` Chai Wen
@ 2014-08-21  1:37                 ` Chai Wen
  2014-08-21  2:30                   ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Chai Wen @ 2014-08-21  1:37 UTC (permalink / raw)
  To: Don Zickus, Ingo Molnar; +Cc: akpm, kvm, pbonzini, mingo, LKML

On 08/19/2014 09:36 AM, Chai Wen wrote:

> On 08/19/2014 04:38 AM, Don Zickus wrote:
> 
>> On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
>>>
>>> * Don Zickus <dzickus@redhat.com> wrote:
>>>
>>>>>>> So I agree with the motivation of this improvement, but 
>>>>>>> is this implementation namespace-safe?
>>>>>>
>>>>>> What namespace are you worried about colliding with?  I 
>>>>>> thought softlockup_ would provide the safety??  Maybe I 
>>>>>> am missing something obvious. :-(
>>>>>
>>>>> I meant PID namespaces - a PID in itself isn't guaranteed 
>>>>> to be unique across the system.
>>>>
>>>> Ah, I don't think we thought about that.  Is there a better 
>>>> way to do this?  Is there a domain id or something that can 
>>>> be OR'd with the pid?
>>>
>>> What is always unique is the task pointer itself. We use pids 
>>> when we interface with user-space - but we don't really do that 
>>> here, right?
>>
>> No, I don't believe so.  Ok, so saving 'current' and comparing that should
>> be enough, correct?
>>
> 
> 
> I am not sure of the safety about using pid here with namespace.
> But as to the pointer of process, is there a chance that we got a 'historical'
> address saved in the 'softlockup_warn_pid(or address)_saved' and the current
> hogging process happened to get the same task pointer address?
> If it never happens, I think the comparing of address is ok.
> 


Hi Ingo

what do you think of Don's solution- 'comparing of task pointer' ?
Anyway this is just an additional check about some very special cases,
so I think the issue that I am concerned above is not a problem at all.
And after learning some concepts about PID namespace, I think comparing
of task pointer is reliable dealing with PID namespace here.

And Don, If you want me to re-post this patch, please let me know that.

thanks
chai wen

> thanks
> chai wen
> 
>> Cheers,
>> Don
>> .
>>
> 
> 
> 



-- 
Regards

Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-21  1:37                 ` Chai Wen
@ 2014-08-21  2:30                   ` Don Zickus
  2014-08-21  5:42                     ` [PATCH] " chai wen
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-21  2:30 UTC (permalink / raw)
  To: Chai Wen; +Cc: Ingo Molnar, akpm, kvm, pbonzini, mingo, LKML

On Thu, Aug 21, 2014 at 09:37:04AM +0800, Chai Wen wrote:
> On 08/19/2014 09:36 AM, Chai Wen wrote:
> 
> > On 08/19/2014 04:38 AM, Don Zickus wrote:
> > 
> >> On Mon, Aug 18, 2014 at 09:02:00PM +0200, Ingo Molnar wrote:
> >>>
> >>> * Don Zickus <dzickus@redhat.com> wrote:
> >>>
> >>>>>>> So I agree with the motivation of this improvement, but 
> >>>>>>> is this implementation namespace-safe?
> >>>>>>
> >>>>>> What namespace are you worried about colliding with?  I 
> >>>>>> thought softlockup_ would provide the safety??  Maybe I 
> >>>>>> am missing something obvious. :-(
> >>>>>
> >>>>> I meant PID namespaces - a PID in itself isn't guaranteed 
> >>>>> to be unique across the system.
> >>>>
> >>>> Ah, I don't think we thought about that.  Is there a better 
> >>>> way to do this?  Is there a domain id or something that can 
> >>>> be OR'd with the pid?
> >>>
> >>> What is always unique is the task pointer itself. We use pids 
> >>> when we interface with user-space - but we don't really do that 
> >>> here, right?
> >>
> >> No, I don't believe so.  Ok, so saving 'current' and comparing that should
> >> be enough, correct?
> >>
> > 
> > 
> > I am not sure of the safety about using pid here with namespace.
> > But as to the pointer of process, is there a chance that we got a 'historical'
> > address saved in the 'softlockup_warn_pid(or address)_saved' and the current
> > hogging process happened to get the same task pointer address?
> > If it never happens, I think the comparing of address is ok.
> > 
> 
> 
> Hi Ingo
> 
> what do you think of Don's solution- 'comparing of task pointer' ?
> Anyway this is just an additional check about some very special cases,
> so I think the issue that I am concerned above is not a problem at all.
> And after learning some concepts about PID namespace, I think comparing
> of task pointer is reliable dealing with PID namespace here.
> 
> And Don, If you want me to re-post this patch, please let me know that.

Sure, just quickly test with the task pointer to make sure it still works
and then re-post.

Cheers,
Don

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-21  2:30                   ` Don Zickus
@ 2014-08-21  5:42                     ` chai wen
  2014-08-22  1:12                       ` Chai Wen
  2014-08-22  1:58                       ` Don Zickus
  0 siblings, 2 replies; 34+ messages in thread
From: chai wen @ 2014-08-21  5:42 UTC (permalink / raw)
  Cc: mingo, linux-kernel, chai wen, Don Zickus

For now, soft lockup detector warns once for each case of process softlockup.
But the thread 'watchdog/n' may not always get the cpu at the time slot between
the task switch of two processes hogging that cpu to reset soft_watchdog_warn.

An example would be two processes hogging the cpu.  Process A causes the
softlockup warning and is killed manually by a user.  Process B immediately
becomes the new process hogging the cpu preventing the softlockup code from
resetting the soft_watchdog_warn variable.

This case is a false negative of "warn only once for a process", as there may
be a different process that is going to hog the cpu.  Resolve this by
saving/checking the task pointer of the hogging process and use that to reset
soft_watchdog_warn too.

Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 0037db6..2e55620 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
 static DEFINE_PER_CPU(bool, soft_watchdog_warn);
 static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
 static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
+static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
 static DEFINE_PER_CPU(bool, hard_watchdog_warn);
 static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
@@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 			return HRTIMER_RESTART;
 
 		/* only warn once */
-		if (__this_cpu_read(soft_watchdog_warn) == true)
+		if (__this_cpu_read(soft_watchdog_warn) == true) {
+			/*
+			 * Handle the case where multiple processes are
+			 * causing softlockups but the duration is small
+			 * enough, the softlockup detector can not reset
+			 * itself in time.  Use task pointers to detect this.
+			 */
+			if (__this_cpu_read(softlockup_task_ptr_saved) !=
+			    current) {
+				__this_cpu_write(soft_watchdog_warn, false);
+				__touch_watchdog();
+			}
 			return HRTIMER_RESTART;
+		}
 
 		if (softlockup_all_cpu_backtrace) {
 			/* Prevent multiple soft-lockup reports if one cpu is already
@@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
 			smp_processor_id(), duration,
 			current->comm, task_pid_nr(current));
+		__this_cpu_write(softlockup_task_ptr_saved, current);
 		print_modules();
 		print_irqtrace_events(current);
 		if (regs)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-21  5:42                     ` [PATCH] " chai wen
@ 2014-08-22  1:12                       ` Chai Wen
  2014-08-22  1:58                       ` Don Zickus
  1 sibling, 0 replies; 34+ messages in thread
From: Chai Wen @ 2014-08-22  1:12 UTC (permalink / raw)
  To: mingo, Don Zickus; +Cc: chai wen, linux-kernel

On 08/21/2014 01:42 PM, chai wen wrote:

> For now, soft lockup detector warns once for each case of process softlockup.
> But the thread 'watchdog/n' may not always get the cpu at the time slot between
> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> 
> An example would be two processes hogging the cpu.  Process A causes the
> softlockup warning and is killed manually by a user.  Process B immediately
> becomes the new process hogging the cpu preventing the softlockup code from
> resetting the soft_watchdog_warn variable.
> 
> This case is a false negative of "warn only once for a process", as there may
> be a different process that is going to hog the cpu.  Resolve this by
> saving/checking the task pointer of the hogging process and use that to reset
> soft_watchdog_warn too.
> 
> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> Signed-off-by: Don Zickus <dzickus@redhat.com>


Hi Ingo & Don

Ping...

This patch is using the task pointer to check cases that softlockup can
not reset itself, and has been tested.

thanks
chai wen

> ---
>  kernel/watchdog.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 0037db6..2e55620 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  			return HRTIMER_RESTART;
>  
>  		/* only warn once */
> -		if (__this_cpu_read(soft_watchdog_warn) == true)
> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> +			/*
> +			 * Handle the case where multiple processes are
> +			 * causing softlockups but the duration is small
> +			 * enough, the softlockup detector can not reset
> +			 * itself in time.  Use task pointers to detect this.
> +			 */
> +			if (__this_cpu_read(softlockup_task_ptr_saved) !=
> +			    current) {
> +				__this_cpu_write(soft_watchdog_warn, false);
> +				__touch_watchdog();
> +			}
>  			return HRTIMER_RESTART;
> +		}
>  
>  		if (softlockup_all_cpu_backtrace) {
>  			/* Prevent multiple soft-lockup reports if one cpu is already
> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>  			smp_processor_id(), duration,
>  			current->comm, task_pid_nr(current));
> +		__this_cpu_write(softlockup_task_ptr_saved, current);
>  		print_modules();
>  		print_irqtrace_events(current);
>  		if (regs)



-- 
Regards

Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-21  5:42                     ` [PATCH] " chai wen
  2014-08-22  1:12                       ` Chai Wen
@ 2014-08-22  1:58                       ` Don Zickus
  2014-08-26 12:51                         ` Chai Wen
  1 sibling, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-22  1:58 UTC (permalink / raw)
  To: chai wen; +Cc: mingo, linux-kernel

On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
> For now, soft lockup detector warns once for each case of process softlockup.
> But the thread 'watchdog/n' may not always get the cpu at the time slot between
> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> 
> An example would be two processes hogging the cpu.  Process A causes the
> softlockup warning and is killed manually by a user.  Process B immediately
> becomes the new process hogging the cpu preventing the softlockup code from
> resetting the soft_watchdog_warn variable.
> 
> This case is a false negative of "warn only once for a process", as there may
> be a different process that is going to hog the cpu.  Resolve this by
> saving/checking the task pointer of the hogging process and use that to reset
> soft_watchdog_warn too.
> 
> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> Signed-off-by: Don Zickus <dzickus@redhat.com>

Acked-by: Don Zickus <dzickus@redhat.com>

> ---
>  kernel/watchdog.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 0037db6..2e55620 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  			return HRTIMER_RESTART;
>  
>  		/* only warn once */
> -		if (__this_cpu_read(soft_watchdog_warn) == true)
> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> +			/*
> +			 * Handle the case where multiple processes are
> +			 * causing softlockups but the duration is small
> +			 * enough, the softlockup detector can not reset
> +			 * itself in time.  Use task pointers to detect this.
> +			 */
> +			if (__this_cpu_read(softlockup_task_ptr_saved) !=
> +			    current) {
> +				__this_cpu_write(soft_watchdog_warn, false);
> +				__touch_watchdog();
> +			}
>  			return HRTIMER_RESTART;
> +		}
>  
>  		if (softlockup_all_cpu_backtrace) {
>  			/* Prevent multiple soft-lockup reports if one cpu is already
> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>  			smp_processor_id(), duration,
>  			current->comm, task_pid_nr(current));
> +		__this_cpu_write(softlockup_task_ptr_saved, current);
>  		print_modules();
>  		print_irqtrace_events(current);
>  		if (regs)
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-22  1:58                       ` Don Zickus
@ 2014-08-26 12:51                         ` Chai Wen
  2014-08-26 14:22                           ` Don Zickus
  0 siblings, 1 reply; 34+ messages in thread
From: Chai Wen @ 2014-08-26 12:51 UTC (permalink / raw)
  To: akpm; +Cc: Don Zickus, mingo, linux-kernel

On 08/22/2014 09:58 AM, Don Zickus wrote:

> On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
>> For now, soft lockup detector warns once for each case of process softlockup.
>> But the thread 'watchdog/n' may not always get the cpu at the time slot between
>> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
>>
>> An example would be two processes hogging the cpu.  Process A causes the
>> softlockup warning and is killed manually by a user.  Process B immediately
>> becomes the new process hogging the cpu preventing the softlockup code from
>> resetting the soft_watchdog_warn variable.
>>
>> This case is a false negative of "warn only once for a process", as there may
>> be a different process that is going to hog the cpu.  Resolve this by
>> saving/checking the task pointer of the hogging process and use that to reset
>> soft_watchdog_warn too.
>>
>> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
>> Signed-off-by: Don Zickus <dzickus@redhat.com>
> 
> Acked-by: Don Zickus <dzickus@redhat.com>
> 


Hi Andrew

Sorry for some disturbing.
Could you help to check and pick up this little improvement patch ?

I am not sure which MAINTAINER I should talk to, but the original version of
this patch is queued to -mm tree by you, so I assume that they are in the charge of you.


thanks
chai wen

>> ---
>>  kernel/watchdog.c |   16 +++++++++++++++-
>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 0037db6..2e55620 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
>> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
>> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>  			return HRTIMER_RESTART;
>>  
>>  		/* only warn once */
>> -		if (__this_cpu_read(soft_watchdog_warn) == true)
>> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
>> +			/*
>> +			 * Handle the case where multiple processes are
>> +			 * causing softlockups but the duration is small
>> +			 * enough, the softlockup detector can not reset
>> +			 * itself in time.  Use task pointers to detect this.
>> +			 */
>> +			if (__this_cpu_read(softlockup_task_ptr_saved) !=
>> +			    current) {
>> +				__this_cpu_write(soft_watchdog_warn, false);
>> +				__touch_watchdog();
>> +			}
>>  			return HRTIMER_RESTART;
>> +		}
>>  
>>  		if (softlockup_all_cpu_backtrace) {
>>  			/* Prevent multiple soft-lockup reports if one cpu is already
>> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>>  			smp_processor_id(), duration,
>>  			current->comm, task_pid_nr(current));
>> +		__this_cpu_write(softlockup_task_ptr_saved, current);
>>  		print_modules();
>>  		print_irqtrace_events(current);
>>  		if (regs)
>> -- 
>> 1.7.1
>>
> .
> 



-- 
Regards

Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-26 12:51                         ` Chai Wen
@ 2014-08-26 14:22                           ` Don Zickus
  2014-08-27  1:33                             ` Chai Wen
  0 siblings, 1 reply; 34+ messages in thread
From: Don Zickus @ 2014-08-26 14:22 UTC (permalink / raw)
  To: Chai Wen; +Cc: akpm, mingo, linux-kernel

On Tue, Aug 26, 2014 at 08:51:30PM +0800, Chai Wen wrote:
> On 08/22/2014 09:58 AM, Don Zickus wrote:
> 
> > On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
> >> For now, soft lockup detector warns once for each case of process softlockup.
> >> But the thread 'watchdog/n' may not always get the cpu at the time slot between
> >> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> >>
> >> An example would be two processes hogging the cpu.  Process A causes the
> >> softlockup warning and is killed manually by a user.  Process B immediately
> >> becomes the new process hogging the cpu preventing the softlockup code from
> >> resetting the soft_watchdog_warn variable.
> >>
> >> This case is a false negative of "warn only once for a process", as there may
> >> be a different process that is going to hog the cpu.  Resolve this by
> >> saving/checking the task pointer of the hogging process and use that to reset
> >> soft_watchdog_warn too.
> >>
> >> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
> >> Signed-off-by: Don Zickus <dzickus@redhat.com>
> > 
> > Acked-by: Don Zickus <dzickus@redhat.com>
> > 
> 
> 
> Hi Andrew
> 
> Sorry for some disturbing.
> Could you help to check and pick up this little improvement patch ?
> 
> I am not sure which MAINTAINER I should talk to, but the original version of
> this patch is queued to -mm tree by you, so I assume that they are in the charge of you.
> 
> 
> thanks
> chai wen

Hi Chai,

Sorry about that.  Ingo asked me privately to pick this up and re-post
with my signoff.  I was converting to a new test env and was going to use this
patch as an excuse to exercise it.  That is the delay.  Let me get this
out today.

Cheers,
Don

> 
> >> ---
> >>  kernel/watchdog.c |   16 +++++++++++++++-
> >>  1 files changed, 15 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> >> index 0037db6..2e55620 100644
> >> --- a/kernel/watchdog.c
> >> +++ b/kernel/watchdog.c
> >> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
> >>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
> >>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
> >>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> >> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
> >>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> >>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
> >>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> >> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> >>  			return HRTIMER_RESTART;
> >>  
> >>  		/* only warn once */
> >> -		if (__this_cpu_read(soft_watchdog_warn) == true)
> >> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> >> +			/*
> >> +			 * Handle the case where multiple processes are
> >> +			 * causing softlockups but the duration is small
> >> +			 * enough, the softlockup detector can not reset
> >> +			 * itself in time.  Use task pointers to detect this.
> >> +			 */
> >> +			if (__this_cpu_read(softlockup_task_ptr_saved) !=
> >> +			    current) {
> >> +				__this_cpu_write(soft_watchdog_warn, false);
> >> +				__touch_watchdog();
> >> +			}
> >>  			return HRTIMER_RESTART;
> >> +		}
> >>  
> >>  		if (softlockup_all_cpu_backtrace) {
> >>  			/* Prevent multiple soft-lockup reports if one cpu is already
> >> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> >>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
> >>  			smp_processor_id(), duration,
> >>  			current->comm, task_pid_nr(current));
> >> +		__this_cpu_write(softlockup_task_ptr_saved, current);
> >>  		print_modules();
> >>  		print_irqtrace_events(current);
> >>  		if (regs)
> >> -- 
> >> 1.7.1
> >>
> > .
> > 
> 
> 
> 
> -- 
> Regards
> 
> Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
  2014-08-26 14:22                           ` Don Zickus
@ 2014-08-27  1:33                             ` Chai Wen
  0 siblings, 0 replies; 34+ messages in thread
From: Chai Wen @ 2014-08-27  1:33 UTC (permalink / raw)
  To: Don Zickus; +Cc: akpm, mingo, linux-kernel

On 08/26/2014 10:22 PM, Don Zickus wrote:

> On Tue, Aug 26, 2014 at 08:51:30PM +0800, Chai Wen wrote:
>> On 08/22/2014 09:58 AM, Don Zickus wrote:
>>
>>> On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
>>>> For now, soft lockup detector warns once for each case of process softlockup.
>>>> But the thread 'watchdog/n' may not always get the cpu at the time slot between
>>>> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
>>>>
>>>> An example would be two processes hogging the cpu.  Process A causes the
>>>> softlockup warning and is killed manually by a user.  Process B immediately
>>>> becomes the new process hogging the cpu preventing the softlockup code from
>>>> resetting the soft_watchdog_warn variable.
>>>>
>>>> This case is a false negative of "warn only once for a process", as there may
>>>> be a different process that is going to hog the cpu.  Resolve this by
>>>> saving/checking the task pointer of the hogging process and use that to reset
>>>> soft_watchdog_warn too.
>>>>
>>>> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
>>>> Signed-off-by: Don Zickus <dzickus@redhat.com>
>>>
>>> Acked-by: Don Zickus <dzickus@redhat.com>
>>>
>>
>>
>> Hi Andrew
>>
>> Sorry for some disturbing.
>> Could you help to check and pick up this little improvement patch ?
>>
>> I am not sure which MAINTAINER I should talk to, but the original version of
>> this patch is queued to -mm tree by you, so I assume that they are in the charge of you.
>>
>>
>> thanks
>> chai wen
> 
> Hi Chai,
> 
> Sorry about that.  Ingo asked me privately to pick this up and re-post
> with my signoff.  I was converting to a new test env and was going to use this
> patch as an excuse to exercise it.  That is the delay.  Let me get this
> out today.
> 


OK, It is kind of you to do that, thanks for your work. :)

thanks
chai wen

> Cheers,
> Don
> 
>>
>>>> ---
>>>>  kernel/watchdog.c |   16 +++++++++++++++-
>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>>>> index 0037db6..2e55620 100644
>>>> --- a/kernel/watchdog.c
>>>> +++ b/kernel/watchdog.c
>>>> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>>>>  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>>>>  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>>>>  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
>>>> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>>>>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
>>>>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>>>>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
>>>> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>>>  			return HRTIMER_RESTART;
>>>>  
>>>>  		/* only warn once */
>>>> -		if (__this_cpu_read(soft_watchdog_warn) == true)
>>>> +		if (__this_cpu_read(soft_watchdog_warn) == true) {
>>>> +			/*
>>>> +			 * Handle the case where multiple processes are
>>>> +			 * causing softlockups but the duration is small
>>>> +			 * enough, the softlockup detector can not reset
>>>> +			 * itself in time.  Use task pointers to detect this.
>>>> +			 */
>>>> +			if (__this_cpu_read(softlockup_task_ptr_saved) !=
>>>> +			    current) {
>>>> +				__this_cpu_write(soft_watchdog_warn, false);
>>>> +				__touch_watchdog();
>>>> +			}
>>>>  			return HRTIMER_RESTART;
>>>> +		}
>>>>  
>>>>  		if (softlockup_all_cpu_backtrace) {
>>>>  			/* Prevent multiple soft-lockup reports if one cpu is already
>>>> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>>>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>>>>  			smp_processor_id(), duration,
>>>>  			current->comm, task_pid_nr(current));
>>>> +		__this_cpu_write(softlockup_task_ptr_saved, current);
>>>>  		print_modules();
>>>>  		print_irqtrace_events(current);
>>>>  		if (regs)
>>>> -- 
>>>> 1.7.1
>>>>
>>> .
>>>
>>
>>
>>
>> -- 
>> Regards
>>
>> Chai Wen
> .
> 



-- 
Regards

Chai Wen

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2014-08-27  1:37 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
2014-08-18 18:03   ` [tip:perf/watchdog] watchdog: Remove unnecessary header files tip-bot for chai wen
2014-08-11 14:49 ` [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu Don Zickus
2014-08-18  9:03   ` Ingo Molnar
2014-08-18 15:06     ` Don Zickus
2014-08-18 18:01       ` Ingo Molnar
2014-08-18 18:43         ` Don Zickus
2014-08-18 19:02           ` Ingo Molnar
2014-08-18 20:38             ` Don Zickus
2014-08-19  1:36               ` Chai Wen
2014-08-21  1:37                 ` Chai Wen
2014-08-21  2:30                   ` Don Zickus
2014-08-21  5:42                     ` [PATCH] " chai wen
2014-08-22  1:12                       ` Chai Wen
2014-08-22  1:58                       ` Don Zickus
2014-08-26 12:51                         ` Chai Wen
2014-08-26 14:22                           ` Don Zickus
2014-08-27  1:33                             ` Chai Wen
2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
2014-08-18  9:05   ` Ingo Molnar
2014-08-18  9:07   ` Ingo Molnar
2014-08-18 15:07     ` Don Zickus
2014-08-18 18:03   ` [tip:perf/watchdog] watchdog: Fix " tip-bot for Ulrich Obergfell
2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
2014-08-18  9:12   ` Ingo Molnar
2014-08-18 15:07     ` Don Zickus
2014-08-18  9:16   ` Ingo Molnar
2014-08-18 10:44     ` Ulrich Obergfell
2014-08-18 15:17     ` Don Zickus
2014-08-18 18:07       ` Ingo Molnar
2014-08-18 18:53         ` Don Zickus
2014-08-18 19:00           ` Ingo Molnar
2014-08-11 14:49 ` [PATCH 5/5] kvm: ensure hard lockup detection is disabled by default Don Zickus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.