linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] watchdog: revert cleanup handling of false positives
@ 2021-05-17 14:06 Sergey Senozhatsky
  2021-05-18 15:36 ` Petr Mladek
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2021-05-17 14:06 UTC (permalink / raw)
  To: Andrew Morton, Petr Mladek
  Cc: Peter Zijlstra, linux-kernel, Sergey Senozhatsky

This reverts commit 9bf3bc949f8aeefeacea4b1198db833b722a8e27.

I can reproduce the case when resumed VCPU starts to execute
is_softlockup() with PVCLOCK_GUEST_STOPPED set on this VCPU:

	watchdog_timer_fn()
	{
		...

		kvm_check_and_clear_guest_paused();

		...

		duration = is_softlockup(touch_ts, period_ts);
		if (unlikely(duration)) {
			....
		}
	}

Which means that guest VCPU has been suspended between
kvm_check_and_clear_guest_paused() and is_softlockup(),
and jiffies (clock) thus shifted forward.

VCPU can be preempted anywhere, so we want to do
kvm_check_and_clear_guest_paused() at the very last
moment: when we are in the soft-lockup branch but
then we figure out that it's false positive and we
need to bail out.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 kernel/watchdog.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7c397907d0e9..8cf0678378d2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -376,14 +376,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 	/* .. and repeat */
 	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
 
-	/*
-	 * If a virtual machine is stopped by the host it can look to
-	 * the watchdog like a soft lockup. Check to see if the host
-	 * stopped the vm before we process the timestamps.
-	 */
-	kvm_check_and_clear_guest_paused();
-
-	/* Reset the interval when touched by known problematic code. */
+	/* Reset the interval when touched externally by a known slow code. */
 	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
 		if (unlikely(__this_cpu_read(softlockup_touch_sync))) {
 			/*
@@ -394,7 +387,10 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 			sched_clock_tick();
 		}
 
+		/* Clear the guest paused flag on watchdog reset */
+		kvm_check_and_clear_guest_paused();
 		update_report_ts();
+
 		return HRTIMER_RESTART;
 	}
 
@@ -406,6 +402,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 	 */
 	duration = is_softlockup(touch_ts, period_ts);
 	if (unlikely(duration)) {
+		/*
+		 * If a virtual machine is stopped by the host it can look to
+		 * the watchdog like a soft lockup, check to see if the host
+		 * stopped the vm before we issue the warning
+		 */
+		if (kvm_check_and_clear_guest_paused())
+			return HRTIMER_RESTART;
+
 		/*
 		 * Prevent multiple soft-lockup reports if one cpu is already
 		 * engaged in dumping all cpu back traces.
-- 
2.31.1.751.gd2f1c929bd-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-20  5:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-17 14:06 [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky
2021-05-18 15:36 ` Petr Mladek
2021-05-19 11:43   ` [PATCH] watchdog: Reliable handling of timestamps Petr Mladek
2021-05-19 12:01     ` Petr Mladek
2021-05-20  5:52       ` Sergey Senozhatsky
2021-05-20  5:29     ` Sergey Senozhatsky
2021-05-20  5:13   ` [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).