linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Laurence Oberman <loberman@redhat.com>,
	Vincent Whitchurch <vincent.whitchurch@axis.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/7] watchdog: Cleanup handling of false positives
Date: Mon, 17 May 2021 17:01:14 +0200	[thread overview]
Message-ID: <YKKFOoMVfAZtDWqE@alley> (raw)
In-Reply-To: <YKD3/RuL/2qUcRhL@google.com>

On Sun 2021-05-16 19:46:21, Sergey Senozhatsky wrote:
> Hi,
> 
> // This was never in my inbox, so sorry if I mess up the "Reply-to"
> // Original message:  https://lore.kernel.org/lkml/20210311122130.6788-7-pmladek@suse.com/
> 
> 
> >@@ -375,7 +375,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > 	/* .. and repeat */
> > 	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
> >
> > -	/* Reset the interval when touched externally by a known slow code. */
> > +	/*
> > +	 * If a virtual machine is stopped by the host it can look to
> > +	 * the watchdog like a soft lockup. Check to see if the host
> > +	 * stopped the vm before we process the timestamps.
> > +	 */
> > +	kvm_check_and_clear_guest_paused();
> > +
> [..]
> >@@ -401,14 +405,6 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > 	 */
> > 	duration = is_softlockup(touch_ts, period_ts);
> > 	if (unlikely(duration)) {
> > -		/*
> > -		 * If a virtual machine is stopped by the host it can look to
> > -		 * the watchdog like a soft lockup, check to see if the host
> > -		 * stopped the vm before we issue the warning
> > -		 */
> > -		if (kvm_check_and_clear_guest_paused())
> > -			return HRTIMER_RESTART;
> 
> This looks racy to me. I believe kvm_check_and_clear_guest_paused()
> was in the right place.
> 
> VCPU can be scheduled out/preepmpted any time at any point; and then
> guest VM (or even the entire system) can be suspended. When we resume
> the VM we continue from where we were preempted (from VCPU POW).
> 
> So what the old code did
> 
> watchdog_timer_fn()
> {
> 	...
> 	<<!!>>
> 
> 	// Suppose we are suspended here. When we are getting resumed
> 	// jiffies jump forward, which may look like a soft lockup.
> 	duration = is_softlockup(touch_ts, period_ts);
> 	if (unlikely(duration)) {
> 		// And this is where kvm_check_and_clear_guest_paused()
> 		// jumps in. We know already that jiffies have jumped,
> 		// we don't know if jiffies jumped because the VM was
> 		// suspended. And this is what we figure out here and
> 		// bail out
> 		if (kvm_check_and_clear_guest_paused())
> 			return HRTIMER_RESTART;
> 	}
> }
> 
> The new code does the following
> 
> watchdog_timer_fn()
> {
> 	...
> 	kvm_check_and_clear_guest_paused(); // PVCLOCK_GUEST_STOPPED is not set
> 
> 	<<!!>>
> 
> 	// Suppose the VM got suspended at this point. PVCLOCK_GUEST_STOPPED
> 	// is set, but we don't check it. jiffies will jump and this will look
> 	// like a lockup, but we don't check if jiffies jumped because the VM
> 	// was suspended
> 	duration = is_softlockup(touch_ts, period_ts);
> 	if (unlikely(duration)) {
> 		// report the lockup and perhaps panic the system,
> 		// depending on the configuration
> 	}
> }
> 
> What am I missing?

Great catch! You have a point.

Well, I think that the entire code is racy. touch_ts and period_ts are
set by:

	unsigned long touch_ts = __this_cpu_read(watchdog_touch_ts);
	unsigned long period_ts = __this_cpu_read(watchdog_report_ts);

They are neither volatile not there are any barriers. It means that
period_ts might be re-read in these two checks:

	/* Reset the interval when touched by known problematic code. */
	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
		update_report_ts();
		return HRTIMER_RESTART;
	}

and

	duration = is_softlockup(touch_ts, period_ts);


where:

static int is_softlockup(unsigned long touch_ts, unsigned long period_ts)
{
	unsigned long now = get_timestamp();

	if ((watchdog_enabled & SOFT_WATCHDOG_ENABLED) && watchdog_thresh){
		/* Warn about unreasonable delays. */
		if (time_after(now, period_ts + get_softlockup_thresh()))
			return now - touch_ts;
	}
	return 0;
}

Now, if the watchdog is touched from NMI. period_ts might be
SOFTLOCKUP_DELAY_REPORT. It is ULONG_MAX.

As a result period_ts + get_softlockup_thresh() would overflow and
we could report softlockup even when there is none.

I probably does not happen because the per-CPU variable is read only
once. And watchdogs are not typically touched from NMI. Except that
show_regs() actually touch the watchdog.

That said. This patch most likely made things worse and should be
reverted. But even better solution would be to remove this race.
I mean to make the code safe and sane at the same time.

Best Regards,
Petr

  reply	other threads:[~2021-05-17 15:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16 10:46 [PATCH v2 6/7] watchdog: Cleanup handling of false positives Sergey Senozhatsky
2021-05-17 15:01 ` Petr Mladek [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-03-11 12:21 [PATCH v2 0/7] watchdog/softlockup: Report overall time and some cleanup Petr Mladek
2021-03-11 12:21 ` [PATCH v2 6/7] watchdog: Cleanup handling of false positives Petr Mladek
2020-12-10 16:00 [PATCH v2 0/7] watchdog/softlockup: Report overall time and some cleanup Petr Mladek
2020-12-10 16:00 ` [PATCH v2 6/7] watchdog: Cleanup handling of false positives Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKKFOoMVfAZtDWqE@alley \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.whitchurch@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).