All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] watchdog: Reliable handling of timestamps
Date: Thu, 20 May 2021 14:29:31 +0900	[thread overview]
Message-ID: <YKXzuy336Fmrcz1s@google.com> (raw)
In-Reply-To: <YKT55gw+RZfyoFf7@alley>

On (21/05/19 13:43), Petr Mladek wrote:
> 
> The commit 9bf3bc949f8aeefeacea4b ("watchdog: cleanup handling of false
> positives") tried to handle a virtual host stopped by the host a more
> straightforward and cleaner way.
> 
> But it introduced a risk of false softlockup reports. The virtual host
> might be stopped at any time, for example between
> kvm_check_and_clear_guest_paused() and is_softlockup().
> As a result, is_softlockup() might read the updated jiffies
> are detects softlockup.
> 
> A solution might be to put back kvm_check_and_clear_guest_paused()
> after is_softlockup() and detect it. But it would put back
> the cycle that complicates the logic.
> 
> In fact, the handling of all the timestamps is not reliable.
> The code does not guarantee when and how many times the timestamps
> are read. For example, "period_ts" might be touched anytime also
> from NMI and re-read in is_softlockup(). It works just by chance.
> 
> Fix all the problems by making the code even more explicit.
> 
> 1. Make sure that "now" and "period_ts" timestamps are read only
>    once. They might be changed at anytime by NMI or when the virtual
>    guest is stopped by the host. Note that "now" timestamp does
>    this implicitly because "jiffies" is marked volatile.
> 
> 2. "now" time must be read first. The state of "period_ts" will decide
>    whether it will be used or the period will get restarted.
> 
> 3. kvm_check_and_clear_guest_paused() must be called before reading
>    "period_ts". It touches the variable when the guest was
>    stopped.
> 
> As a result, "now" timestamp is used only when the watchdog was
> not touched and the guest not stopped in the meantime. "period_ts"
> is restarted in all other situations.
> 
> Fixes: 9bf3bc949f8aeefeacea4b ("watchdog: cleanup handling of false positives")
> Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>

  parent reply	other threads:[~2021-05-20  5:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17 14:06 [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky
2021-05-18 15:36 ` Petr Mladek
2021-05-19 11:43   ` [PATCH] watchdog: Reliable handling of timestamps Petr Mladek
2021-05-19 12:01     ` Petr Mladek
2021-05-20  5:52       ` Sergey Senozhatsky
2021-05-20  5:29     ` Sergey Senozhatsky [this message]
2021-05-20  5:13   ` [PATCH] watchdog: revert cleanup handling of false positives Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKXzuy336Fmrcz1s@google.com \
    --to=senozhatsky@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.