linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: John Stultz <john.stultz@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Boyd <sboyd@kernel.org>, Feng Tang <feng.tang@intel.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Cassio Neri <cassio.neri@gmail.com>,
	Linus Walleij <linus.walleij@linaro.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH v2 0/4] clocksource: Avoid incorrect hpet fallback
Date: Tue, 16 Nov 2021 18:44:22 -0500	[thread overview]
Message-ID: <20211116234426.837799-1-longman@redhat.com> (raw)

It was found that when an x86 system was being stressed by running
various different benchmark suites, the clocksource watchdog might
occasionally mark TSC as unstable and fall back to hpet which will
have a signficant impact on system performance.

The current watchdog clocksource skew threshold of 50us is found to be
insufficient. So it is changed back to 100us before commit 2e27e793e280
("clocksource: Reduce clocksource-skew threshold") in patch 1. This
patch also skip the current clock skew check if the consecutive watchdog
read-back delay contributes a major portion of the total delay. On a
1-socket 64-thread test system, it was actually found that in one the
test sample, the hpet-tsc-hpet delay was 95263ns, while the corresponding
hpet-hpet delay was 94425ns. So the majority of the delay is caused by
the hpet read.

Patch 2 reduces the default clocksource_watchdog() retries to 2 as
suggested by Paul.

Patch 3 implements dynamic readjustment of the new internal
watchdog_max_skew variable in case the current value causes excessive
skipping of clock skew checks. The following reproducer provided by
Feng Tang was used to cause the test skipping:

  sudo stress-ng --timeout 30 --times --verify --metrics-brief --ioport <n>

where <n> is the number of cpus in the system.

A sample watchdog_max_skew readjustment output was:

[  197.771144] clocksource: timekeeping watchdog on CPU8: hpet wd-wd read-back delay of 92539ns
[  197.789589] clocksource: wd-tsc-wd read-back delay of 90933ns, clock-skew test skipped!
[  197.807145] clocksource: timekeeping watchdog on CPU8: watchdog_max_skew increased to 185078ns

To avoid excessive increase of watchdog_max_skew, a limit of
10*WATCHDOG_MAX_SKEW is used over which the watchdog itself will be
mark unstable and a new watchdog will be selected if possible.

To exercise the code, WATCHDOG_MAX_SKEW was reduced to 10us. After
skipping 10 checks, the watchdog then fell back to acpi_pm. However
the corresponding consecutive watchdog delay was still about the same
leading to ping-ponging between hpet and acpi_pm becoming the watchdog.

Patch 4 adds a Kconfig option to allow kernel builder to control the
actual WATCHDOG_MAX_SKEW threshold to be used.

Waiman Long (4):
  clocksource: Avoid accidental unstable marking of clocksources
  clocksource: Reduce the default clocksource_watchdog() retries to 2
  clocksource: Dynamically increase watchdog_max_skew
  clocksource: Add a Kconfig option for WATCHDOG_MAX_SKEW

 .../admin-guide/kernel-parameters.txt         |   4 +-
 kernel/time/Kconfig                           |   9 ++
 kernel/time/clocksource.c                     | 121 +++++++++++++++---
 3 files changed, 114 insertions(+), 20 deletions(-)

-- 
2.27.0


             reply	other threads:[~2021-11-16 23:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16 23:44 Waiman Long [this message]
2021-11-16 23:44 ` [PATCH v2 1/4] clocksource: Avoid accidental unstable marking of clocksources Waiman Long
2021-11-16 23:44 ` [PATCH v2 2/4] clocksource: Reduce the default clocksource_watchdog() retries to 2 Waiman Long
2021-11-16 23:44 ` [PATCH v2 3/4] clocksource: Dynamically increase watchdog_max_skew Waiman Long
2021-11-16 23:44 ` [PATCH v2 4/4] clocksource: Add a Kconfig option for WATCHDOG_MAX_SKEW Waiman Long
2021-11-17 16:54 ` [PATCH v2 0/4] clocksource: Avoid incorrect hpet fallback Paul E. McKenney
2021-11-17 18:51   ` Waiman Long
2021-11-17 21:25     ` Paul E. McKenney
2021-11-17 21:55       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211116234426.837799-1-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=cassio.neri@gmail.com \
    --cc=feng.tang@intel.com \
    --cc=frederic@kernel.org \
    --cc=john.stultz@linaro.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).