All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Feng Tang <feng.tang@intel.com>,
	kernel test robot <oliver.sang@intel.com>,
	0day robot <lkp@intel.com>, John Stultz <john.stultz@linaro.org>,
	Stephen Boyd <sboyd@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
	Mark Rutland <Mark.Rutland@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Xing Zhengjun <zhengjun.xing@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, kernel-team@fb.com, neeraju@codeaurora.org,
	zhengjun.xing@intel.com, x86@kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [clocksource]  8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog
Date: Thu, 29 Apr 2021 17:59:19 -0700	[thread overview]
Message-ID: <20210430005919.GA983840@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210430002459.GB975577@paulmck-ThinkPad-P17-Gen-1>

On Thu, Apr 29, 2021 at 05:24:59PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 29, 2021 at 04:04:11PM -0700, Andi Kleen wrote:
> > > > The idea is to leave the watchdog code in kernel/time/clocksource.c,
> > > > but to move the fault injection into kernel/time/clocksourcefault.c or
> > > > some such.  In this new file, new clocksource structures are created that
> > > > use some existing timebase/clocksource under the covers.  These then
> > > > inject delays based on module parameters (one senstive to CPU number,
> > > > the other unconditional).  They register these clocksources using the
> > > > normal interfaces, and verify that they are eventually marked unstable
> > > > when the fault-injection parameters warrant it.  This is combined with
> > > > the usual checking of the console log.
> > > >
> > > > Or am I missing your point?
> > > 
> > > That's what I meant.
> > 
> > I still think all this stuff should be in the fault injection framework,
> > like other fault injections, to have a consistent discoverable interface. 
> > 
> > I actually checked now and the standard fault injection supports boot arguments,
> > so needing it at boot time shouldn't be a barrier.
> 
> Per Thomas's feedback, I am in the midst of converting this to a unit
> test implemented as a kernel module, at which point the only fault
> injection will be in the unit test.
> 
> At the moment, the code just registers, reads, unregisters, and verifies
> that the bogus unit-test clocksources act normally.  Fault injection is
> next on the list for the fine-grained clocksource.  Which, as Thomas
> noted, is quite a bit simpler, as I just need to force a delay until
> the clocksource gets marked unstable with no need for fancy counting.

And this is what I currently get on the console from a successful test:

------------------------------------------------------------------------

clocksource_wdtest: --- holdoff=20
clocksource_wdtest: --- Verify jiffies-like uncertainty margin.
clocksource: wdtest-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
clocksource_wdtest: --- Verify tsc-like uncertainty margin.
clocksource: wdtest-ktime: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
clocksource_wdtest: --- tsc-like times: 1619743817068433427 - 1619743817068432303 = 1124.
clocksource_wdtest: --- Watchdog without error injection.
clocksource_wdtest: --- Watchdog with singleton error injection.
clocksource_wdtest: --- Watchdog with doublet error injection, expect console messages.
clocksource: timekeeping watchdog on CPU4: kvm-clock retried 2 times before success
clocksource_wdtest: --- Watchdog with quadruplet error injection, expect clock skew.
clocksource: timekeeping watchdog on CPU8: kvm-clock read-back delay of 401209ns, attempt 4, marking unstable
clocksource_wdtest: --- Marking wdtest-ktime unstable due to clocksource watchdog.
clocksource_wdtest: --- Done with test.

------------------------------------------------------------------------

The code currently looks like a dog's breakfast, so I will clean it
up before sending it out.  And of course add the time-readout error
injection to test the other clock-skew code path.

And yes, there are WARNs to verify that skew happens when it is supposed
to and so on.

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: Paul E. McKenney <paulmck@kernel.org>
To: lkp@lists.01.org
Subject: Re: [clocksource] 8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog
Date: Thu, 29 Apr 2021 17:59:19 -0700	[thread overview]
Message-ID: <20210430005919.GA983840@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210430002459.GB975577@paulmck-ThinkPad-P17-Gen-1>

[-- Attachment #1: Type: text/plain, Size: 3385 bytes --]

On Thu, Apr 29, 2021 at 05:24:59PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 29, 2021 at 04:04:11PM -0700, Andi Kleen wrote:
> > > > The idea is to leave the watchdog code in kernel/time/clocksource.c,
> > > > but to move the fault injection into kernel/time/clocksourcefault.c or
> > > > some such.  In this new file, new clocksource structures are created that
> > > > use some existing timebase/clocksource under the covers.  These then
> > > > inject delays based on module parameters (one senstive to CPU number,
> > > > the other unconditional).  They register these clocksources using the
> > > > normal interfaces, and verify that they are eventually marked unstable
> > > > when the fault-injection parameters warrant it.  This is combined with
> > > > the usual checking of the console log.
> > > >
> > > > Or am I missing your point?
> > > 
> > > That's what I meant.
> > 
> > I still think all this stuff should be in the fault injection framework,
> > like other fault injections, to have a consistent discoverable interface. 
> > 
> > I actually checked now and the standard fault injection supports boot arguments,
> > so needing it at boot time shouldn't be a barrier.
> 
> Per Thomas's feedback, I am in the midst of converting this to a unit
> test implemented as a kernel module, at which point the only fault
> injection will be in the unit test.
> 
> At the moment, the code just registers, reads, unregisters, and verifies
> that the bogus unit-test clocksources act normally.  Fault injection is
> next on the list for the fine-grained clocksource.  Which, as Thomas
> noted, is quite a bit simpler, as I just need to force a delay until
> the clocksource gets marked unstable with no need for fancy counting.

And this is what I currently get on the console from a successful test:

------------------------------------------------------------------------

clocksource_wdtest: --- holdoff=20
clocksource_wdtest: --- Verify jiffies-like uncertainty margin.
clocksource: wdtest-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
clocksource_wdtest: --- Verify tsc-like uncertainty margin.
clocksource: wdtest-ktime: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
clocksource_wdtest: --- tsc-like times: 1619743817068433427 - 1619743817068432303 = 1124.
clocksource_wdtest: --- Watchdog without error injection.
clocksource_wdtest: --- Watchdog with singleton error injection.
clocksource_wdtest: --- Watchdog with doublet error injection, expect console messages.
clocksource: timekeeping watchdog on CPU4: kvm-clock retried 2 times before success
clocksource_wdtest: --- Watchdog with quadruplet error injection, expect clock skew.
clocksource: timekeeping watchdog on CPU8: kvm-clock read-back delay of 401209ns, attempt 4, marking unstable
clocksource_wdtest: --- Marking wdtest-ktime unstable due to clocksource watchdog.
clocksource_wdtest: --- Done with test.

------------------------------------------------------------------------

The code currently looks like a dog's breakfast, so I will clean it
up before sending it out.  And of course add the time-readout error
injection to test the other clock-skew code path.

And yes, there are WARNs to verify that skew happens when it is supposed
to and so on.

							Thanx, Paul

  reply	other threads:[~2021-04-30  0:59 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-25 22:45 [PATCH v10 clocksource 0/7] Do not mark clocks unstable due to delays for v5.13 Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 1/7] clocksource: Provide module parameters to inject delays in watchdog Paul E. McKenney
2021-04-26  4:07   ` Andi Kleen
2021-04-26  7:13     ` Thomas Gleixner
2021-04-26 15:28     ` Paul E. McKenney
2021-04-26 16:00       ` Andi Kleen
2021-04-26 16:14         ` Paul E. McKenney
2021-04-26 17:56           ` Andi Kleen
2021-04-26 18:24             ` Paul E. McKenney
2021-04-28  4:49               ` Luming Yu
2021-04-28 13:57                 ` Paul E. McKenney
2021-04-28 14:24                   ` Luming Yu
2021-04-28 14:37                     ` Thomas Gleixner
2021-04-25 22:47 ` [PATCH v10 clocksource 2/7] clocksource: Retry clock read if long delays detected Paul E. McKenney
2021-04-27  1:44   ` Feng Tang
2021-04-25 22:47 ` [PATCH v10 clocksource 3/7] clocksource: Check per-CPU clock synchronization when marked unstable Paul E. McKenney
2021-04-26  4:12   ` Andi Kleen
2021-04-26  7:16     ` Thomas Gleixner
2021-04-25 22:47 ` [PATCH v10 clocksource 4/7] clocksource: Provide a module parameter to fuzz per-CPU clock checking Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 5/7] clocksource: Limit number of CPUs checked for clock synchronization Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 6/7] clocksource: Forgive tsc_early pre-calibration drift Paul E. McKenney
2021-04-26 15:01   ` Feng Tang
2021-04-26 15:25     ` Paul E. McKenney
2021-04-26 15:36       ` Feng Tang
2021-04-26 18:26         ` Paul E. McKenney
2021-04-27  1:13           ` Feng Tang
2021-04-27  3:46             ` Paul E. McKenney
2021-04-27  4:16               ` Feng Tang
2021-04-26 15:28     ` Thomas Gleixner
2021-04-27 21:03     ` Thomas Gleixner
2021-04-27  7:27   ` [clocksource] 8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog kernel test robot
2021-04-27  7:27     ` kernel test robot
2021-04-27  8:45     ` Feng Tang
2021-04-27  8:45       ` Feng Tang
2021-04-27 13:37       ` Paul E. McKenney
2021-04-27 13:37         ` Paul E. McKenney
2021-04-27 17:50         ` Paul E. McKenney
2021-04-27 17:50           ` Paul E. McKenney
2021-04-27 21:09           ` Thomas Gleixner
2021-04-27 21:09             ` Thomas Gleixner
2021-04-28  1:48             ` Paul E. McKenney
2021-04-28  1:48               ` Paul E. McKenney
2021-04-28 10:14               ` Thomas Gleixner
2021-04-28 10:14                 ` Thomas Gleixner
2021-04-28 18:31                 ` Paul E. McKenney
2021-04-28 18:31                   ` Paul E. McKenney
2021-04-28 13:34             ` Thomas Gleixner
2021-04-28 13:34               ` Thomas Gleixner
2021-04-28 15:39               ` Peter Zijlstra
2021-04-28 15:39                 ` Peter Zijlstra
2021-04-28 17:00                 ` Thomas Gleixner
2021-04-28 17:00                   ` Thomas Gleixner
2021-04-29  7:38                   ` Feng Tang
2021-04-29  7:38                     ` Feng Tang
2021-04-28 18:31               ` Paul E. McKenney
2021-04-28 18:31                 ` Paul E. McKenney
2021-04-29  8:27                 ` Thomas Gleixner
2021-04-29  8:27                   ` Thomas Gleixner
2021-04-29 14:26                   ` Paul E. McKenney
2021-04-29 14:26                     ` Paul E. McKenney
2021-04-29 17:30                     ` Thomas Gleixner
2021-04-29 17:30                       ` Thomas Gleixner
2021-04-29 23:04                       ` Andi Kleen
2021-04-29 23:04                         ` Andi Kleen
2021-04-30  0:24                         ` Paul E. McKenney
2021-04-30  0:24                           ` Paul E. McKenney
2021-04-30  0:59                           ` Paul E. McKenney [this message]
2021-04-30  0:59                             ` Paul E. McKenney
2021-04-30  5:08                       ` Paul E. McKenney
2021-04-30  5:08                         ` Paul E. McKenney
2021-04-25 22:47 ` [PATCH v9 clocksource 6/6] clocksource: Reduce WATCHDOG_THRESHOLD Paul E. McKenney
2021-04-25 22:47 ` [PATCH v10 clocksource 7/7] " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210430005919.GA983840@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=Mark.Rutland@arm.com \
    --cc=ak@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=feng.tang@intel.com \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=maz@kernel.org \
    --cc=neeraju@codeaurora.org \
    --cc=oliver.sang@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=zhengjun.xing@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.