linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Feng Tang <feng.tang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@intel.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	rui.zhang@intel.com, andi.kleen@intel.com, len.brown@intel.com,
	tim.c.chen@intel.com
Subject: Re: [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified platforms
Date: Tue, 30 Nov 2021 06:40:48 -0800	[thread overview]
Message-ID: <20211130144048.GQ641268@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20211130064623.GB96474@shbuild999.sh.intel.com>

On Tue, Nov 30, 2021 at 02:46:23PM +0800, Feng Tang wrote:
> On Wed, Nov 17, 2021 at 10:37:51AM +0800, Feng Tang wrote:
> > There are cases that tsc clocksources are wrongly judged as unstable by
> > clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
> > there is hardly a general reliable way to check the validity of a
> > watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:
> 
> Hi All,
> 
> Some more update, last week we got report from validation team that
> the "tsc judged as unstable" happened on latest desktop platform,
> which has serial earlyprintk enabled, and the watchdog here is
> 'refined-jiffies' while hpet is disabled during the PC10 check. I
> tried severy other client platforms I can find: Kabylake, Icelake
> and Alderlake, and the mis-judging can be easily reproduced on
> Icelake and Alderlake (not on Kabylake). Which could be cued by
> this 2/2 patch.
> 
> Also, today we got same report on a 2-sockets Icelake Server with
> 5.5 kernel, while the watchdog is 'hpet', and the system is running
> stressful big-data workload.

Were these tests run with Waiman's latest patch series?  The first
two of them are on RCU's "dev" branch.

							Thanx, Paul

> Thanks,
> Feng
> 
> 
> > "I'm inclined to lift that requirement when the CPU has:
> > 
> >     1) X86_FEATURE_CONSTANT_TSC
> >     2) X86_FEATURE_NONSTOP_TSC
> >     3) X86_FEATURE_NONSTOP_TSC_S3
> >     4) X86_FEATURE_TSC_ADJUST
> >     5) At max. 4 sockets
> > 
> >  After two decades of horrors we're finally at a point where TSC seems
> >  to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
> >  was really key as we can now detect even small modifications reliably
> >  and the important point is that we can cure them as well (not pretty
> >  but better than all other options)."
> > 
> > As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations
> > of Atom processor, and is always coupled with X86_FEATURE_CONSTANT_TSC
> > and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive
> > to use maxim of 2 sockets.
> > 
> > The check is done inside tsc_init() before registering 'tsc-early' and
> > 'tsc' clocksources, as there were cases that both of them had been
> > wrongly judged as unreliable.
> > 
> > For more background of tsc/watchdog, there is a good summary in [2]
> > 
> > [1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/
> > [2]. https://lore.kernel.org/lkml/87a6pimt1f.ffs@nanos.tec.linutronix.de/
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Feng Tang <feng.tang@intel.com>
> > ---
> > Change log:
> > 
> >   v3:
> >     * rebased against 5.16-rc1
> >     * refine commit log
> > 
> >   v2:
> >     * Directly skip watchdog check without messing flag
> >       'tsc_clocksource_reliable' (Thomas)
> > 
> >  arch/x86/kernel/tsc.c | 22 ++++++++++++++++++----
> >  1 file changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index 2e076a459a0c..389511f59101 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1180,6 +1180,12 @@ void mark_tsc_unstable(char *reason)
> >  
> >  EXPORT_SYMBOL_GPL(mark_tsc_unstable);
> >  
> > +static void __init tsc_skip_watchdog_verify(void)
> > +{
> > +	clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +	clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +}
> > +
> >  static void __init check_system_tsc_reliable(void)
> >  {
> >  #if defined(CONFIG_MGEODEGX1) || defined(CONFIG_MGEODE_LX) || defined(CONFIG_X86_GENERIC)
> > @@ -1196,6 +1202,17 @@ static void __init check_system_tsc_reliable(void)
> >  #endif
> >  	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
> >  		tsc_clocksource_reliable = 1;
> > +
> > +	/*
> > +	 * Ideally the socket number should be checked, but this is called
> > +	 * by tsc_init() which is in early boot phase and the socket numbers
> > +	 * may not be available. Use 'nr_online_nodes' as a fallback solution
> > +	 */
> > +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> > +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> > +	    boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> > +	    nr_online_nodes <= 2)
> > +		tsc_skip_watchdog_verify();
> >  }
> >  
> >  /*
> > @@ -1387,9 +1404,6 @@ static int __init init_tsc_clocksource(void)
> >  	if (tsc_unstable)
> >  		goto unreg;
> >  
> > -	if (tsc_clocksource_reliable || no_tsc_watchdog)
> > -		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > -
> >  	if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3))
> >  		clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
> >  
> > @@ -1527,7 +1541,7 @@ void __init tsc_init(void)
> >  	}
> >  
> >  	if (tsc_clocksource_reliable || no_tsc_watchdog)
> > -		clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +		tsc_skip_watchdog_verify();
> >  
> >  	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
> >  	detect_art();
> > -- 
> > 2.27.0

  reply	other threads:[~2021-11-30 14:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17  2:37 [PATCH v3 1/2] x86/tsc: add a timer to make sure tsc_adjust is always checked Feng Tang
2021-11-17  2:37 ` [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified platforms Feng Tang
2021-11-30  6:46   ` Feng Tang
2021-11-30 14:40     ` Paul E. McKenney [this message]
2021-11-30 15:02       ` Feng Tang
2021-11-30 16:28         ` Paul E. McKenney
2021-11-30 20:39           ` Thomas Gleixner
2021-11-30 20:47             ` Paul E. McKenney
2021-11-30 21:55               ` Thomas Gleixner
2021-11-30 22:48                 ` Paul E. McKenney
2021-11-30 23:19                   ` Thomas Gleixner
2021-11-30 23:37                     ` Paul E. McKenney
2021-12-01  1:26                       ` Feng Tang
2021-12-01 17:52                         ` Paul E. McKenney
2021-12-07  1:41           ` Feng Tang
2021-12-01  4:45   ` Luming Yu
2021-12-01  5:19     ` Feng Tang
2021-12-01 10:41     ` Thomas Gleixner
2021-12-01 23:47   ` [tip: x86/urgent] x86/tsc: Disable clocksource watchdog for TSC on qualified platorms tip-bot2 for Feng Tang
2021-12-02  4:47   ` [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified platforms Luming Yu
2021-12-01 23:47 ` [tip: x86/urgent] x86/tsc: Add a timer to make sure TSC_adjust is always checked tip-bot2 for Feng Tang
2022-03-14 17:52 ` [PATCH v3 1/2] x86/tsc: add a timer to make sure tsc_adjust " Nicolas Saenz Julienne
2022-03-15  1:33   ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211130144048.GQ641268@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=andi.kleen@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=feng.tang@intel.com \
    --cc=hpa@zytor.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).