x86/tsc: mark tsc reliable for qualified platforms
diff mbox series

Message ID 1614653572-19941-1-git-send-email-feng.tang@intel.com
State New, archived
Headers show
Series
  • x86/tsc: mark tsc reliable for qualified platforms
Related show

Commit Message

Feng Tang March 2, 2021, 2:52 a.m. UTC
There are cases that tsc clocksource are wrongly judged as unstable by
clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
there is hardly a general reliable way to check the validity of a
watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:

"I'm inclined to lift that requirement when the CPU has:

    1) X86_FEATURE_CONSTANT_TSC
    2) X86_FEATURE_NONSTOP_TSC
    3) X86_FEATURE_NONSTOP_TSC_S3
    4) X86_FEATURE_TSC_ADJUST
    5) At max. 4 sockets

 After two decades of horrors we're finally at a point where TSC seems
 to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
 was really key as we can now detect even small modifications reliably
 and the important point is that we can cure them as well (not pretty
 but better than all other options)."

So implement it with slight change as discussed in the thread, and be
more defensive to use maxim of 2 sockets.

The check is done inside tsc_init() before registering 'tsc-early' and
'tsc' clocksources, as there were cases that both of them have been
wrongly judged as unreliable.

[1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/tsc.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Peter Zijlstra March 2, 2021, 9:14 a.m. UTC | #1
On Tue, Mar 02, 2021 at 10:52:52AM +0800, Feng Tang wrote:
> @@ -1193,6 +1193,17 @@ static void __init check_system_tsc_reliable(void)
>  #endif
>  	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
>  		tsc_clocksource_reliable = 1;
> +
> +	/*
> +	 * Ideally the socket number should be checked, but this is called
> +	 * by tsc_init() which is in early boot phase and the socket numbers
> +	 * may not be available. Use 'nr_online_nodes' as a fallback solution
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)
> +		&& boot_cpu_has(X86_FEATURE_NONSTOP_TSC)
> +		&& boot_cpu_has(X86_FEATURE_TSC_ADJUST)
> +		&& nr_online_nodes <= 2)
> +		tsc_clocksource_reliable = 1;

Logical operators go at the end of a line and alignment is with the (,
not the code block after it.
Feng Tang March 2, 2021, 11:55 a.m. UTC | #2
On Tue, Mar 02, 2021 at 10:14:01AM +0100, Peter Zijlstra wrote:
> On Tue, Mar 02, 2021 at 10:52:52AM +0800, Feng Tang wrote:
> > @@ -1193,6 +1193,17 @@ static void __init check_system_tsc_reliable(void)
> >  #endif
> >  	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
> >  		tsc_clocksource_reliable = 1;
> > +
> > +	/*
> > +	 * Ideally the socket number should be checked, but this is called
> > +	 * by tsc_init() which is in early boot phase and the socket numbers
> > +	 * may not be available. Use 'nr_online_nodes' as a fallback solution
> > +	 */
> > +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)
> > +		&& boot_cpu_has(X86_FEATURE_NONSTOP_TSC)
> > +		&& boot_cpu_has(X86_FEATURE_TSC_ADJUST)
> > +		&& nr_online_nodes <= 2)
> > +		tsc_clocksource_reliable = 1;
> 
> Logical operators go at the end of a line and alignment is with the (,
> not the code block after it.

Thanks for pointing out and the suggestion! Will change it to

	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
	    boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
	    nr_online_nodes <= 2)
		tsc_clocksource_reliable = 1;

- Feng
Thomas Gleixner March 3, 2021, 2:51 p.m. UTC | #3
On Tue, Mar 02 2021 at 10:52, Feng Tang wrote:
> There are cases that tsc clocksource are wrongly judged as unstable by
> clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
> there is hardly a general reliable way to check the validity of a
> watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:
>
> "I'm inclined to lift that requirement when the CPU has:
>
>     1) X86_FEATURE_CONSTANT_TSC
>     2) X86_FEATURE_NONSTOP_TSC
>     3) X86_FEATURE_NONSTOP_TSC_S3
>     4) X86_FEATURE_TSC_ADJUST
>     5) At max. 4 sockets
>
>  After two decades of horrors we're finally at a point where TSC seems
>  to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
>  was really key as we can now detect even small modifications reliably
>  and the important point is that we can cure them as well (not pretty
>  but better than all other options)."
>
> So implement it with slight change as discussed in the thread, and be
> more defensive to use maxim of 2 sockets.

Can you please explain the slight change in the changelog?

Thanks,

        tglx
Feng Tang March 8, 2021, 1:42 a.m. UTC | #4
On Wed, Mar 03, 2021 at 10:51:31PM +0800, Thomas Gleixner wrote:
> On Tue, Mar 02 2021 at 10:52, Feng Tang wrote:
> > There are cases that tsc clocksource are wrongly judged as unstable by
> > clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
> > there is hardly a general reliable way to check the validity of a
> > watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:
> >
> > "I'm inclined to lift that requirement when the CPU has:
> >
> >     1) X86_FEATURE_CONSTANT_TSC
> >     2) X86_FEATURE_NONSTOP_TSC
> >     3) X86_FEATURE_NONSTOP_TSC_S3
> >     4) X86_FEATURE_TSC_ADJUST
> >     5) At max. 4 sockets
> >
> >  After two decades of horrors we're finally at a point where TSC seems
> >  to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
> >  was really key as we can now detect even small modifications reliably
> >  and the important point is that we can cure them as well (not pretty
> >  but better than all other options)."
> >
> > So implement it with slight change as discussed in the thread, and be
> > more defensive to use maxim of 2 sockets.
> 
> Can you please explain the slight change in the changelog?
 
Sorry for the late response. Just found this mail in my "Junk Mail"
folder with 3 copies, interesting mail sever filters!

I will add 
"As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several
generations of Atom processor, and is always coupled with 
X86_FEATURE_CONSTANT_TSC and X86_FEATURE_NONSTOP_TSC, skip checking
it"
to the commit log.

Thanks,
Feng



> Thanks,
> 
>         tglx

Patch
diff mbox series

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index f70dffc..a7e3980 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1193,6 +1193,17 @@  static void __init check_system_tsc_reliable(void)
 #endif
 	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
 		tsc_clocksource_reliable = 1;
+
+	/*
+	 * Ideally the socket number should be checked, but this is called
+	 * by tsc_init() which is in early boot phase and the socket numbers
+	 * may not be available. Use 'nr_online_nodes' as a fallback solution
+	 */
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)
+		&& boot_cpu_has(X86_FEATURE_NONSTOP_TSC)
+		&& boot_cpu_has(X86_FEATURE_TSC_ADJUST)
+		&& nr_online_nodes <= 2)
+		tsc_clocksource_reliable = 1;
 }
 
 /*