All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration
@ 2020-01-17 15:13 Krzysztof Piecuch
  2020-01-17 16:37 ` Andy Lutomirski
  0 siblings, 1 reply; 5+ messages in thread
From: Krzysztof Piecuch @ 2020-01-17 15:13 UTC (permalink / raw)
  To: corbet, tglx, mingo, bp, hpa, x86, mchehab+samsung, jpoimboe,
	gregkh, pawan.kumar.gupta, paulmck, jgross, rafael.j.wysocki,
	viresh.kumar, drake, malat, mzhivich, piecuch, juri.lelli,
	linux-doc, linux-kernel

Changing base clock frequency directly impacts tsc hz but not CPUID.16h
values. An overclocked CPU supporting CPUID.16h and partial CPUID.15h
support will set tsc hz according to "best guess" given by CPUID.16h
relying on tsc_refine_calibration_work to give better numbers later.
tsc_refine_calibration_work will refuse to do its work when the outcome is
off the early tsc hz value by more than 1% which is certain to happen on an
overclocked system.

Fix this by adding tsc_tuned_baseclk command line parameter that makes
the kernel ignore CPUID.16h data during TSC calibration.

Signed-off-by: Krzysztof Piecuch <piecuch@protonmail.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++++
 arch/x86/kernel/tsc.c                           | 16 ++++++++++++++--
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ade4e6ec23e0..b251169692a8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4905,6 +4905,17 @@
 			interruptions from clocksource watchdog are not
 			acceptable).

+	tsc_tuned_baseclk=
+			[X86,INTEL] Ignore data provided by CPUID.16h during
+			early tsc calibration. Useful when changing base clock
+			frequency (overclocking).
+			Warning: in case your system does not provide
+			alternatives to determine cpu speed (HPET, PIT, complete
+			CPUID.15h support, MSR) the kernel will fail to
+			calibrate the clocksource and local APIC.
+			Format: <bool> (1/Y/y=enabled, 0/N/n=disabled)
+			default: disabled
+
 	tsx=		[X86] Control Transactional Synchronization
 			Extensions (TSX) feature in Intel processors that
 			support TSX control.
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 7e322e2daaf5..c9b638dd8f4d 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -59,6 +59,17 @@ struct cyc2ns {

 static DEFINE_PER_CPU_ALIGNED(struct cyc2ns, cyc2ns);

+static bool __read_mostly tsc_tuned_baseclk;
+static int __init tsc_tuned_baseclk_setup(char *buf)
+{
+	int ret = strtobool(buf, &tsc_tuned_baseclk);
+
+	if (tsc_tuned_baseclk)
+		pr_warn("tsc_tuned_baseclk: This will allow your CPU to use TSC with an overclocked base clock but your system will require some means of TSC calibration other than CPUID 16h.");
+	return ret;
+}
+early_param("tsc_tuned_baseclk", tsc_tuned_baseclk_setup);
+
 __always_inline void cyc2ns_read_begin(struct cyc2ns_data *data)
 {
 	int seq, idx;
@@ -654,7 +665,8 @@ unsigned long native_calibrate_tsc(void)
 	 * clock, but we can easily calculate it to a high degree of accuracy
 	 * by considering the crystal ratio and the CPU speed.
 	 */
-	if (crystal_khz == 0 && boot_cpu_data.cpuid_level >= 0x16) {
+	if (crystal_khz == 0 && !tsc_tuned_baseclk &&
+		boot_cpu_data.cpuid_level >= 0x16) {
 		unsigned int eax_base_mhz, ebx, ecx, edx;

 		cpuid(0x16, &eax_base_mhz, &ebx, &ecx, &edx);
@@ -692,7 +704,7 @@ static unsigned long cpu_khz_from_cpuid(void)
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 		return 0;

-	if (boot_cpu_data.cpuid_level < 0x16)
+	if (boot_cpu_data.cpuid_level < 0x16 || tsc_tuned_baseclk)
 		return 0;

 	eax_base_mhz = ebx_max_mhz = ecx_bus_mhz = edx = 0;
--
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration
  2020-01-17 15:13 [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration Krzysztof Piecuch
@ 2020-01-17 16:37 ` Andy Lutomirski
  2020-01-20 11:15   ` Krzysztof Piecuch
  0 siblings, 1 reply; 5+ messages in thread
From: Andy Lutomirski @ 2020-01-17 16:37 UTC (permalink / raw)
  To: Krzysztof Piecuch
  Cc: corbet, tglx, mingo, bp, hpa, x86, mchehab+samsung, jpoimboe,
	gregkh, pawan.kumar.gupta, paulmck, jgross, rafael.j.wysocki,
	viresh.kumar, drake, malat, mzhivich, juri.lelli, linux-doc,
	linux-kernel



> On Jan 17, 2020, at 7:21 AM, Krzysztof Piecuch <piecuch@protonmail.com> wrote:
> 
> Changing base clock frequency directly impacts tsc hz but not CPUID.16h
> values. An overclocked CPU supporting CPUID.16h and partial CPUID.15h
> support will set tsc hz according to "best guess" given by CPUID.16h
> relying on tsc_refine_calibration_work to give better numbers later.
> tsc_refine_calibration_work will refuse to do its work when the outcome is
> off the early tsc hz value by more than 1% which is certain to happen on an
> overclocked system.
> 

Wouldn’t it be better to have an option tsc_max_refinement= to increase the 1%?



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration
  2020-01-17 16:37 ` Andy Lutomirski
@ 2020-01-20 11:15   ` Krzysztof Piecuch
  2020-01-20 13:42     ` Thomas Gleixner
  0 siblings, 1 reply; 5+ messages in thread
From: Krzysztof Piecuch @ 2020-01-20 11:15 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: corbet, tglx, mingo, bp, hpa, x86, mchehab+samsung, jpoimboe,
	gregkh, pawan.kumar.gupta, paulmck, jgross, rafael.j.wysocki,
	viresh.kumar, drake, malat, mzhivich, juri.lelli, linux-doc,
	linux-kernel

On Friday, January 17, 2020 4:37 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> Wouldn’t it be better to have an option tsc_max_refinement= to increase the 1%?

All that is in the commends about it say that:

 * If there are any calibration anomalies (too many SMIs, etc),
 * or the refined calibration is off by 1% of the fast early
 * calibration, we throw out the new calibration and use the
 * early calibration.

I still don't fully understand why the "1% rule" exists.

Ideally it would be better to get the early calibration right than risk getting
it wrong because of an "anomaly".
OTOH if you system doesn't support any of the early calibration methods other
than CPUID.16h (mine doesn't support either PIT or MSR) "tsc_max_refinement"
would allow you to control max tsc_hz error.

If you think that would be better please let me know.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration
  2020-01-20 11:15   ` Krzysztof Piecuch
@ 2020-01-20 13:42     ` Thomas Gleixner
  2020-01-20 14:20       ` Krzysztof Piecuch
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2020-01-20 13:42 UTC (permalink / raw)
  To: Krzysztof Piecuch, Andy Lutomirski
  Cc: corbet, mingo, bp, hpa, x86, mchehab+samsung, jpoimboe, gregkh,
	pawan.kumar.gupta, paulmck, jgross, rafael.j.wysocki,
	viresh.kumar, drake, malat, mzhivich, juri.lelli, linux-doc,
	linux-kernel

Krzysztof,

Krzysztof Piecuch <piecuch@protonmail.com> writes:
> On Friday, January 17, 2020 4:37 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> Wouldn’t it be better to have an option tsc_max_refinement= to increase the 1%?
>
> All that is in the commends about it say that:
>
>  * If there are any calibration anomalies (too many SMIs, etc),
>  * or the refined calibration is off by 1% of the fast early
>  * calibration, we throw out the new calibration and use the
>  * early calibration.
>
> I still don't fully understand why the "1% rule" exists.

Simply because all of this is horribly fragile and if you put virt into
the picture it gets even worse.

The initial calibration via PIT/HPET is halfways accurate in most cases
and we use the 1% as a sanity check.

> Ideally it would be better to get the early calibration right than
> risk getting it wrong because of an "anomaly".

Ideally we would just have a way to read the stupid frequency from some
reliable place, but there is no such thing.

Guess why we have all this code, surely not because we have nothing
better to do than dreaming up a variety of weird ways to figure out that
frequency.

> OTOH if you system doesn't support any of the early calibration
> methods other than CPUID.16h (mine doesn't support either PIT or MSR)
> "tsc_max_refinement" would allow you to control max tsc_hz error.

Widening the error window here is clearly a hack. As you have to supply
a valid number there, then why not just providing the frequency itself
on the command line? That would at least make most sense and would avoid
to use completely wrong data in the early boot stage.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration
  2020-01-20 13:42     ` Thomas Gleixner
@ 2020-01-20 14:20       ` Krzysztof Piecuch
  0 siblings, 0 replies; 5+ messages in thread
From: Krzysztof Piecuch @ 2020-01-20 14:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, corbet\@lwn.net, mingo\@redhat.com,
	bp\@alien8.de, hpa\@zytor.com, x86\@kernel.org,
	mchehab+samsung\@kernel.org, jpoimboe\@redhat.com,
	gregkh\@linuxfoundation.org, pawan.kumar.gupta\@linux.intel.com,
	paulmck\@linux.ibm.com, jgross\@suse.com,
	rafael.j.wysocki\@intel.com, viresh.kumar\@linaro.org,
	drake\@endlessm.com, malat\@debian.org, mzhivich\@akamai.com,
	juri.lelli\@redhat.com, linux-doc\@vger.kernel.org,
	linux-kernel\@vger.kernel.org

On Monday, January 20, 2020 1:42 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Simply because all of this is horribly fragile and if you put virt into
> the picture it gets even worse.
>
> The initial calibration via PIT/HPET is halfways accurate in most cases
> and we use the 1% as a sanity check.
>
> > Ideally it would be better to get the early calibration right than
> > risk getting it wrong because of an "anomaly".
>
> Ideally we would just have a way to read the stupid frequency from some
> reliable place, but there is no such thing.
>
> Guess why we have all this code, surely not because we have nothing
> better to do than dreaming up a variety of weird ways to figure out that
> frequency.

Thank you for the explanation.

> Widening the error window here is clearly a hack. As you have to supply
> a valid number there, then why not just providing the frequency itself
> on the command line? That would at least make most sense and would avoid
> to use completely wrong data in the early boot stage.

That sounds good.
I'll assume that the user will be supposed to provide a flag tsc_early_hz=
so that refine_calibration_work can get better numbers while still doing
the 1% sanity check.

I'll send a patch this week.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-20 14:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-17 15:13 [PATCH] x86/tsc: Add tsc_tuned_baseclk flag disabling CPUID.16h use for tsc calibration Krzysztof Piecuch
2020-01-17 16:37 ` Andy Lutomirski
2020-01-20 11:15   ` Krzysztof Piecuch
2020-01-20 13:42     ` Thomas Gleixner
2020-01-20 14:20       ` Krzysztof Piecuch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.