linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/3] x86/tsc: use CPUID.0x16 to calculate missing crystal frequency
@ 2019-05-09  5:54 Daniel Drake
  2019-05-09  5:54 ` [PATCH v2 2/3] x86/apic: rename lapic_timer_frequency to lapic_timer_period Daniel Drake
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Daniel Drake @ 2019-05-09  5:54 UTC (permalink / raw)
  To: tglx, mingo, bp
  Cc: hpa, x86, linux-kernel, len.brown, rafael.j.wysocki, linux, peterz

native_calibrate_tsc() had a data mapping Intel CPU families
and crystal clock speed, but hardcoded tables are not ideal, and this
approach was already problematic at least in the Skylake X case, as
seen in commit b51120309348 ("x86/tsc: Fix erroneous TSC rate on Skylake
Xeon").

By examining CPUID data from http://instlatx64.atw.hu/ and units
in the lab, we have found that 3 different scenarios need to be dealt
with, and we can eliminate most of the hardcoded data using an approach a
little more advanced than before:

 1. ApolloLake, GeminiLake, CannonLake (and presumably all new chipsets
    from this point) report the crystal frequency directly via CPUID.0x15.
    That's definitive data that we can rely upon.

 2. Skylake, Kabylake and all variants of those two chipsets report a
    crystal frequency of zero, however we can calculate the crystal clock
    speed by condidering data from CPUID.0x16.

    This method correctly distinguishes between the two crystal clock
    frequencies present on different Skylake X variants that caused
    headaches before.

    As the calculations do not quite match the previously-hardcoded values
    in some cases (e.g. 23913043Hz instead of 24MHz), TSC refinement is
    enabled on all platforms where we had to calculate the crystal
    frequency in this way.

 3. Denverton (GOLDMONT_X) reports a crystal frequency of zero and does
    not support CPUID.0x16, so we leave this entry hardcoded.

Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Drake <drake@endlessm.com>
---

Notes:
    v2:
     - Clarify the situation around Skylake X better.
     - Enable TSC refinement when we had to calculate the crystal clock,
       in case slight differences in the calculation result cause problems
       similar to those reported earlier on Skylake X.

 arch/x86/kernel/tsc.c | 47 +++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 15b5e98a86f9..6e6d933fb99c 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -631,31 +631,38 @@ unsigned long native_calibrate_tsc(void)
 
 	crystal_khz = ecx_hz / 1000;
 
-	if (crystal_khz == 0) {
-		switch (boot_cpu_data.x86_model) {
-		case INTEL_FAM6_SKYLAKE_MOBILE:
-		case INTEL_FAM6_SKYLAKE_DESKTOP:
-		case INTEL_FAM6_KABYLAKE_MOBILE:
-		case INTEL_FAM6_KABYLAKE_DESKTOP:
-			crystal_khz = 24000;	/* 24.0 MHz */
-			break;
-		case INTEL_FAM6_ATOM_GOLDMONT_X:
-			crystal_khz = 25000;	/* 25.0 MHz */
-			break;
-		case INTEL_FAM6_ATOM_GOLDMONT:
-			crystal_khz = 19200;	/* 19.2 MHz */
-			break;
-		}
-	}
+	/*
+	 * Denverton SoCs don't report crystal clock, and also don't support
+	 * CPUID.0x16 for the calculation below, so hardcode the 25MHz crystal
+	 * clock.
+	 */
+	if (crystal_khz == 0 &&
+			boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT_X)
+		crystal_khz = 25000;
 
-	if (crystal_khz == 0)
-		return 0;
 	/*
-	 * TSC frequency determined by CPUID is a "hardware reported"
+	 * TSC frequency reported directly by CPUID is a "hardware reported"
 	 * frequency and is the most accurate one so far we have. This
 	 * is considered a known frequency.
 	 */
-	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+	if (crystal_khz != 0)
+		setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+
+	/*
+	 * Some Intel SoCs like Skylake and Kabylake don't report the crystal
+	 * clock, but we can easily calculate it to a high degree of accuracy
+	 * by considering the crystal ratio and the CPU speed.
+	 */
+	if (crystal_khz == 0 && boot_cpu_data.cpuid_level >= 0x16) {
+		unsigned int eax_base_mhz, ebx, ecx, edx;
+
+		cpuid(0x16, &eax_base_mhz, &ebx, &ecx, &edx);
+		crystal_khz = eax_base_mhz * 1000 *
+			eax_denominator / ebx_numerator;
+	}
+
+	if (crystal_khz == 0)
+		return 0;
 
 	/*
 	 * For Atom SoCs TSC is the only reliable clocksource.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread
* Detecting x86 LAPIC timer frequency from CPUID data
@ 2019-04-17  5:28 Daniel Drake
  2019-04-18 13:12 ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Drake @ 2019-04-17  5:28 UTC (permalink / raw)
  To: tglx, lenb; +Cc: x86, linux-kernel, linux, rjw

The CPUID.0x16 leaf provides "Bus (Reference) Frequency (in MHz)".

In the thread "No 8254 PIT & no HPET on new Intel N3350 platforms
causes kernel panic during early boot" we are exploring ways to have
the kernel avoid using the PIT/HPET IRQ0 timer in more cases, and
Thomas Gleixner suggested that we could use this CPUID data to set
lapic_timer_frequency, avoiding the need for calibrate_APIC_clock()
to measure the APIC clock against the IRQ0 timer.

I'm thinking of the the following code change, however I get
unexpected results on Intel i7-8565U (Whiskey Lake). When
booting without this change, and with apic=notscdeadline (so that
APIC clock gets calibrated and used), the bus speed is detected as
23MHz:

 ... lapic delta = 149994
 ... PM-Timer delta = 357939
 ... PM-Timer result ok
 ..... delta 149994
 ..... mult: 6442193
 ..... calibration result: 23999
 ..... CPU clock speed is 1991.0916 MHz.
 ..... host bus clock speed is 23.0999 MHz.

However the CPUID.0x16 ECX reports a 100MHz bus speed on this device,
so this code change would produce a significantly different calibration.

Am I doing anything obviously wrong?

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 3fae23834069..6c51ce842f86 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -679,6 +679,16 @@ static unsigned long cpu_khz_from_cpuid(void)
 
 	cpuid(0x16, &eax_base_mhz, &ebx_max_mhz, &ecx_bus_mhz, &edx);
 
+#ifdef CONFIG_X86_LOCAL_APIC
+	/*
+	 * If bus frequency is provided in CPUID data, set
+	 * global lapic_timer_frequency to bus_clock_cycles/jiffy.
+	 * This avoids having to calibrate the APIC timer later.
+	 */
+	if (ecx_bus_mhz)
+		lapic_timer_frequency = (ecx_bus_mhz * 1000000) / HZ;
+#endif
+
 	return eax_base_mhz * 1000;
 }
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread
* No 8254 PIT & no HPET on new Intel N3350 platforms causes kernel panic during early boot
@ 2019-04-03  7:49 Daniel Drake
  2019-04-03 11:21 ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Drake @ 2019-04-03  7:49 UTC (permalink / raw)
  To: Linux Kernel, Thomas Gleixner, Ingo Molnar, bp
  Cc: Hans de Goede, david.e.box, Endless Linux Upstreaming Team

Hi,

I already wrote about this problem in the thread "APIC timer checked
before it is set up, boot fails on Connex L1430"
https://lkml.org/lkml/2018/12/28/10
However my initial diagnosis was misguided, and I have some new
findings to share now, so I'm starting over in this new thread.

Also CCing Hans, who also often attracts this class of problem on low
cost hardware!

The problem is that on affected platforms, all Linux distros (and all
known kernel versions) fail to boot, hanging on a black screen. EFI
earlyprintk can be used to see the panic:

APIC: switch to symmetric I/O mode setup
x2apic: IRQ remapping doesn't support X2APIC mode
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...tryign to set up timer (IRQ0) through the 8259A ...
..... (found apic 0 pin 2) ...
....... failed.
...trying to set up timer as Virtual Wire IRQ...
..... failed.
...trying to set up timer as ExtINT IRQ...
do_IRQ: 0.55 No irq handler for vector
..... failed :(.
Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
apic=debug and send a report.

After encountering this on Connex L1430 last time, we have now
encountered another affected product, from a different vendor (SCOPE
SN116PYA). They both have Intel Apollo Lake N3350 and AMI BIOS.

The code in question is making sure that the IRQ0 timer works, by
waiting for an interrupt. In this case there is no interrupt.

The x86 platform code in hpet_time_init() tries to enable the HPET
timer for this, however that is not available on these affected
platforms (no HPET ACPI table). So it then falls back on the 8253/8254
legacy PIT. The i8253.c driver is invoked to program the PIT
accordingly, however in this case it does not result in any IRQ0
interrupts being generated --> panic.

I found a relevant setting in the BIOS: Chipset -> South Cluster
Configuration -> Miscellaneous Configuration -> 8254 Clock Gating
This option is set to Enabled by default. Setting it to Disabled makes
the PIT tick and Linux boot finally works.

It's nice to have a workaround but I would hope we could do better -
especially because it seems like this problem is spreading. In
addition to the two products we found here, searching around finds
several other product manuals and discussions that tell you to go into
the BIOS and change this option if you want Linux to boot, some
examples:
https://blog.csdn.net/qhtsm/article/details/88600316
https://www.manualslib.com/manual/1316475/Ecs-Ed20pa2.html?page=23
https://tools.exone.de/live/shop/img/produkte/fs_112124_2.pdf page 11

As another data point, Windows 10 boots fine in this no-PIT no-HPET
configuation.

Going deeper, I found the clock_gate_8254 option in the coreboot
source code. This pointed me to the ITSSPRC register, which is
documented on page 1694 of
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf

"8254 Static Clock Gating Enable (CGE8254): When set, the 8254 timer
is disabled statically. This bit shall be set by BIOS if the 8254
feature is not needed in the system or before BIOS hands off the
system that supports C11. Normal operation of 8254 requires this bit
to 0."

(what's C11?)

I verified that the BIOS setting controls this specific bit value, and
I also created and verified a workaround that unsets this bit - now
Linux boots fine regardless of the BIOS setting:

#define INTEL_APL_PSR_BASE        0xd0000000
#define INTEL_APL_PID_ITSS        0xd0
#define INTEL_PCR_PORTID_SHIFT    16
#define INTEL_APL_PCR_ITSSPRC    0x3300
static void quirk_intel_apl_8254(void)
{
    u32 addr = INTEL_APL_PSR_BASE | \
        (INTEL_APL_PID_ITSS << INTEL_PCR_PORTID_SHIFT) | \
        INTEL_APL_PCR_ITSSPRC;
    u32 value;
    void __iomem *itssprc = ioremap_nocache(addr, 4);

    if (!itssprc)
        return;

    value = readl(itssprc);
    if (value & 4) {
        value &= ~4;
        writel(value, itssprc);
    }
    iounmap(itssprc);
}

I was hoping I could send a workaround patch here, but I'm not sure of
an appropriate way to detect that we are on an Intel Apollo Lake
platform. This timer stuff happens during early boot, the early quirks
in pci/quirks.c run too late for this. Suggestions appreciated.

Poking at other angles, I tried taking the HPET ACPI table from
another (working) Intel N3350 system and putting it in the initrd as
an override. This makes the HPET work fine, at which point Linux boots
OK without having to touch the (BIOS-crippled) PIT.

I also spotted that GRUB was previously affected by this BIOS-level
behaviour change.
http://git.savannah.gnu.org/cgit/grub.git/commit/?id=446794de8da4329ea532cbee4ca877bcafd0e534
Apparently GRUB used to rely on the 8254 PIT too, but it now uses the
pmtimer for TSC calibration instead. I guess the originally-affected
platforms only ran into GRUB freezing here (as opposed to both GRUB
and Linux freezing) because those platforms had a working HPET,
meaning that Linux was unaware/unaffected by the newly-gated PIT.

I'm at the limit of my current knowledge here, but there's an open
question of whether Linux could be made to work without a working PIT
and no HPET, in the same way that grub and Windows seem to manage.
Even though it is currently essential for boot, the PIT (or HPET) is
usually only needed to tick a few times before being replaced with the
APIC timer as a clocksource (when setup_APIC_timer() happens, the
clocksource layer disables the previous timer source). However, Thomas
Gleixner gave some hints at the importance of the PIT/HPET here:

> Well, [avoiding the PIT/HPET ticking requirement] would be trivial if we
> could rely on the APIC timer being functional on all CPUs and if we could
> figure out the APIC timer frequency without calibrating it against the
> PIT/HPET on older CPUs. Plus a gazillion of other issues (e.g. APIC stops
> in C states ....)
> [...]
> Under certain conditions we actually might avoid touching PIT/HPET and
> solely rely on the CPUID/MSR calibration values. Needs quite some thought
> though.

I'm not sure what is the best way forward on this issue, but hopefully
this investigation is useful somehow, and I'd be happy to act on any
suggestions.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-06-28  5:07 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-09  5:54 [PATCH v2 1/3] x86/tsc: use CPUID.0x16 to calculate missing crystal frequency Daniel Drake
2019-05-09  5:54 ` [PATCH v2 2/3] x86/apic: rename lapic_timer_frequency to lapic_timer_period Daniel Drake
2019-05-09 10:34   ` [tip:x86/apic] x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period' tip-bot for Daniel Drake
2019-05-09  5:54 ` [PATCH v2 3/3] x86/tsc: set LAPIC timer period to crystal clock frequency Daniel Drake
2019-05-09  7:25 ` [PATCH v2 1/3] x86/tsc: use CPUID.0x16 to calculate missing crystal frequency Thomas Gleixner
2019-05-09  9:07   ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2019-04-17  5:28 Detecting x86 LAPIC timer frequency from CPUID data Daniel Drake
2019-04-18 13:12 ` Thomas Gleixner
2019-04-18 22:30   ` Thomas Gleixner
2019-04-19  8:35     ` Daniel Drake
2019-04-19  8:57       ` Thomas Gleixner
2019-04-19 20:50         ` Jacob Pan
2019-04-19 20:52           ` Thomas Gleixner
2019-04-19 23:09             ` Jacob Pan
2019-05-09 10:34       ` [tip:x86/apic] x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency tip-bot for Daniel Drake
2019-04-03  7:49 No 8254 PIT & no HPET on new Intel N3350 platforms causes kernel panic during early boot Daniel Drake
2019-04-03 11:21 ` Thomas Gleixner
2019-04-03 12:01   ` Thomas Gleixner
2019-04-09  5:43   ` Daniel Drake
2019-04-10 12:54     ` Thomas Gleixner
2019-04-16  5:21       ` Daniel Drake
2019-05-09 10:35   ` [tip:x86/apic] x86/tsc: Set LAPIC timer period to crystal clock frequency tip-bot for Daniel Drake
2019-06-27  8:54   ` No 8254 PIT & no HPET on new Intel N3350 platforms causes kernel panic during early boot Daniel Drake
2019-06-27 14:06     ` Thomas Gleixner
2019-06-28  3:33       ` Daniel Drake
2019-06-28  5:07         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).