linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data
@ 2024-02-22 18:39 Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack Dave Hansen
                   ` (33 more replies)
  0 siblings, 34 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen, rafael,
	lenb, mpatocka, snitzer, daniel, jgross

tl;dr: This tries to be more regimented in how system-wide x86 processor
configuration data is initialized.  It does that by moving fields out of
the per-cpu 'struct cpuinfo_x86' and into a new structure.  It also
requires that the boot CPU set these data up *once* and then be left
alone.

This is a more comprehensive approach to avoid the random tinkering
in patches like these:

	https://lore.kernel.org/all/20240131230902.1867092-1-pbonzini@redhat.com/

--

'struct cpuinfo_x86' is a mess.  At a high level, it can be thought
of as a CPU enumeration cache that (among other things) avoids using
CPUID constantly.  There's a copy of it for each CPU, but often the
'boot_cpu_data' version is the only one that counts.  Its fields are
written in basically random order, including a memcpy() into the
secondary processor copies from 'boot_cpu_data'.

This series focuses on a subset of the 'cpuinfo_x86' fields that must
be consistent across all CPUs:
 * The virtual and physical sizes supported by the CPU
 * The number of those physical bits that the caches comprehend
 * The size of a cacheline that CLFLUSH works on
 * The actual alignment that matters to the caches (can be different
   from that CLFLUSH size)

There are a few random folks cc'd here because their subsystem reads
one or more of these things in an x86 #ifdef of some kind.

There's a mishmash of ways to obtain these from CPUID, plus fixups
for erratum and general processor wonkiness.  There are also defaults
for these for processors that, for instance, don't even _have_ CPUID
support.  I think virt/phys_bits can be set and overwritten no fewer
than five times!

Let's bring some order to the chaos.

First, create some abstractions so that there are no longer direct
explicit accesses to some 'boot_cpu_data' fields.  This also provides
a way to do sanity checking so nothing consumes garbage.

Second, separate out the address configuration inputs from the
resulting values.  The inputs are provided in early boot by the boot
processor and stashed in x86_addr_config.

Third, remove global, shared configuration from 'struct cpuinfo_x86'.
Put it in a new structure: 'struct x86_sys_config'.

This creates a simple set of rules:
 1. The boot CPU populates 'bsp_addr_config' and nothing ever writes
    to it again
 2. get_cpu_address_sizes() reads 'bsp_addr_config' and writes
    'x86_config'
 3. 'bsp_addr_config' is never read again
 4. 'x86_config' is never written again

The centralized helpers also now provide a chance of enforcing those
rules.  Finish up the series by enforcing those rules and spitting out
a warning when they are violated.  This warning mechanism works in
*very* early boot and is a little unusual.  It could use some more
eyeballs (Subject: Enforce read-only x86_config state).

=== FAQ ===

== Why are both 'bsp_addr_config' and 'x86_config' needed? ==

Having defined, separate lifetimes helps enforce the new rules.

Once everything is in place 'bsp_addr_config' is only used at boot and
can be marked __init.  Mostpost can help find users of it that leak to
after boot.  Secondary CPU startup code can't be __init (because of
CPU hotplug) this helps ensure that 'bsp_addr_config' can only be
referenced from boot CPU code.

'x86_config' can also be __ro_after_init, which helps mitigate the
chance of it becoming tweaked after boot and growing into a little
mini version of 'boot_cpu_data' with all the same problems.

--

Ideas for future work (probably in another series):

 * Consolidate defaults in x86_config accessors and
   get_cpu_address_sizes() down to one copy
 * Can we do something more aggressive with x86_clflush_size()?  Is
   it ever anything other than 64 bytes on 64-bit?

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Juergen Gross <jgross@suse.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-03-07 16:27   ` Borislav Petkov
  2024-02-22 18:39 ` [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups Dave Hansen
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen, jgross


From: Dave Hansen <dave.hansen@linux.intel.com>

xen_start_kernel() some of the first C code run "Xen PV" systems.
It precedes normally very early things like setup_arch() or the
processor initialization code.

That means that 'boot_cpu_data' is garbage.  It has not even
established the utter basics like if the CPU supports the CPUID
instruction.  Unfortunately get_cpu_cap() requires this exact
information.

Nevertheless xen_start_kernel() calls get_cpu_cap().  But it
works out in practice because it's looking for the NX bit which
comes from an extended CPUID leaf that doesn't depend on
c->cpuid_level being set.  This also implicitly assumes that
Xen PV guests support CPUID.

Leave the hack in place, but at least explain some of what is
going on.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
---

 b/arch/x86/xen/enlighten_pv.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff -puN arch/x86/xen/enlighten_pv.c~xen-explain1 arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~xen-explain1	2024-02-22 10:08:48.404451146 -0800
+++ b/arch/x86/xen/enlighten_pv.c	2024-02-22 10:08:48.404451146 -0800
@@ -1372,7 +1372,11 @@ asmlinkage __visible void __init xen_sta
 	/* Get mfn list */
 	xen_build_dynamic_phys_to_machine();
 
-	/* Work out if we support NX */
+	/*
+	 * This is a hack.  'boot_cpu_data' is not filled out enough
+	 * for get_cpu_cap() to function fully.  But it _can_ fill out
+	 * the leaves where NX is.  Gross, but it works.
+	 */
 	get_cpu_cap(&boot_cpu_data);
 	x86_configure_nx();
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-03-07 17:01   ` Borislav Petkov
  2024-03-11 13:16   ` Jürgen Groß
  2024-02-22 18:39 ` [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided Dave Hansen
                   ` (31 subsequent siblings)
  33 siblings, 2 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen, jgross


From: Dave Hansen <dave.hansen@linux.intel.com>

The __pa() facility is subject to debugging checks if CONFIG_DEBUG_VIRTUAL=y.
One of those debugging checks is whether the physical address is valid
on the platform.  That information is normally available via CPUID.  But
the __pa() code currently looks it up in 'boot_cpu_data' which is not
fully set up in early Xen PV boot.

The Xen PV code currently tries to get this info with
get_cpu_address_sizes() which also depends on 'boot_cpu_data' to be at
least somewhat set up.  The result is that the c->x86_phys_bits gets a
sane value, but not one that has anything to do with the hardware.  In
other words, the CONFIG_DEBUG_VIRTUAL checks are performed with what
amounts to garbage inputs.

Garbage checks are worse than no check at all.  Move over to the
"nodebug" variant to axe the checks.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
---

 b/arch/x86/xen/enlighten_pv.c |    2 +-
 b/arch/x86/xen/mmu_pv.c       |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff -puN arch/x86/xen/enlighten_pv.c~xen-no-early-__pa arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~xen-no-early-__pa	2024-02-22 10:08:48.868469363 -0800
+++ b/arch/x86/xen/enlighten_pv.c	2024-02-22 10:08:48.872469519 -0800
@@ -1452,7 +1452,7 @@ asmlinkage __visible void __init xen_sta
 	boot_params.hdr.type_of_loader = (9 << 4) | 0;
 	boot_params.hdr.ramdisk_image = initrd_start;
 	boot_params.hdr.ramdisk_size = xen_start_info->mod_len;
-	boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line);
+	boot_params.hdr.cmd_line_ptr = __pa_nodebug(xen_start_info->cmd_line);
 	boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN;
 
 	if (!xen_initial_domain()) {
diff -puN arch/x86/xen/mmu_pv.c~xen-no-early-__pa arch/x86/xen/mmu_pv.c
--- a/arch/x86/xen/mmu_pv.c~xen-no-early-__pa	2024-02-22 10:08:48.872469519 -0800
+++ b/arch/x86/xen/mmu_pv.c	2024-02-22 10:08:48.872469519 -0800
@@ -2006,7 +2006,7 @@ void __init xen_reserve_special_pages(vo
 {
 	phys_addr_t paddr;
 
-	memblock_reserve(__pa(xen_start_info), PAGE_SIZE);
+	memblock_reserve(__pa_nodebug(xen_start_info), PAGE_SIZE);
 	if (xen_start_info->store_mfn) {
 		paddr = PFN_PHYS(mfn_to_pfn(xen_start_info->store_mfn));
 		memblock_reserve(paddr, PAGE_SIZE);
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-03-08  8:57   ` Borislav Petkov
  2024-02-22 18:39 ` [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper Dave Hansen
                   ` (30 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The early boot code always sets _some_ clflush size.  Use that fact
to avoid handling the case where it is not set.

There may have been a time when the Xen PV call in here way too
early.  But it calls get_cpu_address_sizes() before calling here
now.  It should also be safe.

Note: This series will eventually return sane defaults even very
early in boot.  I believe this is safe now, but it becomes *really*
safe later on.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/pci/common.c |   19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff -puN arch/x86/pci/common.c~x86-pci-clflush-size arch/x86/pci/common.c
--- a/arch/x86/pci/common.c~x86-pci-clflush-size	2024-02-22 10:08:49.356488521 -0800
+++ b/arch/x86/pci/common.c	2024-02-22 10:08:49.356488521 -0800
@@ -480,22 +480,9 @@ void pcibios_scan_root(int busnum)
 
 void __init pcibios_set_cache_line_size(void)
 {
-	struct cpuinfo_x86 *c = &boot_cpu_data;
-
-	/*
-	 * Set PCI cacheline size to that of the CPU if the CPU has reported it.
-	 * (For older CPUs that don't support cpuid, we se it to 32 bytes
-	 * It's also good for 386/486s (which actually have 16)
-	 * as quite a few PCI devices do not support smaller values.
-	 */
-	if (c->x86_clflush_size > 0) {
-		pci_dfl_cache_line_size = c->x86_clflush_size >> 2;
-		printk(KERN_DEBUG "PCI: pci_cache_line_size set to %d bytes\n",
-			pci_dfl_cache_line_size << 2);
-	} else {
- 		pci_dfl_cache_line_size = 32 >> 2;
-		printk(KERN_DEBUG "PCI: Unknown cacheline size. Setting to 32 bytes\n");
-	}
+	pci_dfl_cache_line_size = boot_cpu_data.x86_clflush_size >> 2;
+	printk(KERN_DEBUG "PCI: pci_cache_line_size set to %d bytes\n",
+		pci_dfl_cache_line_size << 2);
 }
 
 int __init pcibios_init(void)
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (2 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-27 11:05   ` Huang, Kai
  2024-02-22 18:39 ` [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one Dave Hansen
                   ` (29 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen, rafael, lenb


From: Dave Hansen <dave.hansen@linux.intel.com>

Modern processors enumerate the sizes of the physical and virtual address
spaces that they can handle.  The kernel stashes this information for each
CPU in 'cpuinfo_x86'.

Like a lot of system-wide configuration, in practice the kernel only uses
this information from the boot CPU.  That makes a lot of sense because
the kernel can't practically support two CPUs with such different
fundamental support as the size of the address spaces.

Having a per-cpu copy of this data is silly.  It is, at best, a waste
of space to keep it around for the non-boot CPUs.  At worst, it is yet
another bit of data that must be initialized in a particular order and
can be a source of bugs.

Introduce a helper to look up the number of supported physical address bits:

	x86_phys_bits()

Replace most open-coded references to boot_cpu_data.x86_phys_bits.

This form is more compact and also opens up the door to adding some
centralized checking and enforcing rules around this important system-
wide value.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
---

 b/arch/x86/include/asm/kmsan.h       |    2 +-
 b/arch/x86/include/asm/mce.h         |    2 +-
 b/arch/x86/include/asm/processor.h   |    5 +++++
 b/arch/x86/kernel/cpu/mtrr/cleanup.c |    2 +-
 b/arch/x86/kernel/cpu/mtrr/generic.c |    2 +-
 b/arch/x86/kernel/cpu/mtrr/mtrr.c    |    5 ++---
 b/arch/x86/kernel/setup.c            |    2 +-
 b/arch/x86/kvm/cpuid.c               |    2 +-
 b/arch/x86/kvm/mmu.h                 |   10 +++++-----
 b/arch/x86/kvm/mmu/spte.c            |    2 +-
 b/arch/x86/kvm/svm/svm.c             |    2 +-
 b/arch/x86/kvm/vmx/vmx.c             |   10 +++++-----
 b/arch/x86/kvm/vmx/vmx.h             |    2 +-
 b/arch/x86/mm/physaddr.h             |    2 +-
 b/drivers/acpi/acpi_fpdt.c           |    2 +-
 15 files changed, 28 insertions(+), 24 deletions(-)

diff -puN arch/x86/include/asm/kmsan.h~introduce-x86_phys_bits arch/x86/include/asm/kmsan.h
--- a/arch/x86/include/asm/kmsan.h~introduce-x86_phys_bits	2024-02-22 10:08:49.828507052 -0800
+++ b/arch/x86/include/asm/kmsan.h	2024-02-22 10:08:49.852507994 -0800
@@ -52,7 +52,7 @@ static inline void *arch_kmsan_get_meta_
 static inline bool kmsan_phys_addr_valid(unsigned long addr)
 {
 	if (IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT))
-		return !(addr >> boot_cpu_data.x86_phys_bits);
+		return !(addr >> x86_phys_bits());
 	else
 		return true;
 }
diff -puN arch/x86/include/asm/mce.h~introduce-x86_phys_bits arch/x86/include/asm/mce.h
--- a/arch/x86/include/asm/mce.h~introduce-x86_phys_bits	2024-02-22 10:08:49.828507052 -0800
+++ b/arch/x86/include/asm/mce.h	2024-02-22 10:08:49.852507994 -0800
@@ -89,7 +89,7 @@
 #define  MCI_MISC_ADDR_GENERIC	7	/* generic */
 
 /* MCi_ADDR register defines */
-#define MCI_ADDR_PHYSADDR	GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0)
+#define MCI_ADDR_PHYSADDR	GENMASK_ULL(x86_phys_bits() - 1, 0)
 
 /* CTL2 register defines */
 #define MCI_CTL2_CMCI_EN		BIT_ULL(30)
diff -puN arch/x86/include/asm/processor.h~introduce-x86_phys_bits arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~introduce-x86_phys_bits	2024-02-22 10:08:49.828507052 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:49.852507994 -0800
@@ -767,4 +767,9 @@ static inline void weak_wrmsr_fence(void
 	alternative("mfence; lfence", "", ALT_NOT(X86_FEATURE_APIC_MSRS_FENCE));
 }
 
+static inline u8 x86_phys_bits(void)
+{
+	return boot_cpu_data.x86_phys_bits;
+}
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff -puN arch/x86/kernel/cpu/mtrr/cleanup.c~introduce-x86_phys_bits arch/x86/kernel/cpu/mtrr/cleanup.c
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c~introduce-x86_phys_bits	2024-02-22 10:08:49.832507209 -0800
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c	2024-02-22 10:08:49.852507994 -0800
@@ -170,7 +170,7 @@ set_var_mtrr(unsigned int reg, unsigned
 		return;
 	}
 
-	mask = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
+	mask = (1ULL << x86_phys_bits()) - 1;
 	mask &= ~((((u64)sizek) << 10) - 1);
 
 	base = ((u64)basek) << 10;
diff -puN arch/x86/kernel/cpu/mtrr/generic.c~introduce-x86_phys_bits arch/x86/kernel/cpu/mtrr/generic.c
--- a/arch/x86/kernel/cpu/mtrr/generic.c~introduce-x86_phys_bits	2024-02-22 10:08:49.832507209 -0800
+++ b/arch/x86/kernel/cpu/mtrr/generic.c	2024-02-22 10:08:49.852507994 -0800
@@ -660,7 +660,7 @@ static void __init print_mtrr_state(void
 	}
 	pr_info("MTRR variable ranges %sabled:\n",
 		mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis");
-	high_width = (boot_cpu_data.x86_phys_bits - (32 - PAGE_SHIFT) + 3) / 4;
+	high_width = (x86_phys_bits() - (32 - PAGE_SHIFT) + 3) / 4;
 
 	for (i = 0; i < num_var_ranges; ++i) {
 		if (mtrr_state.var_ranges[i].mask_lo & MTRR_PHYSMASK_V)
diff -puN arch/x86/kernel/cpu/mtrr/mtrr.c~introduce-x86_phys_bits arch/x86/kernel/cpu/mtrr/mtrr.c
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c~introduce-x86_phys_bits	2024-02-22 10:08:49.836507366 -0800
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.c	2024-02-22 10:08:49.852507994 -0800
@@ -252,8 +252,7 @@ int mtrr_add_page(unsigned long base, un
 		return -EINVAL;
 	}
 
-	if ((base | (base + size - 1)) >>
-	    (boot_cpu_data.x86_phys_bits - PAGE_SHIFT)) {
+	if ((base | (base + size - 1)) >> (x86_phys_bits() - PAGE_SHIFT)) {
 		pr_warn("base or size exceeds the MTRR width\n");
 		return -EINVAL;
 	}
@@ -556,7 +555,7 @@ void __init mtrr_bp_init(void)
 	const char *why = "(not available)";
 	unsigned long config, dummy;
 
-	phys_hi_rsvd = GENMASK(31, boot_cpu_data.x86_phys_bits - 32);
+	phys_hi_rsvd = GENMASK(31, x86_phys_bits() - 32);
 
 	if (!generic_mtrrs && mtrr_state.enabled) {
 		/*
diff -puN arch/x86/kernel/setup.c~introduce-x86_phys_bits arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~introduce-x86_phys_bits	2024-02-22 10:08:49.836507366 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:08:49.852507994 -0800
@@ -813,7 +813,7 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	early_reserve_memory();
 
-	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
+	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
 	e820__memory_setup();
 	parse_setup_data();
 
diff -puN arch/x86/kvm/cpuid.c~introduce-x86_phys_bits arch/x86/kvm/cpuid.c
--- a/arch/x86/kvm/cpuid.c~introduce-x86_phys_bits	2024-02-22 10:08:49.836507366 -0800
+++ b/arch/x86/kvm/cpuid.c	2024-02-22 10:08:49.852507994 -0800
@@ -1236,7 +1236,7 @@ static inline int __do_cpuid_func(struct
 		 * the HPAs do not affect GPAs.
 		 */
 		if (!tdp_enabled)
-			g_phys_as = boot_cpu_data.x86_phys_bits;
+			g_phys_as = x86_phys_bits();
 		else if (!g_phys_as)
 			g_phys_as = phys_as;
 
diff -puN arch/x86/kvm/mmu.h~introduce-x86_phys_bits arch/x86/kvm/mmu.h
--- a/arch/x86/kvm/mmu.h~introduce-x86_phys_bits	2024-02-22 10:08:49.840507523 -0800
+++ b/arch/x86/kvm/mmu.h	2024-02-22 10:08:49.852507994 -0800
@@ -84,10 +84,10 @@ static inline gfn_t kvm_mmu_max_gfn(void
 static inline u8 kvm_get_shadow_phys_bits(void)
 {
 	/*
-	 * boot_cpu_data.x86_phys_bits is reduced when MKTME or SME are detected
-	 * in CPU detection code, but the processor treats those reduced bits as
-	 * 'keyID' thus they are not reserved bits. Therefore KVM needs to look at
-	 * the physical address bits reported by CPUID.
+	 * x86_phys_bits() is reduced when MKTME or SME are detected in CPU
+	 * detection code, but the processor treats those reduced bits as 'keyID'
+	 * thus they are not reserved bits. Therefore KVM needs to look at the
+	 * physical address bits reported by CPUID.
 	 */
 	if (likely(boot_cpu_data.extended_cpuid_level >= 0x80000008))
 		return cpuid_eax(0x80000008) & 0xff;
@@ -97,7 +97,7 @@ static inline u8 kvm_get_shadow_phys_bit
 	 * custom CPUID.  Proceed with whatever the kernel found since these features
 	 * aren't virtualizable (SME/SEV also require CPUIDs higher than 0x80000008).
 	 */
-	return boot_cpu_data.x86_phys_bits;
+	return x86_phys_bits();
 }
 
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask);
diff -puN arch/x86/kvm/mmu/spte.c~introduce-x86_phys_bits arch/x86/kvm/mmu/spte.c
--- a/arch/x86/kvm/mmu/spte.c~introduce-x86_phys_bits	2024-02-22 10:08:49.840507523 -0800
+++ b/arch/x86/kvm/mmu/spte.c	2024-02-22 10:08:49.852507994 -0800
@@ -468,7 +468,7 @@ void kvm_mmu_reset_all_pte_masks(void)
 	 * the most significant bits of legal physical address space.
 	 */
 	shadow_nonpresent_or_rsvd_mask = 0;
-	low_phys_bits = boot_cpu_data.x86_phys_bits;
+	low_phys_bits = x86_phys_bits();
 	if (boot_cpu_has_bug(X86_BUG_L1TF) &&
 	    !WARN_ON_ONCE(boot_cpu_data.x86_cache_bits >=
 			  52 - SHADOW_NONPRESENT_OR_RSVD_MASK_LEN)) {
diff -puN arch/x86/kvm/svm/svm.c~introduce-x86_phys_bits arch/x86/kvm/svm/svm.c
--- a/arch/x86/kvm/svm/svm.c~introduce-x86_phys_bits	2024-02-22 10:08:49.840507523 -0800
+++ b/arch/x86/kvm/svm/svm.c	2024-02-22 10:08:49.852507994 -0800
@@ -5054,7 +5054,7 @@ static __init void svm_adjust_mmio_mask(
 		return;
 
 	enc_bit = cpuid_ebx(0x8000001f) & 0x3f;
-	mask_bit = boot_cpu_data.x86_phys_bits;
+	mask_bit = x86_phys_bits();
 
 	/* Increment the mask bit if it is the same as the encryption bit */
 	if (enc_bit == mask_bit)
diff -puN arch/x86/kvm/vmx/vmx.c~introduce-x86_phys_bits arch/x86/kvm/vmx/vmx.c
--- a/arch/x86/kvm/vmx/vmx.c~introduce-x86_phys_bits	2024-02-22 10:08:49.844507680 -0800
+++ b/arch/x86/kvm/vmx/vmx.c	2024-02-22 10:08:49.852507994 -0800
@@ -8444,14 +8444,14 @@ static void __init vmx_setup_me_spte_mas
 	 * kvm_get_shadow_phys_bits() returns shadow_phys_bits.  Use
 	 * the former to avoid exposing shadow_phys_bits.
 	 *
-	 * On pre-MKTME system, boot_cpu_data.x86_phys_bits equals to
+	 * On pre-MKTME system, x86_phys_bits() equals to
 	 * shadow_phys_bits.  On MKTME and/or TDX capable systems,
-	 * boot_cpu_data.x86_phys_bits holds the actual physical address
-	 * w/o the KeyID bits, and shadow_phys_bits equals to MAXPHYADDR
+	 * x86_phys_bits() holds the actual physical address w/o the
+	 * KeyID bits, and shadow_phys_bits equals to MAXPHYADDR
 	 * reported by CPUID.  Those bits between are KeyID bits.
 	 */
-	if (boot_cpu_data.x86_phys_bits != kvm_get_shadow_phys_bits())
-		me_mask = rsvd_bits(boot_cpu_data.x86_phys_bits,
+	if (x86_phys_bits() != kvm_get_shadow_phys_bits())
+		me_mask = rsvd_bits(x86_phys_bits(),
 			kvm_get_shadow_phys_bits() - 1);
 	/*
 	 * Unlike SME, host kernel doesn't support setting up any
diff -puN arch/x86/kvm/vmx/vmx.h~introduce-x86_phys_bits arch/x86/kvm/vmx/vmx.h
--- a/arch/x86/kvm/vmx/vmx.h~introduce-x86_phys_bits	2024-02-22 10:08:49.844507680 -0800
+++ b/arch/x86/kvm/vmx/vmx.h	2024-02-22 10:08:49.852507994 -0800
@@ -721,7 +721,7 @@ static inline bool vmx_need_pf_intercept
 	if (!enable_ept)
 		return true;
 
-	return allow_smaller_maxphyaddr && cpuid_maxphyaddr(vcpu) < boot_cpu_data.x86_phys_bits;
+	return allow_smaller_maxphyaddr && cpuid_maxphyaddr(vcpu) < x86_phys_bits();
 }
 
 static inline bool is_unrestricted_guest(struct kvm_vcpu *vcpu)
diff -puN arch/x86/mm/physaddr.h~introduce-x86_phys_bits arch/x86/mm/physaddr.h
--- a/arch/x86/mm/physaddr.h~introduce-x86_phys_bits	2024-02-22 10:08:49.848507837 -0800
+++ b/arch/x86/mm/physaddr.h	2024-02-22 10:08:49.852507994 -0800
@@ -4,7 +4,7 @@
 static inline int phys_addr_valid(resource_size_t addr)
 {
 #ifdef CONFIG_PHYS_ADDR_T_64BIT
-	return !(addr >> boot_cpu_data.x86_phys_bits);
+	return !(addr >> x86_phys_bits());
 #else
 	return 1;
 #endif
diff -puN drivers/acpi/acpi_fpdt.c~introduce-x86_phys_bits drivers/acpi/acpi_fpdt.c
--- a/drivers/acpi/acpi_fpdt.c~introduce-x86_phys_bits	2024-02-22 10:08:49.848507837 -0800
+++ b/drivers/acpi/acpi_fpdt.c	2024-02-22 10:08:49.852507994 -0800
@@ -151,7 +151,7 @@ static bool fpdt_address_valid(u64 addre
 	 * On some systems the table contains invalid addresses
 	 * with unsuppored high address bits set, check for this.
 	 */
-	return !(address >> boot_cpu_data.x86_phys_bits);
+	return !(address >> x86_phys_bits());
 }
 #else
 static bool fpdt_address_valid(u64 address)
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (3 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-27 11:05   ` Huang, Kai
  2024-02-22 18:39 ` [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end Dave Hansen
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

This is one of the few references to the per-cpu version of
->x86_phys_bits.  It is theoretically possible to have this value vary
from CPU to CPU.  But in practice nobody builds systems that way.

Continue outputting the value in /proc, but read it from the global
configuration.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/proc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/x86/kernel/cpu/proc.c~proc-global-x86-phys-bits arch/x86/kernel/cpu/proc.c
--- a/arch/x86/kernel/cpu/proc.c~proc-global-x86-phys-bits	2024-02-22 10:08:50.584536732 -0800
+++ b/arch/x86/kernel/cpu/proc.c	2024-02-22 10:08:50.584536732 -0800
@@ -133,7 +133,7 @@ static int show_cpuinfo(struct seq_file
 	seq_printf(m, "clflush size\t: %u\n", c->x86_clflush_size);
 	seq_printf(m, "cache_alignment\t: %d\n", c->x86_cache_alignment);
 	seq_printf(m, "address sizes\t: %u bits physical, %u bits virtual\n",
-		   c->x86_phys_bits, c->x86_virt_bits);
+		   x86_phys_bits(), c->x86_virt_bits);
 
 	seq_puts(m, "power management:");
 	for (i = 0; i < 32; i++) {
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (4 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-27 10:59   ` Huang, Kai
  2024-02-22 18:39 ` [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper Dave Hansen
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
written and overwritten rather randomly.  They are not stable
during early boot and readers end up getting a random mishmash
of hard-coded defaults or CPUID-provided values based on when
the values are read.

iomem_resource.end is one of these users.  Because of where it
is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
which is (mostly) a compile-time default.  But
iomem_resource.end is never updated if the runtime CPUID
.x86_phys_bits is lower.

Set iomem_resource.end to the compile-time value explicitly.
It does not need to be precise as this is mostly to ensure
that insane values can't be reserved in 'iomem_resource'.

Make MAX_PHYSMEM_BITS available outside of sparsemem
configurations by removing the #ifdef CONFIG_SPARSEMEM in the
header.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/sparsemem.h |    3 ---
 b/arch/x86/kernel/setup.c          |   10 +++++++++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff -puN arch/x86/kernel/setup.c~iomem_resource_end arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~iomem_resource_end	2024-02-22 10:08:51.048554948 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:21:04.485531464 -0800
@@ -51,6 +51,7 @@
 #include <asm/pci-direct.h>
 #include <asm/prom.h>
 #include <asm/proto.h>
+#include <asm/sparsemem.h>
 #include <asm/thermal.h>
 #include <asm/unwind.h>
 #include <asm/vsyscall.h>
@@ -813,7 +814,14 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	early_reserve_memory();
 
-	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
+	/*
+	 * This was too big before.  It ended up getting MAX_PHYSMEM_BITS
+	 * even if .x86_phys_bits was eventually lowered below that.
+	 * But that was evidently harmless, so leave it too big, but
+	 * set it explicitly to MAX_PHYSMEM_BITS instead of taking a
+	 * trip through .x86_phys_bits.
+	 */
+	iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;
 	e820__memory_setup();
 	parse_setup_data();
 
diff -puN arch/x86/include/asm/sparsemem.h~iomem_resource_end arch/x86/include/asm/sparsemem.h
--- a/arch/x86/include/asm/sparsemem.h~iomem_resource_end	2024-02-22 10:19:56.842831828 -0800
+++ b/arch/x86/include/asm/sparsemem.h	2024-02-22 10:20:21.207804806 -0800
@@ -4,7 +4,6 @@
 
 #include <linux/types.h>
 
-#ifdef CONFIG_SPARSEMEM
 /*
  * generic non-linear memory support:
  *
@@ -29,8 +28,6 @@
 # define MAX_PHYSMEM_BITS	(pgtable_l5_enabled() ? 52 : 46)
 #endif
 
-#endif /* CONFIG_SPARSEMEM */
-
 #ifndef __ASSEMBLY__
 #ifdef CONFIG_NUMA_KEEP_MEMINFO
 extern int phys_to_target_node(phys_addr_t start);
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (5 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-27 11:09   ` Huang, Kai
  2024-02-22 18:39 ` [RFC][PATCH 08/34] x86/cpu: Add CLFLUSH size helper Dave Hansen
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

This uses the same logic and approach which were used for the physical
address limits with x86_phys_bits() and extends them to the virtual
address space.

Introduce a system-wide helper for users to query the size of the
virtual address space: x86_virt_bits()

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/events/amd/brs.c        |    2 +-
 b/arch/x86/events/amd/lbr.c        |    2 +-
 b/arch/x86/events/intel/pt.c       |    4 ++--
 b/arch/x86/include/asm/processor.h |    5 +++++
 b/arch/x86/kernel/cpu/proc.c       |    2 +-
 b/arch/x86/mm/maccess.c            |    4 ++--
 6 files changed, 12 insertions(+), 7 deletions(-)

diff -puN arch/x86/events/amd/brs.c~x86_virt_bits-func arch/x86/events/amd/brs.c
--- a/arch/x86/events/amd/brs.c~x86_virt_bits-func	2024-02-22 10:08:51.528573793 -0800
+++ b/arch/x86/events/amd/brs.c	2024-02-22 10:08:51.536574107 -0800
@@ -285,7 +285,7 @@ void amd_brs_drain(void)
 	struct perf_branch_entry *br = cpuc->lbr_entries;
 	union amd_debug_extn_cfg cfg;
 	u32 i, nr = 0, num, tos, start;
-	u32 shift = 64 - boot_cpu_data.x86_virt_bits;
+	u32 shift = 64 - x86_virt_bits();
 
 	/*
 	 * BRS event forced on PMC0,
diff -puN arch/x86/events/amd/lbr.c~x86_virt_bits-func arch/x86/events/amd/lbr.c
--- a/arch/x86/events/amd/lbr.c~x86_virt_bits-func	2024-02-22 10:08:51.528573793 -0800
+++ b/arch/x86/events/amd/lbr.c	2024-02-22 10:08:51.536574107 -0800
@@ -89,7 +89,7 @@ static __always_inline u64 amd_pmu_lbr_g
 
 static __always_inline u64 sign_ext_branch_ip(u64 ip)
 {
-	u32 shift = 64 - boot_cpu_data.x86_virt_bits;
+	u32 shift = 64 - x86_virt_bits();
 
 	return (u64)(((s64)ip << shift) >> shift);
 }
diff -puN arch/x86/events/intel/pt.c~x86_virt_bits-func arch/x86/events/intel/pt.c
--- a/arch/x86/events/intel/pt.c~x86_virt_bits-func	2024-02-22 10:08:51.528573793 -0800
+++ b/arch/x86/events/intel/pt.c	2024-02-22 10:08:51.536574107 -0800
@@ -1453,8 +1453,8 @@ static void pt_event_addr_filters_sync(s
 			 * canonical addresses does not affect the result of the
 			 * address filter.
 			 */
-			msr_a = clamp_to_ge_canonical_addr(a, boot_cpu_data.x86_virt_bits);
-			msr_b = clamp_to_le_canonical_addr(b, boot_cpu_data.x86_virt_bits);
+			msr_a = clamp_to_ge_canonical_addr(a, x86_virt_bits());
+			msr_b = clamp_to_le_canonical_addr(b, x86_virt_bits());
 			if (msr_b < msr_a)
 				msr_a = msr_b = 0;
 		}
diff -puN arch/x86/include/asm/processor.h~x86_virt_bits-func arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_virt_bits-func	2024-02-22 10:08:51.532573950 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:51.536574107 -0800
@@ -772,4 +772,9 @@ static inline u8 x86_phys_bits(void)
 	return boot_cpu_data.x86_phys_bits;
 }
 
+static inline u8 x86_virt_bits(void)
+{
+	return boot_cpu_data.x86_virt_bits;
+}
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff -puN arch/x86/kernel/cpu/proc.c~x86_virt_bits-func arch/x86/kernel/cpu/proc.c
--- a/arch/x86/kernel/cpu/proc.c~x86_virt_bits-func	2024-02-22 10:08:51.532573950 -0800
+++ b/arch/x86/kernel/cpu/proc.c	2024-02-22 10:08:51.536574107 -0800
@@ -133,7 +133,7 @@ static int show_cpuinfo(struct seq_file
 	seq_printf(m, "clflush size\t: %u\n", c->x86_clflush_size);
 	seq_printf(m, "cache_alignment\t: %d\n", c->x86_cache_alignment);
 	seq_printf(m, "address sizes\t: %u bits physical, %u bits virtual\n",
-		   x86_phys_bits(), c->x86_virt_bits);
+		   x86_phys_bits(), x86_virt_bits());
 
 	seq_puts(m, "power management:");
 	for (i = 0; i < 32; i++) {
diff -puN arch/x86/mm/maccess.c~x86_virt_bits-func arch/x86/mm/maccess.c
--- a/arch/x86/mm/maccess.c~x86_virt_bits-func	2024-02-22 10:08:51.536574107 -0800
+++ b/arch/x86/mm/maccess.c	2024-02-22 10:08:51.536574107 -0800
@@ -20,10 +20,10 @@ bool copy_from_kernel_nofault_allowed(co
 	 * is initialized.  Needed for instruction decoding in early
 	 * exception handlers.
 	 */
-	if (!boot_cpu_data.x86_virt_bits)
+	if (!x86_virt_bits())
 		return true;
 
-	return __is_canonical_address(vaddr, boot_cpu_data.x86_virt_bits);
+	return __is_canonical_address(vaddr, x86_virt_bits());
 }
 #else
 bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 08/34] x86/cpu: Add CLFLUSH size helper
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (6 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure Dave Hansen
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen, mpatocka,
	snitzer, daniel


From: Dave Hansen <dave.hansen@linux.intel.com>

This is probably starting to look familiar:  The size of a cacheline is
fundamental system-wide information.  Keeping it per-cpu is just silly.

Introduce a system-wide helper for looking up the cacheline size and
use it.

This does one slightly odd looking thing, it stops setting
c->x86_clflush_size on all but the boot CPU.  This is functionally OK
because all readers of the secondary CPU values also go away.  It also
makes it explicit that the 'boot_cpu_data' is the one true system-wide
value.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>

---

 b/arch/x86/include/asm/processor.h       |    5 +++++
 b/arch/x86/kernel/cpu/centaur.c          |    2 +-
 b/arch/x86/kernel/cpu/common.c           |    9 +++++----
 b/arch/x86/kernel/cpu/intel.c            |    2 +-
 b/arch/x86/kernel/cpu/proc.c             |    2 +-
 b/arch/x86/lib/usercopy_64.c             |    7 +++----
 b/arch/x86/mm/pat/set_memory.c           |    2 +-
 b/arch/x86/pci/common.c                  |    2 +-
 b/drivers/gpu/drm/drm_cache.c            |    4 ++--
 b/drivers/gpu/drm/i915/i915_cmd_parser.c |    3 +--
 b/drivers/gpu/drm/i915/i915_gem.c        |    2 +-
 b/drivers/md/dm-writecache.c             |    2 +-
 12 files changed, 23 insertions(+), 19 deletions(-)

diff -puN arch/x86/include/asm/processor.h~x86_clflush_size-func arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_clflush_size-func	2024-02-22 10:08:52.112596720 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:52.132597505 -0800
@@ -777,4 +777,9 @@ static inline u8 x86_virt_bits(void)
 	return boot_cpu_data.x86_virt_bits;
 }
 
+static inline u8 x86_clflush_size(void)
+{
+	return boot_cpu_data.x86_clflush_size;
+}
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff -puN arch/x86/kernel/cpu/centaur.c~x86_clflush_size-func arch/x86/kernel/cpu/centaur.c
--- a/arch/x86/kernel/cpu/centaur.c~x86_clflush_size-func	2024-02-22 10:08:52.112596720 -0800
+++ b/arch/x86/kernel/cpu/centaur.c	2024-02-22 10:08:52.132597505 -0800
@@ -62,7 +62,7 @@ static void init_c3(struct cpuinfo_x86 *
 		set_cpu_cap(c, X86_FEATURE_3DNOW);
 #endif
 	if (c->x86 == 0x6 && c->x86_model >= 0xf) {
-		c->x86_cache_alignment = c->x86_clflush_size * 2;
+		c->x86_cache_alignment = x86_clflush_size() * 2;
 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 	}
 
diff -puN arch/x86/kernel/cpu/common.c~x86_clflush_size-func arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~x86_clflush_size-func	2024-02-22 10:08:52.112596720 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:52.132597505 -0800
@@ -954,8 +954,9 @@ void cpu_detect(struct cpuinfo_x86 *c)
 		c->x86_stepping	= x86_stepping(tfms);
 
 		if (cap0 & (1<<19)) {
-			c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
-			c->x86_cache_alignment = c->x86_clflush_size;
+			if (c == &boot_cpu_data)
+				c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
+			c->x86_cache_alignment = x86_clflush_size();
 		}
 	}
 }
@@ -1123,7 +1124,7 @@ void get_cpu_address_sizes(struct cpuinf
 		}
 	}
 	c->x86_cache_bits = c->x86_phys_bits;
-	c->x86_cache_alignment = c->x86_clflush_size;
+	c->x86_cache_alignment = x86_clflush_size();
 }
 
 static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
@@ -1831,7 +1832,7 @@ static void identify_cpu(struct cpuinfo_
 	c->x86_phys_bits = 32;
 	c->x86_virt_bits = 32;
 #endif
-	c->x86_cache_alignment = c->x86_clflush_size;
+	c->x86_cache_alignment = x86_clflush_size();
 	memset(&c->x86_capability, 0, sizeof(c->x86_capability));
 #ifdef CONFIG_X86_VMX_FEATURE_NAMES
 	memset(&c->vmx_capability, 0, sizeof(c->vmx_capability));
diff -puN arch/x86/kernel/cpu/intel.c~x86_clflush_size-func arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~x86_clflush_size-func	2024-02-22 10:08:52.116596877 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:52.132597505 -0800
@@ -653,7 +653,7 @@ static void init_intel(struct cpuinfo_x8
 
 #ifdef CONFIG_X86_64
 	if (c->x86 == 15)
-		c->x86_cache_alignment = c->x86_clflush_size * 2;
+		c->x86_cache_alignment = x86_clflush_size() * 2;
 	if (c->x86 == 6)
 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 #else
diff -puN arch/x86/kernel/cpu/proc.c~x86_clflush_size-func arch/x86/kernel/cpu/proc.c
--- a/arch/x86/kernel/cpu/proc.c~x86_clflush_size-func	2024-02-22 10:08:52.116596877 -0800
+++ b/arch/x86/kernel/cpu/proc.c	2024-02-22 10:08:52.132597505 -0800
@@ -130,7 +130,7 @@ static int show_cpuinfo(struct seq_file
 	if (c->x86_tlbsize > 0)
 		seq_printf(m, "TLB size\t: %d 4K pages\n", c->x86_tlbsize);
 #endif
-	seq_printf(m, "clflush size\t: %u\n", c->x86_clflush_size);
+	seq_printf(m, "clflush size\t: %u\n", x86_clflush_size());
 	seq_printf(m, "cache_alignment\t: %d\n", c->x86_cache_alignment);
 	seq_printf(m, "address sizes\t: %u bits physical, %u bits virtual\n",
 		   x86_phys_bits(), x86_virt_bits());
diff -puN arch/x86/lib/usercopy_64.c~x86_clflush_size-func arch/x86/lib/usercopy_64.c
--- a/arch/x86/lib/usercopy_64.c~x86_clflush_size-func	2024-02-22 10:08:52.120597034 -0800
+++ b/arch/x86/lib/usercopy_64.c	2024-02-22 10:08:52.132597505 -0800
@@ -27,13 +27,12 @@
  */
 static void clean_cache_range(void *addr, size_t size)
 {
-	u16 x86_clflush_size = boot_cpu_data.x86_clflush_size;
-	unsigned long clflush_mask = x86_clflush_size - 1;
+	unsigned long clflush_mask = x86_clflush_size() - 1;
 	void *vend = addr + size;
 	void *p;
 
 	for (p = (void *)((unsigned long)addr & ~clflush_mask);
-	     p < vend; p += x86_clflush_size)
+	     p < vend; p += x86_clflush_size())
 		clwb(p);
 }
 
@@ -65,7 +64,7 @@ long __copy_user_flushcache(void *dst, c
 			clean_cache_range(dst, size);
 	} else {
 		if (!IS_ALIGNED(dest, 8)) {
-			dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
+			dest = ALIGN(dest, x86_clflush_size());
 			clean_cache_range(dst, 1);
 		}
 
diff -puN arch/x86/mm/pat/set_memory.c~x86_clflush_size-func arch/x86/mm/pat/set_memory.c
--- a/arch/x86/mm/pat/set_memory.c~x86_clflush_size-func	2024-02-22 10:08:52.120597034 -0800
+++ b/arch/x86/mm/pat/set_memory.c	2024-02-22 10:08:52.132597505 -0800
@@ -314,7 +314,7 @@ static unsigned long __cpa_addr(struct c
 
 static void clflush_cache_range_opt(void *vaddr, unsigned int size)
 {
-	const unsigned long clflush_size = boot_cpu_data.x86_clflush_size;
+	const unsigned long clflush_size = x86_clflush_size();
 	void *p = (void *)((unsigned long)vaddr & ~(clflush_size - 1));
 	void *vend = vaddr + size;
 
diff -puN arch/x86/pci/common.c~x86_clflush_size-func arch/x86/pci/common.c
--- a/arch/x86/pci/common.c~x86_clflush_size-func	2024-02-22 10:08:52.120597034 -0800
+++ b/arch/x86/pci/common.c	2024-02-22 10:08:52.132597505 -0800
@@ -480,7 +480,7 @@ void pcibios_scan_root(int busnum)
 
 void __init pcibios_set_cache_line_size(void)
 {
-	pci_dfl_cache_line_size = boot_cpu_data.x86_clflush_size >> 2;
+	pci_dfl_cache_line_size = x86_clflush_size() >> 2;
 	printk(KERN_DEBUG "PCI: pci_cache_line_size set to %d bytes\n",
 		pci_dfl_cache_line_size << 2);
 }
diff -puN drivers/gpu/drm/drm_cache.c~x86_clflush_size-func drivers/gpu/drm/drm_cache.c
--- a/drivers/gpu/drm/drm_cache.c~x86_clflush_size-func	2024-02-22 10:08:52.124597191 -0800
+++ b/drivers/gpu/drm/drm_cache.c	2024-02-22 10:08:52.132597505 -0800
@@ -52,7 +52,7 @@ drm_clflush_page(struct page *page)
 {
 	uint8_t *page_virtual;
 	unsigned int i;
-	const int size = boot_cpu_data.x86_clflush_size;
+	const int size = x86_clflush_size();
 
 	if (unlikely(page == NULL))
 		return;
@@ -160,7 +160,7 @@ drm_clflush_virt_range(void *addr, unsig
 {
 #if defined(CONFIG_X86)
 	if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
-		const int size = boot_cpu_data.x86_clflush_size;
+		const int size = x86_clflush_size();
 		void *end = addr + length;
 
 		addr = (void *)(((unsigned long)addr) & -size);
diff -puN drivers/gpu/drm/i915/i915_cmd_parser.c~x86_clflush_size-func drivers/gpu/drm/i915/i915_cmd_parser.c
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c~x86_clflush_size-func	2024-02-22 10:08:52.124597191 -0800
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c	2024-02-22 10:08:52.132597505 -0800
@@ -1203,8 +1203,7 @@ static u32 *copy_batch(struct drm_i915_g
 		 */
 		remain = length;
 		if (dst_needs_clflush & CLFLUSH_BEFORE)
-			remain = round_up(remain,
-					  boot_cpu_data.x86_clflush_size);
+			remain = round_up(remain, x86_clflush_size());
 
 		ptr = dst;
 		x = offset_in_page(offset);
diff -puN drivers/gpu/drm/i915/i915_gem.c~x86_clflush_size-func drivers/gpu/drm/i915/i915_gem.c
--- a/drivers/gpu/drm/i915/i915_gem.c~x86_clflush_size-func	2024-02-22 10:08:52.124597191 -0800
+++ b/drivers/gpu/drm/i915/i915_gem.c	2024-02-22 10:08:52.132597505 -0800
@@ -696,7 +696,7 @@ i915_gem_shmem_pwrite(struct drm_i915_ge
 	 */
 	partial_cacheline_write = 0;
 	if (needs_clflush & CLFLUSH_BEFORE)
-		partial_cacheline_write = boot_cpu_data.x86_clflush_size - 1;
+		partial_cacheline_write = x86_clflush_size() - 1;
 
 	user_data = u64_to_user_ptr(args->data_ptr);
 	remain = args->size;
diff -puN drivers/md/dm-writecache.c~x86_clflush_size-func drivers/md/dm-writecache.c
--- a/drivers/md/dm-writecache.c~x86_clflush_size-func	2024-02-22 10:08:52.128597348 -0800
+++ b/drivers/md/dm-writecache.c	2024-02-22 10:08:52.132597505 -0800
@@ -1229,7 +1229,7 @@ static void memcpy_flushcache_optimized(
 	 */
 #ifdef CONFIG_X86
 	if (static_cpu_has(X86_FEATURE_CLFLUSHOPT) &&
-	    likely(boot_cpu_data.x86_clflush_size == 64) &&
+	    likely(x86_clflush_size() == 64) &&
 	    likely(size >= 768)) {
 		do {
 			memcpy((void *)dest, (void *)source, 64);
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (7 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 08/34] x86/cpu: Add CLFLUSH size helper Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-27 23:47   ` Huang, Kai
  2024-02-22 18:39 ` [RFC][PATCH 10/34] x86/cpu/amd: Use new "address configuration" infrastructure Dave Hansen
                   ` (24 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Right now, 'boot_cpu_data' is established in a very ad hoc way.  Random
bits of the initialization code write random things to it, either
blowing away state or tweaking it as they see fit.  It's madness.

Add some more structure to the process.

Introduce an "address configuration" structure just for the boot CPU.
This will be used to establish system-wide address space configuration.
It is written while bringing up the boot CPU and then read *ONCE* to
establish the configuration.

Also introduce the first field: phys_addr_reduction_bits.  This field
will be used by memory encryption hardware that reduces the actual
usable address bits beneath what the hardware enumerates.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |   14 ++++++++++++++
 b/arch/x86/kernel/cpu/common.c     |    3 +++
 b/arch/x86/kernel/setup.c          |    2 ++
 3 files changed, 19 insertions(+)

diff -puN arch/x86/include/asm/processor.h~bsp-addr-info arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~bsp-addr-info	2024-02-22 10:08:52.824624673 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:52.828624830 -0800
@@ -163,6 +163,19 @@ struct cpuinfo_x86 {
 	unsigned		initialized : 1;
 } __randomize_layout;
 
+/*
+ * Must be written by the time ->c_bsp_init() completes.
+ * Consumed in get_cpu_address_sizes().
+ */
+struct x86_addr_config {
+	/*
+	 * How many bits of the expected or enumerated physical
+	 * address space are unavailable?  Typically set on
+	 * platforms that use memory encryption.
+	 */
+	u8 phys_addr_reduction_bits;
+};
+
 #define X86_VENDOR_INTEL	0
 #define X86_VENDOR_CYRIX	1
 #define X86_VENDOR_AMD		2
@@ -182,6 +195,7 @@ struct cpuinfo_x86 {
  */
 extern struct cpuinfo_x86	boot_cpu_data;
 extern struct cpuinfo_x86	new_cpu_data;
+extern struct x86_addr_config	bsp_addr_config;
 
 extern __u32			cpu_caps_cleared[NCAPINTS + NBUGINTS];
 extern __u32			cpu_caps_set[NCAPINTS + NBUGINTS];
diff -puN arch/x86/kernel/cpu/common.c~bsp-addr-info arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~bsp-addr-info	2024-02-22 10:08:52.824624673 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:52.828624830 -0800
@@ -1125,6 +1125,9 @@ void get_cpu_address_sizes(struct cpuinf
 	}
 	c->x86_cache_bits = c->x86_phys_bits;
 	c->x86_cache_alignment = x86_clflush_size();
+
+	/* Do this last to avoid affecting ->x86_cache_bits. */
+	c->x86_phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
 }
 
 static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
diff -puN arch/x86/kernel/setup.c~bsp-addr-info arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~bsp-addr-info	2024-02-22 10:08:52.824624673 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:08:52.828624830 -0800
@@ -131,6 +131,8 @@ struct ist_info ist_info;
 struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
+struct x86_addr_config bsp_addr_config;
+
 #if !defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
 __visible unsigned long mmu_cr4_features __ro_after_init;
 #else
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 10/34] x86/cpu/amd: Use new "address configuration" infrastructure
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (8 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for " Dave Hansen
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The AMD memory encryption code is currently one of the 'boot_cpu_data'
field twiddlers.  This has led to all kinds of interesting ordering
bugs at boot.

Move it away from random fiddling and over to 'bsp_addr_config'.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/amd.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -puN arch/x86/kernel/cpu/amd.c~amd-phys_addr_reduction_bits arch/x86/kernel/cpu/amd.c
--- a/arch/x86/kernel/cpu/amd.c~amd-phys_addr_reduction_bits	2024-02-22 10:08:53.340644930 -0800
+++ b/arch/x86/kernel/cpu/amd.c	2024-02-22 10:08:53.344645087 -0800
@@ -622,7 +622,8 @@ static void early_detect_mem_encrypt(str
 		 * will be a value above 32-bits this is still done for
 		 * CONFIG_X86_32 so that accurate values are reported.
 		 */
-		c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;
+		bsp_addr_config.phys_addr_reduction_bits =
+			(cpuid_ebx(0x8000001f) >> 6) & 0x3f;
 
 		if (IS_ENABLED(CONFIG_X86_32))
 			goto clear_all;
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (9 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 10/34] x86/cpu/amd: Use new "address configuration" infrastructure Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-23 11:33   ` Kirill A. Shutemov
  2024-02-22 18:39 ` [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME Dave Hansen
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Intel also does memory encryption and also fiddles with the physical
address bits.  This is currently called for *each* CPU, but practically
only done on the boot CPU because of 'mktme_status'.

Move it from the "each CPU" ->c_init() function to ->c_bsp_init() where
the whole thing only gets called once ever.  This also necessitates moving
detect_tme() and its entourage around in the file.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/intel.c |  174 +++++++++++++++++++++---------------------
 1 file changed, 87 insertions(+), 87 deletions(-)

diff -puN arch/x86/kernel/cpu/intel.c~intel-move-TME-detection arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~intel-move-TME-detection	2024-02-22 10:08:53.820663775 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:53.824663932 -0800
@@ -324,9 +324,96 @@ static void early_init_intel(struct cpui
 		detect_ht_early(c);
 }
 
+#define MSR_IA32_TME_ACTIVATE		0x982
+
+/* Helpers to access TME_ACTIVATE MSR */
+#define TME_ACTIVATE_LOCKED(x)		(x & 0x1)
+#define TME_ACTIVATE_ENABLED(x)		(x & 0x2)
+
+#define TME_ACTIVATE_POLICY(x)		((x >> 4) & 0xf)	/* Bits 7:4 */
+#define TME_ACTIVATE_POLICY_AES_XTS_128	0
+
+#define TME_ACTIVATE_KEYID_BITS(x)	((x >> 32) & 0xf)	/* Bits 35:32 */
+
+#define TME_ACTIVATE_CRYPTO_ALGS(x)	((x >> 48) & 0xffff)	/* Bits 63:48 */
+#define TME_ACTIVATE_CRYPTO_AES_XTS_128	1
+
+/* Values for mktme_status (SW only construct) */
+#define MKTME_ENABLED			0
+#define MKTME_DISABLED			1
+#define MKTME_UNINITIALIZED		2
+static int mktme_status = MKTME_UNINITIALIZED;
+
+static void detect_tme(struct cpuinfo_x86 *c)
+{
+	u64 tme_activate, tme_policy, tme_crypto_algs;
+	int keyid_bits = 0, nr_keyids = 0;
+	static u64 tme_activate_cpu0 = 0;
+
+	rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
+
+	if (mktme_status != MKTME_UNINITIALIZED) {
+		if (tme_activate != tme_activate_cpu0) {
+			/* Broken BIOS? */
+			pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
+			pr_err_once("x86/tme: MKTME is not usable\n");
+			mktme_status = MKTME_DISABLED;
+
+			/* Proceed. We may need to exclude bits from x86_phys_bits. */
+		}
+	} else {
+		tme_activate_cpu0 = tme_activate;
+	}
+
+	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
+		pr_info_once("x86/tme: not enabled by BIOS\n");
+		mktme_status = MKTME_DISABLED;
+		return;
+	}
+
+	if (mktme_status != MKTME_UNINITIALIZED)
+		goto detect_keyid_bits;
+
+	pr_info("x86/tme: enabled by BIOS\n");
+
+	tme_policy = TME_ACTIVATE_POLICY(tme_activate);
+	if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
+		pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
+
+	tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
+	if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
+		pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
+				tme_crypto_algs);
+		mktme_status = MKTME_DISABLED;
+	}
+detect_keyid_bits:
+	keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
+	nr_keyids = (1UL << keyid_bits) - 1;
+	if (nr_keyids) {
+		pr_info_once("x86/mktme: enabled by BIOS\n");
+		pr_info_once("x86/mktme: %d KeyIDs available\n", nr_keyids);
+	} else {
+		pr_info_once("x86/mktme: disabled by BIOS\n");
+	}
+
+	if (mktme_status == MKTME_UNINITIALIZED) {
+		/* MKTME is usable */
+		mktme_status = MKTME_ENABLED;
+	}
+
+	/*
+	 * KeyID bits effectively lower the number of physical address
+	 * bits.  Update cpuinfo_x86::x86_phys_bits accordingly.
+	 */
+	c->x86_phys_bits -= keyid_bits;
+}
+
 static void bsp_init_intel(struct cpuinfo_x86 *c)
 {
 	resctrl_cpu_detect(c);
+
+	if (cpu_has(c, X86_FEATURE_TME))
+		detect_tme(c);
 }
 
 #ifdef CONFIG_X86_32
@@ -482,90 +569,6 @@ static void srat_detect_node(struct cpui
 #endif
 }
 
-#define MSR_IA32_TME_ACTIVATE		0x982
-
-/* Helpers to access TME_ACTIVATE MSR */
-#define TME_ACTIVATE_LOCKED(x)		(x & 0x1)
-#define TME_ACTIVATE_ENABLED(x)		(x & 0x2)
-
-#define TME_ACTIVATE_POLICY(x)		((x >> 4) & 0xf)	/* Bits 7:4 */
-#define TME_ACTIVATE_POLICY_AES_XTS_128	0
-
-#define TME_ACTIVATE_KEYID_BITS(x)	((x >> 32) & 0xf)	/* Bits 35:32 */
-
-#define TME_ACTIVATE_CRYPTO_ALGS(x)	((x >> 48) & 0xffff)	/* Bits 63:48 */
-#define TME_ACTIVATE_CRYPTO_AES_XTS_128	1
-
-/* Values for mktme_status (SW only construct) */
-#define MKTME_ENABLED			0
-#define MKTME_DISABLED			1
-#define MKTME_UNINITIALIZED		2
-static int mktme_status = MKTME_UNINITIALIZED;
-
-static void detect_tme(struct cpuinfo_x86 *c)
-{
-	u64 tme_activate, tme_policy, tme_crypto_algs;
-	int keyid_bits = 0, nr_keyids = 0;
-	static u64 tme_activate_cpu0 = 0;
-
-	rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
-
-	if (mktme_status != MKTME_UNINITIALIZED) {
-		if (tme_activate != tme_activate_cpu0) {
-			/* Broken BIOS? */
-			pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
-			pr_err_once("x86/tme: MKTME is not usable\n");
-			mktme_status = MKTME_DISABLED;
-
-			/* Proceed. We may need to exclude bits from x86_phys_bits. */
-		}
-	} else {
-		tme_activate_cpu0 = tme_activate;
-	}
-
-	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
-		pr_info_once("x86/tme: not enabled by BIOS\n");
-		mktme_status = MKTME_DISABLED;
-		return;
-	}
-
-	if (mktme_status != MKTME_UNINITIALIZED)
-		goto detect_keyid_bits;
-
-	pr_info("x86/tme: enabled by BIOS\n");
-
-	tme_policy = TME_ACTIVATE_POLICY(tme_activate);
-	if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
-		pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
-
-	tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
-	if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
-		pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
-				tme_crypto_algs);
-		mktme_status = MKTME_DISABLED;
-	}
-detect_keyid_bits:
-	keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
-	nr_keyids = (1UL << keyid_bits) - 1;
-	if (nr_keyids) {
-		pr_info_once("x86/mktme: enabled by BIOS\n");
-		pr_info_once("x86/mktme: %d KeyIDs available\n", nr_keyids);
-	} else {
-		pr_info_once("x86/mktme: disabled by BIOS\n");
-	}
-
-	if (mktme_status == MKTME_UNINITIALIZED) {
-		/* MKTME is usable */
-		mktme_status = MKTME_ENABLED;
-	}
-
-	/*
-	 * KeyID bits effectively lower the number of physical address
-	 * bits.  Update cpuinfo_x86::x86_phys_bits accordingly.
-	 */
-	c->x86_phys_bits -= keyid_bits;
-}
-
 static void init_cpuid_fault(struct cpuinfo_x86 *c)
 {
 	u64 msr;
@@ -702,9 +705,6 @@ static void init_intel(struct cpuinfo_x8
 
 	init_ia32_feat_ctl(c);
 
-	if (cpu_has(c, X86_FEATURE_TME))
-		detect_tme(c);
-
 	init_intel_misc_features(c);
 
 	split_lock_init();
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (10 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for " Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-23 11:41   ` Kirill A. Shutemov
  2024-02-22 18:39 ` [RFC][PATCH 13/34] x86/boot: Use address reduction config to handle erratum Dave Hansen
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Now that the TME detection is only called once at boot, stop twiddling
'boot_cpu_data' directly and move over to 'bsp_addr_config'.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/intel.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff -puN arch/x86/kernel/cpu/intel.c~intel-addr-reduce arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~intel-addr-reduce	2024-02-22 10:08:54.296682462 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:54.296682462 -0800
@@ -401,11 +401,8 @@ detect_keyid_bits:
 		mktme_status = MKTME_ENABLED;
 	}
 
-	/*
-	 * KeyID bits effectively lower the number of physical address
-	 * bits.  Update cpuinfo_x86::x86_phys_bits accordingly.
-	 */
-	c->x86_phys_bits -= keyid_bits;
+	/* KeyID bits effectively lower the number of physical address bits */
+	bsp_addr_config.phys_addr_reduction_bits = keyid_bits;
 }
 
 static void bsp_init_intel(struct cpuinfo_x86 *c)
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 13/34] x86/boot: Use address reduction config to handle erratum
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (11 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 14/34] x86/cpu: Remove default physical address space settings Dave Hansen
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Continue to chip away at sites that muck with ->x86_phys_bits.  This
one is an oldie but a goodie:

	af9c142de94e ("[PATCH] x86_64: Force correct address space size for MTRR on some 64bit Intel Xeons")

Evidently, the CPUs in question "report 40bit, but only have 36bits
of physical address space."  Since there's now a handy way to reduce
the amount of physical address bits, use that to reduce from 40->36.

This means there are now two (Intel) users of the address bits
reduction feature.  There is no way a 2005-era CPU will ever have TME,
but it is still nice to be more explicit about the possibility of a
collision.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/intel.c |   27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff -puN arch/x86/kernel/cpu/intel.c~intel-phys-addr-errata arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~intel-phys-addr-errata	2024-02-22 10:08:54.772701150 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:54.776701306 -0800
@@ -242,11 +242,6 @@ static void early_init_intel(struct cpui
 		c->x86_cache_alignment = 128;
 #endif
 
-	/* CPUID workaround for 0F33/0F34 CPU */
-	if (c->x86 == 0xF && c->x86_model == 0x3
-	    && (c->x86_stepping == 0x3 || c->x86_stepping == 0x4))
-		c->x86_phys_bits = 36;
-
 	/*
 	 * c->x86_power is 8000_0007 edx. Bit 8 is TSC runs at constant rate
 	 * with P/T states and does not stop in deep C-states.
@@ -344,7 +339,7 @@ static void early_init_intel(struct cpui
 #define MKTME_UNINITIALIZED		2
 static int mktme_status = MKTME_UNINITIALIZED;
 
-static void detect_tme(struct cpuinfo_x86 *c)
+static int detect_tme(struct cpuinfo_x86 *c)
 {
 	u64 tme_activate, tme_policy, tme_crypto_algs;
 	int keyid_bits = 0, nr_keyids = 0;
@@ -368,7 +363,7 @@ static void detect_tme(struct cpuinfo_x8
 	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
 		pr_info_once("x86/tme: not enabled by BIOS\n");
 		mktme_status = MKTME_DISABLED;
-		return;
+		return 0;
 	}
 
 	if (mktme_status != MKTME_UNINITIALIZED)
@@ -401,16 +396,28 @@ detect_keyid_bits:
 		mktme_status = MKTME_ENABLED;
 	}
 
-	/* KeyID bits effectively lower the number of physical address bits */
-	bsp_addr_config.phys_addr_reduction_bits = keyid_bits;
+	return keyid_bits;
 }
 
 static void bsp_init_intel(struct cpuinfo_x86 *c)
 {
+	int keyid_bits = 0;
+
 	resctrl_cpu_detect(c);
 
 	if (cpu_has(c, X86_FEATURE_TME))
-		detect_tme(c);
+		keyid_bits = detect_tme(c);
+
+	/* KeyID bits effectively lower the number of physical address bits */
+	bsp_addr_config.phys_addr_reduction_bits = keyid_bits;
+
+	/* CPUID workaround for 0F33/0F34 CPU */
+	if (c->x86 == 0xF && c->x86_model == 0x3
+	    && (c->x86_stepping == 0x3 || c->x86_stepping == 0x4)) {
+		/* Warn if MKTME and this workaround collide: */
+		WARN_ON_ONCE(keyid_bits);
+		bsp_addr_config.phys_addr_reduction_bits = 4;
+	}
 }
 
 #ifdef CONFIG_X86_32
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 14/34] x86/cpu: Remove default physical address space settings
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (12 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 13/34] x86/boot: Use address reduction config to handle erratum Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 15/34] x86/cpu: Remove default x86_phys_bits assignment Dave Hansen
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

At this point, the secondary CPUs' c->x86_phys_bits should be unused.  The boot
CPU values were also established long before reaching here.  This can only
serve to sow chaos.  Remove the twiddling.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/common.c |    2 --
 1 file changed, 2 deletions(-)

diff -puN arch/x86/kernel/cpu/common.c~no-default-bit-setting-phys arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~no-default-bit-setting-phys	2024-02-22 10:08:55.256720151 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:55.260720308 -0800
@@ -1827,12 +1827,10 @@ static void identify_cpu(struct cpuinfo_
 	c->topo.l2c_id = BAD_APICID;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
-	c->x86_phys_bits = 36;
 	c->x86_virt_bits = 48;
 #else
 	c->cpuid_level = -1;	/* CPUID not detected */
 	c->x86_clflush_size = 32;
-	c->x86_phys_bits = 32;
 	c->x86_virt_bits = 32;
 #endif
 	c->x86_cache_alignment = x86_clflush_size();
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 15/34] x86/cpu: Remove default x86_phys_bits assignment
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (13 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 14/34] x86/cpu: Remove default physical address space settings Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 16/34] x86/cpu: Move physical address limit out of cpuinfo_x86 Dave Hansen
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

This was presumably needed for *some* reason at some point.  It is
useless now.  Remove it.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/setup.c |    1 -
 1 file changed, 1 deletion(-)

diff -puN arch/x86/kernel/setup.c~no-early-max_phys_bits-assign arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~no-early-max_phys_bits-assign	2024-02-22 10:08:55.736738995 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:08:55.740739152 -0800
@@ -752,7 +752,6 @@ void __init setup_arch(char **cmdline_p)
 	__flush_tlb_all();
 #else
 	printk(KERN_INFO "Command line: %s\n", boot_command_line);
-	boot_cpu_data.x86_phys_bits = MAX_PHYSMEM_BITS;
 #endif
 
 	/*
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 16/34] x86/cpu: Move physical address limit out of cpuinfo_x86
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (14 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 15/34] x86/cpu: Remove default x86_phys_bits assignment Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 17/34] x86/cpu: Move virtual " Dave Hansen
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

There are no longer any direct references to cpuinfo_x86->x86_phys_bits.
The only remaining references are to 'boot_cpu_data' via the
x86_phys_bits() helper.

This means the farce that x86_phys_bits is per-cpu data can end.  Remove
it from cpuinfo_x86 and add it to a new global data structure:
'x86_config'.  (Better names welcome)

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |   16 ++++++++++++++--
 b/arch/x86/kernel/cpu/common.c     |   12 ++++++------
 b/arch/x86/kernel/setup.c          |    1 +
 3 files changed, 21 insertions(+), 8 deletions(-)

diff -puN arch/x86/include/asm/processor.h~no-cpu-data-phys_bits arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~no-cpu-data-phys_bits	2024-02-22 10:08:56.220757996 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:56.228758310 -0800
@@ -118,7 +118,6 @@ struct cpuinfo_x86 {
 	__u32			vmx_capability[NVMXINTS];
 #endif
 	__u8			x86_virt_bits;
-	__u8			x86_phys_bits;
 	/* CPUID returned core id bits: */
 	__u8			x86_coreid_bits;
 	/* Max extended CPUID function supported: */
@@ -176,6 +175,19 @@ struct x86_addr_config {
 	u8 phys_addr_reduction_bits;
 };
 
+/*
+ * System-wide configuration that is shared by all processors.
+ *
+ * Built in early_cpu_init() on the boot CPU and and never
+ * modified after that.
+ */
+struct x86_sys_config {
+	/* Physical address bits supported by all processors */
+	u8	phys_bits;
+};
+
+extern struct x86_sys_config x86_config;
+
 #define X86_VENDOR_INTEL	0
 #define X86_VENDOR_CYRIX	1
 #define X86_VENDOR_AMD		2
@@ -783,7 +795,7 @@ static inline void weak_wrmsr_fence(void
 
 static inline u8 x86_phys_bits(void)
 {
-	return boot_cpu_data.x86_phys_bits;
+	return x86_config.phys_bits;
 }
 
 static inline u8 x86_virt_bits(void)
diff -puN arch/x86/kernel/cpu/common.c~no-cpu-data-phys_bits arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~no-cpu-data-phys_bits	2024-02-22 10:08:56.224758153 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:56.228758310 -0800
@@ -1107,27 +1107,27 @@ void get_cpu_address_sizes(struct cpuinf
 		cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
 
 		c->x86_virt_bits = (eax >> 8) & 0xff;
-		c->x86_phys_bits = eax & 0xff;
+		x86_config.phys_bits = eax & 0xff;
 	} else {
 		if (IS_ENABLED(CONFIG_X86_64)) {
+			x86_config.phys_bits = 36;
 			c->x86_clflush_size = 64;
-			c->x86_phys_bits = 36;
 			c->x86_virt_bits = 48;
 		} else {
+			x86_config.phys_bits = 32;
 			c->x86_clflush_size = 32;
 			c->x86_virt_bits = 32;
-			c->x86_phys_bits = 32;
 
 			if (cpu_has(c, X86_FEATURE_PAE) ||
 			    cpu_has(c, X86_FEATURE_PSE36))
-				c->x86_phys_bits = 36;
+				x86_config.phys_bits = 36;
 		}
 	}
-	c->x86_cache_bits = c->x86_phys_bits;
+	c->x86_cache_bits = x86_config.phys_bits;
 	c->x86_cache_alignment = x86_clflush_size();
 
 	/* Do this last to avoid affecting ->x86_cache_bits. */
-	c->x86_phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
+	x86_config.phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
 }
 
 static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
diff -puN arch/x86/kernel/setup.c~no-cpu-data-phys_bits arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~no-cpu-data-phys_bits	2024-02-22 10:08:56.224758153 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:08:56.228758310 -0800
@@ -132,6 +132,7 @@ struct cpuinfo_x86 boot_cpu_data __read_
 EXPORT_SYMBOL(boot_cpu_data);
 
 struct x86_addr_config bsp_addr_config;
+struct x86_sys_config x86_config __read_mostly;
 
 #if !defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
 __visible unsigned long mmu_cr4_features __ro_after_init;
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 17/34] x86/cpu: Move virtual address limit out of cpuinfo_x86
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (15 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 16/34] x86/cpu: Move physical address limit out of cpuinfo_x86 Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 18/34] x86/cpu/centaur: Move cache alignment override to BSP init Dave Hansen
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

There are no longer any direct references to
cpuinfo_x86->x86_virt_bits.  The only remaining references are to
'boot_cpu_data' via the x86_virt_bits() helper.

This means the farce that x86_virt_bits is per-cpu data can end.
Remove it from cpuinfo_x86 and add it to 'x86_config'.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |    6 +++---
 b/arch/x86/kernel/cpu/common.c     |    8 +++-----
 b/arch/x86/mm/maccess.c            |    9 +++++----
 3 files changed, 11 insertions(+), 12 deletions(-)

diff -puN arch/x86/include/asm/processor.h~no-cpu-data-virt_bits arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~no-cpu-data-virt_bits	2024-02-22 10:08:56.748778725 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:56.752778882 -0800
@@ -117,7 +117,6 @@ struct cpuinfo_x86 {
 #ifdef CONFIG_X86_VMX_FEATURE_NAMES
 	__u32			vmx_capability[NVMXINTS];
 #endif
-	__u8			x86_virt_bits;
 	/* CPUID returned core id bits: */
 	__u8			x86_coreid_bits;
 	/* Max extended CPUID function supported: */
@@ -182,8 +181,9 @@ struct x86_addr_config {
  * modified after that.
  */
 struct x86_sys_config {
-	/* Physical address bits supported by all processors */
+	/* Address bits supported by all processors */
 	u8	phys_bits;
+	u8	virt_bits;
 };
 
 extern struct x86_sys_config x86_config;
@@ -800,7 +800,7 @@ static inline u8 x86_phys_bits(void)
 
 static inline u8 x86_virt_bits(void)
 {
-	return boot_cpu_data.x86_virt_bits;
+	return x86_config.virt_bits;
 }
 
 static inline u8 x86_clflush_size(void)
diff -puN arch/x86/kernel/cpu/common.c~no-cpu-data-virt_bits arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~no-cpu-data-virt_bits	2024-02-22 10:08:56.748778725 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:56.752778882 -0800
@@ -1106,17 +1106,17 @@ void get_cpu_address_sizes(struct cpuinf
 	if (vp_bits_from_cpuid) {
 		cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
 
-		c->x86_virt_bits = (eax >> 8) & 0xff;
+		x86_config.virt_bits = (eax >> 8) & 0xff;
 		x86_config.phys_bits = eax & 0xff;
 	} else {
 		if (IS_ENABLED(CONFIG_X86_64)) {
 			x86_config.phys_bits = 36;
+			x86_config.virt_bits = 48;
 			c->x86_clflush_size = 64;
-			c->x86_virt_bits = 48;
 		} else {
 			x86_config.phys_bits = 32;
+			x86_config.virt_bits = 32;
 			c->x86_clflush_size = 32;
-			c->x86_virt_bits = 32;
 
 			if (cpu_has(c, X86_FEATURE_PAE) ||
 			    cpu_has(c, X86_FEATURE_PSE36))
@@ -1827,11 +1827,9 @@ static void identify_cpu(struct cpuinfo_
 	c->topo.l2c_id = BAD_APICID;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
-	c->x86_virt_bits = 48;
 #else
 	c->cpuid_level = -1;	/* CPUID not detected */
 	c->x86_clflush_size = 32;
-	c->x86_virt_bits = 32;
 #endif
 	c->x86_cache_alignment = x86_clflush_size();
 	memset(&c->x86_capability, 0, sizeof(c->x86_capability));
diff -puN arch/x86/mm/maccess.c~no-cpu-data-virt_bits arch/x86/mm/maccess.c
--- a/arch/x86/mm/maccess.c~no-cpu-data-virt_bits	2024-02-22 10:08:56.752778882 -0800
+++ b/arch/x86/mm/maccess.c	2024-02-22 10:08:56.752778882 -0800
@@ -16,11 +16,12 @@ bool copy_from_kernel_nofault_allowed(co
 		return false;
 
 	/*
-	 * Allow everything during early boot before 'x86_virt_bits'
-	 * is initialized.  Needed for instruction decoding in early
-	 * exception handlers.
+	 * Allow everything during early boot before 'virt_bits' is
+	 * initialized.  Needed for instruction decoding in early
+	 * exception handlers.  Avoid the helper because it may do
+	 * error checking for being uninitiazed.
 	 */
-	if (!x86_virt_bits())
+	if (!x86_config.virt_bits)
 		return true;
 
 	return __is_canonical_address(vaddr, x86_virt_bits());
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 18/34] x86/cpu/centaur: Move cache alignment override to BSP init
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (16 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 17/34] x86/cpu: Move virtual " Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 19/34] x86/cpu: Introduce cache alignment multiplier Dave Hansen
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The 'x86_cache_alignment' is yet another system-wide value which is stored
per-cpu.  Stop initializing it per-cpu on (some) Centaur CPUs and move it
to a new BSP init function.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/centaur.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff -puN arch/x86/kernel/cpu/centaur.c~centaur-c_early_init arch/x86/kernel/cpu/centaur.c
--- a/arch/x86/kernel/cpu/centaur.c~centaur-c_early_init	2024-02-22 10:08:57.268799139 -0800
+++ b/arch/x86/kernel/cpu/centaur.c	2024-02-22 10:08:57.268799139 -0800
@@ -61,12 +61,8 @@ static void init_c3(struct cpuinfo_x86 *
 	if (c->x86_model >= 6 && c->x86_model < 9)
 		set_cpu_cap(c, X86_FEATURE_3DNOW);
 #endif
-	if (c->x86 == 0x6 && c->x86_model >= 0xf) {
-		c->x86_cache_alignment = x86_clflush_size() * 2;
-		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
-	}
-
-	if (c->x86 >= 7)
+	if ((c->x86 == 0x6 && c->x86_model >= 0xf) ||
+	    (c->x86 >= 7))
 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 }
 
@@ -217,6 +213,12 @@ static void init_centaur(struct cpuinfo_
 	init_ia32_feat_ctl(c);
 }
 
+static void bsp_init_centaur(struct cpuinfo_x86 *c)
+{
+	if (c->x86 == 0x6 && c->x86_model >= 0xf)
+		c->x86_cache_alignment = x86_clflush_size() * 2;
+}
+
 #ifdef CONFIG_X86_32
 static unsigned int
 centaur_size_cache(struct cpuinfo_x86 *c, unsigned int size)
@@ -241,6 +243,7 @@ static const struct cpu_dev centaur_cpu_
 	.c_vendor	= "Centaur",
 	.c_ident	= { "CentaurHauls" },
 	.c_early_init	= early_init_centaur,
+	.c_bsp_init	= bsp_init_centaur,
 	.c_init		= init_centaur,
 #ifdef CONFIG_X86_32
 	.legacy_cache_size = centaur_size_cache,
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 19/34] x86/cpu: Introduce cache alignment multiplier
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (17 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 18/34] x86/cpu/centaur: Move cache alignment override to BSP init Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 20/34] x86/cpu: Remove superfluous cache alignment assignments Dave Hansen
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The kernel currently tracks two different but similar-sounding values:

	x86_clflush_size
and
	x86_cache_alignment

x86_clflush_size is literally the size of the cacheline which is also
the architectural granularity at which CLFLUSH operates.

But some weirdo CPUs like the Pentium 4 do some operations across
two cachelines.  Their CLFLUSH still works at 'x86_clflush_size'
but they need some operations aligned to that double cacheline to
work well, thus 'x86_cache_alignment'.

Introduce and use a 'bsp_addr_config' field for these systems.

Note that the Intel Family 15 code actually did this in two different
sites, one for 32-bit and one for 64-bit.  Unify them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |    7 +++++++
 b/arch/x86/kernel/cpu/centaur.c    |    2 +-
 b/arch/x86/kernel/cpu/common.c     |    3 +++
 b/arch/x86/kernel/cpu/intel.c      |   10 ++++------
 4 files changed, 15 insertions(+), 7 deletions(-)

diff -puN arch/x86/include/asm/processor.h~x86_cache_alignment_mult arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_cache_alignment_mult	2024-02-22 10:08:57.732817356 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:57.740817669 -0800
@@ -172,6 +172,13 @@ struct x86_addr_config {
 	 * platforms that use memory encryption.
 	 */
 	u8 phys_addr_reduction_bits;
+
+	/*
+	 * "x86_clflush_size" is the size of an actual cacheline.
+	 * Allow systems to specify a multiplier where alignment
+	 * will take place at a more coarse granularity.
+	 */
+	u8 cache_align_mult;
 };
 
 /*
diff -puN arch/x86/kernel/cpu/centaur.c~x86_cache_alignment_mult arch/x86/kernel/cpu/centaur.c
--- a/arch/x86/kernel/cpu/centaur.c~x86_cache_alignment_mult	2024-02-22 10:08:57.732817356 -0800
+++ b/arch/x86/kernel/cpu/centaur.c	2024-02-22 10:08:57.740817669 -0800
@@ -216,7 +216,7 @@ static void init_centaur(struct cpuinfo_
 static void bsp_init_centaur(struct cpuinfo_x86 *c)
 {
 	if (c->x86 == 0x6 && c->x86_model >= 0xf)
-		c->x86_cache_alignment = x86_clflush_size() * 2;
+		bsp_addr_config.cache_align_mult = 2;
 }
 
 #ifdef CONFIG_X86_32
diff -puN arch/x86/kernel/cpu/common.c~x86_cache_alignment_mult arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~x86_cache_alignment_mult	2024-02-22 10:08:57.736817513 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:57.740817669 -0800
@@ -1124,7 +1124,10 @@ void get_cpu_address_sizes(struct cpuinf
 		}
 	}
 	c->x86_cache_bits = x86_config.phys_bits;
+
 	c->x86_cache_alignment = x86_clflush_size();
+	if (bsp_addr_config.cache_align_mult)
+		c->x86_cache_alignment *= bsp_addr_config.cache_align_mult;
 
 	/* Do this last to avoid affecting ->x86_cache_bits. */
 	x86_config.phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
diff -puN arch/x86/kernel/cpu/intel.c~x86_cache_alignment_mult arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~x86_cache_alignment_mult	2024-02-22 10:08:57.736817513 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:57.740817669 -0800
@@ -236,10 +236,6 @@ static void early_init_intel(struct cpui
 
 #ifdef CONFIG_X86_64
 	set_cpu_cap(c, X86_FEATURE_SYSENTER32);
-#else
-	/* Netburst reports 64 bytes clflush size, but does IO in 128 bytes */
-	if (c->x86 == 15 && c->x86_cache_alignment == 64)
-		c->x86_cache_alignment = 128;
 #endif
 
 	/*
@@ -418,6 +414,10 @@ static void bsp_init_intel(struct cpuinf
 		WARN_ON_ONCE(keyid_bits);
 		bsp_addr_config.phys_addr_reduction_bits = 4;
 	}
+
+	/* Netburst reports 64 bytes clflush size, but does IO in 128 bytes */
+	if (c->x86 == 15)
+		bsp_addr_config.cache_align_mult = 2;
 }
 
 #ifdef CONFIG_X86_32
@@ -659,8 +659,6 @@ static void init_intel(struct cpuinfo_x8
 		set_cpu_bug(c, X86_BUG_MONITOR);
 
 #ifdef CONFIG_X86_64
-	if (c->x86 == 15)
-		c->x86_cache_alignment = x86_clflush_size() * 2;
 	if (c->x86 == 6)
 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 #else
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 20/34] x86/cpu: Remove superfluous cache alignment assignments
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (18 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 19/34] x86/cpu: Introduce cache alignment multiplier Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 21/34] x86/cpu: Consolidate CLFLUSH size setting Dave Hansen
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Now that the boot CPU's ->x86_cache_alignment is authoritative
and is is consistently established, these other copies only
serve to clobber it needlessly.  Remove them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/common.c |    2 --
 1 file changed, 2 deletions(-)

diff -puN arch/x86/kernel/cpu/common.c~dup-cache-alignment arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~dup-cache-alignment	2024-02-22 10:08:58.260838084 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:58.260838084 -0800
@@ -956,7 +956,6 @@ void cpu_detect(struct cpuinfo_x86 *c)
 		if (cap0 & (1<<19)) {
 			if (c == &boot_cpu_data)
 				c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
-			c->x86_cache_alignment = x86_clflush_size();
 		}
 	}
 }
@@ -1834,7 +1833,6 @@ static void identify_cpu(struct cpuinfo_
 	c->cpuid_level = -1;	/* CPUID not detected */
 	c->x86_clflush_size = 32;
 #endif
-	c->x86_cache_alignment = x86_clflush_size();
 	memset(&c->x86_capability, 0, sizeof(c->x86_capability));
 #ifdef CONFIG_X86_VMX_FEATURE_NAMES
 	memset(&c->vmx_capability, 0, sizeof(c->vmx_capability));
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 21/34] x86/cpu: Consolidate CLFLUSH size setting.
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (19 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 20/34] x86/cpu: Remove superfluous cache alignment assignments Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 22/34] x86/cpu: Move CLFLUSH size into global config Dave Hansen
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Right now, cpu_detect() establishes ->x86_clflush_size on each CPU
that.  supports CLFLUSH.  As you might have guessed, that per-cpu
value is no longer used.

Move the setting to get_cpu_address_sizes() which is only called
on the boot CPU.  Consolidate all of the ->x86_clflush_size logic
into a single helper.

This removes the (cap0 & (1<<19) aka X86_FEATURE_CLFLUSH check before
reading the CLFLUSH size out of CPUID[1].  Those bits are presumably
either 0 or have actual data about CLFLUSH in them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/common.c |   35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff -puN arch/x86/kernel/cpu/common.c~later-clflush_size-detect arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~later-clflush_size-detect	2024-02-22 10:08:58.728856457 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:58.732856614 -0800
@@ -935,6 +935,27 @@ static void get_cpu_vendor(struct cpuinf
 	this_cpu = &default_cpu;
 }
 
+static u16 detect_clflush_size(struct cpuinfo_x86 *c)
+{
+	u16 clflush_size = 0;
+
+	if (c->cpuid_level >= 0x00000001) {
+		u32 eax, ebx, ecx, edx;
+
+		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
+
+		clflush_size = ((ebx >> 8) & 0xff) * 8;
+	}
+
+	if (clflush_size)
+		return clflush_size;
+
+	/* Return some mostly sane defaults: */
+	if (IS_ENABLED(CONFIG_X86_64))
+		return 64;
+	return 32;
+}
+
 void cpu_detect(struct cpuinfo_x86 *c)
 {
 	/* Get vendor name */
@@ -952,11 +973,6 @@ void cpu_detect(struct cpuinfo_x86 *c)
 		c->x86		= x86_family(tfms);
 		c->x86_model	= x86_model(tfms);
 		c->x86_stepping	= x86_stepping(tfms);
-
-		if (cap0 & (1<<19)) {
-			if (c == &boot_cpu_data)
-				c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
-		}
 	}
 }
 
@@ -1111,17 +1127,17 @@ void get_cpu_address_sizes(struct cpuinf
 		if (IS_ENABLED(CONFIG_X86_64)) {
 			x86_config.phys_bits = 36;
 			x86_config.virt_bits = 48;
-			c->x86_clflush_size = 64;
 		} else {
 			x86_config.phys_bits = 32;
 			x86_config.virt_bits = 32;
-			c->x86_clflush_size = 32;
 
 			if (cpu_has(c, X86_FEATURE_PAE) ||
 			    cpu_has(c, X86_FEATURE_PSE36))
 				x86_config.phys_bits = 36;
 		}
 	}
+	c->x86_clflush_size = detect_clflush_size(c);
+
 	c->x86_cache_bits = x86_config.phys_bits;
 
 	c->x86_cache_alignment = x86_clflush_size();
@@ -1827,11 +1843,8 @@ static void identify_cpu(struct cpuinfo_
 	c->topo.cu_id = 0xff;
 	c->topo.llc_id = BAD_APICID;
 	c->topo.l2c_id = BAD_APICID;
-#ifdef CONFIG_X86_64
-	c->x86_clflush_size = 64;
-#else
+#ifndef CONFIG_X86_64
 	c->cpuid_level = -1;	/* CPUID not detected */
-	c->x86_clflush_size = 32;
 #endif
 	memset(&c->x86_capability, 0, sizeof(c->x86_capability));
 #ifdef CONFIG_X86_VMX_FEATURE_NAMES
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 22/34] x86/cpu: Move CLFLUSH size into global config
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (20 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 21/34] x86/cpu: Consolidate CLFLUSH size setting Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 23/34] x86/cpu: Move cache alignment configuration to global struct Dave Hansen
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

x86_clflush_size is maintained per-cpu despite being global configuration.

Move it to 'x86_config'.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |    4 ++--
 b/arch/x86/kernel/cpu/common.c     |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff -puN arch/x86/include/asm/processor.h~bsp-clflush_size arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~bsp-clflush_size	2024-02-22 10:08:59.208875301 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:59.212875458 -0800
@@ -148,7 +148,6 @@ struct cpuinfo_x86 {
 	u64			ppin;
 	/* cpuid returned max cores value: */
 	u16			x86_max_cores;
-	u16			x86_clflush_size;
 	/* number of cores as seen by the OS: */
 	u16			booted_cores;
 	/* Index into per_cpu list: */
@@ -191,6 +190,7 @@ struct x86_sys_config {
 	/* Address bits supported by all processors */
 	u8	phys_bits;
 	u8	virt_bits;
+	u16	clflush_size;
 };
 
 extern struct x86_sys_config x86_config;
@@ -812,7 +812,7 @@ static inline u8 x86_virt_bits(void)
 
 static inline u8 x86_clflush_size(void)
 {
-	return boot_cpu_data.x86_clflush_size;
+	return x86_config.clflush_size;
 }
 
 #endif /* _ASM_X86_PROCESSOR_H */
diff -puN arch/x86/kernel/cpu/common.c~bsp-clflush_size arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~bsp-clflush_size	2024-02-22 10:08:59.208875301 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:59.212875458 -0800
@@ -1136,7 +1136,7 @@ void get_cpu_address_sizes(struct cpuinf
 				x86_config.phys_bits = 36;
 		}
 	}
-	c->x86_clflush_size = detect_clflush_size(c);
+	x86_config.clflush_size = detect_clflush_size(c);
 
 	c->x86_cache_bits = x86_config.phys_bits;
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 23/34] x86/cpu: Move cache alignment configuration to global struct
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (21 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 22/34] x86/cpu: Move CLFLUSH size into global config Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 24/34] x86/cpu: Establish 'min_cache_bits' configuration Dave Hansen
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

It is a tale as old as time: x86_cache_alignment is established and
stored per-cpu despite being system-wide.  Move it to the global
configuration structure.

The other values have received _new_ wrappers.  But this one already
had cache_line_size().  Just use that wrapper.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |    4 ++--
 b/arch/x86/kernel/cpu/common.c     |    4 ++--
 b/arch/x86/kernel/cpu/proc.c       |    2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff -puN arch/x86/include/asm/processor.h~x86_config-x86_cache_alignment arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_config-x86_cache_alignment	2024-02-22 10:08:59.700894617 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:08:59.708894931 -0800
@@ -137,7 +137,6 @@ struct cpuinfo_x86 {
 	struct cpuinfo_topology	topo;
 	/* in KB - valid for CPUS which support this call: */
 	unsigned int		x86_cache_size;
-	int			x86_cache_alignment;	/* In bytes */
 	/* Cache QoS architectural values, valid only on the BSP: */
 	int			x86_cache_max_rmid;	/* max index */
 	int			x86_cache_occ_scale;	/* scale to bytes */
@@ -191,6 +190,7 @@ struct x86_sys_config {
 	u8	phys_bits;
 	u8	virt_bits;
 	u16	clflush_size;
+	int	cache_alignment; /* in bytes */
 };
 
 extern struct x86_sys_config x86_config;
@@ -229,7 +229,7 @@ DECLARE_PER_CPU_READ_MOSTLY(struct cpuin
 
 extern const struct seq_operations cpuinfo_op;
 
-#define cache_line_size()	(boot_cpu_data.x86_cache_alignment)
+#define cache_line_size()	(x86_config.cache_alignment)
 
 extern void cpu_detect(struct cpuinfo_x86 *c);
 
diff -puN arch/x86/kernel/cpu/common.c~x86_config-x86_cache_alignment arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~x86_config-x86_cache_alignment	2024-02-22 10:08:59.704894774 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:08:59.708894931 -0800
@@ -1140,9 +1140,9 @@ void get_cpu_address_sizes(struct cpuinf
 
 	c->x86_cache_bits = x86_config.phys_bits;
 
-	c->x86_cache_alignment = x86_clflush_size();
+	x86_config.cache_alignment = x86_clflush_size();
 	if (bsp_addr_config.cache_align_mult)
-		c->x86_cache_alignment *= bsp_addr_config.cache_align_mult;
+		x86_config.cache_alignment *= bsp_addr_config.cache_align_mult;
 
 	/* Do this last to avoid affecting ->x86_cache_bits. */
 	x86_config.phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
diff -puN arch/x86/kernel/cpu/proc.c~x86_config-x86_cache_alignment arch/x86/kernel/cpu/proc.c
--- a/arch/x86/kernel/cpu/proc.c~x86_config-x86_cache_alignment	2024-02-22 10:08:59.704894774 -0800
+++ b/arch/x86/kernel/cpu/proc.c	2024-02-22 10:08:59.708894931 -0800
@@ -131,7 +131,7 @@ static int show_cpuinfo(struct seq_file
 		seq_printf(m, "TLB size\t: %d 4K pages\n", c->x86_tlbsize);
 #endif
 	seq_printf(m, "clflush size\t: %u\n", x86_clflush_size());
-	seq_printf(m, "cache_alignment\t: %d\n", c->x86_cache_alignment);
+	seq_printf(m, "cache_alignment\t: %d\n", x86_config.cache_alignment);
 	seq_printf(m, "address sizes\t: %u bits physical, %u bits virtual\n",
 		   x86_phys_bits(), x86_virt_bits());
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 24/34] x86/cpu: Establish 'min_cache_bits' configuration
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (22 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 23/34] x86/cpu: Move cache alignment configuration to global struct Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:39 ` [RFC][PATCH 25/34] x86/cpu: Move cache bits to global config Dave Hansen
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Continue moving towards a setup where code never tweaks 'boot_cpu_data'.

Code must establish their intent in 'x86_addr_config' and then later
code will use that config information to establish the system-wide
config.

The L1TF wants to tweak x86_cache_bits.  Let it do this, but move
the code away from bugs.c so that ti can be easily called earlier
en boot.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |    6 +++++
 b/arch/x86/kernel/cpu/bugs.c       |   41 -------------------------------------
 b/arch/x86/kernel/cpu/common.c     |    2 +
 b/arch/x86/kernel/cpu/intel.c      |   40 ++++++++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 41 deletions(-)

diff -puN arch/x86/include/asm/processor.h~bsp-min_cache_bits arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~bsp-min_cache_bits	2024-02-22 10:09:00.220915031 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:09:00.224915188 -0800
@@ -177,6 +177,12 @@ struct x86_addr_config {
 	 * will take place at a more coarse granularity.
 	 */
 	u8 cache_align_mult;
+
+	/*
+	 * Specify a floor for the number of bits that the CPU
+	 * caches comprehend.  Used only for L1TF mitigation.
+	 */
+	u8 min_cache_bits;
 };
 
 /*
diff -puN arch/x86/kernel/cpu/bugs.c~bsp-min_cache_bits arch/x86/kernel/cpu/bugs.c
--- a/arch/x86/kernel/cpu/bugs.c~bsp-min_cache_bits	2024-02-22 10:09:00.220915031 -0800
+++ b/arch/x86/kernel/cpu/bugs.c	2024-02-22 10:09:00.224915188 -0800
@@ -2237,45 +2237,6 @@ EXPORT_SYMBOL_GPL(l1tf_mitigation);
 enum vmx_l1d_flush_state l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
 EXPORT_SYMBOL_GPL(l1tf_vmx_mitigation);
 
-/*
- * These CPUs all support 44bits physical address space internally in the
- * cache but CPUID can report a smaller number of physical address bits.
- *
- * The L1TF mitigation uses the top most address bit for the inversion of
- * non present PTEs. When the installed memory reaches into the top most
- * address bit due to memory holes, which has been observed on machines
- * which report 36bits physical address bits and have 32G RAM installed,
- * then the mitigation range check in l1tf_select_mitigation() triggers.
- * This is a false positive because the mitigation is still possible due to
- * the fact that the cache uses 44bit internally. Use the cache bits
- * instead of the reported physical bits and adjust them on the affected
- * machines to 44bit if the reported bits are less than 44.
- */
-static void override_cache_bits(struct cpuinfo_x86 *c)
-{
-	if (c->x86 != 6)
-		return;
-
-	switch (c->x86_model) {
-	case INTEL_FAM6_NEHALEM:
-	case INTEL_FAM6_WESTMERE:
-	case INTEL_FAM6_SANDYBRIDGE:
-	case INTEL_FAM6_IVYBRIDGE:
-	case INTEL_FAM6_HASWELL:
-	case INTEL_FAM6_HASWELL_L:
-	case INTEL_FAM6_HASWELL_G:
-	case INTEL_FAM6_BROADWELL:
-	case INTEL_FAM6_BROADWELL_G:
-	case INTEL_FAM6_SKYLAKE_L:
-	case INTEL_FAM6_SKYLAKE:
-	case INTEL_FAM6_KABYLAKE_L:
-	case INTEL_FAM6_KABYLAKE:
-		if (c->x86_cache_bits < 44)
-			c->x86_cache_bits = 44;
-		break;
-	}
-}
-
 static void __init l1tf_select_mitigation(void)
 {
 	u64 half_pa;
@@ -2288,8 +2249,6 @@ static void __init l1tf_select_mitigatio
 	else if (cpu_mitigations_auto_nosmt())
 		l1tf_mitigation = L1TF_MITIGATION_FLUSH_NOSMT;
 
-	override_cache_bits(&boot_cpu_data);
-
 	switch (l1tf_mitigation) {
 	case L1TF_MITIGATION_OFF:
 	case L1TF_MITIGATION_FLUSH_NOWARN:
diff -puN arch/x86/kernel/cpu/common.c~bsp-min_cache_bits arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~bsp-min_cache_bits	2024-02-22 10:09:00.220915031 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:09:00.228915345 -0800
@@ -1139,6 +1139,8 @@ void get_cpu_address_sizes(struct cpuinf
 	x86_config.clflush_size = detect_clflush_size(c);
 
 	c->x86_cache_bits = x86_config.phys_bits;
+	if (c->x86_cache_bits < bsp_addr_config.min_cache_bits)
+		c->x86_cache_bits = bsp_addr_config.min_cache_bits;
 
 	x86_config.cache_alignment = x86_clflush_size();
 	if (bsp_addr_config.cache_align_mult)
diff -puN arch/x86/kernel/cpu/intel.c~bsp-min_cache_bits arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~bsp-min_cache_bits	2024-02-22 10:09:00.224915188 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:09:00.228915345 -0800
@@ -395,6 +395,44 @@ detect_keyid_bits:
 	return keyid_bits;
 }
 
+/*
+ * These CPUs all support 44bits physical address space internally in the
+ * cache but CPUID can report a smaller number of physical address bits.
+ *
+ * The L1TF mitigation uses the top most address bit for the inversion of
+ * non present PTEs. When the installed memory reaches into the top most
+ * address bit due to memory holes, which has been observed on machines
+ * which report 36bits physical address bits and have 32G RAM installed,
+ * then the mitigation range check in l1tf_select_mitigation() triggers.
+ * This is a false positive because the mitigation is still possible due to
+ * the fact that the cache uses 44bit internally. Use the cache bits
+ * instead of the reported physical bits and adjust them on the affected
+ * machines to 44bit if the reported bits are less than 44.
+ */
+static void set_min_cache_bits(struct cpuinfo_x86 *c)
+{
+	if (c->x86 != 6)
+		return;
+
+	switch (c->x86_model) {
+	case INTEL_FAM6_NEHALEM:
+	case INTEL_FAM6_WESTMERE:
+	case INTEL_FAM6_SANDYBRIDGE:
+	case INTEL_FAM6_IVYBRIDGE:
+	case INTEL_FAM6_HASWELL:
+	case INTEL_FAM6_HASWELL_L:
+	case INTEL_FAM6_HASWELL_G:
+	case INTEL_FAM6_BROADWELL:
+	case INTEL_FAM6_BROADWELL_G:
+	case INTEL_FAM6_SKYLAKE_L:
+	case INTEL_FAM6_SKYLAKE:
+	case INTEL_FAM6_KABYLAKE_L:
+	case INTEL_FAM6_KABYLAKE:
+		bsp_addr_config.min_cache_bits = 44;
+		break;
+	}
+}
+
 static void bsp_init_intel(struct cpuinfo_x86 *c)
 {
 	int keyid_bits = 0;
@@ -418,6 +456,8 @@ static void bsp_init_intel(struct cpuinf
 	/* Netburst reports 64 bytes clflush size, but does IO in 128 bytes */
 	if (c->x86 == 15)
 		bsp_addr_config.cache_align_mult = 2;
+
+	set_min_cache_bits(c);
 }
 
 #ifdef CONFIG_X86_32
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 25/34] x86/cpu: Move cache bits to global config
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (23 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 24/34] x86/cpu: Establish 'min_cache_bits' configuration Dave Hansen
@ 2024-02-22 18:39 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 26/34] x86/cpu: Zap superfluous get_cpu_address_sizes() Dave Hansen
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

x86_cache_bits is established and stored per-cpu despite being system-wide.
Move it from 'struct cpuinfo_x86' to 'x86_config' and give it a helper.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |   10 +++++++---
 b/arch/x86/kernel/cpu/common.c     |    8 ++++----
 b/arch/x86/kvm/mmu/spte.c          |    6 +++---
 3 files changed, 14 insertions(+), 10 deletions(-)

diff -puN arch/x86/include/asm/processor.h~config-cache_bits arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~config-cache_bits	2024-02-22 10:09:00.768936544 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:09:00.772936701 -0800
@@ -154,8 +154,6 @@ struct cpuinfo_x86 {
 	/*  Is SMT active on this core? */
 	bool			smt_active;
 	u32			microcode;
-	/* Address space bits used by the cache internally */
-	u8			x86_cache_bits;
 	unsigned		initialized : 1;
 } __randomize_layout;
 
@@ -195,6 +193,7 @@ struct x86_sys_config {
 	/* Address bits supported by all processors */
 	u8	phys_bits;
 	u8	virt_bits;
+	u8	cache_bits;
 	u16	clflush_size;
 	int	cache_alignment; /* in bytes */
 };
@@ -241,7 +240,7 @@ extern void cpu_detect(struct cpuinfo_x8
 
 static inline unsigned long long l1tf_pfn_limit(void)
 {
-	return BIT_ULL(boot_cpu_data.x86_cache_bits - 1 - PAGE_SHIFT);
+	return BIT_ULL(x86_config.cache_bits - 1 - PAGE_SHIFT);
 }
 
 extern void early_cpu_init(void);
@@ -816,6 +815,11 @@ static inline u8 x86_virt_bits(void)
 	return x86_config.virt_bits;
 }
 
+static inline u8 x86_cache_bits(void)
+{
+	return x86_config.cache_bits;
+}
+
 static inline u8 x86_clflush_size(void)
 {
 	return x86_config.clflush_size;
diff -puN arch/x86/kernel/cpu/common.c~config-cache_bits arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~config-cache_bits	2024-02-22 10:09:00.768936544 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:09:00.772936701 -0800
@@ -1138,15 +1138,15 @@ void get_cpu_address_sizes(struct cpuinf
 	}
 	x86_config.clflush_size = detect_clflush_size(c);
 
-	c->x86_cache_bits = x86_config.phys_bits;
-	if (c->x86_cache_bits < bsp_addr_config.min_cache_bits)
-		c->x86_cache_bits = bsp_addr_config.min_cache_bits;
+	x86_config.cache_bits = x86_config.phys_bits;
+	if (x86_config.cache_bits < bsp_addr_config.min_cache_bits)
+		x86_config.cache_bits = bsp_addr_config.min_cache_bits;
 
 	x86_config.cache_alignment = x86_clflush_size();
 	if (bsp_addr_config.cache_align_mult)
 		x86_config.cache_alignment *= bsp_addr_config.cache_align_mult;
 
-	/* Do this last to avoid affecting ->x86_cache_bits. */
+	/* Do this last to avoid affecting '.cache_bits'. */
 	x86_config.phys_bits -= bsp_addr_config.phys_addr_reduction_bits;
 }
 
diff -puN arch/x86/kvm/mmu/spte.c~config-cache_bits arch/x86/kvm/mmu/spte.c
--- a/arch/x86/kvm/mmu/spte.c~config-cache_bits	2024-02-22 10:09:00.772936701 -0800
+++ b/arch/x86/kvm/mmu/spte.c	2024-02-22 10:09:00.772936701 -0800
@@ -470,12 +470,12 @@ void kvm_mmu_reset_all_pte_masks(void)
 	shadow_nonpresent_or_rsvd_mask = 0;
 	low_phys_bits = x86_phys_bits();
 	if (boot_cpu_has_bug(X86_BUG_L1TF) &&
-	    !WARN_ON_ONCE(boot_cpu_data.x86_cache_bits >=
+	    !WARN_ON_ONCE(x86_cache_bits() >=
 			  52 - SHADOW_NONPRESENT_OR_RSVD_MASK_LEN)) {
-		low_phys_bits = boot_cpu_data.x86_cache_bits
+		low_phys_bits = x86_cache_bits()
 			- SHADOW_NONPRESENT_OR_RSVD_MASK_LEN;
 		shadow_nonpresent_or_rsvd_mask =
-			rsvd_bits(low_phys_bits, boot_cpu_data.x86_cache_bits - 1);
+			rsvd_bits(low_phys_bits, x86_cache_bits() - 1);
 	}
 
 	shadow_nonpresent_or_rsvd_lower_gfn_mask =
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 26/34] x86/cpu: Zap superfluous get_cpu_address_sizes()
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (24 preceding siblings ...)
  2024-02-22 18:39 ` [RFC][PATCH 25/34] x86/cpu: Move cache bits to global config Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 27/34] x86/cpu: Enforce read-only x86_config state (lightly) Dave Hansen
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

All of the 'x86_config' state is established by the boot CPU long
before generic_identify().  There is no need to call
get_cpu_address_sizes() again.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/common.c |    2 --
 1 file changed, 2 deletions(-)

diff -puN arch/x86/kernel/cpu/common.c~zap-extra-get_cpu_address_sizes arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~zap-extra-get_cpu_address_sizes	2024-02-22 10:09:01.292957116 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:09:01.296957273 -0800
@@ -1770,8 +1770,6 @@ static void generic_identify(struct cpui
 
 	get_cpu_cap(c);
 
-	get_cpu_address_sizes(c);
-
 	if (c->cpuid_level >= 0x00000001) {
 		c->topo.initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
 #ifdef CONFIG_X86_32
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 27/34] x86/cpu: Enforce read-only x86_config state (lightly)
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (25 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 26/34] x86/cpu: Zap superfluous get_cpu_address_sizes() Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 28/34] x86/cpu: Return sane defaults for early x86_config reads Dave Hansen
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Part of the reason that this all turned into such a mess is that there
were no rules around when 'struct cpuinfo_x86' was written.  It was
(and is) just a free-for-all.

Establish that 'x86_config' has two phases of its lifetime: an
C_INITIALIZING phase where it can be written and a later C_FINALIZED
stage where it can only be read.  It is simple to audit the fact
that this state transition happens just where the comment says it
should be.

Check that the config is C_FINALIZED in each of the wrappers that
read a 'x86_config' value.  If something reads too early, stash
some information to the caller so that it can spit out a warning
later.

This goofy stash-then-warn construct is necessary here because any
hapless readers are likely to be in a spot where they can not easily
WARN() themselves, like the early Xen PV boot that's caused so many
problems.

This also moves one x86_clflush_size() reference over to the more
direct x86_config.cache_alignment because otherwise it would trip
the !C_FINALIZED check.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |   22 ++++++++++++++++++++++
 b/arch/x86/kernel/cpu/common.c     |    5 ++++-
 b/arch/x86/kernel/setup.c          |    7 +++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff -puN arch/x86/include/asm/processor.h~x86_config-finalize arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_config-finalize	2024-02-22 10:09:01.772975960 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:09:01.780976274 -0800
@@ -183,6 +183,11 @@ struct x86_addr_config {
 	u8 min_cache_bits;
 };
 
+enum x86_sys_config_state {
+	C_INITIALIZING,
+	C_FINALIZED
+};
+
 /*
  * System-wide configuration that is shared by all processors.
  *
@@ -190,6 +195,9 @@ struct x86_addr_config {
  * modified after that.
  */
 struct x86_sys_config {
+	enum x86_sys_config_state conf_state;
+	void *early_reader;
+
 	/* Address bits supported by all processors */
 	u8	phys_bits;
 	u8	virt_bits;
@@ -805,23 +813,37 @@ static inline void weak_wrmsr_fence(void
 	alternative("mfence; lfence", "", ALT_NOT(X86_FEATURE_APIC_MSRS_FENCE));
 }
 
+static inline void read_x86_config(void)
+{
+	if (x86_config.conf_state == C_FINALIZED)
+		return;
+
+	/* Only record the first one: */
+	if (!x86_config.early_reader)
+		x86_config.early_reader = __builtin_return_address(0);
+}
+
 static inline u8 x86_phys_bits(void)
 {
+	read_x86_config();
 	return x86_config.phys_bits;
 }
 
 static inline u8 x86_virt_bits(void)
 {
+	read_x86_config();
 	return x86_config.virt_bits;
 }
 
 static inline u8 x86_cache_bits(void)
 {
+	read_x86_config();
 	return x86_config.cache_bits;
 }
 
 static inline u8 x86_clflush_size(void)
 {
+	read_x86_config();
 	return x86_config.clflush_size;
 }
 
diff -puN arch/x86/kernel/cpu/common.c~x86_config-finalize arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~x86_config-finalize	2024-02-22 10:09:01.776976117 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:09:01.780976274 -0800
@@ -1114,6 +1114,9 @@ void get_cpu_address_sizes(struct cpuinf
 	u32 eax, ebx, ecx, edx;
 	bool vp_bits_from_cpuid = true;
 
+	WARN_ON_ONCE(x86_config.conf_state > C_INITIALIZING);
+	x86_config.conf_state = C_INITIALIZING;
+
 	if (!cpu_has(c, X86_FEATURE_CPUID) ||
 	    (c->extended_cpuid_level < 0x80000008))
 		vp_bits_from_cpuid = false;
@@ -1142,7 +1145,7 @@ void get_cpu_address_sizes(struct cpuinf
 	if (x86_config.cache_bits < bsp_addr_config.min_cache_bits)
 		x86_config.cache_bits = bsp_addr_config.min_cache_bits;
 
-	x86_config.cache_alignment = x86_clflush_size();
+	x86_config.cache_alignment = x86_config.clflush_size;
 	if (bsp_addr_config.cache_align_mult)
 		x86_config.cache_alignment *= bsp_addr_config.cache_align_mult;
 
diff -puN arch/x86/kernel/setup.c~x86_config-finalize arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~x86_config-finalize	2024-02-22 10:09:01.776976117 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:09:01.780976274 -0800
@@ -762,7 +762,14 @@ void __init setup_arch(char **cmdline_p)
 	olpc_ofw_detect();
 
 	idt_setup_early_traps();
+
 	early_cpu_init();
+
+	/* Ensure no readers snuck in before the config was finished: */
+	WARN_ONCE(x86_config.early_reader, "x86_config.early_reader: %pS\n",
+		  x86_config.early_reader);
+	x86_config.conf_state = C_FINALIZED;
+
 	jump_label_init();
 	static_call_init();
 	early_ioremap_init();
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 28/34] x86/cpu: Return sane defaults for early x86_config reads
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (26 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 27/34] x86/cpu: Enforce read-only x86_config state (lightly) Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 29/34] x86/xen: Remove extra get_cpu_address_sizes() call Dave Hansen
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

*If* an unexpected user tries to read 'x86_config' before it is
C_FINALIZED and valid, it will likely get 0's or other garbage.
Those values often cause the user to crash before they can get
a nice message out to the console.

Provide some mostly random but unlikely-to-crash-the-caller
values.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/processor.h |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff -puN arch/x86/include/asm/processor.h~x86_config-defaults arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~x86_config-defaults	2024-02-22 10:09:02.280995903 -0800
+++ b/arch/x86/include/asm/processor.h	2024-02-22 10:09:02.280995903 -0800
@@ -813,37 +813,57 @@ static inline void weak_wrmsr_fence(void
 	alternative("mfence; lfence", "", ALT_NOT(X86_FEATURE_APIC_MSRS_FENCE));
 }
 
-static inline void read_x86_config(void)
+static inline int read_x86_config(void)
 {
 	if (x86_config.conf_state == C_FINALIZED)
-		return;
+		return 0;
 
 	/* Only record the first one: */
 	if (!x86_config.early_reader)
 		x86_config.early_reader = __builtin_return_address(0);
+
+	return -EINVAL;
 }
 
+/*
+ * These _should_ not be called until x86_config is C_FINALIZED.
+ * In the case that they are, two things will happen:
+ *  1. The first caller will get the call site recorded in
+ *     'early_reader' which will trigger a warning later in
+ *     boot.
+ *  2. A moderately sane hard-coded default value will be provided.
+ *     The entire intent of this value is to let things limp along
+ *     until the warning can spit out.
+ */
 static inline u8 x86_phys_bits(void)
 {
-	read_x86_config();
+	if (read_x86_config())
+		return 52;
+
 	return x86_config.phys_bits;
 }
 
 static inline u8 x86_virt_bits(void)
 {
-	read_x86_config();
+	if (read_x86_config())
+		return IS_ENABLED(CONFIG_X86_64) ? 57 : 32;
+
 	return x86_config.virt_bits;
 }
 
 static inline u8 x86_cache_bits(void)
 {
-	read_x86_config();
+	if (read_x86_config())
+		return 52;
+
 	return x86_config.cache_bits;
 }
 
 static inline u8 x86_clflush_size(void)
 {
-	read_x86_config();
+	if (read_x86_config())
+		return 64;
+
 	return x86_config.clflush_size;
 }
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 29/34] x86/xen: Remove extra get_cpu_address_sizes() call
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (27 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 28/34] x86/cpu: Return sane defaults for early x86_config reads Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 30/34] x86/cpu/centaur: Mark BSP init function as __init Dave Hansen
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Xen PV should now be avoiding reading 'x86_config' at all due to
previous fixes.  Even if something was missed, it should now get sane
values *and* be able to survive long enough to get a warning out later
in boot.

Remove the belt-and-suspenders call to get_cpu_address_sizes().

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/xen/enlighten_pv.c |    3 ---
 1 file changed, 3 deletions(-)

diff -puN arch/x86/xen/enlighten_pv.c~xen-remove-early-addr_size-fun arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~xen-remove-early-addr_size-fun	2024-02-22 10:09:02.753014433 -0800
+++ b/arch/x86/xen/enlighten_pv.c	2024-02-22 10:09:02.757014590 -0800
@@ -1386,9 +1386,6 @@ asmlinkage __visible void __init xen_sta
 	 */
 	xen_setup_gdt(0);
 
-	/* Determine virtual and physical address sizes */
-	get_cpu_address_sizes(&boot_cpu_data);
-
 	/* Let's presume PV guests always boot on vCPU with id 0. */
 	per_cpu(xen_vcpu_id, 0) = 0;
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 30/34] x86/cpu/centaur: Mark BSP init function as __init
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (28 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 29/34] x86/xen: Remove extra get_cpu_address_sizes() call Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 31/34] x86/cpu/intel: " Dave Hansen
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The boot CPU is obviously not initialized during normal system runtime
and its code can be __init.  This prevents a warning once
'bsp_addr_config' gets marked __init.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/centaur.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/x86/kernel/cpu/centaur.c~bsp_init_centaur-__init arch/x86/kernel/cpu/centaur.c
--- a/arch/x86/kernel/cpu/centaur.c~bsp_init_centaur-__init	2024-02-22 10:09:03.237033434 -0800
+++ b/arch/x86/kernel/cpu/centaur.c	2024-02-22 10:09:03.237033434 -0800
@@ -213,7 +213,7 @@ static void init_centaur(struct cpuinfo_
 	init_ia32_feat_ctl(c);
 }
 
-static void bsp_init_centaur(struct cpuinfo_x86 *c)
+static void __init bsp_init_centaur(struct cpuinfo_x86 *c)
 {
 	if (c->x86 == 0x6 && c->x86_model >= 0xf)
 		bsp_addr_config.cache_align_mult = 2;
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 31/34] x86/cpu/intel: Mark BSP init function as __init
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (29 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 30/34] x86/cpu/centaur: Mark BSP init function as __init Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 32/34] x86/cpu/amd: Move memory encryption detection Dave Hansen
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

The boot CPU is obviously not initialized during normal system runtime
and its code can be __init.  This prevents a warning once
'bsp_addr_config' gets marked __init.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/intel.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/x86/kernel/cpu/intel.c~bsp_init_intel-__init arch/x86/kernel/cpu/intel.c
--- a/arch/x86/kernel/cpu/intel.c~bsp_init_intel-__init	2024-02-22 10:09:03.721052435 -0800
+++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:09:03.725052592 -0800
@@ -433,7 +433,7 @@ static void set_min_cache_bits(struct cp
 	}
 }
 
-static void bsp_init_intel(struct cpuinfo_x86 *c)
+static void __init bsp_init_intel(struct cpuinfo_x86 *c)
 {
 	int keyid_bits = 0;
 
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 32/34] x86/cpu/amd: Move memory encryption detection
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (30 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 31/34] x86/cpu/intel: " Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 33/34] x86/cpu: Make get_cpu_address_sizes() static and __init Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 34/34] x86/cpu: Mark new boot CPU and config structures appropriately Dave Hansen
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Right now the AMD memory encryption detection code is run on every CPU
despite it operating on 'bsp_addr_config' and doing setup_clear_cpu_cap()
which only affects 'boot_cpu_data'.

Move it to bsp_init_amd() where it belongs and change its name to match.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/amd.c |  110 ++++++++++++++++++++++----------------------
 1 file changed, 55 insertions(+), 55 deletions(-)

diff -puN arch/x86/kernel/cpu/amd.c~early_init_amd-__init arch/x86/kernel/cpu/amd.c
--- a/arch/x86/kernel/cpu/amd.c~early_init_amd-__init	2024-02-22 10:09:04.201071279 -0800
+++ b/arch/x86/kernel/cpu/amd.c	2024-02-22 10:09:04.201071279 -0800
@@ -468,8 +468,62 @@ static void early_init_amd_mc(struct cpu
 #endif
 }
 
-static void bsp_init_amd(struct cpuinfo_x86 *c)
+static void bsp_detect_mem_encrypt(struct cpuinfo_x86 *c)
 {
+	u64 msr;
+
+	/*
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *	      If the kernel has not enabled SME via any means then
+	 *	      don't advertise the SME feature.
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV and SEV_ES feature (set in scattered.c).
+	 *
+	 *   In all cases, since support for SME and SEV requires long mode,
+	 *   don't advertise the feature under CONFIG_X86_32.
+	 */
+	if (cpu_has(c, X86_FEATURE_SME) || cpu_has(c, X86_FEATURE_SEV)) {
+		/* Check if memory encryption is enabled */
+		rdmsrl(MSR_AMD64_SYSCFG, msr);
+		if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
+			goto clear_all;
+
+		/*
+		 * Always adjust physical address bits. Even though this
+		 * will be a value above 32-bits this is still done for
+		 * CONFIG_X86_32 so that accurate values are reported.
+		 */
+		bsp_addr_config.phys_addr_reduction_bits =
+			(cpuid_ebx(0x8000001f) >> 6) & 0x3f;
+
+		if (IS_ENABLED(CONFIG_X86_32))
+			goto clear_all;
+
+		if (!sme_me_mask)
+			setup_clear_cpu_cap(X86_FEATURE_SME);
+
+		rdmsrl(MSR_K7_HWCR, msr);
+		if (!(msr & MSR_K7_HWCR_SMMLOCK))
+			goto clear_sev;
+
+		return;
+
+clear_all:
+		setup_clear_cpu_cap(X86_FEATURE_SME);
+clear_sev:
+		setup_clear_cpu_cap(X86_FEATURE_SEV);
+		setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
+	}
+}
+
+static void __init bsp_init_amd(struct cpuinfo_x86 *c)
+{
+	bsp_detect_mem_encrypt(c);
+
 	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
 
 		if (c->x86 > 0x10 ||
@@ -593,58 +647,6 @@ warn:
 	WARN_ONCE(1, "Family 0x%x, model: 0x%x??\n", c->x86, c->x86_model);
 }
 
-static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
-{
-	u64 msr;
-
-	/*
-	 * BIOS support is required for SME and SEV.
-	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
-	 *	      the SME physical address space reduction value.
-	 *	      If BIOS has not enabled SME then don't advertise the
-	 *	      SME feature (set in scattered.c).
-	 *	      If the kernel has not enabled SME via any means then
-	 *	      don't advertise the SME feature.
-	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
-	 *            SEV and SEV_ES feature (set in scattered.c).
-	 *
-	 *   In all cases, since support for SME and SEV requires long mode,
-	 *   don't advertise the feature under CONFIG_X86_32.
-	 */
-	if (cpu_has(c, X86_FEATURE_SME) || cpu_has(c, X86_FEATURE_SEV)) {
-		/* Check if memory encryption is enabled */
-		rdmsrl(MSR_AMD64_SYSCFG, msr);
-		if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
-			goto clear_all;
-
-		/*
-		 * Always adjust physical address bits. Even though this
-		 * will be a value above 32-bits this is still done for
-		 * CONFIG_X86_32 so that accurate values are reported.
-		 */
-		bsp_addr_config.phys_addr_reduction_bits =
-			(cpuid_ebx(0x8000001f) >> 6) & 0x3f;
-
-		if (IS_ENABLED(CONFIG_X86_32))
-			goto clear_all;
-
-		if (!sme_me_mask)
-			setup_clear_cpu_cap(X86_FEATURE_SME);
-
-		rdmsrl(MSR_K7_HWCR, msr);
-		if (!(msr & MSR_K7_HWCR_SMMLOCK))
-			goto clear_sev;
-
-		return;
-
-clear_all:
-		setup_clear_cpu_cap(X86_FEATURE_SME);
-clear_sev:
-		setup_clear_cpu_cap(X86_FEATURE_SEV);
-		setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
-	}
-}
-
 static void early_init_amd(struct cpuinfo_x86 *c)
 {
 	u64 value;
@@ -715,8 +717,6 @@ static void early_init_amd(struct cpuinf
 	if (c->x86 == 0x16 && c->x86_model <= 0xf)
 		msr_set_bit(MSR_AMD64_LS_CFG, 15);
 
-	early_detect_mem_encrypt(c);
-
 	/* Re-enable TopologyExtensions if switched off by BIOS */
 	if (c->x86 == 0x15 &&
 	    (c->x86_model >= 0x10 && c->x86_model <= 0x6f) &&
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 33/34] x86/cpu: Make get_cpu_address_sizes() static and __init
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (31 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 32/34] x86/cpu/amd: Move memory encryption detection Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  2024-02-22 18:40 ` [RFC][PATCH 34/34] x86/cpu: Mark new boot CPU and config structures appropriately Dave Hansen
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Now that all the other users are gone and the final user is in the same file
and __init, get_cpu_address_sizes() can be static and __init.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/cpu/common.c |    2 +-
 b/arch/x86/kernel/cpu/cpu.h    |    1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff -puN arch/x86/kernel/cpu/common.c~get_cpu_address_sizes-__init arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~get_cpu_address_sizes-__init	2024-02-22 10:09:04.677089966 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-02-22 10:09:04.681090123 -0800
@@ -1109,7 +1109,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	apply_forced_caps(c);
 }
 
-void get_cpu_address_sizes(struct cpuinfo_x86 *c)
+static void __init get_cpu_address_sizes(struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 	bool vp_bits_from_cpuid = true;
diff -puN arch/x86/kernel/cpu/cpu.h~get_cpu_address_sizes-__init arch/x86/kernel/cpu/cpu.h
--- a/arch/x86/kernel/cpu/cpu.h~get_cpu_address_sizes-__init	2024-02-22 10:09:04.677089966 -0800
+++ b/arch/x86/kernel/cpu/cpu.h	2024-02-22 10:09:04.681090123 -0800
@@ -64,7 +64,6 @@ static inline void tsx_ap_init(void) { }
 extern void init_spectral_chicken(struct cpuinfo_x86 *c);
 
 extern void get_cpu_cap(struct cpuinfo_x86 *c);
-extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
 extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
 extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
 extern void init_intel_cacheinfo(struct cpuinfo_x86 *c);
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC][PATCH 34/34] x86/cpu: Mark new boot CPU and config structures appropriately
  2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
                   ` (32 preceding siblings ...)
  2024-02-22 18:40 ` [RFC][PATCH 33/34] x86/cpu: Make get_cpu_address_sizes() static and __init Dave Hansen
@ 2024-02-22 18:40 ` Dave Hansen
  33 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-22 18:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

Just to recap the rules:

 1. 'bsp_addr_config' is only used in early boot by the BSP.
 2. 'x86_config' is only written once by the BSP and can be
    used read-only during normal system operation

Mark 'bsp_addr_config' so it goes away after boot *and* non-init
references to it can be detected and warned about.

Mark 'x86_config' as __ro_after_init so it does not turn into the same
mess that 'boot_cpu_data' is.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/kernel/setup.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN arch/x86/kernel/setup.c~mark-new-structs arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~mark-new-structs	2024-02-22 10:09:05.173109438 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:09:05.177109595 -0800
@@ -131,8 +131,8 @@ struct ist_info ist_info;
 struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
-struct x86_addr_config bsp_addr_config;
-struct x86_sys_config x86_config __read_mostly;
+struct x86_addr_config bsp_addr_config __initdata;
+struct x86_sys_config x86_config __ro_after_init;
 
 #if !defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64)
 __visible unsigned long mmu_cr4_features __ro_after_init;
_

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-22 18:39 ` [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for " Dave Hansen
@ 2024-02-23 11:33   ` Kirill A. Shutemov
  2024-02-23 16:22     ` Dave Hansen
  0 siblings, 1 reply; 54+ messages in thread
From: Kirill A. Shutemov @ 2024-02-23 11:33 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, pbonzini, tglx, x86, bp

On Thu, Feb 22, 2024 at 10:39:41AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Intel also does memory encryption and also fiddles with the physical
> address bits.  This is currently called for *each* CPU, but practically
> only done on the boot CPU because of 'mktme_status'.
> 
> Move it from the "each CPU" ->c_init() function to ->c_bsp_init() where
> the whole thing only gets called once ever.  This also necessitates moving
> detect_tme() and its entourage around in the file.

The state machine around mktme_state doesn't make sense if we only call it
on boot CPU, so detect_tme() can be drastically simplified. I can do it on
top of the patchset.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME
  2024-02-22 18:39 ` [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME Dave Hansen
@ 2024-02-23 11:41   ` Kirill A. Shutemov
  2024-02-23 16:16     ` Dave Hansen
  0 siblings, 1 reply; 54+ messages in thread
From: Kirill A. Shutemov @ 2024-02-23 11:41 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, pbonzini, tglx, x86, bp

On Thu, Feb 22, 2024 at 10:39:42AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Now that the TME detection is only called once at boot, stop twiddling
> 'boot_cpu_data' directly and move over to 'bsp_addr_config'.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
> 
>  b/arch/x86/kernel/cpu/intel.c |    7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff -puN arch/x86/kernel/cpu/intel.c~intel-addr-reduce arch/x86/kernel/cpu/intel.c
> --- a/arch/x86/kernel/cpu/intel.c~intel-addr-reduce	2024-02-22 10:08:54.296682462 -0800
> +++ b/arch/x86/kernel/cpu/intel.c	2024-02-22 10:08:54.296682462 -0800
> @@ -401,11 +401,8 @@ detect_keyid_bits:
>  		mktme_status = MKTME_ENABLED;
>  	}
>  
> -	/*
> -	 * KeyID bits effectively lower the number of physical address
> -	 * bits.  Update cpuinfo_x86::x86_phys_bits accordingly.
> -	 */
> -	c->x86_phys_bits -= keyid_bits;
> +	/* KeyID bits effectively lower the number of physical address bits */
> +	bsp_addr_config.phys_addr_reduction_bits = keyid_bits;

Do we expect reduction_bits to stack? Like can multiple features steal
physical bits? Make use "+= keyid_bits" here?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME
  2024-02-23 11:41   ` Kirill A. Shutemov
@ 2024-02-23 16:16     ` Dave Hansen
  0 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-23 16:16 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Hansen; +Cc: linux-kernel, pbonzini, tglx, x86, bp

On 2/23/24 03:41, Kirill A. Shutemov wrote:
>> -	/*
>> -	 * KeyID bits effectively lower the number of physical address
>> -	 * bits.  Update cpuinfo_x86::x86_phys_bits accordingly.
>> -	 */
>> -	c->x86_phys_bits -= keyid_bits;
>> +	/* KeyID bits effectively lower the number of physical address bits */
>> +	bsp_addr_config.phys_addr_reduction_bits = keyid_bits;
> Do we expect reduction_bits to stack? Like can multiple features steal
> physical bits? Make use "+= keyid_bits" here?

Good question.

IMNHO, the idea that different "users" of these fields can be oblivious
to each other is the reason that this has gotten so bad.

I thought about interfering or stacking a bit when I was putting this
together.  It's one of the reasons I chose to add the specific
'phys_addr_reduction_bits' field _instead_ of continuing to just munge
'phys_addr_bits'.

I want 'bsp_addr_config' to represent the *inputs* that eventually end
up in x86_config, and have the inputs distilled down to the output in
one (or very few) places.

There are thankfully very few users of this: Intel and AMD memory
encryption and one Intel CPU erratum for very old CPUs.  So they can't
stack in practice.

If we ever need something to stack, I'd prefer that we add two fields,
maybe:

	bsp_addr_config.enc_reduction_bits

and

	bsp_addr_config.maxphyaddr_erratum_override

or something, then have any collision be resolved when 'bsp_addr_config'
is consulted, in *ONE* central pace in the code.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-23 11:33   ` Kirill A. Shutemov
@ 2024-02-23 16:22     ` Dave Hansen
  2024-02-26 12:14       ` Kirill A. Shutemov
  0 siblings, 1 reply; 54+ messages in thread
From: Dave Hansen @ 2024-02-23 16:22 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Hansen; +Cc: linux-kernel, pbonzini, tglx, x86, bp

On 2/23/24 03:33, Kirill A. Shutemov wrote:
> On Thu, Feb 22, 2024 at 10:39:41AM -0800, Dave Hansen wrote:
>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>
>> Intel also does memory encryption and also fiddles with the physical
>> address bits.  This is currently called for *each* CPU, but practically
>> only done on the boot CPU because of 'mktme_status'.
>>
>> Move it from the "each CPU" ->c_init() function to ->c_bsp_init() where
>> the whole thing only gets called once ever.  This also necessitates moving
>> detect_tme() and its entourage around in the file.
> The state machine around mktme_state doesn't make sense if we only call it
> on boot CPU, so detect_tme() can be drastically simplified. I can do it on
> top of the patchset.

That would be great.  Looking at it again, the (tme_activate !=
tme_activate_cpu0) block is total cruft now.  It probably just needs to
get moved to secondary CPU startup.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-23 16:22     ` Dave Hansen
@ 2024-02-26 12:14       ` Kirill A. Shutemov
  2024-02-27 21:48         ` Huang, Kai
  0 siblings, 1 reply; 54+ messages in thread
From: Kirill A. Shutemov @ 2024-02-26 12:14 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Dave Hansen, linux-kernel, pbonzini, tglx, x86, bp

On Fri, Feb 23, 2024 at 08:22:16AM -0800, Dave Hansen wrote:
> On 2/23/24 03:33, Kirill A. Shutemov wrote:
> > On Thu, Feb 22, 2024 at 10:39:41AM -0800, Dave Hansen wrote:
> >> From: Dave Hansen <dave.hansen@linux.intel.com>
> >>
> >> Intel also does memory encryption and also fiddles with the physical
> >> address bits.  This is currently called for *each* CPU, but practically
> >> only done on the boot CPU because of 'mktme_status'.
> >>
> >> Move it from the "each CPU" ->c_init() function to ->c_bsp_init() where
> >> the whole thing only gets called once ever.  This also necessitates moving
> >> detect_tme() and its entourage around in the file.
> > The state machine around mktme_state doesn't make sense if we only call it
> > on boot CPU, so detect_tme() can be drastically simplified. I can do it on
> > top of the patchset.
> 
> That would be great.  Looking at it again, the (tme_activate !=
> tme_activate_cpu0) block is total cruft now.  It probably just needs to
> get moved to secondary CPU startup.

I have never saw the check to be useful. I think it can be just dropped.

The patch below makes detect_tme() only enumerate TME and MKTME. And
report number of keyid bits. Kernel doesn't care about anything else.

Any comments?

From 1080535093d21f025d46fd610de5ad788591f4b5 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Mon, 26 Feb 2024 14:01:01 +0200
Subject: [PATCH] x86/cpu/intel: Simplify detect_tme()

The detect_tme() function is now only called by the boot CPU. The logic
for cross-checking TME configuration between CPUs is no longer used. It
has never identified a real problem and can be safely removed.

The kernel does not use MKTME and is not concerned with MKTME policy or
encryption algorithms. There is no need to check them.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/cpu/intel.c | 44 ++-----------------------------------
 1 file changed, 2 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 4192aa4576f4..60918b49344c 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -329,55 +329,20 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 #define TME_ACTIVATE_CRYPTO_ALGS(x)	((x >> 48) & 0xffff)	/* Bits 63:48 */
 #define TME_ACTIVATE_CRYPTO_AES_XTS_128	1
 
-/* Values for mktme_status (SW only construct) */
-#define MKTME_ENABLED			0
-#define MKTME_DISABLED			1
-#define MKTME_UNINITIALIZED		2
-static int mktme_status = MKTME_UNINITIALIZED;
-
 static int detect_tme(struct cpuinfo_x86 *c)
 {
-	u64 tme_activate, tme_policy, tme_crypto_algs;
-	int keyid_bits = 0, nr_keyids = 0;
-	static u64 tme_activate_cpu0 = 0;
+	int keyid_bits, nr_keyids;
+	u64 tme_activate;
 
 	rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
 
-	if (mktme_status != MKTME_UNINITIALIZED) {
-		if (tme_activate != tme_activate_cpu0) {
-			/* Broken BIOS? */
-			pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
-			pr_err_once("x86/tme: MKTME is not usable\n");
-			mktme_status = MKTME_DISABLED;
-
-			/* Proceed. We may need to exclude bits from x86_phys_bits. */
-		}
-	} else {
-		tme_activate_cpu0 = tme_activate;
-	}
-
 	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
 		pr_info_once("x86/tme: not enabled by BIOS\n");
-		mktme_status = MKTME_DISABLED;
 		return 0;
 	}
 
-	if (mktme_status != MKTME_UNINITIALIZED)
-		goto detect_keyid_bits;
-
 	pr_info("x86/tme: enabled by BIOS\n");
 
-	tme_policy = TME_ACTIVATE_POLICY(tme_activate);
-	if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
-		pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
-
-	tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
-	if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
-		pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
-				tme_crypto_algs);
-		mktme_status = MKTME_DISABLED;
-	}
-detect_keyid_bits:
 	keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
 	nr_keyids = (1UL << keyid_bits) - 1;
 	if (nr_keyids) {
@@ -387,11 +352,6 @@ static int detect_tme(struct cpuinfo_x86 *c)
 		pr_info_once("x86/mktme: disabled by BIOS\n");
 	}
 
-	if (mktme_status == MKTME_UNINITIALIZED) {
-		/* MKTME is usable */
-		mktme_status = MKTME_ENABLED;
-	}
-
 	return keyid_bits;
 }
 
-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end
  2024-02-22 18:39 ` [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end Dave Hansen
@ 2024-02-27 10:59   ` Huang, Kai
  2024-02-28 14:22     ` Zhang, Rui
  0 siblings, 1 reply; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 10:59 UTC (permalink / raw)
  To: linux-kernel, dave.hansen; +Cc: kirill.shutemov, tglx, pbonzini, x86, bp

On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
> written and overwritten rather randomly.  They are not stable
> during early boot and readers end up getting a random mishmash
> of hard-coded defaults or CPUID-provided values based on when
> the values are read.
> 
> iomem_resource.end is one of these users.  Because of where it
> is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
> which is (mostly) a compile-time default.  But
> iomem_resource.end is never updated if the runtime CPUID
> x86_phys_bits is lower.
> 
> Set iomem_resource.end to the compile-time value explicitly.
> It does not need to be precise as this is mostly to ensure
> that insane values can't be reserved in 'iomem_resource'.
> 
> Make MAX_PHYSMEM_BITS available outside of sparsemem
> configurations by removing the #ifdef CONFIG_SPARSEMEM in the
> header.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
> 
>  b/arch/x86/include/asm/sparsemem.h |    3 ---
>  b/arch/x86/kernel/setup.c          |   10 +++++++++-
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff -puN arch/x86/kernel/setup.c~iomem_resource_end arch/x86/kernel/setup.c
> --- a/arch/x86/kernel/setup.c~iomem_resource_end	2024-02-22 10:08:51.048554948 -0800
> +++ b/arch/x86/kernel/setup.c	2024-02-22 10:21:04.485531464 -0800
> @@ -51,6 +51,7 @@
>  #include <asm/pci-direct.h>
>  #include <asm/prom.h>
>  #include <asm/proto.h>
> +#include <asm/sparsemem.h>
>  #include <asm/thermal.h>
>  #include <asm/unwind.h>
>  #include <asm/vsyscall.h>
> @@ -813,7 +814,14 @@ void __init setup_arch(char **cmdline_p)
>  	 */
>  	early_reserve_memory();
>  
> -	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
> +	/*
> +	 * This was too big before.  It ended up getting MAX_PHYSMEM_BITS
> +	 * even if .x86_phys_bits was eventually lowered below that.
> +	 * But that was evidently harmless, so leave it too big, but
> +	 * set it explicitly to MAX_PHYSMEM_BITS instead of taking a
> +	 * trip through .x86_phys_bits.
> +	 */
> +	iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;

Paolo's patchset to move MKTME keyid bits detection to early_cpu_init() was
merged to tip:x86/urgent, so looks it will land to Linus's tree before this
series:

https://lore.kernel.org/lkml/eff34df2-fdc1-4ee0-bb8d-90da386b7cb6@intel.com/T/

Paplo's series actually moves the reduction of x86_phys_bits before setting the
iomem_resource.end here, so after rebasing the changelog/comment seems don't
apply anymore.

Perhaps we can get rid of this patch and just set iomem_resource.end based on
x86_phys_bits()?


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper
  2024-02-22 18:39 ` [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper Dave Hansen
@ 2024-02-27 11:05   ` Huang, Kai
  0 siblings, 0 replies; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 11:05 UTC (permalink / raw)
  To: linux-kernel, dave.hansen
  Cc: kirill.shutemov, tglx, pbonzini, rafael, lenb, x86, bp

On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Modern processors enumerate the sizes of the physical and virtual address
> spaces that they can handle.  The kernel stashes this information for each
> CPU in 'cpuinfo_x86'.
> 
> Like a lot of system-wide configuration, in practice the kernel only uses
> this information from the boot CPU.  That makes a lot of sense because
> the kernel can't practically support two CPUs with such different
> fundamental support as the size of the address spaces.
> 
> Having a per-cpu copy of this data is silly.  It is, at best, a waste
> of space to keep it around for the non-boot CPUs.  At worst, it is yet
> another bit of data that must be initialized in a particular order and
> can be a source of bugs.
> 
> Introduce a helper to look up the number of supported physical address bits:
> 
> 	x86_phys_bits()
> 
> Replace most open-coded references to boot_cpu_data.x86_phys_bits.
> 
> This form is more compact and also opens up the door to adding some
> centralized checking and enforcing rules around this important system-
> wide value.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> ---
> 

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one
  2024-02-22 18:39 ` [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one Dave Hansen
@ 2024-02-27 11:05   ` Huang, Kai
  0 siblings, 0 replies; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 11:05 UTC (permalink / raw)
  To: linux-kernel, dave.hansen; +Cc: kirill.shutemov, tglx, pbonzini, x86, bp

On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> This is one of the few references to the per-cpu version of
> ->x86_phys_bits.  It is theoretically possible to have this value vary
> from CPU to CPU.  But in practice nobody builds systems that way.
> 
> Continue outputting the value in /proc, but read it from the global
> configuration.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
> 

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper
  2024-02-22 18:39 ` [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper Dave Hansen
@ 2024-02-27 11:09   ` Huang, Kai
  0 siblings, 0 replies; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 11:09 UTC (permalink / raw)
  To: linux-kernel, dave.hansen; +Cc: kirill.shutemov, tglx, pbonzini, x86, bp

On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> This uses the same logic and approach which were used for the physical
> address limits with x86_phys_bits() and extends them to the virtual
> address space.
> 
> Introduce a system-wide helper for users to query the size of the
> virtual address space: x86_virt_bits()
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
> 

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-26 12:14       ` Kirill A. Shutemov
@ 2024-02-27 21:48         ` Huang, Kai
  2024-02-28 15:20           ` Kirill A. Shutemov
  0 siblings, 1 reply; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 21:48 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Hansen
  Cc: Dave Hansen, linux-kernel, pbonzini, tglx, x86, bp



On 27/02/2024 1:14 am, Kirill A. Shutemov wrote:
> On Fri, Feb 23, 2024 at 08:22:16AM -0800, Dave Hansen wrote:
>> On 2/23/24 03:33, Kirill A. Shutemov wrote:
>>> On Thu, Feb 22, 2024 at 10:39:41AM -0800, Dave Hansen wrote:
>>>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>>>
>>>> Intel also does memory encryption and also fiddles with the physical
>>>> address bits.  This is currently called for *each* CPU, but practically
>>>> only done on the boot CPU because of 'mktme_status'.
>>>>
>>>> Move it from the "each CPU" ->c_init() function to ->c_bsp_init() where
>>>> the whole thing only gets called once ever.  This also necessitates moving
>>>> detect_tme() and its entourage around in the file.
>>> The state machine around mktme_state doesn't make sense if we only call it
>>> on boot CPU, so detect_tme() can be drastically simplified. I can do it on
>>> top of the patchset.
>>
>> That would be great.  Looking at it again, the (tme_activate !=
>> tme_activate_cpu0) block is total cruft now.  It probably just needs to
>> get moved to secondary CPU startup.
> 
> I have never saw the check to be useful. I think it can be just dropped.
> 
> The patch below makes detect_tme() only enumerate TME and MKTME. And
> report number of keyid bits. Kernel doesn't care about anything else.
> 
> Any comments?
> 
>  From 1080535093d21f025d46fd610de5ad788591f4b5 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Mon, 26 Feb 2024 14:01:01 +0200
> Subject: [PATCH] x86/cpu/intel: Simplify detect_tme()
> 
> The detect_tme() function is now only called by the boot CPU. The logic
> for cross-checking TME configuration between CPUs is no longer used. It
> has never identified a real problem and can be safely removed.
> 
> The kernel does not use MKTME and is not concerned with MKTME policy or
> encryption algorithms. There is no need to check them.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>   arch/x86/kernel/cpu/intel.c | 44 ++-----------------------------------
>   1 file changed, 2 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index 4192aa4576f4..60918b49344c 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -329,55 +329,20 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>   #define TME_ACTIVATE_CRYPTO_ALGS(x)	((x >> 48) & 0xffff)	/* Bits 63:48 */
>   #define TME_ACTIVATE_CRYPTO_AES_XTS_128	1
>   
> -/* Values for mktme_status (SW only construct) */
> -#define MKTME_ENABLED			0
> -#define MKTME_DISABLED			1
> -#define MKTME_UNINITIALIZED		2
> -static int mktme_status = MKTME_UNINITIALIZED;
> -
>   static int detect_tme(struct cpuinfo_x86 *c)
>   {
> -	u64 tme_activate, tme_policy, tme_crypto_algs;
> -	int keyid_bits = 0, nr_keyids = 0;
> -	static u64 tme_activate_cpu0 = 0;
> +	int keyid_bits, nr_keyids;
> +	u64 tme_activate;
>   
>   	rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
>   
> -	if (mktme_status != MKTME_UNINITIALIZED) {
> -		if (tme_activate != tme_activate_cpu0) {
> -			/* Broken BIOS? */
> -			pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
> -			pr_err_once("x86/tme: MKTME is not usable\n");
> -			mktme_status = MKTME_DISABLED;
> -
> -			/* Proceed. We may need to exclude bits from x86_phys_bits. */
> -		}
> -	} else {
> -		tme_activate_cpu0 = tme_activate;
> -	}
> -
>   	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
>   		pr_info_once("x86/tme: not enabled by BIOS\n");

Since now detect_tme() is only called on BSP, seems we can change to 
pr_info(), which is used ...
> -		mktme_status = MKTME_DISABLED;
>   		return 0;
>   	}
>   
> -	if (mktme_status != MKTME_UNINITIALIZED)
> -		goto detect_keyid_bits;
> -
>   	pr_info("x86/tme: enabled by BIOS\n");

... right here anyway.

>   
> -	tme_policy = TME_ACTIVATE_POLICY(tme_activate);
> -	if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
> -		pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
> -
> -	tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
> -	if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
> -		pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
> -				tme_crypto_algs);
> -		mktme_status = MKTME_DISABLED;
> -	}
> -detect_keyid_bits:
>   	keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
>   	nr_keyids = (1UL << keyid_bits) - 1;
>   	if (nr_keyids) {
> @@ -387,11 +352,6 @@ static int detect_tme(struct cpuinfo_x86 *c)
>   		pr_info_once("x86/mktme: disabled by BIOS\n");
>   	}
>   
> -	if (mktme_status == MKTME_UNINITIALIZED) {
> -		/* MKTME is usable */
> -		mktme_status = MKTME_ENABLED;
> -	}
> -
>   	return keyid_bits;

Other than that the code change LGTM.

Reviewed-by: Kai Huang <kai.huang@intel.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure
  2024-02-22 18:39 ` [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure Dave Hansen
@ 2024-02-27 23:47   ` Huang, Kai
  2024-02-28 17:29     ` Dave Hansen
  0 siblings, 1 reply; 54+ messages in thread
From: Huang, Kai @ 2024-02-27 23:47 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp



On 23/02/2024 7:39 am, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Right now, 'boot_cpu_data' is established in a very ad hoc way.  Random
> bits of the initialization code write random things to it, either
> blowing away state or tweaking it as they see fit.  It's madness.
> 
> Add some more structure to the process.
> 
> Introduce an "address configuration" structure just for the boot CPU.
> This will be used to establish system-wide address space configuration.
> It is written while bringing up the boot CPU and then read *ONCE* to
> establish the configuration.
> 


 From the above description, ...

> +struct x86_addr_config bsp_addr_config;
> +

... this can be __ro_after_init?

Maybe I am missing something, but if it cannot be annotated with 
__ro_after_init here, IMHO it would be helpful to call out in the changelog.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end
  2024-02-27 10:59   ` Huang, Kai
@ 2024-02-28 14:22     ` Zhang, Rui
  0 siblings, 0 replies; 54+ messages in thread
From: Zhang, Rui @ 2024-02-28 14:22 UTC (permalink / raw)
  To: linux-kernel, Huang, Kai, dave.hansen
  Cc: kirill.shutemov, tglx, pbonzini, x86, bp

On Tue, 2024-02-27 at 10:59 +0000, Huang, Kai wrote:
> On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> > From: Dave Hansen <dave.hansen@linux.intel.com>
> > 
> > The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
> > written and overwritten rather randomly.  They are not stable
> > during early boot and readers end up getting a random mishmash
> > of hard-coded defaults or CPUID-provided values based on when
> > the values are read.
> > 
> > iomem_resource.end is one of these users.  Because of where it
> > is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
> > which is (mostly) a compile-time default.  But
> > iomem_resource.end is never updated if the runtime CPUID
> > x86_phys_bits is lower.
> > 
> > Set iomem_resource.end to the compile-time value explicitly.
> > It does not need to be precise as this is mostly to ensure
> > that insane values can't be reserved in 'iomem_resource'.
> > 
> > Make MAX_PHYSMEM_BITS available outside of sparsemem
> > configurations by removing the #ifdef CONFIG_SPARSEMEM in the
> > header.
> > 
> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > ---
> > 
> >  b/arch/x86/include/asm/sparsemem.h |    3 ---
> >  b/arch/x86/kernel/setup.c          |   10 +++++++++-
> >  2 files changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff -puN arch/x86/kernel/setup.c~iomem_resource_end
> > arch/x86/kernel/setup.c
> > --- a/arch/x86/kernel/setup.c~iomem_resource_end        2024-02-22
> > 10:08:51.048554948 -0800
> > +++ b/arch/x86/kernel/setup.c   2024-02-22 10:21:04.485531464 -0800
> > @@ -51,6 +51,7 @@
> >  #include <asm/pci-direct.h>
> >  #include <asm/prom.h>
> >  #include <asm/proto.h>
> > +#include <asm/sparsemem.h>
> >  #include <asm/thermal.h>
> >  #include <asm/unwind.h>
> >  #include <asm/vsyscall.h>
> > @@ -813,7 +814,14 @@ void __init setup_arch(char **cmdline_p)
> >          */
> >         early_reserve_memory();
> >  
> > -       iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
> > +       /*
> > +        * This was too big before.  It ended up getting
> > MAX_PHYSMEM_BITS
> > +        * even if .x86_phys_bits was eventually lowered below
> > that.
> > +        * But that was evidently harmless, so leave it too big,
> > but
> > +        * set it explicitly to MAX_PHYSMEM_BITS instead of taking
> > a
> > +        * trip through .x86_phys_bits.
> > +        */
> > +       iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;
> 
> Paolo's patchset to move MKTME keyid bits detection to
> early_cpu_init() was
> merged to tip:x86/urgent, so looks it will land to Linus's tree
> before this
> series:
> 
> https://lore.kernel.org/lkml/eff34df2-fdc1-4ee0-bb8d-90da386b7cb6@intel.com/T/
> 
> Paplo's series actually moves the reduction of x86_phys_bits before
> setting the
> iomem_resource.end here, so after rebasing the changelog/comment
> seems don't
> apply anymore.

My understanding is that the below order is always true,
setup_arch()
	early_cpu_init()
		get_cpu_address_sizes()
	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
with or without the above patch.

> 
> Perhaps we can get rid of this patch and just set iomem_resource.end
> based on
> x86_phys_bits()?
> 
Agreed.

thanks,
rui

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-27 21:48         ` Huang, Kai
@ 2024-02-28 15:20           ` Kirill A. Shutemov
  2024-02-28 16:57             ` Dave Hansen
  0 siblings, 1 reply; 54+ messages in thread
From: Kirill A. Shutemov @ 2024-02-28 15:20 UTC (permalink / raw)
  To: Huang, Kai
  Cc: Dave Hansen, Dave Hansen, linux-kernel, pbonzini, tglx, x86, bp

On Wed, Feb 28, 2024 at 10:48:19AM +1300, Huang, Kai wrote:
> >   	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
> >   		pr_info_once("x86/tme: not enabled by BIOS\n");
> 
> Since now detect_tme() is only called on BSP, seems we can change to
> pr_info(), which is used ...

Right.

Dave, do you want a new version? Or can you fix it up on apply?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for "address configuration" infrastructure
  2024-02-28 15:20           ` Kirill A. Shutemov
@ 2024-02-28 16:57             ` Dave Hansen
  0 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-28 16:57 UTC (permalink / raw)
  To: Kirill A. Shutemov, Huang, Kai
  Cc: Dave Hansen, linux-kernel, pbonzini, tglx, x86, bp

On 2/28/24 07:20, Kirill A. Shutemov wrote:
> On Wed, Feb 28, 2024 at 10:48:19AM +1300, Huang, Kai wrote:
>>>   	if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
>>>   		pr_info_once("x86/tme: not enabled by BIOS\n");
>> Since now detect_tme() is only called on BSP, seems we can change to
>> pr_info(), which is used ...
> Right.
> 
> Dave, do you want a new version? Or can you fix it up on apply?

I've already got a version of your patch pulled into my tree.  I did the
conversion there.  Thanks for asking!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure
  2024-02-27 23:47   ` Huang, Kai
@ 2024-02-28 17:29     ` Dave Hansen
  0 siblings, 0 replies; 54+ messages in thread
From: Dave Hansen @ 2024-02-28 17:29 UTC (permalink / raw)
  To: Huang, Kai, Dave Hansen, linux-kernel
  Cc: kirill.shutemov, pbonzini, tglx, x86, bp

On 2/27/24 15:47, Huang, Kai wrote:
> From the above description, ...
> 
>> +struct x86_addr_config bsp_addr_config;
>> +
> 
> ... this can be __ro_after_init?
> 
> Maybe I am missing something, but if it cannot be annotated with
> __ro_after_init here, IMHO it would be helpful to call out in the
> changelog.

It can actually be full on __init .. eventually.  It goes *AWAY* after boot.

I'll add a note in the changelog why it can't be until later in the series.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack
  2024-02-22 18:39 ` [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack Dave Hansen
@ 2024-03-07 16:27   ` Borislav Petkov
  0 siblings, 0 replies; 54+ messages in thread
From: Borislav Petkov @ 2024-03-07 16:27 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, kirill.shutemov, pbonzini, tglx, x86, jgross

On Thu, Feb 22, 2024 at 10:39:28AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> xen_start_kernel() some of the first C code run "Xen PV" systems.

That sentence reads weird.

> It precedes normally very early things like setup_arch() or the
> processor initialization code.
> 
> That means that 'boot_cpu_data' is garbage.  It has not even

s/is/contains/

> established the utter basics like if the CPU supports the CPUID
> instruction.  Unfortunately get_cpu_cap() requires this exact
> information.
> 
> Nevertheless xen_start_kernel() calls get_cpu_cap().  But it
> works out in practice because it's looking for the NX bit which
> comes from an extended CPUID leaf that doesn't depend on
> c->cpuid_level being set.  This also implicitly assumes that
> Xen PV guests support CPUID.
> 
> Leave the hack in place, but at least explain some of what is
> going on.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
> 
>  b/arch/x86/xen/enlighten_pv.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff -puN arch/x86/xen/enlighten_pv.c~xen-explain1 arch/x86/xen/enlighten_pv.c
> --- a/arch/x86/xen/enlighten_pv.c~xen-explain1	2024-02-22 10:08:48.404451146 -0800
> +++ b/arch/x86/xen/enlighten_pv.c	2024-02-22 10:08:48.404451146 -0800
> @@ -1372,7 +1372,11 @@ asmlinkage __visible void __init xen_sta
>  	/* Get mfn list */
>  	xen_build_dynamic_phys_to_machine();
>  
> -	/* Work out if we support NX */
> +	/*
> +	 * This is a hack.  'boot_cpu_data' is not filled out enough
> +	 * for get_cpu_cap() to function fully.  But it _can_ fill out
> +	 * the leaves where NX is.  Gross, but it works.
> +	 */
>  	get_cpu_cap(&boot_cpu_data);
>  	x86_configure_nx();

If all it needs is to figure out whether it supports NX, just do that by
foot here and get rid of the get_cpu_cap(&boot_cpu_data) call
completely.

It won't be a hack anymore.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups
  2024-02-22 18:39 ` [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups Dave Hansen
@ 2024-03-07 17:01   ` Borislav Petkov
  2024-03-11 13:16   ` Jürgen Groß
  1 sibling, 0 replies; 54+ messages in thread
From: Borislav Petkov @ 2024-03-07 17:01 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, kirill.shutemov, pbonzini, tglx, x86, jgross

On Thu, Feb 22, 2024 at 10:39:29AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The __pa() facility is subject to debugging checks if CONFIG_DEBUG_VIRTUAL=y.
> One of those debugging checks is whether the physical address is valid
> on the platform.  That information is normally available via CPUID.  But
> the __pa() code currently looks it up in 'boot_cpu_data' which is not
> fully set up in early Xen PV boot.
> 
> The Xen PV code currently tries to get this info with
> get_cpu_address_sizes() which also depends on 'boot_cpu_data' to be at
> least somewhat set up.  The result is that the c->x86_phys_bits gets a
> sane value, but not one that has anything to do with the hardware.  In
> other words, the CONFIG_DEBUG_VIRTUAL checks are performed with what
> amounts to garbage inputs.
> 
> Garbage checks are worse than no check at all.  Move over to the
> "nodebug" variant to axe the checks.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
> 
>  b/arch/x86/xen/enlighten_pv.c |    2 +-
>  b/arch/x86/xen/mmu_pv.c       |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>

At least as far as I can follow the Xen boot flow, yap, this makes
sense.

But let's have Jürgen confirm first.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided
  2024-02-22 18:39 ` [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided Dave Hansen
@ 2024-03-08  8:57   ` Borislav Petkov
  0 siblings, 0 replies; 54+ messages in thread
From: Borislav Petkov @ 2024-03-08  8:57 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, kirill.shutemov, pbonzini, tglx, x86

On Thu, Feb 22, 2024 at 10:39:30AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The early boot code always sets _some_ clflush size.  Use that fact

s/clflush/CLFLUSH/g

as it is an insn mnemonic.

> to avoid handling the case where it is not set.
> 
> There may have been a time when the Xen PV call in here way too
> early.  But it calls get_cpu_address_sizes() before calling here
> now.  It should also be safe.
> 
> Note: This series will eventually return sane defaults even very
> early in boot.  I believe this is safe now, but it becomes *really*
> safe later on.

This should probably be ...

> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---

<--- ... here as when applied, "this series" means nothing.

With that:

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups
  2024-02-22 18:39 ` [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups Dave Hansen
  2024-03-07 17:01   ` Borislav Petkov
@ 2024-03-11 13:16   ` Jürgen Groß
  1 sibling, 0 replies; 54+ messages in thread
From: Jürgen Groß @ 2024-03-11 13:16 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel; +Cc: kirill.shutemov, pbonzini, tglx, x86, bp

On 22.02.24 19:39, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The __pa() facility is subject to debugging checks if CONFIG_DEBUG_VIRTUAL=y.
> One of those debugging checks is whether the physical address is valid
> on the platform.  That information is normally available via CPUID.  But
> the __pa() code currently looks it up in 'boot_cpu_data' which is not
> fully set up in early Xen PV boot.
> 
> The Xen PV code currently tries to get this info with
> get_cpu_address_sizes() which also depends on 'boot_cpu_data' to be at
> least somewhat set up.  The result is that the c->x86_phys_bits gets a
> sane value, but not one that has anything to do with the hardware.  In
> other words, the CONFIG_DEBUG_VIRTUAL checks are performed with what
> amounts to garbage inputs.
> 
> Garbage checks are worse than no check at all.  Move over to the
> "nodebug" variant to axe the checks.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Juergen Gross <jgross@suse.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2024-03-11 13:16 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-22 18:39 [RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 01/34] x86/xen: Explain why calling get_cpu_cap() so early is a hack Dave Hansen
2024-03-07 16:27   ` Borislav Petkov
2024-02-22 18:39 ` [RFC][PATCH 02/34] x86/xen: Remove early "debug" physical address lookups Dave Hansen
2024-03-07 17:01   ` Borislav Petkov
2024-03-11 13:16   ` Jürgen Groß
2024-02-22 18:39 ` [RFC][PATCH 03/34] x86/pci: Assume that clflush size is always provided Dave Hansen
2024-03-08  8:57   ` Borislav Petkov
2024-02-22 18:39 ` [RFC][PATCH 04/34] x86/mm: Introduce physical address limit helper Dave Hansen
2024-02-27 11:05   ` Huang, Kai
2024-02-22 18:39 ` [RFC][PATCH 05/34] x86/cpu: Move /proc per-cpu ->x86_phys_bits reference to global one Dave Hansen
2024-02-27 11:05   ` Huang, Kai
2024-02-22 18:39 ` [RFC][PATCH 06/34] x86/boot: Use consistent value for iomem_resource.end Dave Hansen
2024-02-27 10:59   ` Huang, Kai
2024-02-28 14:22     ` Zhang, Rui
2024-02-22 18:39 ` [RFC][PATCH 07/34] x86/mm: Introduce virtual address space limit helper Dave Hansen
2024-02-27 11:09   ` Huang, Kai
2024-02-22 18:39 ` [RFC][PATCH 08/34] x86/cpu: Add CLFLUSH size helper Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 09/34] x86/cpu: Introduce address configuration structure Dave Hansen
2024-02-27 23:47   ` Huang, Kai
2024-02-28 17:29     ` Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 10/34] x86/cpu/amd: Use new "address configuration" infrastructure Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 11/34] x86/cpu/intel: Prepare MKTME for " Dave Hansen
2024-02-23 11:33   ` Kirill A. Shutemov
2024-02-23 16:22     ` Dave Hansen
2024-02-26 12:14       ` Kirill A. Shutemov
2024-02-27 21:48         ` Huang, Kai
2024-02-28 15:20           ` Kirill A. Shutemov
2024-02-28 16:57             ` Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 12/34] x86/cpu/intel: Actually use "address configuration" infrastructure for MKTME Dave Hansen
2024-02-23 11:41   ` Kirill A. Shutemov
2024-02-23 16:16     ` Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 13/34] x86/boot: Use address reduction config to handle erratum Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 14/34] x86/cpu: Remove default physical address space settings Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 15/34] x86/cpu: Remove default x86_phys_bits assignment Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 16/34] x86/cpu: Move physical address limit out of cpuinfo_x86 Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 17/34] x86/cpu: Move virtual " Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 18/34] x86/cpu/centaur: Move cache alignment override to BSP init Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 19/34] x86/cpu: Introduce cache alignment multiplier Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 20/34] x86/cpu: Remove superfluous cache alignment assignments Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 21/34] x86/cpu: Consolidate CLFLUSH size setting Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 22/34] x86/cpu: Move CLFLUSH size into global config Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 23/34] x86/cpu: Move cache alignment configuration to global struct Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 24/34] x86/cpu: Establish 'min_cache_bits' configuration Dave Hansen
2024-02-22 18:39 ` [RFC][PATCH 25/34] x86/cpu: Move cache bits to global config Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 26/34] x86/cpu: Zap superfluous get_cpu_address_sizes() Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 27/34] x86/cpu: Enforce read-only x86_config state (lightly) Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 28/34] x86/cpu: Return sane defaults for early x86_config reads Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 29/34] x86/xen: Remove extra get_cpu_address_sizes() call Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 30/34] x86/cpu/centaur: Mark BSP init function as __init Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 31/34] x86/cpu/intel: " Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 32/34] x86/cpu/amd: Move memory encryption detection Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 33/34] x86/cpu: Make get_cpu_address_sizes() static and __init Dave Hansen
2024-02-22 18:40 ` [RFC][PATCH 34/34] x86/cpu: Mark new boot CPU and config structures appropriately Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).