linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch v2 00/38] x86/cpu: Rework the topology evaluation
@ 2023-07-28 12:12 Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
                   ` (39 more replies)
  0 siblings, 40 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Hi!

This is the follow up to V1:

  https://lore.kernel.org/lkml/20230724155329.474037902@linutronix.de

which addresses the review feedback and some minor fallout I observed in my
testing of the work based on top.

TLDR:

This reworks the way how topology information is evaluated via CPUID
in preparation for a larger topology management overhaul to address
shortcomings of the current code vs. hybrid systems and systems which make
use of the extended topology domains in leaf 0x1f. Aside of that it's an
overdue spring cleaning to get rid of accumulated layers of duct tape and
haywire.

What changed vs. V1:

  - Fixed an issue vs. the logical die/pkg management as the current
    code (ab)uses cpuinfo for persistant storage.

  - Consolidated APIC ID usage on u32 and ditched the u16 limitation

  - Addressed the review feedback from Peter and Arjan

  - Added a new patch which gets rid of XENPV fiddling in the cpuinfo
    state. That needs some testing on XENPV obviously. The relevant
    patches are #22 and #37

I did not pick up any of the tested by tags yet. I hope people can run it
once more. Neither did I add the Ack from Peter.

The series is based on the APIC cleanup series:

  https://lore.kernel.org/lkml/20230724131206.500814398@linutronix.de

and also available on top of that from git:

 git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2

Thanks,

	tglx
---
 Documentation/arch/x86/topology.rst       |   12 -
 a/arch/x86/kernel/cpu/topology.c          |  168 ---------------------
 arch/x86/events/amd/core.c                |    2 
 arch/x86/events/amd/uncore.c              |    2 
 arch/x86/events/intel/uncore.c            |    2 
 arch/x86/hyperv/hv_vtl.c                  |    2 
 arch/x86/include/asm/apic.h               |   32 +---
 arch/x86/include/asm/cacheinfo.h          |    3 
 arch/x86/include/asm/cpuid.h              |   32 ++++
 arch/x86/include/asm/mpspec.h             |    2 
 arch/x86/include/asm/processor.h          |   60 +++++--
 arch/x86/include/asm/smp.h                |    4 
 arch/x86/include/asm/topology.h           |   51 +++++-
 arch/x86/include/asm/x86_init.h           |    2 
 arch/x86/kernel/acpi/boot.c               |    4 
 arch/x86/kernel/amd_nb.c                  |    8 -
 arch/x86/kernel/apic/apic.c               |   14 -
 arch/x86/kernel/apic/apic_common.c        |    4 
 arch/x86/kernel/apic/apic_flat_64.c       |   13 -
 arch/x86/kernel/apic/apic_noop.c          |    9 -
 arch/x86/kernel/apic/apic_numachip.c      |   21 --
 arch/x86/kernel/apic/bigsmp_32.c          |   10 -
 arch/x86/kernel/apic/local.h              |    6 
 arch/x86/kernel/apic/probe_32.c           |   10 -
 arch/x86/kernel/apic/x2apic_cluster.c     |    1 
 arch/x86/kernel/apic/x2apic_phys.c        |   10 -
 arch/x86/kernel/apic/x2apic_uv_x.c        |   67 +-------
 arch/x86/kernel/cpu/Makefile              |    5 
 arch/x86/kernel/cpu/amd.c                 |  156 --------------------
 arch/x86/kernel/cpu/cacheinfo.c           |   51 ++----
 arch/x86/kernel/cpu/centaur.c             |    4 
 arch/x86/kernel/cpu/common.c              |  111 +-------------
 arch/x86/kernel/cpu/cpu.h                 |   14 +
 arch/x86/kernel/cpu/hygon.c               |  133 -----------------
 arch/x86/kernel/cpu/intel.c               |   38 ----
 arch/x86/kernel/cpu/mce/amd.c             |    4 
 arch/x86/kernel/cpu/mce/apei.c            |    4 
 arch/x86/kernel/cpu/mce/core.c            |    4 
 arch/x86/kernel/cpu/mce/inject.c          |    7 
 arch/x86/kernel/cpu/proc.c                |    8 -
 arch/x86/kernel/cpu/zhaoxin.c             |   18 --
 arch/x86/kernel/kvm.c                     |    6 
 arch/x86/kernel/sev.c                     |    2 
 arch/x86/kernel/smpboot.c                 |   97 +++++++-----
 arch/x86/kernel/vsmp_64.c                 |   13 -
 arch/x86/mm/amdtopology.c                 |   35 ++--
 arch/x86/mm/numa.c                        |    4 
 arch/x86/xen/apic.c                       |   14 -
 arch/x86/xen/smp_pv.c                     |    3 
 b/arch/x86/kernel/cpu/debugfs.c           |   97 ++++++++++++
 b/arch/x86/kernel/cpu/topology.h          |   51 ++++++
 b/arch/x86/kernel/cpu/topology_amd.c      |  179 +++++++++++++++++++++++
 b/arch/x86/kernel/cpu/topology_common.c   |  233 ++++++++++++++++++++++++++++++
 b/arch/x86/kernel/cpu/topology_ext.c      |  136 +++++++++++++++++
 drivers/edac/amd64_edac.c                 |    4 
 drivers/edac/mce_amd.c                    |    4 
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |    2 
 drivers/hwmon/fam15h_power.c              |    7 
 drivers/scsi/lpfc/lpfc_init.c             |    8 -
 drivers/virt/acrn/hsm.c                   |    2 
 60 files changed, 1049 insertions(+), 956 deletions(-)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 02/38] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

The topology related information is randomly scattered across cpuinfo_x86.

Create a new structure cpuinfo_topo and move in a first step initial_apicid
and apicid into it.

Aside of being better readable this is in preparation for replacing the
horribly fragile CPU topology evaluation code further down the road.

Consolidate APIC ID fields to u32 as that represents the hardware type.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h          |   14 +++++++++-----
 arch/x86/kernel/cpu/amd.c                 |   10 +++++-----
 arch/x86/kernel/cpu/cacheinfo.c           |   20 ++++++++++----------
 arch/x86/kernel/cpu/common.c              |   18 +++++++++---------
 arch/x86/kernel/cpu/hygon.c               |   12 ++++++------
 arch/x86/kernel/cpu/mce/apei.c            |    2 +-
 arch/x86/kernel/cpu/mce/core.c            |    2 +-
 arch/x86/kernel/cpu/proc.c                |    4 ++--
 arch/x86/kernel/cpu/topology.c            |   12 ++++++------
 arch/x86/xen/apic.c                       |    2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |    2 +-
 drivers/virt/acrn/hsm.c                   |    2 +-
 12 files changed, 52 insertions(+), 48 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -74,11 +74,16 @@ extern u16 __read_mostly tlb_lld_4m[NR_I
 extern u16 __read_mostly tlb_lld_1g[NR_INFO];
 
 /*
- *  CPU type and hardware bug flags. Kept separately for each CPU.
- *  Members of this structure are referenced in head_32.S, so think twice
- *  before touching them. [mj]
+ * CPU type and hardware bug flags. Kept separately for each CPU.
  */
 
+struct cpuinfo_topology {
+	// Real APIC ID read from the local APIC
+	u32			apicid;
+	// The initial APIC ID provided by CPUID
+	u32			initial_apicid;
+};
+
 struct cpuinfo_x86 {
 	__u8			x86;		/* CPU family */
 	__u8			x86_vendor;	/* CPU vendor */
@@ -111,6 +116,7 @@ struct cpuinfo_x86 {
 	};
 	char			x86_vendor_id[16];
 	char			x86_model_id[64];
+	struct cpuinfo_topology	topo;
 	/* in KB - valid for CPUS which support this call: */
 	unsigned int		x86_cache_size;
 	int			x86_cache_alignment;	/* In bytes */
@@ -124,8 +130,6 @@ struct cpuinfo_x86 {
 	u64			ppin;
 	/* cpuid returned max cores value: */
 	u16			x86_max_cores;
-	u16			apicid;
-	u16			initial_apicid;
 	u16			x86_clflush_size;
 	/* number of cores as seen by the OS: */
 	u16			booted_cores;
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -387,9 +387,9 @@ static void amd_detect_cmp(struct cpuinf
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
-	c->cpu_core_id = c->initial_apicid & ((1 << bits)-1);
+	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
-	c->phys_proc_id = c->initial_apicid >> bits;
+	c->phys_proc_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
 	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id;
 }
@@ -405,7 +405,7 @@ static void srat_detect_node(struct cpui
 #ifdef CONFIG_NUMA
 	int cpu = smp_processor_id();
 	int node;
-	unsigned apicid = c->apicid;
+	unsigned apicid = c->topo.apicid;
 
 	node = numa_cpu_node(cpu);
 	if (node == NUMA_NO_NODE)
@@ -439,7 +439,7 @@ static void srat_detect_node(struct cpui
 		 * through CPU mapping may alter the outcome, directly
 		 * access __apicid_to_node[].
 		 */
-		int ht_nodeid = c->initial_apicid;
+		int ht_nodeid = c->topo.initial_apicid;
 
 		if (__apicid_to_node[ht_nodeid] != NUMA_NO_NODE)
 			node = __apicid_to_node[ht_nodeid];
@@ -934,7 +934,7 @@ static void init_amd(struct cpuinfo_x86
 		set_cpu_cap(c, X86_FEATURE_FSRS);
 
 	/* get apicid instead of initial apic id from cpuid */
-	c->apicid = read_apic_id();
+	c->topo.apicid = read_apic_id();
 
 	/* K6s reports MCEs but don't actually have all the MSRs */
 	if (c->x86 < 6)
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -678,7 +678,7 @@ void cacheinfo_amd_init_llc_id(struct cp
 		 * LLC is at the core complex level.
 		 * Core complex ID is ApicId[3] for these processors.
 		 */
-		per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
+		per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> 3;
 	} else {
 		/*
 		 * LLC ID is calculated from the number of threads sharing the
@@ -694,7 +694,7 @@ void cacheinfo_amd_init_llc_id(struct cp
 		if (num_sharing_cache) {
 			int bits = get_count_order(num_sharing_cache);
 
-			per_cpu(cpu_llc_id, cpu) = c->apicid >> bits;
+			per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> bits;
 		}
 	}
 }
@@ -712,7 +712,7 @@ void cacheinfo_hygon_init_llc_id(struct
 	 * LLC is at the core complex level.
 	 * Core complex ID is ApicId[3] for these processors.
 	 */
-	per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
+	per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> 3;
 }
 
 void init_amd_cacheinfo(struct cpuinfo_x86 *c)
@@ -776,13 +776,13 @@ void init_intel_cacheinfo(struct cpuinfo
 				new_l2 = this_leaf.size/1024;
 				num_threads_sharing = 1 + this_leaf.eax.split.num_threads_sharing;
 				index_msb = get_count_order(num_threads_sharing);
-				l2_id = c->apicid & ~((1 << index_msb) - 1);
+				l2_id = c->topo.apicid & ~((1 << index_msb) - 1);
 				break;
 			case 3:
 				new_l3 = this_leaf.size/1024;
 				num_threads_sharing = 1 + this_leaf.eax.split.num_threads_sharing;
 				index_msb = get_count_order(num_threads_sharing);
-				l3_id = c->apicid & ~((1 << index_msb) - 1);
+				l3_id = c->topo.apicid & ~((1 << index_msb) - 1);
 				break;
 			default:
 				break;
@@ -915,7 +915,7 @@ static int __cache_amd_cpumap_setup(unsi
 		unsigned int apicid, nshared, first, last;
 
 		nshared = base->eax.split.num_threads_sharing + 1;
-		apicid = cpu_data(cpu).apicid;
+		apicid = cpu_data(cpu).topo.apicid;
 		first = apicid - (apicid % nshared);
 		last = first + nshared - 1;
 
@@ -924,14 +924,14 @@ static int __cache_amd_cpumap_setup(unsi
 			if (!this_cpu_ci->info_list)
 				continue;
 
-			apicid = cpu_data(i).apicid;
+			apicid = cpu_data(i).topo.apicid;
 			if ((apicid < first) || (apicid > last))
 				continue;
 
 			this_leaf = this_cpu_ci->info_list + index;
 
 			for_each_online_cpu(sibling) {
-				apicid = cpu_data(sibling).apicid;
+				apicid = cpu_data(sibling).topo.apicid;
 				if ((apicid < first) || (apicid > last))
 					continue;
 				cpumask_set_cpu(sibling,
@@ -969,7 +969,7 @@ static void __cache_cpumap_setup(unsigne
 	index_msb = get_count_order(num_threads_sharing);
 
 	for_each_online_cpu(i)
-		if (cpu_data(i).apicid >> index_msb == c->apicid >> index_msb) {
+		if (cpu_data(i).topo.apicid >> index_msb == c->topo.apicid >> index_msb) {
 			struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
 
 			if (i == cpu || !sib_cpu_ci->info_list)
@@ -1024,7 +1024,7 @@ static void get_cache_id(int cpu, struct
 
 	num_threads_sharing = 1 + id4_regs->eax.split.num_threads_sharing;
 	index_msb = get_count_order(num_threads_sharing);
-	id4_regs->id = c->apicid >> index_msb;
+	id4_regs->id = c->topo.apicid >> index_msb;
 }
 
 int populate_cache_leaves(unsigned int cpu)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -899,7 +899,7 @@ void detect_ht(struct cpuinfo_x86 *c)
 		return;
 
 	index_msb = get_count_order(smp_num_siblings);
-	c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, index_msb);
+	c->phys_proc_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb);
 
 	smp_num_siblings = smp_num_siblings / c->x86_max_cores;
 
@@ -907,7 +907,7 @@ void detect_ht(struct cpuinfo_x86 *c)
 
 	core_bits = get_count_order(c->x86_max_cores);
 
-	c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, index_msb) &
+	c->cpu_core_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb) &
 				       ((1 << core_bits) - 1);
 #endif
 }
@@ -1721,15 +1721,15 @@ static void generic_identify(struct cpui
 	get_cpu_address_sizes(c);
 
 	if (c->cpuid_level >= 0x00000001) {
-		c->initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
+		c->topo.initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
 #ifdef CONFIG_X86_32
 # ifdef CONFIG_SMP
-		c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);
+		c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
 # else
-		c->apicid = c->initial_apicid;
+		c->topo.apicid = c->topo.initial_apicid;
 # endif
 #endif
-		c->phys_proc_id = c->initial_apicid;
+		c->phys_proc_id = c->topo.initial_apicid;
 	}
 
 	get_model_name(c); /* Default name */
@@ -1763,9 +1763,9 @@ static void validate_apic_and_package_id
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
-	if (apicid != c->apicid) {
+	if (apicid != c->topo.apicid) {
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
-		       cpu, apicid, c->initial_apicid);
+		       cpu, apicid, c->topo.initial_apicid);
 	}
 	BUG_ON(topology_update_package_map(c->phys_proc_id, cpu));
 	BUG_ON(topology_update_die_map(c->cpu_die_id, cpu));
@@ -1815,7 +1815,7 @@ static void identify_cpu(struct cpuinfo_
 	apply_forced_caps(c);
 
 #ifdef CONFIG_X86_64
-	c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);
+	c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
 #endif
 
 	/*
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -88,7 +88,7 @@ static void hygon_get_topology(struct cp
 			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
 		/* Socket ID is ApicId[6] for these processors. */
-		c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT;
+		c->phys_proc_id = c->topo.apicid >> APICID_SOCKET_ID_BIT;
 
 		cacheinfo_hygon_init_llc_id(c, cpu);
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
@@ -116,9 +116,9 @@ static void hygon_detect_cmp(struct cpui
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
-	c->cpu_core_id = c->initial_apicid & ((1 << bits)-1);
+	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
-	c->phys_proc_id = c->initial_apicid >> bits;
+	c->phys_proc_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
 	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id;
 }
@@ -128,7 +128,7 @@ static void srat_detect_node(struct cpui
 #ifdef CONFIG_NUMA
 	int cpu = smp_processor_id();
 	int node;
-	unsigned int apicid = c->apicid;
+	unsigned int apicid = c->topo.apicid;
 
 	node = numa_cpu_node(cpu);
 	if (node == NUMA_NO_NODE)
@@ -161,7 +161,7 @@ static void srat_detect_node(struct cpui
 		 * through CPU mapping may alter the outcome, directly
 		 * access __apicid_to_node[].
 		 */
-		int ht_nodeid = c->initial_apicid;
+		int ht_nodeid = c->topo.initial_apicid;
 
 		if (__apicid_to_node[ht_nodeid] != NUMA_NO_NODE)
 			node = __apicid_to_node[ht_nodeid];
@@ -301,7 +301,7 @@ static void init_hygon(struct cpuinfo_x8
 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 
 	/* get apicid instead of initial apic id from cpuid */
-	c->apicid = read_apic_id();
+	c->topo.apicid = read_apic_id();
 
 	/*
 	 * XXX someone from Hygon needs to confirm this DTRT
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -103,7 +103,7 @@ int apei_smca_report_x86_error(struct cp
 	m.socketid = -1;
 
 	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).initial_apicid == lapic_id) {
+		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
 			m.extcpu = cpu;
 			m.socketid = cpu_data(m.extcpu).phys_proc_id;
 			break;
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -124,7 +124,7 @@ void mce_setup(struct mce *m)
 	m->cpuvendor = boot_cpu_data.x86_vendor;
 	m->cpuid = cpuid_eax(1);
 	m->socketid = cpu_data(m->extcpu).phys_proc_id;
-	m->apicid = cpu_data(m->extcpu).initial_apicid;
+	m->apicid = cpu_data(m->extcpu).topo.initial_apicid;
 	m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
 	m->ppin = cpu_data(m->extcpu).ppin;
 	m->microcode = boot_cpu_data.microcode;
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -23,8 +23,8 @@ static void show_cpuinfo_core(struct seq
 		   cpumask_weight(topology_core_cpumask(cpu)));
 	seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
 	seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
-	seq_printf(m, "apicid\t\t: %d\n", c->apicid);
-	seq_printf(m, "initial apicid\t: %d\n", c->initial_apicid);
+	seq_printf(m, "apicid\t\t: %d\n", c->topo.apicid);
+	seq_printf(m, "initial apicid\t: %d\n", c->topo.initial_apicid);
 #endif
 }
 
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -78,7 +78,7 @@ int detect_extended_topology_early(struc
 	/*
 	 * initial apic id, which also represents 32-bit extended x2apic id.
 	 */
-	c->initial_apicid = edx;
+	c->topo.initial_apicid = edx;
 	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
 #endif
 	return 0;
@@ -108,7 +108,7 @@ int detect_extended_topology(struct cpui
 	 * Populate HT related information from sub-leaf level 0.
 	 */
 	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
-	c->initial_apicid = edx;
+	c->topo.initial_apicid = edx;
 	core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
 	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
 	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
@@ -146,20 +146,20 @@ int detect_extended_topology(struct cpui
 	die_select_mask = (~(-1 << die_plus_mask_width)) >>
 				core_plus_mask_width;
 
-	c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid,
+	c->cpu_core_id = apic->phys_pkg_id(c->topo.initial_apicid,
 				ht_mask_width) & core_select_mask;
 
 	if (die_level_present) {
-		c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
+		c->cpu_die_id = apic->phys_pkg_id(c->topo.initial_apicid,
 					core_plus_mask_width) & die_select_mask;
 	}
 
-	c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid,
+	c->phys_proc_id = apic->phys_pkg_id(c->topo.initial_apicid,
 				pkg_mask_width);
 	/*
 	 * Reinit the apicid, now that we have extended initial_apicid.
 	 */
-	c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);
+	c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
 
 	c->x86_max_cores = (core_level_siblings / smp_num_siblings);
 	__max_die_per_package = (die_level_siblings / core_level_siblings);
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -118,7 +118,7 @@ static int xen_phys_pkg_id(int initial_a
 static int xen_cpu_present_to_apicid(int cpu)
 {
 	if (cpu_present(cpu))
-		return cpu_data(cpu).apicid;
+		return cpu_data(cpu).topo.apicid;
 	else
 		return BAD_APICID;
 }
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -2255,7 +2255,7 @@ static int kfd_cpumask_to_apic_id(const
 	if (first_cpu_of_numa_node >= nr_cpu_ids)
 		return -1;
 #ifdef CONFIG_X86_64
-	return cpu_data(first_cpu_of_numa_node).apicid;
+	return cpu_data(first_cpu_of_numa_node).topo.apicid;
 #else
 	return first_cpu_of_numa_node;
 #endif
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -447,7 +447,7 @@ static ssize_t remove_cpu_store(struct d
 	if (cpu_online(cpu))
 		remove_cpu(cpu);
 
-	lapicid = cpu_data(cpu).apicid;
+	lapicid = cpu_data(cpu).topo.apicid;
 	dev_dbg(dev, "Try to remove cpu %lld with lapicid %lld\n", cpu, lapicid);
 	ret = hcall_sos_remove_cpu(lapicid);
 	if (ret < 0) {


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 02/38] x86/cpu: Move phys_proc_id into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 03/38] x86/cpu: Move cpu_die_id " Thomas Gleixner
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Rename it to pkg_id which is the terminology used in the kernel.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/arch/x86/topology.rst  |    2 +-
 arch/x86/include/asm/processor.h     |    5 +++--
 arch/x86/include/asm/topology.h      |    2 +-
 arch/x86/include/asm/x86_init.h      |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    2 +-
 arch/x86/kernel/cpu/amd.c            |    4 ++--
 arch/x86/kernel/cpu/cacheinfo.c      |    4 ++--
 arch/x86/kernel/cpu/common.c         |    6 +++---
 arch/x86/kernel/cpu/hygon.c          |    6 +++---
 arch/x86/kernel/cpu/mce/apei.c       |    2 +-
 arch/x86/kernel/cpu/mce/core.c       |    2 +-
 arch/x86/kernel/cpu/proc.c           |    2 +-
 arch/x86/kernel/cpu/topology.c       |    3 +--
 arch/x86/kernel/smpboot.c            |   16 ++++++++--------
 drivers/scsi/lpfc/lpfc_init.c        |    6 +-----
 15 files changed, 30 insertions(+), 34 deletions(-)

--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -59,7 +59,7 @@ AMD nomenclature for package is 'Node'.
 
     The physical ID of the die. This information is retrieved via CPUID.
 
-  - cpuinfo_x86.phys_proc_id:
+  - cpuinfo_x86.topo.pkg_id:
 
     The physical ID of the package. This information is retrieved via CPUID
     and deduced from the APIC IDs of the cores in the package.
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -82,6 +82,9 @@ struct cpuinfo_topology {
 	u32			apicid;
 	// The initial APIC ID provided by CPUID
 	u32			initial_apicid;
+
+	// Physical package ID
+	u32			pkg_id;
 };
 
 struct cpuinfo_x86 {
@@ -133,8 +136,6 @@ struct cpuinfo_x86 {
 	u16			x86_clflush_size;
 	/* number of cores as seen by the OS: */
 	u16			booted_cores;
-	/* Physical processor id: */
-	u16			phys_proc_id;
 	/* Logical processor id: */
 	u16			logical_proc_id;
 	/* Core id: */
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -106,7 +106,7 @@ extern const struct cpumask *cpu_coregro
 extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
 #define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
-#define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
+#define topology_physical_package_id(cpu)	(cpu_data(cpu).topo.pkg_id)
 #define topology_logical_die_id(cpu)		(cpu_data(cpu).logical_die_id)
 #define topology_die_id(cpu)			(cpu_data(cpu).cpu_die_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).cpu_core_id)
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -177,7 +177,7 @@ struct x86_init_ops {
  * struct x86_cpuinit_ops - platform specific cpu hotplug setups
  * @setup_percpu_clockev:	set up the per cpu clock event device
  * @early_percpu_clock_init:	early init of the per cpu clock event device
- * @fixup_cpu_id:		fixup function for cpuinfo_x86::phys_proc_id
+ * @fixup_cpu_id:		fixup function for cpuinfo_x86::topo.pkg_id
  * @parallel_bringup:		Parallel bringup control
  */
 struct x86_cpuinit_ops {
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -169,7 +169,7 @@ static void fixup_cpu_id(struct cpuinfo_
 		nodes = ((val >> 3) & 7) + 1;
 	}
 
-	c->phys_proc_id = node / nodes;
+	c->topo.pkg_id = node / nodes;
 }
 
 static int __init numachip_system_init(void)
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -389,9 +389,9 @@ static void amd_detect_cmp(struct cpuinf
 	/* Low order bits define the core id (index of core in socket) */
 	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
-	c->phys_proc_id = c->topo.initial_apicid >> bits;
+	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id;
+	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->topo.pkg_id;
 }
 
 u32 amd_get_nodes_per_socket(void)
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -875,10 +875,10 @@ void init_intel_cacheinfo(struct cpuinfo
 	 * turns means that the only possibility is SMT (as indicated in
 	 * cpuid1). Since cpuid2 doesn't specify shared caches, and we know
 	 * that SMT shares all caches, we can unconditionally set cpu_llc_id to
-	 * c->phys_proc_id.
+	 * c->topo.pkg_id.
 	 */
 	if (per_cpu(cpu_llc_id, cpu) == BAD_APICID)
-		per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+		per_cpu(cpu_llc_id, cpu) = c->topo.pkg_id;
 #endif
 
 	c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -899,7 +899,7 @@ void detect_ht(struct cpuinfo_x86 *c)
 		return;
 
 	index_msb = get_count_order(smp_num_siblings);
-	c->phys_proc_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb);
+	c->topo.pkg_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb);
 
 	smp_num_siblings = smp_num_siblings / c->x86_max_cores;
 
@@ -1729,7 +1729,7 @@ static void generic_identify(struct cpui
 		c->topo.apicid = c->topo.initial_apicid;
 # endif
 #endif
-		c->phys_proc_id = c->topo.initial_apicid;
+		c->topo.pkg_id = c->topo.initial_apicid;
 	}
 
 	get_model_name(c); /* Default name */
@@ -1767,7 +1767,7 @@ static void validate_apic_and_package_id
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
 		       cpu, apicid, c->topo.initial_apicid);
 	}
-	BUG_ON(topology_update_package_map(c->phys_proc_id, cpu));
+	BUG_ON(topology_update_package_map(c->topo.pkg_id, cpu));
 	BUG_ON(topology_update_die_map(c->cpu_die_id, cpu));
 #else
 	c->logical_proc_id = 0;
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -88,7 +88,7 @@ static void hygon_get_topology(struct cp
 			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
 		/* Socket ID is ApicId[6] for these processors. */
-		c->phys_proc_id = c->topo.apicid >> APICID_SOCKET_ID_BIT;
+		c->topo.pkg_id = c->topo.apicid >> APICID_SOCKET_ID_BIT;
 
 		cacheinfo_hygon_init_llc_id(c, cpu);
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
@@ -118,9 +118,9 @@ static void hygon_detect_cmp(struct cpui
 	/* Low order bits define the core id (index of core in socket) */
 	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
-	c->phys_proc_id = c->topo.initial_apicid >> bits;
+	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id;
+	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->topo.pkg_id;
 }
 
 static void srat_detect_node(struct cpuinfo_x86 *c)
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -105,7 +105,7 @@ int apei_smca_report_x86_error(struct cp
 	for_each_possible_cpu(cpu) {
 		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
 			m.extcpu = cpu;
-			m.socketid = cpu_data(m.extcpu).phys_proc_id;
+			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
 			break;
 		}
 	}
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -123,7 +123,7 @@ void mce_setup(struct mce *m)
 	m->time = __ktime_get_real_seconds();
 	m->cpuvendor = boot_cpu_data.x86_vendor;
 	m->cpuid = cpuid_eax(1);
-	m->socketid = cpu_data(m->extcpu).phys_proc_id;
+	m->socketid = cpu_data(m->extcpu).topo.pkg_id;
 	m->apicid = cpu_data(m->extcpu).topo.initial_apicid;
 	m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
 	m->ppin = cpu_data(m->extcpu).ppin;
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -18,7 +18,7 @@ static void show_cpuinfo_core(struct seq
 			      unsigned int cpu)
 {
 #ifdef CONFIG_SMP
-	seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
+	seq_printf(m, "physical id\t: %d\n", c->topo.pkg_id);
 	seq_printf(m, "siblings\t: %d\n",
 		   cpumask_weight(topology_core_cpumask(cpu)));
 	seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -154,8 +154,7 @@ int detect_extended_topology(struct cpui
 					core_plus_mask_width) & die_select_mask;
 	}
 
-	c->phys_proc_id = apic->phys_pkg_id(c->topo.initial_apicid,
-				pkg_mask_width);
+	c->topo.pkg_id = apic->phys_pkg_id(c->topo.initial_apicid, pkg_mask_width);
 	/*
 	 * Reinit the apicid, now that we have extended initial_apicid.
 	 */
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -347,7 +347,7 @@ int topology_phys_to_logical_pkg(unsigne
 	for_each_possible_cpu(cpu) {
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-		if (c->initialized && c->phys_proc_id == phys_pkg)
+		if (c->initialized && c->topo.pkg_id == phys_pkg)
 			return c->logical_proc_id;
 	}
 	return -1;
@@ -363,13 +363,13 @@ EXPORT_SYMBOL(topology_phys_to_logical_p
  */
 static int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu)
 {
-	int cpu, proc_id = cpu_data(cur_cpu).phys_proc_id;
+	int cpu, proc_id = cpu_data(cur_cpu).topo.pkg_id;
 
 	for_each_possible_cpu(cpu) {
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
 		if (c->initialized && c->cpu_die_id == die_id &&
-		    c->phys_proc_id == proc_id)
+		    c->topo.pkg_id == proc_id)
 			return c->logical_die_id;
 	}
 	return -1;
@@ -429,7 +429,7 @@ void __init smp_store_boot_cpu_info(void
 
 	*c = boot_cpu_data;
 	c->cpu_index = id;
-	topology_update_package_map(c->phys_proc_id, id);
+	topology_update_package_map(c->topo.pkg_id, id);
 	topology_update_die_map(c->cpu_die_id, id);
 	c->initialized = true;
 }
@@ -484,7 +484,7 @@ static bool match_smt(struct cpuinfo_x86
 	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
 
-		if (c->phys_proc_id == o->phys_proc_id &&
+		if (c->topo.pkg_id == o->topo.pkg_id &&
 		    c->cpu_die_id == o->cpu_die_id &&
 		    per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
 			if (c->cpu_core_id == o->cpu_core_id)
@@ -496,7 +496,7 @@ static bool match_smt(struct cpuinfo_x86
 				return topology_sane(c, o, "smt");
 		}
 
-	} else if (c->phys_proc_id == o->phys_proc_id &&
+	} else if (c->topo.pkg_id == o->topo.pkg_id &&
 		   c->cpu_die_id == o->cpu_die_id &&
 		   c->cpu_core_id == o->cpu_core_id) {
 		return topology_sane(c, o, "smt");
@@ -507,7 +507,7 @@ static bool match_smt(struct cpuinfo_x86
 
 static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
-	if (c->phys_proc_id == o->phys_proc_id &&
+	if (c->topo.pkg_id == o->topo.pkg_id &&
 	    c->cpu_die_id == o->cpu_die_id)
 		return true;
 	return false;
@@ -535,7 +535,7 @@ static bool match_l2c(struct cpuinfo_x86
  */
 static bool match_pkg(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
-	if (c->phys_proc_id == o->phys_proc_id)
+	if (c->topo.pkg_id == o->topo.pkg_id)
 		return true;
 	return false;
 }
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12428,9 +12428,6 @@ lpfc_cpu_affinity_check(struct lpfc_hba
 	int max_core_id, min_core_id;
 	struct lpfc_vector_map_info *cpup;
 	struct lpfc_vector_map_info *new_cpup;
-#ifdef CONFIG_X86
-	struct cpuinfo_x86 *cpuinfo;
-#endif
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
 	struct lpfc_hdwq_stat *c_stat;
 #endif
@@ -12444,8 +12441,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba
 	for_each_present_cpu(cpu) {
 		cpup = &phba->sli4_hba.cpu_map[cpu];
 #ifdef CONFIG_X86
-		cpuinfo = &cpu_data(cpu);
-		cpup->phys_id = cpuinfo->phys_proc_id;
+		cpup->phys_id = topology_physical_package_id(cpu);
 		cpup->core_id = cpuinfo->cpu_core_id;
 		if (lpfc_find_hyper(phba, cpu, cpup->phys_id, cpup->core_id))
 			cpup->flag |= LPFC_CPU_MAP_HYPER;


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 03/38] x86/cpu: Move cpu_die_id into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 02/38] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 04/38] scsi: lpfc: Use topology_core_id() Thomas Gleixner
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Move the next member.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/arch/x86/topology.rst |    4 ++--
 arch/x86/include/asm/processor.h    |    4 +++-
 arch/x86/include/asm/topology.h     |    2 +-
 arch/x86/kernel/cpu/amd.c           |    8 ++++----
 arch/x86/kernel/cpu/cacheinfo.c     |    2 +-
 arch/x86/kernel/cpu/common.c        |    2 +-
 arch/x86/kernel/cpu/hygon.c         |    8 ++++----
 arch/x86/kernel/cpu/topology.c      |    2 +-
 arch/x86/kernel/smpboot.c           |   10 +++++-----
 9 files changed, 22 insertions(+), 20 deletions(-)

--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -55,7 +55,7 @@ AMD nomenclature for package is 'Node'.
 
     The number of dies in a package. This information is retrieved via CPUID.
 
-  - cpuinfo_x86.cpu_die_id:
+  - cpuinfo_x86.topo_die_id:
 
     The physical ID of the die. This information is retrieved via CPUID.
 
@@ -65,7 +65,7 @@ AMD nomenclature for package is 'Node'.
     and deduced from the APIC IDs of the cores in the package.
 
     Modern systems use this value for the socket. There may be multiple
-    packages within a socket. This value may differ from cpu_die_id.
+    packages within a socket. This value may differ from topo.die_id.
 
   - cpuinfo_x86.logical_proc_id:
 
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -85,6 +85,9 @@ struct cpuinfo_topology {
 
 	// Physical package ID
 	u32			pkg_id;
+
+	// Physical die ID on AMD, Relative on Intel
+	u32			die_id;
 };
 
 struct cpuinfo_x86 {
@@ -140,7 +143,6 @@ struct cpuinfo_x86 {
 	u16			logical_proc_id;
 	/* Core id: */
 	u16			cpu_core_id;
-	u16			cpu_die_id;
 	u16			logical_die_id;
 	/* Index into per_cpu list: */
 	u16			cpu_index;
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -108,7 +108,7 @@ extern const struct cpumask *cpu_cluster
 #define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).topo.pkg_id)
 #define topology_logical_die_id(cpu)		(cpu_data(cpu).logical_die_id)
-#define topology_die_id(cpu)			(cpu_data(cpu).cpu_die_id)
+#define topology_die_id(cpu)			(cpu_data(cpu).topo.die_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).cpu_core_id)
 #define topology_ppin(cpu)			(cpu_data(cpu).ppin)
 
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -338,7 +338,7 @@ static void amd_get_topology(struct cpui
 
 		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
 
-		c->cpu_die_id  = ecx & 0xff;
+		c->topo.die_id  = ecx & 0xff;
 
 		if (c->x86 == 0x15)
 			c->cu_id = ebx & 0xff;
@@ -364,9 +364,9 @@ static void amd_get_topology(struct cpui
 		u64 value;
 
 		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		c->cpu_die_id = value & 7;
+		c->topo.die_id = value & 7;
 
-		per_cpu(cpu_llc_id, cpu) = c->cpu_die_id;
+		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
 	} else
 		return;
 
@@ -391,7 +391,7 @@ static void amd_detect_cmp(struct cpuinf
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->topo.pkg_id;
+	per_cpu(cpu_llc_id, cpu) = c->topo.die_id = c->topo.pkg_id;
 }
 
 u32 amd_get_nodes_per_socket(void)
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -672,7 +672,7 @@ void cacheinfo_amd_init_llc_id(struct cp
 
 	if (c->x86 < 0x17) {
 		/* LLC is at the node level. */
-		per_cpu(cpu_llc_id, cpu) = c->cpu_die_id;
+		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
 	} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
 		/*
 		 * LLC is at the core complex level.
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1768,7 +1768,7 @@ static void validate_apic_and_package_id
 		       cpu, apicid, c->topo.initial_apicid);
 	}
 	BUG_ON(topology_update_package_map(c->topo.pkg_id, cpu));
-	BUG_ON(topology_update_die_map(c->cpu_die_id, cpu));
+	BUG_ON(topology_update_die_map(c->topo.die_id, cpu));
 #else
 	c->logical_proc_id = 0;
 #endif
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -72,7 +72,7 @@ static void hygon_get_topology(struct cp
 
 		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
 
-		c->cpu_die_id  = ecx & 0xff;
+		c->topo.die_id  = ecx & 0xff;
 
 		c->cpu_core_id = ebx & 0xff;
 
@@ -95,9 +95,9 @@ static void hygon_get_topology(struct cp
 		u64 value;
 
 		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		c->cpu_die_id = value & 7;
+		c->topo.die_id = value & 7;
 
-		per_cpu(cpu_llc_id, cpu) = c->cpu_die_id;
+		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
 	} else
 		return;
 
@@ -120,7 +120,7 @@ static void hygon_detect_cmp(struct cpui
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->topo.pkg_id;
+	per_cpu(cpu_llc_id, cpu) = c->topo.die_id = c->topo.pkg_id;
 }
 
 static void srat_detect_node(struct cpuinfo_x86 *c)
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -150,7 +150,7 @@ int detect_extended_topology(struct cpui
 				ht_mask_width) & core_select_mask;
 
 	if (die_level_present) {
-		c->cpu_die_id = apic->phys_pkg_id(c->topo.initial_apicid,
+		c->topo.die_id = apic->phys_pkg_id(c->topo.initial_apicid,
 					core_plus_mask_width) & die_select_mask;
 	}
 
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -368,7 +368,7 @@ static int topology_phys_to_logical_die(
 	for_each_possible_cpu(cpu) {
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-		if (c->initialized && c->cpu_die_id == die_id &&
+		if (c->initialized && c->topo.die_id == die_id &&
 		    c->topo.pkg_id == proc_id)
 			return c->logical_die_id;
 	}
@@ -430,7 +430,7 @@ void __init smp_store_boot_cpu_info(void
 	*c = boot_cpu_data;
 	c->cpu_index = id;
 	topology_update_package_map(c->topo.pkg_id, id);
-	topology_update_die_map(c->cpu_die_id, id);
+	topology_update_die_map(c->topo.die_id, id);
 	c->initialized = true;
 }
 
@@ -485,7 +485,7 @@ static bool match_smt(struct cpuinfo_x86
 		int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
 
 		if (c->topo.pkg_id == o->topo.pkg_id &&
-		    c->cpu_die_id == o->cpu_die_id &&
+		    c->topo.die_id == o->topo.die_id &&
 		    per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
 			if (c->cpu_core_id == o->cpu_core_id)
 				return topology_sane(c, o, "smt");
@@ -497,7 +497,7 @@ static bool match_smt(struct cpuinfo_x86
 		}
 
 	} else if (c->topo.pkg_id == o->topo.pkg_id &&
-		   c->cpu_die_id == o->cpu_die_id &&
+		   c->topo.die_id == o->topo.die_id &&
 		   c->cpu_core_id == o->cpu_core_id) {
 		return topology_sane(c, o, "smt");
 	}
@@ -508,7 +508,7 @@ static bool match_smt(struct cpuinfo_x86
 static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
 	if (c->topo.pkg_id == o->topo.pkg_id &&
-	    c->cpu_die_id == o->cpu_die_id)
+	    c->topo.die_id == o->topo.die_id)
 		return true;
 	return false;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 04/38] scsi: lpfc: Use topology_core_id()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (2 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 03/38] x86/cpu: Move cpu_die_id " Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 05/38] hwmon: (fam15h_power) " Thomas Gleixner
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Use the provided topology helper.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
---
 drivers/scsi/lpfc/lpfc_init.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12442,7 +12442,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba
 		cpup = &phba->sli4_hba.cpu_map[cpu];
 #ifdef CONFIG_X86
 		cpup->phys_id = topology_physical_package_id(cpu);
-		cpup->core_id = cpuinfo->cpu_core_id;
+		cpup->core_id = topology_core_id(cpu);
 		if (lpfc_find_hyper(phba, cpu, cpup->phys_id, cpup->core_id))
 			cpup->flag |= LPFC_CPU_MAP_HYPER;
 #else


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 05/38] hwmon: (fam15h_power) Use topology_core_id()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (3 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 04/38] scsi: lpfc: Use topology_core_id() Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 06/38] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	Guenter Roeck, linux-hwmon, Jean Delvare, Huang Rui,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Juergen Gross, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson

Use the provided topology helper function instead of fiddling in cpu_data.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Huang Rui <ray.huang@amd.com>
---
 drivers/hwmon/fam15h_power.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -17,6 +17,7 @@
 #include <linux/cpumask.h>
 #include <linux/time.h>
 #include <linux/sched.h>
+#include <linux/topology.h>
 #include <asm/processor.h>
 #include <asm/msr.h>
 
@@ -134,15 +135,13 @@ static DEVICE_ATTR_RO(power1_crit);
 static void do_read_registers_on_cu(void *_data)
 {
 	struct fam15h_power_data *data = _data;
-	int cpu, cu;
-
-	cpu = smp_processor_id();
+	int cu;
 
 	/*
 	 * With the new x86 topology modelling, cpu core id actually
 	 * is compute unit id.
 	 */
-	cu = cpu_data(cpu).cpu_core_id;
+	cu = topology_core_id(smp_processor_id());
 
 	rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
 	rdmsrl_safe(MSR_F15H_PTSC, &data->cpu_sw_pwr_ptsc[cu]);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 06/38] x86/cpu: Move cpu_core_id into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (4 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 05/38] hwmon: (fam15h_power) " Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 07/38] x86/cpu: Move cu_id " Thomas Gleixner
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Rename it to core_id and stick it to the other ID fields.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    4 +++-
 arch/x86/include/asm/topology.h  |    2 +-
 arch/x86/kernel/amd_nb.c         |    4 ++--
 arch/x86/kernel/cpu/amd.c        |    8 ++++----
 arch/x86/kernel/cpu/common.c     |    4 ++--
 arch/x86/kernel/cpu/hygon.c      |    4 ++--
 arch/x86/kernel/cpu/proc.c       |    2 +-
 arch/x86/kernel/cpu/topology.c   |    2 +-
 arch/x86/kernel/smpboot.c        |    6 +++---
 9 files changed, 19 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -88,6 +88,9 @@ struct cpuinfo_topology {
 
 	// Physical die ID on AMD, Relative on Intel
 	u32			die_id;
+
+	// Core ID relative to the package
+	u32			core_id;
 };
 
 struct cpuinfo_x86 {
@@ -142,7 +145,6 @@ struct cpuinfo_x86 {
 	/* Logical processor id: */
 	u16			logical_proc_id;
 	/* Core id: */
-	u16			cpu_core_id;
 	u16			logical_die_id;
 	/* Index into per_cpu list: */
 	u16			cpu_index;
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -109,7 +109,7 @@ extern const struct cpumask *cpu_cluster
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).topo.pkg_id)
 #define topology_logical_die_id(cpu)		(cpu_data(cpu).logical_die_id)
 #define topology_die_id(cpu)			(cpu_data(cpu).topo.die_id)
-#define topology_core_id(cpu)			(cpu_data(cpu).cpu_core_id)
+#define topology_core_id(cpu)			(cpu_data(cpu).topo.core_id)
 #define topology_ppin(cpu)			(cpu_data(cpu).ppin)
 
 extern unsigned int __max_die_per_package;
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -378,7 +378,7 @@ int amd_get_subcaches(int cpu)
 
 	pci_read_config_dword(link, 0x1d4, &mask);
 
-	return (mask >> (4 * cpu_data(cpu).cpu_core_id)) & 0xf;
+	return (mask >> (4 * cpu_data(cpu).topo.core_id)) & 0xf;
 }
 
 int amd_set_subcaches(int cpu, unsigned long mask)
@@ -404,7 +404,7 @@ int amd_set_subcaches(int cpu, unsigned
 		pci_write_config_dword(nb->misc, 0x1b8, reg & ~0x180000);
 	}
 
-	cuid = cpu_data(cpu).cpu_core_id;
+	cuid = cpu_data(cpu).topo.core_id;
 	mask <<= 4 * cuid;
 	mask |= (0xf ^ (1 << cuid)) << 26;
 
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -306,7 +306,7 @@ static int nearby_node(int apicid)
 #endif
 
 /*
- * Fix up cpu_core_id for pre-F17h systems to be in the
+ * Fix up topo::core_id for pre-F17h systems to be in the
  * [0 .. cores_per_node - 1] range. Not really needed but
  * kept so as not to break existing setups.
  */
@@ -318,7 +318,7 @@ static void legacy_fixup_core_id(struct
 		return;
 
 	cus_per_node = c->x86_max_cores / nodes_per_socket;
-	c->cpu_core_id %= cus_per_node;
+	c->topo.core_id %= cus_per_node;
 }
 
 /*
@@ -344,7 +344,7 @@ static void amd_get_topology(struct cpui
 			c->cu_id = ebx & 0xff;
 
 		if (c->x86 >= 0x17) {
-			c->cpu_core_id = ebx & 0xff;
+			c->topo.core_id = ebx & 0xff;
 
 			if (smp_num_siblings > 1)
 				c->x86_max_cores /= smp_num_siblings;
@@ -387,7 +387,7 @@ static void amd_detect_cmp(struct cpuinf
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
-	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
+	c->topo.core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -907,8 +907,8 @@ void detect_ht(struct cpuinfo_x86 *c)
 
 	core_bits = get_count_order(c->x86_max_cores);
 
-	c->cpu_core_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb) &
-				       ((1 << core_bits) - 1);
+	c->topo.core_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb) &
+		((1 << core_bits) - 1);
 #endif
 }
 
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -74,7 +74,7 @@ static void hygon_get_topology(struct cp
 
 		c->topo.die_id  = ecx & 0xff;
 
-		c->cpu_core_id = ebx & 0xff;
+		c->topo.core_id = ebx & 0xff;
 
 		if (smp_num_siblings > 1)
 			c->x86_max_cores /= smp_num_siblings;
@@ -116,7 +116,7 @@ static void hygon_detect_cmp(struct cpui
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
-	c->cpu_core_id = c->topo.initial_apicid & ((1 << bits)-1);
+	c->topo.core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -21,7 +21,7 @@ static void show_cpuinfo_core(struct seq
 	seq_printf(m, "physical id\t: %d\n", c->topo.pkg_id);
 	seq_printf(m, "siblings\t: %d\n",
 		   cpumask_weight(topology_core_cpumask(cpu)));
-	seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
+	seq_printf(m, "core id\t\t: %d\n", c->topo.core_id);
 	seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
 	seq_printf(m, "apicid\t\t: %d\n", c->topo.apicid);
 	seq_printf(m, "initial apicid\t: %d\n", c->topo.initial_apicid);
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -146,7 +146,7 @@ int detect_extended_topology(struct cpui
 	die_select_mask = (~(-1 << die_plus_mask_width)) >>
 				core_plus_mask_width;
 
-	c->cpu_core_id = apic->phys_pkg_id(c->topo.initial_apicid,
+	c->topo.core_id = apic->phys_pkg_id(c->topo.initial_apicid,
 				ht_mask_width) & core_select_mask;
 
 	if (die_level_present) {
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -487,7 +487,7 @@ static bool match_smt(struct cpuinfo_x86
 		if (c->topo.pkg_id == o->topo.pkg_id &&
 		    c->topo.die_id == o->topo.die_id &&
 		    per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
-			if (c->cpu_core_id == o->cpu_core_id)
+			if (c->topo.core_id == o->topo.core_id)
 				return topology_sane(c, o, "smt");
 
 			if ((c->cu_id != 0xff) &&
@@ -498,7 +498,7 @@ static bool match_smt(struct cpuinfo_x86
 
 	} else if (c->topo.pkg_id == o->topo.pkg_id &&
 		   c->topo.die_id == o->topo.die_id &&
-		   c->cpu_core_id == o->cpu_core_id) {
+		   c->topo.core_id == o->topo.core_id) {
 		return topology_sane(c, o, "smt");
 	}
 
@@ -1439,7 +1439,7 @@ static void remove_siblinginfo(int cpu)
 	cpumask_clear(topology_sibling_cpumask(cpu));
 	cpumask_clear(topology_core_cpumask(cpu));
 	cpumask_clear(topology_die_cpumask(cpu));
-	c->cpu_core_id = 0;
+	c->topo.core_id = 0;
 	c->booted_cores = 0;
 	cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
 	recompute_smt_state();


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 07/38] x86/cpu: Move cu_id into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (5 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 06/38] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 08/38] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    4 +++-
 arch/x86/kernel/cpu/amd.c        |    2 +-
 arch/x86/kernel/cpu/common.c     |    2 +-
 arch/x86/kernel/smpboot.c        |    6 +++---
 4 files changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -89,6 +89,9 @@ struct cpuinfo_topology {
 	// Physical die ID on AMD, Relative on Intel
 	u32			die_id;
 
+	// Compute unit ID - AMD specific
+	u32			cu_id;
+
 	// Core ID relative to the package
 	u32			core_id;
 };
@@ -109,7 +112,6 @@ struct cpuinfo_x86 {
 	__u8			x86_phys_bits;
 	/* CPUID returned core id bits: */
 	__u8			x86_coreid_bits;
-	__u8			cu_id;
 	/* Max extended CPUID function supported: */
 	__u32			extended_cpuid_level;
 	/* Maximum supported CPUID level, -1=no CPUID: */
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -341,7 +341,7 @@ static void amd_get_topology(struct cpui
 		c->topo.die_id  = ecx & 0xff;
 
 		if (c->x86 == 0x15)
-			c->cu_id = ebx & 0xff;
+			c->topo.cu_id = ebx & 0xff;
 
 		if (c->x86 >= 0x17) {
 			c->topo.core_id = ebx & 0xff;
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1789,7 +1789,7 @@ static void identify_cpu(struct cpuinfo_
 	c->x86_model_id[0] = '\0';  /* Unset */
 	c->x86_max_cores = 1;
 	c->x86_coreid_bits = 0;
-	c->cu_id = 0xff;
+	c->topo.cu_id = 0xff;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
 	c->x86_phys_bits = 36;
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -490,9 +490,9 @@ static bool match_smt(struct cpuinfo_x86
 			if (c->topo.core_id == o->topo.core_id)
 				return topology_sane(c, o, "smt");
 
-			if ((c->cu_id != 0xff) &&
-			    (o->cu_id != 0xff) &&
-			    (c->cu_id == o->cu_id))
+			if ((c->topo.cu_id != 0xff) &&
+			    (o->topo.cu_id != 0xff) &&
+			    (c->topo.cu_id == o->topo.cu_id))
 				return topology_sane(c, o, "smt");
 		}
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 08/38] x86/cpu: Remove pointless evaluation of x86_coreid_bits
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (6 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 07/38] x86/cpu: Move cu_id " Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 09/38] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

cpuinfo_x86::x86_coreid_bits is only used by the AMD numa topology code. No
point in evaluating it on non AMD systems.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/intel.c   |   13 -------------
 arch/x86/kernel/cpu/zhaoxin.c |   14 --------------
 2 files changed, 27 deletions(-)

--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -488,19 +488,6 @@ static void early_init_intel(struct cpui
 		setup_clear_cpu_cap(X86_FEATURE_PGE);
 	}
 
-	if (c->cpuid_level >= 0x00000001) {
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
-		/*
-		 * If HTT (EDX[28]) is set EBX[16:23] contain the number of
-		 * apicids which are reserved per package. Store the resulting
-		 * shift value for the package management code.
-		 */
-		if (edx & (1U << 28))
-			c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
-	}
-
 	check_memory_type_self_snoop_errata(c);
 
 	/*
--- a/arch/x86/kernel/cpu/zhaoxin.c
+++ b/arch/x86/kernel/cpu/zhaoxin.c
@@ -65,20 +65,6 @@ static void early_init_zhaoxin(struct cp
 		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
 		set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
 	}
-
-	if (c->cpuid_level >= 0x00000001) {
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
-		/*
-		 * If HTT (EDX[28]) is set EBX[16:23] contain the number of
-		 * apicids which are reserved per package. Store the resulting
-		 * shift value for the package management code.
-		 */
-		if (edx & (1U << 28))
-			c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
-	}
-
 }
 
 static void init_zhaoxin(struct cpuinfo_x86 *c)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 09/38] x86/cpu: Move logical package and die IDs into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (7 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 08/38] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 10/38] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Yet another topology related data pair. Rename logical_proc_id to
logical_pkg_id so it fits the common naming conventions.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/arch/x86/topology.rst |    2 +-
 arch/x86/events/intel/uncore.c      |    2 +-
 arch/x86/include/asm/processor.h    |    8 ++++----
 arch/x86/include/asm/topology.h     |    4 ++--
 arch/x86/kernel/cpu/common.c        |    2 +-
 arch/x86/kernel/smpboot.c           |    8 ++++----
 6 files changed, 13 insertions(+), 13 deletions(-)

--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -67,7 +67,7 @@ AMD nomenclature for package is 'Node'.
     Modern systems use this value for the socket. There may be multiple
     packages within a socket. This value may differ from topo.die_id.
 
-  - cpuinfo_x86.logical_proc_id:
+  - cpuinfo_x86.topo.logical_pkg_id:
 
     The logical ID of the package. As we do not trust BIOSes to enumerate the
     packages in a consistent way, we introduced the concept of logical package
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -74,7 +74,7 @@ int uncore_device_to_die(struct pci_dev
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
 		if (c->initialized && cpu_to_node(cpu) == node)
-			return c->logical_die_id;
+			return c->topo.logical_die_id;
 	}
 
 	return -1;
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -94,6 +94,10 @@ struct cpuinfo_topology {
 
 	// Core ID relative to the package
 	u32			core_id;
+
+	// Logical ID mappings
+	u32			logical_pkg_id;
+	u32			logical_die_id;
 };
 
 struct cpuinfo_x86 {
@@ -144,10 +148,6 @@ struct cpuinfo_x86 {
 	u16			x86_clflush_size;
 	/* number of cores as seen by the OS: */
 	u16			booted_cores;
-	/* Logical processor id: */
-	u16			logical_proc_id;
-	/* Core id: */
-	u16			logical_die_id;
 	/* Index into per_cpu list: */
 	u16			cpu_index;
 	/*  Is SMT active on this core? */
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -105,9 +105,9 @@ static inline void setup_node_to_cpumask
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
-#define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
+#define topology_logical_package_id(cpu)	(cpu_data(cpu).topo.logical_pkg_id)
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).topo.pkg_id)
-#define topology_logical_die_id(cpu)		(cpu_data(cpu).logical_die_id)
+#define topology_logical_die_id(cpu)		(cpu_data(cpu).topo.logical_die_id)
 #define topology_die_id(cpu)			(cpu_data(cpu).topo.die_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).topo.core_id)
 #define topology_ppin(cpu)			(cpu_data(cpu).ppin)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1770,7 +1770,7 @@ static void validate_apic_and_package_id
 	BUG_ON(topology_update_package_map(c->topo.pkg_id, cpu));
 	BUG_ON(topology_update_die_map(c->topo.die_id, cpu));
 #else
-	c->logical_proc_id = 0;
+	c->topo.logical_pkg_id = 0;
 #endif
 }
 
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -348,7 +348,7 @@ int topology_phys_to_logical_pkg(unsigne
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
 		if (c->initialized && c->topo.pkg_id == phys_pkg)
-			return c->logical_proc_id;
+			return c->topo.logical_pkg_id;
 	}
 	return -1;
 }
@@ -370,7 +370,7 @@ static int topology_phys_to_logical_die(
 
 		if (c->initialized && c->topo.die_id == die_id &&
 		    c->topo.pkg_id == proc_id)
-			return c->logical_die_id;
+			return c->topo.logical_die_id;
 	}
 	return -1;
 }
@@ -395,7 +395,7 @@ int topology_update_package_map(unsigned
 			cpu, pkg, new);
 	}
 found:
-	cpu_data(cpu).logical_proc_id = new;
+	cpu_data(cpu).topo.logical_pkg_id = new;
 	return 0;
 }
 /**
@@ -418,7 +418,7 @@ int topology_update_die_map(unsigned int
 			cpu, die, new);
 	}
 found:
-	cpu_data(cpu).logical_die_id = new;
+	cpu_data(cpu).topo.logical_die_id = new;
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 10/38] x86/cpu: Move cpu_l[l2]c_id into topology info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (8 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 09/38] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 11/38] x86/apic: Use BAD_APICID consistently; Thomas Gleixner
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

The topology IDs which identify the LLC and L2 domains clearly belong to
the per CPU topology information.

Move them into cpuinfo_x86::cpuinfo_topo and get rid of the extra per CPU
data and the related exports.

This also paves the way to do proper topology evaluation during early boot
because it removes the only per CPU dependency for that.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/arch/x86/topology.rst  |    4 +---
 arch/x86/events/amd/uncore.c         |    2 +-
 arch/x86/include/asm/cacheinfo.h     |    3 ---
 arch/x86/include/asm/processor.h     |   14 +++++++++++++-
 arch/x86/include/asm/smp.h           |    2 --
 arch/x86/include/asm/topology.h      |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    2 +-
 arch/x86/kernel/cpu/amd.c            |   12 ++++--------
 arch/x86/kernel/cpu/cacheinfo.c      |   33 ++++++++++++---------------------
 arch/x86/kernel/cpu/common.c         |   14 ++------------
 arch/x86/kernel/cpu/cpu.h            |    3 +++
 arch/x86/kernel/cpu/hygon.c          |   14 +++++---------
 arch/x86/kernel/smpboot.c            |   10 +++++-----
 13 files changed, 48 insertions(+), 67 deletions(-)

--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -79,9 +79,7 @@ AMD nomenclature for package is 'Node'.
     The maximum possible number of packages in the system. Helpful for per
     package facilities to preallocate per package information.
 
-  - cpu_llc_id:
-
-    A per-CPU variable containing:
+  - cpuinfo_x86.topo.llc_id:
 
       - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
         Cache
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -537,7 +537,7 @@ static int amd_uncore_cpu_starting(unsig
 
 	if (amd_uncore_llc) {
 		uncore = *per_cpu_ptr(amd_uncore_llc, cpu);
-		uncore->id = get_llc_id(cpu);
+		uncore->id = per_cpu_llc_id(cpu);
 
 		uncore = amd_uncore_find_online_sibling(uncore, amd_uncore_llc);
 		*per_cpu_ptr(amd_uncore_llc, cpu) = uncore;
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -7,9 +7,6 @@ extern unsigned int memory_caching_contr
 #define CACHE_MTRR 0x01
 #define CACHE_PAT  0x02
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu);
-void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);
-
 void cache_disable(void);
 void cache_enable(void);
 void set_cache_aps_delayed_init(bool val);
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -98,6 +98,10 @@ struct cpuinfo_topology {
 	// Logical ID mappings
 	u32			logical_pkg_id;
 	u32			logical_die_id;
+
+	// Cache level topology IDs
+	u32			llc_id;
+	u32			l2c_id;
 };
 
 struct cpuinfo_x86 {
@@ -687,7 +691,15 @@ extern int set_tsc_mode(unsigned int val
 
 DECLARE_PER_CPU(u64, msr_misc_features_shadow);
 
-extern u16 get_llc_id(unsigned int cpu);
+static inline u16 per_cpu_llc_id(unsigned int cpu)
+{
+	return per_cpu(cpu_info.topo.llc_id, cpu);
+}
+
+static inline u16 per_cpu_l2c_id(unsigned int cpu)
+{
+	return per_cpu(cpu_info.topo.l2c_id, cpu);
+}
 
 #ifdef CONFIG_CPU_SUP_AMD
 extern u32 amd_get_nodes_per_socket(void);
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -17,8 +17,6 @@ DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_
 /* cpus sharing the last level cache: */
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
-DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id);
-DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_l2c_id);
 
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid);
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -115,7 +115,7 @@ extern const struct cpumask *cpu_cluster
 extern unsigned int __max_die_per_package;
 
 #ifdef CONFIG_SMP
-#define topology_cluster_id(cpu)		(per_cpu(cpu_l2c_id, cpu))
+#define topology_cluster_id(cpu)		(cpu_data(cpu).topo.l2c_id)
 #define topology_die_cpumask(cpu)		(per_cpu(cpu_die_map, cpu))
 #define topology_cluster_cpumask(cpu)		(cpu_clustergroup_mask(cpu))
 #define topology_core_cpumask(cpu)		(per_cpu(cpu_core_map, cpu))
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -161,7 +161,7 @@ static void fixup_cpu_id(struct cpuinfo_
 	u64 val;
 	u32 nodes = 1;
 
-	this_cpu_write(cpu_llc_id, node);
+	c->topo.llc_id = node;
 
 	/* Account for nodes per socket in multi-core-module processors */
 	if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -329,8 +329,6 @@ static void legacy_fixup_core_id(struct
  */
 static void amd_get_topology(struct cpuinfo_x86 *c)
 {
-	int cpu = smp_processor_id();
-
 	/* get information required for multi-node processors */
 	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		int err;
@@ -358,15 +356,14 @@ static void amd_get_topology(struct cpui
 		if (!err)
 			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
-		cacheinfo_amd_init_llc_id(c, cpu);
+		cacheinfo_amd_init_llc_id(c);
 
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 		u64 value;
 
 		rdmsrl(MSR_FAM10H_NODE_ID, value);
 		c->topo.die_id = value & 7;
-
-		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
+		c->topo.llc_id = c->topo.die_id;
 	} else
 		return;
 
@@ -383,7 +380,6 @@ static void amd_get_topology(struct cpui
 static void amd_detect_cmp(struct cpuinfo_x86 *c)
 {
 	unsigned bits;
-	int cpu = smp_processor_id();
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
@@ -391,7 +387,7 @@ static void amd_detect_cmp(struct cpuinf
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
 	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->topo.die_id = c->topo.pkg_id;
+	c->topo.llc_id = c->topo.die_id = c->topo.pkg_id;
 }
 
 u32 amd_get_nodes_per_socket(void)
@@ -409,7 +405,7 @@ static void srat_detect_node(struct cpui
 
 	node = numa_cpu_node(cpu);
 	if (node == NUMA_NO_NODE)
-		node = get_llc_id(cpu);
+		node = per_cpu_llc_id(cpu);
 
 	/*
 	 * On multi-fabric platform (e.g. Numascale NumaChip) a
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -661,7 +661,7 @@ static int find_num_cache_leaves(struct
 	return i;
 }
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu)
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c)
 {
 	/*
 	 * We may have multiple LLCs if L3 caches exist, so check if we
@@ -672,13 +672,13 @@ void cacheinfo_amd_init_llc_id(struct cp
 
 	if (c->x86 < 0x17) {
 		/* LLC is at the node level. */
-		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
+		c->topo.llc_id = c->topo.die_id;
 	} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
 		/*
 		 * LLC is at the core complex level.
 		 * Core complex ID is ApicId[3] for these processors.
 		 */
-		per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> 3;
+		c->topo.llc_id = c->topo.apicid >> 3;
 	} else {
 		/*
 		 * LLC ID is calculated from the number of threads sharing the
@@ -694,12 +694,12 @@ void cacheinfo_amd_init_llc_id(struct cp
 		if (num_sharing_cache) {
 			int bits = get_count_order(num_sharing_cache);
 
-			per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> bits;
+			c->topo.llc_id = c->topo.apicid >> bits;
 		}
 	}
 }
 
-void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu)
+void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c)
 {
 	/*
 	 * We may have multiple LLCs if L3 caches exist, so check if we
@@ -712,7 +712,7 @@ void cacheinfo_hygon_init_llc_id(struct
 	 * LLC is at the core complex level.
 	 * Core complex ID is ApicId[3] for these processors.
 	 */
-	per_cpu(cpu_llc_id, cpu) = c->topo.apicid >> 3;
+	c->topo.llc_id = c->topo.apicid >> 3;
 }
 
 void init_amd_cacheinfo(struct cpuinfo_x86 *c)
@@ -740,9 +740,6 @@ void init_intel_cacheinfo(struct cpuinfo
 	unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */
 	unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */
 	unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
-#ifdef CONFIG_SMP
-	unsigned int cpu = c->cpu_index;
-#endif
 
 	if (c->cpuid_level > 3) {
 		static int is_initialized;
@@ -856,30 +853,24 @@ void init_intel_cacheinfo(struct cpuinfo
 
 	if (new_l2) {
 		l2 = new_l2;
-#ifdef CONFIG_SMP
-		per_cpu(cpu_llc_id, cpu) = l2_id;
-		per_cpu(cpu_l2c_id, cpu) = l2_id;
-#endif
+		c->topo.llc_id = l2_id;
+		c->topo.l2c_id = l2_id;
 	}
 
 	if (new_l3) {
 		l3 = new_l3;
-#ifdef CONFIG_SMP
-		per_cpu(cpu_llc_id, cpu) = l3_id;
-#endif
+		c->topo.llc_id = l3_id;
 	}
 
-#ifdef CONFIG_SMP
 	/*
-	 * If cpu_llc_id is not yet set, this means cpuid_level < 4 which in
+	 * If llc_id is not yet set, this means cpuid_level < 4 which in
 	 * turns means that the only possibility is SMT (as indicated in
 	 * cpuid1). Since cpuid2 doesn't specify shared caches, and we know
 	 * that SMT shares all caches, we can unconditionally set cpu_llc_id to
 	 * c->topo.pkg_id.
 	 */
-	if (per_cpu(cpu_llc_id, cpu) == BAD_APICID)
-		per_cpu(cpu_llc_id, cpu) = c->topo.pkg_id;
-#endif
+	if (c->topo.llc_id == BAD_APICID)
+		c->topo.llc_id = c->topo.pkg_id;
 
 	c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
 
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -75,18 +75,6 @@ u32 elf_hwcap2 __read_mostly;
 int smp_num_siblings = 1;
 EXPORT_SYMBOL(smp_num_siblings);
 
-/* Last level cache ID of each logical CPU */
-DEFINE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id) = BAD_APICID;
-
-u16 get_llc_id(unsigned int cpu)
-{
-	return per_cpu(cpu_llc_id, cpu);
-}
-EXPORT_SYMBOL_GPL(get_llc_id);
-
-/* L2 cache ID of each logical CPU */
-DEFINE_PER_CPU_READ_MOSTLY(u16, cpu_l2c_id) = BAD_APICID;
-
 static struct ppin_info {
 	int	feature;
 	int	msr_ppin_ctl;
@@ -1790,6 +1778,8 @@ static void identify_cpu(struct cpuinfo_
 	c->x86_max_cores = 1;
 	c->x86_coreid_bits = 0;
 	c->topo.cu_id = 0xff;
+	c->topo.llc_id = BAD_APICID;
+	c->topo.l2c_id = BAD_APICID;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
 	c->x86_phys_bits = 36;
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -78,6 +78,9 @@ extern int detect_ht_early(struct cpuinf
 extern void detect_ht(struct cpuinfo_x86 *c);
 extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
 
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c);
+void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c);
+
 unsigned int aperfmperf_get_khz(int cpu);
 void cpu_select_mitigations(void);
 
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -63,8 +63,6 @@ static void hygon_get_topology_early(str
  */
 static void hygon_get_topology(struct cpuinfo_x86 *c)
 {
-	int cpu = smp_processor_id();
-
 	/* get information required for multi-node processors */
 	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		int err;
@@ -90,14 +88,13 @@ static void hygon_get_topology(struct cp
 		/* Socket ID is ApicId[6] for these processors. */
 		c->topo.pkg_id = c->topo.apicid >> APICID_SOCKET_ID_BIT;
 
-		cacheinfo_hygon_init_llc_id(c, cpu);
+		cacheinfo_hygon_init_llc_id(c);
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 		u64 value;
 
 		rdmsrl(MSR_FAM10H_NODE_ID, value);
 		c->topo.die_id = value & 7;
-
-		per_cpu(cpu_llc_id, cpu) = c->topo.die_id;
+		c->topo.llc_id = c->topo.die_id;
 	} else
 		return;
 
@@ -112,15 +109,14 @@ static void hygon_get_topology(struct cp
 static void hygon_detect_cmp(struct cpuinfo_x86 *c)
 {
 	unsigned int bits;
-	int cpu = smp_processor_id();
 
 	bits = c->x86_coreid_bits;
 	/* Low order bits define the core id (index of core in socket) */
 	c->topo.core_id = c->topo.initial_apicid & ((1 << bits)-1);
 	/* Convert the initial APIC ID into the socket ID */
 	c->topo.pkg_id = c->topo.initial_apicid >> bits;
-	/* use socket ID also for last level cache */
-	per_cpu(cpu_llc_id, cpu) = c->topo.die_id = c->topo.pkg_id;
+	/* Use package ID also for last level cache */
+	c->topo.llc_id = c->topo.die_id = c->topo.pkg_id;
 }
 
 static void srat_detect_node(struct cpuinfo_x86 *c)
@@ -132,7 +128,7 @@ static void srat_detect_node(struct cpui
 
 	node = numa_cpu_node(cpu);
 	if (node == NUMA_NO_NODE)
-		node = per_cpu(cpu_llc_id, cpu);
+		node = c->topo.llc_id;
 
 	/*
 	 * On multi-fabric platform (e.g. Numascale NumaChip) a
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -486,7 +486,7 @@ static bool match_smt(struct cpuinfo_x86
 
 		if (c->topo.pkg_id == o->topo.pkg_id &&
 		    c->topo.die_id == o->topo.die_id &&
-		    per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
+		    per_cpu_llc_id(cpu1) == per_cpu_llc_id(cpu2)) {
 			if (c->topo.core_id == o->topo.core_id)
 				return topology_sane(c, o, "smt");
 
@@ -518,11 +518,11 @@ static bool match_l2c(struct cpuinfo_x86
 	int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
 
 	/* If the arch didn't set up l2c_id, fall back to SMT */
-	if (per_cpu(cpu_l2c_id, cpu1) == BAD_APICID)
+	if (per_cpu_l2c_id(cpu1) == BAD_APICID)
 		return match_smt(c, o);
 
 	/* Do not match if L2 cache id does not match: */
-	if (per_cpu(cpu_l2c_id, cpu1) != per_cpu(cpu_l2c_id, cpu2))
+	if (per_cpu_l2c_id(cpu1) != per_cpu_l2c_id(cpu2))
 		return false;
 
 	return topology_sane(c, o, "l2c");
@@ -568,11 +568,11 @@ static bool match_llc(struct cpuinfo_x86
 	bool intel_snc = id && id->driver_data;
 
 	/* Do not match if we do not have a valid APICID for cpu: */
-	if (per_cpu(cpu_llc_id, cpu1) == BAD_APICID)
+	if (per_cpu_llc_id(cpu1) == BAD_APICID)
 		return false;
 
 	/* Do not match if LLC id does not match: */
-	if (per_cpu(cpu_llc_id, cpu1) != per_cpu(cpu_llc_id, cpu2))
+	if (per_cpu_llc_id(cpu1) != per_cpu_llc_id(cpu2))
 		return false;
 
 	/*


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 11/38] x86/apic: Use BAD_APICID consistently;
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (9 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 10/38] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 22:36   ` Sohil Mehta
  2023-07-28 12:12 ` [patch v2 12/38] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
                   ` (28 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC ID checks compare with BAD_APICID all over the place, but some
initializers and some code which fiddles with global data structure use
-1[U] instead. That simply cannot work at all.

Fix it up and use BAD_APICID consistenly all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/acpi/boot.c |    2 +-
 arch/x86/kernel/apic/apic.c |    6 ++----
 2 files changed, 3 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -852,7 +852,7 @@ int acpi_unmap_cpu(int cpu)
 	set_apicid_to_node(per_cpu(x86_cpu_to_apicid, cpu), NUMA_NO_NODE);
 #endif
 
-	per_cpu(x86_cpu_to_apicid, cpu) = -1;
+	per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
 	set_cpu_present(cpu, false);
 	num_processors--;
 
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -70,7 +70,7 @@ unsigned int num_processors;
 unsigned disabled_cpus;
 
 /* Processor that is doing the boot up */
-unsigned int boot_cpu_physical_apicid __ro_after_init = -1U;
+unsigned int boot_cpu_physical_apicid __ro_after_init = BAD_APICID;
 EXPORT_SYMBOL_GPL(boot_cpu_physical_apicid);
 
 u8 boot_cpu_apic_version __ro_after_init;
@@ -2316,9 +2316,7 @@ static int nr_logical_cpuids = 1;
 /*
  * Used to store mapping between logical CPU IDs and APIC IDs.
  */
-int cpuid_to_apicid[] = {
-	[0 ... NR_CPUS - 1] = -1,
-};
+int cpuid_to_apicid[] = { [0 ... NR_CPUS - 1] = BAD_APICID, };
 
 bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
 {


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 12/38] x86/apic: Use u32 for APIC IDs in global data
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (10 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 11/38] x86/apic: Use BAD_APICID consistently; Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:12 ` [patch v2 13/38] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width and fixup the most obvious usage sites of that.

The APIC callbacks will be addressed separately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h      |    2 +-
 arch/x86/include/asm/mpspec.h    |    2 +-
 arch/x86/include/asm/processor.h |    4 ++--
 arch/x86/include/asm/smp.h       |    2 +-
 arch/x86/kernel/apic/apic.c      |   12 ++++++------
 arch/x86/kernel/kvm.c            |    6 +++---
 arch/x86/mm/numa.c               |    4 ++--
 7 files changed, 16 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -54,7 +54,7 @@ extern int local_apic_timer_c2_ok;
 extern bool apic_is_disabled;
 extern unsigned int lapic_timer_period;
 
-extern int cpuid_to_apicid[];
+extern u32 cpuid_to_apicid[];
 
 extern enum apic_intr_mode_id apic_intr_mode;
 enum apic_intr_mode_id {
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -37,7 +37,7 @@ extern int mp_bus_id_to_type[MAX_MP_BUSS
 
 extern DECLARE_BITMAP(mp_bus_not_pci, MAX_MP_BUSSES);
 
-extern unsigned int boot_cpu_physical_apicid;
+extern u32 boot_cpu_physical_apicid;
 extern u8 boot_cpu_apic_version;
 
 #ifdef CONFIG_X86_LOCAL_APIC
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -691,12 +691,12 @@ extern int set_tsc_mode(unsigned int val
 
 DECLARE_PER_CPU(u64, msr_misc_features_shadow);
 
-static inline u16 per_cpu_llc_id(unsigned int cpu)
+static inline u32 per_cpu_llc_id(unsigned int cpu)
 {
 	return per_cpu(cpu_info.topo.llc_id, cpu);
 }
 
-static inline u16 per_cpu_l2c_id(unsigned int cpu)
+static inline u32 per_cpu_l2c_id(unsigned int cpu)
 {
 	return per_cpu(cpu_info.topo.l2c_id, cpu);
 }
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -18,7 +18,7 @@ DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
 
-DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid);
+DECLARE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid);
 
 struct task_struct;
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -70,7 +70,7 @@ unsigned int num_processors;
 unsigned disabled_cpus;
 
 /* Processor that is doing the boot up */
-unsigned int boot_cpu_physical_apicid __ro_after_init = BAD_APICID;
+u32 boot_cpu_physical_apicid __ro_after_init = BAD_APICID;
 EXPORT_SYMBOL_GPL(boot_cpu_physical_apicid);
 
 u8 boot_cpu_apic_version __ro_after_init;
@@ -85,7 +85,7 @@ physid_mask_t phys_cpu_present_map;
  * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
  * avoid undefined behaviour caused by sending INIT from AP to BSP.
  */
-static unsigned int disabled_cpu_apicid __ro_after_init = BAD_APICID;
+static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
 
 /*
  * This variable controls which CPUs receive external NMIs.  By default,
@@ -109,7 +109,7 @@ static inline bool apic_accessible(void)
 /*
  * Map cpu index to physical APIC ID
  */
-DEFINE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid, BAD_APICID);
+DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid, BAD_APICID);
 DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid, U32_MAX);
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
@@ -2316,11 +2316,11 @@ static int nr_logical_cpuids = 1;
 /*
  * Used to store mapping between logical CPU IDs and APIC IDs.
  */
-int cpuid_to_apicid[] = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+u32 cpuid_to_apicid[] = { [0 ... NR_CPUS - 1] = BAD_APICID, };
 
 bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
 {
-	return phys_id == cpuid_to_apicid[cpu];
+	return phys_id == (u64)cpuid_to_apicid[cpu];
 }
 
 #ifdef CONFIG_SMP
@@ -2380,7 +2380,7 @@ static int allocate_logical_cpuid(int ap
 	return nr_logical_cpuids++;
 }
 
-static void cpu_update_apic(int cpu, int apicid)
+static void cpu_update_apic(int cpu, u32 apicid)
 {
 #if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
 	early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -500,13 +500,13 @@ static bool pv_sched_yield_supported(voi
 static void __send_ipi_mask(const struct cpumask *mask, int vector)
 {
 	unsigned long flags;
-	int cpu, apic_id, icr;
-	int min = 0, max = 0;
+	int cpu, min = 0, max = 0;
 #ifdef CONFIG_X86_64
 	__uint128_t ipi_bitmap = 0;
 #else
 	u64 ipi_bitmap = 0;
 #endif
+	u32 apic_id, icr;
 	long ret;
 
 	if (cpumask_empty(mask))
@@ -1030,8 +1030,8 @@ arch_initcall(activate_jump_labels);
 /* Kick a cpu by its apicid. Used to wake up a halted vcpu */
 static void kvm_kick_cpu(int cpu)
 {
-	int apicid;
 	unsigned long flags = 0;
+	u32 apicid;
 
 	apicid = per_cpu(x86_cpu_to_apicid, cpu);
 	kvm_hypercall2(KVM_HC_KICK_CPU, flags, apicid);
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -56,7 +56,7 @@ s16 __apicid_to_node[MAX_LOCAL_APIC] = {
 
 int numa_cpu_node(int cpu)
 {
-	int apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
+	u32 apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
 
 	if (apicid != BAD_APICID)
 		return __apicid_to_node[apicid];
@@ -786,7 +786,7 @@ void __init init_gi_nodes(void)
 void __init init_cpu_to_node(void)
 {
 	int cpu;
-	u16 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
+	u32 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
 
 	BUG_ON(cpu_to_apicid == NULL);
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 13/38] x86/apic: Use u32 for check_apicid_used()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (11 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 12/38] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
@ 2023-07-28 12:12 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 14/38] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:12 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width and move the default implementation to local.h as there are
no users outside the apic directory.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h         |    3 +--
 arch/x86/kernel/apic/apic_common.c  |    2 +-
 arch/x86/kernel/apic/apic_flat_64.c |    2 --
 arch/x86/kernel/apic/apic_noop.c    |    2 ++
 arch/x86/kernel/apic/bigsmp_32.c    |    2 +-
 arch/x86/kernel/apic/local.h        |    1 +
 6 files changed, 6 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -292,7 +292,7 @@ struct apic {
 	int	(*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
 	bool	(*apic_id_registered)(void);
 
-	bool	(*check_apicid_used)(physid_mask_t *map, int apicid);
+	bool	(*check_apicid_used)(physid_mask_t *map, u32 apicid);
 	void	(*init_apic_ldr)(void);
 	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
 	int	(*cpu_present_to_apicid)(int mps_cpu);
@@ -538,7 +538,6 @@ extern int default_apic_id_valid(u32 api
 extern u32 apic_default_calc_apicid(unsigned int cpu);
 extern u32 apic_flat_calc_apicid(unsigned int cpu);
 
-extern bool default_check_apicid_used(physid_mask_t *map, int apicid);
 extern void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap);
 extern int default_cpu_present_to_apicid(int mps_cpu);
 
--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -18,7 +18,7 @@ u32 apic_flat_calc_apicid(unsigned int c
 	return 1U << cpu;
 }
 
-bool default_check_apicid_used(physid_mask_t *map, int apicid)
+bool default_check_apicid_used(physid_mask_t *map, u32 apicid)
 {
 	return physid_isset(apicid, *map);
 }
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -158,8 +158,6 @@ static struct apic apic_physflat __ro_af
 
 	.disable_esr			= 0,
 
-	.check_apicid_used		= NULL,
-	.ioapic_phys_id_map		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= flat_phys_pkg_id,
 
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -18,6 +18,8 @@
 
 #include <asm/apic.h>
 
+#include "local.h"
+
 static void noop_send_IPI(int cpu, int vector) { }
 static void noop_send_IPI_mask(const struct cpumask *cpumask, int vector) { }
 static void noop_send_IPI_mask_allbutself(const struct cpumask *cpumask, int vector) { }
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -18,7 +18,7 @@ static unsigned bigsmp_get_apic_id(unsig
 	return (x >> 24) & 0xFF;
 }
 
-static bool bigsmp_check_apicid_used(physid_mask_t *map, int apicid)
+static bool bigsmp_check_apicid_used(physid_mask_t *map, u32 apicid)
 {
 	return false;
 }
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -64,6 +64,7 @@ void default_send_IPI_all(int vector);
 void default_send_IPI_self(int vector);
 
 bool default_apic_id_registered(void);
+bool default_check_apicid_used(physid_mask_t *map, u32 apicid);
 
 #ifdef CONFIG_X86_32
 void default_send_IPI_mask_sequence_logical(const struct cpumask *mask, int vector);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 14/38] x86/apic: Use u32 for cpu_present_to_apicid()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (12 preceding siblings ...)
  2023-07-28 12:12 ` [patch v2 13/38] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width and fixup a few related usage sites for consistency sake.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h        |    4 ++--
 arch/x86/kernel/apic/apic_common.c |    2 +-
 arch/x86/kernel/cpu/common.c       |    3 ++-
 arch/x86/kernel/smpboot.c          |   10 +++++-----
 arch/x86/xen/apic.c                |    2 +-
 5 files changed, 11 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -295,7 +295,7 @@ struct apic {
 	bool	(*check_apicid_used)(physid_mask_t *map, u32 apicid);
 	void	(*init_apic_ldr)(void);
 	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
-	int	(*cpu_present_to_apicid)(int mps_cpu);
+	u32	(*cpu_present_to_apicid)(int mps_cpu);
 	int	(*phys_pkg_id)(int cpuid_apic, int index_msb);
 
 	u32	(*get_apic_id)(unsigned long x);
@@ -539,7 +539,7 @@ extern u32 apic_default_calc_apicid(unsi
 extern u32 apic_flat_calc_apicid(unsigned int cpu);
 
 extern void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap);
-extern int default_cpu_present_to_apicid(int mps_cpu);
+extern u32 default_cpu_present_to_apicid(int mps_cpu);
 
 #else /* CONFIG_X86_LOCAL_APIC */
 
--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -28,7 +28,7 @@ void default_ioapic_phys_id_map(physid_m
 	*retmap = *phys_map;
 }
 
-int default_cpu_present_to_apicid(int mps_cpu)
+u32 default_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids && cpu_present(mps_cpu))
 		return (int)per_cpu(x86_cpu_to_apicid, mps_cpu);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1747,7 +1747,8 @@ static void generic_identify(struct cpui
 static void validate_apic_and_package_id(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_SMP
-	unsigned int apicid, cpu = smp_processor_id();
+	unsigned int cpu = smp_processor_id();
+	u32 apicid;
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -816,7 +816,7 @@ static void __init smp_quirk_init_udelay
 /*
  * Wake up AP by INIT, INIT, STARTUP sequence.
  */
-static void send_init_sequence(int phys_apicid)
+static void send_init_sequence(u32 phys_apicid)
 {
 	int maxlvt = lapic_get_maxlvt();
 
@@ -842,7 +842,7 @@ static void send_init_sequence(int phys_
 /*
  * Wake up AP by INIT, INIT, STARTUP sequence.
  */
-static int wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
+static int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long start_eip)
 {
 	unsigned long send_status = 0, accept_status = 0;
 	int num_starts, j, maxlvt;
@@ -989,7 +989,7 @@ int common_cpu_up(unsigned int cpu, stru
  * Returns zero if startup was successfully sent, else error code from
  * ->wakeup_secondary_cpu.
  */
-static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
+static int do_boot_cpu(u32 apicid, int cpu, struct task_struct *idle)
 {
 	unsigned long start_ip = real_mode_header->trampoline_start;
 	int ret;
@@ -1057,7 +1057,7 @@ static int do_boot_cpu(int apicid, int c
 
 int native_kick_ap(unsigned int cpu, struct task_struct *tidle)
 {
-	int apicid = apic->cpu_present_to_apicid(cpu);
+	u32 apicid = apic->cpu_present_to_apicid(cpu);
 	int err;
 
 	lockdep_assert_irqs_enabled();
@@ -1250,7 +1250,7 @@ void arch_thaw_secondary_cpus_end(void)
 bool smp_park_other_cpus_in_init(void)
 {
 	unsigned int cpu, this_cpu = smp_processor_id();
-	unsigned int apicid;
+	u32 apicid;
 
 	if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu)
 		return false;
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -115,7 +115,7 @@ static int xen_phys_pkg_id(int initial_a
 	return initial_apic_id >> index_msb;
 }
 
-static int xen_cpu_present_to_apicid(int cpu)
+static u32 xen_cpu_present_to_apicid(int cpu)
 {
 	if (cpu_present(cpu))
 		return cpu_data(cpu).topo.apicid;


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (13 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 14/38] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-08-08 20:14   ` Steve Wahl
  2023-07-28 12:13 ` [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
                   ` (24 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width even if that callback going to be removed soonish.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h          |    2 +-
 arch/x86/kernel/apic/apic_flat_64.c  |    2 +-
 arch/x86/kernel/apic/apic_noop.c     |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    2 +-
 arch/x86/kernel/apic/bigsmp_32.c     |    2 +-
 arch/x86/kernel/apic/local.h         |    2 +-
 arch/x86/kernel/apic/probe_32.c      |    2 +-
 arch/x86/kernel/apic/x2apic_phys.c   |    2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
 arch/x86/kernel/vsmp_64.c            |    2 +-
 arch/x86/xen/apic.c                  |    2 +-
 11 files changed, 11 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -296,7 +296,7 @@ struct apic {
 	void	(*init_apic_ldr)(void);
 	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
 	u32	(*cpu_present_to_apicid)(int mps_cpu);
-	int	(*phys_pkg_id)(int cpuid_apic, int index_msb);
+	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
 
 	u32	(*get_apic_id)(unsigned long x);
 	u32	(*set_apic_id)(unsigned int id);
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -66,7 +66,7 @@ static u32 set_apic_id(unsigned int id)
 	return (id & 0xFF) << 24;
 }
 
-static int flat_phys_pkg_id(int initial_apic_id, int index_msb)
+static u32 flat_phys_pkg_id(u32 initial_apic_id, int index_msb)
 {
 	return initial_apic_id >> index_msb;
 }
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -29,7 +29,7 @@ static void noop_send_IPI_self(int vecto
 static void noop_apic_icr_write(u32 low, u32 id) { }
 static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
 static u64 noop_apic_icr_read(void) { return 0; }
-static int noop_phys_pkg_id(int cpuid_apic, int index_msb) { return 0; }
+static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
 static unsigned int noop_get_apic_id(unsigned long x) { return 0; }
 static void noop_apic_eoi(void) { }
 
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -56,7 +56,7 @@ static u32 numachip2_set_apic_id(unsigne
 	return id << 24;
 }
 
-static int numachip_phys_pkg_id(int initial_apic_id, int index_msb)
+static u32 numachip_phys_pkg_id(u32 initial_apic_id, int index_msb)
 {
 	return initial_apic_id >> index_msb;
 }
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -29,7 +29,7 @@ static void bigsmp_ioapic_phys_id_map(ph
 	physids_promote(0xFFL, retmap);
 }
 
-static int bigsmp_phys_pkg_id(int cpuid_apic, int index_msb)
+static u32 bigsmp_phys_pkg_id(u32 cpuid_apic, int index_msb)
 {
 	return cpuid_apic >> index_msb;
 }
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -17,7 +17,7 @@
 void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
 unsigned int x2apic_get_apic_id(unsigned long id);
 u32 x2apic_set_apic_id(unsigned int id);
-int x2apic_phys_pkg_id(int initial_apicid, int index_msb);
+u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
 
 void x2apic_send_IPI_all(int vector);
 void x2apic_send_IPI_allbutself(int vector);
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -18,7 +18,7 @@
 
 #include "local.h"
 
-static int default_phys_pkg_id(int cpuid_apic, int index_msb)
+static u32 default_phys_pkg_id(u32 cpuid_apic, int index_msb)
 {
 	return cpuid_apic >> index_msb;
 }
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -134,7 +134,7 @@ u32 x2apic_set_apic_id(unsigned int id)
 	return id;
 }
 
-int x2apic_phys_pkg_id(int initial_apicid, int index_msb)
+u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb)
 {
 	return initial_apicid >> index_msb;
 }
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -790,7 +790,7 @@ static unsigned int uv_read_apic_id(void
 	return x2apic_get_apic_id(apic_read(APIC_ID));
 }
 
-static int uv_phys_pkg_id(int initial_apicid, int index_msb)
+static u32 uv_phys_pkg_id(u32 initial_apicid, int index_msb)
 {
 	return uv_read_apic_id() >> index_msb;
 }
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -127,7 +127,7 @@ static void __init vsmp_cap_cpus(void)
 #endif
 }
 
-static int apicid_phys_pkg_id(int initial_apic_id, int index_msb)
+static u32 apicid_phys_pkg_id(u32 initial_apic_id, int index_msb)
 {
 	return read_apic_id() >> index_msb;
 }
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -110,7 +110,7 @@ static int xen_madt_oem_check(char *oem_
 	return xen_pv_domain();
 }
 
-static int xen_phys_pkg_id(int initial_apic_id, int index_msb)
+static u32 xen_phys_pkg_id(u32 initial_apic_id, int index_msb)
 {
 	return initial_apic_id >> index_msb;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (14 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-08-08 20:15   ` Steve Wahl
  2023-07-28 12:13 ` [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
                   ` (23 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h          |   14 ++------------
 arch/x86/kernel/apic/apic_flat_64.c  |    4 ++--
 arch/x86/kernel/apic/apic_noop.c     |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    8 ++++----
 arch/x86/kernel/apic/bigsmp_32.c     |    2 +-
 arch/x86/kernel/apic/local.h         |    4 ++--
 arch/x86/kernel/apic/probe_32.c      |   10 ++++++++++
 arch/x86/kernel/apic/x2apic_phys.c   |    4 ++--
 arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
 arch/x86/xen/apic.c                  |    4 ++--
 10 files changed, 27 insertions(+), 27 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -298,8 +298,8 @@ struct apic {
 	u32	(*cpu_present_to_apicid)(int mps_cpu);
 	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
 
-	u32	(*get_apic_id)(unsigned long x);
-	u32	(*set_apic_id)(unsigned int id);
+	u32	(*get_apic_id)(u32 id);
+	u32	(*set_apic_id)(u32 apicid);
 
 	/* wakeup_secondary_cpu */
 	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
@@ -493,16 +493,6 @@ static inline bool lapic_vector_set_in_i
 	return !!(irr & (1U << (vector % 32)));
 }
 
-static inline unsigned default_get_apic_id(unsigned long x)
-{
-	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
-
-	if (APIC_XAPIC(ver) || boot_cpu_has(X86_FEATURE_EXTD_APICID))
-		return (x >> 24) & 0xFF;
-	else
-		return (x >> 24) & 0x0F;
-}
-
 /*
  * Warm reset vector position:
  */
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -56,12 +56,12 @@ flat_send_IPI_mask_allbutself(const stru
 	_flat_send_IPI_mask(mask, vector);
 }
 
-static unsigned int flat_get_apic_id(unsigned long x)
+static u32 flat_get_apic_id(u32 x)
 {
 	return (x >> 24) & 0xFF;
 }
 
-static u32 set_apic_id(unsigned int id)
+static u32 set_apic_id(u32 id)
 {
 	return (id & 0xFF) << 24;
 }
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -30,7 +30,7 @@ static void noop_apic_icr_write(u32 low,
 static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
 static u64 noop_apic_icr_read(void) { return 0; }
 static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
-static unsigned int noop_get_apic_id(unsigned long x) { return 0; }
+static u32 noop_get_apic_id(u32 apicid) { return 0; }
 static void noop_apic_eoi(void) { }
 
 static u32 noop_apic_read(u32 reg)
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -25,7 +25,7 @@ static const struct apic apic_numachip1;
 static const struct apic apic_numachip2;
 static void (*numachip_apic_icr_write)(int apicid, unsigned int val) __read_mostly;
 
-static unsigned int numachip1_get_apic_id(unsigned long x)
+static u32 numachip1_get_apic_id(u32 x)
 {
 	unsigned long value;
 	unsigned int id = (x >> 24) & 0xff;
@@ -38,12 +38,12 @@ static unsigned int numachip1_get_apic_i
 	return id;
 }
 
-static u32 numachip1_set_apic_id(unsigned int id)
+static u32 numachip1_set_apic_id(u32 id)
 {
 	return (id & 0xff) << 24;
 }
 
-static unsigned int numachip2_get_apic_id(unsigned long x)
+static u32 numachip2_get_apic_id(u32 x)
 {
 	u64 mcfg;
 
@@ -51,7 +51,7 @@ static unsigned int numachip2_get_apic_i
 	return ((mcfg >> (28 - 8)) & 0xfff00) | (x >> 24);
 }
 
-static u32 numachip2_set_apic_id(unsigned int id)
+static u32 numachip2_set_apic_id(u32 id)
 {
 	return id << 24;
 }
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -13,7 +13,7 @@
 
 #include "local.h"
 
-static unsigned bigsmp_get_apic_id(unsigned long x)
+static u32 bigsmp_get_apic_id(u32 x)
 {
 	return (x >> 24) & 0xFF;
 }
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -15,8 +15,8 @@
 
 /* X2APIC */
 void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
-unsigned int x2apic_get_apic_id(unsigned long id);
-u32 x2apic_set_apic_id(unsigned int id);
+u32 x2apic_get_apic_id(u32 id);
+u32 x2apic_set_apic_id(u32 id);
 u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
 
 void x2apic_send_IPI_all(int vector);
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -23,6 +23,16 @@ static u32 default_phys_pkg_id(u32 cpuid
 	return cpuid_apic >> index_msb;
 }
 
+static u32 default_get_apic_id(u32 x)
+{
+	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
+
+	if (APIC_XAPIC(ver) || boot_cpu_has(X86_FEATURE_EXTD_APICID))
+		return (x >> 24) & 0xFF;
+	else
+		return (x >> 24) & 0x0F;
+}
+
 /* should be called last. */
 static int probe_default(void)
 {
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -124,12 +124,12 @@ static int x2apic_phys_probe(void)
 	return apic == &apic_x2apic_phys;
 }
 
-unsigned int x2apic_get_apic_id(unsigned long id)
+u32 x2apic_get_apic_id(u32 id)
 {
 	return id;
 }
 
-u32 x2apic_set_apic_id(unsigned int id)
+u32 x2apic_set_apic_id(u32 id)
 {
 	return id;
 }
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -780,7 +780,7 @@ static void uv_send_IPI_all(int vector)
 	uv_send_IPI_mask(cpu_online_mask, vector);
 }
 
-static u32 set_apic_id(unsigned int id)
+static u32 set_apic_id(u32 id)
 {
 	return id;
 }
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -33,13 +33,13 @@ static unsigned int xen_io_apic_read(uns
 	return 0xfd;
 }
 
-static u32 xen_set_apic_id(unsigned int x)
+static u32 xen_set_apic_id(u32 x)
 {
 	WARN_ON(1);
 	return x;
 }
 
-static unsigned int xen_get_apic_id(unsigned long x)
+static u32 xen_get_apic_id(u32 x)
 {
 	return ((x)>>24) & 0xFFu;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (15 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-08-08 20:15   ` Steve Wahl
  2023-07-28 12:13 ` [patch v2 18/38] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
                   ` (22 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/hyperv/hv_vtl.c             |    2 +-
 arch/x86/include/asm/apic.h          |    8 ++++----
 arch/x86/kernel/acpi/boot.c          |    2 +-
 arch/x86/kernel/apic/apic_noop.c     |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
 arch/x86/kernel/sev.c                |    2 +-
 7 files changed, 10 insertions(+), 10 deletions(-)

--- a/arch/x86/hyperv/hv_vtl.c
+++ b/arch/x86/hyperv/hv_vtl.c
@@ -192,7 +192,7 @@ static int hv_vtl_apicid_to_vp_id(u32 ap
 	return ret;
 }
 
-static int hv_vtl_wakeup_secondary_cpu(int apicid, unsigned long start_eip)
+static int hv_vtl_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip)
 {
 	int vp_id;
 
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -302,9 +302,9 @@ struct apic {
 	u32	(*set_apic_id)(u32 apicid);
 
 	/* wakeup_secondary_cpu */
-	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
+	int	(*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip);
 	/* wakeup secondary CPU using 64-bit wakeup point */
-	int	(*wakeup_secondary_cpu_64)(int apicid, unsigned long start_eip);
+	int	(*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip);
 
 	char	*name;
 };
@@ -322,8 +322,8 @@ struct apic_override {
 	void	(*send_IPI_self)(int vector);
 	u64	(*icr_read)(void);
 	void	(*icr_write)(u32 low, u32 high);
-	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
-	int	(*wakeup_secondary_cpu_64)(int apicid, unsigned long start_eip);
+	int	(*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip);
+	int	(*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip);
 };
 
 /*
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -358,7 +358,7 @@ acpi_parse_lapic_nmi(union acpi_subtable
 }
 
 #ifdef CONFIG_X86_64
-static int acpi_wakeup_cpu(int apicid, unsigned long start_ip)
+static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
 {
 	/*
 	 * Remap mailbox memory only for the first call to acpi_wakeup_cpu().
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -27,7 +27,7 @@ static void noop_send_IPI_allbutself(int
 static void noop_send_IPI_all(int vector) { }
 static void noop_send_IPI_self(int vector) { }
 static void noop_apic_icr_write(u32 low, u32 id) { }
-static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
+static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) { return -1; }
 static u64 noop_apic_icr_read(void) { return 0; }
 static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
 static u32 noop_get_apic_id(u32 apicid) { return 0; }
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -71,7 +71,7 @@ static void numachip2_apic_icr_write(int
 	numachip2_write32_lcsr(NUMACHIP2_APIC_ICR, (apicid << 12) | val);
 }
 
-static int numachip_wakeup_secondary(int phys_apicid, unsigned long start_rip)
+static int numachip_wakeup_secondary(u32 phys_apicid, unsigned long start_rip)
 {
 	numachip_apic_icr_write(phys_apicid, APIC_DM_INIT);
 	numachip_apic_icr_write(phys_apicid, APIC_DM_STARTUP |
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -702,7 +702,7 @@ static __init void build_uv_gr_table(voi
 	}
 }
 
-static int uv_wakeup_secondary(int phys_apicid, unsigned long start_rip)
+static int uv_wakeup_secondary(u32 phys_apicid, unsigned long start_rip)
 {
 	unsigned long val;
 	int pnode;
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -940,7 +940,7 @@ static void snp_cleanup_vmsa(struct sev_
 		free_page((unsigned long)vmsa);
 }
 
-static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
+static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
 {
 	struct sev_es_save_area *cur_vmsa, *vmsa;
 	struct ghcb_state state;


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 18/38] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (16 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 19/38] x86/cpu: Provide debug interface Thomas Gleixner
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Per CPU cpuinfo is used to persist the logical package and die IDs. That's
really not the right place simply because cpuinfo is subject to be
reinitialized when a CPU goes through an offline/online cycle.

This works by chance today, but that's far from correct and neither obvious
nor documented.

Add a per cpu datastructure which persists those logical IDs, which allows
to cleanup the CPUID evaluation code.

This is a temporary workaround until the larger topology management is in
place, which makes all of this logical management mechanics obsolete.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/smpboot.c |   33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -124,7 +124,20 @@ struct mwait_cpu_dead {
  */
 static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
 
-/* Logical package management. We might want to allocate that dynamically */
+/* Logical package management. */
+struct logical_maps {
+	u32	phys_pkg_id;
+	u32	phys_die_id;
+	u32	logical_pkg_id;
+	u32	logical_die_id;
+};
+
+/* Temporary workaround until the full topology mechanics is in place */
+static DEFINE_PER_CPU_READ_MOSTLY(struct logical_maps, logical_maps) = {
+	.phys_pkg_id	= U32_MAX,
+	.phys_die_id	= U32_MAX,
+};
+
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
 static unsigned int logical_packages __read_mostly;
@@ -345,10 +358,8 @@ int topology_phys_to_logical_pkg(unsigne
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct cpuinfo_x86 *c = &cpu_data(cpu);
-
-		if (c->initialized && c->topo.pkg_id == phys_pkg)
-			return c->topo.logical_pkg_id;
+		if (per_cpu(logical_maps.phys_pkg_id, cpu) == phys_pkg)
+			return per_cpu(logical_maps.logical_pkg_id, cpu);
 	}
 	return -1;
 }
@@ -366,11 +377,9 @@ static int topology_phys_to_logical_die(
 	int cpu, proc_id = cpu_data(cur_cpu).topo.pkg_id;
 
 	for_each_possible_cpu(cpu) {
-		struct cpuinfo_x86 *c = &cpu_data(cpu);
-
-		if (c->initialized && c->topo.die_id == die_id &&
-		    c->topo.pkg_id == proc_id)
-			return c->topo.logical_die_id;
+		if (per_cpu(logical_maps.phys_pkg_id, cpu) == proc_id &&
+		    per_cpu(logical_maps.phys_die_id, cpu) == die_id)
+			return per_cpu(logical_maps.logical_die_id, cpu);
 	}
 	return -1;
 }
@@ -395,6 +404,8 @@ int topology_update_package_map(unsigned
 			cpu, pkg, new);
 	}
 found:
+	per_cpu(logical_maps.phys_pkg_id, cpu) = pkg;
+	per_cpu(logical_maps.logical_pkg_id, cpu) = new;
 	cpu_data(cpu).topo.logical_pkg_id = new;
 	return 0;
 }
@@ -418,6 +429,8 @@ int topology_update_die_map(unsigned int
 			cpu, die, new);
 	}
 found:
+	per_cpu(logical_maps.phys_die_id, cpu) = die;
+	per_cpu(logical_maps.logical_die_id, cpu) = new;
 	cpu_data(cpu).topo.logical_die_id = new;
 	return 0;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 19/38] x86/cpu: Provide debug interface
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (17 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 18/38] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 23:57   ` Sohil Mehta
  2023-07-28 12:13 ` [patch v2 20/38] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
                   ` (20 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Provide debug files which dump the topology related information of
cpuinfo_x86. This is useful to validate the upcoming conversion of the
topology evaluation for correctness or bug compatibility.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Don't return ENODEV when offline and make online a field.
---
 arch/x86/kernel/cpu/Makefile  |    2 +
 arch/x86/kernel/cpu/debugfs.c |   58 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -54,6 +54,8 @@ obj-$(CONFIG_X86_LOCAL_APIC)		+= perfctr
 obj-$(CONFIG_HYPERVISOR_GUEST)		+= vmware.o hypervisor.o mshyperv.o
 obj-$(CONFIG_ACRN_GUEST)		+= acrn.o
 
+obj-$(CONFIG_DEBUG_FS)			+= debugfs.o
+
 quiet_cmd_mkcapflags = MKCAP   $@
       cmd_mkcapflags = $(CONFIG_SHELL) $(srctree)/$(src)/mkcapflags.sh $@ $^
 
--- /dev/null
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/debugfs.h>
+
+#include <asm/apic.h>
+#include <asm/processor.h>
+
+static int cpu_debug_show(struct seq_file *m, void *p)
+{
+	unsigned long cpu = (unsigned long)m->private;
+	struct cpuinfo_x86 *c = per_cpu_ptr(&cpu_info, cpu);
+
+	seq_printf(m, "online:              %d\n", cpu_online(cpu));
+	if (!c->initialized)
+		return 0;
+
+	seq_printf(m, "initial_apicid:      %x\n", c->topo.initial_apicid);
+	seq_printf(m, "apicid:              %x\n", c->topo.apicid);
+	seq_printf(m, "pkg_id:              %u\n", c->topo.pkg_id);
+	seq_printf(m, "die_id:              %u\n", c->topo.die_id);
+	seq_printf(m, "cu_id:               %u\n", c->topo.cu_id);
+	seq_printf(m, "core_id:             %u\n", c->topo.core_id);
+	seq_printf(m, "logical_pkg_id:      %u\n", c->topo.logical_pkg_id);
+	seq_printf(m, "logical_die_id:      %u\n", c->topo.logical_die_id);
+	seq_printf(m, "llc_id:              %u\n", c->topo.llc_id);
+	seq_printf(m, "l2c_id:              %u\n", c->topo.l2c_id);
+	seq_printf(m, "max_cores:           %u\n", c->x86_max_cores);
+	seq_printf(m, "max_die_per_pkg:     %u\n", __max_die_per_package);
+	seq_printf(m, "smp_num_siblings:    %u\n", smp_num_siblings);
+	return 0;
+}
+
+static int cpu_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, cpu_debug_show, inode->i_private);
+}
+
+static const struct file_operations dfs_cpu_ops = {
+	.open		= cpu_debug_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static __init int cpu_init_debugfs(void)
+{
+	struct dentry *dir, *base = debugfs_create_dir("topo", arch_debugfs_dir);
+	unsigned long id;
+	char name [10];
+
+	dir = debugfs_create_dir("cpus", base);
+	for_each_possible_cpu(id) {
+		sprintf(name, "%lu", id);
+		debugfs_create_file(name, 0444, dir, (void *)id, &dfs_cpu_ops);
+	}
+	return 0;
+}
+late_initcall(cpu_init_debugfs);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 20/38] x86/cpu: Provide cpuid_read() et al.
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (18 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 19/38] x86/cpu: Provide debug interface Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Provide a few helper functions to read CPUID leafs or individual registers
into a data structure without requiring unions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/cpuid.h |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

--- a/arch/x86/include/asm/cpuid.h
+++ b/arch/x86/include/asm/cpuid.h
@@ -127,6 +127,38 @@ static inline unsigned int cpuid_edx(uns
 	return edx;
 }
 
+static inline void __cpuid_read(unsigned int leaf, unsigned int subleaf, u32 *regs)
+{
+	regs[CPUID_EAX] = leaf;
+	regs[CPUID_ECX] = subleaf;
+	__cpuid(regs, regs + 1, regs + 2, regs + 3);
+}
+
+#define cpuid_subleaf(leaf, subleaf, regs)		\
+	BUILD_BUG_ON(sizeof(*(regs)) != 16);		\
+	__cpuid_read(leaf, subleaf, (u32 *)(regs))
+
+#define cpuid_leaf(leaf, regs)				\
+	BUILD_BUG_ON(sizeof(*(regs)) != 16);		\
+	__cpuid_read(leaf, 0, (u32 *)(regs))
+
+static inline void __cpuid_read_reg(unsigned int leaf, unsigned int subleaf,
+				    enum cpuid_regs_idx regidx, u32 *reg)
+{
+	u32 regs[4];
+
+	__cpuid_read(leaf, subleaf, regs);
+	*reg = regs[regidx];
+}
+
+#define cpuid_subleaf_reg(leaf, subleaf, regidx, reg)		\
+	BUILD_BUG_ON(sizeof(*(reg)) != 4);			\
+	__cpuid_read_reg(leaf, subleaf, regidx, (u32 *)(reg))
+
+#define cpuid_leaf_reg(leaf, regidx, reg)			\
+	BUILD_BUG_ON(sizeof(*(reg)) != 4);			\
+	__cpuid_read_reg(leaf, 0, regidx, (u32 *)(reg))
+
 static __always_inline bool cpuid_function_is_indexed(u32 function)
 {
 	switch (function) {


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (19 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 20/38] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-29  0:02   ` Sohil Mehta
                     ` (2 more replies)
  2023-07-28 12:13 ` [patch v2 22/38] x86/cpu: Add legacy topology parser Thomas Gleixner
                   ` (18 subsequent siblings)
  39 siblings, 3 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Topology evaluation is a complete disaster and impenetrable mess. It's
scattered all over the place with some vendor implementatins doing early
evaluation and some not. The most horrific part is the permanent
overwriting of smt_max_siblings and __max_die_per_package, instead of
establishing them once on the boot CPU and validating the result on the
APs.

The goals are:

  - One topology evaluation entry point

  - Proper sharing of pointlessly duplicated code

  - Proper structuring of the evaluation logic and preferences.

  - Evaluating important system wide information only once on the boot CPU

  - Making the 0xb/0x1f leaf parsing less convoluted and actually fixing
    the short comings of leaf 0x1f evaluation.

Start to consolidate the topology evaluation code by providing the entry
points for the early boot CPU evaluation and for the final parsing on the
boot CPU and the APs.

Move the trivial pieces into that new code:

   - The initialization of cpuinfo_x86::topo

   - The evaluation of CPUID leaf 1, which presets topo::initial_apicid

   - topo_apicid is set to topo::initial_apicid when invoked from early
     boot. When invoked for the final evaluation on the boot CPU it reads
     the actual APIC ID, which makes apic_get_initial_apicid() obsolete
     once everything is converted over.

Provide a temporary helper function topo_converted() which shields off the
not yet converted CPU vendors from invoking code which would break them.
This shielding covers all vendor CPUs which support SMP, but not the
historical pure UP ones as they only need the topology info init and
eventually the initial APIC initialization.

Provide two new members in cpuinfo_x86::topo to store the maximum number of
SMT siblings and the number of dies per package and add them to the debugfs
readout. These two members will be used to populate this information on the
boot CPU and to validate the APs against it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/topology.h       |   19 +++
 arch/x86/kernel/cpu/Makefile          |    3 
 arch/x86/kernel/cpu/common.c          |   23 +---
 arch/x86/kernel/cpu/cpu.h             |    6 +
 arch/x86/kernel/cpu/debugfs.c         |   37 ++++++
 arch/x86/kernel/cpu/topology.h        |   32 +++++
 arch/x86/kernel/cpu/topology_common.c |  187 ++++++++++++++++++++++++++++++++++
 7 files changed, 290 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -102,6 +102,25 @@ static inline void setup_node_to_cpumask
 
 #include <asm-generic/topology.h>
 
+/* Topology information */
+enum x86_topology_domains {
+	TOPO_SMT_DOMAIN,
+	TOPO_CORE_DOMAIN,
+	TOPO_MODULE_DOMAIN,
+	TOPO_TILE_DOMAIN,
+	TOPO_DIE_DOMAIN,
+	TOPO_PKG_DOMAIN,
+	TOPO_ROOT_DOMAIN,
+	TOPO_MAX_DOMAIN,
+};
+
+struct x86_topology_system {
+	unsigned int	dom_shifts[TOPO_MAX_DOMAIN];
+	unsigned int	dom_size[TOPO_MAX_DOMAIN];
+};
+
+extern struct x86_topology_system x86_topo_system;
+
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -17,7 +17,8 @@ KMSAN_SANITIZE_common.o := n
 # As above, instrumenting secondary CPU boot code causes boot hangs.
 KCSAN_SANITIZE_common.o := n
 
-obj-y			:= cacheinfo.o scattered.o topology.o
+obj-y			:= cacheinfo.o scattered.o
+obj-y			+= topology_common.o topology.o
 obj-y			+= common.o
 obj-y			+= rdrand.o
 obj-y			+= match.o
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1553,6 +1553,8 @@ static void __init early_identify_cpu(st
 		setup_force_cpu_cap(X86_FEATURE_CPUID);
 		cpu_parse_early_param();
 
+		cpu_init_topology(c);
+
 		if (this_cpu->c_early_init)
 			this_cpu->c_early_init(c);
 
@@ -1563,6 +1565,7 @@ static void __init early_identify_cpu(st
 			this_cpu->c_bsp_init(c);
 	} else {
 		setup_clear_cpu_cap(X86_FEATURE_CPUID);
+		cpu_init_topology(c);
 	}
 
 	setup_force_cpu_cap(X86_FEATURE_ALWAYS);
@@ -1708,18 +1711,6 @@ static void generic_identify(struct cpui
 
 	get_cpu_address_sizes(c);
 
-	if (c->cpuid_level >= 0x00000001) {
-		c->topo.initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
-#ifdef CONFIG_X86_32
-# ifdef CONFIG_SMP
-		c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
-# else
-		c->topo.apicid = c->topo.initial_apicid;
-# endif
-#endif
-		c->topo.pkg_id = c->topo.initial_apicid;
-	}
-
 	get_model_name(c); /* Default name */
 
 	/*
@@ -1778,9 +1769,6 @@ static void identify_cpu(struct cpuinfo_
 	c->x86_model_id[0] = '\0';  /* Unset */
 	c->x86_max_cores = 1;
 	c->x86_coreid_bits = 0;
-	c->topo.cu_id = 0xff;
-	c->topo.llc_id = BAD_APICID;
-	c->topo.l2c_id = BAD_APICID;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
 	c->x86_phys_bits = 36;
@@ -1799,6 +1787,8 @@ static void identify_cpu(struct cpuinfo_
 
 	generic_identify(c);
 
+	cpu_parse_topology(c);
+
 	if (this_cpu->c_identify)
 		this_cpu->c_identify(c);
 
@@ -1806,7 +1796,8 @@ static void identify_cpu(struct cpuinfo_
 	apply_forced_caps(c);
 
 #ifdef CONFIG_X86_64
-	c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
+	if (!topo_is_converted(c))
+		c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
 #endif
 
 	/*
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -2,6 +2,11 @@
 #ifndef ARCH_X86_CPU_H
 #define ARCH_X86_CPU_H
 
+#include <asm/cpu.h>
+#include <asm/topology.h>
+
+#include "topology.h"
+
 /* attempt to consolidate cpu attributes */
 struct cpu_dev {
 	const char	*c_vendor;
@@ -95,4 +100,5 @@ static inline bool spectre_v2_in_eibrs_m
 	       mode == SPECTRE_V2_EIBRS_RETPOLINE ||
 	       mode == SPECTRE_V2_EIBRS_LFENCE;
 }
+
 #endif /* ARCH_X86_CPU_H */
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -5,6 +5,8 @@
 #include <asm/apic.h>
 #include <asm/processor.h>
 
+#include "cpu.h"
+
 static int cpu_debug_show(struct seq_file *m, void *p)
 {
 	unsigned long cpu = (unsigned long)m->private;
@@ -43,12 +45,47 @@ static const struct file_operations dfs_
 	.release	= single_release,
 };
 
+static int dom_debug_show(struct seq_file *m, void *p)
+{
+	static const char *domain_names[TOPO_ROOT_DOMAIN] = {
+		[TOPO_SMT_DOMAIN]	= "Thread",
+		[TOPO_CORE_DOMAIN]	= "Core",
+		[TOPO_MODULE_DOMAIN]	= "Module",
+		[TOPO_TILE_DOMAIN]	= "Tile",
+		[TOPO_DIE_DOMAIN]	= "Die",
+		[TOPO_PKG_DOMAIN]	= "Package",
+	};
+	unsigned int dom, nthreads = 1;
+
+	for (dom = 0; dom < TOPO_ROOT_DOMAIN; dom++) {
+		nthreads *= x86_topo_system.dom_size[dom];
+		seq_printf(m, "domain: %-10s shift: %u dom_size: %5u max_threads: %5u\n",
+			   domain_names[dom], x86_topo_system.dom_shifts[dom],
+			   x86_topo_system.dom_size[dom], nthreads);
+	}
+	return 0;
+}
+
+static int dom_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, dom_debug_show, inode->i_private);
+}
+
+static const struct file_operations dfs_dom_ops = {
+	.open		= dom_debug_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static __init int cpu_init_debugfs(void)
 {
 	struct dentry *dir, *base = debugfs_create_dir("topo", arch_debugfs_dir);
 	unsigned long id;
 	char name [10];
 
+	debugfs_create_file("domains", 0444, base, NULL, &dfs_dom_ops);
+
 	dir = debugfs_create_dir("cpus", base);
 	for_each_possible_cpu(id) {
 		sprintf(name, "%lu", id);
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef ARCH_X86_TOPOLOGY_H
+#define ARCH_X86_TOPOLOGY_H
+
+struct topo_scan {
+	struct cpuinfo_x86	*c;
+	unsigned int		dom_shifts[TOPO_MAX_DOMAIN];
+	unsigned int		dom_ncpus[TOPO_MAX_DOMAIN];
+
+};
+
+bool topo_is_converted(struct cpuinfo_x86 *c);
+void cpu_init_topology(struct cpuinfo_x86 *c);
+void cpu_parse_topology(struct cpuinfo_x86 *c);
+void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
+		      unsigned int shift, unsigned int ncpus);
+
+static inline u32 topo_shift_apicid(u32 apicid, enum x86_topology_domains dom)
+{
+	if (dom == TOPO_SMT_DOMAIN)
+		return apicid;
+	return apicid >> x86_topo_system.dom_shifts[dom - 1];
+}
+
+static inline u32 topo_relative_domain_id(u32 apicid, enum x86_topology_domains dom)
+{
+	if (dom != TOPO_SMT_DOMAIN)
+		apicid >>= x86_topo_system.dom_shifts[dom - 1];
+	return apicid & (x86_topo_system.dom_size[dom] - 1);
+}
+
+#endif /* ARCH_X86_TOPOLOGY_H */
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/cpu.h>
+
+#include <xen/xen.h>
+
+#include <asm/apic.h>
+#include <asm/processor.h>
+#include <asm/smp.h>
+
+#include "cpu.h"
+
+struct x86_topology_system x86_topo_system __ro_after_init;
+
+void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
+		      unsigned int shift, unsigned int ncpus)
+{
+	tscan->dom_shifts[dom] = shift;
+	tscan->dom_ncpus[dom] = ncpus;
+
+	/* Propagate to the upper levels */
+	for (dom++; dom < TOPO_MAX_DOMAIN; dom++) {
+		tscan->dom_shifts[dom] = tscan->dom_shifts[dom - 1];
+		tscan->dom_ncpus[dom] = tscan->dom_ncpus[dom - 1];
+	}
+}
+
+bool topo_is_converted(struct cpuinfo_x86 *c)
+{
+	/* Temporary until everything is converted over. */
+	switch (boot_cpu_data.x86_vendor) {
+	case X86_VENDOR_AMD:
+	case X86_VENDOR_CENTAUR:
+	case X86_VENDOR_INTEL:
+	case X86_VENDOR_HYGON:
+	case X86_VENDOR_ZHAOXIN:
+		return false;
+	default:
+		/* Let all UP systems use the below */
+		return true;
+	}
+}
+
+static bool fake_topology(struct topo_scan *tscan)
+{
+	/*
+	 * Preset the CORE level shift for CPUID less systems and XEN_PV,
+	 * which has useless CPUID information.
+	 */
+	topology_set_dom(tscan, TOPO_SMT_DOMAIN, 0, 1);
+	topology_set_dom(tscan, TOPO_CORE_DOMAIN, 1, 1);
+
+	return tscan->c->cpuid_level < 1 || xen_pv_domain();
+}
+
+static void parse_topology(struct topo_scan *tscan, bool early)
+{
+	const struct cpuinfo_topology topo_defaults = {
+		.cu_id			= 0xff,
+		.llc_id			= BAD_APICID,
+		.l2c_id			= BAD_APICID,
+	};
+	struct cpuinfo_x86 *c = tscan->c;
+	struct {
+		u32	unused0		: 16,
+			nproc		:  8,
+			apicid		:  8;
+	} ebx;
+
+	c->topo = topo_defaults;
+
+	if (fake_topology(tscan))
+	    return;
+
+	/* Preset Initial APIC ID from CPUID leaf 1 */
+	cpuid_leaf_reg(1, CPUID_EBX, &ebx);
+	c->topo.initial_apicid = ebx.apicid;
+
+	/*
+	 * The initial invocation from early_identify_cpu() happens before
+	 * the APIC is mapped or X2APIC enabled. For establishing the
+	 * topology, that's not required. Use the initial APIC ID.
+	 */
+	if (early)
+		c->topo.apicid = c->topo.initial_apicid;
+	else
+		c->topo.apicid = read_apic_id();
+
+	/* The above is sufficient for UP */
+	if (!IS_ENABLED(CONFIG_SMP))
+		return;
+}
+
+static void topo_set_ids(struct topo_scan *tscan)
+{
+	struct cpuinfo_x86 *c = tscan->c;
+	u32 apicid = c->topo.apicid;
+
+	c->topo.pkg_id = topo_shift_apicid(apicid, TOPO_ROOT_DOMAIN);
+	c->topo.die_id = topo_shift_apicid(apicid, TOPO_DIE_DOMAIN);
+
+	/* Relative core ID */
+	c->topo.core_id = topo_relative_domain_id(apicid, TOPO_CORE_DOMAIN);
+}
+
+static void topo_set_max_cores(struct topo_scan *tscan)
+{
+	/*
+	 * Bug compatible for now. This is broken on hybrid systems:
+	 * 8 cores SMT + 8 cores w/o SMT
+	 * tscan.dom_ncpus[TOPO_CORE_DOMAIN] = 24; 24 / 2 = 12 !!
+	 *
+	 * Cannot be fixed without further topology enumeration changes.
+	 */
+	tscan->c->x86_max_cores = tscan->dom_ncpus[TOPO_CORE_DOMAIN] >>
+		x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
+}
+
+void cpu_parse_topology(struct cpuinfo_x86 *c)
+{
+	unsigned int dom, cpu = smp_processor_id();
+	struct topo_scan tscan = { .c = c, };
+
+	parse_topology(&tscan, false);
+
+	if (!topo_is_converted(c))
+		return;
+
+	for (dom = TOPO_SMT_DOMAIN; dom < TOPO_MAX_DOMAIN; dom++) {
+		if (tscan.dom_shifts[dom] == x86_topo_system.dom_shifts[dom])
+			continue;
+		pr_err(FW_BUG "CPU%d: Topology domain %u shift %u != %u\n", cpu, dom,
+		       tscan.dom_shifts[dom], x86_topo_system.dom_shifts[dom]);
+	}
+
+	/* Bug compatible with the existing parsers */
+	if (tscan.dom_ncpus[TOPO_SMT_DOMAIN] > smp_num_siblings) {
+		if (system_state == SYSTEM_BOOTING) {
+			pr_warn_once("CPU%d: SMT detected and enabled late\n", cpu);
+			smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
+		} else {
+			pr_warn_once("CPU%d: SMT detected after init. Too late!\n", cpu);
+		}
+	}
+
+	topo_set_ids(&tscan);
+	topo_set_max_cores(&tscan);
+}
+
+void __init cpu_init_topology(struct cpuinfo_x86 *c)
+{
+	struct topo_scan tscan = { .c = c, };
+	unsigned int dom, sft;
+
+	parse_topology(&tscan, true);
+
+	if (!topo_is_converted(c))
+		return;
+
+	/* Copy the shift values and calculate the unit sizes. */
+	memcpy(x86_topo_system.dom_shifts, tscan.dom_shifts, sizeof(x86_topo_system.dom_shifts));
+
+	dom = TOPO_SMT_DOMAIN;
+	x86_topo_system.dom_size[dom] = 1U << x86_topo_system.dom_shifts[dom];
+
+	for (dom++; dom < TOPO_MAX_DOMAIN; dom++) {
+		sft = x86_topo_system.dom_shifts[dom] - x86_topo_system.dom_shifts[dom - 1];
+		x86_topo_system.dom_size[dom] = 1U << sft;
+	}
+
+	topo_set_ids(&tscan);
+	topo_set_max_cores(&tscan);
+
+	/*
+	 * Bug compatible with the existing code. If the boot CPU does not
+	 * have SMT this ends up with one sibling. This needs way deeper
+	 * changes further down the road to get it right during early boot.
+	 */
+	smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
+
+	/*
+	 * Neither it's clear whether there are as many dies as the APIC
+	 * space indicating die level is. But assume that the actual number
+	 * of CPUs gives a proper indication for now to stay bug compatible.
+	 */
+	__max_die_per_package = tscan.dom_ncpus[TOPO_DIE_DOMAIN] /
+		tscan.dom_ncpus[TOPO_DIE_DOMAIN - 1];
+}


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 22/38] x86/cpu: Add legacy topology parser
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (20 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 21:33   ` [patch v2A " Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 23/38] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
                   ` (17 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

The legacy topology detection via CPUID leaf 4, which provides the number
of cores in the package and CPUID leaf 1 which provides the number of
logical CPUs in case that FEATURE_HT is enabled and the CMP_LEGACY feature
is not set, is shared for Intel, Centaur amd Zhaoxin CPUs.

Lift the code from common.c without the early detection hack and provide it
as common fallback mechanism.

Will be utilized in later changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c          |    3 ++
 arch/x86/kernel/cpu/topology.h        |    2 +
 arch/x86/kernel/cpu/topology_common.c |   37 ++++++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -883,6 +883,9 @@ void detect_ht(struct cpuinfo_x86 *c)
 #ifdef CONFIG_SMP
 	int index_msb, core_bits;
 
+	if (topo_is_converted(c))
+		return;
+
 	if (detect_ht_early(c) < 0)
 		return;
 
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -7,6 +7,8 @@ struct topo_scan {
 	unsigned int		dom_shifts[TOPO_MAX_DOMAIN];
 	unsigned int		dom_ncpus[TOPO_MAX_DOMAIN];
 
+	// Legacy CPUID[1]:EBX[23:16] number of logical processors
+	unsigned int		ebx1_nproc_shift;
 };
 
 bool topo_is_converted(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -24,6 +24,41 @@ void topology_set_dom(struct topo_scan *
 	}
 }
 
+static unsigned int parse_num_cores(struct cpuinfo_x86 *c)
+{
+	struct {
+		u32	cache_type	:  5,
+			unused		: 21,
+			ncores		:  6;
+	} eax;
+
+	if (c->cpuid_level < 4)
+		return 1;
+
+	cpuid_subleaf_reg(4, 0, CPUID_EAX, &eax);
+	if (!eax.cache_type)
+		return 1;
+
+	return eax.ncores + 1;
+}
+
+static void __maybe_unused parse_legacy(struct topo_scan *tscan)
+{
+	unsigned int cores, core_shift, smt_shift = 0;
+	struct cpuinfo_x86 *c = tscan->c;
+
+	cores = parse_num_cores(c);
+	core_shift = get_count_order(cores);
+
+	if (cpu_has(c, X86_FEATURE_HT)) {
+		if (!WARN_ON_ONCE(tscan->ebx1_nproc_shift < core_shift))
+			smt_shift = tscan->ebx1_nproc_shift - core_shift;
+	}
+
+	topology_set_dom(tscan, TOPO_SMT_DOMAIN, smt_shift, 1U << smt_shift);
+	topology_set_dom(tscan, TOPO_CORE_DOMAIN, core_shift, cores >> smt_shift);
+}
+
 bool topo_is_converted(struct cpuinfo_x86 *c)
 {
 	/* Temporary until everything is converted over. */
@@ -88,6 +123,8 @@ static void parse_topology(struct topo_s
 	/* The above is sufficient for UP */
 	if (!IS_ENABLED(CONFIG_SMP))
 		return;
+
+	tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);
 }
 
 static void topo_set_ids(struct topo_scan *tscan)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 23/38] x86/cpu: Use common topology code for Centaur and Zhaoxin
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (21 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 22/38] x86/cpu: Add legacy topology parser Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 24/38] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Centaur and Zhaoxin CPUs use only the legacy SMP detection. Remove the
invocations from their 32bit path and exempt them from the call 64bit.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/centaur.c         |    4 ----
 arch/x86/kernel/cpu/topology_common.c |   11 ++++++++---
 arch/x86/kernel/cpu/zhaoxin.c         |    4 ----
 3 files changed, 8 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -128,10 +128,6 @@ static void init_centaur(struct cpuinfo_
 #endif
 	early_init_centaur(c);
 	init_intel_cacheinfo(c);
-	detect_num_cpu_cores(c);
-#ifdef CONFIG_X86_32
-	detect_ht(c);
-#endif
 
 	if (c->cpuid_level > 9) {
 		unsigned int eax = cpuid_eax(10);
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -42,7 +42,7 @@ static unsigned int parse_num_cores(stru
 	return eax.ncores + 1;
 }
 
-static void __maybe_unused parse_legacy(struct topo_scan *tscan)
+static void parse_legacy(struct topo_scan *tscan)
 {
 	unsigned int cores, core_shift, smt_shift = 0;
 	struct cpuinfo_x86 *c = tscan->c;
@@ -64,10 +64,8 @@ bool topo_is_converted(struct cpuinfo_x8
 	/* Temporary until everything is converted over. */
 	switch (boot_cpu_data.x86_vendor) {
 	case X86_VENDOR_AMD:
-	case X86_VENDOR_CENTAUR:
 	case X86_VENDOR_INTEL:
 	case X86_VENDOR_HYGON:
-	case X86_VENDOR_ZHAOXIN:
 		return false;
 	default:
 		/* Let all UP systems use the below */
@@ -125,6 +123,13 @@ static void parse_topology(struct topo_s
 		return;
 
 	tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);
+
+	switch (c->x86_vendor) {
+	case X86_VENDOR_CENTAUR:
+	case X86_VENDOR_ZHAOXIN:
+		parse_legacy(tscan);
+		break;
+	}
 }
 
 static void topo_set_ids(struct topo_scan *tscan)
--- a/arch/x86/kernel/cpu/zhaoxin.c
+++ b/arch/x86/kernel/cpu/zhaoxin.c
@@ -71,10 +71,6 @@ static void init_zhaoxin(struct cpuinfo_
 {
 	early_init_zhaoxin(c);
 	init_intel_cacheinfo(c);
-	detect_num_cpu_cores(c);
-#ifdef CONFIG_X86_32
-	detect_ht(c);
-#endif
 
 	if (c->cpuid_level > 9) {
 		unsigned int eax = cpuid_eax(10);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 24/38] x86/cpu: Move __max_die_per_package to common.c
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (22 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 23/38] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 25/38] x86/cpu: Provide a sane leaf 0xb/0x1f parser Thomas Gleixner
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

In preparation of a complete replacement for the topology leaf 0xb/0x1f
evaluation, move __max_die_per_package into the common code.

Will be removed once everything is converted over.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c   |    3 +++
 arch/x86/kernel/cpu/topology.c |    3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -75,6 +75,9 @@ u32 elf_hwcap2 __read_mostly;
 int smp_num_siblings = 1;
 EXPORT_SYMBOL(smp_num_siblings);
 
+unsigned int __max_die_per_package __read_mostly = 1;
+EXPORT_SYMBOL(__max_die_per_package);
+
 static struct ppin_info {
 	int	feature;
 	int	msr_ppin_ctl;
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -25,9 +25,6 @@
 #define BITS_SHIFT_NEXT_LEVEL(eax)	((eax) & 0x1f)
 #define LEVEL_MAX_SIBLINGS(ebx)		((ebx) & 0xffff)
 
-unsigned int __max_die_per_package __read_mostly = 1;
-EXPORT_SYMBOL(__max_die_per_package);
-
 #ifdef CONFIG_SMP
 /*
  * Check if given CPUID extended topology "leaf" is implemented


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 25/38] x86/cpu: Provide a sane leaf 0xb/0x1f parser
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (23 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 24/38] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 26/38] x86/cpu: Use common topology code for Intel Thomas Gleixner
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

detect_extended_topology() along with it's early() variant is a classic
example for duct tape engineering:

  - It evaluates an array of subleafs with a boatload of local variables
    for the relevant topology levels instead of using an array to save the
    enumerated information and propagate it to the right level

  - It has no boundary checks for subleafs

  - It prevents updating the die_id with a crude workaround instead of
    checking for leaf 0xb which does not provide die information.

  - It's broken vs. the number of dies evaluation as it uses:

      num_processors[DIE_LEVEL] / num_processors[CORE_LEVEL]

    which "works" only correctly if there is none of the intermediate
    topology levels (MODULE/TILE) enumerated.

There is zero value in trying to "fix" that code as the only proper fix is
to rewrite it from scratch.

Implement a sane parser with proper code documentation, which will be used
for the consolidated topology evaluation in the next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Fixed up the comment alignment for registers - Peterz
---
 arch/x86/kernel/cpu/Makefile       |    2 
 arch/x86/kernel/cpu/topology.h     |   12 +++
 arch/x86/kernel/cpu/topology_ext.c |  136 +++++++++++++++++++++++++++++++++++++
 3 files changed, 149 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -18,7 +18,7 @@ KMSAN_SANITIZE_common.o := n
 KCSAN_SANITIZE_common.o := n
 
 obj-y			:= cacheinfo.o scattered.o
-obj-y			+= topology_common.o topology.o
+obj-y			+= topology_common.o topology_ext.o topology.o
 obj-y			+= common.o
 obj-y			+= rdrand.o
 obj-y			+= match.o
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -16,6 +16,7 @@ void cpu_init_topology(struct cpuinfo_x8
 void cpu_parse_topology(struct cpuinfo_x86 *c);
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
 		      unsigned int shift, unsigned int ncpus);
+bool cpu_parse_topology_ext(struct topo_scan *tscan);
 
 static inline u32 topo_shift_apicid(u32 apicid, enum x86_topology_domains dom)
 {
@@ -31,4 +32,15 @@ static inline u32 topo_relative_domain_i
 	return apicid & (x86_topo_system.dom_size[dom] - 1);
 }
 
+/*
+ * Update a domain level after the fact without propagating. Used to fixup
+ * broken CPUID enumerations.
+ */
+static inline void topology_update_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
+				       unsigned int shift, unsigned int ncpus)
+{
+	tscan->dom_shifts[dom] = shift;
+	tscan->dom_ncpus[dom] = ncpus;
+}
+
 #endif /* ARCH_X86_TOPOLOGY_H */
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology_ext.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/cpu.h>
+
+#include <asm/apic.h>
+#include <asm/memtype.h>
+#include <asm/processor.h>
+
+#include "cpu.h"
+
+enum topo_types {
+	INVALID_TYPE	= 0,
+	SMT_TYPE	= 1,
+	CORE_TYPE	= 2,
+	MODULE_TYPE	= 3,
+	TILE_TYPE	= 4,
+	DIE_TYPE	= 5,
+	DIEGRP_TYPE	= 6,
+	MAX_TYPE	= 7,
+};
+
+/*
+ * Use a lookup table for the case that there are future types > 6 which
+ * describe an intermediate domain level which does not exist today.
+ *
+ * A table will also be handy to parse the new AMD 0x80000026 leaf which
+ * has defined different domain types, but otherwise uses the same layout
+ * with some of the reserved bits used for new information.
+ */
+static const unsigned int topo_domain_map[MAX_TYPE] = {
+	[SMT_TYPE]	= TOPO_SMT_DOMAIN,
+	[CORE_TYPE]	= TOPO_CORE_DOMAIN,
+	[MODULE_TYPE]	= TOPO_MODULE_DOMAIN,
+	[TILE_TYPE]	= TOPO_TILE_DOMAIN,
+	[DIE_TYPE]	= TOPO_DIE_DOMAIN,
+	[DIEGRP_TYPE]	= TOPO_PKG_DOMAIN,
+};
+
+static inline bool topo_subleaf(struct topo_scan *tscan, u32 leaf, u32 subleaf)
+{
+	unsigned int dom, maxtype = leaf == 0xb ? CORE_TYPE + 1 : MAX_TYPE;
+	struct {
+		// eax
+		u32	x2apic_shift	:  5, // Number of bits to shift APIC ID right
+					      // for the topology ID at the next level
+			__rsvd0		: 27; // Reserved
+		// ebx
+		u32	num_processors	: 16, // Number of processors at current level
+			__rsvd1		: 16; // Reserved
+		// ecx
+		u32	level		:  8, // Current topology level. Same as sub leaf number
+			type		:  8, // Level type. If 0, invalid
+			__rsvd2		: 16; // Reserved
+		// edx
+		u32	x2apic_id	: 32; // X2APIC ID of the current logical processor
+	} sl;
+
+	cpuid_subleaf(leaf, subleaf, &sl);
+
+	if (!sl.num_processors || sl.type == INVALID_TYPE)
+		return false;
+
+	if (sl.type >= maxtype) {
+		/*
+		 * As the subleafs are ordered in domain level order, this
+		 * could be recovered in theory by propagating the
+		 * information at the last parsed level.
+		 *
+		 * But if the infinite wisdom of hardware folks decides to
+		 * create a new domain type between CORE and MODULE or DIE
+		 * and DIEGRP, then that would overwrite the CORE or DIE
+		 * information.
+		 *
+		 * It really would have been too obvious to make the domain
+		 * type space sparse and leave a few reserved types between
+		 * the points which might change instead of forcing
+		 * software to either create a monstrosity of workarounds
+		 * or just being up the creek without a paddle.
+		 *
+		 * Refuse to implement monstrosity, emit an error and try
+		 * to survive.
+		 */
+		pr_err_once("Topology: leaf 0x%x:%d Unknown domain type %u\n",
+			    leaf, subleaf, sl.type);
+		return true;
+	}
+
+	dom = topo_domain_map[sl.type];
+	if (!dom) {
+		tscan->c->topo.initial_apicid = sl.x2apic_id;
+	} else if (tscan->c->topo.initial_apicid != sl.x2apic_id) {
+		pr_warn_once(FW_BUG "CPUID leaf 0x%x subleaf %d APIC ID mismatch %x != %x\n",
+			     leaf, subleaf, tscan->c->topo.initial_apicid, sl.x2apic_id);
+	}
+
+	topology_set_dom(tscan, dom, sl.x2apic_shift, sl.num_processors);
+	return true;
+}
+
+static bool parse_topology_leaf(struct topo_scan *tscan, u32 leaf)
+{
+	u32 subleaf;
+
+	if (tscan->c->cpuid_level < leaf)
+		return false;
+
+	/* Read all available subleafs and populate the levels */
+	for (subleaf = 0; topo_subleaf(tscan, leaf, subleaf); subleaf++);
+
+	/* If subleaf 0 failed to parse, give up */
+	if (!subleaf)
+		return false;
+
+	/*
+	 * There are machines in the wild which have shift 0 in the subleaf
+	 * 0, but advertise 2 logical processors at that level. They are
+	 * truly SMT.
+	 */
+	if (!tscan->dom_shifts[TOPO_SMT_DOMAIN] && tscan->dom_ncpus[TOPO_SMT_DOMAIN] > 1) {
+		unsigned int sft = get_count_order(tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+
+		pr_warn_once(FW_BUG "CPUID leaf 0x%x subleaf 0 has shift level 0 but %u CPUs\n",
+			     leaf, tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+		topology_update_dom(tscan, TOPO_SMT_DOMAIN, sft, tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+	}
+
+	set_cpu_cap(tscan->c, X86_FEATURE_XTOPOLOGY);
+	return true;
+}
+
+bool cpu_parse_topology_ext(struct topo_scan *tscan)
+{
+	/* Try lead 0x1F first. If not available try leaf 0x0b */
+	if (parse_topology_leaf(tscan, 0x1f))
+		return true;
+	return parse_topology_leaf(tscan, 0x0b);
+}


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 26/38] x86/cpu: Use common topology code for Intel
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (24 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 25/38] x86/cpu: Provide a sane leaf 0xb/0x1f parser Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID Thomas Gleixner
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Intel CPUs use either topology leaf 0xb/0x1f evaluation or the legacy
SMP/HT evaluation based on CPUID leaf 0x1/0x4.

Move it over to the consolidated topology code and remove the random
topology hacks which are sprinkled into the Intel and the common code.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c          |   65 ----------------------------------
 arch/x86/kernel/cpu/cpu.h             |    4 --
 arch/x86/kernel/cpu/intel.c           |   25 -------------
 arch/x86/kernel/cpu/topology_common.c |    5 ++
 4 files changed, 4 insertions(+), 95 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -784,19 +784,6 @@ static void get_model_name(struct cpuinf
 	*(s + 1) = '\0';
 }
 
-void detect_num_cpu_cores(struct cpuinfo_x86 *c)
-{
-	unsigned int eax, ebx, ecx, edx;
-
-	c->x86_max_cores = 1;
-	if (!IS_ENABLED(CONFIG_SMP) || c->cpuid_level < 4)
-		return;
-
-	cpuid_count(4, 0, &eax, &ebx, &ecx, &edx);
-	if (eax & 0x1f)
-		c->x86_max_cores = (eax >> 26) + 1;
-}
-
 void cpu_detect_cache_sizes(struct cpuinfo_x86 *c)
 {
 	unsigned int n, dummy, ebx, ecx, edx, l2size;
@@ -858,54 +845,6 @@ static void cpu_detect_tlb(struct cpuinf
 		tlb_lld_4m[ENTRIES], tlb_lld_1g[ENTRIES]);
 }
 
-int detect_ht_early(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	u32 eax, ebx, ecx, edx;
-
-	if (!cpu_has(c, X86_FEATURE_HT))
-		return -1;
-
-	if (cpu_has(c, X86_FEATURE_CMP_LEGACY))
-		return -1;
-
-	if (cpu_has(c, X86_FEATURE_XTOPOLOGY))
-		return -1;
-
-	cpuid(1, &eax, &ebx, &ecx, &edx);
-
-	smp_num_siblings = (ebx & 0xff0000) >> 16;
-	if (smp_num_siblings == 1)
-		pr_info_once("CPU0: Hyper-Threading is disabled\n");
-#endif
-	return 0;
-}
-
-void detect_ht(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	int index_msb, core_bits;
-
-	if (topo_is_converted(c))
-		return;
-
-	if (detect_ht_early(c) < 0)
-		return;
-
-	index_msb = get_count_order(smp_num_siblings);
-	c->topo.pkg_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb);
-
-	smp_num_siblings = smp_num_siblings / c->x86_max_cores;
-
-	index_msb = get_count_order(smp_num_siblings);
-
-	core_bits = get_count_order(c->x86_max_cores);
-
-	c->topo.core_id = apic->phys_pkg_id(c->topo.initial_apicid, index_msb) &
-		((1 << core_bits) - 1);
-#endif
-}
-
 static void get_cpu_vendor(struct cpuinfo_x86 *c)
 {
 	char *v = c->x86_vendor_id;
@@ -1853,10 +1792,6 @@ static void identify_cpu(struct cpuinfo_
 				c->x86, c->x86_model);
 	}
 
-#ifdef CONFIG_X86_64
-	detect_ht(c);
-#endif
-
 	x86_init_rdrand(c);
 	setup_pku(c);
 	setup_cet(c);
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -76,11 +76,7 @@ extern void init_intel_cacheinfo(struct
 extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);
 extern void init_hygon_cacheinfo(struct cpuinfo_x86 *c);
 
-extern void detect_num_cpu_cores(struct cpuinfo_x86 *c);
-extern int detect_extended_topology_early(struct cpuinfo_x86 *c);
 extern int detect_extended_topology(struct cpuinfo_x86 *c);
-extern int detect_ht_early(struct cpuinfo_x86 *c);
-extern void detect_ht(struct cpuinfo_x86 *c);
 extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
 
 void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -489,13 +489,6 @@ static void early_init_intel(struct cpui
 	}
 
 	check_memory_type_self_snoop_errata(c);
-
-	/*
-	 * Get the number of SMT siblings early from the extended topology
-	 * leaf, if available. Otherwise try the legacy SMT detection.
-	 */
-	if (detect_extended_topology_early(c) < 0)
-		detect_ht_early(c);
 }
 
 static void bsp_init_intel(struct cpuinfo_x86 *c)
@@ -777,24 +770,6 @@ static void init_intel(struct cpuinfo_x8
 
 	intel_workarounds(c);
 
-	/*
-	 * Detect the extended topology information if available. This
-	 * will reinitialise the initial_apicid which will be used
-	 * in init_intel_cacheinfo()
-	 */
-	detect_extended_topology(c);
-
-	if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) {
-		/*
-		 * let's use the legacy cpuid vector 0x1 and 0x4 for topology
-		 * detection.
-		 */
-		detect_num_cpu_cores(c);
-#ifdef CONFIG_X86_32
-		detect_ht(c);
-#endif
-	}
-
 	init_intel_cacheinfo(c);
 
 	if (c->cpuid_level > 9) {
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -64,7 +64,6 @@ bool topo_is_converted(struct cpuinfo_x8
 	/* Temporary until everything is converted over. */
 	switch (boot_cpu_data.x86_vendor) {
 	case X86_VENDOR_AMD:
-	case X86_VENDOR_INTEL:
 	case X86_VENDOR_HYGON:
 		return false;
 	default:
@@ -129,6 +128,10 @@ static void parse_topology(struct topo_s
 	case X86_VENDOR_ZHAOXIN:
 		parse_legacy(tscan);
 		break;
+	case X86_VENDOR_INTEL:
+		if (!IS_ENABLED(CONFIG_CPU_SUP_INTEL) || !cpu_parse_topology_ext(tscan))
+			parse_legacy(tscan);
+		break;
 	}
 }
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (25 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 26/38] x86/cpu: Use common topology code for Intel Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-29  0:07   ` Sohil Mehta
  2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
                   ` (12 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

AMD (ab)uses topology_die_id() to store the Node ID information and
topology_max_dies_per_pkg to store the number of nodes per package.

This collides with the proper processor die level enumeration which is
coming on AMD with CPUID 8000_0026, unless there is a correlation between
the two. There is zero documentation about that.

So provide new storage and new accessors which for now still access die_id
and topology_max_dies_per_pkg. Will be mopped up after AMD and HYGON are
converted over.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/events/amd/core.c       |    2 +-
 arch/x86/include/asm/processor.h |    3 +++
 arch/x86/include/asm/topology.h  |    8 ++++++++
 arch/x86/kernel/amd_nb.c         |    4 ++--
 arch/x86/kernel/cpu/cacheinfo.c  |    2 +-
 arch/x86/kernel/cpu/mce/amd.c    |    4 ++--
 arch/x86/kernel/cpu/mce/inject.c |    4 ++--
 drivers/edac/amd64_edac.c        |    4 ++--
 drivers/edac/mce_amd.c           |    4 ++--
 9 files changed, 23 insertions(+), 12 deletions(-)

--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -574,7 +574,7 @@ static void amd_pmu_cpu_starting(int cpu
 	if (!x86_pmu.amd_nb_constraints)
 		return;
 
-	nb_id = topology_die_id(cpu);
+	nb_id = topology_amd_node_id(cpu);
 	WARN_ON_ONCE(nb_id == BAD_APICID);
 
 	for_each_online_cpu(i) {
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -99,6 +99,9 @@ struct cpuinfo_topology {
 	u32			logical_pkg_id;
 	u32			logical_die_id;
 
+	// AMD Node ID and Nodes per Package info
+	u32			amd_node_id;
+
 	// Cache level topology IDs
 	u32			llc_id;
 	u32			l2c_id;
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -131,6 +131,8 @@ extern const struct cpumask *cpu_cluster
 #define topology_core_id(cpu)			(cpu_data(cpu).topo.core_id)
 #define topology_ppin(cpu)			(cpu_data(cpu).ppin)
 
+#define topology_amd_node_id(cpu)		(cpu_data(cpu).topo.die_id)
+
 extern unsigned int __max_die_per_package;
 
 #ifdef CONFIG_SMP
@@ -160,6 +162,11 @@ int topology_update_die_map(unsigned int
 int topology_phys_to_logical_pkg(unsigned int pkg);
 bool topology_smt_supported(void);
 
+static inline unsigned int topology_amd_nodes_per_pkg(void)
+{
+	return __max_die_per_package;
+}
+
 extern struct cpumask __cpu_primary_thread_mask;
 #define cpu_primary_thread_mask ((const struct cpumask *)&__cpu_primary_thread_mask)
 
@@ -182,6 +189,7 @@ static inline int topology_max_die_per_p
 static inline int topology_max_smt_threads(void) { return 1; }
 static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
 static inline bool topology_smt_supported(void) { return false; }
+static inline unsigned int topology_amd_nodes_per_pkg(void) { return 0; };
 #endif /* !CONFIG_SMP */
 
 static inline void arch_fix_phys_package_id(int num, u32 slot)
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -370,7 +370,7 @@ struct resource *amd_get_mmconfig_range(
 
 int amd_get_subcaches(int cpu)
 {
-	struct pci_dev *link = node_to_amd_nb(topology_die_id(cpu))->link;
+	struct pci_dev *link = node_to_amd_nb(topology_amd_node_id(cpu))->link;
 	unsigned int mask;
 
 	if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING))
@@ -384,7 +384,7 @@ int amd_get_subcaches(int cpu)
 int amd_set_subcaches(int cpu, unsigned long mask)
 {
 	static unsigned int reset, ban;
-	struct amd_northbridge *nb = node_to_amd_nb(topology_die_id(cpu));
+	struct amd_northbridge *nb = node_to_amd_nb(topology_amd_node_id(cpu));
 	unsigned int reg;
 	int cuid;
 
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -595,7 +595,7 @@ static void amd_init_l3_cache(struct _cp
 	if (index < 3)
 		return;
 
-	node = topology_die_id(smp_processor_id());
+	node = topology_amd_node_id(smp_processor_id());
 	this_leaf->nb = node_to_amd_nb(node);
 	if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)
 		amd_calc_l3_indices(this_leaf->nb);
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1181,7 +1181,7 @@ static int threshold_create_bank(struct
 		return -ENODEV;
 
 	if (is_shared_bank(bank)) {
-		nb = node_to_amd_nb(topology_die_id(cpu));
+		nb = node_to_amd_nb(topology_amd_node_id(cpu));
 
 		/* threshold descriptor already initialized on this node? */
 		if (nb && nb->bank4) {
@@ -1285,7 +1285,7 @@ static void threshold_remove_bank(struct
 		 * The last CPU on this node using the shared bank is going
 		 * away, remove that bank now.
 		 */
-		nb = node_to_amd_nb(topology_die_id(smp_processor_id()));
+		nb = node_to_amd_nb(topology_amd_node_id(smp_processor_id()));
 		nb->bank4 = NULL;
 	}
 
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -543,8 +543,8 @@ static void do_inject(void)
 	if (boot_cpu_has(X86_FEATURE_AMD_DCM) &&
 	    b == 4 &&
 	    boot_cpu_data.x86 < 0x17) {
-		toggle_nb_mca_mst_cpu(topology_die_id(cpu));
-		cpu = get_nbc_for_node(topology_die_id(cpu));
+		toggle_nb_mca_mst_cpu(topology_amd_node_id(cpu));
+		cpu = get_nbc_for_node(topology_amd_node_id(cpu));
 	}
 
 	cpus_read_lock();
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1907,7 +1907,7 @@ static void dct_determine_memory_type(st
 /* On F10h and later ErrAddr is MC4_ADDR[47:1] */
 static u64 get_error_address(struct amd64_pvt *pvt, struct mce *m)
 {
-	u16 mce_nid = topology_die_id(m->extcpu);
+	u16 mce_nid = topology_amd_node_id(m->extcpu);
 	struct mem_ctl_info *mci;
 	u8 start_bit = 1;
 	u8 end_bit   = 47;
@@ -3438,7 +3438,7 @@ static void get_cpus_on_this_dct_cpumask
 	int cpu;
 
 	for_each_online_cpu(cpu)
-		if (topology_die_id(cpu) == nid)
+		if (topology_amd_node_id(cpu) == nid)
 			cpumask_set_cpu(cpu, mask);
 }
 
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1060,7 +1060,7 @@ static void decode_mc3_mce(struct mce *m
 static void decode_mc4_mce(struct mce *m)
 {
 	unsigned int fam = x86_family(m->cpuid);
-	int node_id = topology_die_id(m->extcpu);
+	int node_id = topology_amd_node_id(m->extcpu);
 	u16 ec = EC(m->status);
 	u8 xec = XEC(m->status, 0x1f);
 	u8 offset = 0;
@@ -1188,7 +1188,7 @@ static void decode_smca_error(struct mce
 
 	if ((bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2) &&
 	    xec == 0 && decode_dram_ecc)
-		decode_dram_ecc(topology_die_id(m->extcpu), m);
+		decode_dram_ecc(topology_amd_node_id(m->extcpu), m);
 }
 
 static inline void amd_decode_err_code(u16 ec)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (26 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-29  0:05   ` Sohil Mehta
  2023-07-30  5:20   ` Michael Kelley (LINUX)
  2023-07-28 12:13 ` [patch v2 29/38] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
                   ` (11 subsequent siblings)
  39 siblings, 2 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

AMD/HYGON uses various methods for topology evaluation:

  - Leaf 0x80000008 and 0x8000001e based with an optional leaf 0xb,
    which is the preferred variant for modern CPUs.

    Leaf 0xb will be superseeded by leaf 0x80000026 soon, which is just
    another variant of the Intel 0x1f leaf for whatever reasons.
    
  - Subleaf 0x80000008 and NODEID_MSR base

  - Legacy fallback

That code is following the principle of random bits and pieces all over the
place which results in multiple evaluations and impenetrable code flows in
the same way as the Intel parsing did.

Provide a sane implementation by clearly separating the three variants and
bringing them in the proper preference order in one place.

This provides the parsing for both AMD and HYGON because there is no point
in having a separate HYGON parser which only differs by 3 lines of
code. Any further divergence between AMD and HYGON can be handled in
different functions, while still sharing the existing parsers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/topology.h       |    2 
 arch/x86/kernel/cpu/Makefile          |    2 
 arch/x86/kernel/cpu/amd.c             |    2 
 arch/x86/kernel/cpu/cacheinfo.c       |    4 
 arch/x86/kernel/cpu/cpu.h             |    2 
 arch/x86/kernel/cpu/debugfs.c         |    2 
 arch/x86/kernel/cpu/topology.h        |    6 +
 arch/x86/kernel/cpu/topology_amd.c    |  179 ++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/topology_common.c |   19 +++
 9 files changed, 211 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -162,6 +162,8 @@ int topology_update_die_map(unsigned int
 int topology_phys_to_logical_pkg(unsigned int pkg);
 bool topology_smt_supported(void);
 
+extern unsigned int __amd_nodes_per_pkg;
+
 static inline unsigned int topology_amd_nodes_per_pkg(void)
 {
 	return __max_die_per_package;
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -18,7 +18,7 @@ KMSAN_SANITIZE_common.o := n
 KCSAN_SANITIZE_common.o := n
 
 obj-y			:= cacheinfo.o scattered.o
-obj-y			+= topology_common.o topology_ext.o topology.o
+obj-y			+= topology_common.o topology_ext.o topology_amd.o topology.o
 obj-y			+= common.o
 obj-y			+= rdrand.o
 obj-y			+= match.o
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -356,7 +356,7 @@ static void amd_get_topology(struct cpui
 		if (!err)
 			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
-		cacheinfo_amd_init_llc_id(c);
+		cacheinfo_amd_init_llc_id(c, c->topo.die_id);
 
 	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 		u64 value;
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -661,7 +661,7 @@ static int find_num_cache_leaves(struct
 	return i;
 }
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c)
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id)
 {
 	/*
 	 * We may have multiple LLCs if L3 caches exist, so check if we
@@ -672,7 +672,7 @@ void cacheinfo_amd_init_llc_id(struct cp
 
 	if (c->x86 < 0x17) {
 		/* LLC is at the node level. */
-		c->topo.llc_id = c->topo.die_id;
+		c->topo.llc_id = die_id;
 	} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
 		/*
 		 * LLC is at the core complex level.
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -79,7 +79,7 @@ extern void init_hygon_cacheinfo(struct
 extern int detect_extended_topology(struct cpuinfo_x86 *c);
 extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c);
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id);
 void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c);
 
 unsigned int aperfmperf_get_khz(int cpu);
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -27,6 +27,8 @@ static int cpu_debug_show(struct seq_fil
 	seq_printf(m, "logical_die_id:      %u\n", c->topo.logical_die_id);
 	seq_printf(m, "llc_id:              %u\n", c->topo.llc_id);
 	seq_printf(m, "l2c_id:              %u\n", c->topo.l2c_id);
+	seq_printf(m, "amd_node_id:         %u\n", c->topo.amd_node_id);
+	seq_printf(m, "amd_nodes_per_pkg:   %u\n", topology_amd_nodes_per_pkg());
 	seq_printf(m, "max_cores:           %u\n", c->x86_max_cores);
 	seq_printf(m, "max_die_per_pkg:     %u\n", __max_die_per_package);
 	seq_printf(m, "smp_num_siblings:    %u\n", smp_num_siblings);
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -9,6 +9,10 @@ struct topo_scan {
 
 	// Legacy CPUID[1]:EBX[23:16] number of logical processors
 	unsigned int		ebx1_nproc_shift;
+
+	// AMD specific node ID which cannot be mapped into APIC space.
+	u16			amd_nodes_per_pkg;
+	u16			amd_node_id;
 };
 
 bool topo_is_converted(struct cpuinfo_x86 *c);
@@ -17,6 +21,8 @@ void cpu_parse_topology(struct cpuinfo_x
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
 		      unsigned int shift, unsigned int ncpus);
 bool cpu_parse_topology_ext(struct topo_scan *tscan);
+void cpu_parse_topology_amd(struct topo_scan *tscan);
+void cpu_topology_fixup_amd(struct topo_scan *tscan);
 
 static inline u32 topo_shift_apicid(u32 apicid, enum x86_topology_domains dom)
 {
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/cpu.h>
+
+#include <asm/apic.h>
+#include <asm/memtype.h>
+#include <asm/processor.h>
+
+#include "cpu.h"
+
+static bool parse_8000_0008(struct topo_scan *tscan)
+{
+	struct {
+		u32	ncores		:  8,
+			__rsvd0		:  4,
+			apicidsize	:  4,
+			perftscsize	:  2,
+			__rsvd1		: 14;
+	} ecx;
+	unsigned int sft;
+
+	if (tscan->c->extended_cpuid_level < 0x80000008)
+		return false;
+
+	cpuid_leaf_reg(0x80000008, CPUID_ECX, &ecx);
+
+	/* If the APIC ID size is 0, then get the shift value from ecx.ncores */
+	sft = ecx.apicidsize;
+	if (!sft)
+		sft = get_count_order(ecx.ncores + 1);
+
+	topology_set_dom(tscan, TOPO_CORE_DOMAIN, sft, ecx.ncores + 1);
+	return true;
+}
+
+static void store_node(struct topo_scan *tscan, unsigned int nr_nodes, u16 node_id)
+{
+	/*
+	 * Starting with Fam 17h the DIE domain could probably be used to
+	 * retrieve the node info on AMD/HYGON. Analysis of CPUID dumps
+	 * suggests its the topmost bit(s) of the CPU cores area, but
+	 * that's guess work and neither enumerated nor documented.
+	 *
+	 * Up to Fam 16h this does not work at all and the legacy node ID
+	 * has to be used.
+	 */
+	tscan->amd_nodes_per_pkg = nr_nodes;
+	tscan->amd_node_id = node_id;
+}
+
+static bool parse_8000_001e(struct topo_scan *tscan, bool has_0xb)
+{
+	struct {
+		// eax
+		u32	x2apic_id	: 32;
+		// ebx
+		u32	cuid		:  8,
+			threads_per_cu	:  8,
+			__rsvd0		: 16;
+		// ecx
+		u32	nodeid		:  8,
+			nodes_per_pkg	:  3,
+			__rsvd1		: 21;
+		// edx
+		u32	__rsvd2		: 32;
+	} leaf;
+
+	if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
+		return false;
+
+	cpuid_leaf(0x8000001e, &leaf);
+
+	tscan->c->topo.initial_apicid = leaf.x2apic_id;
+
+	/*
+	 * If leaf 0xb is available, then SMT shift is set already. If not
+	 * take it from ecx.threads_per_cpu and use topo_update_dom() as
+	 * topology_set_dom() would propagate and overwrite the already
+	 * propagated CORE level.
+	 */
+	if (!has_0xb) {
+		topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(leaf.threads_per_cu),
+				    leaf.threads_per_cu);
+	}
+
+	store_node(tscan, leaf.nodes_per_pkg + 1, leaf.nodeid);
+
+	if (tscan->c->x86_vendor == X86_VENDOR_AMD) {
+		if (tscan->c->x86 == 0x15)
+			tscan->c->topo.cu_id = leaf.cuid;
+
+		cacheinfo_amd_init_llc_id(tscan->c, leaf.nodeid);
+	} else {
+		/*
+		 * Pacakge ID is ApicId[6..] on Hygon CPUs. See commit
+		 * e0ceeae708ce for explanation. The topology info is
+		 * screwed up: The package shift is always 6 and the node
+		 * ID is bit [4:5]. Don't touch the latter without
+		 * confirmation from the Hygon developers.
+		 */
+		topology_set_dom(tscan, TOPO_CORE_DOMAIN, 6, tscan->dom_ncpus[TOPO_CORE_DOMAIN]);
+		cacheinfo_hygon_init_llc_id(tscan->c);
+	}
+	return true;
+}
+
+static bool parse_fam10h_node_id(struct topo_scan *tscan)
+{
+	struct {
+		union {
+			u64	node_id		:  3,
+				nodes_per_pkg	:  3,
+				unused		: 58;
+			u64	msr;
+		};
+	} nid;
+
+	if (!boot_cpu_has(X86_FEATURE_NODEID_MSR))
+		return false;
+
+	rdmsrl(MSR_FAM10H_NODE_ID, nid.msr);
+	store_node(tscan, nid.nodes_per_pkg + 1, nid.node_id);
+	tscan->c->topo.llc_id = nid.node_id;
+	return true;
+}
+
+static void legacy_set_llc(struct topo_scan *tscan)
+{
+	unsigned int apicid = tscan->c->topo.initial_apicid;
+
+	/* parse_8000_0008() set everything up except llc_id */
+	tscan->c->topo.llc_id = apicid >> tscan->dom_shifts[TOPO_CORE_DOMAIN];
+}
+
+static void parse_topology_amd(struct topo_scan *tscan)
+{
+	bool has_0xb = false;
+
+	/*
+	 * If the extended topology leaf 0x8000_001e is available
+	 * try to get SMT and CORE shift from leaf 0xb first, then
+	 * try to get the CORE shift from leaf 0x8000_0008.
+	 */
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT))
+		has_0xb = cpu_parse_topology_ext(tscan);
+
+	if (!has_0xb && !parse_8000_0008(tscan))
+		return;
+
+	/* Prefer leaf 0x8000001e if available */
+	if (parse_8000_001e(tscan, has_0xb))
+		return;
+
+	/* Try the NODEID MSR */
+	if (parse_fam10h_node_id(tscan))
+		return;
+
+	legacy_set_llc(tscan);
+}
+
+void cpu_parse_topology_amd(struct topo_scan *tscan)
+{
+	tscan->amd_nodes_per_pkg = 1;
+	parse_topology_amd(tscan);
+
+	if (tscan->amd_nodes_per_pkg > 1)
+		set_cpu_cap(tscan->c, X86_FEATURE_AMD_DCM);
+}
+
+void cpu_topology_fixup_amd(struct topo_scan *tscan)
+{
+	struct cpuinfo_x86 *c = tscan->c;
+
+	/*
+	 * Adjust the core_id relative to the node when there is more than
+	 * one node.
+	 */
+	if (tscan->c->x86 < 0x17 && tscan->amd_nodes_per_pkg > 1)
+		c->topo.core_id %= tscan->dom_ncpus[TOPO_CORE_DOMAIN] / tscan->amd_nodes_per_pkg;
+}
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -11,11 +11,13 @@
 
 struct x86_topology_system x86_topo_system __ro_after_init;
 
+unsigned int __amd_nodes_per_pkg __ro_after_init;
+EXPORT_SYMBOL_GPL(__amd_nodes_per_pkg);
+
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
 		      unsigned int shift, unsigned int ncpus)
 {
-	tscan->dom_shifts[dom] = shift;
-	tscan->dom_ncpus[dom] = ncpus;
+	topology_update_dom(tscan, dom, shift, ncpus);
 
 	/* Propagate to the upper levels */
 	for (dom++; dom < TOPO_MAX_DOMAIN; dom++) {
@@ -145,6 +147,13 @@ static void topo_set_ids(struct topo_sca
 
 	/* Relative core ID */
 	c->topo.core_id = topo_relative_domain_id(apicid, TOPO_CORE_DOMAIN);
+
+	/* Temporary workaround */
+	if (tscan->amd_nodes_per_pkg)
+		c->topo.amd_node_id = c->topo.die_id = tscan->amd_node_id;
+
+	if (c->x86_vendor == X86_VENDOR_AMD)
+		cpu_topology_fixup_amd(tscan);
 }
 
 static void topo_set_max_cores(struct topo_scan *tscan)
@@ -229,4 +238,10 @@ void __init cpu_init_topology(struct cpu
 	 */
 	__max_die_per_package = tscan.dom_ncpus[TOPO_DIE_DOMAIN] /
 		tscan.dom_ncpus[TOPO_DIE_DOMAIN - 1];
+	/*
+	 * AMD systems have Nodes per package which cannot be mapped to
+	 * APIC ID (yet).
+	 */
+	if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON)
+		__amd_nodes_per_pkg = __max_die_per_package = tscan.amd_nodes_per_pkg;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 29/38] x86/smpboot: Teach it about topo.amd_node_id
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (27 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 30/38] x86/cpu: Use common topology code for AMD Thomas Gleixner
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

When switching AMD over to the new topology parser then the match functions
need to look for AMD systems with the extended topology feature at the new
topo.amd_node_id member which is then holding the node id information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/smpboot.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -486,6 +486,7 @@ static bool match_smt(struct cpuinfo_x86
 
 		if (c->topo.pkg_id == o->topo.pkg_id &&
 		    c->topo.die_id == o->topo.die_id &&
+		    c->topo.amd_node_id == o->topo.amd_node_id &&
 		    per_cpu_llc_id(cpu1) == per_cpu_llc_id(cpu2)) {
 			if (c->topo.core_id == o->topo.core_id)
 				return topology_sane(c, o, "smt");
@@ -507,10 +508,13 @@ static bool match_smt(struct cpuinfo_x86
 
 static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
-	if (c->topo.pkg_id == o->topo.pkg_id &&
-	    c->topo.die_id == o->topo.die_id)
-		return true;
-	return false;
+	if (c->topo.pkg_id != o->topo.pkg_id || c->topo.die_id != o->topo.die_id)
+		return false;
+
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT) && topology_amd_nodes_per_pkg() > 1)
+		return c->topo.amd_node_id == o->topo.amd_node_id;
+
+	return true;
 }
 
 static bool match_l2c(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 30/38] x86/cpu: Use common topology code for AMD
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (28 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 29/38] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 31/38] x86/cpu: Use common topology code for HYGON Thomas Gleixner
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Switch it over to the new topology evaluation mechanism and remove the
random bits and pieces which are sprinkled all over the place.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h      |    2 
 arch/x86/include/asm/topology.h       |    5 +
 arch/x86/kernel/cpu/amd.c             |  146 ----------------------------------
 arch/x86/kernel/cpu/mce/inject.c      |    3 
 arch/x86/kernel/cpu/topology_common.c |    5 -
 5 files changed, 10 insertions(+), 151 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -703,10 +703,8 @@ static inline u16 per_cpu_l2c_id(unsigne
 }
 
 #ifdef CONFIG_CPU_SUP_AMD
-extern u32 amd_get_nodes_per_socket(void);
 extern u32 amd_get_highest_perf(void);
 #else
-static inline u32 amd_get_nodes_per_socket(void)	{ return 0; }
 static inline u32 amd_get_highest_perf(void)		{ return 0; }
 #endif
 
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -121,6 +121,11 @@ struct x86_topology_system {
 
 extern struct x86_topology_system x86_topo_system;
 
+static inline unsigned int topology_get_domain_size(enum x86_topology_domains dom)
+{
+	return x86_topo_system.dom_size[dom];
+}
+
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -32,13 +32,6 @@ static const int amd_erratum_400[];
 static const int amd_erratum_1054[];
 static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
 
-/*
- * nodes_per_socket: Stores the number of nodes per socket.
- * Refer to Fam15h Models 00-0fh BKDG - CPUID Fn8000_001E_ECX
- * Node Identifiers[10:8]
- */
-static u32 nodes_per_socket = 1;
-
 static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
 {
 	u32 gprs[8] = { 0 };
@@ -305,97 +298,6 @@ static int nearby_node(int apicid)
 }
 #endif
 
-/*
- * Fix up topo::core_id for pre-F17h systems to be in the
- * [0 .. cores_per_node - 1] range. Not really needed but
- * kept so as not to break existing setups.
- */
-static void legacy_fixup_core_id(struct cpuinfo_x86 *c)
-{
-	u32 cus_per_node;
-
-	if (c->x86 >= 0x17)
-		return;
-
-	cus_per_node = c->x86_max_cores / nodes_per_socket;
-	c->topo.core_id %= cus_per_node;
-}
-
-/*
- * Fixup core topology information for
- * (1) AMD multi-node processors
- *     Assumption: Number of cores in each internal node is the same.
- * (2) AMD processors supporting compute units
- */
-static void amd_get_topology(struct cpuinfo_x86 *c)
-{
-	/* get information required for multi-node processors */
-	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		int err;
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-
-		c->topo.die_id  = ecx & 0xff;
-
-		if (c->x86 == 0x15)
-			c->topo.cu_id = ebx & 0xff;
-
-		if (c->x86 >= 0x17) {
-			c->topo.core_id = ebx & 0xff;
-
-			if (smp_num_siblings > 1)
-				c->x86_max_cores /= smp_num_siblings;
-		}
-
-		/*
-		 * In case leaf B is available, use it to derive
-		 * topology information.
-		 */
-		err = detect_extended_topology(c);
-		if (!err)
-			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
-
-		cacheinfo_amd_init_llc_id(c, c->topo.die_id);
-
-	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
-		u64 value;
-
-		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		c->topo.die_id = value & 7;
-		c->topo.llc_id = c->topo.die_id;
-	} else
-		return;
-
-	if (nodes_per_socket > 1) {
-		set_cpu_cap(c, X86_FEATURE_AMD_DCM);
-		legacy_fixup_core_id(c);
-	}
-}
-
-/*
- * On a AMD dual core setup the lower bits of the APIC id distinguish the cores.
- * Assumes number of cores is a power of two.
- */
-static void amd_detect_cmp(struct cpuinfo_x86 *c)
-{
-	unsigned bits;
-
-	bits = c->x86_coreid_bits;
-	/* Low order bits define the core id (index of core in socket) */
-	c->topo.core_id = c->topo.initial_apicid & ((1 << bits)-1);
-	/* Convert the initial APIC ID into the socket ID */
-	c->topo.pkg_id = c->topo.initial_apicid >> bits;
-	/* use socket ID also for last level cache */
-	c->topo.llc_id = c->topo.die_id = c->topo.pkg_id;
-}
-
-u32 amd_get_nodes_per_socket(void)
-{
-	return nodes_per_socket;
-}
-EXPORT_SYMBOL_GPL(amd_get_nodes_per_socket);
-
 static void srat_detect_node(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_NUMA
@@ -447,32 +349,6 @@ static void srat_detect_node(struct cpui
 #endif
 }
 
-static void early_init_amd_mc(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	unsigned bits, ecx;
-
-	/* Multi core CPU? */
-	if (c->extended_cpuid_level < 0x80000008)
-		return;
-
-	ecx = cpuid_ecx(0x80000008);
-
-	c->x86_max_cores = (ecx & 0xff) + 1;
-
-	/* CPU telling us the core id bits shift? */
-	bits = (ecx >> 12) & 0xF;
-
-	/* Otherwise recompute */
-	if (bits == 0) {
-		while ((1 << bits) < c->x86_max_cores)
-			bits++;
-	}
-
-	c->x86_coreid_bits = bits;
-#endif
-}
-
 static void bsp_init_amd(struct cpuinfo_x86 *c)
 {
 	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
@@ -505,18 +381,6 @@ static void bsp_init_amd(struct cpuinfo_
 	if (cpu_has(c, X86_FEATURE_MWAITX))
 		use_mwaitx_delay();
 
-	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		u32 ecx;
-
-		ecx = cpuid_ecx(0x8000001e);
-		__max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1;
-	} else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
-		u64 value;
-
-		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		__max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 1;
-	}
-
 	if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
 	    !boot_cpu_has(X86_FEATURE_VIRT_SSBD) &&
 	    c->x86 >= 0x15 && c->x86 <= 0x17) {
@@ -598,8 +462,6 @@ static void early_init_amd(struct cpuinf
 	u64 value;
 	u32 dummy;
 
-	early_init_amd_mc(c);
-
 	if (c->x86 >= 0xf)
 		set_cpu_cap(c, X86_FEATURE_K8);
 
@@ -687,9 +549,6 @@ static void early_init_amd(struct cpuinf
 			}
 		}
 	}
-
-	if (cpu_has(c, X86_FEATURE_TOPOEXT))
-		smp_num_siblings = ((cpuid_ebx(0x8000001e) >> 8) & 0xff) + 1;
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
@@ -929,9 +788,6 @@ static void init_amd(struct cpuinfo_x86
 	if (cpu_has(c, X86_FEATURE_FSRM))
 		set_cpu_cap(c, X86_FEATURE_FSRS);
 
-	/* get apicid instead of initial apic id from cpuid */
-	c->topo.apicid = read_apic_id();
-
 	/* K6s reports MCEs but don't actually have all the MSRs */
 	if (c->x86 < 6)
 		clear_cpu_cap(c, X86_FEATURE_MCE);
@@ -959,8 +815,6 @@ static void init_amd(struct cpuinfo_x86
 
 	cpu_detect_cache_sizes(c);
 
-	amd_detect_cmp(c);
-	amd_get_topology(c);
 	srat_detect_node(c);
 
 	init_amd_cacheinfo(c);
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -433,8 +433,7 @@ static u32 get_nbc_for_node(int node_id)
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 cores_per_node;
 
-	cores_per_node = (c->x86_max_cores * smp_num_siblings) / amd_get_nodes_per_socket();
-
+	cores_per_node = (c->x86_max_cores * smp_num_siblings) / topology_amd_nodes_per_pkg();
 	return cores_per_node * node_id;
 }
 
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -65,7 +65,6 @@ bool topo_is_converted(struct cpuinfo_x8
 {
 	/* Temporary until everything is converted over. */
 	switch (boot_cpu_data.x86_vendor) {
-	case X86_VENDOR_AMD:
 	case X86_VENDOR_HYGON:
 		return false;
 	default:
@@ -126,6 +125,10 @@ static void parse_topology(struct topo_s
 	tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);
 
 	switch (c->x86_vendor) {
+	case X86_VENDOR_AMD:
+		if (IS_ENABLED(CONFIG_CPU_SUP_AMD))
+			cpu_parse_topology_amd(tscan);
+		break;
 	case X86_VENDOR_CENTAUR:
 	case X86_VENDOR_ZHAOXIN:
 		parse_legacy(tscan);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 31/38] x86/cpu: Use common topology code for HYGON
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (29 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 30/38] x86/cpu: Use common topology code for AMD Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 32/38] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Switch it over to use the consolidated topology evaluation and remove the
temporary safe guards which are not longer needed.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c          |    5 -
 arch/x86/kernel/cpu/cpu.h             |    1 
 arch/x86/kernel/cpu/hygon.c           |  123 ----------------------------------
 arch/x86/kernel/cpu/topology.h        |    1 
 arch/x86/kernel/cpu/topology_common.c |   22 +-----
 5 files changed, 4 insertions(+), 148 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1739,11 +1739,6 @@ static void identify_cpu(struct cpuinfo_
 	/* Clear/Set all flags overridden by options, after probe */
 	apply_forced_caps(c);
 
-#ifdef CONFIG_X86_64
-	if (!topo_is_converted(c))
-		c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
-#endif
-
 	/*
 	 * Vendor-specific initialization.  In this section we
 	 * canonicalize the feature flags, meaning if there are
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -76,7 +76,6 @@ extern void init_intel_cacheinfo(struct
 extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);
 extern void init_hygon_cacheinfo(struct cpuinfo_x86 *c);
 
-extern int detect_extended_topology(struct cpuinfo_x86 *c);
 extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
 
 void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id);
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -20,12 +20,6 @@
 
 #define APICID_SOCKET_ID_BIT 6
 
-/*
- * nodes_per_socket: Stores the number of nodes per socket.
- * Refer to CPUID Fn8000_001E_ECX Node Identifiers[10:8]
- */
-static u32 nodes_per_socket = 1;
-
 #ifdef CONFIG_NUMA
 /*
  * To workaround broken NUMA config.  Read the comment in
@@ -49,76 +43,6 @@ static int nearby_node(int apicid)
 }
 #endif
 
-static void hygon_get_topology_early(struct cpuinfo_x86 *c)
-{
-	if (cpu_has(c, X86_FEATURE_TOPOEXT))
-		smp_num_siblings = ((cpuid_ebx(0x8000001e) >> 8) & 0xff) + 1;
-}
-
-/*
- * Fixup core topology information for
- * (1) Hygon multi-node processors
- *     Assumption: Number of cores in each internal node is the same.
- * (2) Hygon processors supporting compute units
- */
-static void hygon_get_topology(struct cpuinfo_x86 *c)
-{
-	/* get information required for multi-node processors */
-	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		int err;
-		u32 eax, ebx, ecx, edx;
-
-		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-
-		c->topo.die_id  = ecx & 0xff;
-
-		c->topo.core_id = ebx & 0xff;
-
-		if (smp_num_siblings > 1)
-			c->x86_max_cores /= smp_num_siblings;
-
-		/*
-		 * In case leaf B is available, use it to derive
-		 * topology information.
-		 */
-		err = detect_extended_topology(c);
-		if (!err)
-			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
-
-		/* Socket ID is ApicId[6] for these processors. */
-		c->topo.pkg_id = c->topo.apicid >> APICID_SOCKET_ID_BIT;
-
-		cacheinfo_hygon_init_llc_id(c);
-	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
-		u64 value;
-
-		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		c->topo.die_id = value & 7;
-		c->topo.llc_id = c->topo.die_id;
-	} else
-		return;
-
-	if (nodes_per_socket > 1)
-		set_cpu_cap(c, X86_FEATURE_AMD_DCM);
-}
-
-/*
- * On Hygon setup the lower bits of the APIC id distinguish the cores.
- * Assumes number of cores is a power of two.
- */
-static void hygon_detect_cmp(struct cpuinfo_x86 *c)
-{
-	unsigned int bits;
-
-	bits = c->x86_coreid_bits;
-	/* Low order bits define the core id (index of core in socket) */
-	c->topo.core_id = c->topo.initial_apicid & ((1 << bits)-1);
-	/* Convert the initial APIC ID into the socket ID */
-	c->topo.pkg_id = c->topo.initial_apicid >> bits;
-	/* Use package ID also for last level cache */
-	c->topo.llc_id = c->topo.die_id = c->topo.pkg_id;
-}
-
 static void srat_detect_node(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_NUMA
@@ -169,32 +93,6 @@ static void srat_detect_node(struct cpui
 #endif
 }
 
-static void early_init_hygon_mc(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	unsigned int bits, ecx;
-
-	/* Multi core CPU? */
-	if (c->extended_cpuid_level < 0x80000008)
-		return;
-
-	ecx = cpuid_ecx(0x80000008);
-
-	c->x86_max_cores = (ecx & 0xff) + 1;
-
-	/* CPU telling us the core id bits shift? */
-	bits = (ecx >> 12) & 0xF;
-
-	/* Otherwise recompute */
-	if (bits == 0) {
-		while ((1 << bits) < c->x86_max_cores)
-			bits++;
-	}
-
-	c->x86_coreid_bits = bits;
-#endif
-}
-
 static void bsp_init_hygon(struct cpuinfo_x86 *c)
 {
 	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
@@ -208,18 +106,6 @@ static void bsp_init_hygon(struct cpuinf
 	if (cpu_has(c, X86_FEATURE_MWAITX))
 		use_mwaitx_delay();
 
-	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
-		u32 ecx;
-
-		ecx = cpuid_ecx(0x8000001e);
-		__max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1;
-	} else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
-		u64 value;
-
-		rdmsrl(MSR_FAM10H_NODE_ID, value);
-		__max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 1;
-	}
-
 	if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
 	    !boot_cpu_has(X86_FEATURE_VIRT_SSBD)) {
 		/*
@@ -238,8 +124,6 @@ static void early_init_hygon(struct cpui
 {
 	u32 dummy;
 
-	early_init_hygon_mc(c);
-
 	set_cpu_cap(c, X86_FEATURE_K8);
 
 	rdmsr_safe(MSR_AMD64_PATCH_LEVEL, &c->microcode, &dummy);
@@ -280,8 +164,6 @@ static void early_init_hygon(struct cpui
 	 * we can set it unconditionally.
 	 */
 	set_cpu_cap(c, X86_FEATURE_VMMCALL);
-
-	hygon_get_topology_early(c);
 }
 
 static void init_hygon(struct cpuinfo_x86 *c)
@@ -296,9 +178,6 @@ static void init_hygon(struct cpuinfo_x8
 
 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 
-	/* get apicid instead of initial apic id from cpuid */
-	c->topo.apicid = read_apic_id();
-
 	/*
 	 * XXX someone from Hygon needs to confirm this DTRT
 	 *
@@ -310,8 +189,6 @@ static void init_hygon(struct cpuinfo_x8
 
 	cpu_detect_cache_sizes(c);
 
-	hygon_detect_cmp(c);
-	hygon_get_topology(c);
 	srat_detect_node(c);
 
 	init_hygon_cacheinfo(c);
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -15,7 +15,6 @@ struct topo_scan {
 	u16			amd_node_id;
 };
 
-bool topo_is_converted(struct cpuinfo_x86 *c);
 void cpu_init_topology(struct cpuinfo_x86 *c);
 void cpu_parse_topology(struct cpuinfo_x86 *c);
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -61,18 +61,6 @@ static void parse_legacy(struct topo_sca
 	topology_set_dom(tscan, TOPO_CORE_DOMAIN, core_shift, cores >> smt_shift);
 }
 
-bool topo_is_converted(struct cpuinfo_x86 *c)
-{
-	/* Temporary until everything is converted over. */
-	switch (boot_cpu_data.x86_vendor) {
-	case X86_VENDOR_HYGON:
-		return false;
-	default:
-		/* Let all UP systems use the below */
-		return true;
-	}
-}
-
 static bool fake_topology(struct topo_scan *tscan)
 {
 	/*
@@ -137,6 +125,10 @@ static void parse_topology(struct topo_s
 		if (!IS_ENABLED(CONFIG_CPU_SUP_INTEL) || !cpu_parse_topology_ext(tscan))
 			parse_legacy(tscan);
 		break;
+	case X86_VENDOR_HYGON:
+		if (IS_ENABLED(CONFIG_CPU_SUP_HYGON))
+			cpu_parse_topology_amd(tscan);
+		break;
 	}
 }
 
@@ -179,9 +171,6 @@ void cpu_parse_topology(struct cpuinfo_x
 
 	parse_topology(&tscan, false);
 
-	if (!topo_is_converted(c))
-		return;
-
 	for (dom = TOPO_SMT_DOMAIN; dom < TOPO_MAX_DOMAIN; dom++) {
 		if (tscan.dom_shifts[dom] == x86_topo_system.dom_shifts[dom])
 			continue;
@@ -210,9 +199,6 @@ void __init cpu_init_topology(struct cpu
 
 	parse_topology(&tscan, true);
 
-	if (!topo_is_converted(c))
-		return;
-
 	/* Copy the shift values and calculate the unit sizes. */
 	memcpy(x86_topo_system.dom_shifts, tscan.dom_shifts, sizeof(x86_topo_system.dom_shifts));
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 32/38] x86/mm/numa: Use core domain size on AMD
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (30 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 31/38] x86/cpu: Use common topology code for HYGON Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 33/38] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

cpuinfo::topo::x86_coreid_bits is about to be phased out. Use the core
domain size from the topology information.

Add a comment why the early MPTABLE parsing is required and decrapify the
loop which sets the APIC ID to node map.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/mm/amdtopology.c |   35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

--- a/arch/x86/mm/amdtopology.c
+++ b/arch/x86/mm/amdtopology.c
@@ -54,13 +54,11 @@ static __init int find_northbridge(void)
 
 int __init amd_numa_init(void)
 {
-	u64 start = PFN_PHYS(0);
+	unsigned int numnodes, cores, apicid;
+	u64 prevbase, start = PFN_PHYS(0);
 	u64 end = PFN_PHYS(max_pfn);
-	unsigned numnodes;
-	u64 prevbase;
-	int i, j, nb;
 	u32 nodeid, reg;
-	unsigned int bits, cores, apicid_base;
+	int i, j, nb;
 
 	if (!early_pci_allowed())
 		return -EINVAL;
@@ -158,26 +156,25 @@ int __init amd_numa_init(void)
 		return -ENOENT;
 
 	/*
-	 * We seem to have valid NUMA configuration.  Map apicids to nodes
-	 * using the coreid bits from early_identify_cpu.
+	 * We seem to have valid NUMA configuration. Map apicids to nodes
+	 * using the size of the core domain in the APIC space.
 	 */
-	bits = boot_cpu_data.x86_coreid_bits;
-	cores = 1 << bits;
-	apicid_base = 0;
+	cores = topology_get_domain_size(TOPO_CORE_DOMAIN);
 
 	/*
-	 * get boot-time SMP configuration:
+	 * Scan MPTABLE to map the local APIC and ensure that the boot CPU
+	 * APIC ID is valid. This is required because on pre ACPI/SRAT
+	 * systems IO-APICs are mapped before the boot CPU.
 	 */
 	early_get_smp_config();
 
-	if (boot_cpu_physical_apicid > 0) {
-		pr_info("BSP APIC ID: %02x\n", boot_cpu_physical_apicid);
-		apicid_base = boot_cpu_physical_apicid;
+	apicid = boot_cpu_physical_apicid;
+	if (apicid > 0)
+		pr_info("BSP APIC ID: %02x\n", apicid);
+
+	for_each_node_mask(i, numa_nodes_parsed) {
+		for (j = 0; j < cores; j++, apicid++)
+			set_apicid_to_node(apicid, i);
 	}
-
-	for_each_node_mask(i, numa_nodes_parsed)
-		for (j = apicid_base; j < cores + apicid_base; j++)
-			set_apicid_to_node((i << bits) + j, i);
-
 	return 0;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 33/38] x86/cpu: Make topology_amd_node_id() use the actual node info
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (31 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 32/38] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 34/38] x86/cpu: Remove topology.c Thomas Gleixner
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Now that everything is converted switch it over and remove the intermediate
operation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/topology.h       |    4 ++--
 arch/x86/kernel/cpu/topology_common.c |    7 ++-----
 2 files changed, 4 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -136,7 +136,7 @@ extern const struct cpumask *cpu_cluster
 #define topology_core_id(cpu)			(cpu_data(cpu).topo.core_id)
 #define topology_ppin(cpu)			(cpu_data(cpu).ppin)
 
-#define topology_amd_node_id(cpu)		(cpu_data(cpu).topo.die_id)
+#define topology_amd_node_id(cpu)		(cpu_data(cpu).topo.amd_node_id)
 
 extern unsigned int __max_die_per_package;
 
@@ -171,7 +171,7 @@ extern unsigned int __amd_nodes_per_pkg;
 
 static inline unsigned int topology_amd_nodes_per_pkg(void)
 {
-	return __max_die_per_package;
+	return __amd_nodes_per_pkg;
 }
 
 extern struct cpumask __cpu_primary_thread_mask;
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -143,9 +143,7 @@ static void topo_set_ids(struct topo_sca
 	/* Relative core ID */
 	c->topo.core_id = topo_relative_domain_id(apicid, TOPO_CORE_DOMAIN);
 
-	/* Temporary workaround */
-	if (tscan->amd_nodes_per_pkg)
-		c->topo.amd_node_id = c->topo.die_id = tscan->amd_node_id;
+	c->topo.amd_node_id = tscan->amd_node_id;
 
 	if (c->x86_vendor == X86_VENDOR_AMD)
 		cpu_topology_fixup_amd(tscan);
@@ -231,6 +229,5 @@ void __init cpu_init_topology(struct cpu
 	 * AMD systems have Nodes per package which cannot be mapped to
 	 * APIC ID (yet).
 	 */
-	if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON)
-		__amd_nodes_per_pkg = __max_die_per_package = tscan.amd_nodes_per_pkg;
+	__amd_nodes_per_pkg = tscan.amd_nodes_per_pkg;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 34/38] x86/cpu: Remove topology.c
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (32 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 33/38] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 35/38] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

No more users. Stick it into the ugly code museum.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/Makefile   |    2 
 arch/x86/kernel/cpu/topology.c |  164 -----------------------------------------
 2 files changed, 1 insertion(+), 165 deletions(-)

--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -18,7 +18,7 @@ KMSAN_SANITIZE_common.o := n
 KCSAN_SANITIZE_common.o := n
 
 obj-y			:= cacheinfo.o scattered.o
-obj-y			+= topology_common.o topology_ext.o topology_amd.o topology.o
+obj-y			+= topology_common.o topology_ext.o topology_amd.o
 obj-y			+= common.o
 obj-y			+= rdrand.o
 obj-y			+= match.o
--- a/arch/x86/kernel/cpu/topology.c
+++ /dev/null
@@ -1,164 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Check for extended topology enumeration cpuid leaf 0xb and if it
- * exists, use it for populating initial_apicid and cpu topology
- * detection.
- */
-
-#include <linux/cpu.h>
-#include <asm/apic.h>
-#include <asm/memtype.h>
-#include <asm/processor.h>
-
-#include "cpu.h"
-
-/* leaf 0xb SMT level */
-#define SMT_LEVEL	0
-
-/* extended topology sub-leaf types */
-#define INVALID_TYPE	0
-#define SMT_TYPE	1
-#define CORE_TYPE	2
-#define DIE_TYPE	5
-
-#define LEAFB_SUBTYPE(ecx)		(((ecx) >> 8) & 0xff)
-#define BITS_SHIFT_NEXT_LEVEL(eax)	((eax) & 0x1f)
-#define LEVEL_MAX_SIBLINGS(ebx)		((ebx) & 0xffff)
-
-#ifdef CONFIG_SMP
-/*
- * Check if given CPUID extended topology "leaf" is implemented
- */
-static int check_extended_topology_leaf(int leaf)
-{
-	unsigned int eax, ebx, ecx, edx;
-
-	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
-
-	if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
-		return -1;
-
-	return 0;
-}
-/*
- * Return best CPUID Extended Topology Leaf supported
- */
-static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
-{
-	if (c->cpuid_level >= 0x1f) {
-		if (check_extended_topology_leaf(0x1f) == 0)
-			return 0x1f;
-	}
-
-	if (c->cpuid_level >= 0xb) {
-		if (check_extended_topology_leaf(0xb) == 0)
-			return 0xb;
-	}
-
-	return -1;
-}
-#endif
-
-int detect_extended_topology_early(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	unsigned int eax, ebx, ecx, edx;
-	int leaf;
-
-	leaf = detect_extended_topology_leaf(c);
-	if (leaf < 0)
-		return -1;
-
-	set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);
-
-	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
-	/*
-	 * initial apic id, which also represents 32-bit extended x2apic id.
-	 */
-	c->topo.initial_apicid = edx;
-	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
-#endif
-	return 0;
-}
-
-/*
- * Check for extended topology enumeration cpuid leaf, and if it
- * exists, use it for populating initial_apicid and cpu topology
- * detection.
- */
-int detect_extended_topology(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
-	unsigned int eax, ebx, ecx, edx, sub_index;
-	unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
-	unsigned int core_select_mask, core_level_siblings;
-	unsigned int die_select_mask, die_level_siblings;
-	unsigned int pkg_mask_width;
-	bool die_level_present = false;
-	int leaf;
-
-	leaf = detect_extended_topology_leaf(c);
-	if (leaf < 0)
-		return -1;
-
-	/*
-	 * Populate HT related information from sub-leaf level 0.
-	 */
-	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
-	c->topo.initial_apicid = edx;
-	core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
-	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
-	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-	die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
-	pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-
-	sub_index = 1;
-	while (true) {
-		cpuid_count(leaf, sub_index, &eax, &ebx, &ecx, &edx);
-
-		/*
-		 * Check for the Core type in the implemented sub leaves.
-		 */
-		if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
-			core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
-			core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-			die_level_siblings = core_level_siblings;
-			die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-		}
-		if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
-			die_level_present = true;
-			die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
-			die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-		}
-
-		if (LEAFB_SUBTYPE(ecx) != INVALID_TYPE)
-			pkg_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
-		else
-			break;
-
-		sub_index++;
-	}
-
-	core_select_mask = (~(-1 << pkg_mask_width)) >> ht_mask_width;
-	die_select_mask = (~(-1 << die_plus_mask_width)) >>
-				core_plus_mask_width;
-
-	c->topo.core_id = apic->phys_pkg_id(c->topo.initial_apicid,
-				ht_mask_width) & core_select_mask;
-
-	if (die_level_present) {
-		c->topo.die_id = apic->phys_pkg_id(c->topo.initial_apicid,
-					core_plus_mask_width) & die_select_mask;
-	}
-
-	c->topo.pkg_id = apic->phys_pkg_id(c->topo.initial_apicid, pkg_mask_width);
-	/*
-	 * Reinit the apicid, now that we have extended initial_apicid.
-	 */
-	c->topo.apicid = apic->phys_pkg_id(c->topo.initial_apicid, 0);
-
-	c->x86_max_cores = (core_level_siblings / smp_num_siblings);
-	__max_die_per_package = (die_level_siblings / core_level_siblings);
-#endif
-	return 0;
-}


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 35/38] x86/cpu: Remove x86_coreid_bits
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (33 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 34/38] x86/cpu: Remove topology.c Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

No more users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    2 --
 arch/x86/kernel/cpu/common.c     |    1 -
 2 files changed, 3 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -119,8 +119,6 @@ struct cpuinfo_x86 {
 #endif
 	__u8			x86_virt_bits;
 	__u8			x86_phys_bits;
-	/* CPUID returned core id bits: */
-	__u8			x86_coreid_bits;
 	/* Max extended CPUID function supported: */
 	__u32			extended_cpuid_level;
 	/* Maximum supported CPUID level, -1=no CPUID: */
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1712,7 +1712,6 @@ static void identify_cpu(struct cpuinfo_
 	c->x86_vendor_id[0] = '\0'; /* Unset */
 	c->x86_model_id[0] = '\0';  /* Unset */
 	c->x86_max_cores = 1;
-	c->x86_coreid_bits = 0;
 #ifdef CONFIG_X86_64
 	c->x86_clflush_size = 64;
 	c->x86_phys_bits = 36;


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (34 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 35/38] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-08-08 20:15   ` Steve Wahl
  2023-07-28 12:13 ` [patch v2 37/38] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
                   ` (3 subsequent siblings)
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Now that the core code does not use this monstrosity anymore, it's time to
put it to rest.

The only real purpose was to read the APIC ID on UV and VSMP systems for
the actual evaluation. That's what the core code does now.

For doing the actual shift operation there is truly no APIC callback
required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h           |    1 -
 arch/x86/kernel/apic/apic_flat_64.c   |    7 -------
 arch/x86/kernel/apic/apic_noop.c      |    3 ---
 arch/x86/kernel/apic/apic_numachip.c  |    7 -------
 arch/x86/kernel/apic/bigsmp_32.c      |    6 ------
 arch/x86/kernel/apic/local.h          |    1 -
 arch/x86/kernel/apic/probe_32.c       |    6 ------
 arch/x86/kernel/apic/x2apic_cluster.c |    1 -
 arch/x86/kernel/apic/x2apic_phys.c    |    6 ------
 arch/x86/kernel/apic/x2apic_uv_x.c    |   11 -----------
 arch/x86/kernel/vsmp_64.c             |   13 -------------
 arch/x86/xen/apic.c                   |    6 ------
 12 files changed, 68 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -296,7 +296,6 @@ struct apic {
 	void	(*init_apic_ldr)(void);
 	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
 	u32	(*cpu_present_to_apicid)(int mps_cpu);
-	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
 
 	u32	(*get_apic_id)(u32 id);
 	u32	(*set_apic_id)(u32 apicid);
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -66,11 +66,6 @@ static u32 set_apic_id(u32 id)
 	return (id & 0xFF) << 24;
 }
 
-static u32 flat_phys_pkg_id(u32 initial_apic_id, int index_msb)
-{
-	return initial_apic_id >> index_msb;
-}
-
 static int flat_probe(void)
 {
 	return 1;
@@ -89,7 +84,6 @@ static struct apic apic_flat __ro_after_
 
 	.init_apic_ldr			= default_init_apic_ldr,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= flat_phys_pkg_id,
 
 	.max_apic_id			= 0xFE,
 	.get_apic_id			= flat_get_apic_id,
@@ -159,7 +153,6 @@ static struct apic apic_physflat __ro_af
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= flat_phys_pkg_id,
 
 	.max_apic_id			= 0xFE,
 	.get_apic_id			= flat_get_apic_id,
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -29,7 +29,6 @@ static void noop_send_IPI_self(int vecto
 static void noop_apic_icr_write(u32 low, u32 id) { }
 static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) { return -1; }
 static u64 noop_apic_icr_read(void) { return 0; }
-static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
 static u32 noop_get_apic_id(u32 apicid) { return 0; }
 static void noop_apic_eoi(void) { }
 
@@ -56,8 +55,6 @@ struct apic apic_noop __ro_after_init =
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 
-	.phys_pkg_id			= noop_phys_pkg_id,
-
 	.max_apic_id			= 0xFE,
 	.get_apic_id			= noop_get_apic_id,
 
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -56,11 +56,6 @@ static u32 numachip2_set_apic_id(u32 id)
 	return id << 24;
 }
 
-static u32 numachip_phys_pkg_id(u32 initial_apic_id, int index_msb)
-{
-	return initial_apic_id >> index_msb;
-}
-
 static void numachip1_apic_icr_write(int apicid, unsigned int val)
 {
 	write_lcsr(CSR_G3_EXT_IRQ_GEN, (apicid << 16) | val);
@@ -228,7 +223,6 @@ static const struct apic apic_numachip1
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= numachip_phys_pkg_id,
 
 	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= numachip1_get_apic_id,
@@ -265,7 +259,6 @@ static const struct apic apic_numachip2
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= numachip_phys_pkg_id,
 
 	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= numachip2_get_apic_id,
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -29,11 +29,6 @@ static void bigsmp_ioapic_phys_id_map(ph
 	physids_promote(0xFFL, retmap);
 }
 
-static u32 bigsmp_phys_pkg_id(u32 cpuid_apic, int index_msb)
-{
-	return cpuid_apic >> index_msb;
-}
-
 static void bigsmp_send_IPI_allbutself(int vector)
 {
 	default_send_IPI_mask_allbutself_phys(cpu_online_mask, vector);
@@ -88,7 +83,6 @@ static struct apic apic_bigsmp __ro_afte
 	.check_apicid_used		= bigsmp_check_apicid_used,
 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= bigsmp_phys_pkg_id,
 
 	.max_apic_id			= 0xFE,
 	.get_apic_id			= bigsmp_get_apic_id,
--- a/arch/x86/kernel/apic/local.h
+++ b/arch/x86/kernel/apic/local.h
@@ -17,7 +17,6 @@
 void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
 u32 x2apic_get_apic_id(u32 id);
 u32 x2apic_set_apic_id(u32 id);
-u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
 
 void x2apic_send_IPI_all(int vector);
 void x2apic_send_IPI_allbutself(int vector);
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -18,11 +18,6 @@
 
 #include "local.h"
 
-static u32 default_phys_pkg_id(u32 cpuid_apic, int index_msb)
-{
-	return cpuid_apic >> index_msb;
-}
-
 static u32 default_get_apic_id(u32 x)
 {
 	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
@@ -54,7 +49,6 @@ static struct apic apic_default __ro_aft
 	.init_apic_ldr			= default_init_apic_ldr,
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= default_phys_pkg_id,
 
 	.max_apic_id			= 0xFE,
 	.get_apic_id			= default_get_apic_id,
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -236,7 +236,6 @@ static struct apic apic_x2apic_cluster _
 	.init_apic_ldr			= init_x2apic_ldr,
 	.ioapic_phys_id_map		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= x2apic_phys_pkg_id,
 
 	.max_apic_id			= UINT_MAX,
 	.x2apic_set_max_apicid		= true,
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -134,11 +134,6 @@ u32 x2apic_set_apic_id(u32 id)
 	return id;
 }
 
-u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb)
-{
-	return initial_apicid >> index_msb;
-}
-
 static struct apic apic_x2apic_phys __ro_after_init = {
 
 	.name				= "physical x2apic",
@@ -151,7 +146,6 @@ static struct apic apic_x2apic_phys __ro
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= x2apic_phys_pkg_id,
 
 	.max_apic_id			= UINT_MAX,
 	.x2apic_set_max_apicid		= true,
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -785,16 +785,6 @@ static u32 set_apic_id(u32 id)
 	return id;
 }
 
-static unsigned int uv_read_apic_id(void)
-{
-	return x2apic_get_apic_id(apic_read(APIC_ID));
-}
-
-static u32 uv_phys_pkg_id(u32 initial_apicid, int index_msb)
-{
-	return uv_read_apic_id() >> index_msb;
-}
-
 static int uv_probe(void)
 {
 	return apic == &apic_x2apic_uv_x;
@@ -812,7 +802,6 @@ static struct apic apic_x2apic_uv_x __ro
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
-	.phys_pkg_id			= uv_phys_pkg_id,
 
 	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= x2apic_get_apic_id,
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -127,25 +127,12 @@ static void __init vsmp_cap_cpus(void)
 #endif
 }
 
-static u32 apicid_phys_pkg_id(u32 initial_apic_id, int index_msb)
-{
-	return read_apic_id() >> index_msb;
-}
-
-static void vsmp_apic_post_init(void)
-{
-	/* need to update phys_pkg_id */
-	apic->phys_pkg_id = apicid_phys_pkg_id;
-}
-
 void __init vsmp_init(void)
 {
 	detect_vsmp_box();
 	if (!is_vsmp_box())
 		return;
 
-	x86_platform.apic_post_init = vsmp_apic_post_init;
-
 	vsmp_cap_cpus();
 
 	set_vsmp_ctl();
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -110,11 +110,6 @@ static int xen_madt_oem_check(char *oem_
 	return xen_pv_domain();
 }
 
-static u32 xen_phys_pkg_id(u32 initial_apic_id, int index_msb)
-{
-	return initial_apic_id >> index_msb;
-}
-
 static u32 xen_cpu_present_to_apicid(int cpu)
 {
 	if (cpu_present(cpu))
@@ -133,7 +128,6 @@ static struct apic xen_pv_apic __ro_afte
 	.disable_esr			= 0,
 
 	.cpu_present_to_apicid		= xen_cpu_present_to_apicid,
-	.phys_pkg_id			= xen_phys_pkg_id, /* detect_ht */
 
 	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= xen_get_apic_id,


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 37/38] x86/xen/smp_pv: Remove cpudata fiddling
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (35 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-07-28 12:13 ` [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	Juergen Gross, James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

The new topology CPUID parser installs already fake topology for XEN/PV,
which ends up with cpuinfo::max_cores = 1.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
---
V2: New patch
---
 arch/x86/xen/smp_pv.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -73,7 +73,6 @@ static void cpu_bringup(void)
 	}
 	cpu = smp_processor_id();
 	smp_store_cpu_info(cpu);
-	cpu_data(cpu).x86_max_cores = 1;
 	set_cpu_sibling_map(cpu);
 
 	speculative_store_bypass_ht_init();
@@ -223,8 +222,6 @@ static void __init xen_pv_smp_prepare_cp
 
 	smp_prepare_cpus_common();
 
-	cpu_data(0).x86_max_cores = 1;
-
 	speculative_store_bypass_ht_init();
 
 	xen_pmu_init(0);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (36 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 37/38] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
@ 2023-07-28 12:13 ` Thomas Gleixner
  2023-08-08 20:16   ` Steve Wahl
  2023-07-28 15:41 ` [patch v2 00/38] x86/cpu: Rework the topology evaluation Dimitri Sivanich
  2023-07-28 19:01 ` Sohil Mehta
  39 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 12:13 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross

The package shift has been already evaluated by the early CPU init.

Put the mindless copy right next to the original leaf 0xb parser.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
---
 arch/x86/include/asm/topology.h    |    5 +++
 arch/x86/kernel/apic/x2apic_uv_x.c |   52 ++++++-------------------------------
 2 files changed, 14 insertions(+), 43 deletions(-)

--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -126,6 +126,11 @@ static inline unsigned int topology_get_
 	return x86_topo_system.dom_size[dom];
 }
 
+static inline unsigned int topology_get_domain_shift(enum x86_topology_domains dom)
+{
+	return dom == TOPO_SMT_DOMAIN ? 0 : x86_topo_system.dom_shifts[dom - 1];
+}
+
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -241,54 +241,20 @@ static void __init uv_tsc_check_sync(voi
 	is_uv(UV3) ? sname.s3.field :		\
 	undef)
 
-/* [Copied from arch/x86/kernel/cpu/topology.c:detect_extended_topology()] */
-
-#define SMT_LEVEL			0	/* Leaf 0xb SMT level */
-#define INVALID_TYPE			0	/* Leaf 0xb sub-leaf types */
-#define SMT_TYPE			1
-#define CORE_TYPE			2
-#define LEAFB_SUBTYPE(ecx)		(((ecx) >> 8) & 0xff)
-#define BITS_SHIFT_NEXT_LEVEL(eax)	((eax) & 0x1f)
-
-static void set_x2apic_bits(void)
-{
-	unsigned int eax, ebx, ecx, edx, sub_index;
-	unsigned int sid_shift;
-
-	cpuid(0, &eax, &ebx, &ecx, &edx);
-	if (eax < 0xb) {
-		pr_info("UV: CPU does not have CPUID.11\n");
-		return;
-	}
-
-	cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
-	if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE)) {
-		pr_info("UV: CPUID.11 not implemented\n");
-		return;
-	}
-
-	sid_shift = BITS_SHIFT_NEXT_LEVEL(eax);
-	sub_index = 1;
-	do {
-		cpuid_count(0xb, sub_index, &eax, &ebx, &ecx, &edx);
-		if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
-			sid_shift = BITS_SHIFT_NEXT_LEVEL(eax);
-			break;
-		}
-		sub_index++;
-	} while (LEAFB_SUBTYPE(ecx) != INVALID_TYPE);
-
-	uv_cpuid.apicid_shift	= 0;
-	uv_cpuid.apicid_mask	= (~(-1 << sid_shift));
-	uv_cpuid.socketid_shift = sid_shift;
-}
-
 static void __init early_get_apic_socketid_shift(void)
 {
+	unsigned int sid_shift = topology_get_domain_shift(TOPO_ROOT_DOMAIN);
+
 	if (is_uv2_hub() || is_uv3_hub())
 		uvh_apicid.v = uv_early_read_mmr(UVH_APICID);
 
-	set_x2apic_bits();
+	if (sid_shift) {
+		uv_cpuid.apicid_shift	= 0;
+		uv_cpuid.apicid_mask	= (~(-1 << sid_shift));
+		uv_cpuid.socketid_shift = sid_shift;
+	} else {
+		pr_info("UV: CPU does not have valid CPUID.11\n");
+	}
 
 	pr_info("UV: apicid_shift:%d apicid_mask:0x%x\n", uv_cpuid.apicid_shift, uv_cpuid.apicid_mask);
 	pr_info("UV: socketid_shift:%d pnode_mask:0x%x\n", uv_cpuid.socketid_shift, uv_cpuid.pnode_mask);


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 00/38] x86/cpu: Rework the topology evaluation
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (37 preceding siblings ...)
  2023-07-28 12:13 ` [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
@ 2023-07-28 15:41 ` Dimitri Sivanich
  2023-07-28 19:01 ` Sohil Mehta
  39 siblings, 0 replies; 72+ messages in thread
From: Dimitri Sivanich @ 2023-07-28 15:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Dimitri Sivanich, Russ Anderson

I successfully booted the same 32-socket, 3840 cpu HPE Sapphire Rapids system
with the V2 update found here:
  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2

# cat /sys/kernel/debug/x86/topo/cpus/3839
initial_apicid:      ff7
apicid:              ff7
pkg_id:              31
die_id:              31
cu_id:               255
core_id:             59
logical_pkg_id:      31
logical_die_id:      31
llc_id:              3968
l2c_id:              4086
amd_node_id:         0
amd_nodes_per_pkg:   0
max_cores:           60
max_die_per_pkg:     1
smp_num_siblings:    2

On Fri, Jul 28, 2023 at 02:12:42PM +0200, Thomas Gleixner wrote:
> Hi!
> 
> This is the follow up to V1:
> 
>   https://lore.kernel.org/lkml/20230724155329.474037902@linutronix.de
> 
> which addresses the review feedback and some minor fallout I observed in my
> testing of the work based on top.
> 
> TLDR:
> 
> This reworks the way how topology information is evaluated via CPUID
> in preparation for a larger topology management overhaul to address
> shortcomings of the current code vs. hybrid systems and systems which make
> use of the extended topology domains in leaf 0x1f. Aside of that it's an
> overdue spring cleaning to get rid of accumulated layers of duct tape and
> haywire.
> 
> What changed vs. V1:
> 
>   - Fixed an issue vs. the logical die/pkg management as the current
>     code (ab)uses cpuinfo for persistant storage.
> 
>   - Consolidated APIC ID usage on u32 and ditched the u16 limitation
> 
>   - Addressed the review feedback from Peter and Arjan
> 
>   - Added a new patch which gets rid of XENPV fiddling in the cpuinfo
>     state. That needs some testing on XENPV obviously. The relevant
>     patches are #22 and #37
> 
> I did not pick up any of the tested by tags yet. I hope people can run it
> once more. Neither did I add the Ack from Peter.
> 
> The series is based on the APIC cleanup series:
> 
>   https://lore.kernel.org/lkml/20230724131206.500814398@linutronix.de
> 
> and also available on top of that from git:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2
> 
> Thanks,
> 
> 	tglx
> ---
>  Documentation/arch/x86/topology.rst       |   12 -
>  a/arch/x86/kernel/cpu/topology.c          |  168 ---------------------
>  arch/x86/events/amd/core.c                |    2 
>  arch/x86/events/amd/uncore.c              |    2 
>  arch/x86/events/intel/uncore.c            |    2 
>  arch/x86/hyperv/hv_vtl.c                  |    2 
>  arch/x86/include/asm/apic.h               |   32 +---
>  arch/x86/include/asm/cacheinfo.h          |    3 
>  arch/x86/include/asm/cpuid.h              |   32 ++++
>  arch/x86/include/asm/mpspec.h             |    2 
>  arch/x86/include/asm/processor.h          |   60 +++++--
>  arch/x86/include/asm/smp.h                |    4 
>  arch/x86/include/asm/topology.h           |   51 +++++-
>  arch/x86/include/asm/x86_init.h           |    2 
>  arch/x86/kernel/acpi/boot.c               |    4 
>  arch/x86/kernel/amd_nb.c                  |    8 -
>  arch/x86/kernel/apic/apic.c               |   14 -
>  arch/x86/kernel/apic/apic_common.c        |    4 
>  arch/x86/kernel/apic/apic_flat_64.c       |   13 -
>  arch/x86/kernel/apic/apic_noop.c          |    9 -
>  arch/x86/kernel/apic/apic_numachip.c      |   21 --
>  arch/x86/kernel/apic/bigsmp_32.c          |   10 -
>  arch/x86/kernel/apic/local.h              |    6 
>  arch/x86/kernel/apic/probe_32.c           |   10 -
>  arch/x86/kernel/apic/x2apic_cluster.c     |    1 
>  arch/x86/kernel/apic/x2apic_phys.c        |   10 -
>  arch/x86/kernel/apic/x2apic_uv_x.c        |   67 +-------
>  arch/x86/kernel/cpu/Makefile              |    5 
>  arch/x86/kernel/cpu/amd.c                 |  156 --------------------
>  arch/x86/kernel/cpu/cacheinfo.c           |   51 ++----
>  arch/x86/kernel/cpu/centaur.c             |    4 
>  arch/x86/kernel/cpu/common.c              |  111 +-------------
>  arch/x86/kernel/cpu/cpu.h                 |   14 +
>  arch/x86/kernel/cpu/hygon.c               |  133 -----------------
>  arch/x86/kernel/cpu/intel.c               |   38 ----
>  arch/x86/kernel/cpu/mce/amd.c             |    4 
>  arch/x86/kernel/cpu/mce/apei.c            |    4 
>  arch/x86/kernel/cpu/mce/core.c            |    4 
>  arch/x86/kernel/cpu/mce/inject.c          |    7 
>  arch/x86/kernel/cpu/proc.c                |    8 -
>  arch/x86/kernel/cpu/zhaoxin.c             |   18 --
>  arch/x86/kernel/kvm.c                     |    6 
>  arch/x86/kernel/sev.c                     |    2 
>  arch/x86/kernel/smpboot.c                 |   97 +++++++-----
>  arch/x86/kernel/vsmp_64.c                 |   13 -
>  arch/x86/mm/amdtopology.c                 |   35 ++--
>  arch/x86/mm/numa.c                        |    4 
>  arch/x86/xen/apic.c                       |   14 -
>  arch/x86/xen/smp_pv.c                     |    3 
>  b/arch/x86/kernel/cpu/debugfs.c           |   97 ++++++++++++
>  b/arch/x86/kernel/cpu/topology.h          |   51 ++++++
>  b/arch/x86/kernel/cpu/topology_amd.c      |  179 +++++++++++++++++++++++
>  b/arch/x86/kernel/cpu/topology_common.c   |  233 ++++++++++++++++++++++++++++++
>  b/arch/x86/kernel/cpu/topology_ext.c      |  136 +++++++++++++++++
>  drivers/edac/amd64_edac.c                 |    4 
>  drivers/edac/mce_amd.c                    |    4 
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |    2 
>  drivers/hwmon/fam15h_power.c              |    7 
>  drivers/scsi/lpfc/lpfc_init.c             |    8 -
>  drivers/virt/acrn/hsm.c                   |    2 
>  60 files changed, 1049 insertions(+), 956 deletions(-)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 00/38] x86/cpu: Rework the topology evaluation
  2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
                   ` (38 preceding siblings ...)
  2023-07-28 15:41 ` [patch v2 00/38] x86/cpu: Rework the topology evaluation Dimitri Sivanich
@ 2023-07-28 19:01 ` Sohil Mehta
  39 siblings, 0 replies; 72+ messages in thread
From: Sohil Mehta @ 2023-07-28 19:01 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On 7/28/2023 5:12 AM, Thomas Gleixner wrote:
> I did not pick up any of the tested by tags yet. I hope people can run it
> once more. Neither did I add the Ack from Peter.
> 
> The series is based on the APIC cleanup series:
> 
>   https://lore.kernel.org/lkml/20230724131206.500814398@linutronix.de
> 
> and also available on top of that from git:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v2
> 

The series boots fine on a 2S Sandy bridge system. I didn't see any
issues with cpu-hotplug either.

Tested-by: Sohil Mehta <sohil.mehta@intel.com>


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [patch v2A 22/38] x86/cpu: Add legacy topology parser
  2023-07-28 12:13 ` [patch v2 22/38] x86/cpu: Add legacy topology parser Thomas Gleixner
@ 2023-07-28 21:33   ` Thomas Gleixner
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 21:33 UTC (permalink / raw)
  To: LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Fri, Jul 28 2023 at 14:13, Thomas Gleixner wrote:
> The legacy topology detection via CPUID leaf 4, which provides the number
> of cores in the package and CPUID leaf 1 which provides the number of
> logical CPUs in case that FEATURE_HT is enabled and the CMP_LEGACY feature
> is not set, is shared for Intel, Centaur amd Zhaoxin CPUs.
>
> Lift the code from common.c without the early detection hack and provide it
> as common fallback mechanism.

Here I completely failed to get it right. Why?

My mind was so focussed on the leaf 0xb/0x1f representation that I
completely missed that the legacy parser does not fit into that picture
at all. I did some tests on 32bit in a VM as I really do not have
functional 32bit hardware anymore and they all worked. Not that I tried
hard.

Unfortunately Borislav decided to give it a ride on a real 32bit ATOM
system and unearthed my snafu.

Replacement patch below and also pushed out to the:

 git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/topology

branch w/o a tag attached.

Thanks,

        tglx
---
Subject: x86/cpu: Add legacy topology parser
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 02 Jul 2023 13:20:08 +0200

The legacy topology detection via CPUID leaf 4, which provides the number
of cores in the package and CPUID leaf 1 which provides the number of
logical CPUs in case that FEATURE_HT is enabled and the CMP_LEGACY feature
is not set, is shared for Intel, Centaur amd Zhaoxin CPUs.

Lift the code from common.c without the early detection hack and provide it
as common fallback mechanism.

Will be utilized in later changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2A: Fix the 32bit snafu.
---
 arch/x86/kernel/cpu/common.c          |    3 ++
 arch/x86/kernel/cpu/topology.h        |    2 +
 arch/x86/kernel/cpu/topology_common.c |   44 ++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -883,6 +883,9 @@ void detect_ht(struct cpuinfo_x86 *c)
 #ifdef CONFIG_SMP
 	int index_msb, core_bits;
 
+	if (topo_is_converted(c))
+		return;
+
 	if (detect_ht_early(c) < 0)
 		return;
 
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -7,6 +7,8 @@ struct topo_scan {
 	unsigned int		dom_shifts[TOPO_MAX_DOMAIN];
 	unsigned int		dom_ncpus[TOPO_MAX_DOMAIN];
 
+	// Legacy CPUID[1]:EBX[23:16] number of logical processors
+	unsigned int		ebx1_nproc_shift;
 };
 
 bool topo_is_converted(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -24,6 +24,48 @@ void topology_set_dom(struct topo_scan *
 	}
 }
 
+static unsigned int parse_num_cores(struct cpuinfo_x86 *c)
+{
+	struct {
+		u32	cache_type	:  5,
+			unused		: 21,
+			ncores		:  6;
+	} eax;
+
+	if (c->cpuid_level < 4)
+		return 1;
+
+	cpuid_subleaf_reg(4, 0, CPUID_EAX, &eax);
+	if (!eax.cache_type)
+		return 1;
+
+	return eax.ncores + 1;
+}
+
+static void __maybe_unused parse_legacy(struct topo_scan *tscan)
+{
+	unsigned int cores, core_shift, smt_shift = 0;
+	struct cpuinfo_x86 *c = tscan->c;
+
+	cores = parse_num_cores(c);
+	core_shift = get_count_order(cores);
+
+	if (cpu_has(c, X86_FEATURE_HT)) {
+		if (!WARN_ON_ONCE(tscan->ebx1_nproc_shift < core_shift))
+			smt_shift = tscan->ebx1_nproc_shift - core_shift;
+		/*
+		 * The parser expects leaf 0xb/0x1f format, which means
+		 * the number of logical processors at core level is
+		 * counting threads.
+		 */
+		core_shift += smt_shift;
+		cores <<= smt_shift;
+	}
+
+	topology_set_dom(tscan, TOPO_SMT_DOMAIN, smt_shift, 1U << smt_shift);
+	topology_set_dom(tscan, TOPO_CORE_DOMAIN, core_shift, cores);
+}
+
 bool topo_is_converted(struct cpuinfo_x86 *c)
 {
 	/* Temporary until everything is converted over. */
@@ -88,6 +130,8 @@ static void parse_topology(struct topo_s
 	/* The above is sufficient for UP */
 	if (!IS_ENABLED(CONFIG_SMP))
 		return;
+
+	tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);
 }
 
 static void topo_set_ids(struct topo_scan *tscan)


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 11/38] x86/apic: Use BAD_APICID consistently;
  2023-07-28 12:12 ` [patch v2 11/38] x86/apic: Use BAD_APICID consistently; Thomas Gleixner
@ 2023-07-28 22:36   ` Sohil Mehta
  2023-07-28 22:50     ` Thomas Gleixner
  0 siblings, 1 reply; 72+ messages in thread
From: Sohil Mehta @ 2023-07-28 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

There is an extra semicolon at the end of the patch subject.

On 7/28/2023 5:12 AM, Thomas Gleixner wrote:
> APIC ID checks compare with BAD_APICID all over the place, but some
> initializers and some code which fiddles with global data structure use
> -1[U] instead. That simply cannot work at all.
> 
> Fix it up and use BAD_APICID consistenly all over the place.
> 

s/consistenly/consistently

-Sohil


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 11/38] x86/apic: Use BAD_APICID consistently;
  2023-07-28 22:36   ` Sohil Mehta
@ 2023-07-28 22:50     ` Thomas Gleixner
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-28 22:50 UTC (permalink / raw)
  To: Sohil Mehta, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Fri, Jul 28 2023 at 15:36, Sohil Mehta wrote:

> There is an extra semicolon at the end of the patch subject.

Oops

> On 7/28/2023 5:12 AM, Thomas Gleixner wrote:
>> APIC ID checks compare with BAD_APICID all over the place, but some
>> initializers and some code which fiddles with global data structure use
>> -1[U] instead. That simply cannot work at all.
>> 
>> Fix it up and use BAD_APICID consistenly all over the place.
>> 
>
> s/consistenly/consistently

Indeed.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 19/38] x86/cpu: Provide debug interface
  2023-07-28 12:13 ` [patch v2 19/38] x86/cpu: Provide debug interface Thomas Gleixner
@ 2023-07-28 23:57   ` Sohil Mehta
  0 siblings, 0 replies; 72+ messages in thread
From: Sohil Mehta @ 2023-07-28 23:57 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

> +++ b/arch/x86/kernel/cpu/debugfs.c
> @@ -0,0 +1,58 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/debugfs.h>
> +
> +#include <asm/apic.h>
> +#include <asm/processor.h>
> +

I ran checkpatch through the patches and it lists a bunch of errors and
warnings. Most of them are due to issues in the existing code which is
probably not worth fixing.

However, I found a couple of recommendations which might be of interest.

1) Using linux headers instead of the asm ones. Namely,

> Consider using #include <linux/processor.h> instead of <asm/processor.h>
> Consider using #include <linux/topology.h> instead of <asm/topology.h>
> Consider using #include <linux/smp.h> instead of <asm/smp.h>

2) Macros with multiple statements should be enclosed in a do - while loop

From patch 20/38:
> +#define cpuid_subleaf(leaf, subleaf, regs)		\
> +	BUILD_BUG_ON(sizeof(*(regs)) != 16);		\
> +	__cpuid_read(leaf, subleaf, (u32 *)(regs))


I am wondering if either of them matter?


> +static __init int cpu_init_debugfs(void)
> +{
> +	struct dentry *dir, *base = debugfs_create_dir("topo", arch_debugfs_dir);
> +	unsigned long id;
> +	char name [10];
> +

Excess space after name and before the square bracket can be avoided.


Sohil



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
@ 2023-07-29  0:02   ` Sohil Mehta
  2023-07-31  4:05   ` Michael Kelley (LINUX)
  2023-08-01  7:05   ` Gautham R. Shenoy
  2 siblings, 0 replies; 72+ messages in thread
From: Sohil Mehta @ 2023-07-29  0:02 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On 7/28/2023 5:13 AM, Thomas Gleixner wrote:
> Topology evaluation is a complete disaster and impenetrable mess. It's
> scattered all over the place with some vendor implementatins doing early

s/implementatins/implementations

> +static void parse_topology(struct topo_scan *tscan, bool early)
> +{
> +	const struct cpuinfo_topology topo_defaults = {
> +		.cu_id			= 0xff,
> +		.llc_id			= BAD_APICID,
> +		.l2c_id			= BAD_APICID,
> +	};
> +	struct cpuinfo_x86 *c = tscan->c;
> +	struct {
> +		u32	unused0		: 16,
> +			nproc		:  8,
> +			apicid		:  8;
> +	} ebx;
> +
> +	c->topo = topo_defaults;
> +
> +	if (fake_topology(tscan))
> +	    return;
> +

Spaces used for indenting "return" instead of a tab.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser
  2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
@ 2023-07-29  0:05   ` Sohil Mehta
  2023-07-30  5:20   ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 72+ messages in thread
From: Sohil Mehta @ 2023-07-29  0:05 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On 7/28/2023 5:13 AM, Thomas Gleixner wrote:
> AMD/HYGON uses various methods for topology evaluation:
> 
>   - Leaf 0x80000008 and 0x8000001e based with an optional leaf 0xb,
>     which is the preferred variant for modern CPUs.
> 
>     Leaf 0xb will be superseeded by leaf 0x80000026 soon, which is just

s/superseeded/superseded


> +
> +	if (tscan->c->x86_vendor == X86_VENDOR_AMD) {
> +		if (tscan->c->x86 == 0x15)
> +			tscan->c->topo.cu_id = leaf.cuid;
> +
> +		cacheinfo_amd_init_llc_id(tscan->c, leaf.nodeid);
> +	} else {
> +		/*
> +		 * Pacakge ID is ApicId[6..] on Hygon CPUs. See commit

s/Pacakge/Package



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID
  2023-07-28 12:13 ` [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID Thomas Gleixner
@ 2023-07-29  0:07   ` Sohil Mehta
  0 siblings, 0 replies; 72+ messages in thread
From: Sohil Mehta @ 2023-07-29  0:07 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

s/acessor/accessor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser
  2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
  2023-07-29  0:05   ` Sohil Mehta
@ 2023-07-30  5:20   ` Michael Kelley (LINUX)
  2023-07-30  7:41     ` Thomas Gleixner
  1 sibling, 1 reply; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-07-30  5:20 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

From: Thomas Gleixner <tglx@linutronix.de> Sent: Friday, July 28, 2023 5:13 AM
> 
> AMD/HYGON uses various methods for topology evaluation:
> 
>   - Leaf 0x80000008 and 0x8000001e based with an optional leaf 0xb,
>     which is the preferred variant for modern CPUs.
> 
>     Leaf 0xb will be superseeded by leaf 0x80000026 soon, which is just
>     another variant of the Intel 0x1f leaf for whatever reasons.
> 
>   - Subleaf 0x80000008 and NODEID_MSR base
> 
>   - Legacy fallback
> 
> That code is following the principle of random bits and pieces all over the
> place which results in multiple evaluations and impenetrable code flows in
> the same way as the Intel parsing did.
> 
> Provide a sane implementation by clearly separating the three variants and
> bringing them in the proper preference order in one place.
> 
> This provides the parsing for both AMD and HYGON because there is no point
> in having a separate HYGON parser which only differs by 3 lines of
> code. Any further divergence between AMD and HYGON can be handled in
> different functions, while still sharing the existing parsers.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/topology.h       |    2
>  arch/x86/kernel/cpu/Makefile          |    2
>  arch/x86/kernel/cpu/amd.c             |    2
>  arch/x86/kernel/cpu/cacheinfo.c       |    4
>  arch/x86/kernel/cpu/cpu.h             |    2
>  arch/x86/kernel/cpu/debugfs.c         |    2
>  arch/x86/kernel/cpu/topology.h        |    6 +
>  arch/x86/kernel/cpu/topology_amd.c    |  179
> ++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/topology_common.c |   19 +++
>  9 files changed, 211 insertions(+), 7 deletions(-)
> 
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -162,6 +162,8 @@ int topology_update_die_map(unsigned int
>  int topology_phys_to_logical_pkg(unsigned int pkg);
>  bool topology_smt_supported(void);
> 
> +extern unsigned int __amd_nodes_per_pkg;
> +
>  static inline unsigned int topology_amd_nodes_per_pkg(void)
>  {
>  	return __max_die_per_package;
> --- a/arch/x86/kernel/cpu/Makefile
> +++ b/arch/x86/kernel/cpu/Makefile
> @@ -18,7 +18,7 @@ KMSAN_SANITIZE_common.o := n
>  KCSAN_SANITIZE_common.o := n
> 
>  obj-y			:= cacheinfo.o scattered.o
> -obj-y			+= topology_common.o topology_ext.o topology.o
> +obj-y			+= topology_common.o topology_ext.o topology_amd.o topology.o
>  obj-y			+= common.o
>  obj-y			+= rdrand.o
>  obj-y			+= match.o
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -356,7 +356,7 @@ static void amd_get_topology(struct cpui
>  		if (!err)
>  			c->x86_coreid_bits = get_count_order(c->x86_max_cores);
> 
> -		cacheinfo_amd_init_llc_id(c);
> +		cacheinfo_amd_init_llc_id(c, c->topo.die_id);
> 
>  	} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
>  		u64 value;
> --- a/arch/x86/kernel/cpu/cacheinfo.c
> +++ b/arch/x86/kernel/cpu/cacheinfo.c
> @@ -661,7 +661,7 @@ static int find_num_cache_leaves(struct
>  	return i;
>  }
> 
> -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c)
> +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id)
>  {
>  	/*
>  	 * We may have multiple LLCs if L3 caches exist, so check if we
> @@ -672,7 +672,7 @@ void cacheinfo_amd_init_llc_id(struct cp
> 
>  	if (c->x86 < 0x17) {
>  		/* LLC is at the node level. */
> -		c->topo.llc_id = c->topo.die_id;
> +		c->topo.llc_id = die_id;
>  	} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
>  		/*
>  		 * LLC is at the core complex level.
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -79,7 +79,7 @@ extern void init_hygon_cacheinfo(struct
>  extern int detect_extended_topology(struct cpuinfo_x86 *c);
>  extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
> 
> -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c);
> +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id);
>  void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c);
> 
>  unsigned int aperfmperf_get_khz(int cpu);
> --- a/arch/x86/kernel/cpu/debugfs.c
> +++ b/arch/x86/kernel/cpu/debugfs.c
> @@ -27,6 +27,8 @@ static int cpu_debug_show(struct seq_fil
>  	seq_printf(m, "logical_die_id:      %u\n", c->topo.logical_die_id);
>  	seq_printf(m, "llc_id:              %u\n", c->topo.llc_id);
>  	seq_printf(m, "l2c_id:              %u\n", c->topo.l2c_id);
> +	seq_printf(m, "amd_node_id:         %u\n", c->topo.amd_node_id);
> +	seq_printf(m, "amd_nodes_per_pkg:   %u\n", topology_amd_nodes_per_pkg());
>  	seq_printf(m, "max_cores:           %u\n", c->x86_max_cores);
>  	seq_printf(m, "max_die_per_pkg:     %u\n", __max_die_per_package);
>  	seq_printf(m, "smp_num_siblings:    %u\n", smp_num_siblings);
> --- a/arch/x86/kernel/cpu/topology.h
> +++ b/arch/x86/kernel/cpu/topology.h
> @@ -9,6 +9,10 @@ struct topo_scan {
> 
>  	// Legacy CPUID[1]:EBX[23:16] number of logical processors
>  	unsigned int		ebx1_nproc_shift;
> +
> +	// AMD specific node ID which cannot be mapped into APIC space.
> +	u16			amd_nodes_per_pkg;
> +	u16			amd_node_id;
>  };
> 
>  bool topo_is_converted(struct cpuinfo_x86 *c);
> @@ -17,6 +21,8 @@ void cpu_parse_topology(struct cpuinfo_x
>  void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
>  		      unsigned int shift, unsigned int ncpus);
>  bool cpu_parse_topology_ext(struct topo_scan *tscan);
> +void cpu_parse_topology_amd(struct topo_scan *tscan);
> +void cpu_topology_fixup_amd(struct topo_scan *tscan);
> 
>  static inline u32 topo_shift_apicid(u32 apicid, enum x86_topology_domains dom)
>  {
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/topology_amd.c
> @@ -0,0 +1,179 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/cpu.h>
> +
> +#include <asm/apic.h>
> +#include <asm/memtype.h>
> +#include <asm/processor.h>
> +
> +#include "cpu.h"
> +
> +static bool parse_8000_0008(struct topo_scan *tscan)
> +{
> +	struct {
> +		u32	ncores		:  8,
> +			__rsvd0		:  4,
> +			apicidsize	:  4,
> +			perftscsize	:  2,
> +			__rsvd1		: 14;
> +	} ecx;
> +	unsigned int sft;
> +
> +	if (tscan->c->extended_cpuid_level < 0x80000008)
> +		return false;
> +
> +	cpuid_leaf_reg(0x80000008, CPUID_ECX, &ecx);
> +
> +	/* If the APIC ID size is 0, then get the shift value from ecx.ncores */
> +	sft = ecx.apicidsize;
> +	if (!sft)
> +		sft = get_count_order(ecx.ncores + 1);
> +
> +	topology_set_dom(tscan, TOPO_CORE_DOMAIN, sft, ecx.ncores + 1);
> +	return true;
> +}
> +
> +static void store_node(struct topo_scan *tscan, unsigned int nr_nodes, u16 node_id)
> +{
> +	/*
> +	 * Starting with Fam 17h the DIE domain could probably be used to
> +	 * retrieve the node info on AMD/HYGON. Analysis of CPUID dumps
> +	 * suggests its the topmost bit(s) of the CPU cores area, but

s/its/it's/

> +	 * that's guess work and neither enumerated nor documented.
> +	 *
> +	 * Up to Fam 16h this does not work at all and the legacy node ID
> +	 * has to be used.
> +	 */
> +	tscan->amd_nodes_per_pkg = nr_nodes;
> +	tscan->amd_node_id = node_id;
> +}
> +
> +static bool parse_8000_001e(struct topo_scan *tscan, bool has_0xb)
> +{
> +	struct {
> +		// eax
> +		u32	x2apic_id	: 32;
> +		// ebx
> +		u32	cuid		:  8,
> +			threads_per_cu	:  8,
> +			__rsvd0		: 16;
> +		// ecx
> +		u32	nodeid		:  8,
> +			nodes_per_pkg	:  3,
> +			__rsvd1		: 21;
> +		// edx
> +		u32	__rsvd2		: 32;
> +	} leaf;
> +
> +	if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
> +		return false;
> +
> +	cpuid_leaf(0x8000001e, &leaf);
> +
> +	tscan->c->topo.initial_apicid = leaf.x2apic_id;
> +
> +	/*
> +	 * If leaf 0xb is available, then SMT shift is set already. If not
> +	 * take it from ecx.threads_per_cpu and use topo_update_dom() as

"take it from ebx.threads_per_cu and use topology_update_dom() as"

> +	 * topology_set_dom() would propagate and overwrite the already
> +	 * propagated CORE level.
> +	 */
> +	if (!has_0xb) {
> +		topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(leaf.threads_per_cu),
> +				    leaf.threads_per_cu);

leaf.threads_per_cu needs to be (leaf.threads_per_cu + 1) above.  If
the core has two hyper-threads, the value of the threads_per_cu
field returned by CPUID is "1".

All my Hyper-V VMs on AMD processors were coming up with only
one thread per core.   The change fixes the problem.

Michael

> +	}
> +
> +	store_node(tscan, leaf.nodes_per_pkg + 1, leaf.nodeid);
> +
> +	if (tscan->c->x86_vendor == X86_VENDOR_AMD) {
> +		if (tscan->c->x86 == 0x15)
> +			tscan->c->topo.cu_id = leaf.cuid;
> +
> +		cacheinfo_amd_init_llc_id(tscan->c, leaf.nodeid);
> +	} else {
> +		/*
> +		 * Pacakge ID is ApicId[6..] on Hygon CPUs. See commit
> +		 * e0ceeae708ce for explanation. The topology info is
> +		 * screwed up: The package shift is always 6 and the node
> +		 * ID is bit [4:5]. Don't touch the latter without
> +		 * confirmation from the Hygon developers.
> +		 */
> +		topology_set_dom(tscan, TOPO_CORE_DOMAIN, 6, tscan->dom_ncpus[TOPO_CORE_DOMAIN]);
> +		cacheinfo_hygon_init_llc_id(tscan->c);
> +	}
> +	return true;
> +}
> +
> +static bool parse_fam10h_node_id(struct topo_scan *tscan)
> +{
> +	struct {
> +		union {
> +			u64	node_id		:  3,
> +				nodes_per_pkg	:  3,
> +				unused		: 58;
> +			u64	msr;
> +		};
> +	} nid;
> +
> +	if (!boot_cpu_has(X86_FEATURE_NODEID_MSR))
> +		return false;
> +
> +	rdmsrl(MSR_FAM10H_NODE_ID, nid.msr);
> +	store_node(tscan, nid.nodes_per_pkg + 1, nid.node_id);
> +	tscan->c->topo.llc_id = nid.node_id;
> +	return true;
> +}
> +
> +static void legacy_set_llc(struct topo_scan *tscan)
> +{
> +	unsigned int apicid = tscan->c->topo.initial_apicid;
> +
> +	/* parse_8000_0008() set everything up except llc_id */
> +	tscan->c->topo.llc_id = apicid >> tscan->dom_shifts[TOPO_CORE_DOMAIN];
> +}
> +
> +static void parse_topology_amd(struct topo_scan *tscan)
> +{
> +	bool has_0xb = false;
> +
> +	/*
> +	 * If the extended topology leaf 0x8000_001e is available
> +	 * try to get SMT and CORE shift from leaf 0xb first, then
> +	 * try to get the CORE shift from leaf 0x8000_0008.
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_TOPOEXT))
> +		has_0xb = cpu_parse_topology_ext(tscan);
> +
> +	if (!has_0xb && !parse_8000_0008(tscan))
> +		return;
> +
> +	/* Prefer leaf 0x8000001e if available */
> +	if (parse_8000_001e(tscan, has_0xb))
> +		return;
> +
> +	/* Try the NODEID MSR */
> +	if (parse_fam10h_node_id(tscan))
> +		return;
> +
> +	legacy_set_llc(tscan);
> +}
> +
> +void cpu_parse_topology_amd(struct topo_scan *tscan)
> +{
> +	tscan->amd_nodes_per_pkg = 1;
> +	parse_topology_amd(tscan);
> +
> +	if (tscan->amd_nodes_per_pkg > 1)
> +		set_cpu_cap(tscan->c, X86_FEATURE_AMD_DCM);
> +}
> +
> +void cpu_topology_fixup_amd(struct topo_scan *tscan)
> +{
> +	struct cpuinfo_x86 *c = tscan->c;
> +
> +	/*
> +	 * Adjust the core_id relative to the node when there is more than
> +	 * one node.
> +	 */
> +	if (tscan->c->x86 < 0x17 && tscan->amd_nodes_per_pkg > 1)
> +		c->topo.core_id %= tscan->dom_ncpus[TOPO_CORE_DOMAIN] / tscan->amd_nodes_per_pkg;
> +}
> --- a/arch/x86/kernel/cpu/topology_common.c
> +++ b/arch/x86/kernel/cpu/topology_common.c
> @@ -11,11 +11,13 @@
> 
>  struct x86_topology_system x86_topo_system __ro_after_init;
> 
> +unsigned int __amd_nodes_per_pkg __ro_after_init;
> +EXPORT_SYMBOL_GPL(__amd_nodes_per_pkg);
> +
>  void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
>  		      unsigned int shift, unsigned int ncpus)
>  {
> -	tscan->dom_shifts[dom] = shift;
> -	tscan->dom_ncpus[dom] = ncpus;
> +	topology_update_dom(tscan, dom, shift, ncpus);
> 
>  	/* Propagate to the upper levels */
>  	for (dom++; dom < TOPO_MAX_DOMAIN; dom++) {
> @@ -145,6 +147,13 @@ static void topo_set_ids(struct topo_sca
> 
>  	/* Relative core ID */
>  	c->topo.core_id = topo_relative_domain_id(apicid, TOPO_CORE_DOMAIN);
> +
> +	/* Temporary workaround */
> +	if (tscan->amd_nodes_per_pkg)
> +		c->topo.amd_node_id = c->topo.die_id = tscan->amd_node_id;
> +
> +	if (c->x86_vendor == X86_VENDOR_AMD)
> +		cpu_topology_fixup_amd(tscan);
>  }
> 
>  static void topo_set_max_cores(struct topo_scan *tscan)
> @@ -229,4 +238,10 @@ void __init cpu_init_topology(struct cpu
>  	 */
>  	__max_die_per_package = tscan.dom_ncpus[TOPO_DIE_DOMAIN] /
>  		tscan.dom_ncpus[TOPO_DIE_DOMAIN - 1];
> +	/*
> +	 * AMD systems have Nodes per package which cannot be mapped to
> +	 * APIC ID (yet).
> +	 */
> +	if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON)
> +		__amd_nodes_per_pkg = __max_die_per_package = tscan.amd_nodes_per_pkg;
>  }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser
  2023-07-30  5:20   ` Michael Kelley (LINUX)
@ 2023-07-30  7:41     ` Thomas Gleixner
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-30  7:41 UTC (permalink / raw)
  To: Michael Kelley (LINUX), LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Sun, Jul 30 2023 at 05:20, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Friday, July 28, 2023 5:13 AM
>> +	/*
>> +	 * If leaf 0xb is available, then SMT shift is set already. If not
>> +	 * take it from ecx.threads_per_cpu and use topo_update_dom() as
>
> "take it from ebx.threads_per_cu and use topology_update_dom() as"
>
>> +	 * topology_set_dom() would propagate and overwrite the already
>> +	 * propagated CORE level.
>> +	 */
>> +	if (!has_0xb) {
>> +		topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(leaf.threads_per_cu),
>> +				    leaf.threads_per_cu);
>
> leaf.threads_per_cu needs to be (leaf.threads_per_cu + 1) above.  If
> the core has two hyper-threads, the value of the threads_per_cu
> field returned by CPUID is "1".
>
> All my Hyper-V VMs on AMD processors were coming up with only
> one thread per core.   The change fixes the problem.

You are right. Thanks for tracking that down!

    tglx

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
  2023-07-29  0:02   ` Sohil Mehta
@ 2023-07-31  4:05   ` Michael Kelley (LINUX)
  2023-07-31 12:34     ` Thomas Gleixner
  2023-07-31 13:47     ` Arjan van de Ven
  2023-08-01  7:05   ` Gautham R. Shenoy
  2 siblings, 2 replies; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-07-31  4:05 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

From: Thomas Gleixner <tglx@linutronix.de> Sent: Friday, July 28, 2023 5:13 AM
> 
> Topology evaluation is a complete disaster and impenetrable mess. It's
> scattered all over the place with some vendor implementatins doing early
> evaluation and some not. The most horrific part is the permanent
> overwriting of smt_max_siblings and __max_die_per_package, instead of
> establishing them once on the boot CPU and validating the result on the
> APs.
> 
> The goals are:
> 
>   - One topology evaluation entry point
> 
>   - Proper sharing of pointlessly duplicated code
> 
>   - Proper structuring of the evaluation logic and preferences.
> 
>   - Evaluating important system wide information only once on the boot CPU
> 
>   - Making the 0xb/0x1f leaf parsing less convoluted and actually fixing
>     the short comings of leaf 0x1f evaluation.
> 
> Start to consolidate the topology evaluation code by providing the entry
> points for the early boot CPU evaluation and for the final parsing on the
> boot CPU and the APs.
> 
> Move the trivial pieces into that new code:
> 
>    - The initialization of cpuinfo_x86::topo
> 
>    - The evaluation of CPUID leaf 1, which presets topo::initial_apicid
> 
>    - topo_apicid is set to topo::initial_apicid when invoked from early
>      boot. When invoked for the final evaluation on the boot CPU it reads
>      the actual APIC ID,  which makes apic_get_initial_apicid() obsolete
>      once everything is converted over.

> 
> Provide a temporary helper function topo_converted() which shields off the
> not yet converted CPU vendors from invoking code which would break them.
> This shielding covers all vendor CPUs which support SMP, but not the
> historical pure UP ones as they only need the topology info init and
> eventually the initial APIC initialization.
> 
> Provide two new members in cpuinfo_x86::topo to store the maximum number of
> SMT siblings and the number of dies per package and add them to the debugfs
> readout. These two members will be used to populate this information on the
> boot CPU and to validate the APs against it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/topology.h       |   19 +++
>  arch/x86/kernel/cpu/Makefile          |    3
>  arch/x86/kernel/cpu/common.c          |   23 +---
>  arch/x86/kernel/cpu/cpu.h             |    6 +
>  arch/x86/kernel/cpu/debugfs.c         |   37 ++++++
>  arch/x86/kernel/cpu/topology.h        |   32 +++++
>  arch/x86/kernel/cpu/topology_common.c |  187
> ++++++++++++++++++++++++++++++++++
>  7 files changed, 290 insertions(+), 17 deletions(-)
> 

[snip]

> +
> +static void parse_topology(struct topo_scan *tscan, bool early)
> +{
> +	const struct cpuinfo_topology topo_defaults = {
> +		.cu_id			= 0xff,
> +		.llc_id			= BAD_APICID,
> +		.l2c_id			= BAD_APICID,
> +	};
> +	struct cpuinfo_x86 *c = tscan->c;
> +	struct {
> +		u32	unused0		: 16,
> +			nproc		:  8,
> +			apicid		:  8;
> +	} ebx;
> +
> +	c->topo = topo_defaults;
> +
> +	if (fake_topology(tscan))
> +	    return;
> +
> +	/* Preset Initial APIC ID from CPUID leaf 1 */
> +	cpuid_leaf_reg(1, CPUID_EBX, &ebx);
> +	c->topo.initial_apicid = ebx.apicid;
> +
> +	/*
> +	 * The initial invocation from early_identify_cpu() happens before
> +	 * the APIC is mapped or X2APIC enabled. For establishing the
> +	 * topology, that's not required. Use the initial APIC ID.
> +	 */
> +	if (early)
> +		c->topo.apicid = c->topo.initial_apicid;
> +	else
> +		c->topo.apicid = read_apic_id();

Using the value from the local APIC ID reg turns out to cause a problem in
some Hyper-V VM configurations.  If a VM has multiple L3 caches (probably
due to multiple NUMA nodes) and the # of CPUs in the span of the L3 cache
is not a power of 2, the APIC IDs for the CPUs in the span of the 1st L3 cache
are sequential starting with 0.  But then there is a gap before starting the
APIC IDs for the CPUs in the span of the 2nd L3 cache.  The gap is
repeated if there are additional L3 caches.

The CPUID instruction executed on a guest vCPU correctly reports the APIC
IDs.  However, the ACPI MADT assigns the APIC IDs sequentially with no
gaps, and the guest firmware sets the APIC_ID register for each local APIC
to match the MADT.  When parse_topology() sets the apicid field based on
reading the local APIC ID register, the value it sets is different from the
initial_apicid value for CPUs in the span of the 2nd and subsequent L3
caches, because there's no gap in the APIC IDs read from the local APIC.
Linux boots and runs, but the topology is set up with the wrong span for
the L3 cache and for the associated scheduling domains. 

The old code derives the apicid from the initial_apicid via the
phys_pkg_id() callback, so these bad Hyper-V VM configs skate by.  The
wrong value in the local APIC ID register and MADT does not affect
anything, except that the check in validate_apic_and_package_id() fails
during boot, and a set of "Firmware bug:" messages is correctly output.

Three thoughts:

1)  Are Hyper-V VMs the only place where the local APIC ID register might
have a bogus value?  Probably so, but you never know what might crawl out.

2) The natural response is "Well, fix Hyper-V!"  I first had this conversation
with the Hyper-V team about 5 years ago.  Some cases of the problem were
fixed, but some cases remain unfixed.  It's a long story.

3)  Since Hyper-V code in Linux already has an override for the apic->read()
function, it's possible to do a hack in that override so that apicid gets set to
the same value as initial_apicid, which matches the old code.  Here's the diff:

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 72d9931da3a2..2e7b18557186 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -58,6 +58,8 @@ static u32 hv_apic_read(u32 reg)
        u32 reg_val, hi;

        switch (reg) {
+       case APIC_ID:
+               return __this_cpu_read(cpu_info.topo.initial_apicid) << 24;
        case APIC_EOI:
                rdmsr(HV_X64_MSR_EOI, reg_val, hi);
                (void)hi;
@@ -311,6 +313,7 @@ void __init hv_apic_init(void)
                 * both xapic and x2apic because the field layout is the same.
                 */
                apic_update_callback(eoi, hv_apic_eoi_write);
+               apic->apic_id_registered = NULL;
                if (!x2apic_enabled()) {
                        apic_update_callback(read, hv_apic_read);
                        apic_update_callback(write, hv_apic_write);

Setting apic->apic_id_registered to NULL is necessary because it does
read_apic_id() and checks that the value matches an APIC ID that was
registered when the MADT was parsed.  This test fails for some vCPUs
in the VM because the APIC IDs from the MADT are also sequential
with no gaps as mentioned above. I don't see any big hazard in
bypassing the check.

The hv_apic_read() override is used only in VMs with an xapic.
I still need to check a few things, but I believe Hyper-V gets
MADT and local APIC ID reg numbering correct when an x2apic
is used, so I don't think any hacks are needed for that path.

Does anyone have suggestions on a different way to handle
this that's better than the above diff?  Other thoughts?

Michael

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31  4:05   ` Michael Kelley (LINUX)
@ 2023-07-31 12:34     ` Thomas Gleixner
  2023-07-31 13:27       ` Peter Zijlstra
  2023-07-31 16:25       ` Michael Kelley (LINUX)
  2023-07-31 13:47     ` Arjan van de Ven
  1 sibling, 2 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-31 12:34 UTC (permalink / raw)
  To: Michael Kelley (LINUX), LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31 2023 at 04:05, Michael Kelley wrote:
>> +	/*
>> +	 * The initial invocation from early_identify_cpu() happens before
>> +	 * the APIC is mapped or X2APIC enabled. For establishing the
>> +	 * topology, that's not required. Use the initial APIC ID.
>> +	 */
>> +	if (early)
>> +		c->topo.apicid = c->topo.initial_apicid;
>> +	else
>> +		c->topo.apicid = read_apic_id();
>
> Using the value from the local APIC ID reg turns out to cause a problem in
> some Hyper-V VM configurations.  If a VM has multiple L3 caches (probably
> due to multiple NUMA nodes) and the # of CPUs in the span of the L3 cache
> is not a power of 2, the APIC IDs for the CPUs in the span of the 1st L3 cache
> are sequential starting with 0.  But then there is a gap before starting the
> APIC IDs for the CPUs in the span of the 2nd L3 cache.  The gap is
> repeated if there are additional L3 caches.
>
> The CPUID instruction executed on a guest vCPU correctly reports the APIC
> IDs.  However, the ACPI MADT assigns the APIC IDs sequentially with no
> gaps, and the guest firmware sets the APIC_ID register for each local APIC
> to match the MADT.  When parse_topology() sets the apicid field based on
> reading the local APIC ID register, the value it sets is different from the
> initial_apicid value for CPUs in the span of the 2nd and subsequent L3
> caches, because there's no gap in the APIC IDs read from the local APIC.
> Linux boots and runs, but the topology is set up with the wrong span for
> the L3 cache and for the associated scheduling domains.

TBH. That's an insanity. MADT and the actual APIC ID determine the
topology. So the gaps should be reflected in MADT and the actual APIC
IDs should be set correctly if the intent is to provide topology
information.

Just for the record. This hack works only on Intel today, because AMD
init sets topo.apicid = read_apic_id() unconditionally. So this is
inconsistent already, no?

> The old code derives the apicid from the initial_apicid via the
> phys_pkg_id() callback, so these bad Hyper-V VM configs skate by.  The
> wrong value in the local APIC ID register and MADT does not affect
> anything, except that the check in validate_apic_and_package_id() fails
> during boot, and a set of "Firmware bug:" messages is correctly
> output.

So instead of fixing the firmware bugs, hyper-v just moves on and
pretends that everything works fine, right?

> Three thoughts:
>
> 1)  Are Hyper-V VMs the only place where the local APIC ID register might
> have a bogus value?  Probably so, but you never know what might crawl
> out.

Define bogus. MADT is the primary source of information because that's
how we know how many CPUs (APICs) are there and what their APIC ID is
which we can use to wake them up. So there is a reasonable expectation
that this information is consistent with the rest of the system.

The Intel SDM clearly says in Vol 3A section 9.4.5 Identifying Logical
Processors in an MP System:

  "After the BIOS has completed the MP initialization protocol, each
   logical processor can be uniquely identified by its local APIC
   ID. Software can access these APIC IDs in either of the following
   ways:"

These ways include read from APIC, read MADT, read CPUID and implies
that this must be consistent. For X2APIC it's actually written out:

  "If the local APIC unit supports x2APIC and is operating in x2APIC
   mode, 32-bit APIC ID can be read by executing a RDMSR instruction to
   read the processor’s x2APIC ID register. This method is equivalent to
   executing CPUID leaf 0BH described below."

AMD has not been following that in the early 64bit systems as they moved
the APIC ID space to start at 32 for the first CPU in the first socket
for whatever reasons. But since then the kernel reads back the APIC ID
on AMD systems into topo.apicid. But that was long ago and can easily be
dealt with because at least the real APIC ID and the MADT/MPTABLE
entries are consistent.

Hypervisors have their own CPUID space to override functionality with
their own magic stuff, but imposing their nutbolt ideas on the
architectural part of the system is not only wrong, it's disrespectful
against the OS developers who try to keep their system sane.

> 2) The natural response is "Well, fix Hyper-V!"  I first had this conversation
> with the Hyper-V team about 5 years ago.  Some cases of the problem were
> fixed, but some cases remain unfixed.  It's a long story.
>
> 3)  Since Hyper-V code in Linux already has an override for the apic->read()
> function, it's possible to do a hack in that override so that apicid gets set to
> the same value as initial_apicid, which matches the old code.  Here's the diff:

This collides massively with the other work I'm doing, which uses the
MADT provided information to actually evaluate various topology related
things upfront and later during bringup. Thats badly needed because lots
of todays infrastructure is based on heuristics and guesswork.

But it seems I wasted a month on reworking all of this just to be
stopped cold in the tracks by completely undocumented and unnecessary
hyper-v abuse.

So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
for topology information related to L3, then hyper-v should override the
cache topology mechanism and not impose this insanity on the basic
topology evaluation infrastructure.

Yours seriously grumpy

      tglx

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 12:34     ` Thomas Gleixner
@ 2023-07-31 13:27       ` Peter Zijlstra
  2023-07-31 15:38         ` Thomas Gleixner
  2023-07-31 16:25       ` Michael Kelley (LINUX)
  1 sibling, 1 reply; 72+ messages in thread
From: Peter Zijlstra @ 2023-07-31 13:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Michael Kelley (LINUX),
	LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31, 2023 at 02:34:39PM +0200, Thomas Gleixner wrote:

> This collides massively with the other work I'm doing, which uses the
> MADT provided information to actually evaluate various topology related
> things upfront and later during bringup. Thats badly needed because lots
> of todays infrastructure is based on heuristics and guesswork.
> 
> But it seems I wasted a month on reworking all of this just to be
> stopped cold in the tracks by completely undocumented and unnecessary
> hyper-v abuse.
> 
> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
> for topology information related to L3, then hyper-v should override the
> cache topology mechanism and not impose this insanity on the basic
> topology evaluation infrastructure.

So I'm very tempted to suggest you continue with the topology rewrite
and let Hyper-V keep the pieces. They're very clearly violating the SDM.

Thing as they stand are untenable, the whole topology thing as it exists
today is an untenable shitshow.

Michael, is there anything you can do early (as in MADT parse early) to
fix up the APIC-IDs?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31  4:05   ` Michael Kelley (LINUX)
  2023-07-31 12:34     ` Thomas Gleixner
@ 2023-07-31 13:47     ` Arjan van de Ven
  2023-07-31 14:08       ` Andrew Cooper
  1 sibling, 1 reply; 72+ messages in thread
From: Arjan van de Ven @ 2023-07-31 13:47 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, James E.J. Bottomley,
	Dick Kennedy, James Smart, Martin K. Petersen, linux-scsi,
	Guenter Roeck, linux-hwmon, Jean Delvare, Huang Rui,
	Juergen Gross, Steve Wahl, Mike Travis, Dimitri Sivanich,
	Russ Anderson

On 7/30/2023 9:05 PM, Michael Kelley (LINUX) wrote:
> Does anyone have suggestions on a different way to handle
> this that's better than the above diff?  Other thoughts?

how badly do you need xapic ? Meaning, can x2apic just be used instead always


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 13:47     ` Arjan van de Ven
@ 2023-07-31 14:08       ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2023-07-31 14:08 UTC (permalink / raw)
  To: Arjan van de Ven, Michael Kelley (LINUX), Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, James E.J. Bottomley, Dick Kennedy,
	James Smart, Martin K. Petersen, linux-scsi, Guenter Roeck,
	linux-hwmon, Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson

On 31/07/2023 2:47 pm, Arjan van de Ven wrote:
> On 7/30/2023 9:05 PM, Michael Kelley (LINUX) wrote:
>> Does anyone have suggestions on a different way to handle
>> this that's better than the above diff?  Other thoughts?
>
> how badly do you need xapic ? Meaning, can x2apic just be used instead
> always

x2APIC under virt is a problem if you don't want to fully emulate an
IOMMU just for int-remapping purposes.

You don't know a-priori whether a particular guest kernel knows about
e.g. the rsvd bit trick in IO-APIC RTEs to allow a 32k destination id.

The only generally compatible way is to start in xAPIC mode, leave all
the enumeration hints around which say "really really please switch into
x2APIC mode", and hope the kernel does.

~Andrew

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 13:27       ` Peter Zijlstra
@ 2023-07-31 15:38         ` Thomas Gleixner
  2023-07-31 16:10           ` Michael Kelley (LINUX)
  0 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-31 15:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Kelley (LINUX),
	LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31 2023 at 15:27, Peter Zijlstra wrote:
> On Mon, Jul 31, 2023 at 02:34:39PM +0200, Thomas Gleixner wrote:
>> This collides massively with the other work I'm doing, which uses the
>> MADT provided information to actually evaluate various topology related
>> things upfront and later during bringup. Thats badly needed because lots
>> of todays infrastructure is based on heuristics and guesswork.
>> 
>> But it seems I wasted a month on reworking all of this just to be
>> stopped cold in the tracks by completely undocumented and unnecessary
>> hyper-v abuse.
>> 
>> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
>> for topology information related to L3, then hyper-v should override the
>> cache topology mechanism and not impose this insanity on the basic
>> topology evaluation infrastructure.
>
> So I'm very tempted to suggest you continue with the topology rewrite
> and let Hyper-V keep the pieces. They're very clearly violating the SDM.
>
> Thing as they stand are untenable, the whole topology thing as it exists
> today is an untenable shitshow.
>
> Michael, is there anything you can do early (as in MADT parse early) to
> fix up the APIC-IDs?

I don't think so.

Michael, can you please provide me a table of:

   APICID (real/MADT)		APICID (CPUID)

from one of the tinker VMs please?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 15:38         ` Thomas Gleixner
@ 2023-07-31 16:10           ` Michael Kelley (LINUX)
  2023-07-31 20:48             ` Thomas Gleixner
  0 siblings, 1 reply; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-07-31 16:10 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> 
> On Mon, Jul 31 2023 at 15:27, Peter Zijlstra wrote:
> > On Mon, Jul 31, 2023 at 02:34:39PM +0200, Thomas Gleixner wrote:
> >> This collides massively with the other work I'm doing, which uses the
> >> MADT provided information to actually evaluate various topology related
> >> things upfront and later during bringup. Thats badly needed because lots
> >> of todays infrastructure is based on heuristics and guesswork.
> >>
> >> But it seems I wasted a month on reworking all of this just to be
> >> stopped cold in the tracks by completely undocumented and unnecessary
> >> hyper-v abuse.
> >>
> >> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
> >> for topology information related to L3, then hyper-v should override the
> >> cache topology mechanism and not impose this insanity on the basic
> >> topology evaluation infrastructure.
> >
> > So I'm very tempted to suggest you continue with the topology rewrite
> > and let Hyper-V keep the pieces. They're very clearly violating the SDM.
> >
> > Thing as they stand are untenable, the whole topology thing as it exists
> > today is an untenable shitshow.
> >
> > Michael, is there anything you can do early (as in MADT parse early) to
> > fix up the APIC-IDs?
> 
> I don't think so.
> 
> Michael, can you please provide me a table of:
> 
>    APICID (real/MADT)		APICID (CPUID)
> 
> from one of the tinker VMs please?
> 

The VM is an F72s_v2 in Azure running your patch set.  The VM has
72 vCPUs in two NUMA nodes across two physical Intel processors, with
36 vCPUs in each NUMA node.

The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
is from CPUID, while the apicid is from read_apic_id() and matches the
MADT.  As expected, the two values match for the first 36 vCPUs, but differ
by 28 (decimal) for the remaining 36.

initial_apicid:      0 apicid:              0
initial_apicid:      1 apicid:              1
initial_apicid:      2 apicid:              2
initial_apicid:      3 apicid:              3
initial_apicid:      4 apicid:              4
initial_apicid:      5 apicid:              5
initial_apicid:      6 apicid:              6
initial_apicid:      7 apicid:              7
initial_apicid:      8 apicid:              8
initial_apicid:      9 apicid:              9
initial_apicid:      a apicid:              a
initial_apicid:      b apicid:              b
initial_apicid:      c apicid:              c
initial_apicid:      d apicid:              d
initial_apicid:      e apicid:              e
initial_apicid:      f apicid:              f
initial_apicid:      10 apicid:              10
initial_apicid:      11 apicid:              11
initial_apicid:      12 apicid:              12
initial_apicid:      13 apicid:              13
initial_apicid:      14 apicid:              14
initial_apicid:      15 apicid:              15
initial_apicid:      16 apicid:              16
initial_apicid:      17 apicid:              17
initial_apicid:      18 apicid:              18
initial_apicid:      19 apicid:              19
initial_apicid:      1a apicid:              1a
initial_apicid:      1b apicid:              1b
initial_apicid:      1c apicid:              1c
initial_apicid:      1d apicid:              1d
initial_apicid:      1e apicid:              1e
initial_apicid:      1f apicid:              1f
initial_apicid:      20 apicid:              20
initial_apicid:      21 apicid:              21
initial_apicid:      22 apicid:              22
initial_apicid:      23 apicid:              23
initial_apicid:      40 apicid:              24
initial_apicid:      41 apicid:              25
initial_apicid:      42 apicid:              26
initial_apicid:      43 apicid:              27
initial_apicid:      44 apicid:              28
initial_apicid:      45 apicid:              29
initial_apicid:      46 apicid:              2a
initial_apicid:      47 apicid:              2b
initial_apicid:      48 apicid:              2c
initial_apicid:      49 apicid:              2d
initial_apicid:      4a apicid:              2e
initial_apicid:      4b apicid:              2f
initial_apicid:      4c apicid:              30
initial_apicid:      4d apicid:              31
initial_apicid:      4e apicid:              32
initial_apicid:      4f apicid:              33
initial_apicid:      50 apicid:              34
initial_apicid:      51 apicid:              35
initial_apicid:      52 apicid:              36
initial_apicid:      53 apicid:              37
initial_apicid:      54 apicid:              38
initial_apicid:      55 apicid:              39
initial_apicid:      56 apicid:              3a
initial_apicid:      57 apicid:              3b
initial_apicid:      58 apicid:              3c
initial_apicid:      59 apicid:              3d
initial_apicid:      5a apicid:              3e
initial_apicid:      5b apicid:              3f
initial_apicid:      5c apicid:              40
initial_apicid:      5d apicid:              41
initial_apicid:      5e apicid:              42
initial_apicid:      5f apicid:              43
initial_apicid:      60 apicid:              44
initial_apicid:      61 apicid:              45
initial_apicid:      62 apicid:              46
initial_apicid:      63 apicid:              47

Michael

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 12:34     ` Thomas Gleixner
  2023-07-31 13:27       ` Peter Zijlstra
@ 2023-07-31 16:25       ` Michael Kelley (LINUX)
  2023-07-31 20:41         ` Thomas Gleixner
  1 sibling, 1 reply; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-07-31 16:25 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 5:35 AM
> 
> On Mon, Jul 31 2023 at 04:05, Michael Kelley wrote:
> >> +	/*
> >> +	 * The initial invocation from early_identify_cpu() happens before
> >> +	 * the APIC is mapped or X2APIC enabled. For establishing the
> >> +	 * topology, that's not required. Use the initial APIC ID.
> >> +	 */
> >> +	if (early)
> >> +		c->topo.apicid = c->topo.initial_apicid;
> >> +	else
> >> +		c->topo.apicid = read_apic_id();
> >
> > Using the value from the local APIC ID reg turns out to cause a problem in
> > some Hyper-V VM configurations.  If a VM has multiple L3 caches (probably
> > due to multiple NUMA nodes) and the # of CPUs in the span of the L3 cache
> > is not a power of 2, the APIC IDs for the CPUs in the span of the 1st L3 cache
> > are sequential starting with 0.  But then there is a gap before starting the
> > APIC IDs for the CPUs in the span of the 2nd L3 cache.  The gap is
> > repeated if there are additional L3 caches.
> >
> > The CPUID instruction executed on a guest vCPU correctly reports the APIC
> > IDs.  However, the ACPI MADT assigns the APIC IDs sequentially with no
> > gaps, and the guest firmware sets the APIC_ID register for each local APIC
> > to match the MADT.  When parse_topology() sets the apicid field based on
> > reading the local APIC ID register, the value it sets is different from the
> > initial_apicid value for CPUs in the span of the 2nd and subsequent L3
> > caches, because there's no gap in the APIC IDs read from the local APIC.
> > Linux boots and runs, but the topology is set up with the wrong span for
> > the L3 cache and for the associated scheduling domains.
> 
> TBH. That's an insanity. MADT and the actual APIC ID determine the
> topology. So the gaps should be reflected in MADT and the actual APIC
> IDs should be set correctly if the intent is to provide topology
> information.
> 
> Just for the record. This hack works only on Intel today, because AMD
> init sets topo.apicid = read_apic_id() unconditionally. So this is
> inconsistent already, no?
> 

Correct.  But given that the L3 cache span in the AMD Zen1 and Zen2
processors is only 8 CPUs, there's much less reason to configure a VM
that only uses some of the CPUs in an L3 cache span.  Hyper-V does
the APIC ID numbering correctly for Zen3 with its 16 CPUs in the L3
cache span.

> > The old code derives the apicid from the initial_apicid via the
> > phys_pkg_id() callback, so these bad Hyper-V VM configs skate by.  The
> > wrong value in the local APIC ID register and MADT does not affect
> > anything, except that the check in validate_apic_and_package_id() fails
> > during boot, and a set of "Firmware bug:" messages is correctly
> > output.
> 
> So instead of fixing the firmware bugs, hyper-v just moves on and
> pretends that everything works fine, right?

What can I say. :-(

> 
> > Three thoughts:
> >
> > 1)  Are Hyper-V VMs the only place where the local APIC ID register might
> > have a bogus value?  Probably so, but you never know what might crawl
> > out.
> 
> Define bogus. MADT is the primary source of information because that's
> how we know how many CPUs (APICs) are there and what their APIC ID is
> which we can use to wake them up. So there is a reasonable expectation
> that this information is consistent with the rest of the system.

Commit d49597fd3bc7 "x86/cpu: Deal with broken firmware (VMWare/Xen)"
mentions VMware and XEN implementations that violate the spec.  The
commit is from late 2016.  Have these bad systems aged out and no longer
need accommodation?

> 
> The Intel SDM clearly says in Vol 3A section 9.4.5 Identifying Logical
> Processors in an MP System:
> 
>   "After the BIOS has completed the MP initialization protocol, each
>    logical processor can be uniquely identified by its local APIC
>    ID. Software can access these APIC IDs in either of the following
>    ways:"
> 
> These ways include read from APIC, read MADT, read CPUID and implies
> that this must be consistent. For X2APIC it's actually written out:
> 
>   "If the local APIC unit supports x2APIC and is operating in x2APIC
>    mode, 32-bit APIC ID can be read by executing a RDMSR instruction to
>    read the processor’s x2APIC ID register. This method is equivalent to
>    executing CPUID leaf 0BH described below."
> 
> AMD has not been following that in the early 64bit systems as they moved
> the APIC ID space to start at 32 for the first CPU in the first socket
> for whatever reasons. But since then the kernel reads back the APIC ID
> on AMD systems into topo.apicid. But that was long ago and can easily be
> dealt with because at least the real APIC ID and the MADT/MPTABLE
> entries are consistent.
> 
> Hypervisors have their own CPUID space to override functionality with
> their own magic stuff, but imposing their nutbolt ideas on the
> architectural part of the system is not only wrong, it's disrespectful
> against the OS developers who try to keep their system sane.
> 
> > 2) The natural response is "Well, fix Hyper-V!"  I first had this conversation
> > with the Hyper-V team about 5 years ago.  Some cases of the problem were
> > fixed, but some cases remain unfixed.  It's a long story.
> >
> > 3)  Since Hyper-V code in Linux already has an override for the apic->read()
> > function, it's possible to do a hack in that override so that apicid gets set to
> > the same value as initial_apicid, which matches the old code.  Here's the diff:
> 
> This collides massively with the other work I'm doing, which uses the
> MADT provided information to actually evaluate various topology related
> things upfront and later during bringup. Thats badly needed because lots
> of todays infrastructure is based on heuristics and guesswork.

Fair enough.  And I've re-raised the issue with the Hyper-V team.

> 
> But it seems I wasted a month on reworking all of this just to be
> stopped cold in the tracks by completely undocumented and unnecessary
> hyper-v abuse.
> 
> So if Hyper-V insists on abusing the initial APIC ID as read from CPUID
> for topology information related to L3, then hyper-v should override the
> cache topology mechanism and not impose this insanity on the basic
> topology evaluation infrastructure.
> 
> Yours seriously grumpy
> 
>       tglx

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 16:25       ` Michael Kelley (LINUX)
@ 2023-07-31 20:41         ` Thomas Gleixner
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-31 20:41 UTC (permalink / raw)
  To: Michael Kelley (LINUX), LKML
  Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31 2023 at 16:25, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 5:35 AM
>> Define bogus. MADT is the primary source of information because that's
>> how we know how many CPUs (APICs) are there and what their APIC ID is
>> which we can use to wake them up. So there is a reasonable expectation
>> that this information is consistent with the rest of the system.
>
> Commit d49597fd3bc7 "x86/cpu: Deal with broken firmware (VMWare/Xen)"
> mentions VMware and XEN implementations that violate the spec.  The
> commit is from late 2016.  Have these bad systems aged out and no longer
> need accommodation?

They do, but this commit explicitely uses the MADT/real APIC ID value:

     c->initial_apicid = apicid;

So the new mechanics are accomodating for those, right?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 16:10           ` Michael Kelley (LINUX)
@ 2023-07-31 20:48             ` Thomas Gleixner
  2023-07-31 21:27               ` Michael Kelley (LINUX)
  0 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-31 20:48 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31 2023 at 16:10, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> The VM is an F72s_v2 in Azure running your patch set.  The VM has
> 72 vCPUs in two NUMA nodes across two physical Intel processors, with
> 36 vCPUs in each NUMA node.
>
> The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
> is from CPUID, while the apicid is from read_apic_id() and matches the
> MADT.  As expected, the two values match for the first 36 vCPUs, but differ
> by 28 (decimal) for the remaining 36.
>
> initial_apicid:      0 apicid:              0
...
> initial_apicid:      23 apicid:              23

> initial_apicid:      40 apicid:              24
...
> initial_apicid:      63 apicid:              47

Is there any indication in some other CPUID leaf which lets us deduce this
wreckage?

I don't think the hypervisor space (0x40000xx) has anything helpful, but
staring at the architectural ones provided by hyper-V to the guest might
give us an hint. Can you provide a cpuid dump for the boot CPU please?

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 20:48             ` Thomas Gleixner
@ 2023-07-31 21:27               ` Michael Kelley (LINUX)
  2023-07-31 22:12                 ` Thomas Gleixner
  0 siblings, 1 reply; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-07-31 21:27 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 1:49 PM
> 
> On Mon, Jul 31 2023 at 16:10, Michael Kelley wrote:
> > From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 8:38 AM
> > The VM is an F72s_v2 in Azure running your patch set.  The VM has
> > 72 vCPUs in two NUMA nodes across two physical Intel processors, with
> > 36 vCPUs in each NUMA node.
> >
> > The output is from /sys/kernel/debug/x86/topo/cpus, so the initial_apicid
> > is from CPUID, while the apicid is from read_apic_id() and matches the
> > MADT.  As expected, the two values match for the first 36 vCPUs, but differ
> > by 28 (decimal) for the remaining 36.
> >
> > initial_apicid:      0 apicid:              0
> ...
> > initial_apicid:      23 apicid:              23
> 
> > initial_apicid:      40 apicid:              24
> ...
> > initial_apicid:      63 apicid:              47
> 
> Is there any indication in some other CPUID leaf which lets us deduce this
> wreckage?

You can detect being a Hyper-V guest with leaf 0x40000000.  See Linux
kernel function ms_hyperv_platform().  But I'm not aware of anything
to indicate that a specific Hyper-V VM has the APIC numbering problem
vs. doesn't have the problem.

> 
> I don't think the hypervisor space (0x40000xx) has anything helpful, but
> staring at the architectural ones provided by hyper-V to the guest might
> give us an hint. Can you provide a cpuid dump for the boot CPU please?
> 

I'm not sure if you want the raw or decoded output.  Here's both.

Michael

# taskset -c 0 cpuid -r -1
CPU:
   0x00000000 0x00: eax=0x00000015 ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
   0x00000001 0x00: eax=0x000606a6 ebx=0x00400800 ecx=0xfeda3223 edx=0x1f8bfbff
   0x00000002 0x00: eax=0x00feff01 ebx=0x000000f0 ecx=0x00000000 edx=0x00000000
   0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000004 0x00: eax=0x7c004121 ebx=0x02c0003f ecx=0x0000003f edx=0x00000000
   0x00000004 0x01: eax=0x7c004122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000
   0x00000004 0x02: eax=0x7c004143 ebx=0x04c0003f ecx=0x000003ff edx=0x00000000
   0x00000004 0x03: eax=0x7c0fc163 ebx=0x02c0003f ecx=0x0000ffff edx=0x00000000
   0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000007 0x00: eax=0x00000000 ebx=0xd09f2fb9 ecx=0x00000000 edx=0x00000400
   0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000
   0x0000000b 0x01: eax=0x00000006 ebx=0x00000040 ecx=0x00000201 edx=0x00000000
   0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x00: eax=0x000000e7 ebx=0x00000a80 ecx=0x00000a80 edx=0x00000000
   0x0000000d 0x01: eax=0x0000000b ebx=0x00000980 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x05: eax=0x00000040 ebx=0x00000440 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x06: eax=0x00000200 ebx=0x00000480 ecx=0x00000000 edx=0x00000000
   0x0000000d 0x07: eax=0x00000400 ebx=0x00000680 ecx=0x00000000 edx=0x00000000
   0x0000000e 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000f 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000010 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000012 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000014 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000015 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000000 0x00: eax=0x4000000a ebx=0x7263694d ecx=0x666f736f edx=0x76482074
   0x40000001 0x00: eax=0x31237648 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000002 0x00: eax=0x00004f7c ebx=0x000a0000 ecx=0x00000001 edx=0x000005b6
   0x40000003 0x00: eax=0x00002e7f ebx=0x003b8030 ecx=0x00000002 edx=0x000ed7b2
   0x40000004 0x00: eax=0x00064e24 ebx=0x00000fff ecx=0x0000002e edx=0x00000000
   0x40000005 0x00: eax=0x000000f0 ebx=0x00000400 ecx=0x00005d00 edx=0x00000000
   0x40000006 0x00: eax=0x0000000f ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x40000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x4000000a 0x00: eax=0x000e0101 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x20000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000000 0x00: eax=0x80000008 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000001 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000121 edx=0x2c100800
   0x80000002 0x00: eax=0x65746e49 ebx=0x2952286c ecx=0x6f655820 edx=0x2952286e
   0x80000003 0x00: eax=0x616c5020 ebx=0x756e6974 ecx=0x3338206d edx=0x20433037
   0x80000004 0x00: eax=0x20555043 ebx=0x2e322040 ecx=0x48473038 edx=0x0000007a
   0x80000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x01006040 edx=0x00000000
   0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80000008 0x00: eax=0x0000302e ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x80860000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0xc0000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000

CPU:
   vendor_id = "GenuineIntel"
   version information (1/eax):
      processor type  = primary processor (0)
      family          = 0x6 (6)
      model           = 0xa (10)
      stepping id     = 0x6 (6)
      extended family = 0x0 (0)
      extended model  = 0x6 (6)
      (family synth)  = 0x6 (6)
      (model synth)   = 0x6a (106)
      (simple synth)  = Intel Core (Ice Lake) [Sunny Cove] {Sunny Cove}, 10nm
   miscellaneous (1/ebx):
      process local APIC physical ID = 0x0 (0)
      cpu count                      = 0x40 (64)
      CLFLUSH line size              = 0x8 (8)
      brand index                    = 0x0 (0)
   brand id = 0x00 (0): unknown
   feature information (1/edx):
      x87 FPU on chip                        = true
      VME: virtual-8086 mode enhancement     = true
      DE: debugging extensions               = true
      PSE: page size extensions              = true
      TSC: time stamp counter                = true
      RDMSR and WRMSR support                = true
      PAE: physical address extensions       = true
      MCE: machine check exception           = true
      CMPXCHG8B inst.                        = true
      APIC on chip                           = true
      SYSENTER and SYSEXIT                   = true
      MTRR: memory type range registers      = true
      PTE global bit                         = true
      MCA: machine check architecture        = true
      CMOV: conditional move/compare instr   = true
      PAT: page attribute table              = true
      PSE-36: page size extension            = true
      PSN: processor serial number           = false
      CLFLUSH instruction                    = true
      DS: debug store                        = false
      ACPI: thermal monitor and clock ctrl   = false
      MMX Technology                         = true
      FXSAVE/FXRSTOR                         = true
      SSE extensions                         = true
      SSE2 extensions                        = true
      SS: self snoop                         = true
      hyper-threading / multi-core supported = true
      TM: therm. monitor                     = false
      IA64                                   = false
      PBE: pending break event               = false
   feature information (1/ecx):
      PNI/SSE3: Prescott New Instructions     = true
      PCLMULDQ instruction                    = true
      DTES64: 64-bit debug store              = false
      MONITOR/MWAIT                           = false
      CPL-qualified debug store               = false
      VMX: virtual machine extensions         = true
      SMX: safer mode extensions              = false
      Enhanced Intel SpeedStep Technology     = false
      TM2: thermal monitor 2                  = false
      SSSE3 extensions                        = true
      context ID: adaptive or shared L1 data  = false
      SDBG: IA32_DEBUG_INTERFACE              = false
      FMA instruction                         = true
      CMPXCHG16B instruction                  = true
      xTPR disable                            = false
      PDCM: perfmon and debug                 = false
      PCID: process context identifiers       = true
      DCA: direct cache access                = false
      SSE4.1 extensions                       = true
      SSE4.2 extensions                       = true
      x2APIC: extended xAPIC support          = false
      MOVBE instruction                       = true
      POPCNT instruction                      = true
      time stamp counter deadline             = false
      AES instruction                         = true
      XSAVE/XSTOR states                      = true
      OS-enabled XSAVE/XSTOR                  = true
      AVX: advanced vector extensions         = true
      F16C half-precision convert instruction = true
      RDRAND instruction                      = true
      hypervisor guest status                 = true
   cache and TLB information (2):
      0xff: cache data is in CPUID leaf 4
      0xfe: TLB data is in CPUID leaf 0x18
      0xf0: 64 byte prefetching
   processor serial number = 0006-06A6-0000-0000-0000-0000
   deterministic cache parameters (4):
      --- cache 0 ---
      cache type                           = data cache (1)
      cache level                          = 0x1 (1)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0xc (12)
      number of sets                       = 0x40 (64)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 64
      (size synth)                         = 49152 (48 KB)
      --- cache 1 ---
      cache type                           = instruction cache (2)
      cache level                          = 0x1 (1)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0x8 (8)
      number of sets                       = 0x40 (64)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 64
      (size synth)                         = 32768 (32 KB)
      --- cache 2 ---
      cache type                           = unified cache (3)
      cache level                          = 0x2 (2)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1 (1)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0x14 (20)
      number of sets                       = 0x400 (1024)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 1024
      (size synth)                         = 1310720 (1.2 MB)
      --- cache 3 ---
      cache type                           = unified cache (3)
      cache level                          = 0x3 (3)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x3f (63)
      extra processor cores on this die    = 0x1f (31)
      system coherency line size           = 0x40 (64)
      physical line partitions             = 0x1 (1)
      ways of associativity                = 0xc (12)
      number of sets                       = 0x10000 (65536)
      WBINVD/INVD acts on lower caches     = false
      inclusive to lower caches            = false
      complex cache indexing               = false
      number of sets (s)                   = 65536
      (size synth)                         = 50331648 (48 MB)
   MONITOR/MWAIT (5):
      smallest monitor-line size (bytes)       = 0x0 (0)
      largest monitor-line size (bytes)        = 0x0 (0)
      enum of Monitor-MWAIT exts supported     = false
      supports intrs as break-event for MWAIT  = false
      number of C0 sub C-states using MWAIT    = 0x0 (0)
      number of C1 sub C-states using MWAIT    = 0x0 (0)
      number of C2 sub C-states using MWAIT    = 0x0 (0)
      number of C3 sub C-states using MWAIT    = 0x0 (0)
      number of C4 sub C-states using MWAIT    = 0x0 (0)
      number of C5 sub C-states using MWAIT    = 0x0 (0)
      number of C6 sub C-states using MWAIT    = 0x0 (0)
      number of C7 sub C-states using MWAIT    = 0x0 (0)
   Thermal and Power Management Features (6):
      digital thermometer                     = false
      Intel Turbo Boost Technology            = false
      ARAT always running APIC timer          = false
      PLN power limit notification            = false
      ECMD extended clock modulation duty     = false
      PTM package thermal management          = false
      HWP base registers                      = false
      HWP notification                        = false
      HWP activity window                     = false
      HWP energy performance preference       = false
      HWP package level request               = false
      HDC base registers                      = false
      Intel Turbo Boost Max Technology 3.0    = false
      HWP capabilities                        = false
      HWP PECI override                       = false
      flexible HWP                            = false
      IA32_HWP_REQUEST MSR fast access mode   = false
      HW_FEEDBACK                             = false
      ignoring idle logical processor HWP req = false
      digital thermometer thresholds          = 0x0 (0)
      hardware coordination feedback          = false
      ACNT2 available                         = false
      performance-energy bias capability      = false
      performance capability reporting        = false
      energy efficiency capability reporting  = false
      size of feedback struct (4KB pages)     = 0x0 (0)
      index of CPU's row in feedback struct   = 0x0 (0)
   extended feature flags (7):
      FSGSBASE instructions                    = true
      IA32_TSC_ADJUST MSR supported            = false
      SGX: Software Guard Extensions supported = false
      BMI1 instructions                        = true
      HLE hardware lock elision                = true
      AVX2: advanced vector extensions 2       = true
      FDP_EXCPTN_ONLY                          = false
      SMEP supervisor mode exec protection     = true
      BMI2 instructions                        = true
      enhanced REP MOVSB/STOSB                 = true
      INVPCID instruction                      = true
      RTM: restricted transactional memory     = true
      RDT-CMT/PQoS cache monitoring            = false
      deprecated FPU CS/DS                     = true
      MPX: intel memory protection extensions  = false
      RDT-CAT/PQE cache allocation             = false
      AVX512F: AVX-512 foundation instructions = true
      AVX512DQ: double & quadword instructions = true
      RDSEED instruction                       = true
      ADX instructions                         = true
      SMAP: supervisor mode access prevention  = true
      AVX512IFMA: fused multiply add           = false
      PCOMMIT instruction                      = false
      CLFLUSHOPT instruction                   = true
      CLWB instruction                         = false
      Intel processor trace                    = false
      AVX512PF: prefetch instructions          = false
      AVX512ER: exponent & reciprocal instrs   = false
      AVX512CD: conflict detection instrs      = true
      SHA instructions                         = false
      AVX512BW: byte & word instructions       = true
      AVX512VL: vector length                  = true
      PREFETCHWT1                              = false
      AVX512VBMI: vector byte manipulation     = false
      UMIP: user-mode instruction prevention   = false
      PKU protection keys for user-mode        = false
      OSPKE CR4.PKE and RDPKRU/WRPKRU          = false
      WAITPKG instructions                     = false
      AVX512_VBMI2: byte VPCOMPRESS, VPEXPAND  = false
      CET_SS: CET shadow stack                 = false
      GFNI: Galois Field New Instructions      = false
      VAES instructions                        = false
      VPCLMULQDQ instruction                   = false
      AVX512_VNNI: neural network instructions = false
      AVX512_BITALG: bit count/shiffle         = false
      TME: Total Memory Encryption             = false
      AVX512: VPOPCNTDQ instruction            = false
      5-level paging                           = false
      BNDLDX/BNDSTX MAWAU value in 64-bit mode = 0x0 (0)
      RDPID: read processor D supported        = false
      CLDEMOTE supports cache line demote      = false
      MOVDIRI instruction                      = false
      MOVDIR64B instruction                    = false
      ENQCMD instruction                       = false
      SGX_LC: SGX launch config supported      = false
      AVX512_4VNNIW: neural network instrs     = false
      AVX512_4FMAPS: multiply acc single prec  = false
      fast short REP MOV                       = false
      AVX512_VP2INTERSECT: intersect mask regs = false
      VERW md-clear microcode support          = true
      hybrid part                              = false
      PCONFIG instruction                      = false
      CET_IBT: CET indirect branch tracking    = false
      IBRS/IBPB: indirect branch restrictions  = false
      STIBP: 1 thr indirect branch predictor   = false
      L1D_FLUSH: IA32_FLUSH_CMD MSR            = false
      IA32_ARCH_CAPABILITIES MSR               = false
      IA32_CORE_CAPABILITIES MSR               = false
      SSBD: speculative store bypass disable   = false
   Direct Cache Access Parameters (9):
      PLATFORM_DCA_CAP MSR bits = 0
   Architecture Performance Monitoring Features (0xa/eax):
      version ID                               = 0x0 (0)
      number of counters per logical processor = 0x0 (0)
      bit width of counter                     = 0x0 (0)
      length of EBX bit vector                 = 0x0 (0)
   Architecture Performance Monitoring Features (0xa/ebx):
      core cycle event not available           = false
      instruction retired event not available  = false
      reference cycles event not available     = false
      last-level cache ref event not available = false
      last-level cache miss event not avail    = false
      branch inst retired event not available  = false
      branch mispred retired event not avail   = false
   Architecture Performance Monitoring Features (0xa/edx):
      number of fixed counters    = 0x0 (0)
      bit width of fixed counters = 0x0 (0)
      anythread deprecation       = false
   x2APIC features / processor topology (0xb):
      extended APIC ID                      = 0
      --- level 0 ---
      level number                          = 0x0 (0)
      level type                            = thread (1)
      bit width of level                    = 0x1 (1)
      number of logical processors at level = 0x2 (2)
      --- level 1 ---
      level number                          = 0x1 (1)
      level type                            = core (2)
      bit width of level                    = 0x6 (6)
      number of logical processors at level = 0x40 (64)
   XSAVE features (0xd/0):
      XCR0 lower 32 bits valid bit field mask = 0x000000e7
      XCR0 upper 32 bits valid bit field mask = 0x00000000
         XCR0 supported: x87 state            = true
         XCR0 supported: SSE state            = true
         XCR0 supported: AVX state            = true
         XCR0 supported: MPX BNDREGS          = false
         XCR0 supported: MPX BNDCSR           = false
         XCR0 supported: AVX-512 opmask       = true
         XCR0 supported: AVX-512 ZMM_Hi256    = true
         XCR0 supported: AVX-512 Hi16_ZMM     = true
         IA32_XSS supported: PT state         = false
         XCR0 supported: PKRU state           = false
         XCR0 supported: CET_U state          = false
         XCR0 supported: CET_S state          = false
         IA32_XSS supported: HDC state        = false
      bytes required by fields in XCR0        = 0x00000a80 (2688)
      bytes required by XSAVE/XRSTOR area     = 0x00000a80 (2688)
   XSAVE features (0xd/1):
      XSAVEOPT instruction                        = true
      XSAVEC instruction                          = true
      XGETBV instruction                          = false
      XSAVES/XRSTORS instructions                 = true
      SAVE area size in bytes                     = 0x00000980 (2432)
      IA32_XSS lower 32 bits valid bit field mask = 0x00000000
      IA32_XSS upper 32 bits valid bit field mask = 0x00000000
   AVX/YMM features (0xd/2):
      AVX/YMM save state byte size             = 0x00000100 (256)
      AVX/YMM save state byte offset           = 0x00000240 (576)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 opmask features (0xd/5):
      AVX-512 opmask save state byte size      = 0x00000040 (64)
      AVX-512 opmask save state byte offset    = 0x00000440 (1088)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 ZMM_Hi256 features (0xd/6):
      AVX-512 ZMM_Hi256 save state byte size   = 0x00000200 (512)
      AVX-512 ZMM_Hi256 save state byte offset = 0x00000480 (1152)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   AVX-512 Hi16_ZMM features (0xd/7):
      AVX-512 Hi16_ZMM save state byte size    = 0x00000400 (1024)
      AVX-512 Hi16_ZMM save state byte offset  = 0x00000680 (1664)
      supported in IA32_XSS or XCR0            = XCR0 (user state)
      64-byte alignment in compacted XSAVE     = false
   Quality of Service Monitoring Resource Type (0xf/0):
      Maximum range of RMID = 0
      supports L3 cache QoS monitoring = false
   Resource Director Technology Allocation (0x10/0):
      L3 cache allocation technology supported = false
      L2 cache allocation technology supported = false
      memory bandwidth allocation supported    = false
   0x00000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   Software Guard Extensions (SGX) capability (0x12/0):
      SGX1 supported                         = false
      SGX2 supported                         = false
      SGX ENCLV E*VIRTCHILD, ESETCONTEXT     = false
      SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false
      MISCSELECT.EXINFO supported: #PF & #GP = false
      MISCSELECT.CPINFO supported: #CP       = false
      MaxEnclaveSize_Not64 (log2)            = 0x0 (0)
      MaxEnclaveSize_64 (log2)               = 0x0 (0)
   0x00000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   Intel Processor Trace (0x14):
      IA32_RTIT_CR3_MATCH is accessible      = false
      configurable PSB & cycle-accurate      = false
      IP & TraceStop filtering; PT preserve  = false
      MTC timing packet; suppress COFI-based = false
      PTWRITE support                        = false
      power event trace support              = false
      ToPA output scheme support         = false
      ToPA can hold many output entries  = false
      single-range output scheme support = false
      output to trace transport          = false
      IP payloads have LIP values & CS   = false
   Time Stamp Counter/Core Crystal Clock Information (0x15):
      TSC/clock ratio = 0/0
      nominal core crystal clock = 0 Hz
   hypervisor_id = "Microsoft Hv"
   hypervisor interface identification (0x40000001/eax):
      version = "Hv#1"
   hypervisor system identity (0x40000002):
      build          = 20348
      version        = 10.0
      service pack   = 1
      service branch = 0
      service number = 1462
   hypervisor feature identification (0x40000003/eax):
      VP run time                      = true
      partition reference counter      = true
      basic synIC MSRs                 = true
      synthetic timer MSRs             = true
      APIC access MSRs                 = true
      hypercall MSRs                   = true
      access virtual process index MSR = true
      virtual system reset MSR         = false
      map/unmap statistics pages MSR   = false
      reference TSC access             = true
      guest idle state MSR             = true
      TSC/APIC frequency MSRs          = true
      guest debugging MSRs             = false
   hypervisor partition creation flags (0x40000003/ebx):
      CreatePartitions         = false
      AccessPartitionId        = false
      AccessMemoryPool         = false
      AdjustMessageBuffers     = false
      PostMessages             = true
      SignalEvents             = true
      CreatePort               = false
      ConnectPort              = false
      AccessStats              = false
      Debugging                = false
      CPUManagement            = false
      ConfigureProfiler        = false
      AccessVSM                = true
      AccessVpRegisters        = true
      EnableExtendedHypercalls = true
      StartVirtualProcessor    = true
   hypervisor power management features (0x40000003/ecx):
      maximum process power state = 0x2 (2)
   hypervisor feature identification (0x40000003/edx):
      MWAIT available                          = false
      guest debugging support available        = true
      performance monitor support available    = false
      CPU dynamic partitioning events avail    = false
      hypercall XMM input parameters available = true
      virtual guest idle state available       = true
      hypervisor sleep state available         = false
      query NUMA distance available            = true
      determine timer frequency available      = true
      inject synthetic machine check available = true
      guest crash MSRs available               = true
      debug MSRs available                     = false
      NPIEP available                          = true
      disable hypervisor available             = false
      extended gva ranges for flush virt addrs = true
      hypercall XMM register return available  = true
      sint polling mode available              = true
      hypercall MSR lock available             = true
      use direct synthetic timers              = true
   hypervisor recommendations (0x40000004/eax):
      use hypercalls for AS switches        = false
      use hypercalls for local TLB flushes  = false
      use hypercalls for remote TLB flushes = true
      use MSRs to access EOI, ICR, TPR      = false
      use MSRs to initiate system RESET     = false
      use relaxed timing                    = true
      use DMA remapping                     = false
      use interrupt remapping               = false
      use x2APIC MSRs                       = false
      deprecate AutoEOI                     = true
      use SyntheticClusterIpi hypercall     = true
      use ExProcessorMasks                  = true
      hypervisor is nested with Hyper-V     = false
      use INT for MBEC system calls         = false
      use enlightened VMCS interface        = true
      maximum number of spinlock retry attempts = 0xfff (4095)
   hypervisor implementation limits (0x40000005):
      maximum number of virtual processors                       = 0xf0 (240)
      maximum number of logical processors                       = 0x400 (1024)
      maximum number of physical interrupt vectors for remapping = 0x5d00 (23808)
   hypervisor hardware features used (0x40000006/eax):
      APIC overlay assist              = true
      MSR bitmaps                      = true
      performance counters             = true
      second-level address translation = true
      DMA remapping                    = false
      interrupt remapping              = false
      memory patrol scrubber           = false
      DMA protection                   = false
      HPET requested                   = false
      synthetic timers are volatile    = false
   hypervisor root partition enlightenments (0x40000007):
      StartLogicalProcessor      = false
      CreateRootvirtualProcessor = false
      ProcessorPowerManagement = false
      MwaitIdleStates          = false
      LogicalProcessorIdling   = false
   hypervisor shared virtual memory (0x40000008):
      SvmSupported            = false
      MaxPasidSpacePasidCount = 0x0 (0)
   hypervisor nested hypervisor features (0x40000009):
      AccessSynicRegs               = false
      AccessIntrCtrlRegs            = false
      AccessHypercallMsrs           = false
      AccessVpIndex                 = false
      AccessReenlightenmentControls = false
      XmmRegistersForFastHypercallAvailable = false
      FastHypercallOutputAvailable          = false
      SintPoillingModeAvailable             = false
   hypervisor nested virtualization features (0x4000000a):
      enlightened VMCS version (low)          = 0x1 (1)
      enlightened VMCS version (high)         = 0x1 (1)
      direct virtual flush hypercalls support = true
      HvFlushGuestPhysicalAddress* hypercalls = true
      enlightened MSR bitmap support          = true
   extended feature flags (0x80000001/edx):
      SYSCALL and SYSRET instructions        = true
      execution disable                      = true
      1-GB large page support                = true
      RDTSCP                                 = true
      64-bit extensions technology available = true
   Intel feature flags (0x80000001/ecx):
      LAHF/SAHF supported in 64-bit mode     = true
      LZCNT advanced bit manipulation        = true
      3DNow! PREFETCH/PREFETCHW instructions = true
   brand = "Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz"
   L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax):
      instruction # entries     = 0x0 (0)
      instruction associativity = 0x0 (0)
      data # entries            = 0x0 (0)
      data associativity        = 0x0 (0)
   L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx):
      instruction # entries     = 0x0 (0)
      instruction associativity = 0x0 (0)
      data # entries            = 0x0 (0)
      data associativity        = 0x0 (0)
   L1 data cache information (0x80000005/ecx):
      line size (bytes) = 0x0 (0)
      lines per tag     = 0x0 (0)
      associativity     = 0x0 (0)
      size (KB)         = 0x0 (0)
   L1 instruction cache information (0x80000005/edx):
      line size (bytes) = 0x0 (0)
      lines per tag     = 0x0 (0)
      associativity     = 0x0 (0)
      size (KB)         = 0x0 (0)
   L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax):
      instruction # entries     = 0x0 (0)
      instruction associativity = L2 off (0)
      data # entries            = 0x0 (0)
      data associativity        = L2 off (0)
   L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx):
      instruction # entries     = 0x0 (0)
      instruction associativity = L2 off (0)
      data # entries            = 0x0 (0)
      data associativity        = L2 off (0)
   L2 unified cache information (0x80000006/ecx):
      line size (bytes) = 0x40 (64)
      lines per tag     = 0x0 (0)
      associativity     = 8-way (6)
      size (KB)         = 0x100 (256)
   L3 cache information (0x80000006/edx):
      line size (bytes)     = 0x0 (0)
      lines per tag         = 0x0 (0)
      associativity         = L2 off (0)
      size (in 512KB units) = 0x0 (0)
   RAS Capability (0x80000007/ebx):
      MCA overflow recovery support = false
      SUCCOR support                = false
      HWA: hardware assert support  = false
      scalable MCA support          = false
   Advanced Power Management Features (0x80000007/ecx):
      CmpUnitPwrSampleTimeRatio = 0x0 (0)
   Advanced Power Management Features (0x80000007/edx):
      TS: temperature sensing diode           = false
      FID: frequency ID control               = false
      VID: voltage ID control                 = false
      TTP: thermal trip                       = false
      TM: thermal monitor                     = false
      STC: software thermal control           = false
      100 MHz multiplier control              = false
      hardware P-State control                = false
      TscInvariant                            = false
      CPB: core performance boost             = false
      read-only effective frequency interface = false
      processor feedback interface            = false
      APM power reporting                     = false
      connected standby                       = false
      RAPL: running average power limit       = false
   Physical Address and Linear Address Size (0x80000008/eax):
      maximum physical address bits         = 0x2e (46)
      maximum linear (virtual) address bits = 0x30 (48)
      maximum guest physical address bits   = 0x0 (0)
   Extended Feature Extensions ID (0x80000008/ebx):
      CLZERO instruction                       = false
      instructions retired count support       = false
      always save/restore error pointers       = false
      RDPRU instruction                        = false
      memory bandwidth enforcement             = false
      WBNOINVD instruction                     = false
      IBPB: indirect branch prediction barrier = false
      IBRS: indirect branch restr speculation  = false
      STIBP: 1 thr indirect branch predictor   = false
      STIBP always on preferred mode           = false
      ppin processor id number supported       = false
      SSBD: speculative store bypass disable   = false
      virtualized SSBD                         = false
      SSBD fixed in hardware                   = false
   Size Identifiers (0x80000008/ecx):
      number of CPU cores                 = 0x1 (1)
      ApicIdCoreIdSize                    = 0x0 (0)
      performance time-stamp counter size = 0x0 (0)
   Feature Extended Size (0x80000008/edx):
      RDPRU instruction max input support = 0x0 (0)
   (multi-processing synth) = multi-core (c=32), hyper-threaded (t=2)
   (multi-processing method) = Intel leaf 0xb
   (APIC widths synth): CORE_width=6 SMT_width=1
   (APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=0
   (uarch synth) = Intel Sunny Cove {Sunny Cove}, 10nm
   (synth) = Intel Core (Ice Lake) [Sunny Cove] {Sunny Cove}, 10nm

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 21:27               ` Michael Kelley (LINUX)
@ 2023-07-31 22:12                 ` Thomas Gleixner
  2023-08-01 22:25                   ` Thomas Gleixner
  0 siblings, 1 reply; 72+ messages in thread
From: Thomas Gleixner @ 2023-07-31 22:12 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@linutronix.de> Sent: Monday, July 31, 2023 1:49 PM
>> Is there any indication in some other CPUID leaf which lets us deduce this
>> wreckage?
>
> You can detect being a Hyper-V guest with leaf 0x40000000.  See Linux
> kernel function ms_hyperv_platform().  But I'm not aware of anything
> to indicate that a specific Hyper-V VM has the APIC numbering problem
> vs. doesn't have the problem.

That's what I said :) here:

>> I don't think the hypervisor space (0x40000xx) has anything helpful, but
>> staring at the architectural ones provided by hyper-V to the guest might
>> give us an hint. Can you provide a cpuid dump for the boot CPU please?
>> 
>
> I'm not sure if you want the raw or decoded output.  Here's both.

Either way is fine.

Clearly the hyper-v BIOS people put a lot of thoughts into this:

>    x2APIC features / processor topology (0xb):
>       extended APIC ID                      = 0
>       --- level 0 ---
>       level number                          = 0x0 (0)
>       level type                            = thread (1)
>       bit width of level                    = 0x1 (1)
>       number of logical processors at level = 0x2 (2)
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x40 (64)

FAIL:                                           ^^^^^

While that field is not meant for topology evaluation it is at least
expected to tell the actual number of logical processors at that level
which are actually available. 

The CPUID APIC ID aka initial_apicid clearly tells that the topology has
36 logical CPUs in package 0 and 36 in package 1 according to your
table.

On real hardware this looks like this:

      --- level 1 ---
      level number                          = 0x1 (1)
      level type                            = core (2)
      bit width of level                    = 0x6 (6)
      number of logical processors at level = 0x38 (56)

Which corresponds to reality and is consistent. But sure, consistency is
overrated.

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
  2023-07-29  0:02   ` Sohil Mehta
  2023-07-31  4:05   ` Michael Kelley (LINUX)
@ 2023-08-01  7:05   ` Gautham R. Shenoy
  2023-08-01  7:34     ` Thomas Gleixner
  2 siblings, 1 reply; 72+ messages in thread
From: Gautham R. Shenoy @ 2023-08-01  7:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

Hello Thomas,

On Fri, Jul 28, 2023 at 02:13:08PM +0200, Thomas Gleixner wrote:
> Topology evaluation is a complete disaster and impenetrable mess. It's
> scattered all over the place with some vendor implementatins doing early
> evaluation and some not. The most horrific part is the permanent
> overwriting of smt_max_siblings and __max_die_per_package, instead of
> establishing them once on the boot CPU and validating the result on the
> APs.
> 
> The goals are:
> 
>   - One topology evaluation entry point
> 
>   - Proper sharing of pointlessly duplicated code
> 
>   - Proper structuring of the evaluation logic and preferences.
> 
>   - Evaluating important system wide information only once on the boot CPU
> 
>   - Making the 0xb/0x1f leaf parsing less convoluted and actually fixing
>     the short comings of leaf 0x1f evaluation.
> 
> Start to consolidate the topology evaluation code by providing the entry
> points for the early boot CPU evaluation and for the final parsing on the
> boot CPU and the APs.
> 
> Move the trivial pieces into that new code:
> 
>    - The initialization of cpuinfo_x86::topo
> 
>    - The evaluation of CPUID leaf 1, which presets topo::initial_apicid
> 
>    - topo_apicid is set to topo::initial_apicid when invoked from early
>      boot. When invoked for the final evaluation on the boot CPU it reads
>      the actual APIC ID, which makes apic_get_initial_apicid() obsolete
>      once everything is converted over.
> 
> Provide a temporary helper function topo_converted() which shields off the
> not yet converted CPU vendors from invoking code which would break them.
> This shielding covers all vendor CPUs which support SMP, but not the
> historical pure UP ones as they only need the topology info init and
> eventually the initial APIC initialization.
> 
> Provide two new members in cpuinfo_x86::topo to store the maximum number of
> SMT siblings and the number of dies per package and add them to the debugfs
> readout. These two members will be used to populate this information on the
> boot CPU and to validate the APs against it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/topology.h       |   19 +++
>  arch/x86/kernel/cpu/Makefile          |    3 
>  arch/x86/kernel/cpu/common.c          |   23 +---
>  arch/x86/kernel/cpu/cpu.h             |    6 +
>  arch/x86/kernel/cpu/debugfs.c         |   37 ++++++
>  arch/x86/kernel/cpu/topology.h        |   32 +++++
>  arch/x86/kernel/cpu/topology_common.c |  187 ++++++++++++++++++++++++++++++++++
>  7 files changed, 290 insertions(+), 17 deletions(-)
> 
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -102,6 +102,25 @@ static inline void setup_node_to_cpumask
>  
>  #include <asm-generic/topology.h>
>  
> +/* Topology information */
> +enum x86_topology_domains {
> +	TOPO_SMT_DOMAIN,
> +	TOPO_CORE_DOMAIN,
> +	TOPO_MODULE_DOMAIN,
> +	TOPO_TILE_DOMAIN,
> +	TOPO_DIE_DOMAIN,
> +	TOPO_PKG_DOMAIN,
> +	TOPO_ROOT_DOMAIN,
> +	TOPO_MAX_DOMAIN,
> +};
> +

[..snip..]

> +static void topo_set_ids(struct topo_scan *tscan)
> +{
> +	struct cpuinfo_x86 *c = tscan->c;
> +	u32 apicid = c->topo.apicid;
> +
> +	c->topo.pkg_id = topo_shift_apicid(apicid, TOPO_ROOT_DOMAIN);

Shouldn't this use TOPO_PKG_DOMAIN instead of TOPO_ROOT_DOMAIN ?


> +	c->topo.die_id = topo_shift_apicid(apicid, TOPO_DIE_DOMAIN);
> +
> +	/* Relative core ID */
> +	c->topo.core_id = topo_relative_domain_id(apicid, TOPO_CORE_DOMAIN);
> +}
> +

--
Thanks and Regards
gautham.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-08-01  7:05   ` Gautham R. Shenoy
@ 2023-08-01  7:34     ` Thomas Gleixner
  0 siblings, 0 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-08-01  7:34 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson

On Tue, Aug 01 2023 at 12:35, Gautham R. Shenoy wrote:
> On Fri, Jul 28, 2023 at 02:13:08PM +0200, Thomas Gleixner wrote:
>> +static void topo_set_ids(struct topo_scan *tscan)
>> +{
>> +	struct cpuinfo_x86 *c = tscan->c;
>> +	u32 apicid = c->topo.apicid;
>> +
>> +	c->topo.pkg_id = topo_shift_apicid(apicid, TOPO_ROOT_DOMAIN);
>
> Shouldn't this use TOPO_PKG_DOMAIN instead of TOPO_ROOT_DOMAIN ?

Yup. It does not make a difference in that case. That's why I didn't
notice, but let me fix this for conistency sake.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-07-31 22:12                 ` Thomas Gleixner
@ 2023-08-01 22:25                   ` Thomas Gleixner
  2023-08-01 22:35                     ` Andrew Cooper
  2023-08-02 14:43                     ` Michael Kelley (LINUX)
  0 siblings, 2 replies; 72+ messages in thread
From: Thomas Gleixner @ 2023-08-01 22:25 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson, linux-hyperv, Linus Torvalds,
	Greg Kroah-Hartman

Michael!

On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> Clearly the hyper-v BIOS people put a lot of thoughts into this:
>
>>    x2APIC features / processor topology (0xb):
>>       extended APIC ID                      = 0
>>       --- level 0 ---
>>       level number                          = 0x0 (0)
>>       level type                            = thread (1)
>>       bit width of level                    = 0x1 (1)
>>       number of logical processors at level = 0x2 (2)
>>       --- level 1 ---
>>       level number                          = 0x1 (1)
>>       level type                            = core (2)
>>       bit width of level                    = 0x6 (6)
>>       number of logical processors at level = 0x40 (64)
>
> FAIL:                                           ^^^^^
>
> While that field is not meant for topology evaluation it is at least
> expected to tell the actual number of logical processors at that level
> which are actually available. 
>
> The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> 36 logical CPUs in package 0 and 36 in package 1 according to your
> table.
>
> On real hardware this looks like this:
>
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x38 (56)
>
> Which corresponds to reality and is consistent. But sure, consistency is
> overrated.

So I looked really hard to find some hint how to detect this situation
on the boot CPU, which allows us to mitigate it, but there is none at
all.

So we are caught between a rock and a hard place, which provides us two
mutually exclusive options to chose from:

  1) Have a sane topology evaluation mechanism which solves the known
     problems of hybrid systems, wrong sizing estimates and other
     unpleasantries.

  2) Support the Hyper-V BIOS trainwreck forever.

Unsurprisingly #2 is not really an option as #1 is a crucial issue for
the kernel and we need it resolved urgently as of yesterday.

So while I'm definitely a strong supporter of no-regression policy, I
have to make an argument here why this particular issue is _not_
covered:

 1) Hyper-V BIOS/firmware violates the firmware specification and
    requirements which are clearly spelled out in the SDM.

 2) This violatation is reported on every boot with one promiment
    message per brought up AP where the initial APIC ID as provided by
    CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
    also provided by MADT starting with CPU 36 in the provided example:

    "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"

    repeating itself up to CPU71 with the relevant diverging APIC IDs.

    At least that's what the upstream kernel produces according to
    validate_apic_and_package_id() in such an situation.

 3) This is known for years and the Hyper-V Linux team tried to get this
    resolved, but obviously their arguments fell on deaf ears.

    IOW, the firmware BUG message has been ignored willfully for years
    due to "works for me, why should I care?" attitude.

Seriously, kernel development cannot be held hostage forever by the
wilful ignorance of a BIOS team, which refuses to adhere to
specifications and defines their own world order.

The x86 maintainer team is chosing the lesser of two evils and lets
those who created the problem and refused to resolve it deal with the
outcome.

Just to clarify. This is not preventing affected guests from booting.
The worst consequence is a slight performance regression because the
firmware provided topology information is not matching reality and
therefore the scheduler placement vs. L3 affinity sucks. That's clearly
not a kernel problem.

I'm happy to aid accelerating this thought process by elevating the
existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
-	if (apicid != c->topo.apicid) {
+	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
 		       cpu, apicid, c->topo.initial_apicid);
 	}

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-08-01 22:25                   ` Thomas Gleixner
@ 2023-08-01 22:35                     ` Andrew Cooper
  2023-08-02 14:43                     ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2023-08-01 22:35 UTC (permalink / raw)
  To: Thomas Gleixner, Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Arjan van de Ven, James E.J. Bottomley,
	Dick Kennedy, James Smart, Martin K. Petersen, linux-scsi,
	Guenter Roeck, linux-hwmon, Jean Delvare, Huang Rui,
	Juergen Gross, Steve Wahl, Mike Travis, Dimitri Sivanich,
	Russ Anderson, linux-hyperv, Linus Torvalds, Greg Kroah-Hartman

On 01/08/2023 11:25 pm, Thomas Gleixner wrote:
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
>  
>  	apicid = apic->cpu_present_to_apicid(cpu);
>  
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",

While you're fixing this, care to remove the chaotic-evil path of mixing
%u and %x with no 0x prefix?

~Andrew

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-08-01 22:25                   ` Thomas Gleixner
  2023-08-01 22:35                     ` Andrew Cooper
@ 2023-08-02 14:43                     ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 72+ messages in thread
From: Michael Kelley (LINUX) @ 2023-08-02 14:43 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl, Mike Travis,
	Dimitri Sivanich, Russ Anderson, linux-hyperv, Linus Torvalds,
	Greg Kroah-Hartman

From: Thomas Gleixner <tglx@linutronix.de> Sent: Tuesday, August 1, 2023 3:25 PM
> 
> Michael!
> 
> On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> > On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> > Clearly the hyper-v BIOS people put a lot of thoughts into this:
> >
> >>    x2APIC features / processor topology (0xb):
> >>       extended APIC ID                      = 0
> >>       --- level 0 ---
> >>       level number                          = 0x0 (0)
> >>       level type                            = thread (1)
> >>       bit width of level                    = 0x1 (1)
> >>       number of logical processors at level = 0x2 (2)
> >>       --- level 1 ---
> >>       level number                          = 0x1 (1)
> >>       level type                            = core (2)
> >>       bit width of level                    = 0x6 (6)
> >>       number of logical processors at level = 0x40 (64)
> >
> > FAIL:                                           ^^^^^
> >
> > While that field is not meant for topology evaluation it is at least
> > expected to tell the actual number of logical processors at that level
> > which are actually available.
> >
> > The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> > 36 logical CPUs in package 0 and 36 in package 1 according to your
> > table.
> >
> > On real hardware this looks like this:
> >
> >       --- level 1 ---
> >       level number                          = 0x1 (1)
> >       level type                            = core (2)
> >       bit width of level                    = 0x6 (6)
> >       number of logical processors at level = 0x38 (56)
> >
> > Which corresponds to reality and is consistent. But sure, consistency is
> > overrated.
> 
> So I looked really hard to find some hint how to detect this situation
> on the boot CPU, which allows us to mitigate it, but there is none at
> all.
> 
> So we are caught between a rock and a hard place, which provides us two
> mutually exclusive options to chose from:
> 
>   1) Have a sane topology evaluation mechanism which solves the known
>      problems of hybrid systems, wrong sizing estimates and other
>      unpleasantries.
> 
>   2) Support the Hyper-V BIOS trainwreck forever.
> 
> Unsurprisingly #2 is not really an option as #1 is a crucial issue for
> the kernel and we need it resolved urgently as of yesterday.
> 
> So while I'm definitely a strong supporter of no-regression policy, I
> have to make an argument here why this particular issue is _not_
> covered:
> 
>  1) Hyper-V BIOS/firmware violates the firmware specification and
>     requirements which are clearly spelled out in the SDM.
> 
>  2) This violatation is reported on every boot with one promiment
>     message per brought up AP where the initial APIC ID as provided by
>     CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
>     also provided by MADT starting with CPU 36 in the provided example:
> 
>     "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"
> 
>     repeating itself up to CPU71 with the relevant diverging APIC IDs.
> 
>     At least that's what the upstream kernel produces according to
>     validate_apic_and_package_id() in such an situation.
> 
>  3) This is known for years and the Hyper-V Linux team tried to get this
>     resolved, but obviously their arguments fell on deaf ears.
> 
>     IOW, the firmware BUG message has been ignored willfully for years
>     due to "works for me, why should I care?" attitude.
> 
> Seriously, kernel development cannot be held hostage forever by the
> wilful ignorance of a BIOS team, which refuses to adhere to
> specifications and defines their own world order.
> 
> The x86 maintainer team is chosing the lesser of two evils and lets
> those who created the problem and refused to resolve it deal with the
> outcome.

Fair enough.  I don't have any basis to argue otherwise.  I'm in
discussions with the Hyper-V team about getting it fully fixed in
Hyper-V, and it looks like there's some movement to make it happen.

> 
> Just to clarify. This is not preventing affected guests from booting.
> The worst consequence is a slight performance regression because the
> firmware provided topology information is not matching reality and
> therefore the scheduler placement vs. L3 affinity sucks. That's clearly
> not a kernel problem.

Yes, if Linux will still boots and runs, that helps.  Then it really is up the
(virtual) firmware in Hyper-V to provide the correct topology information
so performance is as expected.

> 
> I'm happy to aid accelerating this thought process by elevating the
> existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Your choice.  In this particular case, it won't make a difference either
way.

Michael

> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
> 
>  	apicid = apic->cpu_present_to_apicid(cpu);
> 
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
>  		       cpu, apicid, c->topo.initial_apicid);
>  	}

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id()
  2023-07-28 12:13 ` [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
@ 2023-08-08 20:14   ` Steve Wahl
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Wahl @ 2023-08-08 20:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Dimitri Sivanich, Russ Anderson

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Fri, Jul 28, 2023 at 02:13:01PM +0200, Thomas Gleixner wrote:
> APIC IDs are used with random data types u16, u32, int, unsigned int,
> unsigned long.
> 
> Make it all consistently use u32 because that reflects the hardware
> register width even if that callback going to be removed soonish.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/apic.h          |    2 +-
>  arch/x86/kernel/apic/apic_flat_64.c  |    2 +-
>  arch/x86/kernel/apic/apic_noop.c     |    2 +-
>  arch/x86/kernel/apic/apic_numachip.c |    2 +-
>  arch/x86/kernel/apic/bigsmp_32.c     |    2 +-
>  arch/x86/kernel/apic/local.h         |    2 +-
>  arch/x86/kernel/apic/probe_32.c      |    2 +-
>  arch/x86/kernel/apic/x2apic_phys.c   |    2 +-
>  arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
>  arch/x86/kernel/vsmp_64.c            |    2 +-
>  arch/x86/xen/apic.c                  |    2 +-
>  11 files changed, 11 insertions(+), 11 deletions(-)
> 
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -296,7 +296,7 @@ struct apic {
>  	void	(*init_apic_ldr)(void);
>  	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
>  	u32	(*cpu_present_to_apicid)(int mps_cpu);
> -	int	(*phys_pkg_id)(int cpuid_apic, int index_msb);
> +	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
>  
>  	u32	(*get_apic_id)(unsigned long x);
>  	u32	(*set_apic_id)(unsigned int id);
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -66,7 +66,7 @@ static u32 set_apic_id(unsigned int id)
>  	return (id & 0xFF) << 24;
>  }
>  
> -static int flat_phys_pkg_id(int initial_apic_id, int index_msb)
> +static u32 flat_phys_pkg_id(u32 initial_apic_id, int index_msb)
>  {
>  	return initial_apic_id >> index_msb;
>  }
> --- a/arch/x86/kernel/apic/apic_noop.c
> +++ b/arch/x86/kernel/apic/apic_noop.c
> @@ -29,7 +29,7 @@ static void noop_send_IPI_self(int vecto
>  static void noop_apic_icr_write(u32 low, u32 id) { }
>  static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
>  static u64 noop_apic_icr_read(void) { return 0; }
> -static int noop_phys_pkg_id(int cpuid_apic, int index_msb) { return 0; }
> +static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
>  static unsigned int noop_get_apic_id(unsigned long x) { return 0; }
>  static void noop_apic_eoi(void) { }
>  
> --- a/arch/x86/kernel/apic/apic_numachip.c
> +++ b/arch/x86/kernel/apic/apic_numachip.c
> @@ -56,7 +56,7 @@ static u32 numachip2_set_apic_id(unsigne
>  	return id << 24;
>  }
>  
> -static int numachip_phys_pkg_id(int initial_apic_id, int index_msb)
> +static u32 numachip_phys_pkg_id(u32 initial_apic_id, int index_msb)
>  {
>  	return initial_apic_id >> index_msb;
>  }
> --- a/arch/x86/kernel/apic/bigsmp_32.c
> +++ b/arch/x86/kernel/apic/bigsmp_32.c
> @@ -29,7 +29,7 @@ static void bigsmp_ioapic_phys_id_map(ph
>  	physids_promote(0xFFL, retmap);
>  }
>  
> -static int bigsmp_phys_pkg_id(int cpuid_apic, int index_msb)
> +static u32 bigsmp_phys_pkg_id(u32 cpuid_apic, int index_msb)
>  {
>  	return cpuid_apic >> index_msb;
>  }
> --- a/arch/x86/kernel/apic/local.h
> +++ b/arch/x86/kernel/apic/local.h
> @@ -17,7 +17,7 @@
>  void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
>  unsigned int x2apic_get_apic_id(unsigned long id);
>  u32 x2apic_set_apic_id(unsigned int id);
> -int x2apic_phys_pkg_id(int initial_apicid, int index_msb);
> +u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
>  
>  void x2apic_send_IPI_all(int vector);
>  void x2apic_send_IPI_allbutself(int vector);
> --- a/arch/x86/kernel/apic/probe_32.c
> +++ b/arch/x86/kernel/apic/probe_32.c
> @@ -18,7 +18,7 @@
>  
>  #include "local.h"
>  
> -static int default_phys_pkg_id(int cpuid_apic, int index_msb)
> +static u32 default_phys_pkg_id(u32 cpuid_apic, int index_msb)
>  {
>  	return cpuid_apic >> index_msb;
>  }
> --- a/arch/x86/kernel/apic/x2apic_phys.c
> +++ b/arch/x86/kernel/apic/x2apic_phys.c
> @@ -134,7 +134,7 @@ u32 x2apic_set_apic_id(unsigned int id)
>  	return id;
>  }
>  
> -int x2apic_phys_pkg_id(int initial_apicid, int index_msb)
> +u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb)
>  {
>  	return initial_apicid >> index_msb;
>  }
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -790,7 +790,7 @@ static unsigned int uv_read_apic_id(void
>  	return x2apic_get_apic_id(apic_read(APIC_ID));
>  }
>  
> -static int uv_phys_pkg_id(int initial_apicid, int index_msb)
> +static u32 uv_phys_pkg_id(u32 initial_apicid, int index_msb)
>  {
>  	return uv_read_apic_id() >> index_msb;
>  }
> --- a/arch/x86/kernel/vsmp_64.c
> +++ b/arch/x86/kernel/vsmp_64.c
> @@ -127,7 +127,7 @@ static void __init vsmp_cap_cpus(void)
>  #endif
>  }
>  
> -static int apicid_phys_pkg_id(int initial_apic_id, int index_msb)
> +static u32 apicid_phys_pkg_id(u32 initial_apic_id, int index_msb)
>  {
>  	return read_apic_id() >> index_msb;
>  }
> --- a/arch/x86/xen/apic.c
> +++ b/arch/x86/xen/apic.c
> @@ -110,7 +110,7 @@ static int xen_madt_oem_check(char *oem_
>  	return xen_pv_domain();
>  }
>  
> -static int xen_phys_pkg_id(int initial_apic_id, int index_msb)
> +static u32 xen_phys_pkg_id(u32 initial_apic_id, int index_msb)
>  {
>  	return initial_apic_id >> index_msb;
>  }
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id()
  2023-07-28 12:13 ` [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
@ 2023-08-08 20:15   ` Steve Wahl
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Wahl @ 2023-08-08 20:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Dimitri Sivanich, Russ Anderson

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Fri, Jul 28, 2023 at 02:13:02PM +0200, Thomas Gleixner wrote:
> APIC IDs are used with random data types u16, u32, int, unsigned int,
> unsigned long.
> 
> Make it all consistently use u32 because that reflects the hardware
> register width.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/apic.h          |   14 ++------------
>  arch/x86/kernel/apic/apic_flat_64.c  |    4 ++--
>  arch/x86/kernel/apic/apic_noop.c     |    2 +-
>  arch/x86/kernel/apic/apic_numachip.c |    8 ++++----
>  arch/x86/kernel/apic/bigsmp_32.c     |    2 +-
>  arch/x86/kernel/apic/local.h         |    4 ++--
>  arch/x86/kernel/apic/probe_32.c      |   10 ++++++++++
>  arch/x86/kernel/apic/x2apic_phys.c   |    4 ++--
>  arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
>  arch/x86/xen/apic.c                  |    4 ++--
>  10 files changed, 27 insertions(+), 27 deletions(-)
> 
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -298,8 +298,8 @@ struct apic {
>  	u32	(*cpu_present_to_apicid)(int mps_cpu);
>  	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
>  
> -	u32	(*get_apic_id)(unsigned long x);
> -	u32	(*set_apic_id)(unsigned int id);
> +	u32	(*get_apic_id)(u32 id);
> +	u32	(*set_apic_id)(u32 apicid);
>  
>  	/* wakeup_secondary_cpu */
>  	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
> @@ -493,16 +493,6 @@ static inline bool lapic_vector_set_in_i
>  	return !!(irr & (1U << (vector % 32)));
>  }
>  
> -static inline unsigned default_get_apic_id(unsigned long x)
> -{
> -	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
> -
> -	if (APIC_XAPIC(ver) || boot_cpu_has(X86_FEATURE_EXTD_APICID))
> -		return (x >> 24) & 0xFF;
> -	else
> -		return (x >> 24) & 0x0F;
> -}
> -
>  /*
>   * Warm reset vector position:
>   */
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -56,12 +56,12 @@ flat_send_IPI_mask_allbutself(const stru
>  	_flat_send_IPI_mask(mask, vector);
>  }
>  
> -static unsigned int flat_get_apic_id(unsigned long x)
> +static u32 flat_get_apic_id(u32 x)
>  {
>  	return (x >> 24) & 0xFF;
>  }
>  
> -static u32 set_apic_id(unsigned int id)
> +static u32 set_apic_id(u32 id)
>  {
>  	return (id & 0xFF) << 24;
>  }
> --- a/arch/x86/kernel/apic/apic_noop.c
> +++ b/arch/x86/kernel/apic/apic_noop.c
> @@ -30,7 +30,7 @@ static void noop_apic_icr_write(u32 low,
>  static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
>  static u64 noop_apic_icr_read(void) { return 0; }
>  static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
> -static unsigned int noop_get_apic_id(unsigned long x) { return 0; }
> +static u32 noop_get_apic_id(u32 apicid) { return 0; }
>  static void noop_apic_eoi(void) { }
>  
>  static u32 noop_apic_read(u32 reg)
> --- a/arch/x86/kernel/apic/apic_numachip.c
> +++ b/arch/x86/kernel/apic/apic_numachip.c
> @@ -25,7 +25,7 @@ static const struct apic apic_numachip1;
>  static const struct apic apic_numachip2;
>  static void (*numachip_apic_icr_write)(int apicid, unsigned int val) __read_mostly;
>  
> -static unsigned int numachip1_get_apic_id(unsigned long x)
> +static u32 numachip1_get_apic_id(u32 x)
>  {
>  	unsigned long value;
>  	unsigned int id = (x >> 24) & 0xff;
> @@ -38,12 +38,12 @@ static unsigned int numachip1_get_apic_i
>  	return id;
>  }
>  
> -static u32 numachip1_set_apic_id(unsigned int id)
> +static u32 numachip1_set_apic_id(u32 id)
>  {
>  	return (id & 0xff) << 24;
>  }
>  
> -static unsigned int numachip2_get_apic_id(unsigned long x)
> +static u32 numachip2_get_apic_id(u32 x)
>  {
>  	u64 mcfg;
>  
> @@ -51,7 +51,7 @@ static unsigned int numachip2_get_apic_i
>  	return ((mcfg >> (28 - 8)) & 0xfff00) | (x >> 24);
>  }
>  
> -static u32 numachip2_set_apic_id(unsigned int id)
> +static u32 numachip2_set_apic_id(u32 id)
>  {
>  	return id << 24;
>  }
> --- a/arch/x86/kernel/apic/bigsmp_32.c
> +++ b/arch/x86/kernel/apic/bigsmp_32.c
> @@ -13,7 +13,7 @@
>  
>  #include "local.h"
>  
> -static unsigned bigsmp_get_apic_id(unsigned long x)
> +static u32 bigsmp_get_apic_id(u32 x)
>  {
>  	return (x >> 24) & 0xFF;
>  }
> --- a/arch/x86/kernel/apic/local.h
> +++ b/arch/x86/kernel/apic/local.h
> @@ -15,8 +15,8 @@
>  
>  /* X2APIC */
>  void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
> -unsigned int x2apic_get_apic_id(unsigned long id);
> -u32 x2apic_set_apic_id(unsigned int id);
> +u32 x2apic_get_apic_id(u32 id);
> +u32 x2apic_set_apic_id(u32 id);
>  u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
>  
>  void x2apic_send_IPI_all(int vector);
> --- a/arch/x86/kernel/apic/probe_32.c
> +++ b/arch/x86/kernel/apic/probe_32.c
> @@ -23,6 +23,16 @@ static u32 default_phys_pkg_id(u32 cpuid
>  	return cpuid_apic >> index_msb;
>  }
>  
> +static u32 default_get_apic_id(u32 x)
> +{
> +	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
> +
> +	if (APIC_XAPIC(ver) || boot_cpu_has(X86_FEATURE_EXTD_APICID))
> +		return (x >> 24) & 0xFF;
> +	else
> +		return (x >> 24) & 0x0F;
> +}
> +
>  /* should be called last. */
>  static int probe_default(void)
>  {
> --- a/arch/x86/kernel/apic/x2apic_phys.c
> +++ b/arch/x86/kernel/apic/x2apic_phys.c
> @@ -124,12 +124,12 @@ static int x2apic_phys_probe(void)
>  	return apic == &apic_x2apic_phys;
>  }
>  
> -unsigned int x2apic_get_apic_id(unsigned long id)
> +u32 x2apic_get_apic_id(u32 id)
>  {
>  	return id;
>  }
>  
> -u32 x2apic_set_apic_id(unsigned int id)
> +u32 x2apic_set_apic_id(u32 id)
>  {
>  	return id;
>  }
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -780,7 +780,7 @@ static void uv_send_IPI_all(int vector)
>  	uv_send_IPI_mask(cpu_online_mask, vector);
>  }
>  
> -static u32 set_apic_id(unsigned int id)
> +static u32 set_apic_id(u32 id)
>  {
>  	return id;
>  }
> --- a/arch/x86/xen/apic.c
> +++ b/arch/x86/xen/apic.c
> @@ -33,13 +33,13 @@ static unsigned int xen_io_apic_read(uns
>  	return 0xfd;
>  }
>  
> -static u32 xen_set_apic_id(unsigned int x)
> +static u32 xen_set_apic_id(u32 x)
>  {
>  	WARN_ON(1);
>  	return x;
>  }
>  
> -static unsigned int xen_get_apic_id(unsigned long x)
> +static u32 xen_get_apic_id(u32 x)
>  {
>  	return ((x)>>24) & 0xFFu;
>  }
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
  2023-07-28 12:13 ` [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
@ 2023-08-08 20:15   ` Steve Wahl
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Wahl @ 2023-08-08 20:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Dimitri Sivanich, Russ Anderson

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Fri, Jul 28, 2023 at 02:13:03PM +0200, Thomas Gleixner wrote:
> APIC IDs are used with random data types u16, u32, int, unsigned int,
> unsigned long.
> 
> Make it all consistently use u32 because that reflects the hardware
> register width.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/hyperv/hv_vtl.c             |    2 +-
>  arch/x86/include/asm/apic.h          |    8 ++++----
>  arch/x86/kernel/acpi/boot.c          |    2 +-
>  arch/x86/kernel/apic/apic_noop.c     |    2 +-
>  arch/x86/kernel/apic/apic_numachip.c |    2 +-
>  arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
>  arch/x86/kernel/sev.c                |    2 +-
>  7 files changed, 10 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/hyperv/hv_vtl.c
> +++ b/arch/x86/hyperv/hv_vtl.c
> @@ -192,7 +192,7 @@ static int hv_vtl_apicid_to_vp_id(u32 ap
>  	return ret;
>  }
>  
> -static int hv_vtl_wakeup_secondary_cpu(int apicid, unsigned long start_eip)
> +static int hv_vtl_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip)
>  {
>  	int vp_id;
>  
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -302,9 +302,9 @@ struct apic {
>  	u32	(*set_apic_id)(u32 apicid);
>  
>  	/* wakeup_secondary_cpu */
> -	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
> +	int	(*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip);
>  	/* wakeup secondary CPU using 64-bit wakeup point */
> -	int	(*wakeup_secondary_cpu_64)(int apicid, unsigned long start_eip);
> +	int	(*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip);
>  
>  	char	*name;
>  };
> @@ -322,8 +322,8 @@ struct apic_override {
>  	void	(*send_IPI_self)(int vector);
>  	u64	(*icr_read)(void);
>  	void	(*icr_write)(u32 low, u32 high);
> -	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
> -	int	(*wakeup_secondary_cpu_64)(int apicid, unsigned long start_eip);
> +	int	(*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip);
> +	int	(*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip);
>  };
>  
>  /*
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -358,7 +358,7 @@ acpi_parse_lapic_nmi(union acpi_subtable
>  }
>  
>  #ifdef CONFIG_X86_64
> -static int acpi_wakeup_cpu(int apicid, unsigned long start_ip)
> +static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip)
>  {
>  	/*
>  	 * Remap mailbox memory only for the first call to acpi_wakeup_cpu().
> --- a/arch/x86/kernel/apic/apic_noop.c
> +++ b/arch/x86/kernel/apic/apic_noop.c
> @@ -27,7 +27,7 @@ static void noop_send_IPI_allbutself(int
>  static void noop_send_IPI_all(int vector) { }
>  static void noop_send_IPI_self(int vector) { }
>  static void noop_apic_icr_write(u32 low, u32 id) { }
> -static int noop_wakeup_secondary_cpu(int apicid, unsigned long start_eip) { return -1; }
> +static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) { return -1; }
>  static u64 noop_apic_icr_read(void) { return 0; }
>  static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
>  static u32 noop_get_apic_id(u32 apicid) { return 0; }
> --- a/arch/x86/kernel/apic/apic_numachip.c
> +++ b/arch/x86/kernel/apic/apic_numachip.c
> @@ -71,7 +71,7 @@ static void numachip2_apic_icr_write(int
>  	numachip2_write32_lcsr(NUMACHIP2_APIC_ICR, (apicid << 12) | val);
>  }
>  
> -static int numachip_wakeup_secondary(int phys_apicid, unsigned long start_rip)
> +static int numachip_wakeup_secondary(u32 phys_apicid, unsigned long start_rip)
>  {
>  	numachip_apic_icr_write(phys_apicid, APIC_DM_INIT);
>  	numachip_apic_icr_write(phys_apicid, APIC_DM_STARTUP |
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -702,7 +702,7 @@ static __init void build_uv_gr_table(voi
>  	}
>  }
>  
> -static int uv_wakeup_secondary(int phys_apicid, unsigned long start_rip)
> +static int uv_wakeup_secondary(u32 phys_apicid, unsigned long start_rip)
>  {
>  	unsigned long val;
>  	int pnode;
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -940,7 +940,7 @@ static void snp_cleanup_vmsa(struct sev_
>  		free_page((unsigned long)vmsa);
>  }
>  
> -static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
> +static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
>  {
>  	struct sev_es_save_area *cur_vmsa, *vmsa;
>  	struct ghcb_state state;
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback
  2023-07-28 12:13 ` [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
@ 2023-08-08 20:15   ` Steve Wahl
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Wahl @ 2023-08-08 20:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross, Steve Wahl,
	Dimitri Sivanich, Russ Anderson

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Fri, Jul 28, 2023 at 02:13:27PM +0200, Thomas Gleixner wrote:
> Now that the core code does not use this monstrosity anymore, it's time to
> put it to rest.
> 
> The only real purpose was to read the APIC ID on UV and VSMP systems for
> the actual evaluation. That's what the core code does now.
> 
> For doing the actual shift operation there is truly no APIC callback
> required.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/apic.h           |    1 -
>  arch/x86/kernel/apic/apic_flat_64.c   |    7 -------
>  arch/x86/kernel/apic/apic_noop.c      |    3 ---
>  arch/x86/kernel/apic/apic_numachip.c  |    7 -------
>  arch/x86/kernel/apic/bigsmp_32.c      |    6 ------
>  arch/x86/kernel/apic/local.h          |    1 -
>  arch/x86/kernel/apic/probe_32.c       |    6 ------
>  arch/x86/kernel/apic/x2apic_cluster.c |    1 -
>  arch/x86/kernel/apic/x2apic_phys.c    |    6 ------
>  arch/x86/kernel/apic/x2apic_uv_x.c    |   11 -----------
>  arch/x86/kernel/vsmp_64.c             |   13 -------------
>  arch/x86/xen/apic.c                   |    6 ------
>  12 files changed, 68 deletions(-)
> 
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -296,7 +296,6 @@ struct apic {
>  	void	(*init_apic_ldr)(void);
>  	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
>  	u32	(*cpu_present_to_apicid)(int mps_cpu);
> -	u32	(*phys_pkg_id)(u32 cpuid_apic, int index_msb);
>  
>  	u32	(*get_apic_id)(u32 id);
>  	u32	(*set_apic_id)(u32 apicid);
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -66,11 +66,6 @@ static u32 set_apic_id(u32 id)
>  	return (id & 0xFF) << 24;
>  }
>  
> -static u32 flat_phys_pkg_id(u32 initial_apic_id, int index_msb)
> -{
> -	return initial_apic_id >> index_msb;
> -}
> -
>  static int flat_probe(void)
>  {
>  	return 1;
> @@ -89,7 +84,6 @@ static struct apic apic_flat __ro_after_
>  
>  	.init_apic_ldr			= default_init_apic_ldr,
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= flat_phys_pkg_id,
>  
>  	.max_apic_id			= 0xFE,
>  	.get_apic_id			= flat_get_apic_id,
> @@ -159,7 +153,6 @@ static struct apic apic_physflat __ro_af
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= flat_phys_pkg_id,
>  
>  	.max_apic_id			= 0xFE,
>  	.get_apic_id			= flat_get_apic_id,
> --- a/arch/x86/kernel/apic/apic_noop.c
> +++ b/arch/x86/kernel/apic/apic_noop.c
> @@ -29,7 +29,6 @@ static void noop_send_IPI_self(int vecto
>  static void noop_apic_icr_write(u32 low, u32 id) { }
>  static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) { return -1; }
>  static u64 noop_apic_icr_read(void) { return 0; }
> -static u32 noop_phys_pkg_id(u32 cpuid_apic, int index_msb) { return 0; }
>  static u32 noop_get_apic_id(u32 apicid) { return 0; }
>  static void noop_apic_eoi(void) { }
>  
> @@ -56,8 +55,6 @@ struct apic apic_noop __ro_after_init =
>  	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
>  
> -	.phys_pkg_id			= noop_phys_pkg_id,
> -
>  	.max_apic_id			= 0xFE,
>  	.get_apic_id			= noop_get_apic_id,
>  
> --- a/arch/x86/kernel/apic/apic_numachip.c
> +++ b/arch/x86/kernel/apic/apic_numachip.c
> @@ -56,11 +56,6 @@ static u32 numachip2_set_apic_id(u32 id)
>  	return id << 24;
>  }
>  
> -static u32 numachip_phys_pkg_id(u32 initial_apic_id, int index_msb)
> -{
> -	return initial_apic_id >> index_msb;
> -}
> -
>  static void numachip1_apic_icr_write(int apicid, unsigned int val)
>  {
>  	write_lcsr(CSR_G3_EXT_IRQ_GEN, (apicid << 16) | val);
> @@ -228,7 +223,6 @@ static const struct apic apic_numachip1
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= numachip_phys_pkg_id,
>  
>  	.max_apic_id			= UINT_MAX,
>  	.get_apic_id			= numachip1_get_apic_id,
> @@ -265,7 +259,6 @@ static const struct apic apic_numachip2
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= numachip_phys_pkg_id,
>  
>  	.max_apic_id			= UINT_MAX,
>  	.get_apic_id			= numachip2_get_apic_id,
> --- a/arch/x86/kernel/apic/bigsmp_32.c
> +++ b/arch/x86/kernel/apic/bigsmp_32.c
> @@ -29,11 +29,6 @@ static void bigsmp_ioapic_phys_id_map(ph
>  	physids_promote(0xFFL, retmap);
>  }
>  
> -static u32 bigsmp_phys_pkg_id(u32 cpuid_apic, int index_msb)
> -{
> -	return cpuid_apic >> index_msb;
> -}
> -
>  static void bigsmp_send_IPI_allbutself(int vector)
>  {
>  	default_send_IPI_mask_allbutself_phys(cpu_online_mask, vector);
> @@ -88,7 +83,6 @@ static struct apic apic_bigsmp __ro_afte
>  	.check_apicid_used		= bigsmp_check_apicid_used,
>  	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= bigsmp_phys_pkg_id,
>  
>  	.max_apic_id			= 0xFE,
>  	.get_apic_id			= bigsmp_get_apic_id,
> --- a/arch/x86/kernel/apic/local.h
> +++ b/arch/x86/kernel/apic/local.h
> @@ -17,7 +17,6 @@
>  void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
>  u32 x2apic_get_apic_id(u32 id);
>  u32 x2apic_set_apic_id(u32 id);
> -u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb);
>  
>  void x2apic_send_IPI_all(int vector);
>  void x2apic_send_IPI_allbutself(int vector);
> --- a/arch/x86/kernel/apic/probe_32.c
> +++ b/arch/x86/kernel/apic/probe_32.c
> @@ -18,11 +18,6 @@
>  
>  #include "local.h"
>  
> -static u32 default_phys_pkg_id(u32 cpuid_apic, int index_msb)
> -{
> -	return cpuid_apic >> index_msb;
> -}
> -
>  static u32 default_get_apic_id(u32 x)
>  {
>  	unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
> @@ -54,7 +49,6 @@ static struct apic apic_default __ro_aft
>  	.init_apic_ldr			= default_init_apic_ldr,
>  	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= default_phys_pkg_id,
>  
>  	.max_apic_id			= 0xFE,
>  	.get_apic_id			= default_get_apic_id,
> --- a/arch/x86/kernel/apic/x2apic_cluster.c
> +++ b/arch/x86/kernel/apic/x2apic_cluster.c
> @@ -236,7 +236,6 @@ static struct apic apic_x2apic_cluster _
>  	.init_apic_ldr			= init_x2apic_ldr,
>  	.ioapic_phys_id_map		= NULL,
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= x2apic_phys_pkg_id,
>  
>  	.max_apic_id			= UINT_MAX,
>  	.x2apic_set_max_apicid		= true,
> --- a/arch/x86/kernel/apic/x2apic_phys.c
> +++ b/arch/x86/kernel/apic/x2apic_phys.c
> @@ -134,11 +134,6 @@ u32 x2apic_set_apic_id(u32 id)
>  	return id;
>  }
>  
> -u32 x2apic_phys_pkg_id(u32 initial_apicid, int index_msb)
> -{
> -	return initial_apicid >> index_msb;
> -}
> -
>  static struct apic apic_x2apic_phys __ro_after_init = {
>  
>  	.name				= "physical x2apic",
> @@ -151,7 +146,6 @@ static struct apic apic_x2apic_phys __ro
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= x2apic_phys_pkg_id,
>  
>  	.max_apic_id			= UINT_MAX,
>  	.x2apic_set_max_apicid		= true,
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -785,16 +785,6 @@ static u32 set_apic_id(u32 id)
>  	return id;
>  }
>  
> -static unsigned int uv_read_apic_id(void)
> -{
> -	return x2apic_get_apic_id(apic_read(APIC_ID));
> -}
> -
> -static u32 uv_phys_pkg_id(u32 initial_apicid, int index_msb)
> -{
> -	return uv_read_apic_id() >> index_msb;
> -}
> -
>  static int uv_probe(void)
>  {
>  	return apic == &apic_x2apic_uv_x;
> @@ -812,7 +802,6 @@ static struct apic apic_x2apic_uv_x __ro
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
> -	.phys_pkg_id			= uv_phys_pkg_id,
>  
>  	.max_apic_id			= UINT_MAX,
>  	.get_apic_id			= x2apic_get_apic_id,
> --- a/arch/x86/kernel/vsmp_64.c
> +++ b/arch/x86/kernel/vsmp_64.c
> @@ -127,25 +127,12 @@ static void __init vsmp_cap_cpus(void)
>  #endif
>  }
>  
> -static u32 apicid_phys_pkg_id(u32 initial_apic_id, int index_msb)
> -{
> -	return read_apic_id() >> index_msb;
> -}
> -
> -static void vsmp_apic_post_init(void)
> -{
> -	/* need to update phys_pkg_id */
> -	apic->phys_pkg_id = apicid_phys_pkg_id;
> -}
> -
>  void __init vsmp_init(void)
>  {
>  	detect_vsmp_box();
>  	if (!is_vsmp_box())
>  		return;
>  
> -	x86_platform.apic_post_init = vsmp_apic_post_init;
> -
>  	vsmp_cap_cpus();
>  
>  	set_vsmp_ctl();
> --- a/arch/x86/xen/apic.c
> +++ b/arch/x86/xen/apic.c
> @@ -110,11 +110,6 @@ static int xen_madt_oem_check(char *oem_
>  	return xen_pv_domain();
>  }
>  
> -static u32 xen_phys_pkg_id(u32 initial_apic_id, int index_msb)
> -{
> -	return initial_apic_id >> index_msb;
> -}
> -
>  static u32 xen_cpu_present_to_apicid(int cpu)
>  {
>  	if (cpu_present(cpu))
> @@ -133,7 +128,6 @@ static struct apic xen_pv_apic __ro_afte
>  	.disable_esr			= 0,
>  
>  	.cpu_present_to_apicid		= xen_cpu_present_to_apicid,
> -	.phys_pkg_id			= xen_phys_pkg_id, /* detect_ht */
>  
>  	.max_apic_id			= UINT_MAX,
>  	.get_apic_id			= xen_get_apic_id,
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser
  2023-07-28 12:13 ` [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
@ 2023-08-08 20:16   ` Steve Wahl
  0 siblings, 0 replies; 72+ messages in thread
From: Steve Wahl @ 2023-08-08 20:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven,
	Steve Wahl, Dimitri Sivanich, Russ Anderson,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi, Guenter Roeck, linux-hwmon,
	Jean Delvare, Huang Rui, Juergen Gross

Looked closely at this one, was wondering about using TOPO_ROOT_DOMAIN
when the original looked at the leaf of TYPE_CORE.  But I think you
probably have it right, and I checked both a newer and an older
system, and the UV info line that prints "apic_pns" gets the same
value as before.

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Fri, Jul 28, 2023 at 02:13:30PM +0200, Thomas Gleixner wrote:
> The package shift has been already evaluated by the early CPU init.
> 
> Put the mindless copy right next to the original leaf 0xb parser.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steve Wahl <steve.wahl@hpe.com>
> Cc: Mike Travis <mike.travis@hpe.com>
> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
> Cc: Russ Anderson <russ.anderson@hpe.com>
> ---
>  arch/x86/include/asm/topology.h    |    5 +++
>  arch/x86/kernel/apic/x2apic_uv_x.c |   52 ++++++-------------------------------
>  2 files changed, 14 insertions(+), 43 deletions(-)
> 
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -126,6 +126,11 @@ static inline unsigned int topology_get_
>  	return x86_topo_system.dom_size[dom];
>  }
>  
> +static inline unsigned int topology_get_domain_shift(enum x86_topology_domains dom)
> +{
> +	return dom == TOPO_SMT_DOMAIN ? 0 : x86_topo_system.dom_shifts[dom - 1];
> +}
> +
>  extern const struct cpumask *cpu_coregroup_mask(int cpu);
>  extern const struct cpumask *cpu_clustergroup_mask(int cpu);
>  
> --- a/arch/x86/kernel/apic/x2apic_uv_x.c
> +++ b/arch/x86/kernel/apic/x2apic_uv_x.c
> @@ -241,54 +241,20 @@ static void __init uv_tsc_check_sync(voi
>  	is_uv(UV3) ? sname.s3.field :		\
>  	undef)
>  
> -/* [Copied from arch/x86/kernel/cpu/topology.c:detect_extended_topology()] */
> -
> -#define SMT_LEVEL			0	/* Leaf 0xb SMT level */
> -#define INVALID_TYPE			0	/* Leaf 0xb sub-leaf types */
> -#define SMT_TYPE			1
> -#define CORE_TYPE			2
> -#define LEAFB_SUBTYPE(ecx)		(((ecx) >> 8) & 0xff)
> -#define BITS_SHIFT_NEXT_LEVEL(eax)	((eax) & 0x1f)
> -
> -static void set_x2apic_bits(void)
> -{
> -	unsigned int eax, ebx, ecx, edx, sub_index;
> -	unsigned int sid_shift;
> -
> -	cpuid(0, &eax, &ebx, &ecx, &edx);
> -	if (eax < 0xb) {
> -		pr_info("UV: CPU does not have CPUID.11\n");
> -		return;
> -	}
> -
> -	cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
> -	if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE)) {
> -		pr_info("UV: CPUID.11 not implemented\n");
> -		return;
> -	}
> -
> -	sid_shift = BITS_SHIFT_NEXT_LEVEL(eax);
> -	sub_index = 1;
> -	do {
> -		cpuid_count(0xb, sub_index, &eax, &ebx, &ecx, &edx);
> -		if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
> -			sid_shift = BITS_SHIFT_NEXT_LEVEL(eax);
> -			break;
> -		}
> -		sub_index++;
> -	} while (LEAFB_SUBTYPE(ecx) != INVALID_TYPE);
> -
> -	uv_cpuid.apicid_shift	= 0;
> -	uv_cpuid.apicid_mask	= (~(-1 << sid_shift));
> -	uv_cpuid.socketid_shift = sid_shift;
> -}
> -
>  static void __init early_get_apic_socketid_shift(void)
>  {
> +	unsigned int sid_shift = topology_get_domain_shift(TOPO_ROOT_DOMAIN);
> +
>  	if (is_uv2_hub() || is_uv3_hub())
>  		uvh_apicid.v = uv_early_read_mmr(UVH_APICID);
>  
> -	set_x2apic_bits();
> +	if (sid_shift) {
> +		uv_cpuid.apicid_shift	= 0;
> +		uv_cpuid.apicid_mask	= (~(-1 << sid_shift));
> +		uv_cpuid.socketid_shift = sid_shift;
> +	} else {
> +		pr_info("UV: CPU does not have valid CPUID.11\n");
> +	}
>  
>  	pr_info("UV: apicid_shift:%d apicid_mask:0x%x\n", uv_cpuid.apicid_shift, uv_cpuid.apicid_mask);
>  	pr_info("UV: socketid_shift:%d pnode_mask:0x%x\n", uv_cpuid.socketid_shift, uv_cpuid.pnode_mask);
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2023-08-08 21:16 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
2023-07-28 12:12 ` [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
2023-07-28 12:12 ` [patch v2 02/38] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 03/38] x86/cpu: Move cpu_die_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 04/38] scsi: lpfc: Use topology_core_id() Thomas Gleixner
2023-07-28 12:12 ` [patch v2 05/38] hwmon: (fam15h_power) " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 06/38] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 07/38] x86/cpu: Move cu_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 08/38] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
2023-07-28 12:12 ` [patch v2 09/38] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 10/38] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 11/38] x86/apic: Use BAD_APICID consistently; Thomas Gleixner
2023-07-28 22:36   ` Sohil Mehta
2023-07-28 22:50     ` Thomas Gleixner
2023-07-28 12:12 ` [patch v2 12/38] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
2023-07-28 12:12 ` [patch v2 13/38] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
2023-07-28 12:13 ` [patch v2 14/38] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
2023-07-28 12:13 ` [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
2023-08-08 20:14   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 18/38] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
2023-07-28 12:13 ` [patch v2 19/38] x86/cpu: Provide debug interface Thomas Gleixner
2023-07-28 23:57   ` Sohil Mehta
2023-07-28 12:13 ` [patch v2 20/38] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
2023-07-29  0:02   ` Sohil Mehta
2023-07-31  4:05   ` Michael Kelley (LINUX)
2023-07-31 12:34     ` Thomas Gleixner
2023-07-31 13:27       ` Peter Zijlstra
2023-07-31 15:38         ` Thomas Gleixner
2023-07-31 16:10           ` Michael Kelley (LINUX)
2023-07-31 20:48             ` Thomas Gleixner
2023-07-31 21:27               ` Michael Kelley (LINUX)
2023-07-31 22:12                 ` Thomas Gleixner
2023-08-01 22:25                   ` Thomas Gleixner
2023-08-01 22:35                     ` Andrew Cooper
2023-08-02 14:43                     ` Michael Kelley (LINUX)
2023-07-31 16:25       ` Michael Kelley (LINUX)
2023-07-31 20:41         ` Thomas Gleixner
2023-07-31 13:47     ` Arjan van de Ven
2023-07-31 14:08       ` Andrew Cooper
2023-08-01  7:05   ` Gautham R. Shenoy
2023-08-01  7:34     ` Thomas Gleixner
2023-07-28 12:13 ` [patch v2 22/38] x86/cpu: Add legacy topology parser Thomas Gleixner
2023-07-28 21:33   ` [patch v2A " Thomas Gleixner
2023-07-28 12:13 ` [patch v2 23/38] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
2023-07-28 12:13 ` [patch v2 24/38] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
2023-07-28 12:13 ` [patch v2 25/38] x86/cpu: Provide a sane leaf 0xb/0x1f parser Thomas Gleixner
2023-07-28 12:13 ` [patch v2 26/38] x86/cpu: Use common topology code for Intel Thomas Gleixner
2023-07-28 12:13 ` [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID Thomas Gleixner
2023-07-29  0:07   ` Sohil Mehta
2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
2023-07-29  0:05   ` Sohil Mehta
2023-07-30  5:20   ` Michael Kelley (LINUX)
2023-07-30  7:41     ` Thomas Gleixner
2023-07-28 12:13 ` [patch v2 29/38] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
2023-07-28 12:13 ` [patch v2 30/38] x86/cpu: Use common topology code for AMD Thomas Gleixner
2023-07-28 12:13 ` [patch v2 31/38] x86/cpu: Use common topology code for HYGON Thomas Gleixner
2023-07-28 12:13 ` [patch v2 32/38] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
2023-07-28 12:13 ` [patch v2 33/38] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
2023-07-28 12:13 ` [patch v2 34/38] x86/cpu: Remove topology.c Thomas Gleixner
2023-07-28 12:13 ` [patch v2 35/38] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
2023-07-28 12:13 ` [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 37/38] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
2023-07-28 12:13 ` [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
2023-08-08 20:16   ` Steve Wahl
2023-07-28 15:41 ` [patch v2 00/38] x86/cpu: Rework the topology evaluation Dimitri Sivanich
2023-07-28 19:01 ` Sohil Mehta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).