All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/3] Fix panic in logical packages calculation
@ 2017-11-14 12:42 Prarit Bhargava
  2017-11-14 12:42 ` [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver Prarit Bhargava
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Prarit Bhargava @ 2017-11-14 12:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Andi Kleen, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Dave Hansen, Piotr Luc,
	Kan Liang, Borislav Petkov, Stephane Eranian, Arvind Yadav,
	Andy Lutomirski, Christian Borntraeger, Kirill A. Shutemov,
	Tom Lendacky, He Chen, Mathias Krause, Tim Chen,
	Vitaly Kuznetsov

A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.
This occurs when the total number of active cores across all packages
is less than the maximum core count for a single package.

Andi noted that the logical_packages calculation wastes 128k on 64-bit
x86 systems and posted patches to fix the allocation.  His patches
result in significant memory size savings, and I have put my
fix on top of those patches.

Here's an example of the panic:

     smpboot: Booting Node   1, Processors  #1 OK
     smpboot: Package 1 of CPU 1 exceeds BIOS package data 1.
     ------------[ cut here ]------------
     kernel BUG at arch/x86/kernel/cpu/common.c:1087!
     invalid opcode: 0000 [#1] SMP
     Modules linked in:
     CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.0-rc2+ #4

This was tested by dropping the number of active cores across
all packages to reproduce the bug above.  Additional testing included
2 socket and 4 socket systems and hotplugging entire sockets in different
order.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>

Andi Kleen (2):
  perf/x86/intel/uncore: Cache logical pkg id in uncore driver
  x86/topology: Avoid wasting 128k for package id array

Prarit Bhargava (1):
  x86/smpboot: Fix __max_logical_packages estimate

 arch/x86/events/intel/uncore.c       |   4 +-
 arch/x86/events/intel/uncore.h       |   2 +-
 arch/x86/events/intel/uncore_snbep.c |   2 +-
 arch/x86/include/asm/processor.h     |   1 +
 arch/x86/kernel/smpboot.c            | 130 +++++++++++------------------------
 5 files changed, 44 insertions(+), 95 deletions(-)

-- 
2.15.0.rc0.39.g2f0e14e64

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver
  2017-11-14 12:42 [PATCH v6 0/3] Fix panic in logical packages calculation Prarit Bhargava
@ 2017-11-14 12:42 ` Prarit Bhargava
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Andi Kleen
  2017-11-14 12:42 ` [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array Prarit Bhargava
  2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
  2 siblings, 1 reply; 16+ messages in thread
From: Prarit Bhargava @ 2017-11-14 12:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Prarit Bhargava, Arvind Yadav,
	Andy Lutomirski, Christian Borntraeger, Kirill A. Shutemov,
	Tom Lendacky, He Chen, Mathias Krause, Tim Chen,
	Vitaly Kuznetsov

From: Andi Kleen <ak@linux.intel.com>

The SNB-EP uncore driver is the only user of topology_phys_to_logical_pkg
in a performance critical path. Change it query the logical pkg ID
only once at initialization time and then cache it in box structure.

[v2]: Used the existing pkgid, Logical package ID, which was already stored.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/events/intel/uncore.c       | 4 ++--
 arch/x86/events/intel/uncore.h       | 2 +-
 arch/x86/events/intel/uncore_snbep.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index d45e06346f14..7874c980d569 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -975,10 +975,10 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	int i, phys_id, pkg;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	pkg = topology_phys_to_logical_pkg(phys_id);
 
 	box = pci_get_drvdata(pdev);
 	if (!box) {
+		pkg = topology_phys_to_logical_pkg(phys_id);
 		for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
 			if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
 				uncore_extra_pci_dev[pkg].dev[i] = NULL;
@@ -994,7 +994,7 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 		return;
 
 	pci_set_drvdata(pdev, NULL);
-	pmu->boxes[pkg] = NULL;
+	pmu->boxes[box->pkgid] = NULL;
 	if (atomic_dec_return(&pmu->activeboxes) == 0)
 		uncore_pmu_unregister(pmu);
 	uncore_box_exit(box);
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 4364191e7c6b..414dc7e7c950 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -100,7 +100,7 @@ struct intel_uncore_extra_reg {
 
 struct intel_uncore_box {
 	int pci_phys_id;
-	int pkgid;
+	int pkgid;	/* Logical package ID */
 	int n_active;	/* number of active events */
 	int n_events;
 	int cpu;	/* cpu to collect events */
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 95cb19f4e06f..de8f8625213c 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1057,7 +1057,7 @@ static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_eve
 
 	if (reg1->idx != EXTRA_REG_NONE) {
 		int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
-		int pkg = topology_phys_to_logical_pkg(box->pci_phys_id);
+		int pkg = box->pkgid;
 		struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
 
 		if (filter_pdev) {
-- 
2.15.0.rc0.39.g2f0e14e64

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array
  2017-11-14 12:42 [PATCH v6 0/3] Fix panic in logical packages calculation Prarit Bhargava
  2017-11-14 12:42 ` [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver Prarit Bhargava
@ 2017-11-14 12:42 ` Prarit Bhargava
  2017-11-17 14:59   ` Thomas Gleixner
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Andi Kleen
  2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
  2 siblings, 2 replies; 16+ messages in thread
From: Prarit Bhargava @ 2017-11-14 12:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Prarit Bhargava, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Dave Hansen, Piotr Luc,
	Kan Liang, Borislav Petkov, Stephane Eranian, Arvind Yadav,
	Andy Lutomirski, Christian Borntraeger, Kirill A. Shutemov,
	Tom Lendacky, He Chen, Mathias Krause, Tim Chen,
	Vitaly Kuznetsov

From: Andi Kleen <ak@linux.intel.com>

I was looking at large early boot allocations and noticed that
since (1f12e32f x86/topology: Create logical package id)
every 64bit system allocates a 128k array to convert logical
package ids.

This happens because the array is sized for
MAX_LOCAL_APIC and that is always 32k on 64bit systems,
and it needs 4 bytes for each entry.

This is fairly wasteful, especially for the common case
of having only one socket, where we need 128K just to store
a single 4 byte value.

Store the logical processor id in cpudata.  Add a set flag that indicates
that the cpu's cpudata entry has been written.

[v2]: Decrease logical_packages when the last thread in a socket is
removed.
[v3]: Add more logic to keep logical and physical package IDs
in synch.
[v4]: Keep logical mapping static by using hybrid approach of a small
logical
to physical array and keeping logical cpu information in cpu_data.
[v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
virtual machines.  Remove phys_pkg_id.  Add spinlock to avoid concurrency
issues.
[v6]: Revert back to storing logical processor ID in cpudata, and
add set flag to indicate that boot_cpu_data was written to a cpu's
cpu_data.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/processor.h |  1 +
 arch/x86/kernel/smpboot.c        | 75 ++++++++++++++++------------------------
 2 files changed, 30 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index bdac19ab2488..44fd3512442f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -132,6 +132,7 @@ struct cpuinfo_x86 {
 	/* Index into per_cpu list: */
 	u16			cpu_index;
 	u32			microcode;
+	unsigned		set : 1;
 } __randomize_layout;
 
 struct cpuid_regs {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ad59edd84de7..838d36ff7ba6 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -100,9 +100,6 @@ DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
 
 /* Logical package management. We might want to allocate that dynamically */
-static int *physical_to_logical_pkg __read_mostly;
-static unsigned long *physical_package_map __read_mostly;;
-static unsigned int max_physical_pkg_id __read_mostly;
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
 static unsigned int logical_packages __read_mostly;
@@ -278,6 +275,25 @@ static void notrace start_secondary(void *unused)
 	cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
 }
 
+/**
+ * topology_phys_to_logical_pkg - Map a physical package id to a logical
+ *
+ * Returns logical package id or -1 if not found
+ */
+int topology_phys_to_logical_pkg(unsigned int phys_pkg)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		if (cpu_data(cpu).phys_proc_id == phys_pkg &&
+		    cpu_data(cpu).set) {
+			return cpu_data(cpu).logical_proc_id;
+		}
+	}
+	return -1;
+}
+EXPORT_SYMBOL(topology_phys_to_logical_pkg);
+
 /**
  * topology_update_package_map - Update the physical to logical package map
  * @pkg:	The physical package id as retrieved via CPUID
@@ -285,17 +301,11 @@ static void notrace start_secondary(void *unused)
  */
 int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 {
-	unsigned int new;
-
-	/* Called from early boot ? */
-	if (!physical_package_map)
-		return 0;
-
-	if (pkg >= max_physical_pkg_id)
-		return -EINVAL;
+	int new;
 
-	/* Set the logical package id */
-	if (test_and_set_bit(pkg, physical_package_map))
+	/* Already available somewhere? */
+	new = topology_phys_to_logical_pkg(pkg);
+	if (new >= 0)
 		goto found;
 
 	if (logical_packages >= __max_logical_packages) {
@@ -305,34 +315,17 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 	}
 
 	new = logical_packages++;
-	if (new != pkg) {
+	if (new != pkg)
 		pr_info("CPU %u Converting physical %u to logical package %u\n",
 			cpu, pkg, new);
-	}
-	physical_to_logical_pkg[pkg] = new;
-
 found:
-	cpu_data(cpu).logical_proc_id = physical_to_logical_pkg[pkg];
+	cpu_data(cpu).logical_proc_id = new;
 	return 0;
 }
 
-/**
- * topology_phys_to_logical_pkg - Map a physical package id to a logical
- *
- * Returns logical package id or -1 if not found
- */
-int topology_phys_to_logical_pkg(unsigned int phys_pkg)
-{
-	if (phys_pkg >= max_physical_pkg_id)
-		return -1;
-	return physical_to_logical_pkg[phys_pkg];
-}
-EXPORT_SYMBOL(topology_phys_to_logical_pkg);
-
 static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
 {
 	unsigned int ncpus;
-	size_t size;
 
 	/*
 	 * Today neither Intel nor AMD support heterogenous systems. That
@@ -363,19 +356,6 @@ static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
 	}
 
 	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
-	logical_packages = 0;
-
-	/*
-	 * Possibly larger than what we need as the number of apic ids per
-	 * package can be smaller than the actual used apic ids.
-	 */
-	max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus);
-	size = max_physical_pkg_id * sizeof(unsigned int);
-	physical_to_logical_pkg = kmalloc(size, GFP_KERNEL);
-	memset(physical_to_logical_pkg, 0xff, size);
-	size = BITS_TO_LONGS(max_physical_pkg_id) * sizeof(unsigned long);
-	physical_package_map = kzalloc(size, GFP_KERNEL);
-
 	pr_info("Max logical packages: %u\n", __max_logical_packages);
 
 	topology_update_package_map(c->phys_proc_id, cpu);
@@ -389,6 +369,7 @@ void __init smp_store_boot_cpu_info(void)
 	*c = boot_cpu_data;
 	c->cpu_index = id;
 	smp_init_package_map(c, id);
+	cpu_data(id).set = 1;
 }
 
 /*
@@ -399,13 +380,15 @@ void smp_store_cpu_info(int id)
 {
 	struct cpuinfo_x86 *c = &cpu_data(id);
 
-	*c = boot_cpu_data;
+	if (cpu_data(id).set == 0)
+		*c = boot_cpu_data;
 	c->cpu_index = id;
 	/*
 	 * During boot time, CPU0 has this setup already. Save the info when
 	 * bringing up AP or offlined CPU0.
 	 */
 	identify_secondary_cpu(c);
+	cpu_data(id).set = 1;
 }
 
 static bool
-- 
2.15.0.rc0.39.g2f0e14e64

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate
  2017-11-14 12:42 [PATCH v6 0/3] Fix panic in logical packages calculation Prarit Bhargava
  2017-11-14 12:42 ` [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver Prarit Bhargava
  2017-11-14 12:42 ` [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array Prarit Bhargava
@ 2017-11-14 12:42 ` Prarit Bhargava
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Prarit Bhargava
                     ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Prarit Bhargava @ 2017-11-14 12:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Peter Zijlstra, Andi Kleen, Dave Hansen, Piotr Luc,
	Kan Liang, Borislav Petkov, Stephane Eranian, Arvind Yadav,
	Andy Lutomirski, Christian Borntraeger, Kirill A. Shutemov,
	Tom Lendacky, He Chen, Mathias Krause, Tim Chen,
	Vitaly Kuznetsov

A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.
This occurs when the total number of active cores across all packages
is less than the maximum core count for a single package.

ie) On a 4 package system with 20 cores/package where only 4 cores
are enabled on each package, the value of __max_logical_packages is
calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.

Calculate __max_logical_packages after the cpu enumeration has completed.
Use the boot cpu's data to extrapolate the number of packages.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
 1 file changed, 10 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 838d36ff7ba6..2e3c5a394e79 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 	if (new >= 0)
 		goto found;
 
-	if (logical_packages >= __max_logical_packages) {
-		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
-			logical_packages, cpu, __max_logical_packages);
-		return -ENOSPC;
-	}
-
 	new = logical_packages++;
 	if (new != pkg)
 		pr_info("CPU %u Converting physical %u to logical package %u\n",
@@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 	return 0;
 }
 
-static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
-{
-	unsigned int ncpus;
-
-	/*
-	 * Today neither Intel nor AMD support heterogenous systems. That
-	 * might change in the future....
-	 *
-	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
-	 * computation, this won't actually work since some Intel BIOSes
-	 * report inconsistent HT data when they disable HT.
-	 *
-	 * In particular, they reduce the APIC-IDs to only include the cores,
-	 * but leave the CPUID topology to say there are (2) siblings.
-	 * This means we don't know how many threads there will be until
-	 * after the APIC enumeration.
-	 *
-	 * By not including this we'll sometimes over-estimate the number of
-	 * logical packages by the amount of !present siblings, but this is
-	 * still better than MAX_LOCAL_APIC.
-	 *
-	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
-	 * on the command line leading to a similar issue as the HT disable
-	 * problem because the hyperthreads are usually enumerated after the
-	 * primary cores.
-	 */
-	ncpus = boot_cpu_data.x86_max_cores;
-	if (!ncpus) {
-		pr_warn("x86_max_cores == zero !?!?");
-		ncpus = 1;
-	}
-
-	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
-	pr_info("Max logical packages: %u\n", __max_logical_packages);
-
-	topology_update_package_map(c->phys_proc_id, cpu);
-}
-
 void __init smp_store_boot_cpu_info(void)
 {
 	int id = 0; /* CPU 0 */
@@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
 
 	*c = boot_cpu_data;
 	c->cpu_index = id;
-	smp_init_package_map(c, id);
+	topology_update_package_map(c->phys_proc_id, id);
 	cpu_data(id).set = 1;
 }
 
@@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
 
 void __init native_smp_cpus_done(unsigned int max_cpus)
 {
+	int ncpus;
+
 	pr_debug("Boot done\n");
+	/*
+	 * Today neither Intel nor AMD support heterogenous systems so
+	 * extrapolate the boot cpu's data to all packages.
+	 */
+	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
+	pr_info("Max logical packages: %u\n", __max_logical_packages);
 
 	if (x86_has_numa_in_package)
 		set_sched_topology(x86_numa_in_package_topology);
-- 
2.15.0.rc0.39.g2f0e14e64

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array
  2017-11-14 12:42 ` [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array Prarit Bhargava
@ 2017-11-17 14:59   ` Thomas Gleixner
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Andi Kleen
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Gleixner @ 2017-11-17 14:59 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Andi Kleen, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Arvind Yadav, Andy Lutomirski,
	Christian Borntraeger, Kirill A. Shutemov, Tom Lendacky, He Chen,
	Mathias Krause, Tim Chen, Vitaly Kuznetsov

On Tue, 14 Nov 2017, Prarit Bhargava wrote:
> @@ -389,6 +369,7 @@ void __init smp_store_boot_cpu_info(void)
>  	*c = boot_cpu_data;
>  	c->cpu_index = id;
>  	smp_init_package_map(c, id);
> +	cpu_data(id).set = 1;

What's wrong with

       c->set ?

Aside of that 'set' is not really descriptive. initialized might be better
suited.

>  }
>  
>  /*
> @@ -399,13 +380,15 @@ void smp_store_cpu_info(int id)
>  {
>  	struct cpuinfo_x86 *c = &cpu_data(id);
>  
> -	*c = boot_cpu_data;
> +	if (cpu_data(id).set == 0)
> +		*c = boot_cpu_data;
>  	c->cpu_index = id;
>  	/*
>  	 * During boot time, CPU0 has this setup already. Save the info when
>  	 * bringing up AP or offlined CPU0.
>  	 */
>  	identify_secondary_cpu(c);
> +	cpu_data(id).set = 1;

See above.

No need to resend. I'll fix it up.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip:x86/urgent] perf/x86/intel/uncore: Cache logical pkg id in uncore driver
  2017-11-14 12:42 ` [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver Prarit Bhargava
@ 2017-11-17 15:27   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Andi Kleen @ 2017-11-17 15:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, he.chen, dave.hansen, minipli, luto, eranian, mingo, prarit,
	kan.liang, tim.c.chen, arvind.yadav.cs, kirill.shutemov,
	borntraeger, thomas.lendacky, peterz, ak, tglx, bp, linux-kernel,
	vkuznets, piotr.luc

Commit-ID:  d46b4c1ce5f0d9a13fb2318763076442669a2bdc
Gitweb:     https://git.kernel.org/tip/d46b4c1ce5f0d9a13fb2318763076442669a2bdc
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 14 Nov 2017 07:42:55 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 17 Nov 2017 16:22:30 +0100

perf/x86/intel/uncore: Cache logical pkg id in uncore driver

The SNB-EP uncore driver is the only user of topology_phys_to_logical_pkg
in a performance critical path.

Change it query the logical pkg ID only once at initialization time and
then cache it in box structure. This allows to change the logical package
management without affecting the performance critical path.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-2-prarit@redhat.com

---
 arch/x86/events/intel/uncore.c       | 4 ++--
 arch/x86/events/intel/uncore.h       | 2 +-
 arch/x86/events/intel/uncore_snbep.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index d45e063..7874c98 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -975,10 +975,10 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	int i, phys_id, pkg;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	pkg = topology_phys_to_logical_pkg(phys_id);
 
 	box = pci_get_drvdata(pdev);
 	if (!box) {
+		pkg = topology_phys_to_logical_pkg(phys_id);
 		for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
 			if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
 				uncore_extra_pci_dev[pkg].dev[i] = NULL;
@@ -994,7 +994,7 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 		return;
 
 	pci_set_drvdata(pdev, NULL);
-	pmu->boxes[pkg] = NULL;
+	pmu->boxes[box->pkgid] = NULL;
 	if (atomic_dec_return(&pmu->activeboxes) == 0)
 		uncore_pmu_unregister(pmu);
 	uncore_box_exit(box);
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 4364191..414dc7e 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -100,7 +100,7 @@ struct intel_uncore_extra_reg {
 
 struct intel_uncore_box {
 	int pci_phys_id;
-	int pkgid;
+	int pkgid;	/* Logical package ID */
 	int n_active;	/* number of active events */
 	int n_events;
 	int cpu;	/* cpu to collect events */
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 95cb19f..de8f862 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1057,7 +1057,7 @@ static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_eve
 
 	if (reg1->idx != EXTRA_REG_NONE) {
 		int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
-		int pkg = topology_phys_to_logical_pkg(box->pci_phys_id);
+		int pkg = box->pkgid;
 		struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
 
 		if (filter_pdev) {

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:x86/urgent] x86/topology: Avoid wasting 128k for package id array
  2017-11-14 12:42 ` [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array Prarit Bhargava
  2017-11-17 14:59   ` Thomas Gleixner
@ 2017-11-17 15:27   ` tip-bot for Andi Kleen
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Andi Kleen @ 2017-11-17 15:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave.hansen, linux-kernel, mingo, prarit, vkuznets, ak,
	kirill.shutemov, kan.liang, luto, tim.c.chen, eranian, bp,
	thomas.lendacky, borntraeger, he.chen, peterz, arvind.yadav.cs,
	tglx, hpa, minipli, piotr.luc

Commit-ID:  30bb9811856f667042e746d8033883b1091a46ce
Gitweb:     https://git.kernel.org/tip/30bb9811856f667042e746d8033883b1091a46ce
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 14 Nov 2017 07:42:56 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 17 Nov 2017 16:22:30 +0100

x86/topology: Avoid wasting 128k for package id array

Analyzing large early boot allocations unveiled the logical package id
storage as a prominent memory waste. Since commit 1f12e32f4cd5
("x86/topology: Create logical package id") every 64-bit system allocates a
128k array to convert logical package ids.

This happens because the array is sized for MAX_LOCAL_APIC which is always
32k on 64bit systems, and it needs 4 bytes for each entry.

This is fairly wasteful, especially for the common case of having only one
socket, which uses exactly 4 byte out of 128K.

There is no user of the package id map which is performance critical, so
the lookup is not required to be O(1). Store the logical processor id in
cpu_data and use a loop based lookup.

To keep the mapping stable accross cpu hotplug operations, add a flag to
cpu_data which is set when the CPU is brought up the first time. When the
flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
subsequent bringups.

[ tglx: Rename the flag to 'initialized', use proper pointers instead of
  	repeated cpu_data(x) evaluation and massage changelog. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com

---
 arch/x86/include/asm/processor.h |  1 +
 arch/x86/kernel/smpboot.c        | 73 ++++++++++++++++------------------------
 2 files changed, 30 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2db7cf7..cc16fa8 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -132,6 +132,7 @@ struct cpuinfo_x86 {
 	/* Index into per_cpu list: */
 	u16			cpu_index;
 	u32			microcode;
+	unsigned		initialized : 1;
 } __randomize_layout;
 
 struct cpuid_regs {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5f59e6b..da5e162 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -101,9 +101,6 @@ DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
 
 /* Logical package management. We might want to allocate that dynamically */
-static int *physical_to_logical_pkg __read_mostly;
-static unsigned long *physical_package_map __read_mostly;;
-static unsigned int max_physical_pkg_id __read_mostly;
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
 static unsigned int logical_packages __read_mostly;
@@ -281,23 +278,36 @@ static void notrace start_secondary(void *unused)
 }
 
 /**
+ * topology_phys_to_logical_pkg - Map a physical package id to a logical
+ *
+ * Returns logical package id or -1 if not found
+ */
+int topology_phys_to_logical_pkg(unsigned int phys_pkg)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct cpuinfo_x86 *c = &cpu_data(cpu);
+
+		if (c->initialized && c->phys_proc_id == phys_pkg)
+			return c->logical_proc_id;
+	}
+	return -1;
+}
+EXPORT_SYMBOL(topology_phys_to_logical_pkg);
+
+/**
  * topology_update_package_map - Update the physical to logical package map
  * @pkg:	The physical package id as retrieved via CPUID
  * @cpu:	The cpu for which this is updated
  */
 int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 {
-	unsigned int new;
-
-	/* Called from early boot ? */
-	if (!physical_package_map)
-		return 0;
-
-	if (pkg >= max_physical_pkg_id)
-		return -EINVAL;
+	int new;
 
-	/* Set the logical package id */
-	if (test_and_set_bit(pkg, physical_package_map))
+	/* Already available somewhere? */
+	new = topology_phys_to_logical_pkg(pkg);
+	if (new >= 0)
 		goto found;
 
 	if (logical_packages >= __max_logical_packages) {
@@ -311,30 +321,14 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 		pr_info("CPU %u Converting physical %u to logical package %u\n",
 			cpu, pkg, new);
 	}
-	physical_to_logical_pkg[pkg] = new;
-
 found:
-	cpu_data(cpu).logical_proc_id = physical_to_logical_pkg[pkg];
+	cpu_data(cpu).logical_proc_id = new;
 	return 0;
 }
 
-/**
- * topology_phys_to_logical_pkg - Map a physical package id to a logical
- *
- * Returns logical package id or -1 if not found
- */
-int topology_phys_to_logical_pkg(unsigned int phys_pkg)
-{
-	if (phys_pkg >= max_physical_pkg_id)
-		return -1;
-	return physical_to_logical_pkg[phys_pkg];
-}
-EXPORT_SYMBOL(topology_phys_to_logical_pkg);
-
 static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
 {
 	unsigned int ncpus;
-	size_t size;
 
 	/*
 	 * Today neither Intel nor AMD support heterogenous systems. That
@@ -365,19 +359,6 @@ static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
 	}
 
 	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
-	logical_packages = 0;
-
-	/*
-	 * Possibly larger than what we need as the number of apic ids per
-	 * package can be smaller than the actual used apic ids.
-	 */
-	max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus);
-	size = max_physical_pkg_id * sizeof(unsigned int);
-	physical_to_logical_pkg = kmalloc(size, GFP_KERNEL);
-	memset(physical_to_logical_pkg, 0xff, size);
-	size = BITS_TO_LONGS(max_physical_pkg_id) * sizeof(unsigned long);
-	physical_package_map = kzalloc(size, GFP_KERNEL);
-
 	pr_info("Max logical packages: %u\n", __max_logical_packages);
 
 	topology_update_package_map(c->phys_proc_id, cpu);
@@ -391,6 +372,7 @@ void __init smp_store_boot_cpu_info(void)
 	*c = boot_cpu_data;
 	c->cpu_index = id;
 	smp_init_package_map(c, id);
+	c->initialized = true;
 }
 
 /*
@@ -401,13 +383,16 @@ void smp_store_cpu_info(int id)
 {
 	struct cpuinfo_x86 *c = &cpu_data(id);
 
-	*c = boot_cpu_data;
+	/* Copy boot_cpu_data only on the first bringup */
+	if (!c->initialized)
+		*c = boot_cpu_data;
 	c->cpu_index = id;
 	/*
 	 * During boot time, CPU0 has this setup already. Save the info when
 	 * bringing up AP or offlined CPU0.
 	 */
 	identify_secondary_cpu(c);
+	c->initialized = true;
 }
 
 static bool

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:x86/urgent] x86/smpboot: Fix __max_logical_packages estimate
  2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
@ 2017-11-17 15:27   ` tip-bot for Prarit Bhargava
  2018-02-07 18:44   ` [v6,3/3] " Simon Gaiser
  2018-02-07 18:44   ` [v6, 3/3] " Simon Gaiser
  2 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Prarit Bhargava @ 2017-11-17 15:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, tglx, vkuznets, ak, bp, piotr.luc, minipli, borntraeger,
	dave.hansen, arvind.yadav.cs, linux-kernel, thomas.lendacky,
	peterz, kirill.shutemov, luto, hpa, kan.liang, prarit, mingo,
	he.chen, tim.c.chen

Commit-ID:  b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1
Gitweb:     https://git.kernel.org/tip/b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1
Author:     Prarit Bhargava <prarit@redhat.com>
AuthorDate: Tue, 14 Nov 2017 07:42:57 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 17 Nov 2017 16:22:31 +0100

x86/smpboot: Fix __max_logical_packages estimate

A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.

This occurs when the total number of active cores across all packages is
less than the maximum core count for a single package. e.g.:

  On a 4 package system with 20 cores/package where only 4 cores are
  enabled on each package, the value of __max_logical_packages is
  calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.

Calculate __max_logical_packages after the cpu enumeration has completed.
Use the boot cpu's data to extrapolate the number of packages.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com

---
 arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
 1 file changed, 10 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index da5e162..3d01df7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -310,12 +310,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
 	if (new >= 0)
 		goto found;
 
-	if (logical_packages >= __max_logical_packages) {
-		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
-			logical_packages, cpu, __max_logical_packages);
-		return -ENOSPC;
-	}
-
 	new = logical_packages++;
 	if (new != pkg) {
 		pr_info("CPU %u Converting physical %u to logical package %u\n",
@@ -326,44 +320,6 @@ found:
 	return 0;
 }
 
-static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
-{
-	unsigned int ncpus;
-
-	/*
-	 * Today neither Intel nor AMD support heterogenous systems. That
-	 * might change in the future....
-	 *
-	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
-	 * computation, this won't actually work since some Intel BIOSes
-	 * report inconsistent HT data when they disable HT.
-	 *
-	 * In particular, they reduce the APIC-IDs to only include the cores,
-	 * but leave the CPUID topology to say there are (2) siblings.
-	 * This means we don't know how many threads there will be until
-	 * after the APIC enumeration.
-	 *
-	 * By not including this we'll sometimes over-estimate the number of
-	 * logical packages by the amount of !present siblings, but this is
-	 * still better than MAX_LOCAL_APIC.
-	 *
-	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
-	 * on the command line leading to a similar issue as the HT disable
-	 * problem because the hyperthreads are usually enumerated after the
-	 * primary cores.
-	 */
-	ncpus = boot_cpu_data.x86_max_cores;
-	if (!ncpus) {
-		pr_warn("x86_max_cores == zero !?!?");
-		ncpus = 1;
-	}
-
-	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
-	pr_info("Max logical packages: %u\n", __max_logical_packages);
-
-	topology_update_package_map(c->phys_proc_id, cpu);
-}
-
 void __init smp_store_boot_cpu_info(void)
 {
 	int id = 0; /* CPU 0 */
@@ -371,7 +327,7 @@ void __init smp_store_boot_cpu_info(void)
 
 	*c = boot_cpu_data;
 	c->cpu_index = id;
-	smp_init_package_map(c, id);
+	topology_update_package_map(c->phys_proc_id, id);
 	c->initialized = true;
 }
 
@@ -1341,7 +1297,16 @@ void __init native_smp_prepare_boot_cpu(void)
 
 void __init native_smp_cpus_done(unsigned int max_cpus)
 {
+	int ncpus;
+
 	pr_debug("Boot done\n");
+	/*
+	 * Today neither Intel nor AMD support heterogenous systems so
+	 * extrapolate the boot cpu's data to all packages.
+	 */
+	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
+	pr_info("Max logical packages: %u\n", __max_logical_packages);
 
 	if (x86_has_numa_in_package)
 		set_sched_topology(x86_numa_in_package_topology);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [v6,3/3] x86/smpboot: Fix __max_logical_packages estimate
  2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Prarit Bhargava
@ 2018-02-07 18:44   ` Simon Gaiser
  2018-02-07 19:04     ` [v6, 3/3] " Prarit Bhargava
  2018-02-07 19:04     ` [v6,3/3] " Prarit Bhargava
  2018-02-07 18:44   ` [v6, 3/3] " Simon Gaiser
  2 siblings, 2 replies; 16+ messages in thread
From: Simon Gaiser @ 2018-02-07 18:44 UTC (permalink / raw)
  To: Prarit Bhargava, xen-devel
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Andi Kleen, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Arvind Yadav, Andy Lutomirski,
	Christian Borntraeger, Kirill A. Shutemov, Tom Lendacky, He Chen,
	Mathias Krause, Tim Chen, Vitaly Kuznetsov


[-- Attachment #1.1: Type: text/plain, Size: 5203 bytes --]

Prarit Bhargava:
> A system booted with a small number of cores enabled per package
> panics because the estimate of __max_logical_packages is too low.
> This occurs when the total number of active cores across all packages
> is less than the maximum core count for a single package.
> 
> ie) On a 4 package system with 20 cores/package where only 4 cores
> are enabled on each package, the value of __max_logical_packages is
> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
> 
> Calculate __max_logical_packages after the cpu enumeration has completed.
> Use the boot cpu's data to extrapolate the number of packages.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Piotr Luc <piotr.luc@intel.com>
> Cc: Kan Liang <kan.liang@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Prarit Bhargava <prarit@redhat.com>
> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: He Chen <he.chen@linux.intel.com>
> Cc: Mathias Krause <minipli@googlemail.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>  1 file changed, 10 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 838d36ff7ba6..2e3c5a394e79 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>  	if (new >= 0)
>  		goto found;
>  
> -	if (logical_packages >= __max_logical_packages) {
> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
> -			logical_packages, cpu, __max_logical_packages);
> -		return -ENOSPC;
> -	}
> -
>  	new = logical_packages++;
>  	if (new != pkg)
>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>  	return 0;
>  }
>  
> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
> -{
> -	unsigned int ncpus;
> -
> -	/*
> -	 * Today neither Intel nor AMD support heterogenous systems. That
> -	 * might change in the future....
> -	 *
> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
> -	 * computation, this won't actually work since some Intel BIOSes
> -	 * report inconsistent HT data when they disable HT.
> -	 *
> -	 * In particular, they reduce the APIC-IDs to only include the cores,
> -	 * but leave the CPUID topology to say there are (2) siblings.
> -	 * This means we don't know how many threads there will be until
> -	 * after the APIC enumeration.
> -	 *
> -	 * By not including this we'll sometimes over-estimate the number of
> -	 * logical packages by the amount of !present siblings, but this is
> -	 * still better than MAX_LOCAL_APIC.
> -	 *
> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
> -	 * on the command line leading to a similar issue as the HT disable
> -	 * problem because the hyperthreads are usually enumerated after the
> -	 * primary cores.
> -	 */
> -	ncpus = boot_cpu_data.x86_max_cores;
> -	if (!ncpus) {
> -		pr_warn("x86_max_cores == zero !?!?");
> -		ncpus = 1;
> -	}
> -
> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
> -
> -	topology_update_package_map(c->phys_proc_id, cpu);
> -}
> -
>  void __init smp_store_boot_cpu_info(void)
>  {
>  	int id = 0; /* CPU 0 */
> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>  
>  	*c = boot_cpu_data;
>  	c->cpu_index = id;
> -	smp_init_package_map(c, id);
> +	topology_update_package_map(c->phys_proc_id, id);
>  	cpu_data(id).set = 1;
>  }
>  
> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>  
>  void __init native_smp_cpus_done(unsigned int max_cpus)
>  {
> +	int ncpus;
> +
>  	pr_debug("Boot done\n");
> +	/*
> +	 * Today neither Intel nor AMD support heterogenous systems so
> +	 * extrapolate the boot cpu's data to all packages.
> +	 */
> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>  
>  	if (x86_has_numa_in_package)
>  		set_sched_topology(x86_numa_in_package_topology);

This breaks booting as Xen PV domain for me. The problem seems to be
that native_smp_cpus_done() is never called on a PV domain. So
__max_logical_packages is uninitialized and this leads to a NULL
pointer dereference in coretemp.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6, 3/3] x86/smpboot: Fix __max_logical_packages estimate
  2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
  2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Prarit Bhargava
  2018-02-07 18:44   ` [v6,3/3] " Simon Gaiser
@ 2018-02-07 18:44   ` Simon Gaiser
  2 siblings, 0 replies; 16+ messages in thread
From: Simon Gaiser @ 2018-02-07 18:44 UTC (permalink / raw)
  To: Prarit Bhargava, xen-devel
  Cc: Tom Lendacky, Vitaly Kuznetsov, Andi Kleen,
	Christian Borntraeger, Peter Zijlstra, Kan Liang, x86,
	linux-kernel, Stephane Eranian, He Chen, Dave Hansen, Piotr Luc,
	Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Arvind Yadav,
	Thomas Gleixner, Borislav Petkov, Tim Chen, Mathias Krause,
	Kirill A. Shutemov


[-- Attachment #1.1.1: Type: text/plain, Size: 5203 bytes --]

Prarit Bhargava:
> A system booted with a small number of cores enabled per package
> panics because the estimate of __max_logical_packages is too low.
> This occurs when the total number of active cores across all packages
> is less than the maximum core count for a single package.
> 
> ie) On a 4 package system with 20 cores/package where only 4 cores
> are enabled on each package, the value of __max_logical_packages is
> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
> 
> Calculate __max_logical_packages after the cpu enumeration has completed.
> Use the boot cpu's data to extrapolate the number of packages.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Piotr Luc <piotr.luc@intel.com>
> Cc: Kan Liang <kan.liang@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Prarit Bhargava <prarit@redhat.com>
> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: He Chen <he.chen@linux.intel.com>
> Cc: Mathias Krause <minipli@googlemail.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>  1 file changed, 10 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 838d36ff7ba6..2e3c5a394e79 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>  	if (new >= 0)
>  		goto found;
>  
> -	if (logical_packages >= __max_logical_packages) {
> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
> -			logical_packages, cpu, __max_logical_packages);
> -		return -ENOSPC;
> -	}
> -
>  	new = logical_packages++;
>  	if (new != pkg)
>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>  	return 0;
>  }
>  
> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
> -{
> -	unsigned int ncpus;
> -
> -	/*
> -	 * Today neither Intel nor AMD support heterogenous systems. That
> -	 * might change in the future....
> -	 *
> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
> -	 * computation, this won't actually work since some Intel BIOSes
> -	 * report inconsistent HT data when they disable HT.
> -	 *
> -	 * In particular, they reduce the APIC-IDs to only include the cores,
> -	 * but leave the CPUID topology to say there are (2) siblings.
> -	 * This means we don't know how many threads there will be until
> -	 * after the APIC enumeration.
> -	 *
> -	 * By not including this we'll sometimes over-estimate the number of
> -	 * logical packages by the amount of !present siblings, but this is
> -	 * still better than MAX_LOCAL_APIC.
> -	 *
> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
> -	 * on the command line leading to a similar issue as the HT disable
> -	 * problem because the hyperthreads are usually enumerated after the
> -	 * primary cores.
> -	 */
> -	ncpus = boot_cpu_data.x86_max_cores;
> -	if (!ncpus) {
> -		pr_warn("x86_max_cores == zero !?!?");
> -		ncpus = 1;
> -	}
> -
> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
> -
> -	topology_update_package_map(c->phys_proc_id, cpu);
> -}
> -
>  void __init smp_store_boot_cpu_info(void)
>  {
>  	int id = 0; /* CPU 0 */
> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>  
>  	*c = boot_cpu_data;
>  	c->cpu_index = id;
> -	smp_init_package_map(c, id);
> +	topology_update_package_map(c->phys_proc_id, id);
>  	cpu_data(id).set = 1;
>  }
>  
> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>  
>  void __init native_smp_cpus_done(unsigned int max_cpus)
>  {
> +	int ncpus;
> +
>  	pr_debug("Boot done\n");
> +	/*
> +	 * Today neither Intel nor AMD support heterogenous systems so
> +	 * extrapolate the boot cpu's data to all packages.
> +	 */
> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>  
>  	if (x86_has_numa_in_package)
>  		set_sched_topology(x86_numa_in_package_topology);

This breaks booting as Xen PV domain for me. The problem seems to be
that native_smp_cpus_done() is never called on a PV domain. So
__max_logical_packages is uninitialized and this leads to a NULL
pointer dereference in coretemp.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6,3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 18:44   ` [v6,3/3] " Simon Gaiser
  2018-02-07 19:04     ` [v6, 3/3] " Prarit Bhargava
@ 2018-02-07 19:04     ` Prarit Bhargava
  2018-02-07 19:26       ` [v6, 3/3] " Simon Gaiser
  2018-02-07 19:26       ` [v6,3/3] " Simon Gaiser
  1 sibling, 2 replies; 16+ messages in thread
From: Prarit Bhargava @ 2018-02-07 19:04 UTC (permalink / raw)
  To: Simon Gaiser, xen-devel
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Andi Kleen, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Arvind Yadav, Andy Lutomirski,
	Christian Borntraeger, Kirill A. Shutemov, Tom Lendacky, He Chen,
	Mathias Krause, Tim Chen, Vitaly Kuznetsov



On 02/07/2018 01:44 PM, Simon Gaiser wrote:
> Prarit Bhargava:
>> A system booted with a small number of cores enabled per package
>> panics because the estimate of __max_logical_packages is too low.
>> This occurs when the total number of active cores across all packages
>> is less than the maximum core count for a single package.
>>
>> ie) On a 4 package system with 20 cores/package where only 4 cores
>> are enabled on each package, the value of __max_logical_packages is
>> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>>
>> Calculate __max_logical_packages after the cpu enumeration has completed.
>> Use the boot cpu's data to extrapolate the number of packages.
>>
>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: x86@kernel.org
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Piotr Luc <piotr.luc@intel.com>
>> Cc: Kan Liang <kan.liang@intel.com>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Stephane Eranian <eranian@google.com>
>> Cc: Prarit Bhargava <prarit@redhat.com>
>> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: He Chen <he.chen@linux.intel.com>
>> Cc: Mathias Krause <minipli@googlemail.com>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>>  1 file changed, 10 insertions(+), 45 deletions(-)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 838d36ff7ba6..2e3c5a394e79 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>  	if (new >= 0)
>>  		goto found;
>>  
>> -	if (logical_packages >= __max_logical_packages) {
>> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
>> -			logical_packages, cpu, __max_logical_packages);
>> -		return -ENOSPC;
>> -	}
>> -
>>  	new = logical_packages++;
>>  	if (new != pkg)
>>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
>> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>  	return 0;
>>  }
>>  
>> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
>> -{
>> -	unsigned int ncpus;
>> -
>> -	/*
>> -	 * Today neither Intel nor AMD support heterogenous systems. That
>> -	 * might change in the future....
>> -	 *
>> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
>> -	 * computation, this won't actually work since some Intel BIOSes
>> -	 * report inconsistent HT data when they disable HT.
>> -	 *
>> -	 * In particular, they reduce the APIC-IDs to only include the cores,
>> -	 * but leave the CPUID topology to say there are (2) siblings.
>> -	 * This means we don't know how many threads there will be until
>> -	 * after the APIC enumeration.
>> -	 *
>> -	 * By not including this we'll sometimes over-estimate the number of
>> -	 * logical packages by the amount of !present siblings, but this is
>> -	 * still better than MAX_LOCAL_APIC.
>> -	 *
>> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
>> -	 * on the command line leading to a similar issue as the HT disable
>> -	 * problem because the hyperthreads are usually enumerated after the
>> -	 * primary cores.
>> -	 */
>> -	ncpus = boot_cpu_data.x86_max_cores;
>> -	if (!ncpus) {
>> -		pr_warn("x86_max_cores == zero !?!?");
>> -		ncpus = 1;
>> -	}
>> -
>> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
>> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
>> -
>> -	topology_update_package_map(c->phys_proc_id, cpu);
>> -}
>> -
>>  void __init smp_store_boot_cpu_info(void)
>>  {
>>  	int id = 0; /* CPU 0 */
>> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>>  
>>  	*c = boot_cpu_data;
>>  	c->cpu_index = id;
>> -	smp_init_package_map(c, id);
>> +	topology_update_package_map(c->phys_proc_id, id);
>>  	cpu_data(id).set = 1;
>>  }
>>  
>> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>>  
>>  void __init native_smp_cpus_done(unsigned int max_cpus)
>>  {
>> +	int ncpus;
>> +
>>  	pr_debug("Boot done\n");
>> +	/*
>> +	 * Today neither Intel nor AMD support heterogenous systems so
>> +	 * extrapolate the boot cpu's data to all packages.
>> +	 */
>> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
>> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>  
>>  	if (x86_has_numa_in_package)
>>  		set_sched_topology(x86_numa_in_package_topology);
> 
> This breaks booting as Xen PV domain for me. The problem seems to be
> that native_smp_cpus_done() is never called on a PV domain. So
> __max_logical_packages is uninitialized and this leads to a NULL
> pointer dereference in coretemp.
> 

I'll see if I can figure out a way to test that.  Does 947134d9b00f
("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
calculation") help?

P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6, 3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 18:44   ` [v6,3/3] " Simon Gaiser
@ 2018-02-07 19:04     ` Prarit Bhargava
  2018-02-07 19:04     ` [v6,3/3] " Prarit Bhargava
  1 sibling, 0 replies; 16+ messages in thread
From: Prarit Bhargava @ 2018-02-07 19:04 UTC (permalink / raw)
  To: Simon Gaiser, xen-devel
  Cc: Tom Lendacky, Vitaly Kuznetsov, Andi Kleen,
	Christian Borntraeger, Peter Zijlstra, Kan Liang, x86,
	linux-kernel, Stephane Eranian, He Chen, Dave Hansen, Piotr Luc,
	Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Arvind Yadav,
	Thomas Gleixner, Borislav Petkov, Tim Chen, Mathias Krause,
	Kirill A. Shutemov



On 02/07/2018 01:44 PM, Simon Gaiser wrote:
> Prarit Bhargava:
>> A system booted with a small number of cores enabled per package
>> panics because the estimate of __max_logical_packages is too low.
>> This occurs when the total number of active cores across all packages
>> is less than the maximum core count for a single package.
>>
>> ie) On a 4 package system with 20 cores/package where only 4 cores
>> are enabled on each package, the value of __max_logical_packages is
>> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>>
>> Calculate __max_logical_packages after the cpu enumeration has completed.
>> Use the boot cpu's data to extrapolate the number of packages.
>>
>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: x86@kernel.org
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Piotr Luc <piotr.luc@intel.com>
>> Cc: Kan Liang <kan.liang@intel.com>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Stephane Eranian <eranian@google.com>
>> Cc: Prarit Bhargava <prarit@redhat.com>
>> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: He Chen <he.chen@linux.intel.com>
>> Cc: Mathias Krause <minipli@googlemail.com>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>>  1 file changed, 10 insertions(+), 45 deletions(-)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 838d36ff7ba6..2e3c5a394e79 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>  	if (new >= 0)
>>  		goto found;
>>  
>> -	if (logical_packages >= __max_logical_packages) {
>> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
>> -			logical_packages, cpu, __max_logical_packages);
>> -		return -ENOSPC;
>> -	}
>> -
>>  	new = logical_packages++;
>>  	if (new != pkg)
>>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
>> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>  	return 0;
>>  }
>>  
>> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
>> -{
>> -	unsigned int ncpus;
>> -
>> -	/*
>> -	 * Today neither Intel nor AMD support heterogenous systems. That
>> -	 * might change in the future....
>> -	 *
>> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
>> -	 * computation, this won't actually work since some Intel BIOSes
>> -	 * report inconsistent HT data when they disable HT.
>> -	 *
>> -	 * In particular, they reduce the APIC-IDs to only include the cores,
>> -	 * but leave the CPUID topology to say there are (2) siblings.
>> -	 * This means we don't know how many threads there will be until
>> -	 * after the APIC enumeration.
>> -	 *
>> -	 * By not including this we'll sometimes over-estimate the number of
>> -	 * logical packages by the amount of !present siblings, but this is
>> -	 * still better than MAX_LOCAL_APIC.
>> -	 *
>> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
>> -	 * on the command line leading to a similar issue as the HT disable
>> -	 * problem because the hyperthreads are usually enumerated after the
>> -	 * primary cores.
>> -	 */
>> -	ncpus = boot_cpu_data.x86_max_cores;
>> -	if (!ncpus) {
>> -		pr_warn("x86_max_cores == zero !?!?");
>> -		ncpus = 1;
>> -	}
>> -
>> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
>> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
>> -
>> -	topology_update_package_map(c->phys_proc_id, cpu);
>> -}
>> -
>>  void __init smp_store_boot_cpu_info(void)
>>  {
>>  	int id = 0; /* CPU 0 */
>> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>>  
>>  	*c = boot_cpu_data;
>>  	c->cpu_index = id;
>> -	smp_init_package_map(c, id);
>> +	topology_update_package_map(c->phys_proc_id, id);
>>  	cpu_data(id).set = 1;
>>  }
>>  
>> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>>  
>>  void __init native_smp_cpus_done(unsigned int max_cpus)
>>  {
>> +	int ncpus;
>> +
>>  	pr_debug("Boot done\n");
>> +	/*
>> +	 * Today neither Intel nor AMD support heterogenous systems so
>> +	 * extrapolate the boot cpu's data to all packages.
>> +	 */
>> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
>> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>  
>>  	if (x86_has_numa_in_package)
>>  		set_sched_topology(x86_numa_in_package_topology);
> 
> This breaks booting as Xen PV domain for me. The problem seems to be
> that native_smp_cpus_done() is never called on a PV domain. So
> __max_logical_packages is uninitialized and this leads to a NULL
> pointer dereference in coretemp.
> 

I'll see if I can figure out a way to test that.  Does 947134d9b00f
("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
calculation") help?

P.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6,3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 19:04     ` [v6,3/3] " Prarit Bhargava
  2018-02-07 19:26       ` [v6, 3/3] " Simon Gaiser
@ 2018-02-07 19:26       ` Simon Gaiser
  2018-02-07 19:31         ` [v6, 3/3] " Prarit Bhargava
  2018-02-07 19:31         ` [v6,3/3] " Prarit Bhargava
  1 sibling, 2 replies; 16+ messages in thread
From: Simon Gaiser @ 2018-02-07 19:26 UTC (permalink / raw)
  To: Prarit Bhargava, xen-devel
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Andi Kleen, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Arvind Yadav, Andy Lutomirski,
	Christian Borntraeger, Kirill A. Shutemov, Tom Lendacky, He Chen,
	Mathias Krause, Tim Chen, Vitaly Kuznetsov


[-- Attachment #1.1: Type: text/plain, Size: 5797 bytes --]

Prarit Bhargava:
> On 02/07/2018 01:44 PM, Simon Gaiser wrote:
>> Prarit Bhargava:
>>> A system booted with a small number of cores enabled per package
>>> panics because the estimate of __max_logical_packages is too low.
>>> This occurs when the total number of active cores across all packages
>>> is less than the maximum core count for a single package.
>>>
>>> ie) On a 4 package system with 20 cores/package where only 4 cores
>>> are enabled on each package, the value of __max_logical_packages is
>>> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>>>
>>> Calculate __max_logical_packages after the cpu enumeration has completed.
>>> Use the boot cpu's data to extrapolate the number of packages.
>>>
>>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> Cc: x86@kernel.org
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Andi Kleen <ak@linux.intel.com>
>>> Cc: Dave Hansen <dave.hansen@intel.com>
>>> Cc: Piotr Luc <piotr.luc@intel.com>
>>> Cc: Kan Liang <kan.liang@intel.com>
>>> Cc: Borislav Petkov <bp@suse.de>
>>> Cc: Stephane Eranian <eranian@google.com>
>>> Cc: Prarit Bhargava <prarit@redhat.com>
>>> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>>> Cc: He Chen <he.chen@linux.intel.com>
>>> Cc: Mathias Krause <minipli@googlemail.com>
>>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> ---
>>>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>>>  1 file changed, 10 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>>> index 838d36ff7ba6..2e3c5a394e79 100644
>>> --- a/arch/x86/kernel/smpboot.c
>>> +++ b/arch/x86/kernel/smpboot.c
>>> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>>  	if (new >= 0)
>>>  		goto found;
>>>  
>>> -	if (logical_packages >= __max_logical_packages) {
>>> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
>>> -			logical_packages, cpu, __max_logical_packages);
>>> -		return -ENOSPC;
>>> -	}
>>> -
>>>  	new = logical_packages++;
>>>  	if (new != pkg)
>>>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
>>> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>>  	return 0;
>>>  }
>>>  
>>> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
>>> -{
>>> -	unsigned int ncpus;
>>> -
>>> -	/*
>>> -	 * Today neither Intel nor AMD support heterogenous systems. That
>>> -	 * might change in the future....
>>> -	 *
>>> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
>>> -	 * computation, this won't actually work since some Intel BIOSes
>>> -	 * report inconsistent HT data when they disable HT.
>>> -	 *
>>> -	 * In particular, they reduce the APIC-IDs to only include the cores,
>>> -	 * but leave the CPUID topology to say there are (2) siblings.
>>> -	 * This means we don't know how many threads there will be until
>>> -	 * after the APIC enumeration.
>>> -	 *
>>> -	 * By not including this we'll sometimes over-estimate the number of
>>> -	 * logical packages by the amount of !present siblings, but this is
>>> -	 * still better than MAX_LOCAL_APIC.
>>> -	 *
>>> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
>>> -	 * on the command line leading to a similar issue as the HT disable
>>> -	 * problem because the hyperthreads are usually enumerated after the
>>> -	 * primary cores.
>>> -	 */
>>> -	ncpus = boot_cpu_data.x86_max_cores;
>>> -	if (!ncpus) {
>>> -		pr_warn("x86_max_cores == zero !?!?");
>>> -		ncpus = 1;
>>> -	}
>>> -
>>> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
>>> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>> -
>>> -	topology_update_package_map(c->phys_proc_id, cpu);
>>> -}
>>> -
>>>  void __init smp_store_boot_cpu_info(void)
>>>  {
>>>  	int id = 0; /* CPU 0 */
>>> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>>>  
>>>  	*c = boot_cpu_data;
>>>  	c->cpu_index = id;
>>> -	smp_init_package_map(c, id);
>>> +	topology_update_package_map(c->phys_proc_id, id);
>>>  	cpu_data(id).set = 1;
>>>  }
>>>  
>>> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>>>  
>>>  void __init native_smp_cpus_done(unsigned int max_cpus)
>>>  {
>>> +	int ncpus;
>>> +
>>>  	pr_debug("Boot done\n");
>>> +	/*
>>> +	 * Today neither Intel nor AMD support heterogenous systems so
>>> +	 * extrapolate the boot cpu's data to all packages.
>>> +	 */
>>> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>>> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
>>> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>>  
>>>  	if (x86_has_numa_in_package)
>>>  		set_sched_topology(x86_numa_in_package_topology);
>>
>> This breaks booting as Xen PV domain for me. The problem seems to be
>> that native_smp_cpus_done() is never called on a PV domain. So
>> __max_logical_packages is uninitialized and this leads to a NULL
>> pointer dereference in coretemp.
>>
> 
> I'll see if I can figure out a way to test that.

Ok, thanks. If you need some logs, or if I should test something just
ask.

> Does 947134d9b00f
> ("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
> calculation") help?

No.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6, 3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 19:04     ` [v6,3/3] " Prarit Bhargava
@ 2018-02-07 19:26       ` Simon Gaiser
  2018-02-07 19:26       ` [v6,3/3] " Simon Gaiser
  1 sibling, 0 replies; 16+ messages in thread
From: Simon Gaiser @ 2018-02-07 19:26 UTC (permalink / raw)
  To: Prarit Bhargava, xen-devel
  Cc: Tom Lendacky, Vitaly Kuznetsov, Andi Kleen,
	Christian Borntraeger, Peter Zijlstra, Kan Liang, x86,
	linux-kernel, Stephane Eranian, He Chen, Dave Hansen, Piotr Luc,
	Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Arvind Yadav,
	Thomas Gleixner, Borislav Petkov, Tim Chen, Mathias Krause,
	Kirill A. Shutemov


[-- Attachment #1.1.1: Type: text/plain, Size: 5797 bytes --]

Prarit Bhargava:
> On 02/07/2018 01:44 PM, Simon Gaiser wrote:
>> Prarit Bhargava:
>>> A system booted with a small number of cores enabled per package
>>> panics because the estimate of __max_logical_packages is too low.
>>> This occurs when the total number of active cores across all packages
>>> is less than the maximum core count for a single package.
>>>
>>> ie) On a 4 package system with 20 cores/package where only 4 cores
>>> are enabled on each package, the value of __max_logical_packages is
>>> calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>>>
>>> Calculate __max_logical_packages after the cpu enumeration has completed.
>>> Use the boot cpu's data to extrapolate the number of packages.
>>>
>>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> Cc: x86@kernel.org
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Andi Kleen <ak@linux.intel.com>
>>> Cc: Dave Hansen <dave.hansen@intel.com>
>>> Cc: Piotr Luc <piotr.luc@intel.com>
>>> Cc: Kan Liang <kan.liang@intel.com>
>>> Cc: Borislav Petkov <bp@suse.de>
>>> Cc: Stephane Eranian <eranian@google.com>
>>> Cc: Prarit Bhargava <prarit@redhat.com>
>>> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>>> Cc: He Chen <he.chen@linux.intel.com>
>>> Cc: Mathias Krause <minipli@googlemail.com>
>>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> ---
>>>  arch/x86/kernel/smpboot.c | 55 +++++++++--------------------------------------
>>>  1 file changed, 10 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>>> index 838d36ff7ba6..2e3c5a394e79 100644
>>> --- a/arch/x86/kernel/smpboot.c
>>> +++ b/arch/x86/kernel/smpboot.c
>>> @@ -308,12 +308,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>>  	if (new >= 0)
>>>  		goto found;
>>>  
>>> -	if (logical_packages >= __max_logical_packages) {
>>> -		pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n",
>>> -			logical_packages, cpu, __max_logical_packages);
>>> -		return -ENOSPC;
>>> -	}
>>> -
>>>  	new = logical_packages++;
>>>  	if (new != pkg)
>>>  		pr_info("CPU %u Converting physical %u to logical package %u\n",
>>> @@ -323,44 +317,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
>>>  	return 0;
>>>  }
>>>  
>>> -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu)
>>> -{
>>> -	unsigned int ncpus;
>>> -
>>> -	/*
>>> -	 * Today neither Intel nor AMD support heterogenous systems. That
>>> -	 * might change in the future....
>>> -	 *
>>> -	 * While ideally we'd want '* smp_num_siblings' in the below @ncpus
>>> -	 * computation, this won't actually work since some Intel BIOSes
>>> -	 * report inconsistent HT data when they disable HT.
>>> -	 *
>>> -	 * In particular, they reduce the APIC-IDs to only include the cores,
>>> -	 * but leave the CPUID topology to say there are (2) siblings.
>>> -	 * This means we don't know how many threads there will be until
>>> -	 * after the APIC enumeration.
>>> -	 *
>>> -	 * By not including this we'll sometimes over-estimate the number of
>>> -	 * logical packages by the amount of !present siblings, but this is
>>> -	 * still better than MAX_LOCAL_APIC.
>>> -	 *
>>> -	 * We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited
>>> -	 * on the command line leading to a similar issue as the HT disable
>>> -	 * problem because the hyperthreads are usually enumerated after the
>>> -	 * primary cores.
>>> -	 */
>>> -	ncpus = boot_cpu_data.x86_max_cores;
>>> -	if (!ncpus) {
>>> -		pr_warn("x86_max_cores == zero !?!?");
>>> -		ncpus = 1;
>>> -	}
>>> -
>>> -	__max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
>>> -	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>> -
>>> -	topology_update_package_map(c->phys_proc_id, cpu);
>>> -}
>>> -
>>>  void __init smp_store_boot_cpu_info(void)
>>>  {
>>>  	int id = 0; /* CPU 0 */
>>> @@ -368,7 +324,7 @@ void __init smp_store_boot_cpu_info(void)
>>>  
>>>  	*c = boot_cpu_data;
>>>  	c->cpu_index = id;
>>> -	smp_init_package_map(c, id);
>>> +	topology_update_package_map(c->phys_proc_id, id);
>>>  	cpu_data(id).set = 1;
>>>  }
>>>  
>>> @@ -1371,7 +1327,16 @@ void __init native_smp_prepare_boot_cpu(void)
>>>  
>>>  void __init native_smp_cpus_done(unsigned int max_cpus)
>>>  {
>>> +	int ncpus;
>>> +
>>>  	pr_debug("Boot done\n");
>>> +	/*
>>> +	 * Today neither Intel nor AMD support heterogenous systems so
>>> +	 * extrapolate the boot cpu's data to all packages.
>>> +	 */
>>> +	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>>> +	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
>>> +	pr_info("Max logical packages: %u\n", __max_logical_packages);
>>>  
>>>  	if (x86_has_numa_in_package)
>>>  		set_sched_topology(x86_numa_in_package_topology);
>>
>> This breaks booting as Xen PV domain for me. The problem seems to be
>> that native_smp_cpus_done() is never called on a PV domain. So
>> __max_logical_packages is uninitialized and this leads to a NULL
>> pointer dereference in coretemp.
>>
> 
> I'll see if I can figure out a way to test that.

Ok, thanks. If you need some logs, or if I should test something just
ask.

> Does 947134d9b00f
> ("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
> calculation") help?

No.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6,3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 19:26       ` [v6,3/3] " Simon Gaiser
  2018-02-07 19:31         ` [v6, 3/3] " Prarit Bhargava
@ 2018-02-07 19:31         ` Prarit Bhargava
  1 sibling, 0 replies; 16+ messages in thread
From: Prarit Bhargava @ 2018-02-07 19:31 UTC (permalink / raw)
  To: Simon Gaiser, xen-devel
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Andi Kleen, Dave Hansen, Piotr Luc, Kan Liang,
	Borislav Petkov, Stephane Eranian, Arvind Yadav, Andy Lutomirski,
	Christian Borntraeger, Kirill A. Shutemov, Tom Lendacky, He Chen,
	Mathias Krause, Tim Chen, Vitaly Kuznetsov



On 02/07/2018 02:26 PM, Simon Gaiser wrote:
> Prarit Bhargava:
>> On 02/07/2018 01:44 PM, Simon Gaiser wrote:

<snip>

>>> This breaks booting as Xen PV domain for me. The problem seems to be
>>> that native_smp_cpus_done() is never called on a PV domain. So
>>> __max_logical_packages is uninitialized and this leads to a NULL
>>> pointer dereference in coretemp.
>>>
>>
>> I'll see if I can figure out a way to test that.
> 
> Ok, thanks. If you need some logs, or if I should test something just
> ask.
> 

Thanks, I may send you a patch off-list to test.

>> Does 947134d9b00f
>> ("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
>> calculation") help?
> 
> No.
> 

Yeah, your description was absolutely correct.  native_smp_cpus_done() is only
called for HVM guests and not PV.  I'm going to think about that and send you a
patch in a bit.

P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [v6, 3/3] x86/smpboot: Fix __max_logical_packages estimate
  2018-02-07 19:26       ` [v6,3/3] " Simon Gaiser
@ 2018-02-07 19:31         ` Prarit Bhargava
  2018-02-07 19:31         ` [v6,3/3] " Prarit Bhargava
  1 sibling, 0 replies; 16+ messages in thread
From: Prarit Bhargava @ 2018-02-07 19:31 UTC (permalink / raw)
  To: Simon Gaiser, xen-devel
  Cc: Tom Lendacky, Vitaly Kuznetsov, Andi Kleen,
	Christian Borntraeger, Peter Zijlstra, Kan Liang, x86,
	linux-kernel, Stephane Eranian, He Chen, Dave Hansen, Piotr Luc,
	Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Arvind Yadav,
	Thomas Gleixner, Borislav Petkov, Tim Chen, Mathias Krause,
	Kirill A. Shutemov



On 02/07/2018 02:26 PM, Simon Gaiser wrote:
> Prarit Bhargava:
>> On 02/07/2018 01:44 PM, Simon Gaiser wrote:

<snip>

>>> This breaks booting as Xen PV domain for me. The problem seems to be
>>> that native_smp_cpus_done() is never called on a PV domain. So
>>> __max_logical_packages is uninitialized and this leads to a NULL
>>> pointer dereference in coretemp.
>>>
>>
>> I'll see if I can figure out a way to test that.
> 
> Ok, thanks. If you need some logs, or if I should test something just
> ask.
> 

Thanks, I may send you a patch off-list to test.

>> Does 947134d9b00f
>> ("x86/smpboot: Do not use smp_num_siblings in __max_logical_packages
>> calculation") help?
> 
> No.
> 

Yeah, your description was absolutely correct.  native_smp_cpus_done() is only
called for HVM guests and not PV.  I'm going to think about that and send you a
patch in a bit.

P.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-02-07 19:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-14 12:42 [PATCH v6 0/3] Fix panic in logical packages calculation Prarit Bhargava
2017-11-14 12:42 ` [PATCH v6 1/3] perf/x86/intel/uncore: Cache logical pkg id in uncore driver Prarit Bhargava
2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Andi Kleen
2017-11-14 12:42 ` [PATCH v6 2/3] x86/topology: Avoid wasting 128k for package id array Prarit Bhargava
2017-11-17 14:59   ` Thomas Gleixner
2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Andi Kleen
2017-11-14 12:42 ` [PATCH v6 3/3] x86/smpboot: Fix __max_logical_packages estimate Prarit Bhargava
2017-11-17 15:27   ` [tip:x86/urgent] " tip-bot for Prarit Bhargava
2018-02-07 18:44   ` [v6,3/3] " Simon Gaiser
2018-02-07 19:04     ` [v6, 3/3] " Prarit Bhargava
2018-02-07 19:04     ` [v6,3/3] " Prarit Bhargava
2018-02-07 19:26       ` [v6, 3/3] " Simon Gaiser
2018-02-07 19:26       ` [v6,3/3] " Simon Gaiser
2018-02-07 19:31         ` [v6, 3/3] " Prarit Bhargava
2018-02-07 19:31         ` [v6,3/3] " Prarit Bhargava
2018-02-07 18:44   ` [v6, 3/3] " Simon Gaiser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.