linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths
@ 2010-11-11 11:02 Tejun Heo
  2010-11-11 11:02 ` [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Tejun Heo
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

Hello,

This patchset started as an attempt to fix percpu setup problem on
x86_32 NUMA configurations reported by Eric Dumazet.  On x86_32, NUMA
configuration is initialized during SMP bringup which happens way
later than percpu setup.  As percpu setup code doesn't have access to
NUMA configuration, it gets set up as if the system is UMA.

The NUMA init paths differ subtly yet significantly between x86_32 and
64 making it quite difficult to follow.  Custom apic configurations on
x86_32 worsens the problem.

There is no fundamental reason why x86_32 can't be initialized in the
same steps as x86_64.  They just were written differently and just
weren't unified during x86_32/64 code merge.  This patchset unifies
them and in the process fixes percpu setup problem on 32bit NUMA.

This patchset contains the following nine patches.

 0001-x86-Kill-unused-static-boot_cpu_logical_apicid-in-sm.patch
 0002-x86-Rename-x86_32-MAX_APICID-to-MAX_LOCAL_APIC.patch
 0003-x86-Make-default_send_IPI_mask_sequence-allbutself_l.patch
 0004-x86-Initialize-32bit-logical-apicid-mapping-early-du.patch
 0005-x86-Replace-apic-apicid_to_node-with-numa_cpu_node.patch
 0006-x86-Unify-cpu-apicid-NUMA-node-mapping-between-32-an.patch
 0007-x86-Unify-CPU-NUMA-node-mapping-between-32-and-64bit.patch
 0008-x86-Unify-node_to_cpumask_map-handling-between-32-an.patch
 0009-x86-Unify-NUMA-initialization-between-32-and-64bit.patch

0001-0003 are small preparation patches.

0004 changes the role of apic->cpu_to_logical_apicid().  It's now
called once for each cpu early during boot and the output is recorded.
apic works fine but I couldn't test other apic configurations.  If you
know and can test one of bigsmp, numaq, es7000 and summit, please
review and test this one.

Please note that I couldn't figure out how to determine logical_apicid
early for numaq and just made it return BAD_APICID which would result
in suboptimal percpu configuration.  NUMA configuration is still
updated during SMP bringup, so the situation isn't worse than the
patch, but if someone knows how to convert this one, please go ahead.

0005 replaces apic->apicid_to_node() with ->numa_cpu_node() as some
need physical apicid while others logical and both can be easily found
from cpu.

0006-0009 unify different parts of NUMA initialization/handling step
by step.  Afterwards, x86_32 and 64 behave mostly the same regarding
NUMA initialization making the whole thing much easier to understand
and percpu setup works correctly on x86_32 too.

This patchset is on top of the current mainline
(f6614b7bb405a9b35dd28baea989a749492c46b2) and also avaialbe in the
following git branch,

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git x86_32-numa

and contains the following changes.

 arch/x86/Kconfig                      |    2 
 arch/x86/include/asm/apic.h           |   22 ++-
 arch/x86/include/asm/apicdef.h        |    1 
 arch/x86/include/asm/mpspec.h         |    3 
 arch/x86/include/asm/numa.h           |   61 ++++++++++
 arch/x86/include/asm/numa_32.h        |    7 -
 arch/x86/include/asm/numa_64.h        |   16 --
 arch/x86/include/asm/smp.h            |    3 
 arch/x86/include/asm/topology.h       |   17 --
 arch/x86/kernel/acpi/boot.c           |    8 -
 arch/x86/kernel/apic/apic.c           |   46 +++++--
 arch/x86/kernel/apic/apic_flat_64.c   |    4 
 arch/x86/kernel/apic/apic_noop.c      |    4 
 arch/x86/kernel/apic/bigsmp_32.c      |   33 ++---
 arch/x86/kernel/apic/es7000_32.c      |   34 +----
 arch/x86/kernel/apic/ipi.c            |   12 -
 arch/x86/kernel/apic/numaq_32.c       |   26 ++--
 arch/x86/kernel/apic/probe_32.c       |    2 
 arch/x86/kernel/apic/summit_32.c      |   47 ++-----
 arch/x86/kernel/apic/x2apic_cluster.c |    2 
 arch/x86/kernel/apic/x2apic_phys.c    |    2 
 arch/x86/kernel/apic/x2apic_uv_x.c    |    2 
 arch/x86/kernel/cpu/amd.c             |   18 +-
 arch/x86/kernel/cpu/common.c          |    2 
 arch/x86/kernel/cpu/intel.c           |    5 
 arch/x86/kernel/setup.c               |    2 
 arch/x86/kernel/setup_percpu.c        |    4 
 arch/x86/kernel/smpboot.c             |   68 -----------
 arch/x86/mm/k8topology_64.c           |    2 
 arch/x86/mm/numa.c                    |  179 +++++++++++++++++++++++++++++
 arch/x86/mm/numa_32.c                 |    7 +
 arch/x86/mm/numa_64.c                 |  205 ----------------------------------
 arch/x86/mm/srat_32.c                 |    6 
 arch/x86/mm/srat_64.c                 |   10 -
 34 files changed, 413 insertions(+), 449 deletions(-)

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 11:13   ` Pekka Enberg
  2010-11-11 11:02 ` [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC Tejun Heo
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/kernel/smpboot.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 083e99d..4de6a00 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -165,8 +165,6 @@ static void unmap_cpu_to_node(int cpu)
 #endif
 
 #ifdef CONFIG_X86_32
-static int boot_cpu_logical_apicid;
-
 u8 cpu_2_logical_apicid[NR_CPUS] __read_mostly =
 					{ [0 ... NR_CPUS-1] = BAD_APICID };
 
@@ -1101,9 +1099,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	 * Setup boot CPU information
 	 */
 	smp_store_cpu_info(0); /* Final full version of the data */
-#ifdef CONFIG_X86_32
-	boot_cpu_logical_apicid = logical_smp_processor_id();
-#endif
+
 	current_thread_info()->cpu = 0;  /* needed? */
 	for_each_possible_cpu(i) {
 		zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
  2010-11-11 11:02 ` [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 11:15   ` Pekka Enberg
  2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

Replace x86_32 MAX_APICID in include/asm/mpspec.h with MAX_LOCAL_APIC
in include/asm/apicdef.h to make it consistent with x86_64.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/include/asm/apicdef.h |    1 +
 arch/x86/include/asm/mpspec.h  |    2 --
 arch/x86/kernel/smpboot.c      |    2 +-
 arch/x86/mm/srat_32.c          |    4 ++--
 4 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index a859ca4..47a30ff 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -145,6 +145,7 @@
 
 #ifdef CONFIG_X86_32
 # define MAX_IO_APICS 64
+# define MAX_LOCAL_APIC 256
 #else
 # define MAX_IO_APICS 128
 # define MAX_LOCAL_APIC 32768
diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index c82868e..018ffc1 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -32,8 +32,6 @@ extern int mp_bus_id_to_local[MAX_MP_BUSSES];
 extern int quad_local_to_mp_bus_id [NR_CPUS/4][4];
 #endif
 
-#define MAX_APICID		256
-
 #else /* CONFIG_X86_64: */
 
 #define MAX_MP_BUSSES		256
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4de6a00..0b32f17 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -72,7 +72,7 @@
 #include <asm/i8259.h>
 
 #ifdef CONFIG_X86_32
-u8 apicid_2_node[MAX_APICID];
+u8 apicid_2_node[MAX_LOCAL_APIC];
 #endif
 
 /* State of each CPU */
diff --git a/arch/x86/mm/srat_32.c b/arch/x86/mm/srat_32.c
index a17dffd..e55e748 100644
--- a/arch/x86/mm/srat_32.c
+++ b/arch/x86/mm/srat_32.c
@@ -57,7 +57,7 @@ struct node_memory_chunk_s {
 static struct node_memory_chunk_s __initdata node_memory_chunk[MAXCHUNKS];
 
 static int __initdata num_memory_chunks; /* total number of memory chunks */
-static u8 __initdata apicid_to_pxm[MAX_APICID];
+static u8 __initdata apicid_to_pxm[MAX_LOCAL_APIC];
 
 int numa_off __initdata;
 int acpi_numa __initdata;
@@ -254,7 +254,7 @@ int __init get_memcfg_from_srat(void)
 	printk(KERN_DEBUG "Number of memory chunks in system = %d\n",
 			 num_memory_chunks);
 
-	for (i = 0; i < MAX_APICID; i++)
+	for (i = 0; i < MAX_LOCAL_APIC; i++)
 		apicid_2_node[i] = pxm_to_node(apicid_to_pxm[i]);
 
 	for (j = 0; j < num_memory_chunks; j++){
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
  2010-11-11 11:02 ` [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Tejun Heo
  2010-11-11 11:02 ` [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 12:02   ` Pekka Enberg
                     ` (2 more replies)
  2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

Both functions are used only in 32bit.  Put them inside CONFIG_X86_32.
This is to prepare for logical apicid handling update.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/kernel/apic/ipi.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 08385e0..5037736 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -56,6 +56,8 @@ void default_send_IPI_mask_allbutself_phys(const struct cpumask *mask,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_X86_32
+
 void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
 						 int vector)
 {
@@ -96,8 +98,6 @@ void default_send_IPI_mask_allbutself_logical(const struct cpumask *mask,
 	local_irq_restore(flags);
 }
 
-#ifdef CONFIG_X86_32
-
 /*
  * This is only used on smaller machines.
  */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (2 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-12  1:54   ` Brian Gerst
                     ` (2 more replies)
  2010-11-11 11:02 ` [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node() Tejun Heo
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

On x86_32, non-standard logical apicid mapping can be used by
different NUMA setups and the mapping is queried while bringing up
each CPU using apic->cpu_to_logical_apicid() to build
cpu_2_logical_apicid[] array.  The logical apicid is then used to
deliver IPIs and determine NUMA configuration.

Unfortunately, initializing at SMP bring up is too late for percpu
setup making static percpu variables setup w/o considering NUMA.  This
also is different from how x86_64 is configured making the code
difficult to follow and maintain.

This patch updates logical apicid mapping handling such that,

* early_percpu variable x86_cpu_to_logical_apicid replaces
  cpu_2_logical_apicid[].

* apic->cpu_to_logical_apicid() is called once during get_smp_config()
  and the output is recorded in x86_cpu_to_logical_apicid.

* apic->cpu_to_logical_apicid() is allowed to return BAD_APICID if it
  can't determine the value that early during boot.  In this case, the
  mapping will be initialized during SMP bring up by reading APIC LDR
  as before.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/include/asm/apic.h      |   18 +++++++++++++-----
 arch/x86/include/asm/smp.h       |    3 +++
 arch/x86/kernel/apic/apic.c      |   36 ++++++++++++++++++++++++++++--------
 arch/x86/kernel/apic/bigsmp_32.c |   26 ++++++++++++++------------
 arch/x86/kernel/apic/es7000_32.c |   27 +++++++++------------------
 arch/x86/kernel/apic/ipi.c       |    8 ++++----
 arch/x86/kernel/apic/numaq_32.c  |   15 +++++++--------
 arch/x86/kernel/apic/summit_32.c |   36 ++++++++++++++++--------------------
 arch/x86/kernel/smpboot.c        |   10 +++-------
 9 files changed, 97 insertions(+), 82 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 286de34..a45e657 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -299,6 +299,19 @@ struct apic {
 	unsigned long (*check_apicid_present)(int apicid);
 
 	void (*vector_allocation_domain)(int cpu, struct cpumask *retmask);
+	/*
+	 * x86_32 specific method called very early during boot from
+	 * get_smp_config().  It should return the logical apicid.
+	 * x86_[bios]_cpu_to_apicid is initialized before this
+	 * function is called.
+	 *
+	 * If logical apicid can't be determined that early, the
+	 * function may return BAD_APICID.  Logical apicid will be
+	 * automatically configured after init_apic_ldr() while
+	 * bringing up CPUs.  Note that NUMA affinity won't work
+	 * properly during early boot in this case.
+	 */
+	int (*cpu_to_logical_apicid)(int cpu);
 	void (*init_apic_ldr)(void);
 
 	void (*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
@@ -306,7 +319,6 @@ struct apic {
 	void (*setup_apic_routing)(void);
 	int (*multi_timer_check)(int apic, int irq);
 	int (*apicid_to_node)(int logical_apicid);
-	int (*cpu_to_logical_apicid)(int cpu);
 	int (*cpu_present_to_apicid)(int mps_cpu);
 	void (*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
 	void (*setup_portio_remap)(void);
@@ -594,8 +606,4 @@ extern int default_check_phys_apicid_present(int phys_apicid);
 
 #endif /* CONFIG_X86_LOCAL_APIC */
 
-#ifdef CONFIG_X86_32
-extern u8 cpu_2_logical_apicid[NR_CPUS];
-#endif
-
 #endif /* _ASM_X86_APIC_H */
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4c2f63c..dc7c46a 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -38,6 +38,9 @@ static inline struct cpumask *cpu_core_mask(int cpu)
 
 DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
+#if defined(CONFIG_SMP) && defined(CONFIG_X86_32)
+DECLARE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid);
+#endif
 
 /* Static state in head.S used to set up a CPU */
 extern struct {
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 850657d..481bce1 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -80,6 +80,11 @@ EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_bios_cpu_apicid);
 
 #ifdef CONFIG_X86_32
+
+#ifdef CONFIG_SMP
+DEFINE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid, BAD_APICID);
+#endif
+
 /*
  * Knob to control our willingness to enable the local APIC.
  *
@@ -1202,6 +1207,7 @@ static void __cpuinit lapic_setup_esr(void)
  */
 void __cpuinit setup_local_APIC(void)
 {
+	int cpu = smp_processor_id();
 	unsigned int value, queued;
 	int i, j, acked = 0;
 	unsigned long long tsc = 0, ntsc;
@@ -1241,6 +1247,19 @@ void __cpuinit setup_local_APIC(void)
 	 */
 	apic->init_apic_ldr();
 
+#ifdef CONFIG_X86_32
+	/*
+	 * APIC LDR is initialized.  If logical_apicid mapping was
+	 * initialized during get_smp_config(), make sure it matches
+	 * the actual value.
+	 */
+	i = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	WARN_ON(i != BAD_APICID && i != logical_smp_processor_id());
+	/* always use the value from LDR */
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		logical_smp_processor_id();
+#endif
+
 	/*
 	 * Set Task Priority to 'accept all'. We never change this
 	 * later on.
@@ -1343,21 +1362,19 @@ void __cpuinit setup_local_APIC(void)
 	 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 	 */
 	value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-	if (!smp_processor_id() && (pic_mode || !value)) {
+	if (!cpu && (pic_mode || !value)) {
 		value = APIC_DM_EXTINT;
-		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
 	} else {
 		value = APIC_DM_EXTINT | APIC_LVT_MASKED;
-		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n", cpu);
 	}
 	apic_write(APIC_LVT0, value);
 
 	/*
 	 * only the BP should see the LINT1 NMI signal, obviously.
 	 */
-	if (!smp_processor_id())
+	if (!cpu)
 		value = APIC_DM_NMI;
 	else
 		value = APIC_DM_NMI | APIC_LVT_MASKED;
@@ -1369,7 +1386,7 @@ void __cpuinit setup_local_APIC(void)
 
 #ifdef CONFIG_X86_MCE_INTEL
 	/* Recheck CMCI information after local APIC is up on CPU #0 */
-	if (smp_processor_id() == 0)
+	if (cpu == 0)
 		cmci_recheck();
 #endif
 }
@@ -1967,7 +1984,10 @@ void __cpuinit generic_processor_info(int apicid, int version)
 	early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
 	early_per_cpu(x86_bios_cpu_apicid, cpu) = apicid;
 #endif
-
+#ifdef CONFIG_X86_32
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		apic->cpu_to_logical_apicid(cpu);
+#endif
 	set_cpu_possible(cpu, true);
 	set_cpu_present(cpu, true);
 }
diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c
index cb804c5..df83be7 100644
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -45,6 +45,12 @@ static unsigned long bigsmp_check_apicid_present(int bit)
 	return 1;
 }
 
+static int bigsmp_cpu_to_logical_apicid(int cpu)
+{
+	/* on bigsmp, logical apicid is the same as physical */
+	return early_per_cpu(x86_cpu_to_apicid, cpu);
+}
+
 static inline unsigned long calculate_ldr(int cpu)
 {
 	unsigned long val, id;
@@ -93,14 +99,6 @@ static int bigsmp_cpu_present_to_apicid(int mps_cpu)
 	return BAD_APICID;
 }
 
-/* Mapping from cpu number to logical apicid */
-static inline int bigsmp_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_physical_id(cpu);
-}
-
 static void bigsmp_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -115,7 +113,11 @@ static int bigsmp_check_phys_apicid_present(int phys_apicid)
 /* As we are using single CPU as destination, pick only one CPU here */
 static unsigned int bigsmp_cpu_mask_to_apicid(const struct cpumask *cpumask)
 {
-	return bigsmp_cpu_to_logical_apicid(cpumask_first(cpumask));
+	int cpu = cpumask_first(cpumask);
+
+	if (cpu < nr_cpu_ids)
+		return cpu_physical_id(cpu);
+	return BAD_APICID;
 }
 
 static unsigned int bigsmp_cpu_mask_to_apicid_and(const struct cpumask *cpumask,
@@ -129,9 +131,9 @@ static unsigned int bigsmp_cpu_mask_to_apicid_and(const struct cpumask *cpumask,
 	 */
 	for_each_cpu_and(cpu, cpumask, andmask) {
 		if (cpumask_test_cpu(cpu, cpu_online_mask))
-			break;
+			return cpu_physical_id(cpu);
 	}
-	return bigsmp_cpu_to_logical_apicid(cpu);
+	return BAD_APICID;
 }
 
 static int bigsmp_phys_pkg_id(int cpuid_apic, int index_msb)
@@ -214,13 +216,13 @@ struct apic apic_bigsmp = {
 	.check_apicid_present		= bigsmp_check_apicid_present,
 
 	.vector_allocation_domain	= bigsmp_vector_allocation_domain,
+	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.init_apic_ldr			= bigsmp_init_apic_ldr,
 
 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
 	.setup_apic_routing		= bigsmp_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= bigsmp_apicid_to_node,
-	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= bigsmp_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= physid_set_mask_of_physid,
 	.setup_portio_remap		= NULL,
diff --git a/arch/x86/kernel/apic/es7000_32.c b/arch/x86/kernel/apic/es7000_32.c
index 8593582..b964d2a 100644
--- a/arch/x86/kernel/apic/es7000_32.c
+++ b/arch/x86/kernel/apic/es7000_32.c
@@ -460,11 +460,14 @@ static unsigned long es7000_check_apicid_present(int bit)
 	return physid_isset(bit, phys_cpu_present_map);
 }
 
-static unsigned long calculate_ldr(int cpu)
+static int es7000_cpu_to_logical_apicid(int cpu)
 {
-	unsigned long id = per_cpu(x86_bios_cpu_apicid, cpu);
+	return early_per_cpu(x86_bios_cpu_apicid, cpu);
+}
 
-	return SET_APIC_LOGICAL_ID(id);
+static unsigned long calculate_ldr(int cpu)
+{
+	return SET_APIC_LOGICAL_ID(per_cpu(x86_cpu_to_logical_apicid, cpu));
 }
 
 /*
@@ -528,18 +531,6 @@ static void es7000_apicid_to_cpu_present(int phys_apicid, physid_mask_t *retmap)
 	++cpu_id;
 }
 
-/* Mapping from cpu number to logical apicid */
-static int es7000_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static void es7000_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -561,7 +552,7 @@ static unsigned int es7000_cpu_mask_to_apicid(const struct cpumask *cpumask)
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = es7000_cpu_to_logical_apicid(cpu);
+		int new_apicid = per_cpu(x86_cpu_to_logical_apicid, cpu);
 
 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			WARN(1, "Not a valid mask!");
@@ -578,7 +569,7 @@ static unsigned int
 es7000_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = es7000_cpu_to_logical_apicid(0);
+	int apicid = per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;
 
 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -650,13 +641,13 @@ struct apic __refdata apic_es7000_cluster = {
 	.check_apicid_present		= es7000_check_apicid_present,
 
 	.vector_allocation_domain	= es7000_vector_allocation_domain,
+	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.init_apic_ldr			= es7000_init_apic_ldr_cluster,
 
 	.ioapic_phys_id_map		= es7000_ioapic_phys_id_map,
 	.setup_apic_routing		= es7000_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= es7000_apicid_to_node,
-	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= es7000_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= es7000_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 5037736..cce91bf 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -73,8 +73,8 @@ void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
 	local_irq_save(flags);
 	for_each_cpu(query_cpu, mask)
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 	local_irq_restore(flags);
 }
 
@@ -92,8 +92,8 @@ void default_send_IPI_mask_allbutself_logical(const struct cpumask *mask,
 		if (query_cpu == this_cpu)
 			continue;
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 		}
 	local_irq_restore(flags);
 }
diff --git a/arch/x86/kernel/apic/numaq_32.c b/arch/x86/kernel/apic/numaq_32.c
index 960f26a..cd9683a 100644
--- a/arch/x86/kernel/apic/numaq_32.c
+++ b/arch/x86/kernel/apic/numaq_32.c
@@ -346,6 +346,12 @@ static inline int numaq_apic_id_registered(void)
 	return 1;
 }
 
+static int numaq_cpu_to_logical_apicid(int cpu)
+{
+	/* NUMA-Q firmware set this up but how do I read this from boot CPU? */
+	return BAD_APICID;
+}
+
 static inline void numaq_init_apic_ldr(void)
 {
 	/* Already done in NUMA-Q firmware */
@@ -373,13 +379,6 @@ static inline void numaq_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask
 	return physids_promote(0xFUL, retmap);
 }
 
-static inline int numaq_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-}
-
 /*
  * Supporting over 60 cpus on NUMA-Q requires a locality-dependent
  * cpu to APIC ID relation to properly interact with the intelligent
@@ -503,13 +502,13 @@ struct apic __refdata apic_numaq = {
 	.check_apicid_present		= numaq_check_apicid_present,
 
 	.vector_allocation_domain	= numaq_vector_allocation_domain,
+	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.init_apic_ldr			= numaq_init_apic_ldr,
 
 	.ioapic_phys_id_map		= numaq_ioapic_phys_id_map,
 	.setup_apic_routing		= numaq_setup_apic_routing,
 	.multi_timer_check		= numaq_multi_timer_check,
 	.apicid_to_node			= numaq_apicid_to_node,
-	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= numaq_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= numaq_apicid_to_cpu_present,
 	.setup_portio_remap		= numaq_setup_portio_remap,
diff --git a/arch/x86/kernel/apic/summit_32.c b/arch/x86/kernel/apic/summit_32.c
index 9b41926..a904d28 100644
--- a/arch/x86/kernel/apic/summit_32.c
+++ b/arch/x86/kernel/apic/summit_32.c
@@ -194,11 +194,11 @@ static unsigned long summit_check_apicid_present(int bit)
 	return 1;
 }
 
-static void summit_init_apic_ldr(void)
+/* Mapping from cpu number to logical apicid */
+static int summit_cpu_to_logical_apicid(int cpu)
 {
-	unsigned long val, id;
 	int count = 0;
-	u8 my_id = (u8)hard_smp_processor_id();
+	u8 my_id = early_per_cpu(x86_cpu_to_apicid, cpu);
 	u8 my_cluster = APIC_CLUSTER(my_id);
 #ifdef CONFIG_SMP
 	u8 lid;
@@ -206,7 +206,7 @@ static void summit_init_apic_ldr(void)
 
 	/* Create logical APIC IDs by counting CPUs already in cluster. */
 	for (count = 0, i = nr_cpu_ids; --i >= 0; ) {
-		lid = cpu_2_logical_apicid[i];
+		lid = early_per_cpu(x86_cpu_to_logical_apicid, i);
 		if (lid != BAD_APICID && APIC_CLUSTER(lid) == my_cluster)
 			++count;
 	}
@@ -214,7 +214,15 @@ static void summit_init_apic_ldr(void)
 	/* We only have a 4 wide bitmap in cluster mode.  If a deranged
 	 * BIOS puts 5 CPUs in one APIC cluster, we're hosed. */
 	BUG_ON(count >= XAPIC_DEST_CPUS_SHIFT);
-	id = my_cluster | (1UL << count);
+	return my_cluster | (1UL << count);
+}
+
+static void summit_init_apic_ldr(void)
+{
+	int cpu = smp_processor_id();
+	unsigned long id = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	unsigned long val;
+
 	apic_write(APIC_DFR, SUMMIT_APIC_DFR_VALUE);
 	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
 	val |= SET_APIC_LOGICAL_ID(id);
@@ -241,18 +249,6 @@ static int summit_apicid_to_node(int logical_apicid)
 #endif
 }
 
-/* Mapping from cpu number to logical apicid */
-static inline int summit_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static int summit_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids)
@@ -286,7 +282,7 @@ static unsigned int summit_cpu_mask_to_apicid(const struct cpumask *cpumask)
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = summit_cpu_to_logical_apicid(cpu);
+		int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
 
 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			printk("%s: Not a valid mask!\n", __func__);
@@ -301,7 +297,7 @@ static unsigned int summit_cpu_mask_to_apicid(const struct cpumask *cpumask)
 static unsigned int summit_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = summit_cpu_to_logical_apicid(0);
+	int apicid = early_per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;
 
 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -523,13 +519,13 @@ struct apic apic_summit = {
 	.check_apicid_present		= summit_check_apicid_present,
 
 	.vector_allocation_domain	= summit_vector_allocation_domain,
+	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.init_apic_ldr			= summit_init_apic_ldr,
 
 	.ioapic_phys_id_map		= summit_ioapic_phys_id_map,
 	.setup_apic_routing		= summit_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= summit_apicid_to_node,
-	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= summit_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= summit_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 0b32f17..0768761 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -165,25 +165,21 @@ static void unmap_cpu_to_node(int cpu)
 #endif
 
 #ifdef CONFIG_X86_32
-u8 cpu_2_logical_apicid[NR_CPUS] __read_mostly =
-					{ [0 ... NR_CPUS-1] = BAD_APICID };
-
 static void map_cpu_to_logical_apicid(void)
 {
 	int cpu = smp_processor_id();
-	int apicid = logical_smp_processor_id();
-	int node = apic->apicid_to_node(apicid);
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	int node;
 
+	node = apic->apicid_to_node(logical_apicid);
 	if (!node_online(node))
 		node = first_online_node;
 
-	cpu_2_logical_apicid[cpu] = apicid;
 	map_cpu_to_node(cpu, node);
 }
 
 void numa_remove_cpu(int cpu)
 {
-	cpu_2_logical_apicid[cpu] = BAD_APICID;
 	unmap_cpu_to_node(cpu);
 }
 #else
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node()
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (3 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 11:37   ` Pekka Enberg
  2010-11-11 11:02 ` [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit Tejun Heo
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

apic->apicid_to_node() is 32bit specific apic operation which
determines NUMA node for a CPU.  Depending on the APIC implementation,
it can be easier to determine NUMA node from either physical or
logical apicid.  Currently, ->apicid_to_node() takes @logical_apicid
and calls hard_smp_processor_id() if the physical apicid is needed.

This prevents NUMA mapping from being queried from a different CPU,
which in turn makes it impossible to initialize NUMA mapping before
SMP bringup.

This patch replaces apic->apicid_to_node() with ->numa_cpu_node()
which takes @cpu, from which both logical and physical apicids can
easily be determined.  While at it, drop duplicate implementations
from bigsmp_32 and summit_32, and use the default.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/include/asm/apic.h           |    4 ++--
 arch/x86/kernel/apic/apic.c           |    6 +++---
 arch/x86/kernel/apic/apic_flat_64.c   |    4 ++--
 arch/x86/kernel/apic/apic_noop.c      |    4 ++--
 arch/x86/kernel/apic/bigsmp_32.c      |    7 +------
 arch/x86/kernel/apic/es7000_32.c      |    7 +++----
 arch/x86/kernel/apic/numaq_32.c       |   11 ++++++++++-
 arch/x86/kernel/apic/probe_32.c       |    2 +-
 arch/x86/kernel/apic/summit_32.c      |   11 +----------
 arch/x86/kernel/apic/x2apic_cluster.c |    2 +-
 arch/x86/kernel/apic/x2apic_phys.c    |    2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c    |    2 +-
 arch/x86/kernel/smpboot.c             |    3 +--
 13 files changed, 29 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index a45e657..eddafd9 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -318,7 +318,7 @@ struct apic {
 
 	void (*setup_apic_routing)(void);
 	int (*multi_timer_check)(int apic, int irq);
-	int (*apicid_to_node)(int logical_apicid);
+	int (*numa_cpu_node)(int cpu);
 	int (*cpu_present_to_apicid)(int mps_cpu);
 	void (*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
 	void (*setup_portio_remap)(void);
@@ -532,7 +532,7 @@ static inline int default_phys_pkg_id(int cpuid_apic, int index_msb)
 	return cpuid_apic >> index_msb;
 }
 
-extern int default_apicid_to_node(int logical_apicid);
+extern int default_numa_cpu_node(int cpu);
 
 #endif
 
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 481bce1..0676454 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2008,10 +2008,10 @@ void default_init_apic_ldr(void)
 }
 
 #ifdef CONFIG_X86_32
-int default_apicid_to_node(int logical_apicid)
+int default_numa_cpu_node(int cpu)
 {
-#ifdef CONFIG_SMP
-	return apicid_2_node[hard_smp_processor_id()];
+#ifdef CONFIG_NUMA
+	return apicid_2_node[early_per_cpu(x86_cpu_to_apicid, cpu)];
 #else
 	return 0;
 #endif
diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
index 09d3b17..a6c2cf4 100644
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -185,7 +185,7 @@ struct apic apic_flat =  {
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= NULL,
+	.numa_cpu_node			= NULL,
 	.cpu_to_logical_apicid		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= NULL,
@@ -337,7 +337,7 @@ struct apic apic_physflat =  {
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= NULL,
+	.numa_cpu_node			= NULL,
 	.cpu_to_logical_apicid		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= NULL,
diff --git a/arch/x86/kernel/apic/apic_noop.c b/arch/x86/kernel/apic/apic_noop.c
index e31b9ff..04f96d6 100644
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -113,7 +113,7 @@ static void noop_vector_allocation_domain(int cpu, struct cpumask *retmask)
 	cpumask_set_cpu(cpu, retmask);
 }
 
-int noop_apicid_to_node(int logical_apicid)
+int noop_numa_cpu_node(int cpu)
 {
 	/* we're always on node 0 */
 	return 0;
@@ -153,7 +153,7 @@ struct apic apic_noop = {
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= noop_apicid_to_node,
+	.numa_cpu_node			= noop_numa_cpu_node,
 
 	.cpu_to_logical_apicid		= noop_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c
index df83be7..d473b3b 100644
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -86,11 +86,6 @@ static void bigsmp_setup_apic_routing(void)
 		nr_ioapics);
 }
 
-static int bigsmp_apicid_to_node(int logical_apicid)
-{
-	return apicid_2_node[hard_smp_processor_id()];
-}
-
 static int bigsmp_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids)
@@ -222,7 +217,7 @@ struct apic apic_bigsmp = {
 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
 	.setup_apic_routing		= bigsmp_setup_apic_routing,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= bigsmp_apicid_to_node,
+	.numa_cpu_node			= default_numa_cpu_node,
 	.cpu_present_to_apicid		= bigsmp_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= physid_set_mask_of_physid,
 	.setup_portio_remap		= NULL,
diff --git a/arch/x86/kernel/apic/es7000_32.c b/arch/x86/kernel/apic/es7000_32.c
index b964d2a..1a26175 100644
--- a/arch/x86/kernel/apic/es7000_32.c
+++ b/arch/x86/kernel/apic/es7000_32.c
@@ -507,12 +507,11 @@ static void es7000_setup_apic_routing(void)
 		nr_ioapics, cpumask_bits(es7000_target_cpus())[0]);
 }
 
-static int es7000_apicid_to_node(int logical_apicid)
+static int es7000_numa_cpu_node(int cpu)
 {
 	return 0;
 }
 
-
 static int es7000_cpu_present_to_apicid(int mps_cpu)
 {
 	if (!mps_cpu)
@@ -647,7 +646,7 @@ struct apic __refdata apic_es7000_cluster = {
 	.ioapic_phys_id_map		= es7000_ioapic_phys_id_map,
 	.setup_apic_routing		= es7000_setup_apic_routing,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= es7000_apicid_to_node,
+	.numa_cpu_node			= es7000_numa_cpu_node,
 	.cpu_present_to_apicid		= es7000_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= es7000_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
@@ -711,7 +710,7 @@ struct apic __refdata apic_es7000 = {
 	.ioapic_phys_id_map		= es7000_ioapic_phys_id_map,
 	.setup_apic_routing		= es7000_setup_apic_routing,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= es7000_apicid_to_node,
+	.numa_cpu_node			= es7000_numa_cpu_node,
 	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= es7000_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= es7000_apicid_to_cpu_present,
diff --git a/arch/x86/kernel/apic/numaq_32.c b/arch/x86/kernel/apic/numaq_32.c
index cd9683a..9cdbfb0 100644
--- a/arch/x86/kernel/apic/numaq_32.c
+++ b/arch/x86/kernel/apic/numaq_32.c
@@ -397,6 +397,15 @@ static inline int numaq_apicid_to_node(int logical_apicid)
 	return logical_apicid >> 4;
 }
 
+static int numaq_numa_cpu_node(int cpu)
+{
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+
+	if (logical_apicid != BAD_APICID)
+		return numaq_apicid_to_node(logical_apicid);
+	return NUMA_NO_NODE;
+}
+
 static void numaq_apicid_to_cpu_present(int logical_apicid, physid_mask_t *retmap)
 {
 	int node = numaq_apicid_to_node(logical_apicid);
@@ -508,7 +517,7 @@ struct apic __refdata apic_numaq = {
 	.ioapic_phys_id_map		= numaq_ioapic_phys_id_map,
 	.setup_apic_routing		= numaq_setup_apic_routing,
 	.multi_timer_check		= numaq_multi_timer_check,
-	.apicid_to_node			= numaq_apicid_to_node,
+	.numa_cpu_node			= numaq_numa_cpu_node,
 	.cpu_present_to_apicid		= numaq_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= numaq_apicid_to_cpu_present,
 	.setup_portio_remap		= numaq_setup_portio_remap,
diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c
index 99d2fe0..ae66389 100644
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -130,7 +130,7 @@ struct apic apic_default = {
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
 	.setup_apic_routing		= setup_apic_flat_routing,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= default_apicid_to_node,
+	.numa_cpu_node			= default_numa_cpu_node,
 	.cpu_to_logical_apicid		= default_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= physid_set_mask_of_physid,
diff --git a/arch/x86/kernel/apic/summit_32.c b/arch/x86/kernel/apic/summit_32.c
index a904d28..cc5c229 100644
--- a/arch/x86/kernel/apic/summit_32.c
+++ b/arch/x86/kernel/apic/summit_32.c
@@ -240,15 +240,6 @@ static void summit_setup_apic_routing(void)
 						nr_ioapics);
 }
 
-static int summit_apicid_to_node(int logical_apicid)
-{
-#ifdef CONFIG_SMP
-	return apicid_2_node[hard_smp_processor_id()];
-#else
-	return 0;
-#endif
-}
-
 static int summit_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids)
@@ -525,7 +516,7 @@ struct apic apic_summit = {
 	.ioapic_phys_id_map		= summit_ioapic_phys_id_map,
 	.setup_apic_routing		= summit_setup_apic_routing,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= summit_apicid_to_node,
+	.numa_cpu_node			= default_numa_cpu_node,
 	.cpu_present_to_apicid		= summit_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= summit_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c
index cf69c59..6f6daed 100644
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -206,7 +206,7 @@ struct apic apic_x2apic_cluster = {
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= NULL,
+	.numa_cpu_node			= NULL,
 	.cpu_to_logical_apicid		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= NULL,
diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/arch/x86/kernel/apic/x2apic_phys.c
index 8972f38..0528136 100644
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -195,7 +195,7 @@ struct apic apic_x2apic_phys = {
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= NULL,
+	.numa_cpu_node			= NULL,
 	.cpu_to_logical_apicid		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= NULL,
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index ed4118d..bd7766a 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -315,7 +315,7 @@ struct apic __refdata apic_x2apic_uv_x = {
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
 	.multi_timer_check		= NULL,
-	.apicid_to_node			= NULL,
+	.numa_cpu_node			= NULL,
 	.cpu_to_logical_apicid		= NULL,
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= NULL,
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 0768761..963c44b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -168,10 +168,9 @@ static void unmap_cpu_to_node(int cpu)
 static void map_cpu_to_logical_apicid(void)
 {
 	int cpu = smp_processor_id();
-	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
 	int node;
 
-	node = apic->apicid_to_node(logical_apicid);
+	node = apic->numa_cpu_node(cpu);
 	if (!node_online(node))
 		node = first_online_node;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (4 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node() Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 12:01   ` Pekka Enberg
  2010-11-11 11:02 ` [PATCH 7/9] x86: Unify CPU -> " Tejun Heo
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

The mapping between cpu/apicid and node is done via apicid_to_node[]
on 64bit and apicid_2_node[] + apic->numa_cpu_node() on 32bit.  This
difference makes it difficult to further unify 32 and 64bit NUMA
hanlding.

This patch unifies it by replacing both apicid_to_node[] and
apicid_2_node[] with __apicid_to_node[] array, which is accessed by
two accessors - set_apicid_to_node() and numa_cpu_node().  On 64bit,
numa_cpu_node() always consults __apicid_to_node[] directly while
32bit goes through apic->numa_cpu_node() method to allow apic
implementation to override it.

There are several places where using numa_cpu_node() is awkward and
the override doesn't matter.  In those places, __apicid_to_node[] are
used directly.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/include/asm/mpspec.h  |    1 -
 arch/x86/include/asm/numa.h    |   31 +++++++++++++++++++++++++++++++
 arch/x86/include/asm/numa_32.h |    6 ++++++
 arch/x86/include/asm/numa_64.h |    5 ++---
 arch/x86/kernel/acpi/boot.c    |    3 +--
 arch/x86/kernel/apic/apic.c    |    6 +++++-
 arch/x86/kernel/cpu/amd.c      |   14 +++++++-------
 arch/x86/kernel/cpu/intel.c    |    3 +--
 arch/x86/kernel/smpboot.c      |    6 +-----
 arch/x86/mm/k8topology_64.c    |    2 +-
 arch/x86/mm/numa.c             |    6 +++++-
 arch/x86/mm/numa_32.c          |    6 ++++++
 arch/x86/mm/numa_64.c          |   18 +++++++++---------
 arch/x86/mm/srat_32.c          |    2 +-
 arch/x86/mm/srat_64.c          |   10 +++++-----
 15 files changed, 81 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 018ffc1..ae78732 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -24,7 +24,6 @@ extern int pic_mode;
 #define MAX_IRQ_SOURCES		256
 
 extern unsigned int def_to_bigsmp;
-extern u8 apicid_2_node[];
 
 #ifdef CONFIG_X86_NUMAQ
 extern int mp_bus_id_to_node[MAX_MP_BUSSES];
diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index 27da400..e40bf6f 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -1,5 +1,36 @@
+#ifndef _ASM_X86_NUMA_H
+#define _ASM_X86_NUMA_H
+
+#include <asm/apicdef.h>
+
+#ifdef CONFIG_NUMA
+/*
+ * __apicid_to_node[] stores the raw mapping between physical apicid
+ * and node and is used to initialize cpu_to_node mapping.
+ *
+ * The mapping may be overridden by apic->numa_cpu_node() on 32bit and
+ * thus should be accessed by the accessors - set_apicid_to_node() and
+ * numa_cpu_node().
+ *
+ * If the user knows that it doesn't care about 32bit APIC-specific
+ * overrides, __apicid_to_node[] may be used directly.
+ */
+extern s16 __apicid_to_node[MAX_LOCAL_APIC];
+
+static inline void set_apicid_to_node(int apicid, s16 node)
+{
+	__apicid_to_node[apicid] = node;
+}
+#else	/* CONFIG_NUMA */
+static inline void set_apicid_to_node(int apicid, s16 node)
+{
+}
+#endif	/* CONFIG_NUMA */
+
 #ifdef CONFIG_X86_32
 # include "numa_32.h"
 #else
 # include "numa_64.h"
 #endif
+
+#endif	/* _ASM_X86_NUMA_H */
diff --git a/arch/x86/include/asm/numa_32.h b/arch/x86/include/asm/numa_32.h
index a372290..d30eb6c 100644
--- a/arch/x86/include/asm/numa_32.h
+++ b/arch/x86/include/asm/numa_32.h
@@ -4,6 +4,12 @@
 extern int pxm_to_nid(int pxm);
 extern void numa_remove_cpu(int cpu);
 
+#ifdef CONFIG_NUMA
+extern int __cpuinit numa_cpu_node(int apicid);
+#else	/* CONFIG_NUMA */
+static inline int numa_cpu_node(int cpu)		{ return NUMA_NO_NODE; }
+#endif	/* CONFIG_NUMA */
+
 #ifdef CONFIG_HIGHMEM
 extern void set_highmem_pages_init(void);
 #else
diff --git a/arch/x86/include/asm/numa_64.h b/arch/x86/include/asm/numa_64.h
index 823e070..17171ee 100644
--- a/arch/x86/include/asm/numa_64.h
+++ b/arch/x86/include/asm/numa_64.h
@@ -2,7 +2,6 @@
 #define _ASM_X86_NUMA_64_H
 
 #include <linux/nodemask.h>
-#include <asm/apicdef.h>
 
 struct bootnode {
 	u64 start;
@@ -17,8 +16,6 @@ extern int compute_hash_shift(struct bootnode *nodes, int numblks,
 extern void numa_init_array(void);
 extern int numa_off;
 
-extern s16 apicid_to_node[MAX_LOCAL_APIC];
-
 extern unsigned long numa_free_all_bootmem(void);
 extern void setup_node_bootmem(int nodeid, unsigned long start,
 			       unsigned long end);
@@ -32,6 +29,7 @@ extern void setup_node_bootmem(int nodeid, unsigned long start,
 #define NODE_MIN_SIZE (4*1024*1024)
 
 extern void __init init_cpu_to_node(void);
+extern int __cpuinit numa_cpu_node(int cpu);
 extern void __cpuinit numa_set_node(int cpu, int node);
 extern void __cpuinit numa_clear_node(int cpu);
 extern void __cpuinit numa_add_cpu(int cpu);
@@ -43,6 +41,7 @@ extern void __cpuinit numa_remove_cpu(int cpu);
 #endif /* CONFIG_NUMA_EMU */
 #else
 static inline void init_cpu_to_node(void)		{ }
+static inline int numa_cpu_node(int cpu)		{ return NUMA_NO_NODE; }
 static inline void numa_set_node(int cpu, int node)	{ }
 static inline void numa_clear_node(int cpu)		{ }
 static inline void numa_add_cpu(int cpu, int node)	{ }
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 71232b9..edff4f5 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -583,11 +583,10 @@ static void acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
 	nid = acpi_get_node(handle);
 	if (nid == -1 || !node_online(nid))
 		return;
+	set_apicid_to_node(physid, nid);
 #ifdef CONFIG_X86_64
-	apicid_to_node[physid] = nid;
 	numa_set_node(cpu, nid);
 #else /* CONFIG_X86_32 */
-	apicid_2_node[physid] = nid;
 	cpu_to_node_map[cpu] = nid;
 #endif
 
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 0676454..3e20f4f 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2011,7 +2011,11 @@ void default_init_apic_ldr(void)
 int default_numa_cpu_node(int cpu)
 {
 #ifdef CONFIG_NUMA
-	return apicid_2_node[early_per_cpu(x86_cpu_to_apicid, cpu)];
+	int apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
+
+	if (apicid != BAD_APICID)
+		return __apicid_to_node[apicid];
+	return NUMA_NO_NODE;
 #else
 	return 0;
 #endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 9e093f8..aa3c613 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -239,12 +239,12 @@ static int __cpuinit nearby_node(int apicid)
 	int i, node;
 
 	for (i = apicid - 1; i >= 0; i--) {
-		node = apicid_to_node[i];
+		node = __apicid_to_node[i];
 		if (node != NUMA_NO_NODE && node_online(node))
 			return node;
 	}
 	for (i = apicid + 1; i < MAX_LOCAL_APIC; i++) {
-		node = apicid_to_node[i];
+		node = __apicid_to_node[i];
 		if (node != NUMA_NO_NODE && node_online(node))
 			return node;
 	}
@@ -339,10 +339,10 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
 	int node;
 	unsigned apicid = c->apicid;
 
-	node = per_cpu(cpu_llc_id, cpu);
+	node = numa_cpu_node(cpu);
+	if (node == NUMA_NO_NODE)
+		node = per_cpu(cpu_llc_id, cpu);
 
-	if (apicid_to_node[apicid] != NUMA_NO_NODE)
-		node = apicid_to_node[apicid];
 	if (!node_online(node)) {
 		/* Two possibilities here:
 		   - The CPU is missing memory and no node was created.
@@ -357,8 +357,8 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
 		int ht_nodeid = c->initial_apicid;
 
 		if (ht_nodeid >= 0 &&
-		    apicid_to_node[ht_nodeid] != NUMA_NO_NODE)
-			node = apicid_to_node[ht_nodeid];
+		    __apicid_to_node[ht_nodeid] != NUMA_NO_NODE)
+			node = __apicid_to_node[ht_nodeid];
 		/* Pick a nearby node */
 		if (!node_online(node))
 			node = nearby_node(apicid);
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index d16c2c5..6052004 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -279,11 +279,10 @@ static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
 #if defined(CONFIG_NUMA) && defined(CONFIG_X86_64)
 	unsigned node;
 	int cpu = smp_processor_id();
-	int apicid = cpu_has_apic ? hard_smp_processor_id() : c->apicid;
 
 	/* Don't do the funky fallback heuristics the AMD version employs
 	   for now. */
-	node = apicid_to_node[apicid];
+	node = numa_cpu_node(cpu);
 	if (node == NUMA_NO_NODE || !node_online(node)) {
 		/* reuse the value from init_cpu_to_node() */
 		node = cpu_to_node(cpu);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 963c44b..4b8b72d 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -71,10 +71,6 @@
 #include <asm/smpboot_hooks.h>
 #include <asm/i8259.h>
 
-#ifdef CONFIG_X86_32
-u8 apicid_2_node[MAX_LOCAL_APIC];
-#endif
-
 /* State of each CPU */
 DEFINE_PER_CPU(int, cpu_state) = { 0 };
 
@@ -170,7 +166,7 @@ static void map_cpu_to_logical_apicid(void)
 	int cpu = smp_processor_id();
 	int node;
 
-	node = apic->numa_cpu_node(cpu);
+	node = numa_cpu_node(cpu);
 	if (!node_online(node))
 		node = first_online_node;
 
diff --git a/arch/x86/mm/k8topology_64.c b/arch/x86/mm/k8topology_64.c
index 804a3b6..484d80c 100644
--- a/arch/x86/mm/k8topology_64.c
+++ b/arch/x86/mm/k8topology_64.c
@@ -228,7 +228,7 @@ int __init k8_scan_nodes(void)
 				nodes[i].start >> PAGE_SHIFT,
 				nodes[i].end >> PAGE_SHIFT);
 		for (j = apicid_base; j < cores + apicid_base; j++)
-			apicid_to_node[(i << bits) + j] = i;
+			set_apicid_to_node((i << bits) + j, i);
 		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
 	}
 
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 787c52c..63db99c 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -4,8 +4,12 @@
 #include <linux/bootmem.h>
 
 /*
- * Which logical CPUs are on which nodes
+ * apicid, cpu, node mappings
  */
+s16 __apicid_to_node[MAX_LOCAL_APIC] __cpuinitdata = {
+	[0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
+};
+
 cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
 EXPORT_SYMBOL(node_to_cpumask_map);
 
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c
index 84a3e4c..9f27ae2 100644
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -110,6 +110,12 @@ void set_pmd_pfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags);
 
 static unsigned long kva_start_pfn;
 static unsigned long kva_pages;
+
+int __cpuinit numa_cpu_node(int cpu)
+{
+	return apic->numa_cpu_node(cpu);
+}
+
 /*
  * FLAT - support for basic PC memory model with discontig enabled, essentially
  *        a single node with all available processors in it with a flat
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 7ffc9b7..47ca1b0 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -26,10 +26,6 @@ EXPORT_SYMBOL(node_data);
 
 struct memnode memnode;
 
-s16 apicid_to_node[MAX_LOCAL_APIC] __cpuinitdata = {
-	[0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
-};
-
 int numa_off __initdata;
 static unsigned long __initdata nodemap_addr;
 static unsigned long __initdata nodemap_size;
@@ -721,12 +717,8 @@ void __init init_cpu_to_node(void)
 	BUG_ON(cpu_to_apicid == NULL);
 
 	for_each_possible_cpu(cpu) {
-		int node;
-		u16 apicid = cpu_to_apicid[cpu];
+		int node = numa_cpu_node(cpu);
 
-		if (apicid == BAD_APICID)
-			continue;
-		node = apicid_to_node[apicid];
 		if (node == NUMA_NO_NODE)
 			continue;
 		if (!node_online(node))
@@ -736,6 +728,14 @@ void __init init_cpu_to_node(void)
 }
 #endif
 
+int __cpuinit numa_cpu_node(int cpu)
+{
+	int apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
+
+	if (apicid != BAD_APICID)
+		return __apicid_to_node[apicid];
+	return NUMA_NO_NODE;
+}
 
 void __cpuinit numa_set_node(int cpu, int node)
 {
diff --git a/arch/x86/mm/srat_32.c b/arch/x86/mm/srat_32.c
index e55e748..7fcae55 100644
--- a/arch/x86/mm/srat_32.c
+++ b/arch/x86/mm/srat_32.c
@@ -255,7 +255,7 @@ int __init get_memcfg_from_srat(void)
 			 num_memory_chunks);
 
 	for (i = 0; i < MAX_LOCAL_APIC; i++)
-		apicid_2_node[i] = pxm_to_node(apicid_to_pxm[i]);
+		set_apicid_to_node(i, pxm_to_node(apicid_to_pxm[i]));
 
 	for (j = 0; j < num_memory_chunks; j++){
 		struct node_memory_chunk_s * chunk = &node_memory_chunk[j];
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a35cb9d..1af9c6e 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -79,7 +79,7 @@ static __init void bad_srat(void)
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
 	acpi_numa = -1;
 	for (i = 0; i < MAX_LOCAL_APIC; i++)
-		apicid_to_node[i] = NUMA_NO_NODE;
+		set_apicid_to_node(i, NUMA_NO_NODE);
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		nodes[i].start = nodes[i].end = 0;
 		nodes_add[i].start = nodes_add[i].end = 0;
@@ -134,7 +134,7 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
 	}
 
 	apic_id = pa->apic_id;
-	apicid_to_node[apic_id] = node;
+	set_apicid_to_node(apic_id, node);
 	node_set(node, cpu_nodes_parsed);
 	acpi_numa = 1;
 	printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u\n",
@@ -168,7 +168,7 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
 		apic_id = (pa->apic_id << 8) | pa->local_sapic_eid;
 	else
 		apic_id = pa->apic_id;
-	apicid_to_node[apic_id] = node;
+	set_apicid_to_node(apic_id, node);
 	node_set(node, cpu_nodes_parsed);
 	acpi_numa = 1;
 	printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u\n",
@@ -512,13 +512,13 @@ void __init acpi_fake_nodes(const struct bootnode *fake_nodes, int num_nodes)
 		 * node, it must now point to the fake node ID.
 		 */
 		for (j = 0; j < MAX_LOCAL_APIC; j++)
-			if (apicid_to_node[j] == nid &&
+			if (__apicid_to_node[j] == nid &&
 			    fake_apicid_to_node[j] == NUMA_NO_NODE)
 				fake_apicid_to_node[j] = i;
 	}
 	for (i = 0; i < num_nodes; i++)
 		__acpi_map_pxm_to_node(fake_node_to_pxm_map[i], i);
-	memcpy(apicid_to_node, fake_apicid_to_node, sizeof(apicid_to_node));
+	memcpy(__apicid_to_node, fake_apicid_to_node, sizeof(__apicid_to_node));
 
 	nodes_clear(nodes_parsed);
 	for (i = 0; i < num_nodes; i++)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/9] x86: Unify CPU -> NUMA node mapping between 32 and 64bit
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (5 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 11:02 ` [PATCH 8/9] x86: Unify node_to_cpumask_map handling " Tejun Heo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for CPU
-> NUMA node mapping.  Replace it with early_percpu variable
x86_cpu_to_node_map and share the mapping code with 64bit.

* USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too.

* x86_cpu_to_node_map and numa_set/clear_node() are moved from numa_64
  to numa.  For now, x86_cpu_to_node_map is initialized with 0 instead
  of NUMA_NO_NODE.  This is to avoid introducing unexpected behavior
  change and will be updated once init path is unified.

* srat_detect_node() is now enabled for x86_32 too.  It calls
  numa_set_node() and initializes the mapping making explicit
  cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/Kconfig                |    2 +-
 arch/x86/include/asm/numa.h     |    8 ++++
 arch/x86/include/asm/numa_64.h  |    4 --
 arch/x86/include/asm/topology.h |   17 ---------
 arch/x86/kernel/acpi/boot.c     |    5 ---
 arch/x86/kernel/cpu/amd.c       |    4 +-
 arch/x86/kernel/cpu/intel.c     |    2 +-
 arch/x86/kernel/setup_percpu.c  |    4 +-
 arch/x86/kernel/smpboot.c       |    6 ---
 arch/x86/mm/numa.c              |   72 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/mm/numa_64.c           |   65 -----------------------------------
 11 files changed, 85 insertions(+), 104 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e832768..d493c44 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1678,7 +1678,7 @@ config HAVE_ARCH_EARLY_PFN_TO_NID
 	depends on NUMA
 
 config USE_PERCPU_NUMA_NODE_ID
-	def_bool X86_64
+	def_bool y
 	depends on NUMA
 
 menu "Power management and ACPI options"
diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index e40bf6f..97078b8 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -33,4 +33,12 @@ static inline void set_apicid_to_node(int apicid, s16 node)
 # include "numa_64.h"
 #endif
 
+#ifdef CONFIG_NUMA
+extern void __cpuinit numa_set_node(int cpu, int node);
+extern void __cpuinit numa_clear_node(int cpu);
+#else	/* CONFIG_NUMA */
+static inline void numa_set_node(int cpu, int node)	{ }
+static inline void numa_clear_node(int cpu)		{ }
+#endif	/* CONFIG_NUMA */
+
 #endif	/* _ASM_X86_NUMA_H */
diff --git a/arch/x86/include/asm/numa_64.h b/arch/x86/include/asm/numa_64.h
index 17171ee..96691a0 100644
--- a/arch/x86/include/asm/numa_64.h
+++ b/arch/x86/include/asm/numa_64.h
@@ -30,8 +30,6 @@ extern void setup_node_bootmem(int nodeid, unsigned long start,
 
 extern void __init init_cpu_to_node(void);
 extern int __cpuinit numa_cpu_node(int cpu);
-extern void __cpuinit numa_set_node(int cpu, int node);
-extern void __cpuinit numa_clear_node(int cpu);
 extern void __cpuinit numa_add_cpu(int cpu);
 extern void __cpuinit numa_remove_cpu(int cpu);
 
@@ -42,8 +40,6 @@ extern void __cpuinit numa_remove_cpu(int cpu);
 #else
 static inline void init_cpu_to_node(void)		{ }
 static inline int numa_cpu_node(int cpu)		{ return NUMA_NO_NODE; }
-static inline void numa_set_node(int cpu, int node)	{ }
-static inline void numa_clear_node(int cpu)		{ }
 static inline void numa_add_cpu(int cpu, int node)	{ }
 static inline void numa_remove_cpu(int cpu)		{ }
 #endif
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 21899cc..b101c17 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -47,21 +47,6 @@
 
 #include <asm/mpspec.h>
 
-#ifdef CONFIG_X86_32
-
-/* Mappings between logical cpu number and node number */
-extern int cpu_to_node_map[];
-
-/* Returns the number of the node containing CPU 'cpu' */
-static inline int __cpu_to_node(int cpu)
-{
-	return cpu_to_node_map[cpu];
-}
-#define early_cpu_to_node __cpu_to_node
-#define cpu_to_node __cpu_to_node
-
-#else /* CONFIG_X86_64 */
-
 /* Mappings between logical cpu number and node number */
 DECLARE_EARLY_PER_CPU(int, x86_cpu_to_node_map);
 
@@ -84,8 +69,6 @@ static inline int early_cpu_to_node(int cpu)
 
 #endif /* !CONFIG_DEBUG_PER_CPU_MAPS */
 
-#endif /* CONFIG_X86_64 */
-
 /* Mappings between node number and cpus on that node. */
 extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
 
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index edff4f5..0ce5755 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -584,12 +584,7 @@ static void acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
 	if (nid == -1 || !node_online(nid))
 		return;
 	set_apicid_to_node(physid, nid);
-#ifdef CONFIG_X86_64
 	numa_set_node(cpu, nid);
-#else /* CONFIG_X86_32 */
-	cpu_to_node_map[cpu] = nid;
-#endif
-
 #endif
 }
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index aa3c613..4259322 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -233,7 +233,7 @@ static void __cpuinit init_amd_k7(struct cpuinfo_x86 *c)
 }
 #endif
 
-#if defined(CONFIG_NUMA) && defined(CONFIG_X86_64)
+#ifdef CONFIG_NUMA
 static int __cpuinit nearby_node(int apicid)
 {
 	int i, node;
@@ -334,7 +334,7 @@ EXPORT_SYMBOL_GPL(amd_get_nb_id);
 
 static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
 {
-#if defined(CONFIG_NUMA) && defined(CONFIG_X86_64)
+#ifdef CONFIG_NUMA
 	int cpu = smp_processor_id();
 	int node;
 	unsigned apicid = c->apicid;
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 6052004..df86bc8 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -276,7 +276,7 @@ static void __cpuinit intel_workarounds(struct cpuinfo_x86 *c)
 
 static void __cpuinit srat_detect_node(struct cpuinfo_x86 *c)
 {
-#if defined(CONFIG_NUMA) && defined(CONFIG_X86_64)
+#ifdef CONFIG_NUMA
 	unsigned node;
 	int cpu = smp_processor_id();
 
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 002b796..024e7fc 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -229,6 +229,7 @@ void __init setup_per_cpu_areas(void)
 		per_cpu(irq_stack_ptr, cpu) =
 			per_cpu(irq_stack_union.irq_stack, cpu) +
 			IRQ_STACK_SIZE - 64;
+#endif
 #ifdef CONFIG_NUMA
 		per_cpu(x86_cpu_to_node_map, cpu) =
 			early_per_cpu_map(x86_cpu_to_node_map, cpu);
@@ -242,7 +243,6 @@ void __init setup_per_cpu_areas(void)
 		 */
 		set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
 #endif
-#endif
 		/*
 		 * Up to this point, the boot CPU has been using .init.data
 		 * area.  Reload any changed state for the boot CPU.
@@ -256,7 +256,7 @@ void __init setup_per_cpu_areas(void)
 	early_per_cpu_ptr(x86_cpu_to_apicid) = NULL;
 	early_per_cpu_ptr(x86_bios_cpu_apicid) = NULL;
 #endif
-#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
+#ifdef CONFIG_NUMA
 	early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
 #endif
 
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4b8b72d..b78ce8c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -133,16 +133,11 @@ EXPORT_PER_CPU_SYMBOL(cpu_info);
 atomic_t init_deasserted;
 
 #if defined(CONFIG_NUMA) && defined(CONFIG_X86_32)
-/* which node each logical CPU is on */
-int cpu_to_node_map[NR_CPUS] __read_mostly = { [0 ... NR_CPUS-1] = 0 };
-EXPORT_SYMBOL(cpu_to_node_map);
-
 /* set up a mapping between cpu and node. */
 static void map_cpu_to_node(int cpu, int node)
 {
 	printk(KERN_INFO "Mapping cpu %d to node %d\n", cpu, node);
 	cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
-	cpu_to_node_map[cpu] = node;
 }
 
 /* undo a mapping between cpu and node. */
@@ -153,7 +148,6 @@ static void unmap_cpu_to_node(int cpu)
 	printk(KERN_INFO "Unmapping cpu %d from all nodes\n", cpu);
 	for (node = 0; node < MAX_NUMNODES; node++)
 		cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
-	cpu_to_node_map[cpu] = 0;
 }
 #else /* !(CONFIG_NUMA && CONFIG_X86_32) */
 #define map_cpu_to_node(cpu, node)	({})
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 63db99c..b23b5fe 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -14,6 +14,44 @@ cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
 EXPORT_SYMBOL(node_to_cpumask_map);
 
 /*
+ * Map cpu index to node index
+ */
+#ifdef CONFIG_X86_32
+DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node_map, 0);
+#else
+DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node_map, NUMA_NO_NODE);
+#endif
+EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_node_map);
+
+void __cpuinit numa_set_node(int cpu, int node)
+{
+	int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map);
+
+	/* early setting, no percpu area yet */
+	if (cpu_to_node_map) {
+		cpu_to_node_map[cpu] = node;
+		return;
+	}
+
+#ifdef CONFIG_DEBUG_PER_CPU_MAPS
+	if (cpu >= nr_cpu_ids || !cpu_possible(cpu)) {
+		printk(KERN_ERR "numa_set_node: invalid cpu# (%d)\n", cpu);
+		dump_stack();
+		return;
+	}
+#endif
+	per_cpu(x86_cpu_to_node_map, cpu) = node;
+
+	if (node != NUMA_NO_NODE)
+		set_cpu_numa_node(cpu, node);
+}
+
+void __cpuinit numa_clear_node(int cpu)
+{
+	numa_set_node(cpu, NUMA_NO_NODE);
+}
+
+/*
  * Allocate node_to_cpumask_map based on number of available nodes
  * Requires node_possible_map to be valid.
  *
@@ -40,6 +78,37 @@ void __init setup_node_to_cpumask_map(void)
 }
 
 #ifdef CONFIG_DEBUG_PER_CPU_MAPS
+
+int __cpu_to_node(int cpu)
+{
+	if (early_per_cpu_ptr(x86_cpu_to_node_map)) {
+		printk(KERN_WARNING
+			"cpu_to_node(%d): usage too early!\n", cpu);
+		dump_stack();
+		return early_per_cpu_ptr(x86_cpu_to_node_map)[cpu];
+	}
+	return per_cpu(x86_cpu_to_node_map, cpu);
+}
+EXPORT_SYMBOL(__cpu_to_node);
+
+/*
+ * Same function as cpu_to_node() but used if called before the
+ * per_cpu areas are setup.
+ */
+int early_cpu_to_node(int cpu)
+{
+	if (early_per_cpu_ptr(x86_cpu_to_node_map))
+		return early_per_cpu_ptr(x86_cpu_to_node_map)[cpu];
+
+	if (!cpu_possible(cpu)) {
+		printk(KERN_WARNING
+			"early_cpu_to_node(%d): no per_cpu area!\n", cpu);
+		dump_stack();
+		return NUMA_NO_NODE;
+	}
+	return per_cpu(x86_cpu_to_node_map, cpu);
+}
+
 /*
  * Returns a pointer to the bitmask of CPUs on Node 'node'.
  */
@@ -62,4 +131,5 @@ const struct cpumask *cpumask_of_node(int node)
 	return node_to_cpumask_map[node];
 }
 EXPORT_SYMBOL(cpumask_of_node);
-#endif
+
+#endif	/* CONFIG_DEBUG_PER_CPU_MAPS */
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 47ca1b0..121b26d 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -31,12 +31,6 @@ static unsigned long __initdata nodemap_addr;
 static unsigned long __initdata nodemap_size;
 
 /*
- * Map cpu index to node index
- */
-DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node_map, NUMA_NO_NODE);
-EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_node_map);
-
-/*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
  * 1 if OK
@@ -737,34 +731,6 @@ int __cpuinit numa_cpu_node(int cpu)
 	return NUMA_NO_NODE;
 }
 
-void __cpuinit numa_set_node(int cpu, int node)
-{
-	int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map);
-
-	/* early setting, no percpu area yet */
-	if (cpu_to_node_map) {
-		cpu_to_node_map[cpu] = node;
-		return;
-	}
-
-#ifdef CONFIG_DEBUG_PER_CPU_MAPS
-	if (cpu >= nr_cpu_ids || !cpu_possible(cpu)) {
-		printk(KERN_ERR "numa_set_node: invalid cpu# (%d)\n", cpu);
-		dump_stack();
-		return;
-	}
-#endif
-	per_cpu(x86_cpu_to_node_map, cpu) = node;
-
-	if (node != NUMA_NO_NODE)
-		set_cpu_numa_node(cpu, node);
-}
-
-void __cpuinit numa_clear_node(int cpu)
-{
-	numa_set_node(cpu, NUMA_NO_NODE);
-}
-
 #ifndef CONFIG_DEBUG_PER_CPU_MAPS
 
 void __cpuinit numa_add_cpu(int cpu)
@@ -814,37 +780,6 @@ void __cpuinit numa_remove_cpu(int cpu)
 {
 	numa_set_cpumask(cpu, 0);
 }
-
-int __cpu_to_node(int cpu)
-{
-	if (early_per_cpu_ptr(x86_cpu_to_node_map)) {
-		printk(KERN_WARNING
-			"cpu_to_node(%d): usage too early!\n", cpu);
-		dump_stack();
-		return early_per_cpu_ptr(x86_cpu_to_node_map)[cpu];
-	}
-	return per_cpu(x86_cpu_to_node_map, cpu);
-}
-EXPORT_SYMBOL(__cpu_to_node);
-
-/*
- * Same function as cpu_to_node() but used if called before the
- * per_cpu areas are setup.
- */
-int early_cpu_to_node(int cpu)
-{
-	if (early_per_cpu_ptr(x86_cpu_to_node_map))
-		return early_per_cpu_ptr(x86_cpu_to_node_map)[cpu];
-
-	if (!cpu_possible(cpu)) {
-		printk(KERN_WARNING
-			"early_cpu_to_node(%d): no per_cpu area!\n", cpu);
-		dump_stack();
-		return NUMA_NO_NODE;
-	}
-	return per_cpu(x86_cpu_to_node_map, cpu);
-}
-
 /*
  * --------- end of debug versions of the numa functions ---------
  */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/9] x86: Unify node_to_cpumask_map handling between 32 and 64bit
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (6 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 7/9] x86: Unify CPU -> " Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 12:11   ` Pekka Enberg
  2010-11-11 11:02 ` [PATCH 9/9] x86: Unify NUMA initialization " Tejun Heo
  2010-11-12 10:32 ` [PATCH 3.5/9] x86: Use local variable to cache smp_processor_id() in setup_local_APIC() Tejun Heo
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

x86_32 has been managing node_to_cpumask_map explicitly from
map_cpu_to_node() and friends in a rather ugly way.  With previous
changes, it's now possible to simply share the code with 64bit.

* numa_add/remove_cpu() are moved from numa_64 to numa.

* identify_cpu() now calls numa_add_cpu() for 32bit too.  This makes
  the explicit mask management from map_cpu_to_node() unnecessary.

* The whole x86_32 specific map_cpu_to_node() chunk is no longer
  necessary.  Dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 arch/x86/include/asm/numa.h    |   18 +++++++++++++
 arch/x86/include/asm/numa_32.h |    1 -
 arch/x86/include/asm/numa_64.h |    4 ---
 arch/x86/kernel/cpu/common.c   |    2 +-
 arch/x86/kernel/smpboot.c      |   47 ----------------------------------
 arch/x86/mm/numa.c             |   33 ++++++++++++++++++++++++
 arch/x86/mm/numa_64.c          |   55 ----------------------------------------
 7 files changed, 52 insertions(+), 108 deletions(-)

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index 97078b8..77f8bcb 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -1,6 +1,7 @@
 #ifndef _ASM_X86_NUMA_H
 #define _ASM_X86_NUMA_H
 
+#include <asm/topology.h>
 #include <asm/apicdef.h>
 
 #ifdef CONFIG_NUMA
@@ -36,9 +37,26 @@ static inline void set_apicid_to_node(int apicid, s16 node)
 #ifdef CONFIG_NUMA
 extern void __cpuinit numa_set_node(int cpu, int node);
 extern void __cpuinit numa_clear_node(int cpu);
+
+# ifdef CONFIG_DEBUG_PER_CPU_MAPS
+extern void __cpuinit numa_add_cpu(int cpu);
+extern void __cpuinit numa_remove_cpu(int cpu);
+# else	/* CONFIG_DEBUG_PER_CPU_MAPS */
+static inline void __cpuinit numa_add_cpu(int cpu)
+{
+	cpumask_set_cpu(cpu, node_to_cpumask_map[early_cpu_to_node(cpu)]);
+}
+static inline void __cpuinit numa_remove_cpu(int cpu)
+{
+	cpumask_clear_cpu(cpu, node_to_cpumask_map[early_cpu_to_node(cpu)]);
+}
+# endif	/* CONFIG_DEBUG_PER_CPU_MAPS */
+
 #else	/* CONFIG_NUMA */
 static inline void numa_set_node(int cpu, int node)	{ }
 static inline void numa_clear_node(int cpu)		{ }
+static inline void numa_add_cpu(int cpu)		{ }
+static inline void numa_remove_cpu(int cpu)		{ }
 #endif	/* CONFIG_NUMA */
 
 #endif	/* _ASM_X86_NUMA_H */
diff --git a/arch/x86/include/asm/numa_32.h b/arch/x86/include/asm/numa_32.h
index d30eb6c..ba0ea62 100644
--- a/arch/x86/include/asm/numa_32.h
+++ b/arch/x86/include/asm/numa_32.h
@@ -2,7 +2,6 @@
 #define _ASM_X86_NUMA_32_H
 
 extern int pxm_to_nid(int pxm);
-extern void numa_remove_cpu(int cpu);
 
 #ifdef CONFIG_NUMA
 extern int __cpuinit numa_cpu_node(int apicid);
diff --git a/arch/x86/include/asm/numa_64.h b/arch/x86/include/asm/numa_64.h
index 96691a0..6331190 100644
--- a/arch/x86/include/asm/numa_64.h
+++ b/arch/x86/include/asm/numa_64.h
@@ -30,8 +30,6 @@ extern void setup_node_bootmem(int nodeid, unsigned long start,
 
 extern void __init init_cpu_to_node(void);
 extern int __cpuinit numa_cpu_node(int cpu);
-extern void __cpuinit numa_add_cpu(int cpu);
-extern void __cpuinit numa_remove_cpu(int cpu);
 
 #ifdef CONFIG_NUMA_EMU
 #define FAKE_NODE_MIN_SIZE	((u64)64 << 20)
@@ -40,8 +38,6 @@ extern void __cpuinit numa_remove_cpu(int cpu);
 #else
 static inline void init_cpu_to_node(void)		{ }
 static inline int numa_cpu_node(int cpu)		{ return NUMA_NO_NODE; }
-static inline void numa_add_cpu(int cpu, int node)	{ }
-static inline void numa_remove_cpu(int cpu)		{ }
 #endif
 
 #endif /* _ASM_X86_NUMA_64_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4b68bda..0e9a1d7 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -869,7 +869,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
 
 	select_idle_routine(c);
 
-#if defined(CONFIG_NUMA) && defined(CONFIG_X86_64)
+#ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
 }
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index b78ce8c..b152077 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -132,49 +132,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_info);
 
 atomic_t init_deasserted;
 
-#if defined(CONFIG_NUMA) && defined(CONFIG_X86_32)
-/* set up a mapping between cpu and node. */
-static void map_cpu_to_node(int cpu, int node)
-{
-	printk(KERN_INFO "Mapping cpu %d to node %d\n", cpu, node);
-	cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
-}
-
-/* undo a mapping between cpu and node. */
-static void unmap_cpu_to_node(int cpu)
-{
-	int node;
-
-	printk(KERN_INFO "Unmapping cpu %d from all nodes\n", cpu);
-	for (node = 0; node < MAX_NUMNODES; node++)
-		cpumask_clear_cpu(cpu, node_to_cpumask_map[node]);
-}
-#else /* !(CONFIG_NUMA && CONFIG_X86_32) */
-#define map_cpu_to_node(cpu, node)	({})
-#define unmap_cpu_to_node(cpu)	({})
-#endif
-
-#ifdef CONFIG_X86_32
-static void map_cpu_to_logical_apicid(void)
-{
-	int cpu = smp_processor_id();
-	int node;
-
-	node = numa_cpu_node(cpu);
-	if (!node_online(node))
-		node = first_online_node;
-
-	map_cpu_to_node(cpu, node);
-}
-
-void numa_remove_cpu(int cpu)
-{
-	unmap_cpu_to_node(cpu);
-}
-#else
-#define map_cpu_to_logical_apicid()  do {} while (0)
-#endif
-
 /*
  * Report back to the Boot Processor.
  * Running on AP.
@@ -242,7 +199,6 @@ static void __cpuinit smp_callin(void)
 		apic->smp_callin_clear_local_apic();
 	setup_local_APIC();
 	end_local_APIC_setup();
-	map_cpu_to_logical_apicid();
 
 	/*
 	 * Need to setup vector mappings before we enable interrupts.
@@ -946,7 +902,6 @@ static __init void disable_smp(void)
 		physid_set_mask_of_physid(boot_cpu_physical_apicid, &phys_cpu_present_map);
 	else
 		physid_set_mask_of_physid(0, &phys_cpu_present_map);
-	map_cpu_to_logical_apicid();
 	cpumask_set_cpu(0, cpu_sibling_mask(0));
 	cpumask_set_cpu(0, cpu_core_mask(0));
 }
@@ -1125,8 +1080,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 
 	end_local_APIC_setup();
 
-	map_cpu_to_logical_apicid();
-
 	if (apic->setup_portio_remap)
 		apic->setup_portio_remap();
 
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index b23b5fe..c52c267 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -109,6 +109,39 @@ int early_cpu_to_node(int cpu)
 	return per_cpu(x86_cpu_to_node_map, cpu);
 }
 
+static void __cpuinit numa_set_cpumask(int cpu, int enable)
+{
+	int node = early_cpu_to_node(cpu);
+	struct cpumask *mask;
+	char buf[64];
+
+	mask = node_to_cpumask_map[node];
+	if (mask == NULL) {
+		printk(KERN_ERR "node_to_cpumask_map[%i] NULL\n", node);
+		dump_stack();
+		return;
+	}
+
+	if (enable)
+		cpumask_set_cpu(cpu, mask);
+	else
+		cpumask_clear_cpu(cpu, mask);
+
+	cpulist_scnprintf(buf, sizeof(buf), mask);
+	printk(KERN_DEBUG "%s cpu %d node %d: mask now %s\n",
+		enable ? "numa_add_cpu" : "numa_remove_cpu", cpu, node, buf);
+}
+
+void __cpuinit numa_add_cpu(int cpu)
+{
+	numa_set_cpumask(cpu, 1);
+}
+
+void __cpuinit numa_remove_cpu(int cpu)
+{
+	numa_set_cpumask(cpu, 0);
+}
+
 /*
  * Returns a pointer to the bitmask of CPUs on Node 'node'.
  */
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 121b26d..c8c2e29 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -730,58 +730,3 @@ int __cpuinit numa_cpu_node(int cpu)
 		return __apicid_to_node[apicid];
 	return NUMA_NO_NODE;
 }
-
-#ifndef CONFIG_DEBUG_PER_CPU_MAPS
-
-void __cpuinit numa_add_cpu(int cpu)
-{
-	cpumask_set_cpu(cpu, node_to_cpumask_map[early_cpu_to_node(cpu)]);
-}
-
-void __cpuinit numa_remove_cpu(int cpu)
-{
-	cpumask_clear_cpu(cpu, node_to_cpumask_map[early_cpu_to_node(cpu)]);
-}
-
-#else /* CONFIG_DEBUG_PER_CPU_MAPS */
-
-/*
- * --------- debug versions of the numa functions ---------
- */
-static void __cpuinit numa_set_cpumask(int cpu, int enable)
-{
-	int node = early_cpu_to_node(cpu);
-	struct cpumask *mask;
-	char buf[64];
-
-	mask = node_to_cpumask_map[node];
-	if (mask == NULL) {
-		printk(KERN_ERR "node_to_cpumask_map[%i] NULL\n", node);
-		dump_stack();
-		return;
-	}
-
-	if (enable)
-		cpumask_set_cpu(cpu, mask);
-	else
-		cpumask_clear_cpu(cpu, mask);
-
-	cpulist_scnprintf(buf, sizeof(buf), mask);
-	printk(KERN_DEBUG "%s cpu %d node %d: mask now %s\n",
-		enable ? "numa_add_cpu" : "numa_remove_cpu", cpu, node, buf);
-}
-
-void __cpuinit numa_add_cpu(int cpu)
-{
-	numa_set_cpumask(cpu, 1);
-}
-
-void __cpuinit numa_remove_cpu(int cpu)
-{
-	numa_set_cpumask(cpu, 0);
-}
-/*
- * --------- end of debug versions of the numa functions ---------
- */
-
-#endif /* CONFIG_DEBUG_PER_CPU_MAPS */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 9/9] x86: Unify NUMA initialization between 32 and 64bit
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (7 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 8/9] x86: Unify node_to_cpumask_map handling " Tejun Heo
@ 2010-11-11 11:02 ` Tejun Heo
  2010-11-11 12:09   ` Pekka Enberg
  2010-11-12 10:32 ` [PATCH 3.5/9] x86: Use local variable to cache smp_processor_id() in setup_local_APIC() Tejun Heo
  9 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 11:02 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai; +Cc: Tejun Heo

Now that everything else is unified, NUMA initialization can be
unified too.

* numa_init_array() and init_cpu_to_node() are moved from numa_64 to
  numa.

* numa_32::initmem_init() is updated to call numa_init_array() and
  setup_arch() to call init_cpu_to_node() on 32bit too.

* x86_cpu_to_node_map is now initialized to NUMA_NO_NODE on 32bit too.
  This is safe now as numa_init_array() will initialize it early
  during boot.

This makes NUMA mapping fully initialized before setup_per_cpu_areas()
on 32bit too and thus makes the first percpu chunk which contains all
the static variables and some of dynamic area allocated with NUMA
affinity correctly considered.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 arch/x86/include/asm/numa.h    |    4 ++
 arch/x86/include/asm/numa_64.h |    3 --
 arch/x86/kernel/setup.c        |    2 -
 arch/x86/mm/numa.c             |   76 +++++++++++++++++++++++++++++++++++++--
 arch/x86/mm/numa_32.c          |    1 +
 arch/x86/mm/numa_64.c          |   75 ---------------------------------------
 6 files changed, 77 insertions(+), 84 deletions(-)

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index 77f8bcb..b131348 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -37,6 +37,8 @@ static inline void set_apicid_to_node(int apicid, s16 node)
 #ifdef CONFIG_NUMA
 extern void __cpuinit numa_set_node(int cpu, int node);
 extern void __cpuinit numa_clear_node(int cpu);
+extern void __init numa_init_array(void);
+extern void __init init_cpu_to_node(void);
 
 # ifdef CONFIG_DEBUG_PER_CPU_MAPS
 extern void __cpuinit numa_add_cpu(int cpu);
@@ -55,6 +57,8 @@ static inline void __cpuinit numa_remove_cpu(int cpu)
 #else	/* CONFIG_NUMA */
 static inline void numa_set_node(int cpu, int node)	{ }
 static inline void numa_clear_node(int cpu)		{ }
+static inline void numa_init_array(void)		{ }
+static inline void init_cpu_to_node(void)		{ }
 static inline void numa_add_cpu(int cpu)		{ }
 static inline void numa_remove_cpu(int cpu)		{ }
 #endif	/* CONFIG_NUMA */
diff --git a/arch/x86/include/asm/numa_64.h b/arch/x86/include/asm/numa_64.h
index 6331190..db3baa2 100644
--- a/arch/x86/include/asm/numa_64.h
+++ b/arch/x86/include/asm/numa_64.h
@@ -13,7 +13,6 @@ extern int compute_hash_shift(struct bootnode *nodes, int numblks,
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
-extern void numa_init_array(void);
 extern int numa_off;
 
 extern unsigned long numa_free_all_bootmem(void);
@@ -28,7 +27,6 @@ extern void setup_node_bootmem(int nodeid, unsigned long start,
  */
 #define NODE_MIN_SIZE (4*1024*1024)
 
-extern void __init init_cpu_to_node(void);
 extern int __cpuinit numa_cpu_node(int cpu);
 
 #ifdef CONFIG_NUMA_EMU
@@ -36,7 +34,6 @@ extern int __cpuinit numa_cpu_node(int cpu);
 #define FAKE_NODE_MIN_HASH_MASK	(~(FAKE_NODE_MIN_SIZE - 1UL))
 #endif /* CONFIG_NUMA_EMU */
 #else
-static inline void init_cpu_to_node(void)		{ }
 static inline int numa_cpu_node(int cpu)		{ return NUMA_NO_NODE; }
 #endif
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 21c6746..fca6d54 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1030,9 +1030,7 @@ void __init setup_arch(char **cmdline_p)
 
 	prefill_possible_map();
 
-#ifdef CONFIG_X86_64
 	init_cpu_to_node();
-#endif
 
 	init_apic_mappings();
 	ioapic_init_mappings();
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index c52c267..dd93a41 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -16,11 +16,7 @@ EXPORT_SYMBOL(node_to_cpumask_map);
 /*
  * Map cpu index to node index
  */
-#ifdef CONFIG_X86_32
-DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node_map, 0);
-#else
 DEFINE_EARLY_PER_CPU(int, x86_cpu_to_node_map, NUMA_NO_NODE);
-#endif
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_node_map);
 
 void __cpuinit numa_set_node(int cpu, int node)
@@ -77,6 +73,78 @@ void __init setup_node_to_cpumask_map(void)
 	pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
 }
 
+/*
+ * There are unfortunately some poorly designed mainboards around that
+ * only connect memory to a single CPU. This breaks the 1:1 cpu->node
+ * mapping. To avoid this fill in the mapping for all possible CPUs,
+ * as the number of CPUs is not known yet. We round robin the existing
+ * nodes.
+ */
+void __init numa_init_array(void)
+{
+	int rr, i;
+
+	rr = first_node(node_online_map);
+	for (i = 0; i < nr_cpu_ids; i++) {
+		if (early_cpu_to_node(i) != NUMA_NO_NODE)
+			continue;
+		numa_set_node(i, rr);
+		rr = next_node(rr, node_online_map);
+		if (rr == MAX_NUMNODES)
+			rr = first_node(node_online_map);
+	}
+}
+
+static __init int find_near_online_node(int node)
+{
+	int n, val;
+	int min_val = INT_MAX;
+	int best_node = -1;
+
+	for_each_online_node(n) {
+		val = node_distance(node, n);
+
+		if (val < min_val) {
+			min_val = val;
+			best_node = n;
+		}
+	}
+
+	return best_node;
+}
+
+/*
+ * Setup early cpu_to_node.
+ *
+ * Populate cpu_to_node[] only if x86_cpu_to_apicid[],
+ * and apicid_to_node[] tables have valid entries for a CPU.
+ * This means we skip cpu_to_node[] initialisation for NUMA
+ * emulation and faking node case (when running a kernel compiled
+ * for NUMA on a non NUMA box), which is OK as cpu_to_node[]
+ * is already initialized in a round robin manner at numa_init_array,
+ * prior to this call, and this initialization is good enough
+ * for the fake NUMA cases.
+ *
+ * Called before the per_cpu areas are setup.
+ */
+void __init init_cpu_to_node(void)
+{
+	int cpu;
+	u16 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
+
+	BUG_ON(cpu_to_apicid == NULL);
+
+	for_each_possible_cpu(cpu) {
+		int node = numa_cpu_node(cpu);
+
+		if (node == NUMA_NO_NODE)
+			continue;
+		if (!node_online(node))
+			node = find_near_online_node(node);
+		numa_set_node(cpu, node);
+	}
+}
+
 #ifdef CONFIG_DEBUG_PER_CPU_MAPS
 
 int __cpu_to_node(int cpu)
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c
index 9f27ae2..0bad0f8 100644
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -367,6 +367,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	 */
 
 	get_memcfg_numa();
+	numa_init_array();
 
 	kva_pages = roundup(calculate_numa_remap_pages(), PTRS_PER_PTE);
 
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index c8c2e29..e7b1ab5 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -225,28 +225,6 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	node_set_online(nodeid);
 }
 
-/*
- * There are unfortunately some poorly designed mainboards around that
- * only connect memory to a single CPU. This breaks the 1:1 cpu->node
- * mapping. To avoid this fill in the mapping for all possible CPUs,
- * as the number of CPUs is not known yet. We round robin the existing
- * nodes.
- */
-void __init numa_init_array(void)
-{
-	int rr, i;
-
-	rr = first_node(node_online_map);
-	for (i = 0; i < nr_cpu_ids; i++) {
-		if (early_cpu_to_node(i) != NUMA_NO_NODE)
-			continue;
-		numa_set_node(i, rr);
-		rr = next_node(rr, node_online_map);
-		if (rr == MAX_NUMNODES)
-			rr = first_node(node_online_map);
-	}
-}
-
 #ifdef CONFIG_NUMA_EMU
 /* Numa emulation */
 static struct bootnode nodes[MAX_NUMNODES] __initdata;
@@ -669,59 +647,6 @@ static __init int numa_setup(char *opt)
 }
 early_param("numa", numa_setup);
 
-#ifdef CONFIG_NUMA
-
-static __init int find_near_online_node(int node)
-{
-	int n, val;
-	int min_val = INT_MAX;
-	int best_node = -1;
-
-	for_each_online_node(n) {
-		val = node_distance(node, n);
-
-		if (val < min_val) {
-			min_val = val;
-			best_node = n;
-		}
-	}
-
-	return best_node;
-}
-
-/*
- * Setup early cpu_to_node.
- *
- * Populate cpu_to_node[] only if x86_cpu_to_apicid[],
- * and apicid_to_node[] tables have valid entries for a CPU.
- * This means we skip cpu_to_node[] initialisation for NUMA
- * emulation and faking node case (when running a kernel compiled
- * for NUMA on a non NUMA box), which is OK as cpu_to_node[]
- * is already initialized in a round robin manner at numa_init_array,
- * prior to this call, and this initialization is good enough
- * for the fake NUMA cases.
- *
- * Called before the per_cpu areas are setup.
- */
-void __init init_cpu_to_node(void)
-{
-	int cpu;
-	u16 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
-
-	BUG_ON(cpu_to_apicid == NULL);
-
-	for_each_possible_cpu(cpu) {
-		int node = numa_cpu_node(cpu);
-
-		if (node == NUMA_NO_NODE)
-			continue;
-		if (!node_online(node))
-			node = find_near_online_node(node);
-		numa_set_node(cpu, node);
-	}
-}
-#endif
-
 int __cpuinit numa_cpu_node(int cpu)
 {
 	int apicid = early_per_cpu(x86_cpu_to_apicid, cpu);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c
  2010-11-11 11:02 ` [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Tejun Heo
@ 2010-11-11 11:13   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 11:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC
  2010-11-11 11:02 ` [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC Tejun Heo
@ 2010-11-11 11:15   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 11:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> Replace x86_32 MAX_APICID in include/asm/mpspec.h with MAX_LOCAL_APIC
> in include/asm/apicdef.h to make it consistent with x86_64.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node()
  2010-11-11 11:02 ` [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node() Tejun Heo
@ 2010-11-11 11:37   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 11:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> apic->apicid_to_node() is 32bit specific apic operation which
> determines NUMA node for a CPU.  Depending on the APIC implementation,
> it can be easier to determine NUMA node from either physical or
> logical apicid.  Currently, ->apicid_to_node() takes @logical_apicid
> and calls hard_smp_processor_id() if the physical apicid is needed.
>
> This prevents NUMA mapping from being queried from a different CPU,
> which in turn makes it impossible to initialize NUMA mapping before
> SMP bringup.
>
> This patch replaces apic->apicid_to_node() with ->numa_cpu_node()
> which takes @cpu, from which both logical and physical apicids can
> easily be determined.  While at it, drop duplicate implementations
> from bigsmp_32 and summit_32, and use the default.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit
  2010-11-11 11:02 ` [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit Tejun Heo
@ 2010-11-11 12:01   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 12:01 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> The mapping between cpu/apicid and node is done via apicid_to_node[]
> on 64bit and apicid_2_node[] + apic->numa_cpu_node() on 32bit.  This
> difference makes it difficult to further unify 32 and 64bit NUMA
> hanlding.
>
> This patch unifies it by replacing both apicid_to_node[] and
> apicid_2_node[] with __apicid_to_node[] array, which is accessed by
> two accessors - set_apicid_to_node() and numa_cpu_node().  On 64bit,
> numa_cpu_node() always consults __apicid_to_node[] directly while
> 32bit goes through apic->numa_cpu_node() method to allow apic
> implementation to override it.
>
> There are several places where using numa_cpu_node() is awkward and
> the override doesn't matter.  In those places, __apicid_to_node[] are
> used directly.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only
  2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
@ 2010-11-11 12:02   ` Pekka Enberg
  2010-11-11 16:18   ` Cyrill Gorcunov
  2010-11-11 17:34   ` [PATCH UPDATED " Tejun Heo
  2 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 12:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> Both functions are used only in 32bit.  Put them inside CONFIG_X86_32.
> This is to prepare for logical apicid handling update.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9/9] x86: Unify NUMA initialization between 32 and 64bit
  2010-11-11 11:02 ` [PATCH 9/9] x86: Unify NUMA initialization " Tejun Heo
@ 2010-11-11 12:09   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 12:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> Now that everything else is unified, NUMA initialization can be
> unified too.
>
> * numa_init_array() and init_cpu_to_node() are moved from numa_64 to
>  numa.
>
> * numa_32::initmem_init() is updated to call numa_init_array() and
>  setup_arch() to call init_cpu_to_node() on 32bit too.
>
> * x86_cpu_to_node_map is now initialized to NUMA_NO_NODE on 32bit too.
>  This is safe now as numa_init_array() will initialize it early
>  during boot.
>
> This makes NUMA mapping fully initialized before setup_per_cpu_areas()
> on 32bit too and thus makes the first percpu chunk which contains all
> the static variables and some of dynamic area allocated with NUMA
> affinity correctly considered.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/9] x86: Unify node_to_cpumask_map handling between 32 and 64bit
  2010-11-11 11:02 ` [PATCH 8/9] x86: Unify node_to_cpumask_map handling " Tejun Heo
@ 2010-11-11 12:11   ` Pekka Enberg
  0 siblings, 0 replies; 31+ messages in thread
From: Pekka Enberg @ 2010-11-11 12:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 1:02 PM, Tejun Heo <tj@kernel.org> wrote:
> x86_32 has been managing node_to_cpumask_map explicitly from
> map_cpu_to_node() and friends in a rather ugly way.  With previous
> changes, it's now possible to simply share the code with 64bit.
>
> * numa_add/remove_cpu() are moved from numa_64 to numa.
>
> * identify_cpu() now calls numa_add_cpu() for 32bit too.  This makes
>  the explicit mask management from map_cpu_to_node() unnecessary.
>
> * The whole x86_32 specific map_cpu_to_node() chunk is no longer
>  necessary.  Dropped.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only
  2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
  2010-11-11 12:02   ` Pekka Enberg
@ 2010-11-11 16:18   ` Cyrill Gorcunov
  2010-11-11 17:34   ` [PATCH UPDATED " Tejun Heo
  2 siblings, 0 replies; 31+ messages in thread
From: Cyrill Gorcunov @ 2010-11-11 16:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 12:02:37PM +0100, Tejun Heo wrote:
> Both functions are used only in 32bit.  Put them inside CONFIG_X86_32.
> This is to prepare for logical apicid handling update.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  arch/x86/kernel/apic/ipi.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
> index 08385e0..5037736 100644
> --- a/arch/x86/kernel/apic/ipi.c
> +++ b/arch/x86/kernel/apic/ipi.c
> @@ -56,6 +56,8 @@ void default_send_IPI_mask_allbutself_phys(const struct cpumask *mask,
>  	local_irq_restore(flags);
>  }
>  
> +#ifdef CONFIG_X86_32
> +
>  void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
>  						 int vector)

It still remains defined in ipi.h as external symbol without CONFIG_X86_32.

 Cyrill

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH UPDATED 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only
  2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
  2010-11-11 12:02   ` Pekka Enberg
  2010-11-11 16:18   ` Cyrill Gorcunov
@ 2010-11-11 17:34   ` Tejun Heo
  2010-11-11 17:52     ` Cyrill Gorcunov
  2 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-11 17:34 UTC (permalink / raw)
  To: Tejun Heo, gorcunov
  Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

Both functions are used only in 32bit.  Put them inside CONFIG_X86_32.
This is to prepare for logical apicid handling update.

- Cyrill Gorcunov spotted that I forgot to move declarations in ipi.h
  under CONFIG_X86_32.  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
---
Patch updated as per Cyrill's comment.  git tree has been updated
accordingly w/ Reviewed-by's added.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git x86_32-numa

Thanks.

 arch/x86/include/asm/ipi.h |    8 ++++----
 arch/x86/kernel/apic/ipi.c |    4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/ipi.h b/arch/x86/include/asm/ipi.h
index 0b72282..615fa90 100644
--- a/arch/x86/include/asm/ipi.h
+++ b/arch/x86/include/asm/ipi.h
@@ -123,10 +123,6 @@ extern void default_send_IPI_mask_sequence_phys(const struct cpumask *mask,
 						 int vector);
 extern void default_send_IPI_mask_allbutself_phys(const struct cpumask *mask,
 							 int vector);
-extern void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
-							 int vector);
-extern void default_send_IPI_mask_allbutself_logical(const struct cpumask *mask,
-							 int vector);

 /* Avoid include hell */
 #define NMI_VECTOR 0x02
@@ -150,6 +146,10 @@ static inline void __default_local_send_IPI_all(int vector)
 }

 #ifdef CONFIG_X86_32
+extern void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
+							 int vector);
+extern void default_send_IPI_mask_allbutself_logical(const struct cpumask *mask,
+							 int vector);
 extern void default_send_IPI_mask_logical(const struct cpumask *mask,
 						 int vector);
 extern void default_send_IPI_allbutself(int vector);
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 08385e0..5037736 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -56,6 +56,8 @@ void default_send_IPI_mask_allbutself_phys(const struct cpumask *mask,
 	local_irq_restore(flags);
 }

+#ifdef CONFIG_X86_32
+
 void default_send_IPI_mask_sequence_logical(const struct cpumask *mask,
 						 int vector)
 {
@@ -96,8 +98,6 @@ void default_send_IPI_mask_allbutself_logical(const struct cpumask *mask,
 	local_irq_restore(flags);
 }

-#ifdef CONFIG_X86_32
-
 /*
  * This is only used on smaller machines.
  */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH UPDATED 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only
  2010-11-11 17:34   ` [PATCH UPDATED " Tejun Heo
@ 2010-11-11 17:52     ` Cyrill Gorcunov
  0 siblings, 0 replies; 31+ messages in thread
From: Cyrill Gorcunov @ 2010-11-11 17:52 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 06:34:12PM +0100, Tejun Heo wrote:
> Both functions are used only in 32bit.  Put them inside CONFIG_X86_32.
> This is to prepare for logical apicid handling update.
> 
> - Cyrill Gorcunov spotted that I forgot to move declarations in ipi.h
>   under CONFIG_X86_32.  Fixed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reviewed-by: Pekka Enberg <penberg@kernel.org>
> Cc: Cyrill Gorcunov <gorcunov@gmail.com>
> ---

yup, thanks Tejun,
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
@ 2010-11-12  1:54   ` Brian Gerst
  2010-11-12  9:38   ` [PATCH UPDATED " Tejun Heo
  2010-11-12 10:33   ` [PATCH 4/9 UPDATED-1] " Tejun Heo
  2 siblings, 0 replies; 31+ messages in thread
From: Brian Gerst @ 2010-11-12  1:54 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On Thu, Nov 11, 2010 at 6:02 AM, Tejun Heo <tj@kernel.org> wrote:
> On x86_32, non-standard logical apicid mapping can be used by
> different NUMA setups and the mapping is queried while bringing up
> each CPU using apic->cpu_to_logical_apicid() to build
> cpu_2_logical_apicid[] array.  The logical apicid is then used to
> deliver IPIs and determine NUMA configuration.
>
> Unfortunately, initializing at SMP bring up is too late for percpu
> setup making static percpu variables setup w/o considering NUMA.  This
> also is different from how x86_64 is configured making the code
> difficult to follow and maintain.
>
> This patch updates logical apicid mapping handling such that,
>
> * early_percpu variable x86_cpu_to_logical_apicid replaces
>  cpu_2_logical_apicid[].

You are missing code in setup_percpu.c to copy the values from the
early array to the percpu variable.  This isn't handled automatically.

Also make sure that any direct use of the percpu variable (ie. not via
early_per_cpu()) can happen only after percpu setup.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH UPDATED 4/9] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
  2010-11-12  1:54   ` Brian Gerst
@ 2010-11-12  9:38   ` Tejun Heo
  2010-11-12  9:58     ` Yinghai Lu
  2010-11-12 10:33   ` [PATCH 4/9 UPDATED-1] " Tejun Heo
  2 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-12  9:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On x86_32, non-standard logical apicid mapping can be used by
different NUMA setups and the mapping is queried while bringing up
each CPU using apic->cpu_to_logical_apicid() to build
cpu_2_logical_apicid[] array.  The logical apicid is then used to
deliver IPIs and determine NUMA configuration.

Unfortunately, initializing at SMP bring up is too late for percpu
setup making static percpu variables setup w/o considering NUMA.  This
also is different from how x86_64 is configured making the code
difficult to follow and maintain.

This patch updates logical apicid mapping handling such that,

* early_percpu variable x86_cpu_to_logical_apicid replaces
  cpu_2_logical_apicid[].

* apic->cpu_to_logical_apicid() is called once during get_smp_config()
  and the output is recorded in x86_cpu_to_logical_apicid.

* apic->cpu_to_logical_apicid() is allowed to return BAD_APICID if it
  can't determine the value that early during boot.  In this case, the
  mapping will be initialized during SMP bring up by reading APIC LDR
  as before.

- Brian Gerst spotted that setup_per_cpu_areas() was not copying the
  early x86_cpu_to_logical_apicid to the permanent percpu area and
  es7000_32 is using per_cpu() instead of early_per_cpu(), which in
  itself is not incorrect as they're never used before setup_per_cpu()
  but still confusing.  Both updated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
---
Updated as per Brian's comment.  The rest of the patchset apply okay
on top of this change.  The git tree is updated accordingly.

Thanks.

 arch/x86/include/asm/apic.h      |   18 +++++++++++++-----
 arch/x86/include/asm/smp.h       |    3 +++
 arch/x86/kernel/apic/apic.c      |   36 ++++++++++++++++++++++++++++--------
 arch/x86/kernel/apic/bigsmp_32.c |   26 ++++++++++++++------------
 arch/x86/kernel/apic/es7000_32.c |   27 ++++++++++-----------------
 arch/x86/kernel/apic/ipi.c       |    8 ++++----
 arch/x86/kernel/apic/numaq_32.c  |   15 +++++++--------
 arch/x86/kernel/apic/summit_32.c |   36 ++++++++++++++++--------------------
 arch/x86/kernel/setup_percpu.c   |    7 +++++++
 arch/x86/kernel/smpboot.c        |   10 +++-------
 10 files changed, 105 insertions(+), 81 deletions(-)

Index: work/arch/x86/kernel/smpboot.c
===================================================================
--- work.orig/arch/x86/kernel/smpboot.c
+++ work/arch/x86/kernel/smpboot.c
@@ -165,25 +165,21 @@ static void unmap_cpu_to_node(int cpu)
 #endif

 #ifdef CONFIG_X86_32
-u8 cpu_2_logical_apicid[NR_CPUS] __read_mostly =
-					{ [0 ... NR_CPUS-1] = BAD_APICID };
-
 static void map_cpu_to_logical_apicid(void)
 {
 	int cpu = smp_processor_id();
-	int apicid = logical_smp_processor_id();
-	int node = apic->apicid_to_node(apicid);
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	int node;

+	node = apic->apicid_to_node(logical_apicid);
 	if (!node_online(node))
 		node = first_online_node;

-	cpu_2_logical_apicid[cpu] = apicid;
 	map_cpu_to_node(cpu, node);
 }

 void numa_remove_cpu(int cpu)
 {
-	cpu_2_logical_apicid[cpu] = BAD_APICID;
 	unmap_cpu_to_node(cpu);
 }
 #else
Index: work/arch/x86/include/asm/apic.h
===================================================================
--- work.orig/arch/x86/include/asm/apic.h
+++ work/arch/x86/include/asm/apic.h
@@ -299,6 +299,19 @@ struct apic {
 	unsigned long (*check_apicid_present)(int apicid);

 	void (*vector_allocation_domain)(int cpu, struct cpumask *retmask);
+	/*
+	 * x86_32 specific method called very early during boot from
+	 * get_smp_config().  It should return the logical apicid.
+	 * x86_[bios]_cpu_to_apicid is initialized before this
+	 * function is called.
+	 *
+	 * If logical apicid can't be determined that early, the
+	 * function may return BAD_APICID.  Logical apicid will be
+	 * automatically configured after init_apic_ldr() while
+	 * bringing up CPUs.  Note that NUMA affinity won't work
+	 * properly during early boot in this case.
+	 */
+	int (*cpu_to_logical_apicid)(int cpu);
 	void (*init_apic_ldr)(void);

 	void (*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
@@ -306,7 +319,6 @@ struct apic {
 	void (*setup_apic_routing)(void);
 	int (*multi_timer_check)(int apic, int irq);
 	int (*apicid_to_node)(int logical_apicid);
-	int (*cpu_to_logical_apicid)(int cpu);
 	int (*cpu_present_to_apicid)(int mps_cpu);
 	void (*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
 	void (*setup_portio_remap)(void);
@@ -594,8 +606,4 @@ extern int default_check_phys_apicid_pre

 #endif /* CONFIG_X86_LOCAL_APIC */

-#ifdef CONFIG_X86_32
-extern u8 cpu_2_logical_apicid[NR_CPUS];
-#endif
-
 #endif /* _ASM_X86_APIC_H */
Index: work/arch/x86/kernel/apic/apic.c
===================================================================
--- work.orig/arch/x86/kernel/apic/apic.c
+++ work/arch/x86/kernel/apic/apic.c
@@ -80,6 +80,11 @@ EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_a
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_bios_cpu_apicid);

 #ifdef CONFIG_X86_32
+
+#ifdef CONFIG_SMP
+DEFINE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid, BAD_APICID);
+#endif
+
 /*
  * Knob to control our willingness to enable the local APIC.
  *
@@ -1202,6 +1207,7 @@ static void __cpuinit lapic_setup_esr(vo
  */
 void __cpuinit setup_local_APIC(void)
 {
+	int cpu = smp_processor_id();
 	unsigned int value, queued;
 	int i, j, acked = 0;
 	unsigned long long tsc = 0, ntsc;
@@ -1241,6 +1247,19 @@ void __cpuinit setup_local_APIC(void)
 	 */
 	apic->init_apic_ldr();

+#ifdef CONFIG_X86_32
+	/*
+	 * APIC LDR is initialized.  If logical_apicid mapping was
+	 * initialized during get_smp_config(), make sure it matches
+	 * the actual value.
+	 */
+	i = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	WARN_ON(i != BAD_APICID && i != logical_smp_processor_id());
+	/* always use the value from LDR */
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		logical_smp_processor_id();
+#endif
+
 	/*
 	 * Set Task Priority to 'accept all'. We never change this
 	 * later on.
@@ -1343,21 +1362,19 @@ void __cpuinit setup_local_APIC(void)
 	 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 	 */
 	value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-	if (!smp_processor_id() && (pic_mode || !value)) {
+	if (!cpu && (pic_mode || !value)) {
 		value = APIC_DM_EXTINT;
-		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
 	} else {
 		value = APIC_DM_EXTINT | APIC_LVT_MASKED;
-		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n", cpu);
 	}
 	apic_write(APIC_LVT0, value);

 	/*
 	 * only the BP should see the LINT1 NMI signal, obviously.
 	 */
-	if (!smp_processor_id())
+	if (!cpu)
 		value = APIC_DM_NMI;
 	else
 		value = APIC_DM_NMI | APIC_LVT_MASKED;
@@ -1369,7 +1386,7 @@ void __cpuinit setup_local_APIC(void)

 #ifdef CONFIG_X86_MCE_INTEL
 	/* Recheck CMCI information after local APIC is up on CPU #0 */
-	if (smp_processor_id() == 0)
+	if (cpu == 0)
 		cmci_recheck();
 #endif
 }
@@ -1967,7 +1984,10 @@ void __cpuinit generic_processor_info(in
 	early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
 	early_per_cpu(x86_bios_cpu_apicid, cpu) = apicid;
 #endif
-
+#ifdef CONFIG_X86_32
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		apic->cpu_to_logical_apicid(cpu);
+#endif
 	set_cpu_possible(cpu, true);
 	set_cpu_present(cpu, true);
 }
Index: work/arch/x86/kernel/apic/bigsmp_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/bigsmp_32.c
+++ work/arch/x86/kernel/apic/bigsmp_32.c
@@ -45,6 +45,12 @@ static unsigned long bigsmp_check_apicid
 	return 1;
 }

+static int bigsmp_cpu_to_logical_apicid(int cpu)
+{
+	/* on bigsmp, logical apicid is the same as physical */
+	return early_per_cpu(x86_cpu_to_apicid, cpu);
+}
+
 static inline unsigned long calculate_ldr(int cpu)
 {
 	unsigned long val, id;
@@ -93,14 +99,6 @@ static int bigsmp_cpu_present_to_apicid(
 	return BAD_APICID;
 }

-/* Mapping from cpu number to logical apicid */
-static inline int bigsmp_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_physical_id(cpu);
-}
-
 static void bigsmp_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -115,7 +113,11 @@ static int bigsmp_check_phys_apicid_pres
 /* As we are using single CPU as destination, pick only one CPU here */
 static unsigned int bigsmp_cpu_mask_to_apicid(const struct cpumask *cpumask)
 {
-	return bigsmp_cpu_to_logical_apicid(cpumask_first(cpumask));
+	int cpu = cpumask_first(cpumask);
+
+	if (cpu < nr_cpu_ids)
+		return cpu_physical_id(cpu);
+	return BAD_APICID;
 }

 static unsigned int bigsmp_cpu_mask_to_apicid_and(const struct cpumask *cpumask,
@@ -129,9 +131,9 @@ static unsigned int bigsmp_cpu_mask_to_a
 	 */
 	for_each_cpu_and(cpu, cpumask, andmask) {
 		if (cpumask_test_cpu(cpu, cpu_online_mask))
-			break;
+			return cpu_physical_id(cpu);
 	}
-	return bigsmp_cpu_to_logical_apicid(cpu);
+	return BAD_APICID;
 }

 static int bigsmp_phys_pkg_id(int cpuid_apic, int index_msb)
@@ -214,13 +216,13 @@ struct apic apic_bigsmp = {
 	.check_apicid_present		= bigsmp_check_apicid_present,

 	.vector_allocation_domain	= bigsmp_vector_allocation_domain,
+	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.init_apic_ldr			= bigsmp_init_apic_ldr,

 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
 	.setup_apic_routing		= bigsmp_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= bigsmp_apicid_to_node,
-	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= bigsmp_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= physid_set_mask_of_physid,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/kernel/apic/summit_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/summit_32.c
+++ work/arch/x86/kernel/apic/summit_32.c
@@ -194,11 +194,11 @@ static unsigned long summit_check_apicid
 	return 1;
 }

-static void summit_init_apic_ldr(void)
+/* Mapping from cpu number to logical apicid */
+static int summit_cpu_to_logical_apicid(int cpu)
 {
-	unsigned long val, id;
 	int count = 0;
-	u8 my_id = (u8)hard_smp_processor_id();
+	u8 my_id = early_per_cpu(x86_cpu_to_apicid, cpu);
 	u8 my_cluster = APIC_CLUSTER(my_id);
 #ifdef CONFIG_SMP
 	u8 lid;
@@ -206,7 +206,7 @@ static void summit_init_apic_ldr(void)

 	/* Create logical APIC IDs by counting CPUs already in cluster. */
 	for (count = 0, i = nr_cpu_ids; --i >= 0; ) {
-		lid = cpu_2_logical_apicid[i];
+		lid = early_per_cpu(x86_cpu_to_logical_apicid, i);
 		if (lid != BAD_APICID && APIC_CLUSTER(lid) == my_cluster)
 			++count;
 	}
@@ -214,7 +214,15 @@ static void summit_init_apic_ldr(void)
 	/* We only have a 4 wide bitmap in cluster mode.  If a deranged
 	 * BIOS puts 5 CPUs in one APIC cluster, we're hosed. */
 	BUG_ON(count >= XAPIC_DEST_CPUS_SHIFT);
-	id = my_cluster | (1UL << count);
+	return my_cluster | (1UL << count);
+}
+
+static void summit_init_apic_ldr(void)
+{
+	int cpu = smp_processor_id();
+	unsigned long id = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	unsigned long val;
+
 	apic_write(APIC_DFR, SUMMIT_APIC_DFR_VALUE);
 	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
 	val |= SET_APIC_LOGICAL_ID(id);
@@ -241,18 +249,6 @@ static int summit_apicid_to_node(int log
 #endif
 }

-/* Mapping from cpu number to logical apicid */
-static inline int summit_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static int summit_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids)
@@ -286,7 +282,7 @@ static unsigned int summit_cpu_mask_to_a
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = summit_cpu_to_logical_apicid(cpu);
+		int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			printk("%s: Not a valid mask!\n", __func__);
@@ -301,7 +297,7 @@ static unsigned int summit_cpu_mask_to_a
 static unsigned int summit_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = summit_cpu_to_logical_apicid(0);
+	int apicid = early_per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;

 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -523,13 +519,13 @@ struct apic apic_summit = {
 	.check_apicid_present		= summit_check_apicid_present,

 	.vector_allocation_domain	= summit_vector_allocation_domain,
+	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.init_apic_ldr			= summit_init_apic_ldr,

 	.ioapic_phys_id_map		= summit_ioapic_phys_id_map,
 	.setup_apic_routing		= summit_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= summit_apicid_to_node,
-	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= summit_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= summit_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/include/asm/smp.h
===================================================================
--- work.orig/arch/x86/include/asm/smp.h
+++ work/arch/x86/include/asm/smp.h
@@ -38,6 +38,9 @@ static inline struct cpumask *cpu_core_m

 DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
+#if defined(CONFIG_SMP) && defined(CONFIG_X86_32)
+DECLARE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid);
+#endif

 /* Static state in head.S used to set up a CPU */
 extern struct {
Index: work/arch/x86/kernel/apic/numaq_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/numaq_32.c
+++ work/arch/x86/kernel/apic/numaq_32.c
@@ -346,6 +346,12 @@ static inline int numaq_apic_id_register
 	return 1;
 }

+static int numaq_cpu_to_logical_apicid(int cpu)
+{
+	/* NUMA-Q firmware set this up but how do I read this from boot CPU? */
+	return BAD_APICID;
+}
+
 static inline void numaq_init_apic_ldr(void)
 {
 	/* Already done in NUMA-Q firmware */
@@ -373,13 +379,6 @@ static inline void numaq_ioapic_phys_id_
 	return physids_promote(0xFUL, retmap);
 }

-static inline int numaq_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-}
-
 /*
  * Supporting over 60 cpus on NUMA-Q requires a locality-dependent
  * cpu to APIC ID relation to properly interact with the intelligent
@@ -503,13 +502,13 @@ struct apic __refdata apic_numaq = {
 	.check_apicid_present		= numaq_check_apicid_present,

 	.vector_allocation_domain	= numaq_vector_allocation_domain,
+	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.init_apic_ldr			= numaq_init_apic_ldr,

 	.ioapic_phys_id_map		= numaq_ioapic_phys_id_map,
 	.setup_apic_routing		= numaq_setup_apic_routing,
 	.multi_timer_check		= numaq_multi_timer_check,
 	.apicid_to_node			= numaq_apicid_to_node,
-	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= numaq_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= numaq_apicid_to_cpu_present,
 	.setup_portio_remap		= numaq_setup_portio_remap,
Index: work/arch/x86/kernel/apic/ipi.c
===================================================================
--- work.orig/arch/x86/kernel/apic/ipi.c
+++ work/arch/x86/kernel/apic/ipi.c
@@ -73,8 +73,8 @@ void default_send_IPI_mask_sequence_logi
 	local_irq_save(flags);
 	for_each_cpu(query_cpu, mask)
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 	local_irq_restore(flags);
 }

@@ -92,8 +92,8 @@ void default_send_IPI_mask_allbutself_lo
 		if (query_cpu == this_cpu)
 			continue;
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 		}
 	local_irq_restore(flags);
 }
Index: work/arch/x86/kernel/apic/es7000_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/es7000_32.c
+++ work/arch/x86/kernel/apic/es7000_32.c
@@ -460,11 +460,16 @@ static unsigned long es7000_check_apicid
 	return physid_isset(bit, phys_cpu_present_map);
 }

+static int es7000_cpu_to_logical_apicid(int cpu)
+{
+	return early_per_cpu(x86_bios_cpu_apicid, cpu);
+}
+
 static unsigned long calculate_ldr(int cpu)
 {
-	unsigned long id = per_cpu(x86_bios_cpu_apicid, cpu);
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

-	return SET_APIC_LOGICAL_ID(id);
+	return SET_APIC_LOGICAL_ID(logical_apicid);
 }

 /*
@@ -528,18 +533,6 @@ static void es7000_apicid_to_cpu_present
 	++cpu_id;
 }

-/* Mapping from cpu number to logical apicid */
-static int es7000_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static void es7000_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -561,7 +554,7 @@ static unsigned int es7000_cpu_mask_to_a
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = es7000_cpu_to_logical_apicid(cpu);
+		int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			WARN(1, "Not a valid mask!");
@@ -578,7 +571,7 @@ static unsigned int
 es7000_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = es7000_cpu_to_logical_apicid(0);
+	int apicid = early_per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;

 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -650,13 +643,13 @@ struct apic __refdata apic_es7000_cluste
 	.check_apicid_present		= es7000_check_apicid_present,

 	.vector_allocation_domain	= es7000_vector_allocation_domain,
+	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.init_apic_ldr			= es7000_init_apic_ldr_cluster,

 	.ioapic_phys_id_map		= es7000_ioapic_phys_id_map,
 	.setup_apic_routing		= es7000_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= es7000_apicid_to_node,
-	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= es7000_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= es7000_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/kernel/setup_percpu.c
===================================================================
--- work.orig/arch/x86/kernel/setup_percpu.c
+++ work/arch/x86/kernel/setup_percpu.c
@@ -225,6 +225,10 @@ void __init setup_per_cpu_areas(void)
 		per_cpu(x86_bios_cpu_apicid, cpu) =
 			early_per_cpu_map(x86_bios_cpu_apicid, cpu);
 #endif
+#ifdef CONFIG_X86_32
+		per_cpu(x86_cpu_to_logical_apicid, cpu) =
+			early_per_cpu_map(x86_cpu_to_logical_apicid, cpu);
+#endif
 #ifdef CONFIG_X86_64
 		per_cpu(irq_stack_ptr, cpu) =
 			per_cpu(irq_stack_union.irq_stack, cpu) +
@@ -256,6 +260,9 @@ void __init setup_per_cpu_areas(void)
 	early_per_cpu_ptr(x86_cpu_to_apicid) = NULL;
 	early_per_cpu_ptr(x86_bios_cpu_apicid) = NULL;
 #endif
+#ifdef CONFIG_X86_32
+	early_per_cpu_ptr(x86_cpu_to_logical_apicid) = NULL;
+#endif
 #if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
 	early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
 #endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH UPDATED 4/9] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-12  9:38   ` [PATCH UPDATED " Tejun Heo
@ 2010-11-12  9:58     ` Yinghai Lu
  2010-11-12 10:22       ` Tejun Heo
  0 siblings, 1 reply; 31+ messages in thread
From: Yinghai Lu @ 2010-11-12  9:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet

On 11/12/2010 01:38 AM, Tejun Heo wrote:
> On x86_32, non-standard logical apicid mapping can be used by
> different NUMA setups and the mapping is queried while bringing up
> each CPU using apic->cpu_to_logical_apicid() to build
> cpu_2_logical_apicid[] array.  The logical apicid is then used to
> deliver IPIs and determine NUMA configuration.
> 
> Unfortunately, initializing at SMP bring up is too late for percpu
> setup making static percpu variables setup w/o considering NUMA.  This
> also is different from how x86_64 is configured making the code
> difficult to follow and maintain.
> 
> This patch updates logical apicid mapping handling such that,
> 
> * early_percpu variable x86_cpu_to_logical_apicid replaces
>   cpu_2_logical_apicid[].
> 
> * apic->cpu_to_logical_apicid() is called once during get_smp_config()
>   and the output is recorded in x86_cpu_to_logical_apicid.
> 
> * apic->cpu_to_logical_apicid() is allowed to return BAD_APICID if it
>   can't determine the value that early during boot.  In this case, the
>   mapping will be initialized during SMP bring up by reading APIC LDR
>   as before.
> 
> - Brian Gerst spotted that setup_per_cpu_areas() was not copying the
>   early x86_cpu_to_logical_apicid to the permanent percpu area and
>   es7000_32 is using per_cpu() instead of early_per_cpu(), which in
>   itself is not incorrect as they're never used before setup_per_cpu()
>   but still confusing.  Both updated.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Brian Gerst <brgerst@gmail.com>
> ---
> Updated as per Brian's comment.  The rest of the patchset apply okay
> on top of this change.  The git tree is updated accordingly.
> 
> Thanks.
> 
>  arch/x86/include/asm/apic.h      |   18 +++++++++++++-----
>  arch/x86/include/asm/smp.h       |    3 +++
>  arch/x86/kernel/apic/apic.c      |   36 ++++++++++++++++++++++++++++--------
>  arch/x86/kernel/apic/bigsmp_32.c |   26 ++++++++++++++------------
>  arch/x86/kernel/apic/es7000_32.c |   27 ++++++++++-----------------
>  arch/x86/kernel/apic/ipi.c       |    8 ++++----
>  arch/x86/kernel/apic/numaq_32.c  |   15 +++++++--------
>  arch/x86/kernel/apic/summit_32.c |   36 ++++++++++++++++--------------------
>  arch/x86/kernel/setup_percpu.c   |    7 +++++++
>  arch/x86/kernel/smpboot.c        |   10 +++-------
>  10 files changed, 105 insertions(+), 81 deletions(-)
> 
> Index: work/arch/x86/kernel/apic/apic.c
> ===================================================================
> --- work.orig/arch/x86/kernel/apic/apic.c
> +++ work/arch/x86/kernel/apic/apic.c
> @@ -1202,6 +1207,7 @@ static void __cpuinit lapic_setup_esr(vo
>   */
>  void __cpuinit setup_local_APIC(void)
>  {
> +	int cpu = smp_processor_id();
>  	unsigned int value, queued;
>  	int i, j, acked = 0;
>  	unsigned long long tsc = 0, ntsc;
> @@ -1343,21 +1362,19 @@ void __cpuinit setup_local_APIC(void)
>  	 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
>  	 */
>  	value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
> -	if (!smp_processor_id() && (pic_mode || !value)) {
> +	if (!cpu && (pic_mode || !value)) {
>  		value = APIC_DM_EXTINT;
> -		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
> -				smp_processor_id());
> +		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
>  	} else {
>  		value = APIC_DM_EXTINT | APIC_LVT_MASKED;
> -		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
> -				smp_processor_id());
> +		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n", cpu);
>  	}
>  	apic_write(APIC_LVT0, value);
> 
>  	/*
>  	 * only the BP should see the LINT1 NMI signal, obviously.
>  	 */
> -	if (!smp_processor_id())
> +	if (!cpu)
>  		value = APIC_DM_NMI;
>  	else
>  		value = APIC_DM_NMI | APIC_LVT_MASKED;
> @@ -1369,7 +1386,7 @@ void __cpuinit setup_local_APIC(void)
> 
>  #ifdef CONFIG_X86_MCE_INTEL
>  	/* Recheck CMCI information after local APIC is up on CPU #0 */
> -	if (smp_processor_id() == 0)
> +	if (cpu == 0)
>  		cmci_recheck();
>  #endif
>  }

those replacing smp_processor_id() with cpu should be in another patch.

Thanks

	Yinghai

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH UPDATED 4/9] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-12  9:58     ` Yinghai Lu
@ 2010-11-12 10:22       ` Tejun Heo
  0 siblings, 0 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-12 10:22 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet

On 11/12/2010 10:58 AM, Yinghai Lu wrote:
> those replacing smp_processor_id() with cpu should be in another patch.

It's a mechanical cleanup.  I can add 'while at it, ...' to the patch
description.  Oh well, what the hell, I'll split it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 3.5/9] x86: Use local variable to cache smp_processor_id() in setup_local_APIC()
  2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
                   ` (8 preceding siblings ...)
  2010-11-11 11:02 ` [PATCH 9/9] x86: Unify NUMA initialization " Tejun Heo
@ 2010-11-12 10:32 ` Tejun Heo
  9 siblings, 0 replies; 31+ messages in thread
From: Tejun Heo @ 2010-11-12 10:32 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

This is a trivial clean up.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
---
Separated out from patch 4 as suggested by Yinghai Lu.

Thanks.

 arch/x86/kernel/apic/apic.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Index: work/arch/x86/kernel/apic/apic.c
===================================================================
--- work.orig/arch/x86/kernel/apic/apic.c
+++ work/arch/x86/kernel/apic/apic.c
@@ -1202,6 +1202,7 @@ static void __cpuinit lapic_setup_esr(vo
  */
 void __cpuinit setup_local_APIC(void)
 {
+	int cpu = smp_processor_id();
 	unsigned int value, queued;
 	int i, j, acked = 0;
 	unsigned long long tsc = 0, ntsc;
@@ -1343,21 +1344,19 @@ void __cpuinit setup_local_APIC(void)
 	 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 	 */
 	value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-	if (!smp_processor_id() && (pic_mode || !value)) {
+	if (!cpu && (pic_mode || !value)) {
 		value = APIC_DM_EXTINT;
-		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
 	} else {
 		value = APIC_DM_EXTINT | APIC_LVT_MASKED;
-		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
-				smp_processor_id());
+		apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n", cpu);
 	}
 	apic_write(APIC_LVT0, value);

 	/*
 	 * only the BP should see the LINT1 NMI signal, obviously.
 	 */
-	if (!smp_processor_id())
+	if (!cpu)
 		value = APIC_DM_NMI;
 	else
 		value = APIC_DM_NMI | APIC_LVT_MASKED;
@@ -1369,7 +1368,7 @@ void __cpuinit setup_local_APIC(void)

 #ifdef CONFIG_X86_MCE_INTEL
 	/* Recheck CMCI information after local APIC is up on CPU #0 */
-	if (smp_processor_id() == 0)
+	if (cpu == 0)
 		cmci_recheck();
 #endif
 }

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
  2010-11-12  1:54   ` Brian Gerst
  2010-11-12  9:38   ` [PATCH UPDATED " Tejun Heo
@ 2010-11-12 10:33   ` Tejun Heo
  2010-11-18  8:30     ` Ingo Molnar
  2 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-12 10:33 UTC (permalink / raw)
  To: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

On x86_32, non-standard logical apicid mapping can be used by
different NUMA setups and the mapping is queried while bringing up
each CPU using apic->cpu_to_logical_apicid() to build
cpu_2_logical_apicid[] array.  The logical apicid is then used to
deliver IPIs and determine NUMA configuration.

Unfortunately, initializing at SMP bring up is too late for percpu
setup making static percpu variables setup w/o considering NUMA.  This
also is different from how x86_64 is configured making the code
difficult to follow and maintain.

This patch updates logical apicid mapping handling such that,

* early_percpu variable x86_cpu_to_logical_apicid replaces
  cpu_2_logical_apicid[].

* apic->cpu_to_logical_apicid() is called once during get_smp_config()
  and the output is recorded in x86_cpu_to_logical_apicid.

* apic->cpu_to_logical_apicid() is allowed to return BAD_APICID if it
  can't determine the value that early during boot.  In this case, the
  mapping will be initialized during SMP bring up by reading APIC LDR
  as before.

- Brian Gerst spotted that setup_per_cpu_areas() was not copying the
  early x86_cpu_to_logical_apicid to the permanent percpu area and
  es7000_32 is using per_cpu() instead of early_per_cpu(), which in
  itself is not incorrect as they're never used before setup_per_cpu()
  but still confusing.  Both updated.

- Using local variable @cpu to cache smp_processor_id() in
  setup_local_APIC() separated out into a separate patch as suggested
  by Yinghai Lu.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
---
The git tree is again updated accordingly.  Scream if anyone wants the
whole series reposted.

Thanks.

 arch/x86/include/asm/apic.h      |   18 +++++++++++++-----
 arch/x86/include/asm/smp.h       |    3 +++
 arch/x86/kernel/apic/apic.c      |   23 ++++++++++++++++++++++-
 arch/x86/kernel/apic/bigsmp_32.c |   26 ++++++++++++++------------
 arch/x86/kernel/apic/es7000_32.c |   27 ++++++++++-----------------
 arch/x86/kernel/apic/ipi.c       |    8 ++++----
 arch/x86/kernel/apic/numaq_32.c  |   15 +++++++--------
 arch/x86/kernel/apic/summit_32.c |   36 ++++++++++++++++--------------------
 arch/x86/kernel/setup_percpu.c   |    7 +++++++
 arch/x86/kernel/smpboot.c        |   10 +++-------
 10 files changed, 99 insertions(+), 74 deletions(-)

Index: work/arch/x86/kernel/smpboot.c
===================================================================
--- work.orig/arch/x86/kernel/smpboot.c
+++ work/arch/x86/kernel/smpboot.c
@@ -165,25 +165,21 @@ static void unmap_cpu_to_node(int cpu)
 #endif

 #ifdef CONFIG_X86_32
-u8 cpu_2_logical_apicid[NR_CPUS] __read_mostly =
-					{ [0 ... NR_CPUS-1] = BAD_APICID };
-
 static void map_cpu_to_logical_apicid(void)
 {
 	int cpu = smp_processor_id();
-	int apicid = logical_smp_processor_id();
-	int node = apic->apicid_to_node(apicid);
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	int node;

+	node = apic->apicid_to_node(logical_apicid);
 	if (!node_online(node))
 		node = first_online_node;

-	cpu_2_logical_apicid[cpu] = apicid;
 	map_cpu_to_node(cpu, node);
 }

 void numa_remove_cpu(int cpu)
 {
-	cpu_2_logical_apicid[cpu] = BAD_APICID;
 	unmap_cpu_to_node(cpu);
 }
 #else
Index: work/arch/x86/include/asm/apic.h
===================================================================
--- work.orig/arch/x86/include/asm/apic.h
+++ work/arch/x86/include/asm/apic.h
@@ -299,6 +299,19 @@ struct apic {
 	unsigned long (*check_apicid_present)(int apicid);

 	void (*vector_allocation_domain)(int cpu, struct cpumask *retmask);
+	/*
+	 * x86_32 specific method called very early during boot from
+	 * get_smp_config().  It should return the logical apicid.
+	 * x86_[bios]_cpu_to_apicid is initialized before this
+	 * function is called.
+	 *
+	 * If logical apicid can't be determined that early, the
+	 * function may return BAD_APICID.  Logical apicid will be
+	 * automatically configured after init_apic_ldr() while
+	 * bringing up CPUs.  Note that NUMA affinity won't work
+	 * properly during early boot in this case.
+	 */
+	int (*cpu_to_logical_apicid)(int cpu);
 	void (*init_apic_ldr)(void);

 	void (*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
@@ -306,7 +319,6 @@ struct apic {
 	void (*setup_apic_routing)(void);
 	int (*multi_timer_check)(int apic, int irq);
 	int (*apicid_to_node)(int logical_apicid);
-	int (*cpu_to_logical_apicid)(int cpu);
 	int (*cpu_present_to_apicid)(int mps_cpu);
 	void (*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
 	void (*setup_portio_remap)(void);
@@ -594,8 +606,4 @@ extern int default_check_phys_apicid_pre

 #endif /* CONFIG_X86_LOCAL_APIC */

-#ifdef CONFIG_X86_32
-extern u8 cpu_2_logical_apicid[NR_CPUS];
-#endif
-
 #endif /* _ASM_X86_APIC_H */
Index: work/arch/x86/kernel/apic/apic.c
===================================================================
--- work.orig/arch/x86/kernel/apic/apic.c
+++ work/arch/x86/kernel/apic/apic.c
@@ -80,6 +80,11 @@ EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_a
 EXPORT_EARLY_PER_CPU_SYMBOL(x86_bios_cpu_apicid);

 #ifdef CONFIG_X86_32
+
+#ifdef CONFIG_SMP
+DEFINE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid, BAD_APICID);
+#endif
+
 /*
  * Knob to control our willingness to enable the local APIC.
  *
@@ -1242,6 +1247,19 @@ void __cpuinit setup_local_APIC(void)
 	 */
 	apic->init_apic_ldr();

+#ifdef CONFIG_X86_32
+	/*
+	 * APIC LDR is initialized.  If logical_apicid mapping was
+	 * initialized during get_smp_config(), make sure it matches
+	 * the actual value.
+	 */
+	i = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	WARN_ON(i != BAD_APICID && i != logical_smp_processor_id());
+	/* always use the value from LDR */
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		logical_smp_processor_id();
+#endif
+
 	/*
 	 * Set Task Priority to 'accept all'. We never change this
 	 * later on.
@@ -1966,7 +1984,10 @@ void __cpuinit generic_processor_info(in
 	early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
 	early_per_cpu(x86_bios_cpu_apicid, cpu) = apicid;
 #endif
-
+#ifdef CONFIG_X86_32
+	early_per_cpu(x86_cpu_to_logical_apicid, cpu) =
+		apic->cpu_to_logical_apicid(cpu);
+#endif
 	set_cpu_possible(cpu, true);
 	set_cpu_present(cpu, true);
 }
Index: work/arch/x86/kernel/apic/bigsmp_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/bigsmp_32.c
+++ work/arch/x86/kernel/apic/bigsmp_32.c
@@ -45,6 +45,12 @@ static unsigned long bigsmp_check_apicid
 	return 1;
 }

+static int bigsmp_cpu_to_logical_apicid(int cpu)
+{
+	/* on bigsmp, logical apicid is the same as physical */
+	return early_per_cpu(x86_cpu_to_apicid, cpu);
+}
+
 static inline unsigned long calculate_ldr(int cpu)
 {
 	unsigned long val, id;
@@ -93,14 +99,6 @@ static int bigsmp_cpu_present_to_apicid(
 	return BAD_APICID;
 }

-/* Mapping from cpu number to logical apicid */
-static inline int bigsmp_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_physical_id(cpu);
-}
-
 static void bigsmp_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -115,7 +113,11 @@ static int bigsmp_check_phys_apicid_pres
 /* As we are using single CPU as destination, pick only one CPU here */
 static unsigned int bigsmp_cpu_mask_to_apicid(const struct cpumask *cpumask)
 {
-	return bigsmp_cpu_to_logical_apicid(cpumask_first(cpumask));
+	int cpu = cpumask_first(cpumask);
+
+	if (cpu < nr_cpu_ids)
+		return cpu_physical_id(cpu);
+	return BAD_APICID;
 }

 static unsigned int bigsmp_cpu_mask_to_apicid_and(const struct cpumask *cpumask,
@@ -129,9 +131,9 @@ static unsigned int bigsmp_cpu_mask_to_a
 	 */
 	for_each_cpu_and(cpu, cpumask, andmask) {
 		if (cpumask_test_cpu(cpu, cpu_online_mask))
-			break;
+			return cpu_physical_id(cpu);
 	}
-	return bigsmp_cpu_to_logical_apicid(cpu);
+	return BAD_APICID;
 }

 static int bigsmp_phys_pkg_id(int cpuid_apic, int index_msb)
@@ -214,13 +216,13 @@ struct apic apic_bigsmp = {
 	.check_apicid_present		= bigsmp_check_apicid_present,

 	.vector_allocation_domain	= bigsmp_vector_allocation_domain,
+	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.init_apic_ldr			= bigsmp_init_apic_ldr,

 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
 	.setup_apic_routing		= bigsmp_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= bigsmp_apicid_to_node,
-	.cpu_to_logical_apicid		= bigsmp_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= bigsmp_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= physid_set_mask_of_physid,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/kernel/apic/summit_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/summit_32.c
+++ work/arch/x86/kernel/apic/summit_32.c
@@ -194,11 +194,11 @@ static unsigned long summit_check_apicid
 	return 1;
 }

-static void summit_init_apic_ldr(void)
+/* Mapping from cpu number to logical apicid */
+static int summit_cpu_to_logical_apicid(int cpu)
 {
-	unsigned long val, id;
 	int count = 0;
-	u8 my_id = (u8)hard_smp_processor_id();
+	u8 my_id = early_per_cpu(x86_cpu_to_apicid, cpu);
 	u8 my_cluster = APIC_CLUSTER(my_id);
 #ifdef CONFIG_SMP
 	u8 lid;
@@ -206,7 +206,7 @@ static void summit_init_apic_ldr(void)

 	/* Create logical APIC IDs by counting CPUs already in cluster. */
 	for (count = 0, i = nr_cpu_ids; --i >= 0; ) {
-		lid = cpu_2_logical_apicid[i];
+		lid = early_per_cpu(x86_cpu_to_logical_apicid, i);
 		if (lid != BAD_APICID && APIC_CLUSTER(lid) == my_cluster)
 			++count;
 	}
@@ -214,7 +214,15 @@ static void summit_init_apic_ldr(void)
 	/* We only have a 4 wide bitmap in cluster mode.  If a deranged
 	 * BIOS puts 5 CPUs in one APIC cluster, we're hosed. */
 	BUG_ON(count >= XAPIC_DEST_CPUS_SHIFT);
-	id = my_cluster | (1UL << count);
+	return my_cluster | (1UL << count);
+}
+
+static void summit_init_apic_ldr(void)
+{
+	int cpu = smp_processor_id();
+	unsigned long id = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+	unsigned long val;
+
 	apic_write(APIC_DFR, SUMMIT_APIC_DFR_VALUE);
 	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
 	val |= SET_APIC_LOGICAL_ID(id);
@@ -241,18 +249,6 @@ static int summit_apicid_to_node(int log
 #endif
 }

-/* Mapping from cpu number to logical apicid */
-static inline int summit_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static int summit_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids)
@@ -286,7 +282,7 @@ static unsigned int summit_cpu_mask_to_a
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = summit_cpu_to_logical_apicid(cpu);
+		int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			printk("%s: Not a valid mask!\n", __func__);
@@ -301,7 +297,7 @@ static unsigned int summit_cpu_mask_to_a
 static unsigned int summit_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = summit_cpu_to_logical_apicid(0);
+	int apicid = early_per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;

 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -523,13 +519,13 @@ struct apic apic_summit = {
 	.check_apicid_present		= summit_check_apicid_present,

 	.vector_allocation_domain	= summit_vector_allocation_domain,
+	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.init_apic_ldr			= summit_init_apic_ldr,

 	.ioapic_phys_id_map		= summit_ioapic_phys_id_map,
 	.setup_apic_routing		= summit_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= summit_apicid_to_node,
-	.cpu_to_logical_apicid		= summit_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= summit_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= summit_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/include/asm/smp.h
===================================================================
--- work.orig/arch/x86/include/asm/smp.h
+++ work/arch/x86/include/asm/smp.h
@@ -38,6 +38,9 @@ static inline struct cpumask *cpu_core_m

 DECLARE_EARLY_PER_CPU(u16, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU(u16, x86_bios_cpu_apicid);
+#if defined(CONFIG_SMP) && defined(CONFIG_X86_32)
+DECLARE_EARLY_PER_CPU(int, x86_cpu_to_logical_apicid);
+#endif

 /* Static state in head.S used to set up a CPU */
 extern struct {
Index: work/arch/x86/kernel/apic/numaq_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/numaq_32.c
+++ work/arch/x86/kernel/apic/numaq_32.c
@@ -346,6 +346,12 @@ static inline int numaq_apic_id_register
 	return 1;
 }

+static int numaq_cpu_to_logical_apicid(int cpu)
+{
+	/* NUMA-Q firmware set this up but how do I read this from boot CPU? */
+	return BAD_APICID;
+}
+
 static inline void numaq_init_apic_ldr(void)
 {
 	/* Already done in NUMA-Q firmware */
@@ -373,13 +379,6 @@ static inline void numaq_ioapic_phys_id_
 	return physids_promote(0xFUL, retmap);
 }

-static inline int numaq_cpu_to_logical_apicid(int cpu)
-{
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-}
-
 /*
  * Supporting over 60 cpus on NUMA-Q requires a locality-dependent
  * cpu to APIC ID relation to properly interact with the intelligent
@@ -503,13 +502,13 @@ struct apic __refdata apic_numaq = {
 	.check_apicid_present		= numaq_check_apicid_present,

 	.vector_allocation_domain	= numaq_vector_allocation_domain,
+	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.init_apic_ldr			= numaq_init_apic_ldr,

 	.ioapic_phys_id_map		= numaq_ioapic_phys_id_map,
 	.setup_apic_routing		= numaq_setup_apic_routing,
 	.multi_timer_check		= numaq_multi_timer_check,
 	.apicid_to_node			= numaq_apicid_to_node,
-	.cpu_to_logical_apicid		= numaq_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= numaq_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= numaq_apicid_to_cpu_present,
 	.setup_portio_remap		= numaq_setup_portio_remap,
Index: work/arch/x86/kernel/apic/ipi.c
===================================================================
--- work.orig/arch/x86/kernel/apic/ipi.c
+++ work/arch/x86/kernel/apic/ipi.c
@@ -73,8 +73,8 @@ void default_send_IPI_mask_sequence_logi
 	local_irq_save(flags);
 	for_each_cpu(query_cpu, mask)
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 	local_irq_restore(flags);
 }

@@ -92,8 +92,8 @@ void default_send_IPI_mask_allbutself_lo
 		if (query_cpu == this_cpu)
 			continue;
 		__default_send_IPI_dest_field(
-			apic->cpu_to_logical_apicid(query_cpu), vector,
-			apic->dest_logical);
+			early_per_cpu(x86_cpu_to_logical_apicid, query_cpu),
+			vector, apic->dest_logical);
 		}
 	local_irq_restore(flags);
 }
Index: work/arch/x86/kernel/apic/es7000_32.c
===================================================================
--- work.orig/arch/x86/kernel/apic/es7000_32.c
+++ work/arch/x86/kernel/apic/es7000_32.c
@@ -460,11 +460,16 @@ static unsigned long es7000_check_apicid
 	return physid_isset(bit, phys_cpu_present_map);
 }

+static int es7000_cpu_to_logical_apicid(int cpu)
+{
+	return early_per_cpu(x86_bios_cpu_apicid, cpu);
+}
+
 static unsigned long calculate_ldr(int cpu)
 {
-	unsigned long id = per_cpu(x86_bios_cpu_apicid, cpu);
+	int logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

-	return SET_APIC_LOGICAL_ID(id);
+	return SET_APIC_LOGICAL_ID(logical_apicid);
 }

 /*
@@ -528,18 +533,6 @@ static void es7000_apicid_to_cpu_present
 	++cpu_id;
 }

-/* Mapping from cpu number to logical apicid */
-static int es7000_cpu_to_logical_apicid(int cpu)
-{
-#ifdef CONFIG_SMP
-	if (cpu >= nr_cpu_ids)
-		return BAD_APICID;
-	return cpu_2_logical_apicid[cpu];
-#else
-	return logical_smp_processor_id();
-#endif
-}
-
 static void es7000_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	/* For clustered we don't have a good way to do this yet - hack */
@@ -561,7 +554,7 @@ static unsigned int es7000_cpu_mask_to_a
 	 * The cpus in the mask must all be on the apic cluster.
 	 */
 	for_each_cpu(cpu, cpumask) {
-		int new_apicid = es7000_cpu_to_logical_apicid(cpu);
+		int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

 		if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
 			WARN(1, "Not a valid mask!");
@@ -578,7 +571,7 @@ static unsigned int
 es7000_cpu_mask_to_apicid_and(const struct cpumask *inmask,
 			      const struct cpumask *andmask)
 {
-	int apicid = es7000_cpu_to_logical_apicid(0);
+	int apicid = early_per_cpu(x86_cpu_to_logical_apicid, 0);
 	cpumask_var_t cpumask;

 	if (!alloc_cpumask_var(&cpumask, GFP_ATOMIC))
@@ -650,13 +643,13 @@ struct apic __refdata apic_es7000_cluste
 	.check_apicid_present		= es7000_check_apicid_present,

 	.vector_allocation_domain	= es7000_vector_allocation_domain,
+	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.init_apic_ldr			= es7000_init_apic_ldr_cluster,

 	.ioapic_phys_id_map		= es7000_ioapic_phys_id_map,
 	.setup_apic_routing		= es7000_setup_apic_routing,
 	.multi_timer_check		= NULL,
 	.apicid_to_node			= es7000_apicid_to_node,
-	.cpu_to_logical_apicid		= es7000_cpu_to_logical_apicid,
 	.cpu_present_to_apicid		= es7000_cpu_present_to_apicid,
 	.apicid_to_cpu_present		= es7000_apicid_to_cpu_present,
 	.setup_portio_remap		= NULL,
Index: work/arch/x86/kernel/setup_percpu.c
===================================================================
--- work.orig/arch/x86/kernel/setup_percpu.c
+++ work/arch/x86/kernel/setup_percpu.c
@@ -225,6 +225,10 @@ void __init setup_per_cpu_areas(void)
 		per_cpu(x86_bios_cpu_apicid, cpu) =
 			early_per_cpu_map(x86_bios_cpu_apicid, cpu);
 #endif
+#ifdef CONFIG_X86_32
+		per_cpu(x86_cpu_to_logical_apicid, cpu) =
+			early_per_cpu_map(x86_cpu_to_logical_apicid, cpu);
+#endif
 #ifdef CONFIG_X86_64
 		per_cpu(irq_stack_ptr, cpu) =
 			per_cpu(irq_stack_union.irq_stack, cpu) +
@@ -256,6 +260,9 @@ void __init setup_per_cpu_areas(void)
 	early_per_cpu_ptr(x86_cpu_to_apicid) = NULL;
 	early_per_cpu_ptr(x86_bios_cpu_apicid) = NULL;
 #endif
+#ifdef CONFIG_X86_32
+	early_per_cpu_ptr(x86_cpu_to_logical_apicid) = NULL;
+#endif
 #if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
 	early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
 #endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-12 10:33   ` [PATCH 4/9 UPDATED-1] " Tejun Heo
@ 2010-11-18  8:30     ` Ingo Molnar
  2010-11-18  8:39       ` Tejun Heo
  0 siblings, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2010-11-18  8:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai


* Tejun Heo <tj@kernel.org> wrote:

> On x86_32, non-standard logical apicid mapping can be used by
> different NUMA setups and the mapping is queried while bringing up
> each CPU using apic->cpu_to_logical_apicid() to build
> cpu_2_logical_apicid[] array.  The logical apicid is then used to
> deliver IPIs and determine NUMA configuration.
> 
> Unfortunately, initializing at SMP bring up is too late for percpu
> setup making static percpu variables setup w/o considering NUMA.  This
> also is different from how x86_64 is configured making the code
> difficult to follow and maintain.
> 
> This patch updates logical apicid mapping handling such that,
> 
> * early_percpu variable x86_cpu_to_logical_apicid replaces
>   cpu_2_logical_apicid[].
> 
> * apic->cpu_to_logical_apicid() is called once during get_smp_config()
>   and the output is recorded in x86_cpu_to_logical_apicid.
> 
> * apic->cpu_to_logical_apicid() is allowed to return BAD_APICID if it
>   can't determine the value that early during boot.  In this case, the
>   mapping will be initialized during SMP bring up by reading APIC LDR
>   as before.
> 
> - Brian Gerst spotted that setup_per_cpu_areas() was not copying the
>   early x86_cpu_to_logical_apicid to the permanent percpu area and
>   es7000_32 is using per_cpu() instead of early_per_cpu(), which in
>   itself is not incorrect as they're never used before setup_per_cpu()
>   but still confusing.  Both updated.
> 
> - Using local variable @cpu to cache smp_processor_id() in
>   setup_local_APIC() separated out into a separate patch as suggested
>   by Yinghai Lu.

This patch is still _WAY_ too large.

Also, these:

> +#ifdef CONFIG_X86_32
> +#endif

> +#ifdef CONFIG_X86_32
> +#endif

> +#ifdef CONFIG_X86_32
> +#endif

> +#ifdef CONFIG_X86_32
> +#endif

Are rather ugly.

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-18  8:30     ` Ingo Molnar
@ 2010-11-18  8:39       ` Tejun Heo
  2010-11-18  8:42         ` Ingo Molnar
  0 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-18  8:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

Hey, Ingo.

On 11/18/2010 09:30 AM, Ingo Molnar wrote:
> This patch is still _WAY_ too large.

Really?  Alright, I'll split it further.

> Also, these:
> 
>> +#ifdef CONFIG_X86_32
>> +#endif
> 
>> +#ifdef CONFIG_X86_32
>> +#endif
> 
>> +#ifdef CONFIG_X86_32
>> +#endif
> 
>> +#ifdef CONFIG_X86_32
>> +#endif
> 
> Are rather ugly.

Yeah, what they do is ugly.  It gets less uglier after the patchset.
I'll see if some of them can be dropped but I don't think putting them
inside ifdef'd inline functions necessarily improves things.  It often
just makes things more difficult to follow.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-18  8:39       ` Tejun Heo
@ 2010-11-18  8:42         ` Ingo Molnar
  2010-11-18  8:48           ` Tejun Heo
  0 siblings, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2010-11-18  8:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai


* Tejun Heo <tj@kernel.org> wrote:

> Hey, Ingo.
> 
> On 11/18/2010 09:30 AM, Ingo Molnar wrote:
> > This patch is still _WAY_ too large.
> 
> Really?  Alright, I'll split it further.

This is a very sensitive area of code that tends to blow up in nasty ways. So when 
we do changes here we want them super-finegrained. If you split it up into 30 
reasonable patches - no problem at all. (here up to 5 would suffice i think)

Yinghai used to have this too big patches illness too.

> > Also, these:
> > 
> >> +#ifdef CONFIG_X86_32
> >> +#endif
> > 
> >> +#ifdef CONFIG_X86_32
> >> +#endif
> > 
> >> +#ifdef CONFIG_X86_32
> >> +#endif
> > 
> >> +#ifdef CONFIG_X86_32
> >> +#endif
> > 
> > Are rather ugly.
> 
> Yeah, what they do is ugly.  It gets less uglier after the patchset.
> I'll see if some of them can be dropped but I don't think putting them
> inside ifdef'd inline functions necessarily improves things.  It often
> just makes things more difficult to follow.

Can we remove them? Or does it make any sense on the 64-bit side?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-18  8:42         ` Ingo Molnar
@ 2010-11-18  8:48           ` Tejun Heo
  2010-11-18  9:08             ` Ingo Molnar
  0 siblings, 1 reply; 31+ messages in thread
From: Tejun Heo @ 2010-11-18  8:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai

Hello,

On 11/18/2010 09:42 AM, Ingo Molnar wrote:
> This is a very sensitive area of code that tends to blow up in nasty
> ways. So when we do changes here we want them super-finegrained. If
> you split it up into 30 reasonable patches - no problem at
> all. (here up to 5 would suffice i think)

I'll probably split it to three or four.  That said, I don't really
think it would help bisection.  But, anyways, no problem.

>> Yeah, what they do is ugly.  It gets less uglier after the patchset.
>> I'll see if some of them can be dropped but I don't think putting them
>> inside ifdef'd inline functions necessarily improves things.  It often
>> just makes things more difficult to follow.
> 
> Can we remove them? Or does it make any sense on the 64-bit side?

Those are the parts where 32 and 64 actually differ.  On 64, the
mapping between physical and logical are fixed (at least in the
generic arch code) which isn't true on 32, so it doesn't make much
sense for 64.  For now, I would suggest leaving them ugly.  At least
they're now properly localized after the patchset.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/9 UPDATED-1] x86: Initialize 32bit logical apicid mapping early during boot
  2010-11-18  8:48           ` Tejun Heo
@ 2010-11-18  9:08             ` Ingo Molnar
  0 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2010-11-18  9:08 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, mingo, tglx, hpa, x86, eric.dumazet, yinghai


* Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On 11/18/2010 09:42 AM, Ingo Molnar wrote:
> > This is a very sensitive area of code that tends to blow up in nasty
> > ways. So when we do changes here we want them super-finegrained. If
> > you split it up into 30 reasonable patches - no problem at
> > all. (here up to 5 would suffice i think)
> 
> I'll probably split it to three or four.  That said, I don't really
> think it would help bisection.  But, anyways, no problem.

The most critical patch will of course be the most likely one to regress so a 
split-up does not help bisection per se.

The help this does for us is _fixing_ stuff: we will be looking at a 20 lines patch, 
not a 200 lines one ...

> [...] For now, I would suggest leaving them ugly.  At least
> they're now properly localized after the patchset.

ok. Please add a comment declaring the ugliness and explaining why they are ugly - 
so that future hackers see it as an exception to the rule, not as an example to 
follow ;-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2010-11-18  9:08 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-11 11:02 [PATCHSET RFC] x86: unify x86_32 and 64 NUMA init paths Tejun Heo
2010-11-11 11:02 ` [PATCH 1/9] x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Tejun Heo
2010-11-11 11:13   ` Pekka Enberg
2010-11-11 11:02 ` [PATCH 2/9] x86: Rename x86_32 MAX_APICID to MAX_LOCAL_APIC Tejun Heo
2010-11-11 11:15   ` Pekka Enberg
2010-11-11 11:02 ` [PATCH 3/9] x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Tejun Heo
2010-11-11 12:02   ` Pekka Enberg
2010-11-11 16:18   ` Cyrill Gorcunov
2010-11-11 17:34   ` [PATCH UPDATED " Tejun Heo
2010-11-11 17:52     ` Cyrill Gorcunov
2010-11-11 11:02 ` [PATCH 4/9] x86: Initialize 32bit logical apicid mapping early during boot Tejun Heo
2010-11-12  1:54   ` Brian Gerst
2010-11-12  9:38   ` [PATCH UPDATED " Tejun Heo
2010-11-12  9:58     ` Yinghai Lu
2010-11-12 10:22       ` Tejun Heo
2010-11-12 10:33   ` [PATCH 4/9 UPDATED-1] " Tejun Heo
2010-11-18  8:30     ` Ingo Molnar
2010-11-18  8:39       ` Tejun Heo
2010-11-18  8:42         ` Ingo Molnar
2010-11-18  8:48           ` Tejun Heo
2010-11-18  9:08             ` Ingo Molnar
2010-11-11 11:02 ` [PATCH 5/9] x86: Replace apic->apicid_to_node() with ->numa_cpu_node() Tejun Heo
2010-11-11 11:37   ` Pekka Enberg
2010-11-11 11:02 ` [PATCH 6/9] x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit Tejun Heo
2010-11-11 12:01   ` Pekka Enberg
2010-11-11 11:02 ` [PATCH 7/9] x86: Unify CPU -> " Tejun Heo
2010-11-11 11:02 ` [PATCH 8/9] x86: Unify node_to_cpumask_map handling " Tejun Heo
2010-11-11 12:11   ` Pekka Enberg
2010-11-11 11:02 ` [PATCH 9/9] x86: Unify NUMA initialization " Tejun Heo
2010-11-11 12:09   ` Pekka Enberg
2010-11-12 10:32 ` [PATCH 3.5/9] x86: Use local variable to cache smp_processor_id() in setup_local_APIC() Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).