All of lore.kernel.org
 help / color / mirror / Atom feed
* 32bit NUMA and fakeNUMA broken for AMD CPUs
@ 2011-06-21 15:41 Conny Seidel
  2011-06-26 10:22 ` Tejun Heo
  0 siblings, 1 reply; 28+ messages in thread
From: Conny Seidel @ 2011-06-21 15:41 UTC (permalink / raw)
  To: LKML, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 5910 bytes --]

Hi,

the commit 797390d8554b1e07aabea37d0140933b0412dba0 breaks 32bit on AMD
with native NUMA and fakeNUMA.

Native NUMA still boots, when the kernel parameter numa=off is added to
the cmdline.

[    0.000000] BUG: unable to handle kernel paging request at 000012b0
[    0.000000] IP: [<c1aa13ce>] memmap_init_zone+0x6c/0xf2
[    0.000000] *pdpt = 0000000000000000 *pde = f000eef3f000ee00
[    0.000000] Oops: 0000 [#1] SMP
[    0.000000] last sysfs file:
[    0.000000] Modules linked in:
[    0.000000]
[    0.000000] Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00164-g797390d #1 To Be Filled By O.E.M. To Be Filled By O.E.M./E350M1
[    0.000000] EIP: 0060:[<c1aa13ce>] EFLAGS: 00010012 CPU: 0
[    0.000000] EIP is at memmap_init_zone+0x6c/0xf2
[    0.000000] EAX: 00000000 EBX: 000a8000 ECX: 000a7fff EDX: f2c00b80
[    0.000000] ESI: 000a8000 EDI: f2c00800 EBP: c19ffe54 ESP: c19ffe34
[    0.000000]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    0.000000] Process swapper (pid: 0, ti=c19fe000 task=c1a07f60 task.ti=c19fe000)
[    0.000000] Stack:
[    0.000000]  00000002 00000000 0023f000 00000000 10000000 00000a00 f2c00000 f2c00b58
[    0.000000]  c19ffeb0 c1a80f24 000375fe 00000000 f2c00800 00000800 00000100 00000030
[    0.000000]  c1abb768 0000003c 00000000 00000000 00000004 00207a02 f2c00800 000375fe
[    0.000000] Call Trace:
[    0.000000]  [<c1a80f24>] free_area_init_node+0x358/0x385
[    0.000000]  [<c1a81384>] free_area_init_nodes+0x420/0x487
[    0.000000]  [<c1637323>] ? printk+0x14/0x16
[    0.000000]  [<c102489e>] ? memory_present+0x66/0x6f
[    0.000000]  [<c1a79326>] paging_init+0x114/0x11b
[    0.000000]  [<c101742f>] ? native_apic_mem_read+0x8/0x19
[    0.000000]  [<c1a6cb13>] setup_arch+0xb37/0xc0a
[    0.000000]  [<c1638f6d>] ? _raw_spin_unlock_irqrestore+0x19/0x25
[    0.000000]  [<c1638f6d>] ? _raw_spin_unlock_irqrestore+0x19/0x25
[    0.000000]  [<c1637323>] ? printk+0x14/0x16
[    0.000000]  [<c1a69554>] start_kernel+0x76/0x316
[    0.000000]  [<c1a690a8>] i386_start_kernel+0xa8/0xb0
[    0.000000] Code: 0a c1 e0 1d 89 45 ec 8b 45 e4 03 3c 85 e8 5b a6 c1 e9 8a 00 00 00 89 f0 89 f3 c1 e8 0e 0f be 80 a8 57 a6 c1 8b 04 85 e8 5b a6 c1 <2b> 98 b0 12 00 00 c1 e3 05 03 98 ac 12 00 00 8b 03 25 ff ff ff
[    0.000000] EIP: [<c1aa13ce>] memmap_init_zone+0x6c/0xf2 SS:ESP 0068:c19ffe34
[    0.000000] CR2: 00000000000012b0
[    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] Pid: 0, comm: swapper Tainted: G      D     2.6.39-rc5-00164-g797390d #1
[    0.000000] Call Trace:
[    0.000000]  [<c1637213>] panic+0x55/0x151
[    0.000000]  [<c10507c9>] ? blocking_notifier_call_chain+0x11/0x13
[    0.000000]  [<c1038340>] do_exit+0x99/0x6fa
[    0.000000]  [<c1638f6d>] ? _raw_spin_unlock_irqrestore+0x19/0x25
[    0.000000]  [<c10356de>] ? kmsg_dump+0x3c/0xbe
[    0.000000]  [<c163a569>] oops_end+0x97/0x9f
[    0.000000]  [<c101e9a4>] no_context+0x144/0x14e
[    0.000000]  [<c101eada>] __bad_area_nosemaphore+0x12c/0x134
[    0.000000]  [<c1a83a75>] ? memblock_add_region+0xbf/0x4af
[    0.000000]  [<c101eaf4>] bad_area_nosemaphore+0x12/0x15
[    0.000000]  [<c163beb0>] do_page_fault+0x1e8/0x3c8
[    0.000000]  [<c1a82c5e>] ? __alloc_memory_core_early+0x86/0x94
[    0.000000]  [<c163bcc8>] ? spurious_fault+0xf2/0xf2
[    0.000000]  [<c1639c6b>] error_code+0x5f/0x64
[    0.000000]  [<c163bcc8>] ? spurious_fault+0xf2/0xf2
[    0.000000]  [<c1aa13ce>] ? memmap_init_zone+0x6c/0xf2
[    0.000000]  [<c1a80f24>] free_area_init_node+0x358/0x385
[    0.000000]  [<c1a81384>] free_area_init_nodes+0x420/0x487
[    0.000000]  [<c1637323>] ? printk+0x14/0x16
[    0.000000]  [<c102489e>] ? memory_present+0x66/0x6f
[    0.000000]  [<c1a79326>] paging_init+0x114/0x11b
[    0.000000]  [<c101742f>] ? native_apic_mem_read+0x8/0x19
[    0.000000]  [<c1a6cb13>] setup_arch+0xb37/0xc0a
[    0.000000]  [<c1638f6d>] ? _raw_spin_unlock_irqrestore+0x19/0x25
[    0.000000]  [<c1638f6d>] ? _raw_spin_unlock_irqrestore+0x19/0x25
[    0.000000]  [<c1637323>] ? printk+0x14/0x16
[    0.000000]  [<c1a69554>] start_kernel+0x76/0x316
[    0.000000]  [<c1a690a8>] i386_start_kernel+0xa8/0xb0



commit 797390d8554b1e07aabea37d0140933b0412dba0
Author: Tejun Heo <tj@kernel.org>
Date:   Mon May 2 14:18:52 2011 +0200

    x86-32, NUMA: use sparse_memory_present_with_active_regions()

    Instead of calling memory_present() for each region from NUMA init,
    call sparse_memory_present_with_active_regions() from paging_init()
    similarly to x86-64.

    For flat and numaq, this results in exactly the same memory_present()
    calls.  For srat, if there are multiple memory chunks for a node,
    after this change, memory_present() will be called separately for each
    chunk instead of being called once to encompass the whole range, which
    doesn't cause any harm and actually is the better behavior.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>


##
##################################################################
# Email : conny.seidel@amd.com            GnuPG-Key : 0xA6AB055D #
# Fingerprint: 17C4 5DB2 7C4C C1C7 1452 8148 F139 7C09 A6AB 055D #
##################################################################
# Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach      #
# General Managers: Alberto Bozzoi                               #
# Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen #
#               HRB Nr. 43632                                    #
##################################################################

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread
* [PATCH x86/mm 1/2] x86: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/
@ 2011-07-12  7:44 Tejun Heo
  2011-07-12  7:45 ` [PATCH x86/mm 2/2] x86: Implement pfn -> nid mapping granularity check Tejun Heo
  2011-07-13  5:33 ` [tip:x86/numa] x86, mm: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/ tip-bot for Tejun Heo
  0 siblings, 2 replies; 28+ messages in thread
From: Tejun Heo @ 2011-07-12  7:44 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Conny Seidel, x86, linux-kernel, Hans Rosenfeld

>From 9f5e6296923d7cf47738dfcd38ab9e333d3fd356 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Fri, 1 Jul 2011 18:22:39 +0200

DISCONTIGMEM on x86-32 implements pfn -> nid mapping similarly to
SPARSEMEM; however, it calls each mapping unit ELEMENT instead of
SECTION.  This patch renames it to SECTION so that PAGES_PER_SECTION
is valid for both DISCONTIGMEM and SPARSEMEM.  This will be used by
the next patch to implement mapping granularity check.

This patch is trivial constant rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
---
This one is identical as the original posting[1].  Only the second
patch is updated.  Please schedule for 3.1-rc1.  Also available on the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-x86-mm-base

Thanks.

[1] http://thread.gmane.org/gmane.linux.kernel/1161279/focus=1162583

 arch/x86/include/asm/mmzone_32.h |    6 +++---
 arch/x86/mm/numa_32.c            |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/mmzone_32.h b/arch/x86/include/asm/mmzone_32.h
index ffa037f..55728e1 100644
--- a/arch/x86/include/asm/mmzone_32.h
+++ b/arch/x86/include/asm/mmzone_32.h
@@ -34,15 +34,15 @@ static inline void resume_map_numa_kva(pgd_t *pgd) {}
  *    64Gb / 4096bytes/page = 16777216 pages
  */
 #define MAX_NR_PAGES 16777216
-#define MAX_ELEMENTS 1024
-#define PAGES_PER_ELEMENT (MAX_NR_PAGES/MAX_ELEMENTS)
+#define MAX_SECTIONS 1024
+#define PAGES_PER_SECTION (MAX_NR_PAGES/MAX_SECTIONS)
 
 extern s8 physnode_map[];
 
 static inline int pfn_to_nid(unsigned long pfn)
 {
 #ifdef CONFIG_NUMA
-	return((int) physnode_map[(pfn) / PAGES_PER_ELEMENT]);
+	return((int) physnode_map[(pfn) / PAGES_PER_SECTION]);
 #else
 	return 0;
 #endif
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c
index 849a975..3adebe7 100644
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -41,7 +41,7 @@
  *     physnode_map[16-31] = 1;
  *     physnode_map[32- ] = -1;
  */
-s8 physnode_map[MAX_ELEMENTS] __read_mostly = { [0 ... (MAX_ELEMENTS - 1)] = -1};
+s8 physnode_map[MAX_SECTIONS] __read_mostly = { [0 ... (MAX_SECTIONS - 1)] = -1};
 EXPORT_SYMBOL(physnode_map);
 
 void memory_present(int nid, unsigned long start, unsigned long end)
@@ -52,8 +52,8 @@ void memory_present(int nid, unsigned long start, unsigned long end)
 			nid, start, end);
 	printk(KERN_DEBUG "  Setting physnode_map array to node %d for pfns:\n", nid);
 	printk(KERN_DEBUG "  ");
-	for (pfn = start; pfn < end; pfn += PAGES_PER_ELEMENT) {
-		physnode_map[pfn / PAGES_PER_ELEMENT] = nid;
+	for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
+		physnode_map[pfn / PAGES_PER_SECTION] = nid;
 		printk(KERN_CONT "%lx ", pfn);
 	}
 	printk(KERN_CONT "\n");
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-07-13  5:34 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-21 15:41 32bit NUMA and fakeNUMA broken for AMD CPUs Conny Seidel
2011-06-26 10:22 ` Tejun Heo
     [not found]   ` <20110626223807.47cef5c6.conny.seidel_amd.com@marah.osrc.amd.com>
2011-06-28  9:41     ` [PATCH tip:x86/urgent] x86-32, NUMA: Fix boot regression caused by NUMA init unification on highmem machines Tejun Heo
2011-06-28 12:35       ` Conny Seidel
2011-07-01 15:26       ` [tip:x86/urgent] " tip-bot for Tejun Heo
     [not found]     ` <20110628174613.GP478@escobedo.osrc.amd.com>
2011-06-29  9:44       ` 32bit NUMA and fakeNUMA broken for AMD CPUs Tejun Heo
2011-06-29 10:51         ` Tejun Heo
2011-06-29 12:34         ` Tejun Heo
2011-06-29 12:55           ` Hans Rosenfeld
2011-06-29 13:03             ` Tejun Heo
2011-06-29 16:15               ` Tejun Heo
2011-06-30 13:13                 ` Hans Rosenfeld
2011-06-30 15:55                   ` Tejun Heo
2011-06-30 16:32                     ` Hans Rosenfeld
2011-06-30 16:42                       ` Tejun Heo
2011-06-30 17:04                         ` Hans Rosenfeld
2011-07-01 16:22         ` [PATCH x86/urgent 1/2] x86: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/ Tejun Heo
2011-07-01 16:23           ` [PATCH x86/urgent 2/2] x86: Implement pfn -> nid mapping granularity check Tejun Heo
2011-07-09  8:32             ` Tejun Heo
2011-07-09  8:42               ` H. Peter Anvin
2011-07-11  8:34                 ` [PATCH x86/urgent] x86: Disable AMD_NUMA for 32bit for now Tejun Heo
2011-07-11 14:01                   ` Tejun Heo
2011-07-11 18:58                   ` [tip:x86/urgent] " tip-bot for Tejun Heo
2011-07-11 14:20                 ` [PATCH x86/urgent 2/2] x86: Implement pfn -> nid mapping granularity check Hans Rosenfeld
2011-07-13  5:34       ` [tip:x86/numa] x86, numa: " tip-bot for Tejun Heo
2011-07-12  7:44 [PATCH x86/mm 1/2] x86: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/ Tejun Heo
2011-07-12  7:45 ` [PATCH x86/mm 2/2] x86: Implement pfn -> nid mapping granularity check Tejun Heo
2011-07-13  5:33 ` [tip:x86/numa] x86, mm: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/ tip-bot for Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.