All of lore.kernel.org
 help / color / mirror / Atom feed
From: Barclay Jameson <almightybeeij@gmail.com>
To: linux-ia64@vger.kernel.org
Subject: Kernel oops
Date: Fri, 30 Aug 2013 03:13:27 +0000	[thread overview]
Message-ID: <CAMzumdY89wpc26ca7Tzg-GMqyMr6xODiniSzg+4gRyNKEVtszA@mail.gmail.com> (raw)

I have retried compiling the 3.4 Kernel this time in Squeeze. The
Kernel compiles fine and will boot up to 255 cores; however, after
that if fails out with the following Kernel oops when booting more
than 255 cores (Kernel is compiled with 512). Here is the boot log
with the option bootmem_debug=1. I have tried to shorten the boot log
and leave what I think are the important parts; however if anyone
needs the 35M boot log then I will gladly send as attachment.

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.4.49 (beeij@debian-hpc) (gcc version
4.4.5 (Debian 4.4.5-8) ) #16 SMP Thu Aug 29 14:41:59 CDT 2013
[    0.000000] EFI v1.10 by INTEL: SALsystab=0x1802c2d990 ACPI 2.0=0x1802c2da80
[    0.000000] booting generic kernel on platform sn2
[    0.000000] console [sn_sal0] enabled
[    0.000000] ACPI: RSDP 0000001802c2da80 00024 (v02    SGI)
[    0.000000] ACPI: XSDT 0000001802c38df0 00044 (v01    SGI  XSDTSN2
00010001    ? 00000094)
[    0.000000] ACPI: APIC 0000001802c2f5a0 0152C (v01    SGI  APICSN2
00010001    ? 00000001)
[    0.000000] ACPI: SRAT 0000001802c30ae0 02DB0 (v01    SGI  SRATSN2
00010001    ? 00000001)
[    0.000000] ACPI: SLIT 0000001802c338a0 0312C (v01    SGI  SLITSN2
00010001    ? 00000001)
[    0.000000] ACPI: FACP 0000001802c369e0 000F4 (v03    SGI  FACPSN2
00030001    ? 00000001)
[    0.000000] ACPI Warning: 32/64X length mismatch in Pm1aEventBlock:
32/0 (20120320/tbfadt-548)
[    0.000000] ACPI Warning: 32/64X length mismatch in
Pm1aControlBlock: 16/0 (20120320/tbfadt-548)
[    0.000000] ACPI Warning: 32/64X length mismatch in PmTimerBlock:
32/0 (20120320/tbfadt-548)
[    0.000000] ACPI Warning: 32/64X length mismatch in Gpe0Block: 64/0
(20120320/tbfadt-548)
[    0.000000] ACPI Warning: Invalid length for Pm1aEventBlock: 0,
using default 32 (20120320/tbfadt-629)
[    0.000000] ACPI Warning: Invalid length for Pm1aControlBlock: 0,
using default 16 (20120320/tbfadt-629)
[    0.000000] ACPI Warning: Invalid length for PmTimerBlock: 0, using
default 32 (20120320/tbfadt-629)
[    0.000000] ACPI: DSDT 0000001802c3af20 00024 (v02    SGI  DSDTSN2
00020001    ? 00002483)
[    0.000000] ACPI: FACS 0000001802c2e1e0 00040
[    0.000000] ACPI: Local APIC address c0000000fee00000
[    0.000000] 448 CPUs available, 448 CPUs total
[    0.000000] Number of logical nodes in system = 112
[    0.000000] Number of memory chunks in system = 112
[    0.000000] SMP: Allowing 448 CPUs, 0 hotplug CPUs

[=========SNIP============]
[    0.000000] On node 63 totalpages: 504832
[    0.000000] free_area_init_node: node 63, pgdat e0000fd8040c1f80,
node_mem_map a0007ff57d62a000
[    0.000000]   DMA zone: 2650 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 502182 pages, LIFO batch:7
[    0.000000] bootmem::alloc_bootmem_core nidc size\x18 [1 pages]
align€ goal@00000000000 limit=0
[    0.000000] bootmem::__reserve nidc start?601318 end?601319 flags=1
[    0.000000] bootmem::alloc_bootmem_core nidc size\x18000 [6 pages]
align€ goal@00000000000 limit=0
[    0.000000] bootmem::__reserve nidc start?601319 end?60131f flags=1
[    0.000000] Could not find start_pfn for node 64
[    0.000000] On node 64 totalpages: 0
[    0.000000] free_area_init_node: node 64, pgdat e000101804102000,
node_mem_map a0007ff5b562a000
[    0.000000] Could not find start_pfn for node 65
[    0.000000] On node 65 totalpages: 0
[    0.000000] free_area_init_node: node 65, pgdat e000105804142080,
node_mem_map a0007ff5ed62a000
[    0.000000] Could not find start_pfn for node 66
[    0.000000] On node 66 totalpages: 0
[=========SNIP============]

 [    0.000000] BUG: Bad page state in process swapper  pfn:40601318
[    0.000000] page:a0007ff5b5642d40 count:0 mapcount:1 mapping:
   (null) index:0x0
[    0.000000] page flags: 0x0()
[    0.000000] Modules linked in:
[    0.000000] Unable to handle kernel NULL pointer dereference
(address 0000000000000018)
[    0.000000] swapper[0]: Oops 11003706212352 [1]
[    0.000000] Modules linked in:
[    0.000000]
[    0.000000] Pid: 0, CPU 0, comm:              swapper
[    0.000000] psr : 00001210084a2018 ifs : 800000000000cc18 ip  :
[<a0000001003ea1b1>]    Not tainted (3.4.49)
[    0.000000] ip is at __copy_user+0x891/0x950
[    0.000000] unat: 0000000000000000 pfs : 0000000000000792 rsc :
0000000000000003
[    0.000000] rnat: 0000000000000000 bsps: 0000000000000000 pr  :
0bad0bad0baa55a9
[    0.000000] ldrs: 0000000000000000 ccv : 0000000000000000 fpsr:
0009804c8a70433f
[    0.000000] csd : 0000000000000000 ssd : 0000000000000000
[    0.000000] b0  : a000000100043430 b6  : a000000100043660 b7  :
a00000010000c3b0
[    0.000000] f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
[    0.000000] f8  : 1003e0a00000010001577 f9  : 10006c7fffffffd73ea5c
[    0.000000] f10 : 1003e0000000000000000 f11 : 1003e0044b82fa09b5a53
[    0.000000] r1  : a000000100dfa9e0 r2  : a000000100ac75f0 r3  :
a000000100ac75f8
[    0.000000] r8  : 0000000000000298 r9  : 0000000000000013 r10 :
0000000000000000
[    0.000000] r11 : 0bad0bad0baa11e9 r12 : a000000100ac7550 r13 :
a000000100ac0000
[    0.000000] r14 : a000000100e44080 r15 : a000000100e44030 r16 :
0000000000000298
[    0.000000] r17 : 0000000000000010 r18 : 0000000000000018 r19 :
a000000100ac7850
[    0.000000] r20 : 0000000000000290 r21 : a000000100ac75b4 r22 :
a000000100c12f20
[    0.000000] r23 : a000000100ac75b0 r24 : 0000000000000000 r25 :
a000000100e44030
[    0.000000] r26 : a0000001007ea718 r27 : 0000000000018869 r28 :
a000000100ac4000
[    0.000000] r29 : 0000000000000014 r30 : 0000000000000000 r31 :
0000000000000792

It looks like after Node 64 which would be cores 256 and up, it can
not find start_pfn and
it shows 0 total pages. The instruction pointer is at
[__copy_user+0x891/0x950].  In the meantime I
have compiled the 2.6.35 Kernel with support for 1024 CPUs that works
as a hold over. Anyone have any ideas as to why it is failing out at
this point?

Thanks,

Beeij

             reply	other threads:[~2013-08-30  3:13 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-30  3:13 Barclay Jameson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-03-31 14:23 kernel oops Olga Kornievskaia
2023-03-31 14:25 ` Chuck Lever III
2023-04-04 14:24   ` Olga Kornievskaia
2023-04-04 14:31     ` Chuck Lever III
2023-04-04 14:44       ` Olga Kornievskaia
2023-04-04 14:54         ` Chuck Lever III
2023-04-04 16:15           ` Olga Kornievskaia
2023-04-04 17:28             ` Chuck Lever III
2023-04-04 19:11               ` Olga Kornievskaia
2023-04-04 19:14                 ` Chuck Lever III
2023-04-04 19:26                   ` Olga Kornievskaia
2023-04-04 19:30                     ` Chuck Lever III
2017-07-24 21:16 Kernel oops Jason Gunthorpe
     [not found] ` <20170724211606.GA1705-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-27 11:46   ` Matan Barak
     [not found]     ` <CAAKD3BAdB2aRk3WGdbeDYof6dUfkEwhQf27cG0FWe5DRuQ15NQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-27 12:54       ` Matan Barak
     [not found]         ` <CAAKD3BDFrTMMgX0nErD50rp2je=HC9zeaYWHDKf0mqQwc5fM9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-27 20:44           ` Jason Gunthorpe
     [not found]             ` <20170727204437.GA16986-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-30 10:25               ` Leon Romanovsky
     [not found]                 ` <20170730102514.GQ13672-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-07-31  3:52                   ` Jason Gunthorpe
     [not found]                     ` <20170731035208.GA30615-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-31  5:39                       ` Leon Romanovsky
     [not found]                         ` <20170731053901.GR13672-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-07-31  7:12                           ` Leon Romanovsky
2012-07-07 12:54 Kernel Oops RuanZhijie
2012-07-04 11:04 RuanZhijie
2008-07-23 12:52 kernel oops Andrei Popa
2008-07-23 17:11 ` Vegard Nossum
2008-08-18 16:33   ` Vegard Nossum
2008-08-18 16:39     ` Greg KH
     [not found] <e8eb01770803120245x7690e6a9te8ad04296aa3fc4d@mail.gmail.com>
2008-03-12  9:49 ` Zbynek Drlik
2008-03-12 10:33   ` Al Viro
2008-03-12 13:12     ` Zbynek Drlik
2008-02-05 12:57 Andrej Hocevar
2008-02-06 17:55 ` Len Brown
2006-10-27  8:59 Kernel Oops Joël Cuissinat
2006-10-27 17:30 ` Russell Cattelan
2006-09-12 10:21 Marcin Prączko
2006-09-13  3:43 ` Andrew Morton
2005-10-15  1:03 Marc Perkel
2005-10-15  1:21 ` Randy.Dunlap
2005-10-15  1:43   ` Marc Perkel
2005-10-15  1:52     ` Randy.Dunlap
2005-04-25 16:31 Chris Penney
2005-04-25 17:47 ` Dan Stromberg
2005-01-08 12:47 kernel oops ierdnah
2005-01-03 21:10 Kernel oops Marat BN
2005-01-05 10:13 ` Andrew Morton
2004-10-17 12:02 kernel oops Thomas Bleher
2004-10-17 12:59 ` Erich Schubert
2004-10-18 21:09   ` Thomas Bleher
2004-06-11  7:27 Kernel oops tmp
2004-05-24 20:19 tmp
2004-05-16 12:08 Kernel OOPS tmp
2004-05-16 23:27 ` Andrew Morton
2004-05-17  0:33   ` tmp
2004-03-09 22:13 Kernel oops Philipp Baer
2004-03-09 23:11 ` Andrew Morton
2004-03-12  7:46   ` Philipp Baer
2004-02-08 11:05 Kernel Oops Mathieu LESNIAK
2004-02-08 16:35 ` Greg KH
2004-02-09  7:06   ` Mathieu LESNIAK
2003-11-28 23:15 Kernel oops Ville Jutvik
2003-11-28  5:45 Anderson Levi
2003-08-09 12:39 kernel oops Jean-Yves LENHOF
2003-08-09 20:37 ` Jean-Yves LENHOF
2003-08-09  9:28 Jean-Yves LENHOF
2003-07-18 19:44 Kernel OOPS Robert Scussel
2003-07-18 21:31 ` Alan Cox
2003-07-07 12:53 kernel oops Anders Karlsson
2003-07-07 13:14 ` Alan Cox
2003-07-07 13:32   ` Anders Karlsson
2003-07-07 13:37     ` Alan Cox
2003-07-07 13:56       ` Anders Karlsson
2003-07-08  9:39         ` Marcelo Tosatti
     [not found] ` <200307072009.50677.bernd-schubert@web.de>
2003-07-08  5:13   ` Anders Karlsson
2003-05-31  1:32 Nadeem Riaz
2003-03-26 15:52 Steve Terrell
2003-02-03  1:18 Kernel Oops Daniel Espinoza
2003-02-03  3:23 ` vishwas
2002-08-09  5:25 sanket rathi
2002-06-23 19:39 Dirk Schmidt
2002-06-10  8:46 kernel oops Robert Litwiniec
2002-02-26 18:26 Suporte RedeBonja
2002-02-27 13:35 ` Erik Mouw
2001-11-26 19:59 Tracy R Reed
2001-11-26 19:27 ` Stephen Smalley
2001-11-13 13:23 Kernel oops Anthony
2001-11-14  6:02 ` Thiago Rondon
2001-10-08 12:59 kernel oops Terry Kendal
2001-09-27  9:49 kewl
2001-06-01 15:13 Kernel oops David Harris
2001-06-01 15:12 David Harris
2001-04-19 18:32 kernel oops Ronald Bultje
2001-04-19 19:04 ` Alan Cox
2001-04-19 19:08   ` Ronald Bultje
2001-02-19 14:44 Kernel Oops Alberto Bertogli
2000-12-10  0:53 Bastien Nocera
2000-12-10  1:13 ` Bastien Nocera
2000-05-17 17:10 Kernel oops Patrick Higgins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMzumdY89wpc26ca7Tzg-GMqyMr6xODiniSzg+4gRyNKEVtszA@mail.gmail.com \
    --to=almightybeeij@gmail.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.