All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Rong Chen <rong.a.chen@intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org
Subject: Re: [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok
Date: Tue, 25 Aug 2020 23:02:40 -0400	[thread overview]
Message-ID: <9D9FBD8D-DF19-4DA9-B0B1-260BA72D3712@lca.pw> (raw)
In-Reply-To: <34a960a0-ec0b-3c26-ec73-e415a8197757@intel.com>



> On Aug 25, 2020, at 8:44 PM, Rong Chen <rong.a.chen@intel.com> wrote:
> 
> I rebuilt the kernel on commit c566586818 but the error changed to "RIP: 0010:clear_page_orig+0x12/0x40",
> and the error can be reproduced on parent commit:

Catalin, any thought? Sounds like those early kmemleak allocations cause some sort of memory corruption?

> 
> [    0.539811] Memory: 12325340K/12680692K available (10243K kernel code, 2414K rwdata, 8188K rodata, 856K init, 14628K bss, 355352K reserved, 0K cma-reserved)
> [    4.133400] BUG: unable to handle page fault for address: ffff88833653e000
> [    4.134130] #PF: supervisor write access in kernel mode
> [    4.134694] #PF: error_code(0x0002) - not-present page
> [    4.135177] PGD 3800067 P4D 3800067 PUD f000e6f2f000d445 PMD 0
> [    4.135730] Thread overran stack, or stack corrupted
> [    4.136192] Oops: 0002 [#1] DEBUG_PAGEALLOC PTI
> [    4.136609] CPU: 0 PID: 0 Comm: swapper Not tainted 5.3.0-11792-gc5665868183fe #1
> [    4.137300] RIP: 0010:clear_page_orig+0x12/0x40
> [    4.137732] Code: 03 00 00 00 b0 01 5b c3 b9 00 02 00 00 31 c0 f3 48 ab c3 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00 00 00 00 ff c9 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
> [    4.139453] RSP: 0000:ffffffff8239d8e8 EFLAGS: 00010016
> [    4.139939] RAX: 0000000000000000 RBX: 0000000000000101 RCX: 000000000000003f
> [    4.140602] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88833653e000
> [    4.141261] RBP: ffffea000cd94f80 R08: ffffffff82427800 R09: ffffea000cd94f80
> [    4.141956] R10: 0000160000000000 R11: ffff888000000000 R12: 0000000000000000
> [    4.142642] R13: 0000000000000001 R14: 0000000000092000 R15: 0000000000000046
> [    4.143298] FS:  0000000000000000(0000) GS:ffffffff8243d000(0000) knlGS:0000000000000000
> [    4.144076] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.144661] CR2: ffff88833653e000 CR3: 0000000002420000 CR4: 00000000000006b0
> [    4.145382] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    4.146121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    4.146829] Call Trace:
> [    4.147066] Modules linked in:
> [    4.147359] CR2: ffff88833653e000
> [    4.147757] random: get_random_bytes called from init_oops_id+0x1d/0x2c with crng_init=0
> 
> $ ./scripts/faddr2line vmlinux clear_page_orig+0x12/0x40
> clear_page_orig+0x12/0x40:
> clear_page_orig at arch/x86/lib/clear_page_64.S:31
> 
> 
> but I also can reproduced the lookup_address_in_pgd error in v5.9-rc2 with attached config file:
> 
> [    0.382789] Memory: 12313044K/12680692K available (10242K kernel code, 2658K rwdata, 8916K rodata, 800K init, 24540K bss, 367392K reserved, 0K cma-reserved)
> [    4.027977] general protection fault, probably for non-canonical address 0xf0006f7280000d98: 0000 [#1] DEBUG_PAGEALLOC PTI
> [    4.029094] CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc2 #1
> [    4.029741] RIP: 0010:lookup_address_in_pgd+0x7c/0xcc
> [    4.030341] Code: 00 00 48 3d 81 00 00 00 74 6c 4c 89 df e8 9d f2 ff ff 48 f7 d0 4c 21 d8 a8 01 74 5a 4c 89 d6 4c 89 df e8 fd f5 ff ff 49 89 c0 <48> f7 00 9f ff ff ff 74 93 41 c7 01 02 00 00 00 48 8b 08 48 89 cf
> [    4.032205] RSP: 0000:ffffffff82453a08 EFLAGS: 00010082
> [    4.032716] RAX: f0006f7280000d98 RBX: 0000000000000001 RCX: f000e6f280000000
> [    4.033569] RDX: ffff888000000000 RSI: ffff888000000d98 RDI: f000e6f2f000d400
> [    4.034474] RBP: ffffffff82453b28 R08: f0006f7280000d98 R09: ffffffff82453a48
> [    4.035125] R10: ffff88833664c000 R11: f000e6f2f000d445 R12: ffff88833664c000
> [    4.035836] R13: 0000000000000001 R14: ffff888000000000 R15: ffffffff827806b8
> [    4.036575] FS:  0000000000000000(0000) GS:ffffffff82641000(0000) knlGS:0000000000000000
> [    4.037389] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.037961] CR2: ffff8883447ff000 CR3: 0000000002622000 CR4: 00000000000006b0
> [    4.038677] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    4.039388] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    4.040243] Call Trace:
> [    4.040552] Modules linked in:
> [    4.041033] random: get_random_bytes called from init_oops_id+0x1d/0x2c with crng_init=0
> 
> $ ./scripts/faddr2line vmlinux lookup_address_in_pgd+0x7c/0xcc
> lookup_address_in_pgd+0x7c/0xcc:
> lookup_address_in_pgd at arch/x86/mm/pat/set_memory.c:604
> (inlined by) lookup_address_in_pgd at arch/x86/mm/pat/set_memory.c:575

  reply	other threads:[~2020-08-26  3:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18  0:23 [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok kernel test robot
2020-08-18  0:23 ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage, last_printk:Probing_EDD(edd=off_to_disable)...ok kernel test robot
2020-08-21  1:01 ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok Qian Cai
2020-08-24  2:47   ` Rong Chen
2020-08-24  2:47     ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage, last_printk:Probing_EDD(edd=off_to_disable)...ok Rong Chen
2020-08-24 12:29     ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok Qian Cai
2020-08-26  0:44       ` Rong Chen
2020-08-26  0:44         ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage, last_printk:Probing_EDD(edd=off_to_disable)...ok Rong Chen
2020-08-26  3:02         ` Qian Cai [this message]
2020-08-26 17:30           ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok Catalin Marinas
2020-08-26 17:30             ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage, last_printk:Probing_EDD(edd=off_to_disable)...ok Catalin Marinas
2020-08-27  9:16             ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage,last_printk:Probing_EDD(edd=off_to_disable)...ok Rong Chen
2020-08-27  9:16               ` [mm] c566586818: BUG:kernel_hang_in_early-boot_stage, last_printk:Probing_EDD(edd=off_to_disable)...ok Rong Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9D9FBD8D-DF19-4DA9-B0B1-260BA72D3712@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=mhocko@kernel.org \
    --cc=rong.a.chen@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.