linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
@ 2021-10-12 18:11 Steven Rostedt
  2021-10-12 18:34 ` Linus Torvalds
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2021-10-12 18:11 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Gavin Shan, Anshuman Khandual, Christophe Leroy,
	Gerald Schaefer, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 3801 bytes --]

My tests have been triggering on this every so often. My code is based on
v5.15-rc3. Note, this is not fully reproducible.

I also get errors an mm/debug_vm_pgtable.c:438

And it also triggers the WARN_ON()s after that as well.

Always happens at the debug_vm_pgtable() during boot up.

Here's the last splat:

------------[ cut here ]------------
 WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:492 debug_vm_pgtable+0x1312/0x1693
 Modules linked in:
 CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-test+ #24
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 RIP: 0010:debug_vm_pgtable+0x1312/0x1693
 Code: 38 ff ff ff 48 c7 00 00 00 00 00 48 8b 75 b8 48 8b 95 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 <0f> 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 0f 0b 48 8b
 RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
 RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: 0000000000000007
 RBP: ffffa1eac0033e98 R08: ffffa1eac0033d23 R09: 000000021ee00000
 R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
 R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
 FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
 Call Trace:
  ? destroy_args+0x281/0x281
  do_one_initcall+0x68/0x2d0
  ? rcu_read_lock_sched_held+0x46/0x80
  kernel_init_freeable+0x1a5/0x1f4
  ? rest_init+0x280/0x280
  kernel_init+0x1a/0x120
  ret_from_fork+0x22/0x30
 irq event stamp: 880991
 hardirqs last  enabled at (881001): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
 hardirqs last disabled at (881016): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
 softirqs last  enabled at (881014): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
 softirqs last disabled at (881009): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
 ---[ end trace 731534acace262b5 ]---
 ------------[ cut here ]------------
 WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493 debug_vm_pgtable+0x1324/0x1693
 Modules linked in:
 CPU: 3 PID: 1 Comm: swapper/0 Tainted: G        W         5.15.0-rc3-test+ #24
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 RIP: 0010:debug_vm_pgtable+0x1324/0x1693
 Code: 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 0f 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 <0f> 0b 48 8b 85 38 ff ff ff 48 8b 00 48 a9 9f ff ff ff 74 02 0f 0b
 RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
 RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: ffff8cf1436db160
 RBP: ffffa1eac0033e98 R08: 0000000000000000 R09: 000000021ee00000
 R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
 R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
 FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
 Call Trace:
  ? destroy_args+0x281/0x281
  do_one_initcall+0x68/0x2d0
  ? rcu_read_lock_sched_held+0x46/0x80
  kernel_init_freeable+0x1a5/0x1f4
  ? rest_init+0x280/0x280
  kernel_init+0x1a/0x120
  ret_from_fork+0x22/0x30
 irq event stamp: 881327
 hardirqs last  enabled at (881337): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
 hardirqs last disabled at (881352): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
 softirqs last  enabled at (881350): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
 softirqs last disabled at (881345): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
 ---[ end trace 731534acace262b6 ]---

Config attached (this is x86_64).

-- Steve

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 30048 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 18:11 [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493 Steven Rostedt
@ 2021-10-12 18:34 ` Linus Torvalds
  2021-10-12 18:55   ` Steven Rostedt
  2021-10-13  4:10   ` Anshuman Khandual
  0 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2021-10-12 18:34 UTC (permalink / raw)
  To: Steven Rostedt, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Borislav Petkov
  Cc: LKML, Gavin Shan, Anshuman Khandual, Christophe Leroy,
	Gerald Schaefer, Andrew Morton, the arch/x86 maintainers

Adding x86 people (VM in particular). The tests were written primarily
for ARM, but should work on x86 too.

That warning seems to be due to pud_set_huge() failing (the "line 438"
one I don't know what it is - there's no warning on that line in my
tree, so I worry a bit that Steven has other things in his tree).

On x86, that can fail due to mtrr differences, or if there's already
an entry there. Neither _should_ be an issue during the boot-time
tests, but who knows..

Or there's something wrong with the tests. The fact that it's not
repeatable despite being a boot-time thing is a bit odd.

             Linus

On Tue, Oct 12, 2021 at 11:11 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> My tests have been triggering on this every so often. My code is based on
> v5.15-rc3. Note, this is not fully reproducible.
>
> I also get errors an mm/debug_vm_pgtable.c:438
>
> And it also triggers the WARN_ON()s after that as well.
>
> Always happens at the debug_vm_pgtable() during boot up.
>
> Here's the last splat:
>
> ------------[ cut here ]------------
>  WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:492 debug_vm_pgtable+0x1312/0x1693
>  Modules linked in:
>  CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-test+ #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  RIP: 0010:debug_vm_pgtable+0x1312/0x1693
>  Code: 38 ff ff ff 48 c7 00 00 00 00 00 48 8b 75 b8 48 8b 95 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 <0f> 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 0f 0b 48 8b
>  RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
>  RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: 0000000000000007
>  RBP: ffffa1eac0033e98 R08: ffffa1eac0033d23 R09: 000000021ee00000
>  R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
>  R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
>  FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
>  Call Trace:
>   ? destroy_args+0x281/0x281
>   do_one_initcall+0x68/0x2d0
>   ? rcu_read_lock_sched_held+0x46/0x80
>   kernel_init_freeable+0x1a5/0x1f4
>   ? rest_init+0x280/0x280
>   kernel_init+0x1a/0x120
>   ret_from_fork+0x22/0x30
>  irq event stamp: 880991
>  hardirqs last  enabled at (881001): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
>  hardirqs last disabled at (881016): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
>  softirqs last  enabled at (881014): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>  softirqs last disabled at (881009): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>  ---[ end trace 731534acace262b5 ]---
>  ------------[ cut here ]------------
>  WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493 debug_vm_pgtable+0x1324/0x1693
>  Modules linked in:
>  CPU: 3 PID: 1 Comm: swapper/0 Tainted: G        W         5.15.0-rc3-test+ #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  RIP: 0010:debug_vm_pgtable+0x1324/0x1693
>  Code: 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 0f 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 <0f> 0b 48 8b 85 38 ff ff ff 48 8b 00 48 a9 9f ff ff ff 74 02 0f 0b
>  RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
>  RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: ffff8cf1436db160
>  RBP: ffffa1eac0033e98 R08: 0000000000000000 R09: 000000021ee00000
>  R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
>  R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
>  FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
>  Call Trace:
>   ? destroy_args+0x281/0x281
>   do_one_initcall+0x68/0x2d0
>   ? rcu_read_lock_sched_held+0x46/0x80
>   kernel_init_freeable+0x1a5/0x1f4
>   ? rest_init+0x280/0x280
>   kernel_init+0x1a/0x120
>   ret_from_fork+0x22/0x30
>  irq event stamp: 881327
>  hardirqs last  enabled at (881337): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
>  hardirqs last disabled at (881352): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
>  softirqs last  enabled at (881350): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>  softirqs last disabled at (881345): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>  ---[ end trace 731534acace262b6 ]---
>
> Config attached (this is x86_64).
>
> -- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 18:34 ` Linus Torvalds
@ 2021-10-12 18:55   ` Steven Rostedt
  2021-10-12 19:15     ` Linus Torvalds
  2021-10-13  4:10   ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2021-10-12 18:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Tue, 12 Oct 2021 11:34:19 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> That warning seems to be due to pud_set_huge() failing (the "line 438"
> one I don't know what it is - there's no warning on that line in my
> tree, so I worry a bit that Steven has other things in his tree).

I saved off that warning from a different branch, and looking at the dmesg,
it is:

  Linux version 5.14.0-rc2-test+

Which shows this happened on 5.14 as well (and explains why the line
numbers are different).

Here's the full splat:

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1 at mm/debug_vm_pgtable.c:438 pud_huge_tests+0x2f/0x50
 Modules linked in:
 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc2-test+ #16
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 RIP: 0010:pud_huge_tests+0x2f/0x50
 Code: 48 8b 05 c4 93 da ff 48 0f ba e0 3a 73 3c 55 48 c1 e6 0c 48 89 e5 53 48 89 fb 48 c7 07 00 00 00 00 e8 47 3d f5 fd 85 c0 75 02 <0f> 0b 48 89 df e8 0c 3e f5 fd 85 c0 75 02 0f 0b 48 8b 03 48 a9 9f
 RSP: 0000:ffffb728c0033dd0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff8d5648f44570 RCX: 6421c8b694c43a00
 RDX: 00ffffffaf11c9ec RSI: ffffb728c0033d58 RDI: ffffb728c0033d59
 RBP: ffffb728c0033dd8 R08: 000000021f000000 R09: 0000000000040000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8d5648f44570
 R13: ffff8d5640a808c0 R14: 000000000020713a R15: ffff8d5640a80938
 FS:  0000000000000000(0000) GS:ffff8d5756800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000206c2a001 CR4: 00000000001706e0
 Call Trace:
  debug_vm_pgtable+0x533/0x6f0
  ? pgd_clear_tests.isra.0+0x50/0x50
  do_one_initcall+0xe8/0x25c
  ? lock_is_held+0xc/0xe
  ? rcu_read_lock_sched_held+0x3b/0x72
  do_initcalls+0xcd/0xed
  kernel_init_freeable+0x18a/0x1c1
  ? rest_init+0x155/0x155
  kernel_init+0x1a/0x11a
  ret_from_fork+0x22/0x30
 irq event stamp: 28613811
 hardirqs last  enabled at (28613821): [<ffffffffaf12caf1>] console_unlock+0x323/0x36e
 hardirqs last disabled at (28613830): [<ffffffffaf12ca36>] console_unlock+0x268/0x36e
 softirqs last  enabled at (28613546): [<ffffffffb0200328>] __do_softirq+0x328/0x363
 softirqs last disabled at (28613475): [<ffffffffaf0cfea2>] __irq_exit_rcu+0x5f/0x9b
 ---[ end trace 216b0cab27d4c431 ]---
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1 at mm/debug_vm_pgtable.c:439 pud_huge_tests+0x3d/0x50
 Modules linked in:
 CPU: 2 PID: 1 Comm: swapper/0 Tainted: G        W         5.14.0-rc2-test+ #16
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 RIP: 0010:pud_huge_tests+0x3d/0x50
 Code: 55 48 c1 e6 0c 48 89 e5 53 48 89 fb 48 c7 07 00 00 00 00 e8 47 3d f5 fd 85 c0 75 02 0f 0b 48 89 df e8 0c 3e f5 fd 85 c0 75 02 <0f> 0b 48 8b 03 48 a9 9f ff ff ff 74 02 0f 0b 5b 5d c3 c3 0f 1f 44
 RSP: 0000:ffffb728c0033dd0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff8d5648f44570 RCX: 6421c8b694c43a00
 RDX: 0000000000000000 RSI: ffffb728c0033d58 RDI: ffff8d5648f44570
 RBP: ffffb728c0033dd8 R08: 000000021f000000 R09: 0000000000040000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8d5648f44570
 R13: ffff8d5640a808c0 R14: 000000000020713a R15: ffff8d5640a80938
 FS:  0000000000000000(0000) GS:ffff8d5756800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000206c2a001 CR4: 00000000001706e0
 Call Trace:
  debug_vm_pgtable+0x533/0x6f0
  ? pgd_clear_tests.isra.0+0x50/0x50
  do_one_initcall+0xe8/0x25c
  ? lock_is_held+0xc/0xe
  ? rcu_read_lock_sched_held+0x3b/0x72
  do_initcalls+0xcd/0xed
  kernel_init_freeable+0x18a/0x1c1
  ? rest_init+0x155/0x155
  kernel_init+0x1a/0x11a
  ret_from_fork+0x22/0x30
 irq event stamp: 28614155
 hardirqs last  enabled at (28614165): [<ffffffffaf12caf1>] console_unlock+0x323/0x36e
 hardirqs last disabled at (28614174): [<ffffffffaf12ca36>] console_unlock+0x268/0x36e
 softirqs last  enabled at (28613546): [<ffffffffb0200328>] __do_softirq+0x328/0x363
 softirqs last disabled at (28613475): [<ffffffffaf0cfea2>] __irq_exit_rcu+0x5f/0x9b
 ---[ end trace 216b0cab27d4c432 ]---


May be just the same bug at a different line number for 5.14-rc2?

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 18:55   ` Steven Rostedt
@ 2021-10-12 19:15     ` Linus Torvalds
  2021-10-12 19:19       ` Steven Rostedt
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Linus Torvalds @ 2021-10-12 19:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

[-- Attachment #1: Type: text/plain, Size: 746 bytes --]

On Tue, Oct 12, 2021 at 11:55 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
> I saved off that warning from a different branch, and looking at the dmesg,
> it is:
>
>   Linux version 5.14.0-rc2-test+
>
> Which shows this happened on 5.14 as well (and explains why the line
> numbers are different).

Ok, yes, that's the exact same "pud_set_huge()" failure.

So it's the same issue, and has been going on for a while, and
presumably very few people end up enabling DEBUG_VM_PGTABLE (together
with it being hard to trigger).

Or it's something specific to your setup, but that sounds unlikely.

You might add some debugging to the x86 pud_set_huge() function,
something like the attached (obviously entirely untested) patch.

           Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 904 bytes --]

 arch/x86/mm/pgtable.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3481b35cb4ec..0a3d67ec73d4 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -706,12 +706,16 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 
 	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
 	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
-	    (mtrr != MTRR_TYPE_WRBACK))
+	    (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_debug("mtrr_type_lookup() returned %d (%d)\n", mtrr, uniform);
 		return 0;
+	}
 
 	/* Bail out if we are we on a populated non-leaf entry: */
-	if (pud_present(*pud) && !pud_huge(*pud))
+	if (pud_present(*pud) && !pud_huge(*pud)) {
+		pr_debug("pud is already present (%lx)\n", (unsigned long)pud_val(*pud));
 		return 0;
+	}
 
 	set_pte((pte_t *)pud, pfn_pte(
 		(u64)addr >> PAGE_SHIFT,

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 19:15     ` Linus Torvalds
@ 2021-10-12 19:19       ` Steven Rostedt
  2021-10-13 21:37       ` Steven Rostedt
  2021-10-22 12:38       ` Steven Rostedt
  2 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-10-12 19:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Tue, 12 Oct 2021 12:15:40 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Or it's something specific to your setup, but that sounds unlikely.
> 
> You might add some debugging to the x86 pud_set_huge() function,
> something like the attached (obviously entirely untested) patch.

I'll add this to the patches that get applied before my tests run (which
mostly just disable things unrelated to my code, like i915 warnings :-p)

But my box is continuing its test suite, so I'll add it when it finishes,
and then run more tests to try to trigger it again.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 18:34 ` Linus Torvalds
  2021-10-12 18:55   ` Steven Rostedt
@ 2021-10-13  4:10   ` Anshuman Khandual
  2021-10-13 14:27     ` Steven Rostedt
  1 sibling, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2021-10-13  4:10 UTC (permalink / raw)
  To: Linus Torvalds, Steven Rostedt, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Thomas Gleixner, Borislav Petkov
  Cc: LKML, Gavin Shan, Christophe Leroy, Gerald Schaefer,
	Andrew Morton, the arch/x86 maintainers



On 10/13/21 12:04 AM, Linus Torvalds wrote:
> Adding x86 people (VM in particular). The tests were written primarily
> for ARM, but should work on x86 too.

Did a quick test on arm64 with 5.15-rc5 without any warning. There was a
recent rework from Gavin which does not seem to have changed anything on
this test in particular.

4cbde03bdb0b ("mm/debug_vm_pgtable: use struct pgtable_debug_args in PUD modifying tests")

===============
-static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+static void __init pud_huge_tests(struct pgtable_debug_args *args)
 {
        pud_t pud;
 
-       if (!arch_vmap_pud_supported(prot))
+       if (!arch_vmap_pud_supported(args->page_prot))
                return;
 
        pr_debug("Validating PUD huge\n");
@@ -460,15 +451,15 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
         * X86 defined pud_set_huge() verifies that the given
         * PUD is not a populated non-leaf entry.
         */
-       WRITE_ONCE(*pudp, __pud(0));
-       WARN_ON(!pud_set_huge(pudp, __pfn_to_phys(pfn), prot));
-       WARN_ON(!pud_clear_huge(pudp));
-       pud = READ_ONCE(*pudp);
+       WRITE_ONCE(*args->pudp, __pud(0));
+       WARN_ON(!pud_set_huge(args->pudp, __pfn_to_phys(args->fixed_pud_pfn), args->page_prot));
+       WARN_ON(!pud_clear_huge(args->pudp));
+       pud = READ_ONCE(*args->pudp);
        WARN_ON(!pud_none(pud));
 }
===============

I have run this test on x86 platform many times before (although not recently)
without any problem and never had this bug report before. Wondering if there
are recent changes on x86 platform, causing these warning.

Gavin,

Although unlikely, was there something on the series which might have
changed pud_huge_tests() ?

> 
> That warning seems to be due to pud_set_huge() failing (the "line 438"
> one I don't know what it is - there's no warning on that line in my
> tree, so I worry a bit that Steven has other things in his tree).
> 
> On x86, that can fail due to mtrr differences, or if there's already
> an entry there. Neither _should_ be an issue during the boot-time
> tests, but who knows..
> 
> Or there's something wrong with the tests. The fact that it's not
> repeatable despite being a boot-time thing is a bit odd.
> 
>              Linus
> 
> On Tue, Oct 12, 2021 at 11:11 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>> My tests have been triggering on this every so often. My code is based on
>> v5.15-rc3. Note, this is not fully reproducible.
>>
>> I also get errors an mm/debug_vm_pgtable.c:438
>>
>> And it also triggers the WARN_ON()s after that as well.
>>
>> Always happens at the debug_vm_pgtable() during boot up.
>>
>> Here's the last splat:
>>
>> ------------[ cut here ]------------
>>  WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:492 debug_vm_pgtable+0x1312/0x1693
>>  Modules linked in:
>>  CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-test+ #24
>>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>>  RIP: 0010:debug_vm_pgtable+0x1312/0x1693
>>  Code: 38 ff ff ff 48 c7 00 00 00 00 00 48 8b 75 b8 48 8b 95 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 <0f> 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 0f 0b 48 8b
>>  RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
>>  RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
>>  RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: 0000000000000007
>>  RBP: ffffa1eac0033e98 R08: ffffa1eac0033d23 R09: 000000021ee00000
>>  R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
>>  R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
>>  FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
>>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>  CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
>>  Call Trace:
>>   ? destroy_args+0x281/0x281
>>   do_one_initcall+0x68/0x2d0
>>   ? rcu_read_lock_sched_held+0x46/0x80
>>   kernel_init_freeable+0x1a5/0x1f4
>>   ? rest_init+0x280/0x280
>>   kernel_init+0x1a/0x120
>>   ret_from_fork+0x22/0x30
>>  irq event stamp: 880991
>>  hardirqs last  enabled at (881001): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
>>  hardirqs last disabled at (881016): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
>>  softirqs last  enabled at (881014): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>>  softirqs last disabled at (881009): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>>  ---[ end trace 731534acace262b5 ]---
>>  ------------[ cut here ]------------
>>  WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493 debug_vm_pgtable+0x1324/0x1693
>>  Modules linked in:
>>  CPU: 3 PID: 1 Comm: swapper/0 Tainted: G        W         5.15.0-rc3-test+ #24
>>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>>  RIP: 0010:debug_vm_pgtable+0x1324/0x1693
>>  Code: 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 62 4e 62 fe 85 c0 75 02 0f 0b 48 8b bd 38 ff ff ff e8 e0 50 62 fe 85 c0 75 02 <0f> 0b 48 8b 85 38 ff ff ff 48 8b 00 48 a9 9f ff ff ff 74 02 0f 0b
>>  RSP: 0000:ffffa1eac0033da8 EFLAGS: 00010246
>>  RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 0000000000000000
>>  RDX: ffffffffffe00000 RSI: ffffffff8bb477d0 RDI: ffff8cf1436db160
>>  RBP: ffffa1eac0033e98 R08: 0000000000000000 R09: 000000021ee00000
>>  R10: 000000021effffff R11: 0000000000000000 R12: ffff8cf1436da538
>>  R13: 00003e8b00000000 R14: 000fffffffe00000 R15: 0000000000000027
>>  FS:  0000000000000000(0000) GS:ffff8cf256800000(0000) knlGS:0000000000000000
>>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>  CR2: 0000000000000000 CR3: 000000020f612001 CR4: 00000000001706e0
>>  Call Trace:
>>   ? destroy_args+0x281/0x281
>>   do_one_initcall+0x68/0x2d0
>>   ? rcu_read_lock_sched_held+0x46/0x80
>>   kernel_init_freeable+0x1a5/0x1f4
>>   ? rest_init+0x280/0x280
>>   kernel_init+0x1a/0x120
>>   ret_from_fork+0x22/0x30
>>  irq event stamp: 881327
>>  hardirqs last  enabled at (881337): [<ffffffff8a1096ef>] __up_console_sem+0x6f/0x80
>>  hardirqs last disabled at (881352): [<ffffffff8a1096d4>] __up_console_sem+0x54/0x80
>>  softirqs last  enabled at (881350): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>>  softirqs last disabled at (881345): [<ffffffff8a097033>] irq_exit_rcu+0xa3/0xd0
>>  ---[ end trace 731534acace262b6 ]---
>>
>> Config attached (this is x86_64).
>>
>> -- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-13  4:10   ` Anshuman Khandual
@ 2021-10-13 14:27     ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-10-13 14:27 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Linus Torvalds, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Borislav Petkov, LKML, Gavin Shan,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Wed, 13 Oct 2021 09:40:11 +0530
Anshuman Khandual <anshuman.khandual@arm.com> wrote:

> I have run this test on x86 platform many times before (although not recently)
> without any problem and never had this bug report before. Wondering if there
> are recent changes on x86 platform, causing these warning.

As noticed in my conversation with Linus, this warning was there before
your updates, as it was in 5.14-rc2.

> 
> Gavin,
> 
> Although unlikely, was there something on the series which might have
> changed pud_huge_tests() ?

Not sure. But my test box that triggers this is open now, and I'll add
Linus's debug patch and see if I can retrigger it.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 19:15     ` Linus Torvalds
  2021-10-12 19:19       ` Steven Rostedt
@ 2021-10-13 21:37       ` Steven Rostedt
  2021-10-22 12:38       ` Steven Rostedt
  2 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-10-13 21:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Tue, 12 Oct 2021 12:15:40 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> You might add some debugging to the x86 pud_set_huge() function,
> something like the attached (obviously entirely untested) patch.

Well, I've applied this patch, and have yet to reproduce it.

So, I added this to my "apply before testing" patch set, and if it triggers
again, I should have more info.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-12 19:15     ` Linus Torvalds
  2021-10-12 19:19       ` Steven Rostedt
  2021-10-13 21:37       ` Steven Rostedt
@ 2021-10-22 12:38       ` Steven Rostedt
  2021-10-22 19:34         ` Linus Torvalds
  2 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2021-10-22 12:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Tue, 12 Oct 2021 12:15:40 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -706,12 +706,16 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
>  
>  	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
>  	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> -	    (mtrr != MTRR_TYPE_WRBACK))
> +	    (mtrr != MTRR_TYPE_WRBACK)) {
> +		pr_debug("mtrr_type_lookup() returned %d (%d)\n", mtrr, uniform);
>  		return 0;
> +	}
>  
>  	/* Bail out if we are we on a populated non-leaf entry: */
> -	if (pud_present(*pud) && !pud_huge(*pud))
> +	if (pud_present(*pud) && !pud_huge(*pud)) {
> +		pr_debug("pud is already present (%lx)\n", (unsigned long)pud_val(*pud));
>  		return 0;
> +	}
>  

It finally triggered again. And this time with this patch applied. But I
don't see the added printks anywhere in the dmesg.

Full dmesg is here:

  https://rostedt.org/private/dmesg-debug_vm_pgtable-20211022

Unfortunately I lost the config, but can recreated it when my tests finish.
(I kicked it off again so that I can post these patches to linux-next)
But I did share a config that triggered this in the past:

  https://lore.kernel.org/all/20211012141131.3c9a2eb1@gandalf.local.home/

The tree I'm testing is:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
  branch: ftrace/core

But this is something that has been triggering since 5.14.

Now there's a lot of debugging that is happening.

Here's the first splat:

[  178.714431] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[  178.723726] ------------[ cut here ]------------
[  178.728389] WARNING: CPU: 2 PID: 1 at mm/debug_vm_pgtable.c:492 pud_huge_tests+0x42/0x68
[  178.736494] Modules linked in:
[  178.739565] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-test+ #79
[  178.746452] Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
[  178.754112] RIP: 0010:pud_huge_tests+0x42/0x68
[  178.758567] Code: 48 8b 47 20 48 89 fb 48 c7 00 00 00 00 00 48 8b b7 a0 00 00 00 48 8b 57 60 48 8b 7f 20 48 c1 e6 0c e8 ca 2b d5 fd 85 c0 75 02 <0f> 0b 48 8b 7b 20 e8 92 2d d5 fd 85 c0 75 02 0f 0b 48 8b 43 20 48
[  178.777323] RSP: 0000:ffffaadf40033d70 EFLAGS: 00010246
[  178.782566] RAX: 0000000000000000 RBX: ffffaadf40033d88 RCX: 6fd4a5ea5b1e4400
[  178.789706] RDX: 00ffffff8411f9a3 RSI: ffffaadf40033cf8 RDI: ffffaadf40033cf9
[  178.796846] RBP: ffffaadf40033d78 R08: 00000000dc000000 R09: 0000000000040000
[  178.803992] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  178.811129] R13: ffff947400a48938 R14: ffff9474009afde8 R15: 0000000000000000
[  178.818273] FS:  0000000000000000(0000) GS:ffff947516800000(0000) knlGS:0000000000000000
[  178.826367] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  178.832121] CR2: 0000000000000000 CR3: 00000000cae2a002 CR4: 00000000001706e0
[  178.839261] Call Trace:
[  178.841725]  debug_vm_pgtable+0x3a2/0x50b
[  178.845765]  ? pgd_clear_tests+0x61/0x61
[  178.849700]  do_one_initcall+0xe8/0x25c
[  178.853556]  ? lock_is_held+0xc/0xe
[  178.857055]  ? rcu_read_lock_sched_held+0x3b/0x72
[  178.861775]  do_initcalls+0xcd/0xed
[  178.865282]  kernel_init_freeable+0x183/0x1ba
[  178.869653]  ? rest_init+0x155/0x155
[  178.873248]  kernel_init+0x1a/0x11a
[  178.876754]  ret_from_fork+0x22/0x30
[  178.880360] irq event stamp: 29103539
[  178.884035] hardirqs last  enabled at (29103549): [<ffffffff8412e491>] __up_console_sem+0x4b/0x4f
[  178.892910] hardirqs last disabled at (29103558): [<ffffffff8412e471>] __up_console_sem+0x2b/0x4f
[  178.901782] softirqs last  enabled at (29103506): [<ffffffff85400328>] __do_softirq+0x328/0x363
[  178.910485] softirqs last disabled at (29103501): [<ffffffff840d211e>] __irq_exit_rcu+0x60/0x9c
[  178.919187] ---[ end trace 328fd4bcdb7a033d ]---

It did trigger right after the kprobe test. Could be a hint. As it does seem
to only happen on configs with a lot of debugging enabled. But it doesn't
always happen there. I remember it happening usually around testing the
tracers.

Maybe it's a race between the initcall debug_vm_pgtable happens while some
internal tests are going on?

The tests do trigger "text_poke" which will muck with the page tables. Not
sure if that has anything to do with this.

I need to update my tests to just save off all failures and configs, so I
can go back to them. Just that my tests fail so often I'd fill up my hard
drive ;-)

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-22 12:38       ` Steven Rostedt
@ 2021-10-22 19:34         ` Linus Torvalds
  2021-10-22 21:19           ` Steven Rostedt
  2021-11-18 16:47           ` Steven Rostedt
  0 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2021-10-22 19:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Fri, Oct 22, 2021 at 2:38 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> It finally triggered again. And this time with this patch applied. But I
> don't see the added printks anywhere in the dmesg.

That's strange. Those printk's were added in the only places that do a
"return 0".

Ok, there's also the dummy pud_set_huge() inline function that
unconditionally returns zero, but that's only if you don't have
CONFIG_HAVE_ARCH_HUGE_VMAP enabled. And then the testing code is
disabled too.

> [  178.714431] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
> [  178.723726] ------------[ cut here ]------------
> [  178.728389] WARNING: CPU: 2 PID: 1 at mm/debug_vm_pgtable.c:492 pud_huge_tests+0x42/0x68

That's literally that

    WARN_ON(!pud_set_huge(..));

and pud_set_huge() has two 'return 0' (and one 'return 1') and that
patch added debug-printing to both of them.

Oh, it shouldn't have been a pr_debug() that gets suppressed. It
should have been a pr_warn() or something.

My bad.

                   Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-22 19:34         ` Linus Torvalds
@ 2021-10-22 21:19           ` Steven Rostedt
  2021-11-18 16:47           ` Steven Rostedt
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-10-22 21:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Fri, 22 Oct 2021 09:34:15 -1000
Linus Torvalds <torvalds@linux-foundation.org> wrote:


> Oh, it shouldn't have been a pr_debug() that gets suppressed. It
> should have been a pr_warn() or something.

Yeah, pr_debug() usually doesn't get printed.

I'll update your patch.

-- Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-10-22 19:34         ` Linus Torvalds
  2021-10-22 21:19           ` Steven Rostedt
@ 2021-11-18 16:47           ` Steven Rostedt
  2021-11-18 18:33             ` Linus Torvalds
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2021-11-18 16:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Gavin Shan, Anshuman Khandual,
	Christophe Leroy, Gerald Schaefer, Andrew Morton,
	the arch/x86 maintainers

On Fri, 22 Oct 2021 09:34:15 -1000
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Oct 22, 2021 at 2:38 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > It finally triggered again. And this time with this patch applied. But I
> > don't see the added printks anywhere in the dmesg.  
> 
> That's strange. Those printk's were added in the only places that do a
> "return 0".
> 
> Ok, there's also the dummy pud_set_huge() inline function that
> unconditionally returns zero, but that's only if you don't have
> CONFIG_HAVE_ARCH_HUGE_VMAP enabled. And then the testing code is
> disabled too.
> 
> > [  178.714431] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
> > [  178.723726] ------------[ cut here ]------------
> > [  178.728389] WARNING: CPU: 2 PID: 1 at mm/debug_vm_pgtable.c:492 pud_huge_tests+0x42/0x68  
> 
> That's literally that
> 
>     WARN_ON(!pud_set_huge(..));
> 
> and pud_set_huge() has two 'return 0' (and one 'return 1') and that
> patch added debug-printing to both of them.
> 
> Oh, it shouldn't have been a pr_debug() that gets suppressed. It
> should have been a pr_warn() or something.

Triggered it again with the new update:

[   24.751779] IPI shorthand broadcast: enabled
[   24.761177] sched_clock: Marking stable (23431856262, 1329270511)->(28163092341, -3401965568)
[   24.770495] device: 'cpu_dma_latency': device_add
[   24.775232] PM: Adding info for No Bus:cpu_dma_latency
[   24.780929] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[   24.799490] mtrr_type_lookup() returned 0 (0)
[   24.803892] ------------[ cut here ]------------
[   24.808517] WARNING: CPU: 0 PID: 1 at mm/debug_vm_pgtable.c:492 debug_vm_pgtable+0x1315/0x1696
[   24.817131] Modules linked in:
[   24.820193] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc1-test+ #17
[   24.827069] Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
[   24.834724] RIP: 0010:debug_vm_pgtable+0x1315/0x1696
[   24.839692] Code: 38 ff ff ff 48 c7 00 00 00 00 00 48 8b 75 b8 48 8b 95 78 ff ff ff 48 8b bd 38 ff ff ff 48 c1 e6 0c e8 d3 f4 61 fe 85 c0 75 02 <0f> 0b 48 8b bd 38 ff ff ff e8 61 f7 61 fe 85 c0 75 02 0f 0b 48 8b
[   24.858438] RSP: 0000:ffffb59e80033da8 EFLAGS: 00010246
[   24.863677] RAX: 0000000000000000 RBX: bffffffffffffff7 RCX: 00000000ffffefff
[   24.870815] RDX: 0000000000000000 RSI: 00000000ffffffea RDI: 0000000000000001
[   24.877947] RBP: ffffb59e80033e98 R08: ffffffff9b6d3b68 R09: 0000000000004ffb
[   24.885078] R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffff9abe83684078
[   24.892211] R13: 000038b500000000 R14: 000fffffffe00000 R15: 0000000000000027
[   24.899346] FS:  0000000000000000(0000) GS:ffff9abf9dc00000(0000) knlGS:0000000000000000
[   24.907430] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.913175] CR2: ffff9abf97a01000 CR3: 0000000216612001 CR4: 00000000001706f0
[   24.920309] Call Trace:
[   24.922763]  <TASK>
[   24.924880]  ? destroy_args+0x281/0x281
[   24.928727]  do_one_initcall+0x68/0x310
[   24.932574]  ? rcu_read_lock_sched_held+0x46/0x80
[   24.937290]  kernel_init_freeable+0x1a5/0x1f4
[   24.941654]  ? rest_init+0x270/0x270
[   24.945236]  kernel_init+0x1a/0x120
[   24.948736]  ret_from_fork+0x22/0x30
[   24.952323]  </TASK>
[   24.954517] irq event stamp: 902445
[   24.958008] hardirqs last  enabled at (902455): [<ffffffff9a10c33f>] __up_console_sem+0x6f/0x80
[   24.966700] hardirqs last disabled at (902464): [<ffffffff9a10c324>] __up_console_sem+0x54/0x80
[   24.975395] softirqs last  enabled at (902372): [<ffffffff9ae00276>] __do_softirq+0x276/0x43d
[   24.983912] softirqs last disabled at (902367): [<ffffffff9a098dd2>] irq_exit_rcu+0xa2/0xd0
[   24.992262] ---[ end trace 385def99126fe75e ]---
[   24.996891] ------------[ cut here ]------------

Full dmesg is here:  https://rostedt.org/private/dmesg-20211118.txt
config is here: https://rostedt.org/private/config-20211118

-- Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-11-18 16:47           ` Steven Rostedt
@ 2021-11-18 18:33             ` Linus Torvalds
  2021-11-18 18:52               ` Steven Rostedt
  2021-11-22  6:31               ` Anshuman Khandual
  0 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2021-11-18 18:33 UTC (permalink / raw)
  To: Steven Rostedt, Andrew Morton, Gavin Shan, Anshuman Khandual
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Christophe Leroy, Gerald Schaefer,
	the arch/x86 maintainers

On Thu, Nov 18, 2021 at 8:47 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Triggered it again with the new update:
>
> [   24.751779] IPI shorthand broadcast: enabled
> [   24.761177] sched_clock: Marking stable (23431856262, 1329270511)->(28163092341, -3401965568)
> [   24.770495] device: 'cpu_dma_latency': device_add
> [   24.775232] PM: Adding info for No Bus:cpu_dma_latency
> [   24.780929] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
> [   24.799490] mtrr_type_lookup() returned 0 (0)

Ok, so that's MTRR_TYPE_UNCACHABLE, and "uniform" is 0.

Anyway, either the mtrr code is confused, or more likely it just does
the right thing, and  pud_set_huge() is simply expected to return 0 in
this situation, and that WARN_ON() in pud_huge_tests() is simply wrong
to trigger at all.

I didn't look at what all the code in debug_vm_pgtable() is trying to
set up to test. Honestly, it's all very opaque.

But I do notice that the pfn that the test uses ends up basically
being something random, where the "fixed" pfn is

        phys = __pa_symbol(&start_kernel);
        ...
        args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);

rather than being an allocated real PUD-sized page. That can be a
problem in itself.

So I think the problem is that depending on where the kernel is
allocated, the fixed_pud_pfn ends up being in an area with MTRR
settings. In fact, I'm surprised it's not *always* in that area, since
presumabl;y you have the normal fixed MTRR issues with the 640k-1M
range.

But I didn't look - probably the MTRR code doesn't actually check the
special fixed MTRR's.

Anyway, I think that the end result is simply that the tests in
mm/debug_vm_pgtable.c are simply buggy, and the WARN_ON() is not a
sign of anything wrong in the mm, but with the tests themselves.

So the fixed_pud_pfn is dodgy, but it looks like the non-fixed
'pud_pfn' allocation may be dodgy too:

  #ifdef CONFIG_CONTIG_ALLOC
        if (order >= MAX_ORDER) {
                page = alloc_contig_pages((1 << order), GFP_KERNEL,
                                          first_online_node, NULL);

because afaik, alloc_contig_pages() does allocate a contiguous region,
but it doesn't necessarily allocate a _aligned_ contiguous region.

So I think _all_ those PUD tests are likely broken, but honestly, I
don't know the code well enough to be entirely sure, I'm just seeing
code that looks dodgy to me.

I don't think the breakage is x86-specific. Quite the reverse. I think
the x86 code just happens to randomly show it when some MTRR ends up
being used.

Maybe pfn_pud() should verify that it's actually given an aligned argument?

Gavin, Anshuman? Feel free to tell me what I missed.

Otherwise, we should disable those PUD tests (or fix them, of course).

For now, I consider this WARN_ON() to be a failure of the testing
infrastructure, not of the VM code.

                Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-11-18 18:33             ` Linus Torvalds
@ 2021-11-18 18:52               ` Steven Rostedt
  2021-11-22  6:31               ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2021-11-18 18:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Gavin Shan, Anshuman Khandual, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Christophe Leroy, Gerald Schaefer,
	the arch/x86 maintainers

On Thu, 18 Nov 2021 10:33:39 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Gavin, Anshuman? Feel free to tell me what I missed.

Let me know if you would like me to add more debugging. The failure happens
once every few weeks or so.

> 
> Otherwise, we should disable those PUD tests (or fix them, of course).
> 
> For now, I consider this WARN_ON() to be a failure of the testing
> infrastructure, not of the VM code.

For now, I'm just going to comment out those WARN_ON()s in my tests, as it
causes me to have to restart them :-p (I can modify the testing scripts to
restart where it last failed as it is a couple of hours into the tests when
it triggers, but it is still a pain).

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493
  2021-11-18 18:33             ` Linus Torvalds
  2021-11-18 18:52               ` Steven Rostedt
@ 2021-11-22  6:31               ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2021-11-22  6:31 UTC (permalink / raw)
  To: Linus Torvalds, Steven Rostedt, Andrew Morton, Gavin Shan
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Borislav Petkov, LKML, Christophe Leroy, Gerald Schaefer,
	the arch/x86 maintainers



On 11/19/21 12:03 AM, Linus Torvalds wrote:
> On Thu, Nov 18, 2021 at 8:47 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>> Triggered it again with the new update:
>>
>> [   24.751779] IPI shorthand broadcast: enabled
>> [   24.761177] sched_clock: Marking stable (23431856262, 1329270511)->(28163092341, -3401965568)
>> [   24.770495] device: 'cpu_dma_latency': device_add
>> [   24.775232] PM: Adding info for No Bus:cpu_dma_latency
>> [   24.780929] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
>> [   24.799490] mtrr_type_lookup() returned 0 (0)
> Ok, so that's MTRR_TYPE_UNCACHABLE, and "uniform" is 0.
> 
> Anyway, either the mtrr code is confused, or more likely it just does
> the right thing, and  pud_set_huge() is simply expected to return 0 in
> this situation, and that WARN_ON() in pud_huge_tests() is simply wrong
> to trigger at all.
> 
> I didn't look at what all the code in debug_vm_pgtable() is trying to
> set up to test. Honestly, it's all very opaque.
> 
> But I do notice that the pfn that the test uses ends up basically
> being something random, where the "fixed" pfn is
> 
>         phys = __pa_symbol(&start_kernel);
>         ...
>         args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
> 
> rather than being an allocated real PUD-sized page. That can be a
> problem in itself.
> 
> So I think the problem is that depending on where the kernel is
> allocated, the fixed_pud_pfn ends up being in an area with MTRR
> settings. In fact, I'm surprised it's not *always* in that area, since
> presumabl;y you have the normal fixed MTRR issues with the 640k-1M
> range.
> 
> But I didn't look - probably the MTRR code doesn't actually check the
> special fixed MTRR's.
> 
> Anyway, I think that the end result is simply that the tests in
> mm/debug_vm_pgtable.c are simply buggy, and the WARN_ON() is not a
> sign of anything wrong in the mm, but with the tests themselves.
> 
> So the fixed_pud_pfn is dodgy, but it looks like the non-fixed
> 'pud_pfn' allocation may be dodgy too:
> 
>   #ifdef CONFIG_CONTIG_ALLOC
>         if (order >= MAX_ORDER) {
>                 page = alloc_contig_pages((1 << order), GFP_KERNEL,
>                                           first_online_node, NULL);
> 
> because afaik, alloc_contig_pages() does allocate a contiguous region,
> but it doesn't necessarily allocate a _aligned_ contiguous region.
> 
> So I think _all_ those PUD tests are likely broken, but honestly, I
> don't know the code well enough to be entirely sure, I'm just seeing
> code that looks dodgy to me.
> 
> I don't think the breakage is x86-specific. Quite the reverse. I think
> the x86 code just happens to randomly show it when some MTRR ends up
> being used.
> 
> Maybe pfn_pud() should verify that it's actually given an aligned argument?
> 
> Gavin, Anshuman? Feel free to tell me what I missed.

Hi Linus,

These PUD tests have been subtle (including their problems as seen here
in this report) on certain platforms. I will definitely take a detailed
look, but probably after an week (leave, travel etc). Thank you.

- Anshuman

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-11-22  6:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12 18:11 [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493 Steven Rostedt
2021-10-12 18:34 ` Linus Torvalds
2021-10-12 18:55   ` Steven Rostedt
2021-10-12 19:15     ` Linus Torvalds
2021-10-12 19:19       ` Steven Rostedt
2021-10-13 21:37       ` Steven Rostedt
2021-10-22 12:38       ` Steven Rostedt
2021-10-22 19:34         ` Linus Torvalds
2021-10-22 21:19           ` Steven Rostedt
2021-11-18 16:47           ` Steven Rostedt
2021-11-18 18:33             ` Linus Torvalds
2021-11-18 18:52               ` Steven Rostedt
2021-11-22  6:31               ` Anshuman Khandual
2021-10-13  4:10   ` Anshuman Khandual
2021-10-13 14:27     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).