linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* general protection fault in madvise_cold_or_pageout_pte_range
@ 2020-09-14  9:29 syzbot
  2020-09-14 20:38 ` Minchan Kim
  2020-09-15 16:33 ` Minchan Kim
  0 siblings, 2 replies; 4+ messages in thread
From: syzbot @ 2020-09-14  9:29 UTC (permalink / raw)
  To: akpm, andreyknvl, hannes, khalid.aziz, linux-kernel, linux-mm,
	mhocko, minchan, rppt, syzkaller-bugs, torvalds

Hello,

syzbot found the following issue on:

HEAD commit:    729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1482b99e900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8f5c353182ed6199
dashboard link: https://syzkaller.appspot.com/bug?extid=ecf80462cb7d5d552bc7
compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e2a255900000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=164afdb3900000

The issue was bisected to:

commit 1a4e58cce84ee88129d5d49c064bd2852b481357
Author: Minchan Kim <minchan@kernel.org>
Date:   Wed Sep 25 23:49:15 2019 +0000

    mm: introduce MADV_PAGEOUT

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=127f973e900000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=117f973e900000
console output: https://syzkaller.appspot.com/x/log.txt?x=167f973e900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")

general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
 walk_pmd_range mm/pagewalk.c:89 [inline]
 walk_pud_range mm/pagewalk.c:160 [inline]
 walk_p4d_range mm/pagewalk.c:193 [inline]
 walk_pgd_range mm/pagewalk.c:229 [inline]
 __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
 walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
 madvise_pageout_page_range mm/madvise.c:521 [inline]
 madvise_pageout mm/madvise.c:557 [inline]
 madvise_vma mm/madvise.c:946 [inline]
 do_madvise+0x12d0/0x2090 mm/madvise.c:1145
 __do_sys_madvise mm/madvise.c:1171 [inline]
 __se_sys_madvise mm/madvise.c:1169 [inline]
 __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4440e9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db d7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffed62d6668 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 00000000004440e9
RDX: 0000000000000015 RSI: 0000000000600003 RDI: 0000000020000000
RBP: 00000000006ce018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000401d50
R13: 0000000000401de0 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace 0453ba4a30f03f10 ]---
RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: general protection fault in madvise_cold_or_pageout_pte_range
  2020-09-14  9:29 general protection fault in madvise_cold_or_pageout_pte_range syzbot
@ 2020-09-14 20:38 ` Minchan Kim
  2020-09-15 16:33 ` Minchan Kim
  1 sibling, 0 replies; 4+ messages in thread
From: Minchan Kim @ 2020-09-14 20:38 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, andreyknvl, hannes, khalid.aziz, linux-kernel, linux-mm,
	mhocko, rppt, syzkaller-bugs, torvalds

On Mon, Sep 14, 2020 at 02:29:15AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1482b99e900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8f5c353182ed6199
> dashboard link: https://syzkaller.appspot.com/bug?extid=ecf80462cb7d5d552bc7
> compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e2a255900000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=164afdb3900000
> 
> The issue was bisected to:
> 
> commit 1a4e58cce84ee88129d5d49c064bd2852b481357
> Author: Minchan Kim <minchan@kernel.org>
> Date:   Wed Sep 25 23:49:15 2019 +0000
> 
>     mm: introduce MADV_PAGEOUT
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=127f973e900000
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=117f973e900000
> console output: https://syzkaller.appspot.com/x/log.txt?x=167f973e900000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
> Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
>  walk_pmd_range mm/pagewalk.c:89 [inline]
>  walk_pud_range mm/pagewalk.c:160 [inline]
>  walk_p4d_range mm/pagewalk.c:193 [inline]
>  walk_pgd_range mm/pagewalk.c:229 [inline]
>  __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
>  walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
>  madvise_pageout_page_range mm/madvise.c:521 [inline]
>  madvise_pageout mm/madvise.c:557 [inline]
>  madvise_vma mm/madvise.c:946 [inline]
>  do_madvise+0x12d0/0x2090 mm/madvise.c:1145
>  __do_sys_madvise mm/madvise.c:1171 [inline]
>  __se_sys_madvise mm/madvise.c:1169 [inline]
>  __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
>  do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9

It's the bug to access pmd again after split_huge_page of the pmd so pmd
would be NULL. Let me look at it.

Thanks.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: general protection fault in madvise_cold_or_pageout_pte_range
  2020-09-14  9:29 general protection fault in madvise_cold_or_pageout_pte_range syzbot
  2020-09-14 20:38 ` Minchan Kim
@ 2020-09-15 16:33 ` Minchan Kim
  2020-09-26  8:36   ` Kirill A. Shutemov
  1 sibling, 1 reply; 4+ messages in thread
From: Minchan Kim @ 2020-09-15 16:33 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, andreyknvl, hannes, khalid.aziz, linux-kernel, linux-mm,
	mhocko, rppt, syzkaller-bugs, torvalds, Kirill A. Shutemov

On Mon, Sep 14, 2020 at 02:29:15AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1482b99e900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8f5c353182ed6199
> dashboard link: https://syzkaller.appspot.com/bug?extid=ecf80462cb7d5d552bc7
> compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e2a255900000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=164afdb3900000
> 
> The issue was bisected to:
> 
> commit 1a4e58cce84ee88129d5d49c064bd2852b481357
> Author: Minchan Kim <minchan@kernel.org>
> Date:   Wed Sep 25 23:49:15 2019 +0000
> 
>     mm: introduce MADV_PAGEOUT
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=127f973e900000
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=117f973e900000
> console output: https://syzkaller.appspot.com/x/log.txt?x=167f973e900000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
> Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
>  walk_pmd_range mm/pagewalk.c:89 [inline]
>  walk_pud_range mm/pagewalk.c:160 [inline]
>  walk_p4d_range mm/pagewalk.c:193 [inline]
>  walk_pgd_range mm/pagewalk.c:229 [inline]
>  __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
>  walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
>  madvise_pageout_page_range mm/madvise.c:521 [inline]
>  madvise_pageout mm/madvise.c:557 [inline]
>  madvise_vma mm/madvise.c:946 [inline]
>  do_madvise+0x12d0/0x2090 mm/madvise.c:1145
>  __do_sys_madvise mm/madvise.c:1171 [inline]
>  __se_sys_madvise mm/madvise.c:1169 [inline]
>  __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
>  do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4440e9
> Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db d7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007ffed62d6668 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 00000000004440e9
> RDX: 0000000000000015 RSI: 0000000000600003 RDI: 0000000020000000
> RBP: 00000000006ce018 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000401d50
> R13: 0000000000401de0 R14: 0000000000000000 R15: 0000000000000000
> Modules linked in:
> ---[ end trace 0453ba4a30f03f10 ]---
> RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 


The backing vma was shmem. When I see the implemenation of __split_huge_pmd,
it looks like pmd zapping if vma is not vma_is_anonymous unlike anon vma
whereremapping pmd page to ptes.

commit d21b9e57c74c (HEAD)
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Tue Jul 26 15:25:37 2016 -0700

    thp: handle file pages in split_huge_pmd()

    Splitting THP PMD is simple: just unmap it as in DAX case.  This way we
    can avoid memory overhead on page table allocation to deposit.

    It's probably a good idea to try to allocation page table with
    GFP_ATOMIC in __split_huge_pmd_locked() to avoid refaulting the area,
    but clearing pmd should be good enough for now.

    Unlike DAX, we also remove the page from rmap and drop reference.
    pmd_young() is transfered to PageReferenced().

If so, we need to check the pmd validation after splitting.
Ccing to Kirill for double check.

From 26e804a0723f92862aa1ee9cc2c9e5d4691cb11d Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Mon, 14 Sep 2020 23:32:15 -0700
Subject: [PATCH] mm: validate pmd after splitting

syzbot reported following.

general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
 walk_pmd_range mm/pagewalk.c:89 [inline]
 walk_pud_range mm/pagewalk.c:160 [inline]
 walk_p4d_range mm/pagewalk.c:193 [inline]
 walk_pgd_range mm/pagewalk.c:229 [inline]
 __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
 walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
 madvise_pageout_page_range mm/madvise.c:521 [inline]
 madvise_pageout mm/madvise.c:557 [inline]
 madvise_vma mm/madvise.c:946 [inline]
 do_madvise+0x12d0/0x2090 mm/madvise.c:1145
 __do_sys_madvise mm/madvise.c:1171 [inline]
 __se_sys_madvise mm/madvise.c:1169 [inline]
 __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

In case of split page of file-backed THP, it zaps the pmd instead of
remapping of sub-pages so need to check pmd validity after split.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/madvise.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index d4aa5f776543..0e0d61003fc6 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -381,9 +381,9 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 		return 0;
 	}
 
+regular_page:
 	if (pmd_trans_unstable(pmd))
 		return 0;
-regular_page:
 #endif
 	tlb_change_page_size(tlb, PAGE_SIZE);
 	orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
-- 
2.28.0.618.gf4bc123cb7-goog



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: general protection fault in madvise_cold_or_pageout_pte_range
  2020-09-15 16:33 ` Minchan Kim
@ 2020-09-26  8:36   ` Kirill A. Shutemov
  0 siblings, 0 replies; 4+ messages in thread
From: Kirill A. Shutemov @ 2020-09-26  8:36 UTC (permalink / raw)
  To: Minchan Kim
  Cc: syzbot, akpm, andreyknvl, hannes, khalid.aziz, linux-kernel,
	linux-mm, mhocko, rppt, syzkaller-bugs, torvalds

On Tue, Sep 15, 2020 at 09:33:49AM -0700, Minchan Kim wrote:
> On Mon, Sep 14, 2020 at 02:29:15AM -0700, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1482b99e900000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=8f5c353182ed6199
> > dashboard link: https://syzkaller.appspot.com/bug?extid=ecf80462cb7d5d552bc7
> > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e2a255900000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=164afdb3900000
> > 
> > The issue was bisected to:
> > 
> > commit 1a4e58cce84ee88129d5d49c064bd2852b481357
> > Author: Minchan Kim <minchan@kernel.org>
> > Date:   Wed Sep 25 23:49:15 2019 +0000
> > 
> >     mm: introduce MADV_PAGEOUT
> > 
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=127f973e900000
> > final oops:     https://syzkaller.appspot.com/x/report.txt?x=117f973e900000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=167f973e900000
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
> > Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
> > 
> > general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> > CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> > Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> > RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> > RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> > RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> > R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> > R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> > FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
> >  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> >  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
> >  spin_lock include/linux/spinlock.h:354 [inline]
> >  madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
> >  walk_pmd_range mm/pagewalk.c:89 [inline]
> >  walk_pud_range mm/pagewalk.c:160 [inline]
> >  walk_p4d_range mm/pagewalk.c:193 [inline]
> >  walk_pgd_range mm/pagewalk.c:229 [inline]
> >  __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
> >  walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
> >  madvise_pageout_page_range mm/madvise.c:521 [inline]
> >  madvise_pageout mm/madvise.c:557 [inline]
> >  madvise_vma mm/madvise.c:946 [inline]
> >  do_madvise+0x12d0/0x2090 mm/madvise.c:1145
> >  __do_sys_madvise mm/madvise.c:1171 [inline]
> >  __se_sys_madvise mm/madvise.c:1169 [inline]
> >  __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
> >  do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x4440e9
> > Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db d7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007ffed62d6668 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> > RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 00000000004440e9
> > RDX: 0000000000000015 RSI: 0000000000600003 RDI: 0000000020000000
> > RBP: 00000000006ce018 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000401d50
> > R13: 0000000000401de0 R14: 0000000000000000 R15: 0000000000000000
> > Modules linked in:
> > ---[ end trace 0453ba4a30f03f10 ]---
> > RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> > Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> > RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> > RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> > RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> > R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> > R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> > FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > 
> 
> 
> The backing vma was shmem. When I see the implemenation of __split_huge_pmd,
> it looks like pmd zapping if vma is not vma_is_anonymous unlike anon vma
> whereremapping pmd page to ptes.
> 
> commit d21b9e57c74c (HEAD)
> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Date:   Tue Jul 26 15:25:37 2016 -0700
> 
>     thp: handle file pages in split_huge_pmd()
> 
>     Splitting THP PMD is simple: just unmap it as in DAX case.  This way we
>     can avoid memory overhead on page table allocation to deposit.
> 
>     It's probably a good idea to try to allocation page table with
>     GFP_ATOMIC in __split_huge_pmd_locked() to avoid refaulting the area,
>     but clearing pmd should be good enough for now.
> 
>     Unlike DAX, we also remove the page from rmap and drop reference.
>     pmd_young() is transfered to PageReferenced().
> 
> If so, we need to check the pmd validation after splitting.
> Ccing to Kirill for double check.
> 
> From 26e804a0723f92862aa1ee9cc2c9e5d4691cb11d Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 14 Sep 2020 23:32:15 -0700
> Subject: [PATCH] mm: validate pmd after splitting
> 
> syzbot reported following.
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
>  walk_pmd_range mm/pagewalk.c:89 [inline]
>  walk_pud_range mm/pagewalk.c:160 [inline]
>  walk_p4d_range mm/pagewalk.c:193 [inline]
>  walk_pgd_range mm/pagewalk.c:229 [inline]
>  __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
>  walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
>  madvise_pageout_page_range mm/madvise.c:521 [inline]
>  madvise_pageout mm/madvise.c:557 [inline]
>  madvise_vma mm/madvise.c:946 [inline]
>  do_madvise+0x12d0/0x2090 mm/madvise.c:1145
>  __do_sys_madvise mm/madvise.c:1171 [inline]
>  __se_sys_madvise mm/madvise.c:1169 [inline]
>  __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
>  do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> In case of split page of file-backed THP, it zaps the pmd instead of
> remapping of sub-pages so need to check pmd validity after split.
> 
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: syzbot+ecf80462cb7d5d552bc7@syzkaller.appspotmail.com
> Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
> Signed-off-by: Minchan Kim <minchan@kernel.org>

That's correct fix. I've come up with the same one.

But it would be nice to trim the commit message. There's a lot of
unrelated info in the dump.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

> ---
>  mm/madvise.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index d4aa5f776543..0e0d61003fc6 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -381,9 +381,9 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
>  		return 0;
>  	}
>  
> +regular_page:
>  	if (pmd_trans_unstable(pmd))
>  		return 0;
> -regular_page:
>  #endif
>  	tlb_change_page_size(tlb, PAGE_SIZE);
>  	orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> -- 
> 2.28.0.618.gf4bc123cb7-goog
> 

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-26  8:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14  9:29 general protection fault in madvise_cold_or_pageout_pte_range syzbot
2020-09-14 20:38 ` Minchan Kim
2020-09-15 16:33 ` Minchan Kim
2020-09-26  8:36   ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).