* BUG: unable to handle kernel paging request in kernel_get_mempolicy @ 2020-04-06 18:16 syzbot 2020-04-07 0:47 ` Peter Xu 0 siblings, 1 reply; 13+ messages in thread From: syzbot @ 2020-04-06 18:16 UTC (permalink / raw) To: akpm, bgeffon, linux-kernel, linux-mm, peterx, syzkaller-bugs, torvalds Hello, syzbot found the following crash on: HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 The bug was bisected to: commit 4426e945df588f2878affddf88a51259200f7e29 Author: Peter Xu <peterx@redhat.com> Date: Thu Apr 2 04:08:49 2020 +0000 mm/gup: allow VM_FAULT_RETRY for multiple times bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") BUG: unable to handle page fault for address: ffffffff00000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 987c067 P4D 987c067 PUD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x446719 Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf Modules linked in: CR2: ffffffff00000000 ---[ end trace 0becf554e06291c3 ]--- RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot can test patches for this bug, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-06 18:16 BUG: unable to handle kernel paging request in kernel_get_mempolicy syzbot @ 2020-04-07 0:47 ` Peter Xu 2020-04-07 1:05 ` Randy Dunlap 2020-04-07 1:39 ` Andrew Morton 0 siblings, 2 replies; 13+ messages in thread From: Peter Xu @ 2020-04-07 0:47 UTC (permalink / raw) To: syzbot; +Cc: akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 > dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 > > The bug was bisected to: > > commit 4426e945df588f2878affddf88a51259200f7e29 > Author: Peter Xu <peterx@redhat.com> > Date: Thu Apr 2 04:08:49 2020 +0000 > > mm/gup: allow VM_FAULT_RETRY for multiple times > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 > final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") > > BUG: unable to handle page fault for address: ffffffff00000000 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 987c067 P4D 987c067 PUD 0 > Oops: 0000 [#1] PREEMPT SMP KASAN > CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] > RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] > RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] > RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 > Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff > RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 > RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 > RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 > R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 > R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 > FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] > __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] > __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 > do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > RIP: 0033:0x446719 > Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef > RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 > R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c > R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf > Modules linked in: > CR2: ffffffff00000000 > ---[ end trace 0becf554e06291c3 ]--- > RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] > RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] > RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] > RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 > Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff > RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 > RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 > RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 > R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 > R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 > FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Hi, Andrew & all, I can reproduce this locally right after I run the test program, and below patch fixed it for me - the test program can run with quite a few minutes without crashing again. Is there a way I can feed this to the syzbot to re-verify this? Thanks, 8<--------------------------------------------------------------- From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Mon, 6 Apr 2020 20:40:13 -0400 Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal lookup_node() uses gup to pin the page and get node information. It checks against ret>=0 assuming the page will be filled in. However it's also possible that gup will return zero, for example, when the thread is quickly killed with a fatal signal. Teach lookup_node() to gracefully return an error -EFAULT if it happens. Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/mempolicy.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 5fb427aed612..1398578db025 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) int locked = 1; err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); - if (err >= 0) { + if (err == 0) { + /* E.g. GUP interupted by fatal signal */ + err = -EFAULT; + } else if (err > 0) { err = page_to_nid(p); put_page(p); } -- 2.24.1 -- Peter Xu ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 0:47 ` Peter Xu @ 2020-04-07 1:05 ` Randy Dunlap 2020-04-07 1:05 ` syzbot ` (2 more replies) 2020-04-07 1:39 ` Andrew Morton 1 sibling, 3 replies; 13+ messages in thread From: Randy Dunlap @ 2020-04-07 1:05 UTC (permalink / raw) To: Peter Xu, syzbot Cc: akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On 4/6/20 5:47 PM, Peter Xu wrote: > On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote: >> Hello, >> >> syzbot found the following crash on: >> >> HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 >> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 >> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 >> >> The bug was bisected to: >> >> commit 4426e945df588f2878affddf88a51259200f7e29 >> Author: Peter Xu <peterx@redhat.com> >> Date: Thu Apr 2 04:08:49 2020 +0000 >> >> mm/gup: allow VM_FAULT_RETRY for multiple times >> >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 >> final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 >> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com >> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") >> >> BUG: unable to handle page fault for address: ffffffff00000000 >> #PF: supervisor read access in kernel mode >> #PF: error_code(0x0000) - not-present page >> PGD 987c067 P4D 987c067 PUD 0 >> Oops: 0000 [#1] PREEMPT SMP KASAN >> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] >> __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] >> __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 >> do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 >> entry_SYSCALL_64_after_hwframe+0x49/0xb3 >> RIP: 0033:0x446719 >> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 >> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef >> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 >> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c >> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf >> Modules linked in: >> CR2: ffffffff00000000 >> ---[ end trace 0becf554e06291c3 ]--- >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Hi, Andrew & all, > > I can reproduce this locally right after I run the test program, and > below patch fixed it for me - the test program can run with quite a > few minutes without crashing again. > > Is there a way I can feed this to the syzbot to re-verify this? Hi Peter, Send the patch. At the top of the email, put something like #syz test <git repo> <branch> It's documented here: https://github.com/google/syzkaller/blob/master/docs/syzbot.md > Thanks, > > 8<--------------------------------------------------------------- > From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Mon, 6 Apr 2020 20:40:13 -0400 > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > lookup_node() uses gup to pin the page and get node information. It > checks against ret>=0 assuming the page will be filled in. However > it's also possible that gup will return zero, for example, when the > thread is quickly killed with a fatal signal. Teach lookup_node() to > gracefully return an error -EFAULT if it happens. > > Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > mm/mempolicy.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 5fb427aed612..1398578db025 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > int locked = 1; > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > - if (err >= 0) { > + if (err == 0) { > + /* E.g. GUP interupted by fatal signal */ > + err = -EFAULT; > + } else if (err > 0) { > err = page_to_nid(p); > put_page(p); > } > -- ~Randy Reported-by: Randy Dunlap <rdunlap@infradead.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:05 ` Randy Dunlap @ 2020-04-07 1:05 ` syzbot 2020-04-07 1:06 ` syzbot 2020-04-07 1:27 ` Peter Xu 2 siblings, 0 replies; 13+ messages in thread From: syzbot @ 2020-04-07 1:05 UTC (permalink / raw) To: Randy Dunlap Cc: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap, syzkaller-bugs, torvalds > On 4/6/20 5:47 PM, Peter Xu wrote: >> On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote: >>> Hello, >>> >>> syzbot found the following crash on: >>> >>> HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 >>> >>> The bug was bisected to: >>> >>> commit 4426e945df588f2878affddf88a51259200f7e29 >>> Author: Peter Xu <peterx@redhat.com> >>> Date: Thu Apr 2 04:08:49 2020 +0000 >>> >>> mm/gup: allow VM_FAULT_RETRY for multiple times >>> >>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 >>> final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 >>> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com >>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") >>> >>> BUG: unable to handle page fault for address: ffffffff00000000 >>> #PF: supervisor read access in kernel mode >>> #PF: error_code(0x0000) - not-present page >>> PGD 987c067 P4D 987c067 PUD 0 >>> Oops: 0000 [#1] PREEMPT SMP KASAN >>> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >>> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] >>> __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] >>> __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 >>> do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 >>> entry_SYSCALL_64_after_hwframe+0x49/0xb3 >>> RIP: 0033:0x446719 >>> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 >>> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef >>> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 >>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >>> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 >>> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c >>> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf >>> Modules linked in: >>> CR2: ffffffff00000000 >>> ---[ end trace 0becf554e06291c3 ]--- >>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >>> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> >> Hi, Andrew & all, >> >> I can reproduce this locally right after I run the test program, and >> below patch fixed it for me - the test program can run with quite a >> few minutes without crashing again. >> >> Is there a way I can feed this to the syzbot to re-verify this? > > Hi Peter, > > Send the patch. At the top of the email, put something like > #syz test <git repo> <branch> want 2 args (repo, branch), got 3 > > It's documented here: > https://github.com/google/syzkaller/blob/master/docs/syzbot.md > > >> Thanks, >> >> 8<--------------------------------------------------------------- >> From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 >> From: Peter Xu <peterx@redhat.com> >> Date: Mon, 6 Apr 2020 20:40:13 -0400 >> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal >> >> lookup_node() uses gup to pin the page and get node information. It >> checks against ret>=0 assuming the page will be filled in. However >> it's also possible that gup will return zero, for example, when the >> thread is quickly killed with a fatal signal. Teach lookup_node() to >> gracefully return an error -EFAULT if it happens. >> >> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com >> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") >> Signed-off-by: Peter Xu <peterx@redhat.com> >> --- >> mm/mempolicy.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 5fb427aed612..1398578db025 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) >> >> int locked = 1; >> err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); >> - if (err >= 0) { >> + if (err == 0) { >> + /* E.g. GUP interupted by fatal signal */ >> + err = -EFAULT; >> + } else if (err > 0) { >> err = page_to_nid(p); >> put_page(p); >> } >> > > > > -- > ~Randy > Reported-by: Randy Dunlap <rdunlap@infradead.org> > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/3ff20c8a-5a26-5e38-42f7-ec751735d47c%40infradead.org. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:05 ` Randy Dunlap 2020-04-07 1:05 ` syzbot @ 2020-04-07 1:06 ` syzbot 2020-04-07 1:27 ` Peter Xu 2 siblings, 0 replies; 13+ messages in thread From: syzbot @ 2020-04-07 1:06 UTC (permalink / raw) To: Randy Dunlap Cc: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap, syzkaller-bugs, torvalds > On 4/6/20 5:47 PM, Peter Xu wrote: >> On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote: >>> Hello, >>> >>> syzbot found the following crash on: >>> >>> HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 >>> >>> The bug was bisected to: >>> >>> commit 4426e945df588f2878affddf88a51259200f7e29 >>> Author: Peter Xu <peterx@redhat.com> >>> Date: Thu Apr 2 04:08:49 2020 +0000 >>> >>> mm/gup: allow VM_FAULT_RETRY for multiple times >>> >>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 >>> final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 >>> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com >>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") >>> >>> BUG: unable to handle page fault for address: ffffffff00000000 >>> #PF: supervisor read access in kernel mode >>> #PF: error_code(0x0000) - not-present page >>> PGD 987c067 P4D 987c067 PUD 0 >>> Oops: 0000 [#1] PREEMPT SMP KASAN >>> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >>> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] >>> __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] >>> __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 >>> do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 >>> entry_SYSCALL_64_after_hwframe+0x49/0xb3 >>> RIP: 0033:0x446719 >>> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 >>> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef >>> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 >>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >>> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 >>> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c >>> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf >>> Modules linked in: >>> CR2: ffffffff00000000 >>> ---[ end trace 0becf554e06291c3 ]--- >>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] >>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] >>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] >>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 >>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff >>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 >>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 >>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 >>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 >>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 >>> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> >> Hi, Andrew & all, >> >> I can reproduce this locally right after I run the test program, and >> below patch fixed it for me - the test program can run with quite a >> few minutes without crashing again. >> >> Is there a way I can feed this to the syzbot to re-verify this? > > Hi Peter, > > Send the patch. At the top of the email, put something like > #syz test <git repo> <branch> want 2 args (repo, branch), got 3 > > It's documented here: > https://github.com/google/syzkaller/blob/master/docs/syzbot.md > > >> Thanks, >> >> 8<--------------------------------------------------------------- >> From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 >> From: Peter Xu <peterx@redhat.com> >> Date: Mon, 6 Apr 2020 20:40:13 -0400 >> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal >> >> lookup_node() uses gup to pin the page and get node information. It >> checks against ret>=0 assuming the page will be filled in. However >> it's also possible that gup will return zero, for example, when the >> thread is quickly killed with a fatal signal. Teach lookup_node() to >> gracefully return an error -EFAULT if it happens. >> >> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com >> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") >> Signed-off-by: Peter Xu <peterx@redhat.com> >> --- >> mm/mempolicy.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 5fb427aed612..1398578db025 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) >> >> int locked = 1; >> err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); >> - if (err >= 0) { >> + if (err == 0) { >> + /* E.g. GUP interupted by fatal signal */ >> + err = -EFAULT; >> + } else if (err > 0) { >> err = page_to_nid(p); >> put_page(p); >> } >> > > > > -- > ~Randy > Reported-by: Randy Dunlap <rdunlap@infradead.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:05 ` Randy Dunlap 2020-04-07 1:05 ` syzbot 2020-04-07 1:06 ` syzbot @ 2020-04-07 1:27 ` Peter Xu 2020-04-07 5:26 ` syzbot 2 siblings, 1 reply; 13+ messages in thread From: Peter Xu @ 2020-04-07 1:27 UTC (permalink / raw) To: Randy Dunlap Cc: syzbot, akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, Apr 06, 2020 at 06:05:54PM -0700, Randy Dunlap wrote: > On 4/6/20 5:47 PM, Peter Xu wrote: > > On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote: > >> Hello, > >> > >> syzbot found the following crash on: > >> > >> HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne.. > >> git tree: upstream > >> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000 > >> kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69 > >> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 > >> compiler: gcc (GCC) 9.0.0 20181231 (experimental) > >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000 > >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000 > >> > >> The bug was bisected to: > >> > >> commit 4426e945df588f2878affddf88a51259200f7e29 > >> Author: Peter Xu <peterx@redhat.com> > >> Date: Thu Apr 2 04:08:49 2020 +0000 > >> > >> mm/gup: allow VM_FAULT_RETRY for multiple times > >> > >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000 > >> final crash: https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000 > >> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000 > >> > >> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com > >> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") > >> > >> BUG: unable to handle page fault for address: ffffffff00000000 > >> #PF: supervisor read access in kernel mode > >> #PF: error_code(0x0000) - not-present page > >> PGD 987c067 P4D 987c067 PUD 0 > >> Oops: 0000 [#1] PREEMPT SMP KASAN > >> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0 > >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] > >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] > >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] > >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 > >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff > >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 > >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 > >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 > >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 > >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 > >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 > >> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > >> Call Trace: > >> __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline] > >> __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline] > >> __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629 > >> do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 > >> entry_SYSCALL_64_after_hwframe+0x49/0xb3 > >> RIP: 0033:0x446719 > >> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 > >> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef > >> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719 > >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > >> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000 > >> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c > >> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf > >> Modules linked in: > >> CR2: ffffffff00000000 > >> ---[ end trace 0becf554e06291c3 ]--- > >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline] > >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline] > >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline] > >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615 > >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff > >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246 > >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1 > >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005 > >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499 > >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000 > >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000 > >> FS: 00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > Hi, Andrew & all, > > > > I can reproduce this locally right after I run the test program, and > > below patch fixed it for me - the test program can run with quite a > > few minutes without crashing again. > > > > Is there a way I can feed this to the syzbot to re-verify this? > > Hi Peter, > > Send the patch. At the top of the email, put something like > #syz test <git repo> <branch> > > It's documented here: > https://github.com/google/syzkaller/blob/master/docs/syzbot.md Thanks Randy. #syz test https://github.com/xzpeter/linux.git 23800bff6fa346a4e9b3806dc0cfeb74498df757 > > > > Thanks, > > > > 8<--------------------------------------------------------------- > > From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > From: Peter Xu <peterx@redhat.com> > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > lookup_node() uses gup to pin the page and get node information. It > > checks against ret>=0 assuming the page will be filled in. However > > it's also possible that gup will return zero, for example, when the > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > gracefully return an error -EFAULT if it happens. > > > > Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com > > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > mm/mempolicy.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index 5fb427aed612..1398578db025 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > int locked = 1; > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > - if (err >= 0) { > > + if (err == 0) { > > + /* E.g. GUP interupted by fatal signal */ > > + err = -EFAULT; > > + } else if (err > 0) { > > err = page_to_nid(p); > > put_page(p); > > } > > > > > > -- > ~Randy > Reported-by: Randy Dunlap <rdunlap@infradead.org> > -- Peter Xu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:27 ` Peter Xu @ 2020-04-07 5:26 ` syzbot 0 siblings, 0 replies; 13+ messages in thread From: syzbot @ 2020-04-07 5:26 UTC (permalink / raw) To: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap, syzkaller-bugs, torvalds Hello, syzbot has tested the proposed patch and the reproducer did not trigger crash: Reported-and-tested-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com Tested on: commit: 23800bff mm/mempolicy: Allow lookup_node() to handle fatal.. git tree: https://github.com/xzpeter/linux.git kernel config: https://syzkaller.appspot.com/x/.config?x=288d637f7bebfd40 dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 0:47 ` Peter Xu 2020-04-07 1:05 ` Randy Dunlap @ 2020-04-07 1:39 ` Andrew Morton 2020-04-07 1:55 ` Peter Xu 1 sibling, 1 reply; 13+ messages in thread From: Andrew Morton @ 2020-04-07 1:39 UTC (permalink / raw) To: Peter Xu Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Mon, 6 Apr 2020 20:40:13 -0400 > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > lookup_node() uses gup to pin the page and get node information. It > checks against ret>=0 assuming the page will be filled in. However > it's also possible that gup will return zero, for example, when the > thread is quickly killed with a fatal signal. Teach lookup_node() to > gracefully return an error -EFAULT if it happens. > > ... > > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > int locked = 1; > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > - if (err >= 0) { > + if (err == 0) { > + /* E.g. GUP interupted by fatal signal */ > + err = -EFAULT; > + } else if (err > 0) { > err = page_to_nid(p); > put_page(p); > } Doh. Thanks. Should it have been -EINTR? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:39 ` Andrew Morton @ 2020-04-07 1:55 ` Peter Xu 2020-04-07 2:15 ` Andrew Morton 0 siblings, 1 reply; 13+ messages in thread From: Peter Xu @ 2020-04-07 1:55 UTC (permalink / raw) To: Andrew Morton Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote: > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > From: Peter Xu <peterx@redhat.com> > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > lookup_node() uses gup to pin the page and get node information. It > > checks against ret>=0 assuming the page will be filled in. However > > it's also possible that gup will return zero, for example, when the > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > gracefully return an error -EFAULT if it happens. > > > > ... > > > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > int locked = 1; > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > - if (err >= 0) { > > + if (err == 0) { > > + /* E.g. GUP interupted by fatal signal */ > > + err = -EFAULT; > > + } else if (err > 0) { > > err = page_to_nid(p); > > put_page(p); > > } > > Doh. Thanks. > > Should it have been -EINTR? It looks ok to me too. I was returning -EFAULT to follow the same value as get_vaddr_frames() (which is the other caller of get_user_pages_locked()). So far the only path that I found can trigger this is when there's a fatal signal pending right after the gup. If so, the userspace won't have a chance to see the -EINTR (or whatever we return) anyways. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 1:55 ` Peter Xu @ 2020-04-07 2:15 ` Andrew Morton 2020-04-07 2:42 ` Peter Xu 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2020-04-07 2:15 UTC (permalink / raw) To: Peter Xu Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote: > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote: > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > > From: Peter Xu <peterx@redhat.com> > > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > > > lookup_node() uses gup to pin the page and get node information. It > > > checks against ret>=0 assuming the page will be filled in. However > > > it's also possible that gup will return zero, for example, when the > > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > > gracefully return an error -EFAULT if it happens. > > > > > > ... > > > > > > --- a/mm/mempolicy.c > > > +++ b/mm/mempolicy.c > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > > > int locked = 1; > > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > > - if (err >= 0) { > > > + if (err == 0) { > > > + /* E.g. GUP interupted by fatal signal */ > > > + err = -EFAULT; > > > + } else if (err > 0) { > > > err = page_to_nid(p); > > > put_page(p); > > > } > > > > Doh. Thanks. > > > > Should it have been -EINTR? > > It looks ok to me too. I was returning -EFAULT to follow the same > value as get_vaddr_frames() (which is the other caller of > get_user_pages_locked()). So far the only path that I found can > trigger this is when there's a fatal signal pending right after the > gup. If so, the userspace won't have a chance to see the -EINTR (or > whatever we return) anyways. Yup. I guess we're a victim of get_user_pages()'s screwy return value conventions - the caller cannot distinguish between invalid-addr and fatal-signal. Which makes one wonder why lookup_node() ever worked. What happens if get_mempolicy(MPOL_F_NODE) is passed a wild userspace address? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 2:15 ` Andrew Morton @ 2020-04-07 2:42 ` Peter Xu 2020-04-07 8:27 ` Dmitry Vyukov 0 siblings, 1 reply; 13+ messages in thread From: Peter Xu @ 2020-04-07 2:42 UTC (permalink / raw) To: Andrew Morton Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote: > On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote: > > > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote: > > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > > > From: Peter Xu <peterx@redhat.com> > > > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > > > > > lookup_node() uses gup to pin the page and get node information. It > > > > checks against ret>=0 assuming the page will be filled in. However > > > > it's also possible that gup will return zero, for example, when the > > > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > > > gracefully return an error -EFAULT if it happens. > > > > > > > > ... > > > > > > > > --- a/mm/mempolicy.c > > > > +++ b/mm/mempolicy.c > > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > > > > > int locked = 1; > > > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > > > - if (err >= 0) { > > > > + if (err == 0) { > > > > + /* E.g. GUP interupted by fatal signal */ > > > > + err = -EFAULT; > > > > + } else if (err > 0) { > > > > err = page_to_nid(p); > > > > put_page(p); > > > > } > > > > > > Doh. Thanks. > > > > > > Should it have been -EINTR? > > > > It looks ok to me too. I was returning -EFAULT to follow the same > > value as get_vaddr_frames() (which is the other caller of > > get_user_pages_locked()). So far the only path that I found can > > trigger this is when there's a fatal signal pending right after the > > gup. If so, the userspace won't have a chance to see the -EINTR (or > > whatever we return) anyways. > > Yup. I guess we're a victim of get_user_pages()'s screwy return value > conventions - the caller cannot distinguish between invalid-addr and > fatal-signal. Indeed. > > Which makes one wonder why lookup_node() ever worked. What happens if > get_mempolicy(MPOL_F_NODE) is passed a wild userspace address? > I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE with MPOL_F_ADDR? Asked since iiuc if only MPOL_F_NODE is specified, the kernel should not use the userspace addr at all (which seems to be the thing we do now). get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to return -EFAULT as expected, though I agree maybe it would still be nicer to differentiate the two cases. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 2:42 ` Peter Xu @ 2020-04-07 8:27 ` Dmitry Vyukov 2020-04-07 15:59 ` Peter Xu 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Vyukov @ 2020-04-07 8:27 UTC (permalink / raw) To: Peter Xu Cc: Andrew Morton, syzbot, Brian Geffon, LKML, Linux-MM, syzkaller-bugs, Linus Torvalds, Andrey Konovalov On Tue, Apr 7, 2020 at 4:43 AM Peter Xu <peterx@redhat.com> wrote: > > On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote: > > On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote: > > > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > > > > From: Peter Xu <peterx@redhat.com> > > > > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > > > > > > > lookup_node() uses gup to pin the page and get node information. It > > > > > checks against ret>=0 assuming the page will be filled in. However > > > > > it's also possible that gup will return zero, for example, when the > > > > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > > > > gracefully return an error -EFAULT if it happens. > > > > > > > > > > ... > > > > > > > > > > --- a/mm/mempolicy.c > > > > > +++ b/mm/mempolicy.c > > > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > > > > > > > int locked = 1; > > > > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > > > > - if (err >= 0) { > > > > > + if (err == 0) { > > > > > + /* E.g. GUP interupted by fatal signal */ > > > > > + err = -EFAULT; > > > > > + } else if (err > 0) { > > > > > err = page_to_nid(p); > > > > > put_page(p); > > > > > } > > > > > > > > Doh. Thanks. > > > > > > > > Should it have been -EINTR? > > > > > > It looks ok to me too. I was returning -EFAULT to follow the same > > > value as get_vaddr_frames() (which is the other caller of > > > get_user_pages_locked()). So far the only path that I found can > > > trigger this is when there's a fatal signal pending right after the > > > gup. If so, the userspace won't have a chance to see the -EINTR (or > > > whatever we return) anyways. > > > > Yup. I guess we're a victim of get_user_pages()'s screwy return value > > conventions - the caller cannot distinguish between invalid-addr and > > fatal-signal. > > Indeed. > > > > > Which makes one wonder why lookup_node() ever worked. What happens if > > get_mempolicy(MPOL_F_NODE) is passed a wild userspace address? > > > > I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE > with MPOL_F_ADDR? Asked since iiuc if only MPOL_F_NODE is specified, > the kernel should not use the userspace addr at all (which seems to be > the thing we do now). get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to > return -EFAULT as expected, though I agree maybe it would still be > nicer to differentiate the two cases. Am I reading this correctly that we put an initialized struct page* in this case? If so, with stack spraying this looks like an "interesting" bug. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy 2020-04-07 8:27 ` Dmitry Vyukov @ 2020-04-07 15:59 ` Peter Xu 0 siblings, 0 replies; 13+ messages in thread From: Peter Xu @ 2020-04-07 15:59 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrew Morton, syzbot, Brian Geffon, LKML, Linux-MM, syzkaller-bugs, Linus Torvalds, Andrey Konovalov On Tue, Apr 07, 2020 at 10:27:15AM +0200, Dmitry Vyukov wrote: > On Tue, Apr 7, 2020 at 4:43 AM Peter Xu <peterx@redhat.com> wrote: > > > > On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote: > > > On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > > > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote: > > > > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote: > > > > > > > > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001 > > > > > > From: Peter Xu <peterx@redhat.com> > > > > > > Date: Mon, 6 Apr 2020 20:40:13 -0400 > > > > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal > > > > > > > > > > > > lookup_node() uses gup to pin the page and get node information. It > > > > > > checks against ret>=0 assuming the page will be filled in. However > > > > > > it's also possible that gup will return zero, for example, when the > > > > > > thread is quickly killed with a fatal signal. Teach lookup_node() to > > > > > > gracefully return an error -EFAULT if it happens. > > > > > > > > > > > > ... > > > > > > > > > > > > --- a/mm/mempolicy.c > > > > > > +++ b/mm/mempolicy.c > > > > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr) > > > > > > > > > > > > int locked = 1; > > > > > > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked); > > > > > > - if (err >= 0) { > > > > > > + if (err == 0) { > > > > > > + /* E.g. GUP interupted by fatal signal */ > > > > > > + err = -EFAULT; > > > > > > + } else if (err > 0) { > > > > > > err = page_to_nid(p); > > > > > > put_page(p); > > > > > > } > > > > > > > > > > Doh. Thanks. > > > > > > > > > > Should it have been -EINTR? > > > > > > > > It looks ok to me too. I was returning -EFAULT to follow the same > > > > value as get_vaddr_frames() (which is the other caller of > > > > get_user_pages_locked()). So far the only path that I found can > > > > trigger this is when there's a fatal signal pending right after the > > > > gup. If so, the userspace won't have a chance to see the -EINTR (or > > > > whatever we return) anyways. > > > > > > Yup. I guess we're a victim of get_user_pages()'s screwy return value > > > conventions - the caller cannot distinguish between invalid-addr and > > > fatal-signal. > > > > Indeed. > > > > > > > > Which makes one wonder why lookup_node() ever worked. What happens if > > > get_mempolicy(MPOL_F_NODE) is passed a wild userspace address? > > > > > > > I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE > > with MPOL_F_ADDR? Asked since iiuc if only MPOL_F_NODE is specified, > > the kernel should not use the userspace addr at all (which seems to be > > the thing we do now). get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to > > return -EFAULT as expected, though I agree maybe it would still be > > nicer to differentiate the two cases. > > Am I reading this correctly that we put an initialized struct page* in > this case? If so, with stack spraying this looks like an "interesting" > bug. Yeah, so far it should be fine, but... ideally I guess we should init page==NULL in lookup_node() too to avoid potential risk on exploiting. Maybe we could squash this into the fix if still possible. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-04-07 15:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-04-06 18:16 BUG: unable to handle kernel paging request in kernel_get_mempolicy syzbot 2020-04-07 0:47 ` Peter Xu 2020-04-07 1:05 ` Randy Dunlap 2020-04-07 1:05 ` syzbot 2020-04-07 1:06 ` syzbot 2020-04-07 1:27 ` Peter Xu 2020-04-07 5:26 ` syzbot 2020-04-07 1:39 ` Andrew Morton 2020-04-07 1:55 ` Peter Xu 2020-04-07 2:15 ` Andrew Morton 2020-04-07 2:42 ` Peter Xu 2020-04-07 8:27 ` Dmitry Vyukov 2020-04-07 15:59 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).