linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* BUG: unable to handle kernel paging request in kernel_get_mempolicy
@ 2020-04-06 18:16 syzbot
  2020-04-07  0:47 ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: syzbot @ 2020-04-06 18:16 UTC (permalink / raw)
  To: akpm, bgeffon, linux-kernel, linux-mm, peterx, syzkaller-bugs, torvalds

Hello,

syzbot found the following crash on:

HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000

The bug was bisected to:

commit 4426e945df588f2878affddf88a51259200f7e29
Author: Peter Xu <peterx@redhat.com>
Date:   Thu Apr 2 04:08:49 2020 +0000

    mm/gup: allow VM_FAULT_RETRY for multiple times

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")

BUG: unable to handle page fault for address: ffffffff00000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 987c067 P4D 987c067 PUD 0 
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
 __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
 __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x446719
Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
Modules linked in:
CR2: ffffffff00000000
---[ end trace 0becf554e06291c3 ]---
RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-06 18:16 BUG: unable to handle kernel paging request in kernel_get_mempolicy syzbot
@ 2020-04-07  0:47 ` Peter Xu
  2020-04-07  1:05   ` Randy Dunlap
  2020-04-07  1:39   ` Andrew Morton
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Xu @ 2020-04-07  0:47 UTC (permalink / raw)
  To: syzbot; +Cc: akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000
> 
> The bug was bisected to:
> 
> commit 4426e945df588f2878affddf88a51259200f7e29
> Author: Peter Xu <peterx@redhat.com>
> Date:   Thu Apr 2 04:08:49 2020 +0000
> 
>     mm/gup: allow VM_FAULT_RETRY for multiple times
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> 
> BUG: unable to handle page fault for address: ffffffff00000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 987c067 P4D 987c067 PUD 0 
> Oops: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
>  __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
>  __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> RIP: 0033:0x446719
> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
> Modules linked in:
> CR2: ffffffff00000000
> ---[ end trace 0becf554e06291c3 ]---
> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Hi, Andrew & all,

I can reproduce this locally right after I run the test program, and
below patch fixed it for me - the test program can run with quite a
few minutes without crashing again.

Is there a way I can feed this to the syzbot to re-verify this?

Thanks,

8<---------------------------------------------------------------
From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Mon, 6 Apr 2020 20:40:13 -0400
Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal

lookup_node() uses gup to pin the page and get node information.  It
checks against ret>=0 assuming the page will be filled in.  However
it's also possible that gup will return zero, for example, when the
thread is quickly killed with a fatal signal.  Teach lookup_node() to
gracefully return an error -EFAULT if it happens.

Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/mempolicy.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 5fb427aed612..1398578db025 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
 
 	int locked = 1;
 	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
-	if (err >= 0) {
+	if (err == 0) {
+		/* E.g. GUP interupted by fatal signal */
+		err = -EFAULT;
+	} else if (err > 0) {
 		err = page_to_nid(p);
 		put_page(p);
 	}
-- 
2.24.1


-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  0:47 ` Peter Xu
@ 2020-04-07  1:05   ` Randy Dunlap
  2020-04-07  1:05     ` syzbot
                       ` (2 more replies)
  2020-04-07  1:39   ` Andrew Morton
  1 sibling, 3 replies; 13+ messages in thread
From: Randy Dunlap @ 2020-04-07  1:05 UTC (permalink / raw)
  To: Peter Xu, syzbot
  Cc: akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On 4/6/20 5:47 PM, Peter Xu wrote:
> On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000
>>
>> The bug was bisected to:
>>
>> commit 4426e945df588f2878affddf88a51259200f7e29
>> Author: Peter Xu <peterx@redhat.com>
>> Date:   Thu Apr 2 04:08:49 2020 +0000
>>
>>     mm/gup: allow VM_FAULT_RETRY for multiple times
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
>>
>> BUG: unable to handle page fault for address: ffffffff00000000
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x0000) - not-present page
>> PGD 987c067 P4D 987c067 PUD 0 
>> Oops: 0000 [#1] PREEMPT SMP KASAN
>> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
>>  __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
>>  __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
>>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>> RIP: 0033:0x446719
>> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
>> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
>> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
>> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
>> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
>> Modules linked in:
>> CR2: ffffffff00000000
>> ---[ end trace 0becf554e06291c3 ]---
>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> Hi, Andrew & all,
> 
> I can reproduce this locally right after I run the test program, and
> below patch fixed it for me - the test program can run with quite a
> few minutes without crashing again.
> 
> Is there a way I can feed this to the syzbot to re-verify this?

Hi Peter,

Send the patch. At the top of the email, put something like
#syz test <git repo> <branch>

It's documented here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md


> Thanks,
> 
> 8<---------------------------------------------------------------
> From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Mon, 6 Apr 2020 20:40:13 -0400
> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> 
> lookup_node() uses gup to pin the page and get node information.  It
> checks against ret>=0 assuming the page will be filled in.  However
> it's also possible that gup will return zero, for example, when the
> thread is quickly killed with a fatal signal.  Teach lookup_node() to
> gracefully return an error -EFAULT if it happens.
> 
> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/mempolicy.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 5fb427aed612..1398578db025 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
>  
>  	int locked = 1;
>  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> -	if (err >= 0) {
> +	if (err == 0) {
> +		/* E.g. GUP interupted by fatal signal */
> +		err = -EFAULT;
> +	} else if (err > 0) {
>  		err = page_to_nid(p);
>  		put_page(p);
>  	}
> 



-- 
~Randy
Reported-by: Randy Dunlap <rdunlap@infradead.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:05   ` Randy Dunlap
@ 2020-04-07  1:05     ` syzbot
  2020-04-07  1:06     ` syzbot
  2020-04-07  1:27     ` Peter Xu
  2 siblings, 0 replies; 13+ messages in thread
From: syzbot @ 2020-04-07  1:05 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap,
	syzkaller-bugs, torvalds

> On 4/6/20 5:47 PM, Peter Xu wrote:
>> On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000
>>>
>>> The bug was bisected to:
>>>
>>> commit 4426e945df588f2878affddf88a51259200f7e29
>>> Author: Peter Xu <peterx@redhat.com>
>>> Date:   Thu Apr 2 04:08:49 2020 +0000
>>>
>>>     mm/gup: allow VM_FAULT_RETRY for multiple times
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
>>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
>>>
>>> BUG: unable to handle page fault for address: ffffffff00000000
>>> #PF: supervisor read access in kernel mode
>>> #PF: error_code(0x0000) - not-present page
>>> PGD 987c067 P4D 987c067 PUD 0 
>>> Oops: 0000 [#1] PREEMPT SMP KASAN
>>> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>  __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
>>>  __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
>>>  __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
>>>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x446719
>>> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
>>> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
>>> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
>>> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
>>> Modules linked in:
>>> CR2: ffffffff00000000
>>> ---[ end trace 0becf554e06291c3 ]---
>>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> 
>> Hi, Andrew & all,
>> 
>> I can reproduce this locally right after I run the test program, and
>> below patch fixed it for me - the test program can run with quite a
>> few minutes without crashing again.
>> 
>> Is there a way I can feed this to the syzbot to re-verify this?
>
> Hi Peter,
>
> Send the patch. At the top of the email, put something like
> #syz test <git repo> <branch>

want 2 args (repo, branch), got 3

>
> It's documented here:
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md
>
>
>> Thanks,
>> 
>> 8<---------------------------------------------------------------
>> From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
>> From: Peter Xu <peterx@redhat.com>
>> Date: Mon, 6 Apr 2020 20:40:13 -0400
>> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
>> 
>> lookup_node() uses gup to pin the page and get node information.  It
>> checks against ret>=0 assuming the page will be filled in.  However
>> it's also possible that gup will return zero, for example, when the
>> thread is quickly killed with a fatal signal.  Teach lookup_node() to
>> gracefully return an error -EFAULT if it happens.
>> 
>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  mm/mempolicy.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 5fb427aed612..1398578db025 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
>>  
>>  	int locked = 1;
>>  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
>> -	if (err >= 0) {
>> +	if (err == 0) {
>> +		/* E.g. GUP interupted by fatal signal */
>> +		err = -EFAULT;
>> +	} else if (err > 0) {
>>  		err = page_to_nid(p);
>>  		put_page(p);
>>  	}
>> 
>
>
>
> -- 
> ~Randy
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
>
> -- 
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/3ff20c8a-5a26-5e38-42f7-ec751735d47c%40infradead.org.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:05   ` Randy Dunlap
  2020-04-07  1:05     ` syzbot
@ 2020-04-07  1:06     ` syzbot
  2020-04-07  1:27     ` Peter Xu
  2 siblings, 0 replies; 13+ messages in thread
From: syzbot @ 2020-04-07  1:06 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap,
	syzkaller-bugs, torvalds

> On 4/6/20 5:47 PM, Peter Xu wrote:
>> On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000
>>>
>>> The bug was bisected to:
>>>
>>> commit 4426e945df588f2878affddf88a51259200f7e29
>>> Author: Peter Xu <peterx@redhat.com>
>>> Date:   Thu Apr 2 04:08:49 2020 +0000
>>>
>>>     mm/gup: allow VM_FAULT_RETRY for multiple times
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
>>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
>>>
>>> BUG: unable to handle page fault for address: ffffffff00000000
>>> #PF: supervisor read access in kernel mode
>>> #PF: error_code(0x0000) - not-present page
>>> PGD 987c067 P4D 987c067 PUD 0 
>>> Oops: 0000 [#1] PREEMPT SMP KASAN
>>> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>  __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
>>>  __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
>>>  __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
>>>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x446719
>>> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
>>> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
>>> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
>>> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
>>> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
>>> Modules linked in:
>>> CR2: ffffffff00000000
>>> ---[ end trace 0becf554e06291c3 ]---
>>> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
>>> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
>>> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
>>> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
>>> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
>>> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
>>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
>>> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
>>> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
>>> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
>>> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
>>> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> 
>> Hi, Andrew & all,
>> 
>> I can reproduce this locally right after I run the test program, and
>> below patch fixed it for me - the test program can run with quite a
>> few minutes without crashing again.
>> 
>> Is there a way I can feed this to the syzbot to re-verify this?
>
> Hi Peter,
>
> Send the patch. At the top of the email, put something like
> #syz test <git repo> <branch>

want 2 args (repo, branch), got 3

>
> It's documented here:
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md
>
>
>> Thanks,
>> 
>> 8<---------------------------------------------------------------
>> From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
>> From: Peter Xu <peterx@redhat.com>
>> Date: Mon, 6 Apr 2020 20:40:13 -0400
>> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
>> 
>> lookup_node() uses gup to pin the page and get node information.  It
>> checks against ret>=0 assuming the page will be filled in.  However
>> it's also possible that gup will return zero, for example, when the
>> thread is quickly killed with a fatal signal.  Teach lookup_node() to
>> gracefully return an error -EFAULT if it happens.
>> 
>> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
>> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  mm/mempolicy.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 5fb427aed612..1398578db025 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
>>  
>>  	int locked = 1;
>>  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
>> -	if (err >= 0) {
>> +	if (err == 0) {
>> +		/* E.g. GUP interupted by fatal signal */
>> +		err = -EFAULT;
>> +	} else if (err > 0) {
>>  		err = page_to_nid(p);
>>  		put_page(p);
>>  	}
>> 
>
>
>
> -- 
> ~Randy
> Reported-by: Randy Dunlap <rdunlap@infradead.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:05   ` Randy Dunlap
  2020-04-07  1:05     ` syzbot
  2020-04-07  1:06     ` syzbot
@ 2020-04-07  1:27     ` Peter Xu
  2020-04-07  5:26       ` syzbot
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2020-04-07  1:27 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: syzbot, akpm, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, Apr 06, 2020 at 06:05:54PM -0700, Randy Dunlap wrote:
> On 4/6/20 5:47 PM, Peter Xu wrote:
> > On Mon, Apr 06, 2020 at 11:16:13AM -0700, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=13966e8fe00000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
> >> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1738b02be00000
> >> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d2c76de00000
> >>
> >> The bug was bisected to:
> >>
> >> commit 4426e945df588f2878affddf88a51259200f7e29
> >> Author: Peter Xu <peterx@redhat.com>
> >> Date:   Thu Apr 2 04:08:49 2020 +0000
> >>
> >>     mm/gup: allow VM_FAULT_RETRY for multiple times
> >>
> >> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14ac4a5de00000
> >> final crash:    https://syzkaller.appspot.com/x/report.txt?x=16ac4a5de00000
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=12ac4a5de00000
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
> >> Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> >>
> >> BUG: unable to handle page fault for address: ffffffff00000000
> >> #PF: supervisor read access in kernel mode
> >> #PF: error_code(0x0000) - not-present page
> >> PGD 987c067 P4D 987c067 PUD 0 
> >> Oops: 0000 [#1] PREEMPT SMP KASAN
> >> CPU: 1 PID: 7181 Comm: syz-executor616 Not tainted 5.6.0-syzkaller #0
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
> >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
> >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
> >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
> >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
> >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
> >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
> >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
> >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
> >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
> >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
> >> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> Call Trace:
> >>  __do_sys_get_mempolicy mm/mempolicy.c:1633 [inline]
> >>  __se_sys_get_mempolicy mm/mempolicy.c:1629 [inline]
> >>  __x64_sys_get_mempolicy+0xba/0x150 mm/mempolicy.c:1629
> >>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
> >>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> >> RIP: 0033:0x446719
> >> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> >> RSP: 002b:00007f848cd49db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
> >> RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446719
> >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >> RBP: 00000000006dbc20 R08: 0000000000000003 R09: 0000000000000000
> >> R10: 000000002073b000 R11: 0000000000000246 R12: 00000000006dbc2c
> >> R13: 00007ffcfe6ba66f R14: 00007f848cd4a9c0 R15: 20c49ba5e353f7cf
> >> Modules linked in:
> >> CR2: ffffffff00000000
> >> ---[ end trace 0becf554e06291c3 ]---
> >> RIP: 0010:page_to_nid include/linux/mm.h:1245 [inline]
> >> RIP: 0010:lookup_node mm/mempolicy.c:906 [inline]
> >> RIP: 0010:do_get_mempolicy mm/mempolicy.c:970 [inline]
> >> RIP: 0010:kernel_get_mempolicy+0x60e/0xfb0 mm/mempolicy.c:1615
> >> Code: 88 00 07 00 00 e8 b2 35 c5 ff 4c 8b 7c 24 78 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 fb 08 00 00 <49> 8b 1f 48 c7 c7 ff ff ff ff 48 89 de e8 10 37 c5 ff 48 83 fb ff
> >> RSP: 0018:ffffc900018d7de8 EFLAGS: 00010246
> >> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff81adaaf1
> >> RDX: 1fffffffe0000000 RSI: ffffffff81adaafe RDI: 0000000000000005
> >> RBP: 0000000000000000 R08: ffff88808de924c0 R09: ffffed1011bd2499
> >> R10: ffff88808de924c7 R11: ffffed1011bd2498 R12: 0000000000000000
> >> R13: 1ffff9200031afc4 R14: ffffffff89a6df60 R15: ffffffff00000000
> >> FS:  00007f848cd4a700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: ffffffff00000000 CR3: 00000000a7a8d000 CR4: 00000000001406e0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > 
> > Hi, Andrew & all,
> > 
> > I can reproduce this locally right after I run the test program, and
> > below patch fixed it for me - the test program can run with quite a
> > few minutes without crashing again.
> > 
> > Is there a way I can feed this to the syzbot to re-verify this?
> 
> Hi Peter,
> 
> Send the patch. At the top of the email, put something like
> #syz test <git repo> <branch>
> 
> It's documented here:
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md

Thanks Randy.

#syz test https://github.com/xzpeter/linux.git 23800bff6fa346a4e9b3806dc0cfeb74498df757

> 
> 
> > Thanks,
> > 
> > 8<---------------------------------------------------------------
> > From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > From: Peter Xu <peterx@redhat.com>
> > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > 
> > lookup_node() uses gup to pin the page and get node information.  It
> > checks against ret>=0 assuming the page will be filled in.  However
> > it's also possible that gup will return zero, for example, when the
> > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > gracefully return an error -EFAULT if it happens.
> > 
> > Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
> > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  mm/mempolicy.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 5fb427aed612..1398578db025 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> >  
> >  	int locked = 1;
> >  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > -	if (err >= 0) {
> > +	if (err == 0) {
> > +		/* E.g. GUP interupted by fatal signal */
> > +		err = -EFAULT;
> > +	} else if (err > 0) {
> >  		err = page_to_nid(p);
> >  		put_page(p);
> >  	}
> > 
> 
> 
> 
> -- 
> ~Randy
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  0:47 ` Peter Xu
  2020-04-07  1:05   ` Randy Dunlap
@ 2020-04-07  1:39   ` Andrew Morton
  2020-04-07  1:55     ` Peter Xu
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2020-04-07  1:39 UTC (permalink / raw)
  To: Peter Xu
  Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:

> >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Mon, 6 Apr 2020 20:40:13 -0400
> Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> 
> lookup_node() uses gup to pin the page and get node information.  It
> checks against ret>=0 assuming the page will be filled in.  However
> it's also possible that gup will return zero, for example, when the
> thread is quickly killed with a fatal signal.  Teach lookup_node() to
> gracefully return an error -EFAULT if it happens.
> 
> ...
>
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
>  
>  	int locked = 1;
>  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> -	if (err >= 0) {
> +	if (err == 0) {
> +		/* E.g. GUP interupted by fatal signal */
> +		err = -EFAULT;
> +	} else if (err > 0) {
>  		err = page_to_nid(p);
>  		put_page(p);
>  	}

Doh.  Thanks.

Should it have been -EINTR?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:39   ` Andrew Morton
@ 2020-04-07  1:55     ` Peter Xu
  2020-04-07  2:15       ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2020-04-07  1:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote:
> On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:
> 
> > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > From: Peter Xu <peterx@redhat.com>
> > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > 
> > lookup_node() uses gup to pin the page and get node information.  It
> > checks against ret>=0 assuming the page will be filled in.  However
> > it's also possible that gup will return zero, for example, when the
> > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > gracefully return an error -EFAULT if it happens.
> > 
> > ...
> >
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> >  
> >  	int locked = 1;
> >  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > -	if (err >= 0) {
> > +	if (err == 0) {
> > +		/* E.g. GUP interupted by fatal signal */
> > +		err = -EFAULT;
> > +	} else if (err > 0) {
> >  		err = page_to_nid(p);
> >  		put_page(p);
> >  	}
> 
> Doh.  Thanks.
> 
> Should it have been -EINTR?

It looks ok to me too.  I was returning -EFAULT to follow the same
value as get_vaddr_frames() (which is the other caller of
get_user_pages_locked()).  So far the only path that I found can
trigger this is when there's a fatal signal pending right after the
gup.  If so, the userspace won't have a chance to see the -EINTR (or
whatever we return) anyways.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:55     ` Peter Xu
@ 2020-04-07  2:15       ` Andrew Morton
  2020-04-07  2:42         ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2020-04-07  2:15 UTC (permalink / raw)
  To: Peter Xu
  Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote:

> On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote:
> > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:
> > 
> > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > > From: Peter Xu <peterx@redhat.com>
> > > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > > 
> > > lookup_node() uses gup to pin the page and get node information.  It
> > > checks against ret>=0 assuming the page will be filled in.  However
> > > it's also possible that gup will return zero, for example, when the
> > > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > > gracefully return an error -EFAULT if it happens.
> > > 
> > > ...
> > >
> > > --- a/mm/mempolicy.c
> > > +++ b/mm/mempolicy.c
> > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> > >  
> > >  	int locked = 1;
> > >  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > > -	if (err >= 0) {
> > > +	if (err == 0) {
> > > +		/* E.g. GUP interupted by fatal signal */
> > > +		err = -EFAULT;
> > > +	} else if (err > 0) {
> > >  		err = page_to_nid(p);
> > >  		put_page(p);
> > >  	}
> > 
> > Doh.  Thanks.
> > 
> > Should it have been -EINTR?
> 
> It looks ok to me too.  I was returning -EFAULT to follow the same
> value as get_vaddr_frames() (which is the other caller of
> get_user_pages_locked()).  So far the only path that I found can
> trigger this is when there's a fatal signal pending right after the
> gup.  If so, the userspace won't have a chance to see the -EINTR (or
> whatever we return) anyways.

Yup.  I guess we're a victim of get_user_pages()'s screwy return value
conventions - the caller cannot distinguish between invalid-addr and
fatal-signal.

Which makes one wonder why lookup_node() ever worked.  What happens if
get_mempolicy(MPOL_F_NODE) is passed a wild userspace address?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  2:15       ` Andrew Morton
@ 2020-04-07  2:42         ` Peter Xu
  2020-04-07  8:27           ` Dmitry Vyukov
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2020-04-07  2:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: syzbot, bgeffon, linux-kernel, linux-mm, syzkaller-bugs, torvalds

On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote:
> On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote:
> 
> > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote:
> > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:
> > > 
> > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > > > From: Peter Xu <peterx@redhat.com>
> > > > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > > > 
> > > > lookup_node() uses gup to pin the page and get node information.  It
> > > > checks against ret>=0 assuming the page will be filled in.  However
> > > > it's also possible that gup will return zero, for example, when the
> > > > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > > > gracefully return an error -EFAULT if it happens.
> > > > 
> > > > ...
> > > >
> > > > --- a/mm/mempolicy.c
> > > > +++ b/mm/mempolicy.c
> > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> > > >  
> > > >  	int locked = 1;
> > > >  	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > > > -	if (err >= 0) {
> > > > +	if (err == 0) {
> > > > +		/* E.g. GUP interupted by fatal signal */
> > > > +		err = -EFAULT;
> > > > +	} else if (err > 0) {
> > > >  		err = page_to_nid(p);
> > > >  		put_page(p);
> > > >  	}
> > > 
> > > Doh.  Thanks.
> > > 
> > > Should it have been -EINTR?
> > 
> > It looks ok to me too.  I was returning -EFAULT to follow the same
> > value as get_vaddr_frames() (which is the other caller of
> > get_user_pages_locked()).  So far the only path that I found can
> > trigger this is when there's a fatal signal pending right after the
> > gup.  If so, the userspace won't have a chance to see the -EINTR (or
> > whatever we return) anyways.
> 
> Yup.  I guess we're a victim of get_user_pages()'s screwy return value
> conventions - the caller cannot distinguish between invalid-addr and
> fatal-signal.

Indeed.

> 
> Which makes one wonder why lookup_node() ever worked.  What happens if
> get_mempolicy(MPOL_F_NODE) is passed a wild userspace address?
> 

I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE
with MPOL_F_ADDR?  Asked since iiuc if only MPOL_F_NODE is specified,
the kernel should not use the userspace addr at all (which seems to be
the thing we do now).  get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to
return -EFAULT as expected, though I agree maybe it would still be
nicer to differentiate the two cases.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  1:27     ` Peter Xu
@ 2020-04-07  5:26       ` syzbot
  0 siblings, 0 replies; 13+ messages in thread
From: syzbot @ 2020-04-07  5:26 UTC (permalink / raw)
  To: akpm, bgeffon, linux-kernel, linux-mm, peterx, rdunlap,
	syzkaller-bugs, torvalds

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com

Tested on:

commit:         23800bff mm/mempolicy: Allow lookup_node() to handle fatal..
git tree:       https://github.com/xzpeter/linux.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=288d637f7bebfd40
dashboard link: https://syzkaller.appspot.com/bug?extid=693dc11fcb53120b5559
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  2:42         ` Peter Xu
@ 2020-04-07  8:27           ` Dmitry Vyukov
  2020-04-07 15:59             ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Vyukov @ 2020-04-07  8:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: Andrew Morton, syzbot, Brian Geffon, LKML, Linux-MM,
	syzkaller-bugs, Linus Torvalds, Andrey Konovalov

On Tue, Apr 7, 2020 at 4:43 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote:
> > On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote:
> >
> > > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote:
> > > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > > > > From: Peter Xu <peterx@redhat.com>
> > > > > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > > > >
> > > > > lookup_node() uses gup to pin the page and get node information.  It
> > > > > checks against ret>=0 assuming the page will be filled in.  However
> > > > > it's also possible that gup will return zero, for example, when the
> > > > > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > > > > gracefully return an error -EFAULT if it happens.
> > > > >
> > > > > ...
> > > > >
> > > > > --- a/mm/mempolicy.c
> > > > > +++ b/mm/mempolicy.c
> > > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> > > > >
> > > > >         int locked = 1;
> > > > >         err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > > > > -       if (err >= 0) {
> > > > > +       if (err == 0) {
> > > > > +               /* E.g. GUP interupted by fatal signal */
> > > > > +               err = -EFAULT;
> > > > > +       } else if (err > 0) {
> > > > >                 err = page_to_nid(p);
> > > > >                 put_page(p);
> > > > >         }
> > > >
> > > > Doh.  Thanks.
> > > >
> > > > Should it have been -EINTR?
> > >
> > > It looks ok to me too.  I was returning -EFAULT to follow the same
> > > value as get_vaddr_frames() (which is the other caller of
> > > get_user_pages_locked()).  So far the only path that I found can
> > > trigger this is when there's a fatal signal pending right after the
> > > gup.  If so, the userspace won't have a chance to see the -EINTR (or
> > > whatever we return) anyways.
> >
> > Yup.  I guess we're a victim of get_user_pages()'s screwy return value
> > conventions - the caller cannot distinguish between invalid-addr and
> > fatal-signal.
>
> Indeed.
>
> >
> > Which makes one wonder why lookup_node() ever worked.  What happens if
> > get_mempolicy(MPOL_F_NODE) is passed a wild userspace address?
> >
>
> I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE
> with MPOL_F_ADDR?  Asked since iiuc if only MPOL_F_NODE is specified,
> the kernel should not use the userspace addr at all (which seems to be
> the thing we do now).  get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to
> return -EFAULT as expected, though I agree maybe it would still be
> nicer to differentiate the two cases.

Am I reading this correctly that we put an initialized struct page* in
this case? If so, with stack spraying this looks like an "interesting"
bug.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: unable to handle kernel paging request in kernel_get_mempolicy
  2020-04-07  8:27           ` Dmitry Vyukov
@ 2020-04-07 15:59             ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2020-04-07 15:59 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, syzbot, Brian Geffon, LKML, Linux-MM,
	syzkaller-bugs, Linus Torvalds, Andrey Konovalov

On Tue, Apr 07, 2020 at 10:27:15AM +0200, Dmitry Vyukov wrote:
> On Tue, Apr 7, 2020 at 4:43 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Mon, Apr 06, 2020 at 07:15:34PM -0700, Andrew Morton wrote:
> > > On Mon, 6 Apr 2020 21:55:35 -0400 Peter Xu <peterx@redhat.com> wrote:
> > >
> > > > On Mon, Apr 06, 2020 at 06:39:41PM -0700, Andrew Morton wrote:
> > > > > On Mon, 6 Apr 2020 20:47:45 -0400 Peter Xu <peterx@redhat.com> wrote:
> > > > >
> > > > > > >From 23800bff6fa346a4e9b3806dc0cfeb74498df757 Mon Sep 17 00:00:00 2001
> > > > > > From: Peter Xu <peterx@redhat.com>
> > > > > > Date: Mon, 6 Apr 2020 20:40:13 -0400
> > > > > > Subject: [PATCH] mm/mempolicy: Allow lookup_node() to handle fatal signal
> > > > > >
> > > > > > lookup_node() uses gup to pin the page and get node information.  It
> > > > > > checks against ret>=0 assuming the page will be filled in.  However
> > > > > > it's also possible that gup will return zero, for example, when the
> > > > > > thread is quickly killed with a fatal signal.  Teach lookup_node() to
> > > > > > gracefully return an error -EFAULT if it happens.
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > --- a/mm/mempolicy.c
> > > > > > +++ b/mm/mempolicy.c
> > > > > > @@ -902,7 +902,10 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> > > > > >
> > > > > >         int locked = 1;
> > > > > >         err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > > > > > -       if (err >= 0) {
> > > > > > +       if (err == 0) {
> > > > > > +               /* E.g. GUP interupted by fatal signal */
> > > > > > +               err = -EFAULT;
> > > > > > +       } else if (err > 0) {
> > > > > >                 err = page_to_nid(p);
> > > > > >                 put_page(p);
> > > > > >         }
> > > > >
> > > > > Doh.  Thanks.
> > > > >
> > > > > Should it have been -EINTR?
> > > >
> > > > It looks ok to me too.  I was returning -EFAULT to follow the same
> > > > value as get_vaddr_frames() (which is the other caller of
> > > > get_user_pages_locked()).  So far the only path that I found can
> > > > trigger this is when there's a fatal signal pending right after the
> > > > gup.  If so, the userspace won't have a chance to see the -EINTR (or
> > > > whatever we return) anyways.
> > >
> > > Yup.  I guess we're a victim of get_user_pages()'s screwy return value
> > > conventions - the caller cannot distinguish between invalid-addr and
> > > fatal-signal.
> >
> > Indeed.
> >
> > >
> > > Which makes one wonder why lookup_node() ever worked.  What happens if
> > > get_mempolicy(MPOL_F_NODE) is passed a wild userspace address?
> > >
> >
> > I'm not familiar with mempolicy at all, but do you mean MPOL_F_NODE
> > with MPOL_F_ADDR?  Asked since iiuc if only MPOL_F_NODE is specified,
> > the kernel should not use the userspace addr at all (which seems to be
> > the thing we do now).  get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) seems to
> > return -EFAULT as expected, though I agree maybe it would still be
> > nicer to differentiate the two cases.
> 
> Am I reading this correctly that we put an initialized struct page* in
> this case? If so, with stack spraying this looks like an "interesting"
> bug.

Yeah, so far it should be fine, but... ideally I guess we should init
page==NULL in lookup_node() too to avoid potential risk on exploiting.
Maybe we could squash this into the fix if still possible.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-07 15:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-06 18:16 BUG: unable to handle kernel paging request in kernel_get_mempolicy syzbot
2020-04-07  0:47 ` Peter Xu
2020-04-07  1:05   ` Randy Dunlap
2020-04-07  1:05     ` syzbot
2020-04-07  1:06     ` syzbot
2020-04-07  1:27     ` Peter Xu
2020-04-07  5:26       ` syzbot
2020-04-07  1:39   ` Andrew Morton
2020-04-07  1:55     ` Peter Xu
2020-04-07  2:15       ` Andrew Morton
2020-04-07  2:42         ` Peter Xu
2020-04-07  8:27           ` Dmitry Vyukov
2020-04-07 15:59             ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).