All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [mm?] general protection fault in hpage_collapse_scan_file
@ 2024-04-09 10:16 syzbot
  2024-04-09 23:46 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2024-04-09 10:16 UTC (permalink / raw)
  To: akpm, linux-kernel, linux-mm, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    8568bb2ccc27 Add linux-next specific files for 20240405
git tree:       linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=152f4805180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=48ca5acf8d2eb3bc
dashboard link: https://syzkaller.appspot.com/bug?extid=57adb2a4b9d206521bc2
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1268258d180000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1256598d180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1d120b5e779c/disk-8568bb2c.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a89e3589a585/vmlinux-8568bb2c.xz
kernel image: https://storage.googleapis.com/syzbot-assets/045e657c0e0d/bzImage-8568bb2c.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+57adb2a4b9d206521bc2@syzkaller.appspotmail.com

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 PID: 5080 Comm: syz-executor931 Not tainted 6.9.0-rc2-next-20240405-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:PageTail include/linux/page-flags.h:284 [inline]
RIP: 0010:const_folio_flags include/linux/page-flags.h:312 [inline]
RIP: 0010:folio_test_locked include/linux/page-flags.h:512 [inline]
RIP: 0010:collapse_file mm/khugepaged.c:1907 [inline]
RIP: 0010:hpage_collapse_scan_file+0x1ea3/0x63e0 mm/khugepaged.c:2292
Code: 48 8d bc 24 30 02 00 00 e8 9a a1 f9 ff 4c 8b bc 24 30 02 00 00 49 8d 5f 08 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 48 89 df e8 6f a1 f9 ff 48 8b 1b 48 89 de 48 83
RSP: 0018:ffffc9000340f420 EFLAGS: 00010247
RAX: 0000000000000000 RBX: 0000000000000006 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000340f7d0 R08: ffffffff81cb9212 R09: 1ffffffff1f5802d
R10: dffffc0000000000 R11: fffffbfff1f5802e R12: ffffc9000340f6b0
R13: 0000000000000000 R14: ffffc9000340f5f0 R15: fffffffffffffffe
FS:  000055558dc18380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000240 CR3: 000000006ee40000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 madvise_collapse+0x561/0xcc0 mm/khugepaged.c:2736
 madvise_vma_behavior mm/madvise.c:1081 [inline]
 madvise_walk_vmas mm/madvise.c:1255 [inline]
 do_madvise+0xc3c/0x44a0 mm/madvise.c:1441
 __do_sys_madvise mm/madvise.c:1456 [inline]
 __se_sys_madvise mm/madvise.c:1454 [inline]
 __x64_sys_madvise+0xa6/0xc0 mm/madvise.c:1454
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x72/0x7a
RIP: 0033:0x7f4cdb4292e9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff99038cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
RAX: ffffffffffffffda RBX: 00007fff99038eb8 RCX: 00007f4cdb4292e9
RDX: 0000000000000019 RSI: 0000000000600722 RDI: 0000000020000000
RBP: 00007f4cdb49c610 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fff99038ea8 R14: 0000000000000001 R15: 0000000000000001
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:PageTail include/linux/page-flags.h:284 [inline]
RIP: 0010:const_folio_flags include/linux/page-flags.h:312 [inline]
RIP: 0010:folio_test_locked include/linux/page-flags.h:512 [inline]
RIP: 0010:collapse_file mm/khugepaged.c:1907 [inline]
RIP: 0010:hpage_collapse_scan_file+0x1ea3/0x63e0 mm/khugepaged.c:2292
Code: 48 8d bc 24 30 02 00 00 e8 9a a1 f9 ff 4c 8b bc 24 30 02 00 00 49 8d 5f 08 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 48 89 df e8 6f a1 f9 ff 48 8b 1b 48 89 de 48 83
RSP: 0018:ffffc9000340f420 EFLAGS: 00010247
RAX: 0000000000000000 RBX: 0000000000000006 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000340f7d0 R08: ffffffff81cb9212 R09: 1ffffffff1f5802d
R10: dffffc0000000000 R11: fffffbfff1f5802e R12: ffffc9000340f6b0
R13: 0000000000000000 R14: ffffc9000340f5f0 R15: fffffffffffffffe
FS:  000055558dc18380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000240 CR3: 000000006ee40000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	48 8d bc 24 30 02 00 	lea    0x230(%rsp),%rdi
   7:	00
   8:	e8 9a a1 f9 ff       	call   0xfff9a1a7
   d:	4c 8b bc 24 30 02 00 	mov    0x230(%rsp),%r15
  14:	00
  15:	49 8d 5f 08          	lea    0x8(%r15),%rbx
  19:	48 89 d8             	mov    %rbx,%rax
  1c:	48 c1 e8 03          	shr    $0x3,%rax
  20:	48 b9 00 00 00 00 00 	movabs $0xdffffc0000000000,%rcx
  27:	fc ff df
* 2a:	80 3c 08 00          	cmpb   $0x0,(%rax,%rcx,1) <-- trapping instruction
  2e:	74 08                	je     0x38
  30:	48 89 df             	mov    %rbx,%rdi
  33:	e8 6f a1 f9 ff       	call   0xfff9a1a7
  38:	48 8b 1b             	mov    (%rbx),%rbx
  3b:	48 89 de             	mov    %rbx,%rsi
  3e:	48                   	rex.W
  3f:	83                   	.byte 0x83


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [mm?] general protection fault in hpage_collapse_scan_file
  2024-04-09 10:16 [syzbot] [mm?] general protection fault in hpage_collapse_scan_file syzbot
@ 2024-04-09 23:46 ` Andrew Morton
  2024-04-10  0:32   ` Zach O'Keefe
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2024-04-09 23:46 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, linux-mm, syzkaller-bugs

On Tue, 09 Apr 2024 03:16:20 -0700 syzbot <syzbot+57adb2a4b9d206521bc2@syzkaller.appspotmail.com> wrote:

> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    8568bb2ccc27 Add linux-next specific files for 20240405
> git tree:       linux-next
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152f4805180000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=48ca5acf8d2eb3bc
> dashboard link: https://syzkaller.appspot.com/bug?extid=57adb2a4b9d206521bc2
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1268258d180000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1256598d180000

Help.  From a quick look this seems to be claiming that collapse_file()
got to 

	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);

with folio==NULL, but the code look solid regarding this.

Given that we have a reproducer, can we expect the bot to perform a
bisection for us?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [mm?] general protection fault in hpage_collapse_scan_file
  2024-04-09 23:46 ` Andrew Morton
@ 2024-04-10  0:32   ` Zach O'Keefe
  2024-04-16 23:07     ` Zach O'Keefe
  0 siblings, 1 reply; 5+ messages in thread
From: Zach O'Keefe @ 2024-04-10  0:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: syzbot, linux-kernel, linux-mm, syzkaller-bugs

On Tue, Apr 9, 2024 at 4:46 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 09 Apr 2024 03:16:20 -0700 syzbot <syzbot+57adb2a4b9d206521bc2@syzkaller.appspotmail.com> wrote:
>
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    8568bb2ccc27 Add linux-next specific files for 20240405
> > git tree:       linux-next
> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=152f4805180000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=48ca5acf8d2eb3bc
> > dashboard link: https://syzkaller.appspot.com/bug?extid=57adb2a4b9d206521bc2
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1268258d180000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1256598d180000
>
> Help.  From a quick look this seems to be claiming that collapse_file()
> got to
>
>         VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
>
> with folio==NULL, but the code look solid regarding this.
>
> Given that we have a reproducer, can we expect the bot to perform a
> bisection for us?
>

I often don't see a successful automatic bisect, even with
reproducers. Hit or miss. I will take a closer look tomorrow -- the
reproducer doesn't look to be doing anything crazy.

Thanks,
Zach

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [mm?] general protection fault in hpage_collapse_scan_file
  2024-04-10  0:32   ` Zach O'Keefe
@ 2024-04-16 23:07     ` Zach O'Keefe
  2024-04-17 18:56       ` Zach O'Keefe
  0 siblings, 1 reply; 5+ messages in thread
From: Zach O'Keefe @ 2024-04-16 23:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: syzbot, linux-kernel, linux-mm, syzkaller-bugs

On Tue, Apr 9, 2024 at 5:32 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> On Tue, Apr 9, 2024 at 4:46 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue, 09 Apr 2024 03:16:20 -0700 syzbot <syzbot+57adb2a4b9d206521bc2@syzkaller.appspotmail.com> wrote:
> >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    8568bb2ccc27 Add linux-next specific files for 20240405
> > > git tree:       linux-next
> > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=152f4805180000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=48ca5acf8d2eb3bc
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=57adb2a4b9d206521bc2
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1268258d180000
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1256598d180000
> >
> > Help.  From a quick look this seems to be claiming that collapse_file()
> > got to
> >
> >         VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
> >
> > with folio==NULL, but the code look solid regarding this.
> >
> > Given that we have a reproducer, can we expect the bot to perform a
> > bisection for us?
> >
>
> I often don't see a successful automatic bisect, even with
> reproducers. Hit or miss. I will take a closer look tomorrow -- the
> reproducer doesn't look to be doing anything crazy.

I've only been able to reproduce this using the disk image provided by syzbot.

What is happening is we are calling MADV_COLLAPSE on an empty mapping
-- which actually reaches collapse_file() -> filemap_lock_folio()
after page_cache_sync_readahead() attempt. This of course fails
correctly, and I can see right before GPF that the returned pointer is
0xfffffffffffffffe, which is correctly ERR_PTR(-ENOENT). This should
be causing us to take the if (IS_ERR(folio)) {..} path .. but we
don't, and I don't know why. I haven't yet attempted to repro this
against other images. Will continue looking, but wanted to provide
some type of update -- even if it is a disappointing one -- so as to
not appear like I've disappeared.

Thanks,
Zach

> Thanks,
> Zach

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [mm?] general protection fault in hpage_collapse_scan_file
  2024-04-16 23:07     ` Zach O'Keefe
@ 2024-04-17 18:56       ` Zach O'Keefe
  0 siblings, 0 replies; 5+ messages in thread
From: Zach O'Keefe @ 2024-04-17 18:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: syzbot, linux-kernel, linux-mm, syzkaller-bugs, Matthew Wilcox,
	Hugh Dickins

On Tue, Apr 16, 2024 at 4:07 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> On Tue, Apr 9, 2024 at 5:32 PM Zach O'Keefe <zokeefe@google.com> wrote:
> >
> > On Tue, Apr 9, 2024 at 4:46 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Tue, 09 Apr 2024 03:16:20 -0700 syzbot <syzbot+57adb2a4b9d206521bc2@syzkaller.appspotmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    8568bb2ccc27 Add linux-next specific files for 20240405
> > > > git tree:       linux-next
> > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=152f4805180000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=48ca5acf8d2eb3bc
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=57adb2a4b9d206521bc2
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1268258d180000
> > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1256598d180000
> > >
> > > Help.  From a quick look this seems to be claiming that collapse_file()
> > > got to
> > >
> > >         VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
> > >
> > > with folio==NULL, but the code look solid regarding this.
> > >
> > > Given that we have a reproducer, can we expect the bot to perform a
> > > bisection for us?
> > >
> >
> > I often don't see a successful automatic bisect, even with
> > reproducers. Hit or miss. I will take a closer look tomorrow -- the
> > reproducer doesn't look to be doing anything crazy.
>
> I've only been able to reproduce this using the disk image provided by syzbot.
>
> What is happening is we are calling MADV_COLLAPSE on an empty mapping
> -- which actually reaches collapse_file() -> filemap_lock_folio()
> after page_cache_sync_readahead() attempt. This of course fails
> correctly, and I can see right before GPF that the returned pointer is
> 0xfffffffffffffffe, which is correctly ERR_PTR(-ENOENT). This should
> be causing us to take the if (IS_ERR(folio)) {..} path .. but we
> don't, and I don't know why. I haven't yet attempted to repro this
> against other images. Will continue looking, but wanted to provide
> some type of update -- even if it is a disappointing one -- so as to
> not appear like I've disappeared.

Ugh. Was looking at the wrong source. Thanks hughd@ for mentioning
that IS_ERR(folio) changed recently, else I'd have spent more time on
it. Fixed by https://lore.kernel.org/all/ZhIWX8K0E2tSyMSr@casper.infradead.org/

> Thanks,
> Zach
>
> > Thanks,
> > Zach

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-17 18:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-09 10:16 [syzbot] [mm?] general protection fault in hpage_collapse_scan_file syzbot
2024-04-09 23:46 ` Andrew Morton
2024-04-10  0:32   ` Zach O'Keefe
2024-04-16 23:07     ` Zach O'Keefe
2024-04-17 18:56       ` Zach O'Keefe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.