linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] kernel BUG in __filemap_get_folio
@ 2022-04-20 15:54 syzbot
  2022-04-21 20:21 ` Matthew Wilcox
  0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2022-04-20 15:54 UTC (permalink / raw)
  To: akpm, dhowells, hughd, kirill.shutemov, linux-kernel, linux-mm,
	syzkaller-bugs, vbabka, william.kucharski, willy

Hello,

syzbot found the following issue on:

HEAD commit:    559089e0a93d vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLO..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1306a768f00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dd7c9a79dfcfa205
dashboard link: https://syzkaller.appspot.com/bug?extid=cf4cf13056f85dec2c40
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16554fd0f00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=147f41ecf00000

The issue was bisected to:

commit 6b24ca4a1a8d4ee3221d6d44ddbb99f542e4bda3
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sun Jun 28 02:19:08 2020 +0000

    mm: Use multi-index entries in the page cache

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=173fc7d0f00000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=14bfc7d0f00000
console output: https://syzkaller.appspot.com/x/log.txt?x=10bfc7d0f00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com
Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache")

 do_initcall_level init/main.c:1371 [inline]
 do_initcalls init/main.c:1387 [inline]
 do_basic_setup init/main.c:1406 [inline]
 kernel_init_freeable+0x6b1/0x73a init/main.c:1613
 kernel_init+0x1a/0x1d0 init/main.c:1502
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
------------[ cut here ]------------
kernel BUG at mm/filemap.c:1971!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 3867 Comm: syz-executor935 Not tainted 5.18.0-rc3-syzkaller-00007-g559089e0a93d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__filemap_get_folio+0xc65/0xf00 mm/filemap.c:1971
Code: db 45 31 f6 e9 fd f5 ff ff 44 8b 6c 24 10 48 89 eb e9 f0 f5 ff ff e8 ba f5 d8 ff 48 c7 c6 80 d9 d5 89 48 89 df e8 6b 8d 0e 00 <0f> 0b e8 a4 f5 d8 ff 48 89 df 31 db e8 4a af 03 00 e9 78 f7 ff ff
RSP: 0018:ffffc900033d78b0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea00014c8dc0 RCX: 0000000000000000
RDX: ffff88807bb560c0 RSI: ffffffff819f5865 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000018 R09: 00000000ffffffff
R10: ffffffff891d5eec R11: 00000000ffffffff R12: 0000000000000180
R13: 0000000000000182 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f78863e0700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002013f000 CR3: 0000000075648000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 pagecache_get_page+0x2e/0x290 mm/folio-compat.c:126
 shmem_getpage_gfp+0x471/0x2370 mm/shmem.c:1812
 shmem_getpage mm/shmem.c:149 [inline]
 shmem_write_begin+0xff/0x1e0 mm/shmem.c:2446
 generic_perform_write+0x249/0x560 mm/filemap.c:3787
 __generic_file_write_iter+0x2aa/0x4d0 mm/filemap.c:3915
 generic_file_write_iter+0xd7/0x220 mm/filemap.c:3947
 call_write_iter include/linux/fs.h:2050 [inline]
 new_sync_write+0x38a/0x560 fs/read_write.c:504
 vfs_write+0x7c0/0xac0 fs/read_write.c:591
 ksys_write+0x127/0x250 fs/read_write.c:644
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f78864331c9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f78863e0308 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f78864b73e8 RCX: 00007f78864331c9
RDX: 000000000208e24b RSI: 0000000020000080 RDI: 0000000000000004
RBP: 00007f78864b73e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f78864b73ec
R13: 00007f78864840ac R14: 776c613d65677568 R15: 0000000000022000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__filemap_get_folio+0xc65/0xf00 mm/filemap.c:1971
Code: db 45 31 f6 e9 fd f5 ff ff 44 8b 6c 24 10 48 89 eb e9 f0 f5 ff ff e8 ba f5 d8 ff 48 c7 c6 80 d9 d5 89 48 89 df e8 6b 8d 0e 00 <0f> 0b e8 a4 f5 d8 ff 48 89 df 31 db e8 4a af 03 00 e9 78 f7 ff ff
RSP: 0018:ffffc900033d78b0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea00014c8dc0 RCX: 0000000000000000
RDX: ffff88807bb560c0 RSI: ffffffff819f5865 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000018 R09: 00000000ffffffff
R10: ffffffff891d5eec R11: 00000000ffffffff R12: 0000000000000180
R13: 0000000000000182 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f78863e0700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020800000 CR3: 0000000075648000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] kernel BUG in __filemap_get_folio
  2022-04-20 15:54 [syzbot] kernel BUG in __filemap_get_folio syzbot
@ 2022-04-21 20:21 ` Matthew Wilcox
  2022-04-22 18:30   ` Matthew Wilcox
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2022-04-21 20:21 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, dhowells, hughd, kirill.shutemov, linux-kernel, linux-mm,
	syzkaller-bugs, vbabka, william.kucharski

On Wed, Apr 20, 2022 at 08:54:32AM -0700, syzbot wrote:
> syzbot found the following issue on:

The log attached here omits some of the interesting information.
From the full console log:

> page:ffffea0000b78d00 refcount:2 mapcount:0 mapping:ffff888071347c70 index:0x234 pfn:0x2de34
> memcg:ffff888073230000
> aops:shmem_aops ino:2 dentry name:"cgroup.controllers"
> flags: 0xfff0000008003f(locked|referenced|uptodate|dirty|lru|active|swapbacked|node=0|zone=1|lastcpupid=0x7ff)
> raw: 00fff0000008003f ffffea0000b78cc8 ffffea0000b78d48 ffff888071347c70
> raw: 0000000000000234 0000000000000000 00000002ffffffff ffff888073230000
> page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index))
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Movable, gfp_mask 0x13d20ca(GFP_TRANSHUGE_LIGHT|__GFP_NORETRY|__GFP_THISNODE), pid 6314, ts 110712153176, free_ts 109293647371
>  get_page_from_freelist+0xa6f/0x2f10
>  __alloc_pages+0x1b2/0x500
>  alloc_pages_vma+0x545/0x650
>  shmem_alloc_hugepage+0x18c/0x270

This call-site only allocates order-9 pages.  So clearly this was
_allocated_ as an order-9 page and then split.

> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:1917!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 6314 Comm: syz-executor.5 Not tainted 5.16.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__filemap_get_folio+0x72f/0x9c0
> Code: 02 84 c0 74 09 3c 03 7f 05 e8 6d 13 1b 00 41 8b 46 58 48 39 c5 0f 82 68 fc ff ff 48 c7 c6 60 ec d3 88 4c 89 f7 e8 e1 ef 0a 00 <0f> 0b 4d 8d 6e 34 be 04 00 00 00 4c 89 ef e8 ae 16 1b 00 4c 89 e8
> RSP: 0018:ffffc90005ed78e0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000182 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8920c520 RDI: ffff88801191a9ca
> RBP: 0000000000000080 R08: 0000000000000019 R09: ffff8880b9f33fc7
> R10: ffffed10173e67f8 R11: 6f775f6b73617420 R12: dffffc0000000000
> R13: ffffea0000b78d00 R14: ffffea0000b78d00 R15: ffffea0000b78d00
> FS:  00007f0f22c1d700(0000) GS:ffff8880b9f00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f149ec93058 CR3: 00000000705cf000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  pagecache_get_page+0x10/0x100

I wish I knew which 'index' we were looking up.  I'll try reproducing it
locally so I can print that out too.

My suspicion is that there's a race where the folio is split during the
lookup, and the bug is really in mapping_get_entry().  The folio->index
is weird though; if this was the explanation, I'd expect it to find a
page at a multiple of 512 or at least a multiple of 64.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] kernel BUG in __filemap_get_folio
  2022-04-21 20:21 ` Matthew Wilcox
@ 2022-04-22 18:30   ` Matthew Wilcox
  2022-04-22 19:34     ` syzbot
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2022-04-22 18:30 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, dhowells, hughd, kirill.shutemov, linux-kernel, linux-mm,
	syzkaller-bugs, vbabka, william.kucharski

On Thu, Apr 21, 2022 at 09:21:34PM +0100, Matthew Wilcox wrote:
> I wish I knew which 'index' we were looking up.  I'll try reproducing it
> locally so I can print that out too.

I can't reproduce it locally because the OOM killer says I don't have
enough RAM.  That's with giving 4GB to the VM.  If I give more than 4GB
to the VM, my laptop is insufficiently studly, and the host OOM killer
takes out qemu instead ;-P

> My suspicion is that there's a race where the folio is split during the
> lookup, and the bug is really in mapping_get_entry().  The folio->index
> is weird though; if this was the explanation, I'd expect it to find a
> page at a multiple of 512 or at least a multiple of 64.

I think I have an explanation (from thinking really hard, rather than
testing).  Before we call xas_split(), the tree looks like this:

node (shift=6)
 -> page (index 0)
 -> sibling of 0
 -> sibling of 0
 -> sibling of 0
 -> sibling of 0
 -> sibling of 0
 -> sibling of 0
 -> sibling of 0
 -> page (index 0x200)
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8
 -> sibling of 8

Then we split the page at index 0x200.  Simultaneously, we try to load
the page at index 0x274 (or 2b4 or 2f4 or ... 3f4).  The load picks
up the sibling entry at offset 9 (0x274 >> 6), which says to refer to
the entry at offset 8.  But by the time it gets the entry at offset 8,
the split has replaced the compound page at index 0x200 with a node that
points to pages at indices 0x200-0x23f.

Solving it on the split side is possible, but I think it's easier to
solve on the load side.  I have a patch, it seems to work; let's see
what syzbot thinks of it:

#syz test: git://git.infradead.org/users/willy/xarray.git main


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] kernel BUG in __filemap_get_folio
  2022-04-22 18:30   ` Matthew Wilcox
@ 2022-04-22 19:34     ` syzbot
  0 siblings, 0 replies; 4+ messages in thread
From: syzbot @ 2022-04-22 19:34 UTC (permalink / raw)
  To: akpm, dhowells, hughd, kirill.shutemov, linux-kernel, linux-mm,
	syzkaller-bugs, vbabka, william.kucharski, willy

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com

Tested on:

commit:         91070385 XArray: Disallow sibling entries of nodes
git tree:       git://git.infradead.org/users/willy/xarray.git main
kernel config:  https://syzkaller.appspot.com/x/.config?x=78d87160570beee3
dashboard link: https://syzkaller.appspot.com/bug?extid=cf4cf13056f85dec2c40
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-04-22 19:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20 15:54 [syzbot] kernel BUG in __filemap_get_folio syzbot
2022-04-21 20:21 ` Matthew Wilcox
2022-04-22 18:30   ` Matthew Wilcox
2022-04-22 19:34     ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).