All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [mm?] WARNING in zswap_folio_swapin
@ 2024-02-03 20:37 syzbot
  2024-02-04  1:28 ` Nhat Pham
  0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2024-02-03 20:37 UTC (permalink / raw)
  To: akpm, hannes, linux-kernel, linux-mm, nphamcs, syzkaller-bugs,
	yosryahmed

Hello,

syzbot found the following issue on:

HEAD commit:    861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000
kernel config:  https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08
dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz
kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com

 kcov_ioctl+0x4f/0x720 kernel/kcov.c:704
 __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971
 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
 __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321
page has been migrated, last migrate reason: compaction
------------[ cut here ]------------
WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline]
WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
Modules linked in:
CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline]
RIP: 0010:zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
Code: e8 d8 9f ae ff 45 84 e4 0f 85 e7 fc ff ff e8 9a a4 ae ff 48 c7 c6 20 9a da 8a 48 89 df e8 2b 1a ee ff c6 05 d1 8f 4b 0d 01 90 <0f> 0b 90 e9 c3 fc ff ff e8 76 a4 ae ff 48 c7 c6 60 99 da 8a 48 89
RSP: 0018:ffffc9000397f8c0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea0000a74300 RCX: ffffc9000397f720
RDX: ffff88801a064800 RSI: ffffffff81d98145 RDI: ffffffff8b2fdc00
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1e76002
R10: ffffffff8f3b0017 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000000069a1 R15: 0000000000000003
FS:  000000c000056490(0000) GS:ffff88802c800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000030623000 CR3: 000000001c68c000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 swap_cluster_readahead+0x4fb/0x710 mm/swap_state.c:685
 swapin_readahead+0x132/0xe60 mm/swap_state.c:886
 do_swap_page+0x4a6/0x30f0 mm/memory.c:3898
 handle_pte_fault mm/memory.c:5147 [inline]
 __handle_mm_fault+0x13a0/0x4900 mm/memory.c:5285
 handle_mm_fault+0x47a/0xa10 mm/memory.c:5450
 do_user_addr_fault+0x30b/0x1030 arch/x86/mm/fault.c:1364
 handle_page_fault arch/x86/mm/fault.c:1507 [inline]
 exc_page_fault+0x5d/0xc0 arch/x86/mm/fault.c:1563
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0033:0x46d3b9
Code: fe 7f 44 1f 80 c5 f8 77 c3 80 3d 84 7c c7 01 01 75 0d c5 f9 ef c0 48 81 fb 00 00 00 02 73 13 48 89 d9 48 c1 e9 03 48 83 e3 07 <f3> 48 ab e9 65 fe ff ff c5 fe 7f 07 48 89 fe 48 83 c7 20 48 83 e7
RSP: 002b:000000c00108b700 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000400
RDX: 000000c00258f002 RSI: 00000000222172b0 RDI: 000000c00258fffa
RBP: 000000c00108b758 R08: 0000000000000000 R09: 000000000000a000
R10: 000000c002588000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000040 R14: 000000c000508ea0 R15: 000000c000056400
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] [mm?] WARNING in zswap_folio_swapin
  2024-02-03 20:37 [syzbot] [mm?] WARNING in zswap_folio_swapin syzbot
@ 2024-02-04  1:28 ` Nhat Pham
  2024-02-04  2:59   ` Chengming Zhou
  0 siblings, 1 reply; 4+ messages in thread
From: Nhat Pham @ 2024-02-04  1:28 UTC (permalink / raw)
  To: syzbot; +Cc: akpm, hannes, linux-kernel, linux-mm, syzkaller-bugs, yosryahmed

On Sat, Feb 3, 2024 at 12:37 PM syzbot
<syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08
> dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: i386
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
>
>  kcov_ioctl+0x4f/0x720 kernel/kcov.c:704
>  __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971
>  do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
>  __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321
> page has been migrated, last migrate reason: compaction
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline]
> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
> Modules linked in:
> CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline]

Hmm looks like it's this line:
VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio);

Looks like memcg was cleared from the folio. Haven't looked too
closely yet, but this (and the "page has been migrated" line above)
suggests maybe there is some migration business going on -
mem_cgroup_migrate() clears the old folio's memcg_data (via
old->memcg_data = 0).

Here's my theory (which could be wrong - someone please fact-check
me): swap_read_folio(), which precedes zswap_folio_swapin(), unlocks
the folio. Could this be sufficient to allow for migration? If this is
the case, all we need to do is move this to above swap_read_folio(),
while the folio is still locked. __read_swap_cache_async() already
charges the folio to an memcg, so no need to wait till after
swap_read_page() anyway.

> RIP: 0010:zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
> Code: e8 d8 9f ae ff 45 84 e4 0f 85 e7 fc ff ff e8 9a a4 ae ff 48 c7 c6 20 9a da 8a 48 89 df e8 2b 1a ee ff c6 05 d1 8f 4b 0d 01 90 <0f> 0b 90 e9 c3 fc ff ff e8 76 a4 ae ff 48 c7 c6 60 99 da 8a 48 89
> RSP: 0018:ffffc9000397f8c0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea0000a74300 RCX: ffffc9000397f720
> RDX: ffff88801a064800 RSI: ffffffff81d98145 RDI: ffffffff8b2fdc00
> RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1e76002
> R10: ffffffff8f3b0017 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 00000000000069a1 R15: 0000000000000003
> FS:  000000c000056490(0000) GS:ffff88802c800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000030623000 CR3: 000000001c68c000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  swap_cluster_readahead+0x4fb/0x710 mm/swap_state.c:685
>  swapin_readahead+0x132/0xe60 mm/swap_state.c:886
>  do_swap_page+0x4a6/0x30f0 mm/memory.c:3898
>  handle_pte_fault mm/memory.c:5147 [inline]
>  __handle_mm_fault+0x13a0/0x4900 mm/memory.c:5285
>  handle_mm_fault+0x47a/0xa10 mm/memory.c:5450
>  do_user_addr_fault+0x30b/0x1030 arch/x86/mm/fault.c:1364
>  handle_page_fault arch/x86/mm/fault.c:1507 [inline]
>  exc_page_fault+0x5d/0xc0 arch/x86/mm/fault.c:1563
>  asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570
> RIP: 0033:0x46d3b9
> Code: fe 7f 44 1f 80 c5 f8 77 c3 80 3d 84 7c c7 01 01 75 0d c5 f9 ef c0 48 81 fb 00 00 00 02 73 13 48 89 d9 48 c1 e9 03 48 83 e3 07 <f3> 48 ab e9 65 fe ff ff c5 fe 7f 07 48 89 fe 48 83 c7 20 48 83 e7
> RSP: 002b:000000c00108b700 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000400
> RDX: 000000c00258f002 RSI: 00000000222172b0 RDI: 000000c00258fffa
> RBP: 000000c00108b758 R08: 0000000000000000 R09: 000000000000a000
> R10: 000000c002588000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000040 R14: 000000c000508ea0 R15: 000000c000056400
>  </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] [mm?] WARNING in zswap_folio_swapin
  2024-02-04  1:28 ` Nhat Pham
@ 2024-02-04  2:59   ` Chengming Zhou
  2024-02-05  3:48     ` Nhat Pham
  0 siblings, 1 reply; 4+ messages in thread
From: Chengming Zhou @ 2024-02-04  2:59 UTC (permalink / raw)
  To: Nhat Pham, syzbot
  Cc: akpm, hannes, linux-kernel, linux-mm, syzkaller-bugs, yosryahmed

On 2024/2/4 09:28, Nhat Pham wrote:
> On Sat, Feb 3, 2024 at 12:37 PM syzbot
> <syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08
>> dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092
>> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>> userspace arch: i386
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
>>
>>  kcov_ioctl+0x4f/0x720 kernel/kcov.c:704
>>  __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971
>>  do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
>>  __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321
>> page has been migrated, last migrate reason: compaction
>> ------------[ cut here ]------------
>> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline]
>> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
>> Modules linked in:
>> CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
>> RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline]
> 
> Hmm looks like it's this line:
> VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio);
> 
> Looks like memcg was cleared from the folio. Haven't looked too
> closely yet, but this (and the "page has been migrated" line above)
> suggests maybe there is some migration business going on -
> mem_cgroup_migrate() clears the old folio's memcg_data (via
> old->memcg_data = 0).

Yeah, I think it's this case.

> 
> Here's my theory (which could be wrong - someone please fact-check
> me): swap_read_folio(), which precedes zswap_folio_swapin(), unlocks

And another case is !page_allocated, the returned folio is unlocked, right?

> the folio. Could this be sufficient to allow for migration? If this is

IMHO, folio locked is sufficient to avoid concurrent memcg migration.

> the case, all we need to do is move this to above swap_read_folio(),
> while the folio is still locked. __read_swap_cache_async() already
> charges the folio to an memcg, so no need to wait till after
> swap_read_page() anyway.

Should we call zswap_folio_swapin() in the !page_allocated case?

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [syzbot] [mm?] WARNING in zswap_folio_swapin
  2024-02-04  2:59   ` Chengming Zhou
@ 2024-02-05  3:48     ` Nhat Pham
  0 siblings, 0 replies; 4+ messages in thread
From: Nhat Pham @ 2024-02-05  3:48 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: syzbot, akpm, hannes, linux-kernel, linux-mm, syzkaller-bugs, yosryahmed

On Sat, Feb 3, 2024 at 6:59 PM Chengming Zhou <chengming.zhou@linux.dev> wrote:
>
> On 2024/2/4 09:28, Nhat Pham wrote:
> > On Sat, Feb 3, 2024 at 12:37 PM syzbot
> > <syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com> wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit:    861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam..
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092
> >> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> >> userspace arch: i386
> >>
> >> Unfortunately, I don't have any reproducer for this issue yet.
> >>
> >> Downloadable assets:
> >> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz
> >> vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz
> >> kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> Reported-by: syzbot+17a611d10af7d18a7092@syzkaller.appspotmail.com
> >>
> >>  kcov_ioctl+0x4f/0x720 kernel/kcov.c:704
> >>  __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971
> >>  do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
> >>  __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321
> >> page has been migrated, last migrate reason: compaction
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline]
> >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
> >> Modules linked in:
> >> CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
> >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> >> RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline]
> >
> > Hmm looks like it's this line:
> > VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio);
> >
> > Looks like memcg was cleared from the folio. Haven't looked too
> > closely yet, but this (and the "page has been migrated" line above)
> > suggests maybe there is some migration business going on -
> > mem_cgroup_migrate() clears the old folio's memcg_data (via
> > old->memcg_data = 0).
>
> Yeah, I think it's this case.
>
> >
> > Here's my theory (which could be wrong - someone please fact-check
> > me): swap_read_folio(), which precedes zswap_folio_swapin(), unlocks
>
> And another case is !page_allocated, the returned folio is unlocked, right?

I think you're correct. That said, it's probably fine to keep the
protection size if we find the folio in the swapcache anyway - IIUC,
we are not performing a swapin in that case (since !page_allocated
means no swap_read_folio() called), which is the scenario that the
heuristics cares about :)

IOW, something like this:

if (unlikely(page_allocated)) {
    zswap_folio_swapin(folio);
    swap_read_folio(folio, false, NULL);
}

make sense to me, both from the correctness POV, and the heuristics POV.


>
> > the folio. Could this be sufficient to allow for migration? If this is
>
> IMHO, folio locked is sufficient to avoid concurrent memcg migration.
>
> > the case, all we need to do is move this to above swap_read_folio(),
> > while the folio is still locked. __read_swap_cache_async() already
> > charges the folio to an memcg, so no need to wait till after
> > swap_read_page() anyway.
>
> Should we call zswap_folio_swapin() in the !page_allocated case?
>
> Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-05  3:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-03 20:37 [syzbot] [mm?] WARNING in zswap_folio_swapin syzbot
2024-02-04  1:28 ` Nhat Pham
2024-02-04  2:59   ` Chengming Zhou
2024-02-05  3:48     ` Nhat Pham

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.