* [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
@ 2022-12-05 9:21 syzbot
2022-12-05 10:35 ` syzbot
0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2022-12-05 9:21 UTC (permalink / raw)
To: djwong, linux-kernel, linux-xfs, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 0ba09b173387 Revert "mm: align larger anonymous mappings o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1736cf4b880000
kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
Read of size 1 at addr ffff88807ed63a98 by task syz-executor.2/22148
CPU: 1 PID: 22148 Comm: syz-executor.2 Not tainted 6.1.0-rc7-syzkaller-00211-g0ba09b173387 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
print_address_description+0x74/0x340 mm/kasan/report.c:284
print_report+0x107/0x1f0 mm/kasan/report.c:395
kasan_report+0xcd/0x100 mm/kasan/report.c:495
xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
xfs_qm_shrink_scan+0x351/0x410 fs/xfs/xfs_qm.c:523
do_shrink_slab+0x4e1/0xa00 mm/vmscan.c:842
shrink_slab+0x1e6/0x340 mm/vmscan.c:1002
drop_slab_node mm/vmscan.c:1037 [inline]
drop_slab+0x185/0x2c0 mm/vmscan.c:1047
drop_caches_sysctl_handler+0xb1/0x160 fs/drop_caches.c:66
proc_sys_call_handler+0x576/0x890 fs/proc/proc_sysctl.c:604
do_iter_write+0x6c2/0xc20 fs/read_write.c:861
iter_file_splice_write+0x7fc/0xfc0 fs/splice.c:686
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0xe6/0x1c0 fs/splice.c:931
splice_direct_to_actor+0x4e4/0xc00 fs/splice.c:886
do_splice_direct+0x279/0x3d0 fs/splice.c:974
do_sendfile+0x5fb/0xf80 fs/read_write.c:1255
__do_sys_sendfile64 fs/read_write.c:1317 [inline]
__se_sys_sendfile64+0xd0/0x1b0 fs/read_write.c:1309
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7eff8be8c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007eff8cbb7168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007eff8bfabf80 RCX: 00007eff8be8c0d9
RDX: 0000000020002080 RSI: 0000000000000004 RDI: 0000000000000006
RBP: 00007eff8bee7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000870 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffd5ac99a0f R14: 00007eff8cbb7300 R15: 0000000000022000
</TASK>
Allocated by task 22095:
kasan_save_stack mm/kasan/common.c:45 [inline]
kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
__kasan_slab_alloc+0x65/0x70 mm/kasan/common.c:325
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slab.h:737 [inline]
slab_alloc_node mm/slub.c:3398 [inline]
slab_alloc mm/slub.c:3406 [inline]
__kmem_cache_alloc_lru mm/slub.c:3413 [inline]
kmem_cache_alloc+0x1cc/0x300 mm/slub.c:3422
kmem_cache_zalloc include/linux/slab.h:679 [inline]
xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870
xfs_qm_vop_dqalloc+0x9bf/0xca0 fs/xfs/xfs_qm.c:1704
xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702
xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022
notify_change+0xe38/0x10f0 fs/attr.c:420
chown_common+0x586/0x8f0 fs/open.c:736
do_fchownat+0x165/0x240 fs/open.c:767
__do_sys_lchown fs/open.c:792 [inline]
__se_sys_lchown fs/open.c:790 [inline]
__x64_sys_lchown+0x81/0x90 fs/open.c:790
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 3661:
kasan_save_stack mm/kasan/common.c:45 [inline]
kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511
____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236
kasan_slab_free include/linux/kasan.h:177 [inline]
slab_free_hook mm/slub.c:1724 [inline]
slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750
slab_free mm/slub.c:3661 [inline]
kmem_cache_free+0x94/0x1d0 mm/slub.c:3683
xfs_qm_dqpurge+0x4f7/0x660 fs/xfs/xfs_qm.c:177
xfs_qm_dquot_walk+0x249/0x490 fs/xfs/xfs_qm.c:87
xfs_qm_dqpurge_all fs/xfs/xfs_qm.c:193 [inline]
xfs_qm_unmount+0x71/0x100 fs/xfs/xfs_qm.c:205
xfs_unmountfs+0xc5/0x1e0 fs/xfs/xfs_mount.c:1059
xfs_fs_put_super+0x6e/0x2d0 fs/xfs/xfs_super.c:1115
generic_shutdown_super+0x130/0x310 fs/super.c:492
kill_block_super+0x79/0xd0 fs/super.c:1428
deactivate_locked_super+0xa7/0xf0 fs/super.c:332
cleanup_mnt+0x494/0x520 fs/namespace.c:1186
task_work_run+0x243/0x300 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171
exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296
do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff88807ed63a80
which belongs to the cache xfs_dquot of size 704
The buggy address is located 24 bytes inside of
704-byte region [ffff88807ed63a80, ffff88807ed63d40)
The buggy address belongs to the physical page:
page:ffffea0001fb5800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7ed60
head:ffffea0001fb5800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 ffffea0001a74500 dead000000000003 ffff88801c6f3a00
raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 18894, tgid 18893 (syz-executor.0), ts 751185457243, free_ts 748684720140
prep_new_page mm/page_alloc.c:2539 [inline]
get_page_from_freelist+0x742/0x7c0 mm/page_alloc.c:4291
__alloc_pages+0x259/0x560 mm/page_alloc.c:5558
alloc_slab_page+0xbd/0x190 mm/slub.c:1794
allocate_slab+0x5e/0x4b0 mm/slub.c:1939
new_slab mm/slub.c:1992 [inline]
___slab_alloc+0x782/0xe20 mm/slub.c:3180
__slab_alloc mm/slub.c:3279 [inline]
slab_alloc_node mm/slub.c:3364 [inline]
slab_alloc mm/slub.c:3406 [inline]
__kmem_cache_alloc_lru mm/slub.c:3413 [inline]
kmem_cache_alloc+0x24c/0x300 mm/slub.c:3422
kmem_cache_zalloc include/linux/slab.h:679 [inline]
xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
xfs_qm_dqget_inode+0x430/0x960 fs/xfs/xfs_dquot.c:973
xfs_qm_dqattach_one+0xe8/0x1c0 fs/xfs/xfs_qm.c:277
xfs_qm_dqattach_locked+0x3ed/0x4a0 fs/xfs/xfs_qm.c:336
xfs_qm_vop_dqalloc+0x3f2/0xca0 fs/xfs/xfs_qm.c:1659
xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702
xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022
notify_change+0xe38/0x10f0 fs/attr.c:420
chown_common+0x586/0x8f0 fs/open.c:736
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1459 [inline]
free_pcp_prepare+0x80c/0x8f0 mm/page_alloc.c:1509
free_unref_page_prepare mm/page_alloc.c:3387 [inline]
free_unref_page_list+0xb4/0x7b0 mm/page_alloc.c:3529
release_pages+0x232a/0x25c0 mm/swap.c:1055
__pagevec_release+0x7d/0xf0 mm/swap.c:1075
pagevec_release include/linux/pagevec.h:71 [inline]
folio_batch_release include/linux/pagevec.h:135 [inline]
truncate_inode_pages_range+0x472/0x17f0 mm/truncate.c:373
kill_bdev block/bdev.c:76 [inline]
blkdev_flush_mapping+0x153/0x2c0 block/bdev.c:662
blkdev_put_whole block/bdev.c:693 [inline]
blkdev_put+0x4a5/0x730 block/bdev.c:953
deactivate_locked_super+0xa7/0xf0 fs/super.c:332
cleanup_mnt+0x494/0x520 fs/namespace.c:1186
task_work_run+0x243/0x300 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171
exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296
do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Memory state around the buggy address:
ffff88807ed63980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88807ed63a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88807ed63a80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88807ed63b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88807ed63b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-05 9:21 [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one syzbot @ 2022-12-05 10:35 ` syzbot 2022-12-05 22:52 ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner 2022-12-05 23:58 ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner 0 siblings, 2 replies; 15+ messages in thread From: syzbot @ 2022-12-05 10:35 UTC (permalink / raw) To: djwong, linux-kernel, linux-xfs, syzkaller-bugs syzbot has found a reproducer for the following issue on: HEAD commit: 0ba09b173387 Revert "mm: align larger anonymous mappings o.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000 kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1 dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com XFS (loop1): Quotacheck: Done. syz-executor.1 (4657): drop_caches: 2 ================================================================== BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline] BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604 Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657 CPU: 1 PID: 4657 Comm: syz-executor.1 Not tainted 6.1.0-rc7-syzkaller-00211-g0ba09b173387 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:284 print_report+0x107/0x1f0 mm/kasan/report.c:395 kasan_report+0xcd/0x100 mm/kasan/report.c:495 xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline] xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604 xfs_qm_shrink_scan+0x351/0x410 fs/xfs/xfs_qm.c:523 do_shrink_slab+0x4e1/0xa00 mm/vmscan.c:842 shrink_slab+0x1e6/0x340 mm/vmscan.c:1002 drop_slab_node mm/vmscan.c:1037 [inline] drop_slab+0x185/0x2c0 mm/vmscan.c:1047 drop_caches_sysctl_handler+0xb1/0x160 fs/drop_caches.c:66 proc_sys_call_handler+0x576/0x890 fs/proc/proc_sysctl.c:604 do_iter_write+0x6c2/0xc20 fs/read_write.c:861 iter_file_splice_write+0x7fc/0xfc0 fs/splice.c:686 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0xe6/0x1c0 fs/splice.c:931 splice_direct_to_actor+0x4e4/0xc00 fs/splice.c:886 do_splice_direct+0x279/0x3d0 fs/splice.c:974 do_sendfile+0x5fb/0xf80 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1317 [inline] __se_sys_sendfile64+0xd0/0x1b0 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fc3e3c8c0d9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc3e4a9a168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 00007fc3e3dabf80 RCX: 00007fc3e3c8c0d9 RDX: 0000000020002080 RSI: 0000000000000004 RDI: 0000000000000006 RBP: 00007fc3e3ce7ae9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000870 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffcb98dc7f R14: 00007fc3e4a9a300 R15: 0000000000022000 </TASK> Allocated by task 4642: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 __kasan_slab_alloc+0x65/0x70 mm/kasan/common.c:325 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slab.h:737 [inline] slab_alloc_node mm/slub.c:3398 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc+0x1cc/0x300 mm/slub.c:3422 kmem_cache_zalloc include/linux/slab.h:679 [inline] xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659 xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870 xfs_qm_vop_dqalloc+0x9bf/0xca0 fs/xfs/xfs_qm.c:1704 xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702 xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022 notify_change+0xe38/0x10f0 fs/attr.c:420 chown_common+0x586/0x8f0 fs/open.c:736 do_fchownat+0x165/0x240 fs/open.c:767 __do_sys_chown fs/open.c:787 [inline] __se_sys_chown fs/open.c:785 [inline] __x64_sys_chown+0x7e/0x90 fs/open.c:785 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 3677: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] kmem_cache_free+0x94/0x1d0 mm/slub.c:3683 xfs_qm_dqpurge+0x4f7/0x660 fs/xfs/xfs_qm.c:177 xfs_qm_dquot_walk+0x249/0x490 fs/xfs/xfs_qm.c:87 xfs_qm_dqpurge_all fs/xfs/xfs_qm.c:193 [inline] xfs_qm_unmount+0x71/0x100 fs/xfs/xfs_qm.c:205 xfs_unmountfs+0xc5/0x1e0 fs/xfs/xfs_mount.c:1059 xfs_fs_put_super+0x6e/0x2d0 fs/xfs/xfs_super.c:1115 generic_shutdown_super+0x130/0x310 fs/super.c:492 kill_block_super+0x79/0xd0 fs/super.c:1428 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 cleanup_mnt+0x494/0x520 fs/namespace.c:1186 task_work_run+0x243/0x300 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd The buggy address belongs to the object at ffff888079a6aa40 which belongs to the cache xfs_dquot of size 704 The buggy address is located 24 bytes inside of 704-byte region [ffff888079a6aa40, ffff888079a6ad00) The buggy address belongs to the physical page: page:ffffea0001e69a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x79a68 head:ffffea0001e69a00 order:2 compound_mapcount:0 compound_pincount:0 flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000010200 0000000000000000 dead000000000122 ffff88814660f000 raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 2, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 33, tgid 33 (kworker/u4:2), ts 336553992665, free_ts 335702258247 prep_new_page mm/page_alloc.c:2539 [inline] get_page_from_freelist+0x742/0x7c0 mm/page_alloc.c:4291 __alloc_pages+0x259/0x560 mm/page_alloc.c:5558 alloc_slab_page+0xbd/0x190 mm/slub.c:1794 allocate_slab+0x5e/0x4b0 mm/slub.c:1939 new_slab mm/slub.c:1992 [inline] ___slab_alloc+0x782/0xe20 mm/slub.c:3180 __slab_alloc mm/slub.c:3279 [inline] slab_alloc_node mm/slub.c:3364 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc+0x24c/0x300 mm/slub.c:3422 kmem_cache_zalloc include/linux/slab.h:679 [inline] xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659 xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870 xfs_qm_quotacheck_dqadjust+0xb7/0x380 fs/xfs/xfs_qm.c:1077 xfs_qm_dqusage_adjust+0x4bd/0x630 fs/xfs/xfs_qm.c:1189 xfs_iwalk_ag_recs+0x425/0x620 fs/xfs/xfs_iwalk.c:220 xfs_iwalk_run_callbacks+0x20f/0x410 fs/xfs/xfs_iwalk.c:376 xfs_iwalk_ag+0xaa5/0xb80 fs/xfs/xfs_iwalk.c:482 xfs_iwalk_ag_work+0xf5/0x1a0 fs/xfs/xfs_iwalk.c:624 xfs_pwork_work+0x7f/0x180 fs/xfs/xfs_pwork.c:47 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1459 [inline] free_pcp_prepare+0x80c/0x8f0 mm/page_alloc.c:1509 free_unref_page_prepare mm/page_alloc.c:3387 [inline] free_unref_page+0x7d/0x5f0 mm/page_alloc.c:3483 __stack_depot_save+0x430/0x4a0 lib/stackdepot.c:506 kasan_save_stack mm/kasan/common.c:46 [inline] kasan_set_track+0x52/0x60 mm/kasan/common.c:52 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3674 memcg_free_slab_cgroups mm/slab.h:456 [inline] unaccount_slab mm/slab.h:645 [inline] __free_slab+0xf0/0x320 mm/slub.c:2015 qlist_free_all+0x2b/0x70 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x169/0x180 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0x1f/0x70 mm/kasan/common.c:302 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slab.h:737 [inline] slab_alloc_node mm/slub.c:3398 [inline] __kmem_cache_alloc_node+0x1d7/0x310 mm/slub.c:3437 __do_kmalloc_node mm/slab_common.c:954 [inline] __kmalloc+0x9e/0x1a0 mm/slab_common.c:968 kmalloc include/linux/slab.h:558 [inline] tomoyo_realpath_from_path+0xcd/0x5f0 security/tomoyo/realpath.c:251 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_path_perm+0x227/0x670 security/tomoyo/file.c:822 Memory state around the buggy address: ffff888079a6a900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888079a6a980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc >ffff888079a6aa00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ^ ffff888079a6aa80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888079a6ab00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING 2022-12-05 10:35 ` syzbot @ 2022-12-05 22:52 ` Dave Chinner 2022-12-07 16:17 ` Darrick J. Wong 2022-12-05 23:58 ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner 1 sibling, 1 reply; 15+ messages in thread From: Dave Chinner @ 2022-12-05 22:52 UTC (permalink / raw) To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 0ba09b173387 Revert "mm: align larger anonymous mappings o.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000 > kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1 > dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz > kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz > mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com > > XFS (loop1): Quotacheck: Done. > syz-executor.1 (4657): drop_caches: 2 > ================================================================== > BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline] > BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604 > Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657 Looks like we've missed a XFS_DQUOT_FREEING check in xfs_qm_shrink_scan(), and the dquot purge run by unmount has raced with the shrinker. Patch below should fix it. -Dave. -- Dave Chinner david@fromorbit.com xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING From: Dave Chinner <dchinner@redhat.com> Resulting in a UAF if the shrinker races with some other dquot freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is removed from the LRU. This can occur if a dquot purge races with drop_caches. Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_qm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 18bb4ec4d7c9..ff53d40a2dae 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -422,6 +422,14 @@ xfs_qm_dquot_isolate( if (!xfs_dqlock_nowait(dqp)) goto out_miss_busy; + /* + * If something else is freeing this dquot and hasn't yet removed it + * from the LRU, leave it for the freeing task to complete the freeing + * process rather than risk it being free from under us here. + */ + if (dqp->q_flags & XFS_DQFLAG_FREEING) + goto out_miss_unlock; + /* * This dquot has acquired a reference in the meantime remove it from * the freelist and try again. @@ -441,10 +449,8 @@ xfs_qm_dquot_isolate( * skip it so there is time for the IO to complete before we try to * reclaim it again on the next LRU pass. */ - if (!xfs_dqflock_nowait(dqp)) { - xfs_dqunlock(dqp); - goto out_miss_busy; - } + if (!xfs_dqflock_nowait(dqp)) + goto out_miss_unlock; if (XFS_DQ_IS_DIRTY(dqp)) { struct xfs_buf *bp = NULL; @@ -478,6 +484,8 @@ xfs_qm_dquot_isolate( XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims); return LRU_REMOVED; +out_miss_unlock: + xfs_dqunlock(dqp); out_miss_busy: trace_xfs_dqreclaim_busy(dqp); XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses); ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING 2022-12-05 22:52 ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner @ 2022-12-07 16:17 ` Darrick J. Wong 0 siblings, 0 replies; 15+ messages in thread From: Darrick J. Wong @ 2022-12-07 16:17 UTC (permalink / raw) To: Dave Chinner; +Cc: syzbot, linux-kernel, linux-xfs, syzkaller-bugs On Tue, Dec 06, 2022 at 09:52:46AM +1100, Dave Chinner wrote: > On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote: > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: 0ba09b173387 Revert "mm: align larger anonymous mappings o.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1 > > dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 > > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz > > mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com > > > > XFS (loop1): Quotacheck: Done. > > syz-executor.1 (4657): drop_caches: 2 > > ================================================================== > > BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline] > > BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604 > > Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657 > > Looks like we've missed a XFS_DQUOT_FREEING check in > xfs_qm_shrink_scan(), and the dquot purge run by unmount has raced > with the shrinker. Patch below should fix it. > > -Dave. > -- > Dave Chinner > david@fromorbit.com > > xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING > > From: Dave Chinner <dchinner@redhat.com> > > Resulting in a UAF if the shrinker races with some other dquot > freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is > removed from the LRU. This can occur if a dquot purge races with > drop_caches. > > Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com > Signed-off-by: Dave Chinner <dchinner@redhat.com> Please repost this as a toplevel thread so it doesn't get lost in the depths. Anyway, this looks correct so: Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/xfs_qm.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c > index 18bb4ec4d7c9..ff53d40a2dae 100644 > --- a/fs/xfs/xfs_qm.c > +++ b/fs/xfs/xfs_qm.c > @@ -422,6 +422,14 @@ xfs_qm_dquot_isolate( > if (!xfs_dqlock_nowait(dqp)) > goto out_miss_busy; > > + /* > + * If something else is freeing this dquot and hasn't yet removed it > + * from the LRU, leave it for the freeing task to complete the freeing > + * process rather than risk it being free from under us here. > + */ > + if (dqp->q_flags & XFS_DQFLAG_FREEING) > + goto out_miss_unlock; > + > /* > * This dquot has acquired a reference in the meantime remove it from > * the freelist and try again. > @@ -441,10 +449,8 @@ xfs_qm_dquot_isolate( > * skip it so there is time for the IO to complete before we try to > * reclaim it again on the next LRU pass. > */ > - if (!xfs_dqflock_nowait(dqp)) { > - xfs_dqunlock(dqp); > - goto out_miss_busy; > - } > + if (!xfs_dqflock_nowait(dqp)) > + goto out_miss_unlock; > > if (XFS_DQ_IS_DIRTY(dqp)) { > struct xfs_buf *bp = NULL; > @@ -478,6 +484,8 @@ xfs_qm_dquot_isolate( > XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims); > return LRU_REMOVED; > > +out_miss_unlock: > + xfs_dqunlock(dqp); > out_miss_busy: > trace_xfs_dqreclaim_busy(dqp); > XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-05 10:35 ` syzbot 2022-12-05 22:52 ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner @ 2022-12-05 23:58 ` Dave Chinner 2022-12-06 3:12 ` syzbot 1 sibling, 1 reply; 15+ messages in thread From: Dave Chinner @ 2022-12-05 23:58 UTC (permalink / raw) To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 0ba09b173387 Revert "mm: align larger anonymous mappings o.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000 > kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1 > dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz > kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz > mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com #syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING From: Dave Chinner <dchinner@redhat.com> Resulting in a UAF if the shrinker races with some other dquot freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is removed from the LRU. This can occur if a dquot purge races with drop_caches. Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_qm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 18bb4ec4d7c9..ff53d40a2dae 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -422,6 +422,14 @@ xfs_qm_dquot_isolate( if (!xfs_dqlock_nowait(dqp)) goto out_miss_busy; + /* + * If something else is freeing this dquot and hasn't yet removed it + * from the LRU, leave it for the freeing task to complete the freeing + * process rather than risk it being free from under us here. + */ + if (dqp->q_flags & XFS_DQFLAG_FREEING) + goto out_miss_unlock; + /* * This dquot has acquired a reference in the meantime remove it from * the freelist and try again. @@ -441,10 +449,8 @@ xfs_qm_dquot_isolate( * skip it so there is time for the IO to complete before we try to * reclaim it again on the next LRU pass. */ - if (!xfs_dqflock_nowait(dqp)) { - xfs_dqunlock(dqp); - goto out_miss_busy; - } + if (!xfs_dqflock_nowait(dqp)) + goto out_miss_unlock; if (XFS_DQ_IS_DIRTY(dqp)) { struct xfs_buf *bp = NULL; @@ -478,6 +484,8 @@ xfs_qm_dquot_isolate( XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims); return LRU_REMOVED; +out_miss_unlock: + xfs_dqunlock(dqp); out_miss_busy: trace_xfs_dqreclaim_busy(dqp); XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses); ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-05 23:58 ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner @ 2022-12-06 3:12 ` syzbot 2022-12-06 3:34 ` Dave Chinner 0 siblings, 1 reply; 15+ messages in thread From: syzbot @ 2022-12-06 3:12 UTC (permalink / raw) To: david, djwong, linux-kernel, linux-xfs, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: rcu detected stall in corrupted rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T rcu: blocking rcu_node structures (internal RCU debug): Tested on: commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 kernel config: https://syzkaller.appspot.com/x/.config?x=d58e7fe7f9cf5e24 dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=164cad83880000 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 3:12 ` syzbot @ 2022-12-06 3:34 ` Dave Chinner 2022-12-06 11:06 ` Dmitry Vyukov 0 siblings, 1 reply; 15+ messages in thread From: Dave Chinner @ 2022-12-06 3:34 UTC (permalink / raw) To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > Hello, > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > INFO: rcu detected stall in corrupted > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > rcu: blocking rcu_node structures (internal RCU debug): I'm pretty sure this has nothing to do with the reproducer - the console log here: > Tested on: > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 indicates that syzbot is screwing around with bluetooth, HCI, netdevsim, bridging, bonding, etc. There's no evidence that it actually ran the reproducer for the bug reported in this thread - there's no record of a single XFS filesystem being mounted in the log.... It look slike someone else also tried a private patch to fix this problem (which was obviously broken) and it failed with exactly the same RCU warnings. That was run from the same commit id as the original reproducer, so this looks like either syzbot is broken or there's some other completely unrelated problem that syzbot is tripping over here. Over to the syzbot people to debug the syzbot failure.... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 3:34 ` Dave Chinner @ 2022-12-06 11:06 ` Dmitry Vyukov 2022-12-06 15:32 ` Paul E. McKenney ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Dmitry Vyukov @ 2022-12-06 11:06 UTC (permalink / raw) To: Dave Chinner, Paul E. McKenney, frederic, quic_neeraju, Josh Triplett, RCU Cc: syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > Hello, > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > INFO: rcu detected stall in corrupted > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > rcu: blocking rcu_node structures (internal RCU debug): > > I'm pretty sure this has nothing to do with the reproducer - the > console log here: > > > Tested on: > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > indicates that syzbot is screwing around with bluetooth, HCI, > netdevsim, bridging, bonding, etc. > > There's no evidence that it actually ran the reproducer for the bug > reported in this thread - there's no record of a single XFS > filesystem being mounted in the log.... > > It look slike someone else also tried a private patch to fix this > problem (which was obviously broken) and it failed with exactly the > same RCU warnings. That was run from the same commit id as the > original reproducer, so this looks like either syzbot is broken or > there's some other completely unrelated problem that syzbot is > tripping over here. > > Over to the syzbot people to debug the syzbot failure.... Hi Dave, It's not uncommon for a single program to trigger multiple bugs. That's what happens here. The rcu stall issue is reproducible with this test program. In such cases you can either submit more test requests, or test manually. I think there is an RCU expedited stall detection. For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 seconds, and that's not enough for reliable flake-free stress testing. We bump other timeouts to 100+ seconds. +RCU maintainers, do you mind removing the overly restrictive limit on CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? Or you think there is something to fix in the kernel to not stall? I see the test writes to /proc/sys/vm/drop_caches, maybe there is some issue in that code. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 11:06 ` Dmitry Vyukov @ 2022-12-06 15:32 ` Paul E. McKenney 2022-12-06 16:19 ` Dmitry Vyukov 2022-12-06 20:58 ` Dave Chinner [not found] ` <20221209034605.1801-1-hdanton@sina.com> 2 siblings, 1 reply; 15+ messages in thread From: Paul E. McKenney @ 2022-12-06 15:32 UTC (permalink / raw) To: Dmitry Vyukov Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > Hello, > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > INFO: rcu detected stall in corrupted > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > I'm pretty sure this has nothing to do with the reproducer - the > > console log here: > > > > > Tested on: > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > netdevsim, bridging, bonding, etc. > > > > There's no evidence that it actually ran the reproducer for the bug > > reported in this thread - there's no record of a single XFS > > filesystem being mounted in the log.... > > > > It look slike someone else also tried a private patch to fix this > > problem (which was obviously broken) and it failed with exactly the > > same RCU warnings. That was run from the same commit id as the > > original reproducer, so this looks like either syzbot is broken or > > there's some other completely unrelated problem that syzbot is > > tripping over here. > > > > Over to the syzbot people to debug the syzbot failure.... > > Hi Dave, > > It's not uncommon for a single program to trigger multiple bugs. > That's what happens here. The rcu stall issue is reproducible with > this test program. > In such cases you can either submit more test requests, or test manually. > > I think there is an RCU expedited stall detection. > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 > seconds, and that's not enough for reliable flake-free stress testing. > We bump other timeouts to 100+ seconds. > +RCU maintainers, do you mind removing the overly restrictive limit on > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? > Or you think there is something to fix in the kernel to not stall? I > see the test writes to > /proc/sys/vm/drop_caches, maybe there is some issue in that code. Like this? If so, I don't see why not. And in that case, may I please have your Tested-by or similar? At the same time, I am sure that there are things in the kernel that should be adjusted to avoid stalls, but I recognize that different developers in different situations will have different issues that they choose to focus on. ;-) Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug index 49da904df6aa6..2984de629f749 100644 --- a/kernel/rcu/Kconfig.debug +++ b/kernel/rcu/Kconfig.debug @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT config RCU_EXP_CPU_STALL_TIMEOUT int "Expedited RCU CPU stall timeout in milliseconds" depends on RCU_STALL_COMMON - range 0 21000 + range 0 300000 default 0 help If a given expedited RCU grace period extends more than the ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 15:32 ` Paul E. McKenney @ 2022-12-06 16:19 ` Dmitry Vyukov 2022-12-06 17:47 ` Paul E. McKenney 2022-12-06 21:03 ` Dave Chinner 0 siblings, 2 replies; 15+ messages in thread From: Dmitry Vyukov @ 2022-12-06 16:19 UTC (permalink / raw) To: paulmck Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > > Hello, > > > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > > INFO: rcu detected stall in corrupted > > > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > > > I'm pretty sure this has nothing to do with the reproducer - the > > > console log here: > > > > > > > Tested on: > > > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > > netdevsim, bridging, bonding, etc. > > > > > > There's no evidence that it actually ran the reproducer for the bug > > > reported in this thread - there's no record of a single XFS > > > filesystem being mounted in the log.... > > > > > > It look slike someone else also tried a private patch to fix this > > > problem (which was obviously broken) and it failed with exactly the > > > same RCU warnings. That was run from the same commit id as the > > > original reproducer, so this looks like either syzbot is broken or > > > there's some other completely unrelated problem that syzbot is > > > tripping over here. > > > > > > Over to the syzbot people to debug the syzbot failure.... > > > > Hi Dave, > > > > It's not uncommon for a single program to trigger multiple bugs. > > That's what happens here. The rcu stall issue is reproducible with > > this test program. > > In such cases you can either submit more test requests, or test manually. > > > > I think there is an RCU expedited stall detection. > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 > > seconds, and that's not enough for reliable flake-free stress testing. > > We bump other timeouts to 100+ seconds. > > +RCU maintainers, do you mind removing the overly restrictive limit on > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? > > Or you think there is something to fix in the kernel to not stall? I > > see the test writes to > > /proc/sys/vm/drop_caches, maybe there is some issue in that code. > > Like this? > > If so, I don't see why not. And in that case, may I please have > your Tested-by or similar? I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000. Running the test program I got some kernel BUG in XFS and no RCU errors/warnings. Tested-by: Dmitry Vyukov <dvyukov@google.com> Thanks > At the same time, I am sure that there are things in the kernel that > should be adjusted to avoid stalls, but I recognize that different > developers in different situations will have different issues that they > choose to focus on. ;-) > > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug > index 49da904df6aa6..2984de629f749 100644 > --- a/kernel/rcu/Kconfig.debug > +++ b/kernel/rcu/Kconfig.debug > @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT > config RCU_EXP_CPU_STALL_TIMEOUT > int "Expedited RCU CPU stall timeout in milliseconds" > depends on RCU_STALL_COMMON > - range 0 21000 > + range 0 300000 > default 0 > help > If a given expedited RCU grace period extends more than the ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 16:19 ` Dmitry Vyukov @ 2022-12-06 17:47 ` Paul E. McKenney 2022-12-06 21:03 ` Dave Chinner 1 sibling, 0 replies; 15+ messages in thread From: Paul E. McKenney @ 2022-12-06 17:47 UTC (permalink / raw) To: Dmitry Vyukov Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, Dec 06, 2022 at 05:19:10PM +0100, Dmitry Vyukov wrote: > On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > > > Hello, > > > > > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > > > INFO: rcu detected stall in corrupted > > > > > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > > > > > I'm pretty sure this has nothing to do with the reproducer - the > > > > console log here: > > > > > > > > > Tested on: > > > > > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > > > netdevsim, bridging, bonding, etc. > > > > > > > > There's no evidence that it actually ran the reproducer for the bug > > > > reported in this thread - there's no record of a single XFS > > > > filesystem being mounted in the log.... > > > > > > > > It look slike someone else also tried a private patch to fix this > > > > problem (which was obviously broken) and it failed with exactly the > > > > same RCU warnings. That was run from the same commit id as the > > > > original reproducer, so this looks like either syzbot is broken or > > > > there's some other completely unrelated problem that syzbot is > > > > tripping over here. > > > > > > > > Over to the syzbot people to debug the syzbot failure.... > > > > > > Hi Dave, > > > > > > It's not uncommon for a single program to trigger multiple bugs. > > > That's what happens here. The rcu stall issue is reproducible with > > > this test program. > > > In such cases you can either submit more test requests, or test manually. > > > > > > I think there is an RCU expedited stall detection. > > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 > > > seconds, and that's not enough for reliable flake-free stress testing. > > > We bump other timeouts to 100+ seconds. > > > +RCU maintainers, do you mind removing the overly restrictive limit on > > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? > > > Or you think there is something to fix in the kernel to not stall? I > > > see the test writes to > > > /proc/sys/vm/drop_caches, maybe there is some issue in that code. > > > > Like this? > > > > If so, I don't see why not. And in that case, may I please have > > your Tested-by or similar? > > I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000. > Running the test program I got some kernel BUG in XFS and no RCU > errors/warnings. > > Tested-by: Dmitry Vyukov <dvyukov@google.com> Applied, thank you both! I expect to push this into the v6.3 merge window, that is, not the one coming up real soon now, but the one after that. Thanx, Paul > Thanks > > > At the same time, I am sure that there are things in the kernel that > > should be adjusted to avoid stalls, but I recognize that different > > developers in different situations will have different issues that they > > choose to focus on. ;-) > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug > > index 49da904df6aa6..2984de629f749 100644 > > --- a/kernel/rcu/Kconfig.debug > > +++ b/kernel/rcu/Kconfig.debug > > @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT > > config RCU_EXP_CPU_STALL_TIMEOUT > > int "Expedited RCU CPU stall timeout in milliseconds" > > depends on RCU_STALL_COMMON > > - range 0 21000 > > + range 0 300000 > > default 0 > > help > > If a given expedited RCU grace period extends more than the ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 16:19 ` Dmitry Vyukov 2022-12-06 17:47 ` Paul E. McKenney @ 2022-12-06 21:03 ` Dave Chinner 1 sibling, 0 replies; 15+ messages in thread From: Dave Chinner @ 2022-12-06 21:03 UTC (permalink / raw) To: Dmitry Vyukov Cc: paulmck, frederic, quic_neeraju, Josh Triplett, RCU, syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, Dec 06, 2022 at 05:19:10PM +0100, Dmitry Vyukov wrote: > On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > > > Hello, > > > > > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > > > INFO: rcu detected stall in corrupted > > > > > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > > > > > I'm pretty sure this has nothing to do with the reproducer - the > > > > console log here: > > > > > > > > > Tested on: > > > > > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > > > netdevsim, bridging, bonding, etc. > > > > > > > > There's no evidence that it actually ran the reproducer for the bug > > > > reported in this thread - there's no record of a single XFS > > > > filesystem being mounted in the log.... > > > > > > > > It look slike someone else also tried a private patch to fix this > > > > problem (which was obviously broken) and it failed with exactly the > > > > same RCU warnings. That was run from the same commit id as the > > > > original reproducer, so this looks like either syzbot is broken or > > > > there's some other completely unrelated problem that syzbot is > > > > tripping over here. > > > > > > > > Over to the syzbot people to debug the syzbot failure.... > > > > > > Hi Dave, > > > > > > It's not uncommon for a single program to trigger multiple bugs. > > > That's what happens here. The rcu stall issue is reproducible with > > > this test program. > > > In such cases you can either submit more test requests, or test manually. > > > > > > I think there is an RCU expedited stall detection. > > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 > > > seconds, and that's not enough for reliable flake-free stress testing. > > > We bump other timeouts to 100+ seconds. > > > +RCU maintainers, do you mind removing the overly restrictive limit on > > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? > > > Or you think there is something to fix in the kernel to not stall? I > > > see the test writes to > > > /proc/sys/vm/drop_caches, maybe there is some issue in that code. > > > > Like this? > > > > If so, I don't see why not. And in that case, may I please have > > your Tested-by or similar? > > I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000. > Running the test program I got some kernel BUG in XFS and no RCU > errors/warnings. What BUG did it trigger? Where's the log? -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one 2022-12-06 11:06 ` Dmitry Vyukov 2022-12-06 15:32 ` Paul E. McKenney @ 2022-12-06 20:58 ` Dave Chinner [not found] ` <20221209034605.1801-1-hdanton@sina.com> 2 siblings, 0 replies; 15+ messages in thread From: Dave Chinner @ 2022-12-06 20:58 UTC (permalink / raw) To: Dmitry Vyukov Cc: Paul E. McKenney, frederic, quic_neeraju, Josh Triplett, RCU, syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > Hello, > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > INFO: rcu detected stall in corrupted > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > I'm pretty sure this has nothing to do with the reproducer - the > > console log here: > > > > > Tested on: > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > netdevsim, bridging, bonding, etc. > > > > There's no evidence that it actually ran the reproducer for the bug > > reported in this thread - there's no record of a single XFS > > filesystem being mounted in the log.... > > > > It look slike someone else also tried a private patch to fix this > > problem (which was obviously broken) and it failed with exactly the > > same RCU warnings. That was run from the same commit id as the > > original reproducer, so this looks like either syzbot is broken or > > there's some other completely unrelated problem that syzbot is > > tripping over here. > > > > Over to the syzbot people to debug the syzbot failure.... > > Hi Dave, > > It's not uncommon for a single program to trigger multiple bugs. > That's what happens here. The rcu stall issue is reproducible with > this test program. > In such cases you can either submit more test requests, or test manually. So you're telling us syzbot reproducers are unreliable and we are expected to play whack-a-mole with test resubmission until we get the result we want? How do I tell syzbot to resubmit the same patch for testing without having to send the same patch to syzbot via email again? Can I retrigger a new test run through the web interface? -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20221209034605.1801-1-hdanton@sina.com>]
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one [not found] ` <20221209034605.1801-1-hdanton@sina.com> @ 2022-12-09 4:14 ` Paul E. McKenney 0 siblings, 0 replies; 15+ messages in thread From: Paul E. McKenney @ 2022-12-09 4:14 UTC (permalink / raw) To: Hillf Danton Cc: Dmitry Vyukov, Dave Chinner, linux-kernel, linux-xfs, syzkaller-bugs On Fri, Dec 09, 2022 at 11:46:05AM +0800, Hillf Danton wrote: > On 6 Dec 2022 07:32:11 -0800 "Paul E. McKenney" <paulmck@kernel.org> > > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote: > > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote: > > > > > Hello, > > > > > > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > > > > INFO: rcu detected stall in corrupted > > > > > > > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T > > > > > rcu: blocking rcu_node structures (internal RCU debug): > > > > > > > > I'm pretty sure this has nothing to do with the reproducer - the > > > > console log here: > > > > > > > > > Tested on: > > > > > > > > > > commit: bce93322 proc: proc_skip_spaces() shouldn't think it i.. > > > > > git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000 > > > > > > > > indicates that syzbot is screwing around with bluetooth, HCI, > > > > netdevsim, bridging, bonding, etc. > > > > > > > > There's no evidence that it actually ran the reproducer for the bug > > > > reported in this thread - there's no record of a single XFS > > > > filesystem being mounted in the log.... > > > > > > > > It look slike someone else also tried a private patch to fix this > > > > problem (which was obviously broken) and it failed with exactly the > > > > same RCU warnings. That was run from the same commit id as the > > > > original reproducer, so this looks like either syzbot is broken or > > > > there's some other completely unrelated problem that syzbot is > > > > tripping over here. > > > > > > > > Over to the syzbot people to debug the syzbot failure.... > > > > > > Hi Dave, > > > > > > It's not uncommon for a single program to trigger multiple bugs. > > > That's what happens here. The rcu stall issue is reproducible with > > > this test program. > > > In such cases you can either submit more test requests, or test manually. > > > > > > I think there is an RCU expedited stall detection. > > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21 > > > seconds, and that's not enough for reliable flake-free stress testing. > > > We bump other timeouts to 100+ seconds. > > > +RCU maintainers, do you mind removing the overly restrictive limit on > > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT? > > > Or you think there is something to fix in the kernel to not stall? I > > > see the test writes to > > > /proc/sys/vm/drop_caches, maybe there is some issue in that code. > > > > Like this? > > > > If so, I don't see why not. And in that case, may I please have > > your Tested-by or similar? > > > > At the same time, I am sure that there are things in the kernel that > > should be adjusted to avoid stalls, but I recognize that different > > developers in different situations will have different issues that they > > choose to focus on. ;-) > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug > > index 49da904df6aa6..2984de629f749 100644 > > --- a/kernel/rcu/Kconfig.debug > > +++ b/kernel/rcu/Kconfig.debug > > @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT > > config RCU_EXP_CPU_STALL_TIMEOUT > > int "Expedited RCU CPU stall timeout in milliseconds" > > depends on RCU_STALL_COMMON > > - range 0 21000 > > + range 0 300000 > > default 0 > > help > > If a given expedited RCU grace period extends more than the > > > // Limit check must be consistent with the Kconfig limits for > // CONFIG_RCU_EXP_CPU_STALL_TIMEOUT, so check the allowed range. > // The minimum clamped value is "2UL", because at least one full > // tick has to be guaranteed. > till_stall_check = clamp(msecs_to_jiffies(cpu_stall_timeout), 2UL, 21UL * HZ); > > But with 21UL left behind intact? Good catch, will fix, thank you! Thanx, Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20221205140422.7412-1-hdanton@sina.com>]
* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one [not found] <20221205140422.7412-1-hdanton@sina.com> @ 2022-12-05 17:06 ` syzbot 0 siblings, 0 replies; 15+ messages in thread From: syzbot @ 2022-12-05 17:06 UTC (permalink / raw) To: hdanton, linux-kernel, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: rcu detected stall in corrupted rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4111 } 2640 jiffies s: 2849 root: 0x0/T rcu: blocking rcu_node structures (internal RCU debug): Tested on: commit: 0ba09b17 Revert "mm: align larger anonymous mappings o.. git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=135dc11d880000 kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1 dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3 compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=1551f50f880000 ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-12-09 4:14 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-12-05 9:21 [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one syzbot 2022-12-05 10:35 ` syzbot 2022-12-05 22:52 ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner 2022-12-07 16:17 ` Darrick J. Wong 2022-12-05 23:58 ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner 2022-12-06 3:12 ` syzbot 2022-12-06 3:34 ` Dave Chinner 2022-12-06 11:06 ` Dmitry Vyukov 2022-12-06 15:32 ` Paul E. McKenney 2022-12-06 16:19 ` Dmitry Vyukov 2022-12-06 17:47 ` Paul E. McKenney 2022-12-06 21:03 ` Dave Chinner 2022-12-06 20:58 ` Dave Chinner [not found] ` <20221209034605.1801-1-hdanton@sina.com> 2022-12-09 4:14 ` Paul E. McKenney [not found] <20221205140422.7412-1-hdanton@sina.com> 2022-12-05 17:06 ` syzbot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.