All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
@ 2022-12-05  9:21 syzbot
  2022-12-05 10:35 ` syzbot
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2022-12-05  9:21 UTC (permalink / raw)
  To: djwong, linux-kernel, linux-xfs, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    0ba09b173387 Revert "mm: align larger anonymous mappings o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1736cf4b880000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
Read of size 1 at addr ffff88807ed63a98 by task syz-executor.2/22148

CPU: 1 PID: 22148 Comm: syz-executor.2 Not tainted 6.1.0-rc7-syzkaller-00211-g0ba09b173387 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
 print_address_description+0x74/0x340 mm/kasan/report.c:284
 print_report+0x107/0x1f0 mm/kasan/report.c:395
 kasan_report+0xcd/0x100 mm/kasan/report.c:495
 xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
 xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
 xfs_qm_shrink_scan+0x351/0x410 fs/xfs/xfs_qm.c:523
 do_shrink_slab+0x4e1/0xa00 mm/vmscan.c:842
 shrink_slab+0x1e6/0x340 mm/vmscan.c:1002
 drop_slab_node mm/vmscan.c:1037 [inline]
 drop_slab+0x185/0x2c0 mm/vmscan.c:1047
 drop_caches_sysctl_handler+0xb1/0x160 fs/drop_caches.c:66
 proc_sys_call_handler+0x576/0x890 fs/proc/proc_sysctl.c:604
 do_iter_write+0x6c2/0xc20 fs/read_write.c:861
 iter_file_splice_write+0x7fc/0xfc0 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0xe6/0x1c0 fs/splice.c:931
 splice_direct_to_actor+0x4e4/0xc00 fs/splice.c:886
 do_splice_direct+0x279/0x3d0 fs/splice.c:974
 do_sendfile+0x5fb/0xf80 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64+0xd0/0x1b0 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7eff8be8c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007eff8cbb7168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007eff8bfabf80 RCX: 00007eff8be8c0d9
RDX: 0000000020002080 RSI: 0000000000000004 RDI: 0000000000000006
RBP: 00007eff8bee7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000870 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffd5ac99a0f R14: 00007eff8cbb7300 R15: 0000000000022000
 </TASK>

Allocated by task 22095:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
 __kasan_slab_alloc+0x65/0x70 mm/kasan/common.c:325
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x1cc/0x300 mm/slub.c:3422
 kmem_cache_zalloc include/linux/slab.h:679 [inline]
 xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
 xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870
 xfs_qm_vop_dqalloc+0x9bf/0xca0 fs/xfs/xfs_qm.c:1704
 xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702
 xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022
 notify_change+0xe38/0x10f0 fs/attr.c:420
 chown_common+0x586/0x8f0 fs/open.c:736
 do_fchownat+0x165/0x240 fs/open.c:767
 __do_sys_lchown fs/open.c:792 [inline]
 __se_sys_lchown fs/open.c:790 [inline]
 __x64_sys_lchown+0x81/0x90 fs/open.c:790
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 3661:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511
 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x94/0x1d0 mm/slub.c:3683
 xfs_qm_dqpurge+0x4f7/0x660 fs/xfs/xfs_qm.c:177
 xfs_qm_dquot_walk+0x249/0x490 fs/xfs/xfs_qm.c:87
 xfs_qm_dqpurge_all fs/xfs/xfs_qm.c:193 [inline]
 xfs_qm_unmount+0x71/0x100 fs/xfs/xfs_qm.c:205
 xfs_unmountfs+0xc5/0x1e0 fs/xfs/xfs_mount.c:1059
 xfs_fs_put_super+0x6e/0x2d0 fs/xfs/xfs_super.c:1115
 generic_shutdown_super+0x130/0x310 fs/super.c:492
 kill_block_super+0x79/0xd0 fs/super.c:1428
 deactivate_locked_super+0xa7/0xf0 fs/super.c:332
 cleanup_mnt+0x494/0x520 fs/namespace.c:1186
 task_work_run+0x243/0x300 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171
 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296
 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff88807ed63a80
 which belongs to the cache xfs_dquot of size 704
The buggy address is located 24 bytes inside of
 704-byte region [ffff88807ed63a80, ffff88807ed63d40)

The buggy address belongs to the physical page:
page:ffffea0001fb5800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7ed60
head:ffffea0001fb5800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 ffffea0001a74500 dead000000000003 ffff88801c6f3a00
raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 18894, tgid 18893 (syz-executor.0), ts 751185457243, free_ts 748684720140
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x742/0x7c0 mm/page_alloc.c:4291
 __alloc_pages+0x259/0x560 mm/page_alloc.c:5558
 alloc_slab_page+0xbd/0x190 mm/slub.c:1794
 allocate_slab+0x5e/0x4b0 mm/slub.c:1939
 new_slab mm/slub.c:1992 [inline]
 ___slab_alloc+0x782/0xe20 mm/slub.c:3180
 __slab_alloc mm/slub.c:3279 [inline]
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x24c/0x300 mm/slub.c:3422
 kmem_cache_zalloc include/linux/slab.h:679 [inline]
 xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
 xfs_qm_dqget_inode+0x430/0x960 fs/xfs/xfs_dquot.c:973
 xfs_qm_dqattach_one+0xe8/0x1c0 fs/xfs/xfs_qm.c:277
 xfs_qm_dqattach_locked+0x3ed/0x4a0 fs/xfs/xfs_qm.c:336
 xfs_qm_vop_dqalloc+0x3f2/0xca0 fs/xfs/xfs_qm.c:1659
 xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702
 xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022
 notify_change+0xe38/0x10f0 fs/attr.c:420
 chown_common+0x586/0x8f0 fs/open.c:736
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1459 [inline]
 free_pcp_prepare+0x80c/0x8f0 mm/page_alloc.c:1509
 free_unref_page_prepare mm/page_alloc.c:3387 [inline]
 free_unref_page_list+0xb4/0x7b0 mm/page_alloc.c:3529
 release_pages+0x232a/0x25c0 mm/swap.c:1055
 __pagevec_release+0x7d/0xf0 mm/swap.c:1075
 pagevec_release include/linux/pagevec.h:71 [inline]
 folio_batch_release include/linux/pagevec.h:135 [inline]
 truncate_inode_pages_range+0x472/0x17f0 mm/truncate.c:373
 kill_bdev block/bdev.c:76 [inline]
 blkdev_flush_mapping+0x153/0x2c0 block/bdev.c:662
 blkdev_put_whole block/bdev.c:693 [inline]
 blkdev_put+0x4a5/0x730 block/bdev.c:953
 deactivate_locked_super+0xa7/0xf0 fs/super.c:332
 cleanup_mnt+0x494/0x520 fs/namespace.c:1186
 task_work_run+0x243/0x300 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171
 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296
 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Memory state around the buggy address:
 ffff88807ed63980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807ed63a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88807ed63a80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
 ffff88807ed63b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807ed63b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-05  9:21 [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one syzbot
@ 2022-12-05 10:35 ` syzbot
  2022-12-05 22:52   ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner
  2022-12-05 23:58   ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner
  0 siblings, 2 replies; 15+ messages in thread
From: syzbot @ 2022-12-05 10:35 UTC (permalink / raw)
  To: djwong, linux-kernel, linux-xfs, syzkaller-bugs

syzbot has found a reproducer for the following issue on:

HEAD commit:    0ba09b173387 Revert "mm: align larger anonymous mappings o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com

XFS (loop1): Quotacheck: Done.
syz-executor.1 (4657): drop_caches: 2
==================================================================
BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657

CPU: 1 PID: 4657 Comm: syz-executor.1 Not tainted 6.1.0-rc7-syzkaller-00211-g0ba09b173387 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
 print_address_description+0x74/0x340 mm/kasan/report.c:284
 print_report+0x107/0x1f0 mm/kasan/report.c:395
 kasan_report+0xcd/0x100 mm/kasan/report.c:495
 xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
 xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
 xfs_qm_shrink_scan+0x351/0x410 fs/xfs/xfs_qm.c:523
 do_shrink_slab+0x4e1/0xa00 mm/vmscan.c:842
 shrink_slab+0x1e6/0x340 mm/vmscan.c:1002
 drop_slab_node mm/vmscan.c:1037 [inline]
 drop_slab+0x185/0x2c0 mm/vmscan.c:1047
 drop_caches_sysctl_handler+0xb1/0x160 fs/drop_caches.c:66
 proc_sys_call_handler+0x576/0x890 fs/proc/proc_sysctl.c:604
 do_iter_write+0x6c2/0xc20 fs/read_write.c:861
 iter_file_splice_write+0x7fc/0xfc0 fs/splice.c:686
 do_splice_from fs/splice.c:764 [inline]
 direct_splice_actor+0xe6/0x1c0 fs/splice.c:931
 splice_direct_to_actor+0x4e4/0xc00 fs/splice.c:886
 do_splice_direct+0x279/0x3d0 fs/splice.c:974
 do_sendfile+0x5fb/0xf80 fs/read_write.c:1255
 __do_sys_sendfile64 fs/read_write.c:1317 [inline]
 __se_sys_sendfile64+0xd0/0x1b0 fs/read_write.c:1309
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fc3e3c8c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc3e4a9a168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007fc3e3dabf80 RCX: 00007fc3e3c8c0d9
RDX: 0000000020002080 RSI: 0000000000000004 RDI: 0000000000000006
RBP: 00007fc3e3ce7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000870 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffcb98dc7f R14: 00007fc3e4a9a300 R15: 0000000000022000
 </TASK>

Allocated by task 4642:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
 __kasan_slab_alloc+0x65/0x70 mm/kasan/common.c:325
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x1cc/0x300 mm/slub.c:3422
 kmem_cache_zalloc include/linux/slab.h:679 [inline]
 xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
 xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870
 xfs_qm_vop_dqalloc+0x9bf/0xca0 fs/xfs/xfs_qm.c:1704
 xfs_setattr_nonsize+0x3c2/0xfd0 fs/xfs/xfs_iops.c:702
 xfs_vn_setattr+0x2f5/0x340 fs/xfs/xfs_iops.c:1022
 notify_change+0xe38/0x10f0 fs/attr.c:420
 chown_common+0x586/0x8f0 fs/open.c:736
 do_fchownat+0x165/0x240 fs/open.c:767
 __do_sys_chown fs/open.c:787 [inline]
 __se_sys_chown fs/open.c:785 [inline]
 __x64_sys_chown+0x7e/0x90 fs/open.c:785
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 3677:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x3d/0x60 mm/kasan/common.c:52
 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511
 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x94/0x1d0 mm/slub.c:3683
 xfs_qm_dqpurge+0x4f7/0x660 fs/xfs/xfs_qm.c:177
 xfs_qm_dquot_walk+0x249/0x490 fs/xfs/xfs_qm.c:87
 xfs_qm_dqpurge_all fs/xfs/xfs_qm.c:193 [inline]
 xfs_qm_unmount+0x71/0x100 fs/xfs/xfs_qm.c:205
 xfs_unmountfs+0xc5/0x1e0 fs/xfs/xfs_mount.c:1059
 xfs_fs_put_super+0x6e/0x2d0 fs/xfs/xfs_super.c:1115
 generic_shutdown_super+0x130/0x310 fs/super.c:492
 kill_block_super+0x79/0xd0 fs/super.c:1428
 deactivate_locked_super+0xa7/0xf0 fs/super.c:332
 cleanup_mnt+0x494/0x520 fs/namespace.c:1186
 task_work_run+0x243/0x300 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171
 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296
 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888079a6aa40
 which belongs to the cache xfs_dquot of size 704
The buggy address is located 24 bytes inside of
 704-byte region [ffff888079a6aa40, ffff888079a6ad00)

The buggy address belongs to the physical page:
page:ffffea0001e69a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x79a68
head:ffffea0001e69a00 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 0000000000000000 dead000000000122 ffff88814660f000
raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 33, tgid 33 (kworker/u4:2), ts 336553992665, free_ts 335702258247
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x742/0x7c0 mm/page_alloc.c:4291
 __alloc_pages+0x259/0x560 mm/page_alloc.c:5558
 alloc_slab_page+0xbd/0x190 mm/slub.c:1794
 allocate_slab+0x5e/0x4b0 mm/slub.c:1939
 new_slab mm/slub.c:1992 [inline]
 ___slab_alloc+0x782/0xe20 mm/slub.c:3180
 __slab_alloc mm/slub.c:3279 [inline]
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x24c/0x300 mm/slub.c:3422
 kmem_cache_zalloc include/linux/slab.h:679 [inline]
 xfs_dquot_alloc+0x36/0x600 fs/xfs/xfs_dquot.c:475
 xfs_qm_dqread+0x8a/0x1d0 fs/xfs/xfs_dquot.c:659
 xfs_qm_dqget+0x27d/0x4f0 fs/xfs/xfs_dquot.c:870
 xfs_qm_quotacheck_dqadjust+0xb7/0x380 fs/xfs/xfs_qm.c:1077
 xfs_qm_dqusage_adjust+0x4bd/0x630 fs/xfs/xfs_qm.c:1189
 xfs_iwalk_ag_recs+0x425/0x620 fs/xfs/xfs_iwalk.c:220
 xfs_iwalk_run_callbacks+0x20f/0x410 fs/xfs/xfs_iwalk.c:376
 xfs_iwalk_ag+0xaa5/0xb80 fs/xfs/xfs_iwalk.c:482
 xfs_iwalk_ag_work+0xf5/0x1a0 fs/xfs/xfs_iwalk.c:624
 xfs_pwork_work+0x7f/0x180 fs/xfs/xfs_pwork.c:47
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1459 [inline]
 free_pcp_prepare+0x80c/0x8f0 mm/page_alloc.c:1509
 free_unref_page_prepare mm/page_alloc.c:3387 [inline]
 free_unref_page+0x7d/0x5f0 mm/page_alloc.c:3483
 __stack_depot_save+0x430/0x4a0 lib/stackdepot.c:506
 kasan_save_stack mm/kasan/common.c:46 [inline]
 kasan_set_track+0x52/0x60 mm/kasan/common.c:52
 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:511
 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 __kmem_cache_free+0x71/0x110 mm/slub.c:3674
 memcg_free_slab_cgroups mm/slab.h:456 [inline]
 unaccount_slab mm/slab.h:645 [inline]
 __free_slab+0xf0/0x320 mm/slub.c:2015
 qlist_free_all+0x2b/0x70 mm/kasan/quarantine.c:187
 kasan_quarantine_reduce+0x169/0x180 mm/kasan/quarantine.c:294
 __kasan_slab_alloc+0x1f/0x70 mm/kasan/common.c:302
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3398 [inline]
 __kmem_cache_alloc_node+0x1d7/0x310 mm/slub.c:3437
 __do_kmalloc_node mm/slab_common.c:954 [inline]
 __kmalloc+0x9e/0x1a0 mm/slab_common.c:968
 kmalloc include/linux/slab.h:558 [inline]
 tomoyo_realpath_from_path+0xcd/0x5f0 security/tomoyo/realpath.c:251
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x227/0x670 security/tomoyo/file.c:822

Memory state around the buggy address:
 ffff888079a6a900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888079a6a980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888079a6aa00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
                                                    ^
 ffff888079a6aa80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888079a6ab00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING
  2022-12-05 10:35 ` syzbot
@ 2022-12-05 22:52   ` Dave Chinner
  2022-12-07 16:17     ` Darrick J. Wong
  2022-12-05 23:58   ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2022-12-05 22:52 UTC (permalink / raw)
  To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs

On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    0ba09b173387 Revert "mm: align larger anonymous mappings o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
> dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
> compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
> 
> XFS (loop1): Quotacheck: Done.
> syz-executor.1 (4657): drop_caches: 2
> ==================================================================
> BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
> BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
> Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657

Looks like we've missed a XFS_DQUOT_FREEING check in
xfs_qm_shrink_scan(), and the dquot purge run by unmount has raced
with the shrinker. Patch below should fix it.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING

From: Dave Chinner <dchinner@redhat.com>

Resulting in a UAF if the shrinker races with some other dquot
freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is
removed from the LRU. This can occur if a dquot purge races with
drop_caches.

Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_qm.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 18bb4ec4d7c9..ff53d40a2dae 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -422,6 +422,14 @@ xfs_qm_dquot_isolate(
 	if (!xfs_dqlock_nowait(dqp))
 		goto out_miss_busy;
 
+	/*
+	 * If something else is freeing this dquot and hasn't yet removed it
+	 * from the LRU, leave it for the freeing task to complete the freeing
+	 * process rather than risk it being free from under us here.
+	 */
+	if (dqp->q_flags & XFS_DQFLAG_FREEING)
+		goto out_miss_unlock;
+
 	/*
 	 * This dquot has acquired a reference in the meantime remove it from
 	 * the freelist and try again.
@@ -441,10 +449,8 @@ xfs_qm_dquot_isolate(
 	 * skip it so there is time for the IO to complete before we try to
 	 * reclaim it again on the next LRU pass.
 	 */
-	if (!xfs_dqflock_nowait(dqp)) {
-		xfs_dqunlock(dqp);
-		goto out_miss_busy;
-	}
+	if (!xfs_dqflock_nowait(dqp))
+		goto out_miss_unlock;
 
 	if (XFS_DQ_IS_DIRTY(dqp)) {
 		struct xfs_buf	*bp = NULL;
@@ -478,6 +484,8 @@ xfs_qm_dquot_isolate(
 	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims);
 	return LRU_REMOVED;
 
+out_miss_unlock:
+	xfs_dqunlock(dqp);
 out_miss_busy:
 	trace_xfs_dqreclaim_busy(dqp);
 	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-05 10:35 ` syzbot
  2022-12-05 22:52   ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner
@ 2022-12-05 23:58   ` Dave Chinner
  2022-12-06  3:12     ` syzbot
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2022-12-05 23:58 UTC (permalink / raw)
  To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs

On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    0ba09b173387 Revert "mm: align larger anonymous mappings o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
> dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
> compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master


xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING

From: Dave Chinner <dchinner@redhat.com>

Resulting in a UAF if the shrinker races with some other dquot
freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is
removed from the LRU. This can occur if a dquot purge races with
drop_caches.

Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_qm.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 18bb4ec4d7c9..ff53d40a2dae 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -422,6 +422,14 @@ xfs_qm_dquot_isolate(
 	if (!xfs_dqlock_nowait(dqp))
 		goto out_miss_busy;
 
+	/*
+	 * If something else is freeing this dquot and hasn't yet removed it
+	 * from the LRU, leave it for the freeing task to complete the freeing
+	 * process rather than risk it being free from under us here.
+	 */
+	if (dqp->q_flags & XFS_DQFLAG_FREEING)
+		goto out_miss_unlock;
+
 	/*
 	 * This dquot has acquired a reference in the meantime remove it from
 	 * the freelist and try again.
@@ -441,10 +449,8 @@ xfs_qm_dquot_isolate(
 	 * skip it so there is time for the IO to complete before we try to
 	 * reclaim it again on the next LRU pass.
 	 */
-	if (!xfs_dqflock_nowait(dqp)) {
-		xfs_dqunlock(dqp);
-		goto out_miss_busy;
-	}
+	if (!xfs_dqflock_nowait(dqp))
+		goto out_miss_unlock;
 
 	if (XFS_DQ_IS_DIRTY(dqp)) {
 		struct xfs_buf	*bp = NULL;
@@ -478,6 +484,8 @@ xfs_qm_dquot_isolate(
 	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims);
 	return LRU_REMOVED;
 
+out_miss_unlock:
+	xfs_dqunlock(dqp);
 out_miss_busy:
 	trace_xfs_dqreclaim_busy(dqp);
 	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-05 23:58   ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner
@ 2022-12-06  3:12     ` syzbot
  2022-12-06  3:34       ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2022-12-06  3:12 UTC (permalink / raw)
  To: david, djwong, linux-kernel, linux-xfs, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d58e7fe7f9cf5e24
dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=164cad83880000


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06  3:12     ` syzbot
@ 2022-12-06  3:34       ` Dave Chinner
  2022-12-06 11:06         ` Dmitry Vyukov
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2022-12-06  3:34 UTC (permalink / raw)
  To: syzbot; +Cc: djwong, linux-kernel, linux-xfs, syzkaller-bugs

On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> INFO: rcu detected stall in corrupted
> 
> rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> rcu: blocking rcu_node structures (internal RCU debug):

I'm pretty sure this has nothing to do with the reproducer - the
console log here:

> Tested on:
> 
> commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000

indicates that syzbot is screwing around with bluetooth, HCI,
netdevsim, bridging, bonding, etc.

There's no evidence that it actually ran the reproducer for the bug
reported in this thread - there's no record of a single XFS
filesystem being mounted in the log....

It look slike someone else also tried a private patch to fix this
problem (which was obviously broken) and it failed with exactly the
same RCU warnings. That was run from the same commit id as the
original reproducer, so this looks like either syzbot is broken or
there's some other completely unrelated problem that syzbot is
tripping over here.

Over to the syzbot people to debug the syzbot failure....

-Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06  3:34       ` Dave Chinner
@ 2022-12-06 11:06         ` Dmitry Vyukov
  2022-12-06 15:32           ` Paul E. McKenney
                             ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Dmitry Vyukov @ 2022-12-06 11:06 UTC (permalink / raw)
  To: Dave Chinner, Paul E. McKenney, frederic, quic_neeraju,
	Josh Triplett, RCU
  Cc: syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller

On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > INFO: rcu detected stall in corrupted
> >
> > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > rcu: blocking rcu_node structures (internal RCU debug):
>
> I'm pretty sure this has nothing to do with the reproducer - the
> console log here:
>
> > Tested on:
> >
> > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
>
> indicates that syzbot is screwing around with bluetooth, HCI,
> netdevsim, bridging, bonding, etc.
>
> There's no evidence that it actually ran the reproducer for the bug
> reported in this thread - there's no record of a single XFS
> filesystem being mounted in the log....
>
> It look slike someone else also tried a private patch to fix this
> problem (which was obviously broken) and it failed with exactly the
> same RCU warnings. That was run from the same commit id as the
> original reproducer, so this looks like either syzbot is broken or
> there's some other completely unrelated problem that syzbot is
> tripping over here.
>
> Over to the syzbot people to debug the syzbot failure....

Hi Dave,

It's not uncommon for a single program to trigger multiple bugs.
That's what happens here. The rcu stall issue is reproducible with
this test program.
In such cases you can either submit more test requests, or test manually.

I think there is an RCU expedited stall detection.
For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
seconds, and that's not enough for reliable flake-free stress testing.
We bump other timeouts to 100+ seconds.
+RCU maintainers, do you mind removing the overly restrictive limit on
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
Or you think there is something to fix in the kernel to not stall? I
see the test writes to
/proc/sys/vm/drop_caches, maybe there is some issue in that code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06 11:06         ` Dmitry Vyukov
@ 2022-12-06 15:32           ` Paul E. McKenney
  2022-12-06 16:19             ` Dmitry Vyukov
  2022-12-06 20:58           ` Dave Chinner
       [not found]           ` <20221209034605.1801-1-hdanton@sina.com>
  2 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2022-12-06 15:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot,
	djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller

On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > Hello,
> > >
> > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > INFO: rcu detected stall in corrupted
> > >
> > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > rcu: blocking rcu_node structures (internal RCU debug):
> >
> > I'm pretty sure this has nothing to do with the reproducer - the
> > console log here:
> >
> > > Tested on:
> > >
> > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> >
> > indicates that syzbot is screwing around with bluetooth, HCI,
> > netdevsim, bridging, bonding, etc.
> >
> > There's no evidence that it actually ran the reproducer for the bug
> > reported in this thread - there's no record of a single XFS
> > filesystem being mounted in the log....
> >
> > It look slike someone else also tried a private patch to fix this
> > problem (which was obviously broken) and it failed with exactly the
> > same RCU warnings. That was run from the same commit id as the
> > original reproducer, so this looks like either syzbot is broken or
> > there's some other completely unrelated problem that syzbot is
> > tripping over here.
> >
> > Over to the syzbot people to debug the syzbot failure....
> 
> Hi Dave,
> 
> It's not uncommon for a single program to trigger multiple bugs.
> That's what happens here. The rcu stall issue is reproducible with
> this test program.
> In such cases you can either submit more test requests, or test manually.
> 
> I think there is an RCU expedited stall detection.
> For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
> seconds, and that's not enough for reliable flake-free stress testing.
> We bump other timeouts to 100+ seconds.
> +RCU maintainers, do you mind removing the overly restrictive limit on
> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
> Or you think there is something to fix in the kernel to not stall? I
> see the test writes to
> /proc/sys/vm/drop_caches, maybe there is some issue in that code.

Like this?

If so, I don't see why not.  And in that case, may I please have
your Tested-by or similar?

At the same time, I am sure that there are things in the kernel that
should be adjusted to avoid stalls, but I recognize that different
developers in different situations will have different issues that they
choose to focus on.  ;-)

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
index 49da904df6aa6..2984de629f749 100644
--- a/kernel/rcu/Kconfig.debug
+++ b/kernel/rcu/Kconfig.debug
@@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT
 config RCU_EXP_CPU_STALL_TIMEOUT
 	int "Expedited RCU CPU stall timeout in milliseconds"
 	depends on RCU_STALL_COMMON
-	range 0 21000
+	range 0 300000
 	default 0
 	help
 	  If a given expedited RCU grace period extends more than the

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06 15:32           ` Paul E. McKenney
@ 2022-12-06 16:19             ` Dmitry Vyukov
  2022-12-06 17:47               ` Paul E. McKenney
  2022-12-06 21:03               ` Dave Chinner
  0 siblings, 2 replies; 15+ messages in thread
From: Dmitry Vyukov @ 2022-12-06 16:19 UTC (permalink / raw)
  To: paulmck
  Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot,
	djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller

On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > > INFO: rcu detected stall in corrupted
> > > >
> > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > > rcu: blocking rcu_node structures (internal RCU debug):
> > >
> > > I'm pretty sure this has nothing to do with the reproducer - the
> > > console log here:
> > >
> > > > Tested on:
> > > >
> > > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> > >
> > > indicates that syzbot is screwing around with bluetooth, HCI,
> > > netdevsim, bridging, bonding, etc.
> > >
> > > There's no evidence that it actually ran the reproducer for the bug
> > > reported in this thread - there's no record of a single XFS
> > > filesystem being mounted in the log....
> > >
> > > It look slike someone else also tried a private patch to fix this
> > > problem (which was obviously broken) and it failed with exactly the
> > > same RCU warnings. That was run from the same commit id as the
> > > original reproducer, so this looks like either syzbot is broken or
> > > there's some other completely unrelated problem that syzbot is
> > > tripping over here.
> > >
> > > Over to the syzbot people to debug the syzbot failure....
> >
> > Hi Dave,
> >
> > It's not uncommon for a single program to trigger multiple bugs.
> > That's what happens here. The rcu stall issue is reproducible with
> > this test program.
> > In such cases you can either submit more test requests, or test manually.
> >
> > I think there is an RCU expedited stall detection.
> > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
> > seconds, and that's not enough for reliable flake-free stress testing.
> > We bump other timeouts to 100+ seconds.
> > +RCU maintainers, do you mind removing the overly restrictive limit on
> > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
> > Or you think there is something to fix in the kernel to not stall? I
> > see the test writes to
> > /proc/sys/vm/drop_caches, maybe there is some issue in that code.
>
> Like this?
>
> If so, I don't see why not.  And in that case, may I please have
> your Tested-by or similar?

I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000.
Running the test program I got some kernel BUG in XFS and no RCU
errors/warnings.

Tested-by: Dmitry Vyukov <dvyukov@google.com>

Thanks

> At the same time, I am sure that there are things in the kernel that
> should be adjusted to avoid stalls, but I recognize that different
> developers in different situations will have different issues that they
> choose to focus on.  ;-)
>
>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> index 49da904df6aa6..2984de629f749 100644
> --- a/kernel/rcu/Kconfig.debug
> +++ b/kernel/rcu/Kconfig.debug
> @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT
>  config RCU_EXP_CPU_STALL_TIMEOUT
>         int "Expedited RCU CPU stall timeout in milliseconds"
>         depends on RCU_STALL_COMMON
> -       range 0 21000
> +       range 0 300000
>         default 0
>         help
>           If a given expedited RCU grace period extends more than the

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06 16:19             ` Dmitry Vyukov
@ 2022-12-06 17:47               ` Paul E. McKenney
  2022-12-06 21:03               ` Dave Chinner
  1 sibling, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2022-12-06 17:47 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Dave Chinner, frederic, quic_neeraju, Josh Triplett, RCU, syzbot,
	djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller

On Tue, Dec 06, 2022 at 05:19:10PM +0100, Dmitry Vyukov wrote:
> On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > > > INFO: rcu detected stall in corrupted
> > > > >
> > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > > > rcu: blocking rcu_node structures (internal RCU debug):
> > > >
> > > > I'm pretty sure this has nothing to do with the reproducer - the
> > > > console log here:
> > > >
> > > > > Tested on:
> > > > >
> > > > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> > > >
> > > > indicates that syzbot is screwing around with bluetooth, HCI,
> > > > netdevsim, bridging, bonding, etc.
> > > >
> > > > There's no evidence that it actually ran the reproducer for the bug
> > > > reported in this thread - there's no record of a single XFS
> > > > filesystem being mounted in the log....
> > > >
> > > > It look slike someone else also tried a private patch to fix this
> > > > problem (which was obviously broken) and it failed with exactly the
> > > > same RCU warnings. That was run from the same commit id as the
> > > > original reproducer, so this looks like either syzbot is broken or
> > > > there's some other completely unrelated problem that syzbot is
> > > > tripping over here.
> > > >
> > > > Over to the syzbot people to debug the syzbot failure....
> > >
> > > Hi Dave,
> > >
> > > It's not uncommon for a single program to trigger multiple bugs.
> > > That's what happens here. The rcu stall issue is reproducible with
> > > this test program.
> > > In such cases you can either submit more test requests, or test manually.
> > >
> > > I think there is an RCU expedited stall detection.
> > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
> > > seconds, and that's not enough for reliable flake-free stress testing.
> > > We bump other timeouts to 100+ seconds.
> > > +RCU maintainers, do you mind removing the overly restrictive limit on
> > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
> > > Or you think there is something to fix in the kernel to not stall? I
> > > see the test writes to
> > > /proc/sys/vm/drop_caches, maybe there is some issue in that code.
> >
> > Like this?
> >
> > If so, I don't see why not.  And in that case, may I please have
> > your Tested-by or similar?
> 
> I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000.
> Running the test program I got some kernel BUG in XFS and no RCU
> errors/warnings.
> 
> Tested-by: Dmitry Vyukov <dvyukov@google.com>

Applied, thank you both!

I expect to push this into the v6.3 merge window, that is, not the
one coming up real soon now, but the one after that.

							Thanx, Paul

> Thanks
> 
> > At the same time, I am sure that there are things in the kernel that
> > should be adjusted to avoid stalls, but I recognize that different
> > developers in different situations will have different issues that they
> > choose to focus on.  ;-)
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> > index 49da904df6aa6..2984de629f749 100644
> > --- a/kernel/rcu/Kconfig.debug
> > +++ b/kernel/rcu/Kconfig.debug
> > @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT
> >  config RCU_EXP_CPU_STALL_TIMEOUT
> >         int "Expedited RCU CPU stall timeout in milliseconds"
> >         depends on RCU_STALL_COMMON
> > -       range 0 21000
> > +       range 0 300000
> >         default 0
> >         help
> >           If a given expedited RCU grace period extends more than the

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06 11:06         ` Dmitry Vyukov
  2022-12-06 15:32           ` Paul E. McKenney
@ 2022-12-06 20:58           ` Dave Chinner
       [not found]           ` <20221209034605.1801-1-hdanton@sina.com>
  2 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2022-12-06 20:58 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Paul E. McKenney, frederic, quic_neeraju, Josh Triplett, RCU,
	syzbot, djwong, linux-kernel, linux-xfs, syzkaller-bugs,
	syzkaller

On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > Hello,
> > >
> > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > INFO: rcu detected stall in corrupted
> > >
> > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > rcu: blocking rcu_node structures (internal RCU debug):
> >
> > I'm pretty sure this has nothing to do with the reproducer - the
> > console log here:
> >
> > > Tested on:
> > >
> > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> >
> > indicates that syzbot is screwing around with bluetooth, HCI,
> > netdevsim, bridging, bonding, etc.
> >
> > There's no evidence that it actually ran the reproducer for the bug
> > reported in this thread - there's no record of a single XFS
> > filesystem being mounted in the log....
> >
> > It look slike someone else also tried a private patch to fix this
> > problem (which was obviously broken) and it failed with exactly the
> > same RCU warnings. That was run from the same commit id as the
> > original reproducer, so this looks like either syzbot is broken or
> > there's some other completely unrelated problem that syzbot is
> > tripping over here.
> >
> > Over to the syzbot people to debug the syzbot failure....
> 
> Hi Dave,
> 
> It's not uncommon for a single program to trigger multiple bugs.
> That's what happens here. The rcu stall issue is reproducible with
> this test program.
> In such cases you can either submit more test requests, or test manually.

So you're telling us syzbot reproducers are unreliable and we are
expected to play whack-a-mole with test resubmission until we get
the result we want?

How do I tell syzbot to resubmit the same patch for testing without
having to send the same patch to syzbot via email again? Can I
retrigger a new test run through the web interface?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
  2022-12-06 16:19             ` Dmitry Vyukov
  2022-12-06 17:47               ` Paul E. McKenney
@ 2022-12-06 21:03               ` Dave Chinner
  1 sibling, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2022-12-06 21:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: paulmck, frederic, quic_neeraju, Josh Triplett, RCU, syzbot,
	djwong, linux-kernel, linux-xfs, syzkaller-bugs, syzkaller

On Tue, Dec 06, 2022 at 05:19:10PM +0100, Dmitry Vyukov wrote:
> On Tue, 6 Dec 2022 at 16:32, Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > > > INFO: rcu detected stall in corrupted
> > > > >
> > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > > > rcu: blocking rcu_node structures (internal RCU debug):
> > > >
> > > > I'm pretty sure this has nothing to do with the reproducer - the
> > > > console log here:
> > > >
> > > > > Tested on:
> > > > >
> > > > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> > > >
> > > > indicates that syzbot is screwing around with bluetooth, HCI,
> > > > netdevsim, bridging, bonding, etc.
> > > >
> > > > There's no evidence that it actually ran the reproducer for the bug
> > > > reported in this thread - there's no record of a single XFS
> > > > filesystem being mounted in the log....
> > > >
> > > > It look slike someone else also tried a private patch to fix this
> > > > problem (which was obviously broken) and it failed with exactly the
> > > > same RCU warnings. That was run from the same commit id as the
> > > > original reproducer, so this looks like either syzbot is broken or
> > > > there's some other completely unrelated problem that syzbot is
> > > > tripping over here.
> > > >
> > > > Over to the syzbot people to debug the syzbot failure....
> > >
> > > Hi Dave,
> > >
> > > It's not uncommon for a single program to trigger multiple bugs.
> > > That's what happens here. The rcu stall issue is reproducible with
> > > this test program.
> > > In such cases you can either submit more test requests, or test manually.
> > >
> > > I think there is an RCU expedited stall detection.
> > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
> > > seconds, and that's not enough for reliable flake-free stress testing.
> > > We bump other timeouts to 100+ seconds.
> > > +RCU maintainers, do you mind removing the overly restrictive limit on
> > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
> > > Or you think there is something to fix in the kernel to not stall? I
> > > see the test writes to
> > > /proc/sys/vm/drop_caches, maybe there is some issue in that code.
> >
> > Like this?
> >
> > If so, I don't see why not.  And in that case, may I please have
> > your Tested-by or similar?
> 
> I've tried with this patch and RCU_EXP_CPU_STALL_TIMEOUT=80000.
> Running the test program I got some kernel BUG in XFS and no RCU
> errors/warnings.

What BUG did it trigger? Where's the log?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING
  2022-12-05 22:52   ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner
@ 2022-12-07 16:17     ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2022-12-07 16:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: syzbot, linux-kernel, linux-xfs, syzkaller-bugs

On Tue, Dec 06, 2022 at 09:52:46AM +1100, Dave Chinner wrote:
> On Mon, Dec 05, 2022 at 02:35:39AM -0800, syzbot wrote:
> > syzbot has found a reproducer for the following issue on:
> > 
> > HEAD commit:    0ba09b173387 Revert "mm: align larger anonymous mappings o..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15550c47880000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
> > dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
> > compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=128c9e23880000
> > 
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/9758ec2c06f4/disk-0ba09b17.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/06781dbfd581/vmlinux-0ba09b17.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/3d44a22d15fa/bzImage-0ba09b17.xz
> > mounted in repro: https://storage.googleapis.com/syzbot-assets/335889b2d730/mount_0.gz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
> > 
> > XFS (loop1): Quotacheck: Done.
> > syz-executor.1 (4657): drop_caches: 2
> > ==================================================================
> > BUG: KASAN: use-after-free in xfs_dquot_type fs/xfs/xfs_dquot.h:136 [inline]
> > BUG: KASAN: use-after-free in xfs_qm_dqfree_one+0x12f/0x170 fs/xfs/xfs_qm.c:1604
> > Read of size 1 at addr ffff888079a6aa58 by task syz-executor.1/4657
> 
> Looks like we've missed a XFS_DQUOT_FREEING check in
> xfs_qm_shrink_scan(), and the dquot purge run by unmount has raced
> with the shrinker. Patch below should fix it.
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Resulting in a UAF if the shrinker races with some other dquot
> freeing mechanism that sets XFS_DQFLAG_FREEING before the dquot is
> removed from the LRU. This can occur if a dquot purge races with
> drop_caches.
> 
> Reported-by: syzbot+912776840162c13db1a3@syzkaller.appspotmail.com
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Please repost this as a toplevel thread so it doesn't get lost in the
depths.  Anyway, this looks correct so:

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_qm.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 18bb4ec4d7c9..ff53d40a2dae 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -422,6 +422,14 @@ xfs_qm_dquot_isolate(
>  	if (!xfs_dqlock_nowait(dqp))
>  		goto out_miss_busy;
>  
> +	/*
> +	 * If something else is freeing this dquot and hasn't yet removed it
> +	 * from the LRU, leave it for the freeing task to complete the freeing
> +	 * process rather than risk it being free from under us here.
> +	 */
> +	if (dqp->q_flags & XFS_DQFLAG_FREEING)
> +		goto out_miss_unlock;
> +
>  	/*
>  	 * This dquot has acquired a reference in the meantime remove it from
>  	 * the freelist and try again.
> @@ -441,10 +449,8 @@ xfs_qm_dquot_isolate(
>  	 * skip it so there is time for the IO to complete before we try to
>  	 * reclaim it again on the next LRU pass.
>  	 */
> -	if (!xfs_dqflock_nowait(dqp)) {
> -		xfs_dqunlock(dqp);
> -		goto out_miss_busy;
> -	}
> +	if (!xfs_dqflock_nowait(dqp))
> +		goto out_miss_unlock;
>  
>  	if (XFS_DQ_IS_DIRTY(dqp)) {
>  		struct xfs_buf	*bp = NULL;
> @@ -478,6 +484,8 @@ xfs_qm_dquot_isolate(
>  	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaims);
>  	return LRU_REMOVED;
>  
> +out_miss_unlock:
> +	xfs_dqunlock(dqp);
>  out_miss_busy:
>  	trace_xfs_dqreclaim_busy(dqp);
>  	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
       [not found]           ` <20221209034605.1801-1-hdanton@sina.com>
@ 2022-12-09  4:14             ` Paul E. McKenney
  0 siblings, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2022-12-09  4:14 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Dmitry Vyukov, Dave Chinner, linux-kernel, linux-xfs, syzkaller-bugs

On Fri, Dec 09, 2022 at 11:46:05AM +0800, Hillf Danton wrote:
> On 6 Dec 2022 07:32:11 -0800 "Paul E. McKenney" <paulmck@kernel.org>
> > On Tue, Dec 06, 2022 at 12:06:10PM +0100, Dmitry Vyukov wrote:
> > > On Tue, 6 Dec 2022 at 04:34, Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Dec 05, 2022 at 07:12:15PM -0800, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > > > INFO: rcu detected stall in corrupted
> > > > >
> > > > > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4122 } 2641 jiffies s: 2877 root: 0x0/T
> > > > > rcu: blocking rcu_node structures (internal RCU debug):
> > > >
> > > > I'm pretty sure this has nothing to do with the reproducer - the
> > > > console log here:
> > > >
> > > > > Tested on:
> > > > >
> > > > > commit:         bce93322 proc: proc_skip_spaces() shouldn't think it i..
> > > > > git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1566216b880000
> > > >
> > > > indicates that syzbot is screwing around with bluetooth, HCI,
> > > > netdevsim, bridging, bonding, etc.
> > > >
> > > > There's no evidence that it actually ran the reproducer for the bug
> > > > reported in this thread - there's no record of a single XFS
> > > > filesystem being mounted in the log....
> > > >
> > > > It look slike someone else also tried a private patch to fix this
> > > > problem (which was obviously broken) and it failed with exactly the
> > > > same RCU warnings. That was run from the same commit id as the
> > > > original reproducer, so this looks like either syzbot is broken or
> > > > there's some other completely unrelated problem that syzbot is
> > > > tripping over here.
> > > >
> > > > Over to the syzbot people to debug the syzbot failure....
> > > 
> > > Hi Dave,
> > > 
> > > It's not uncommon for a single program to trigger multiple bugs.
> > > That's what happens here. The rcu stall issue is reproducible with
> > > this test program.
> > > In such cases you can either submit more test requests, or test manually.
> > > 
> > > I think there is an RCU expedited stall detection.
> > > For some reason CONFIG_RCU_EXP_CPU_STALL_TIMEOUT is limited to 21
> > > seconds, and that's not enough for reliable flake-free stress testing.
> > > We bump other timeouts to 100+ seconds.
> > > +RCU maintainers, do you mind removing the overly restrictive limit on
> > > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT?
> > > Or you think there is something to fix in the kernel to not stall? I
> > > see the test writes to
> > > /proc/sys/vm/drop_caches, maybe there is some issue in that code.
> > 
> > Like this?
> > 
> > If so, I don't see why not.  And in that case, may I please have
> > your Tested-by or similar?
> > 
> > At the same time, I am sure that there are things in the kernel that
> > should be adjusted to avoid stalls, but I recognize that different
> > developers in different situations will have different issues that they
> > choose to focus on.  ;-)
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> > index 49da904df6aa6..2984de629f749 100644
> > --- a/kernel/rcu/Kconfig.debug
> > +++ b/kernel/rcu/Kconfig.debug
> > @@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT
> >  config RCU_EXP_CPU_STALL_TIMEOUT
> >  	int "Expedited RCU CPU stall timeout in milliseconds"
> >  	depends on RCU_STALL_COMMON
> > -	range 0 21000
> > +	range 0 300000
> >  	default 0
> >  	help
> >  	  If a given expedited RCU grace period extends more than the
> >
> 	// Limit check must be consistent with the Kconfig limits for
> 	// CONFIG_RCU_EXP_CPU_STALL_TIMEOUT, so check the allowed range.
> 	// The minimum clamped value is "2UL", because at least one full
> 	// tick has to be guaranteed.
> 	till_stall_check = clamp(msecs_to_jiffies(cpu_stall_timeout), 2UL, 21UL * HZ); 
> 
> But with 21UL left behind intact?

Good catch, will fix, thank you!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one
       [not found] <20221205140422.7412-1-hdanton@sina.com>
@ 2022-12-05 17:06 ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2022-12-05 17:06 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P4111 } 2640 jiffies s: 2849 root: 0x0/T
rcu: blocking rcu_node structures (internal RCU debug):


Tested on:

commit:         0ba09b17 Revert "mm: align larger anonymous mappings o..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=135dc11d880000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=912776840162c13db1a3
compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1551f50f880000


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-12-09  4:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-05  9:21 [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one syzbot
2022-12-05 10:35 ` syzbot
2022-12-05 22:52   ` [PATCH] xfs: dquot shrinker doesn't check for XFS_DQFLAG_FREEING Dave Chinner
2022-12-07 16:17     ` Darrick J. Wong
2022-12-05 23:58   ` [syzbot] KASAN: use-after-free Read in xfs_qm_dqfree_one Dave Chinner
2022-12-06  3:12     ` syzbot
2022-12-06  3:34       ` Dave Chinner
2022-12-06 11:06         ` Dmitry Vyukov
2022-12-06 15:32           ` Paul E. McKenney
2022-12-06 16:19             ` Dmitry Vyukov
2022-12-06 17:47               ` Paul E. McKenney
2022-12-06 21:03               ` Dave Chinner
2022-12-06 20:58           ` Dave Chinner
     [not found]           ` <20221209034605.1801-1-hdanton@sina.com>
2022-12-09  4:14             ` Paul E. McKenney
     [not found] <20221205140422.7412-1-hdanton@sina.com>
2022-12-05 17:06 ` syzbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.