From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>
Cc: syzbot <syzbot+7b2866454055e43c21e5@syzkaller.appspotmail.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk
Subject: Re: INFO: task hung in __sb_start_write
Date: Sun, 10 Jun 2018 23:47:56 +0900 [thread overview]
Message-ID: <ee29ad1e-8bd0-3eef-f49e-793bfadcbea7@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <000000000000283c37056b4a81a5@google.com>
Hello.
Commits
401c636a0eeb0d51 "kernel/hung_task.c: show all hung tasks before panic"
8cc05c71ba5f7936 "locking/lockdep: Move sanity check to inside lockdep_print_held_locks()"
arrived at linux.git and syzbot started giving us more hints.
Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&x=1477e81f800000 :
----------------------------------------
2 locks held by rs:main Q:Reg/4416:
#0: 00000000dff3f899 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0 fs/file.c:766
#1: 00000000c4a96cb8 (sb_writers#6){++++}, at: file_start_write include/linux/fs.h:2737 [inline]
#1: 00000000c4a96cb8 (sb_writers#6){++++}, at: vfs_write+0x452/0x560 fs/read_write.c:548
1 lock held by rsyslogd/4418:
#0: 000000007f0c215c (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0 fs/file.c:766
syz-executor4 D22224 4597 4588 0x00000000
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_write include/linux/fs.h:1552 [inline]
mnt_want_write+0x3f/0xc0 fs/namespace.c:386
do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
__do_sys_unlink fs/namei.c:4091 [inline]
__se_sys_unlink fs/namei.c:4089 [inline]
__x64_sys_unlink+0x42/0x50 fs/namei.c:4089
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor4/4597:
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386
syz-executor6 D21536 4600 4591 0x00000000
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_write include/linux/fs.h:1552 [inline]
mnt_want_write+0x3f/0xc0 fs/namespace.c:386
do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
__do_sys_unlink fs/namei.c:4091 [inline]
__se_sys_unlink fs/namei.c:4089 [inline]
__x64_sys_unlink+0x42/0x50 fs/namei.c:4089
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor6/4600:
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386
syz-executor3 D21536 7320 7261 0x00000000
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_write include/linux/fs.h:1552 [inline]
mnt_want_write+0x3f/0xc0 fs/namespace.c:386
do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
__do_sys_unlink fs/namei.c:4091 [inline]
__se_sys_unlink fs/namei.c:4089 [inline]
__x64_sys_unlink+0x42/0x50 fs/namei.c:4089
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor3/7320:
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
#0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386
syz-executor2 D24032 13221 4593 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_pagefault include/linux/fs.h:1581 [inline]
ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
do_page_mkwrite+0x146/0x500 mm/memory.c:2380
do_shared_fault mm/memory.c:3706 [inline]
do_fault mm/memory.c:3745 [inline]
handle_pte_fault mm/memory.c:3972 [inline]
__handle_mm_fault+0x2acb/0x4390 mm/memory.c:4096
handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
__do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor2/13221:
#0: 000000002618cb0b (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
syz-executor5 D24696 13225 4598 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_pagefault include/linux/fs.h:1581 [inline]
ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
do_page_mkwrite+0x146/0x500 mm/memory.c:2380
wp_page_shared mm/memory.c:2676 [inline]
do_wp_page+0xf5d/0x1990 mm/memory.c:2776
handle_pte_fault mm/memory.c:3988 [inline]
__handle_mm_fault+0x29f5/0x4390 mm/memory.c:4096
handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
__do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor5/13225:
#0: 000000000aa505ed (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
syz-executor1 D23432 13236 4594 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2856 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3498
schedule+0xef/0x430 kernel/sched/core.c:3542
__rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
__down_read arch/x86/include/asm/rwsem.h:83 [inline]
__percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
__sb_start_write+0x2d7/0x300 fs/super.c:1403
sb_start_pagefault include/linux/fs.h:1581 [inline]
ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
do_page_mkwrite+0x146/0x500 mm/memory.c:2380
wp_page_shared mm/memory.c:2676 [inline]
do_wp_page+0xf5d/0x1990 mm/memory.c:2776
handle_pte_fault mm/memory.c:3988 [inline]
__handle_mm_fault+0x29f5/0x4390 mm/memory.c:4096
handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
__do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor1/13236:
#0: 0000000079279c9c (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
#1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
----------------------------------------
This looks quite strange that nobody is holding percpu_rw_semaphore for
write but everybody is stuck trying to hold it for read. (Since there
is no "X locks held by ..." line without followup "#0:" line, there is
no possibility that somebody is in TASK_RUNNING state while holding
percpu_rw_semaphore for write.)
I feel that either API has a bug or API usage is wrong.
Any idea for debugging this?
next prev parent reply other threads:[~2018-06-10 14:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-03 10:17 INFO: task hung in __sb_start_write syzbot
2018-06-10 14:47 ` Tetsuo Handa [this message]
2018-06-11 7:30 ` Peter Zijlstra
2018-06-11 7:39 ` Dmitry Vyukov
2018-06-14 10:33 ` Tetsuo Handa
2018-06-14 10:33 ` Tetsuo Handa
2018-06-15 9:19 ` Dmitry Vyukov
2018-06-15 19:40 ` Tetsuo Handa
2018-06-19 11:10 ` Tetsuo Handa
2018-06-19 11:47 ` Dmitry Vyukov
2018-06-19 13:00 ` Tetsuo Handa
2018-07-11 11:13 ` Tetsuo Handa
2018-07-13 10:38 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee29ad1e-8bd0-3eef-f49e-793bfadcbea7@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=syzbot+7b2866454055e43c21e5@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=viro@zeniv.linux.org.uk \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.