All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>
Cc: syzbot <syzbot+7b2866454055e43c21e5@syzkaller.appspotmail.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk
Subject: Re: INFO: task hung in __sb_start_write
Date: Sun, 10 Jun 2018 23:47:56 +0900	[thread overview]
Message-ID: <ee29ad1e-8bd0-3eef-f49e-793bfadcbea7@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <000000000000283c37056b4a81a5@google.com>

Hello.

Commits

  401c636a0eeb0d51 "kernel/hung_task.c: show all hung tasks before panic"
  8cc05c71ba5f7936 "locking/lockdep: Move sanity check to inside lockdep_print_held_locks()"

arrived at linux.git and syzbot started giving us more hints.

Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&x=1477e81f800000 :
----------------------------------------
2 locks held by rs:main Q:Reg/4416:
 #0: 00000000dff3f899 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0 fs/file.c:766
 #1: 00000000c4a96cb8 (sb_writers#6){++++}, at: file_start_write include/linux/fs.h:2737 [inline]
 #1: 00000000c4a96cb8 (sb_writers#6){++++}, at: vfs_write+0x452/0x560 fs/read_write.c:548

1 lock held by rsyslogd/4418:
 #0: 000000007f0c215c (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0 fs/file.c:766

syz-executor4   D22224  4597   4588 0x00000000
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_write include/linux/fs.h:1552 [inline]
 mnt_want_write+0x3f/0xc0 fs/namespace.c:386
 do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
 __do_sys_unlink fs/namei.c:4091 [inline]
 __se_sys_unlink fs/namei.c:4089 [inline]
 __x64_sys_unlink+0x42/0x50 fs/namei.c:4089
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor4/4597:
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386

syz-executor6   D21536  4600   4591 0x00000000
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_write include/linux/fs.h:1552 [inline]
 mnt_want_write+0x3f/0xc0 fs/namespace.c:386
 do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
 __do_sys_unlink fs/namei.c:4091 [inline]
 __se_sys_unlink fs/namei.c:4089 [inline]
 __x64_sys_unlink+0x42/0x50 fs/namei.c:4089
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor6/4600:
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386

syz-executor3   D21536  7320   7261 0x00000000
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_write include/linux/fs.h:1552 [inline]
 mnt_want_write+0x3f/0xc0 fs/namespace.c:386
 do_unlinkat+0x2a3/0xa10 fs/namei.c:4026
 __do_sys_unlink fs/namei.c:4091 [inline]
 __se_sys_unlink fs/namei.c:4089 [inline]
 __x64_sys_unlink+0x42/0x50 fs/namei.c:4089
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
1 lock held by syz-executor3/7320:
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: sb_start_write include/linux/fs.h:1552 [inline]
 #0: 00000000c4a96cb8 (sb_writers#6){++++}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:386

syz-executor2   D24032 13221   4593 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_pagefault include/linux/fs.h:1581 [inline]
 ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
 do_page_mkwrite+0x146/0x500 mm/memory.c:2380
 do_shared_fault mm/memory.c:3706 [inline]
 do_fault mm/memory.c:3745 [inline]
 handle_pte_fault mm/memory.c:3972 [inline]
 __handle_mm_fault+0x2acb/0x4390 mm/memory.c:4096
 handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
 __do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
 do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
 page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor2/13221:
 #0: 000000002618cb0b (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125

syz-executor5   D24696 13225   4598 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_pagefault include/linux/fs.h:1581 [inline]
 ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
 do_page_mkwrite+0x146/0x500 mm/memory.c:2380
 wp_page_shared mm/memory.c:2676 [inline]
 do_wp_page+0xf5d/0x1990 mm/memory.c:2776
 handle_pte_fault mm/memory.c:3988 [inline]
 __handle_mm_fault+0x29f5/0x4390 mm/memory.c:4096
 handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
 __do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
 do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
 page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor5/13225:
 #0: 000000000aa505ed (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125

syz-executor1   D23432 13236   4594 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2856 [inline]
 __schedule+0x801/0x1e30 kernel/sched/core.c:3498
 schedule+0xef/0x430 kernel/sched/core.c:3542
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x350/0x5e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 __percpu_down_read+0x15d/0x200 kernel/locking/percpu-rwsem.c:85
 percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:49 [inline]
 percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
 __sb_start_write+0x2d7/0x300 fs/super.c:1403
 sb_start_pagefault include/linux/fs.h:1581 [inline]
 ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
 do_page_mkwrite+0x146/0x500 mm/memory.c:2380
 wp_page_shared mm/memory.c:2676 [inline]
 do_wp_page+0xf5d/0x1990 mm/memory.c:2776
 handle_pte_fault mm/memory.c:3988 [inline]
 __handle_mm_fault+0x29f5/0x4390 mm/memory.c:4096
 handle_mm_fault+0x53a/0xc70 mm/memory.c:4133
 __do_page_fault+0x60b/0xe40 arch/x86/mm/fault.c:1403
 do_page_fault+0xee/0x8a7 arch/x86/mm/fault.c:1478
 page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1160
2 locks held by syz-executor1/13236:
 #0: 0000000079279c9c (&mm->mmap_sem){++++}, at: __do_page_fault+0x381/0xe40 arch/x86/mm/fault.c:1332
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: sb_start_pagefault include/linux/fs.h:1581 [inline]
 #1: 00000000a50d0f8c (sb_pagefaults){++++}, at: ext4_page_mkwrite+0x1c8/0x1420 fs/ext4/inode.c:6125
----------------------------------------

This looks quite strange that nobody is holding percpu_rw_semaphore for
write but everybody is stuck trying to hold it for read. (Since there
is no "X locks held by ..." line without followup "#0:" line, there is
no possibility that somebody is in TASK_RUNNING state while holding
percpu_rw_semaphore for write.)

I feel that either API has a bug or API usage is wrong.
Any idea for debugging this?

  reply	other threads:[~2018-06-10 14:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-03 10:17 INFO: task hung in __sb_start_write syzbot
2018-06-10 14:47 ` Tetsuo Handa [this message]
2018-06-11  7:30   ` Peter Zijlstra
2018-06-11  7:39     ` Dmitry Vyukov
2018-06-14 10:33       ` Tetsuo Handa
2018-06-14 10:33         ` Tetsuo Handa
2018-06-15  9:19         ` Dmitry Vyukov
2018-06-15 19:40           ` Tetsuo Handa
2018-06-19 11:10             ` Tetsuo Handa
2018-06-19 11:47               ` Dmitry Vyukov
2018-06-19 13:00                 ` Tetsuo Handa
2018-07-11 11:13               ` Tetsuo Handa
2018-07-13 10:38                 ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee29ad1e-8bd0-3eef-f49e-793bfadcbea7@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=syzbot+7b2866454055e43c21e5@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.