* [PATCH] fix syzkaller task hung in exit_aio
@ 2019-03-06 13:53 zhengbin
2019-03-06 19:44 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: zhengbin @ 2019-03-06 13:53 UTC (permalink / raw)
To: viro, bcrl, linux-fsdevel, linux-aio; +Cc: houtao1, yi.zhang
When I use syzkaller test kernel, will hung in exit_aio.
INFO: task syz-executor.2:22372 blocked for more than 140 seconds.
Not tainted 4.19.25 #5
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2 D27568 22372 2689 0x90000002
Call Trace:
schedule+0x7c/0x1a0 kernel/sched/core.c:3516
schedule_timeout+0x4cf/0x1140 kernel/time/timer.c:1780
do_wait_for_common kernel/sched/completion.c:83 [inline]
__wait_for_common kernel/sched/completion.c:104 [inline]
wait_for_common kernel/sched/completion.c:115 [inline]
wait_for_completion+0x27a/0x3d0 kernel/sched/completion.c:136
exit_aio+0x2ef/0x3c0 fs/aio.c:881
__mmput kernel/fork.c:1047 [inline]
mmput+0xb4/0x460 kernel/fork.c:1071
exit_mm kernel/exit.c:545 [inline]
do_exit+0x79c/0x2cb0 kernel/exit.c:862
do_group_exit+0x106/0x2f0 kernel/exit.c:978
get_signal+0x325/0x1c80 kernel/signal.c:2572
do_signal+0x94/0x16a0 arch/x86/kernel/signal.c:816
exit_to_usermode_loop+0x108/0x1d0 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_64+0x461/0x580 arch/x86/entry/common.c:293
The reason is as follows:
io_submit_one-->aio_get_req-->percpu_ref_get(&ctx->reqs)
-->req->ki_refcnt=0
-->aio_poll-->req->ki_refcnt=2
-->aio_poll_complete-->aio_complete-->iocb_put
-->iocb_put
iocb_put will decrease req->ki_refcnt, the number of calls of
aio_poll_complete must be equal with iocb_put. Unfortunately, in some
case, this is not equal, which is as follows:
CPU 0 CPU 1
aio_poll-->vfs_poll
eventfd_write-->spin_lock_irq(lock)
-->..-->aio_poll_wake
-->spin_unlock_irq(lock)
-->spin_lock(lock)
-->if (req->woken)
mask = 0; --->did not call aio_poll_complete
-->iocb_put
aio_poll_wake
req->woken = true;
if (mask) {
if (!(mask & req->events))
return 0; --->did not call aio_poll_complete too
vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add
aio_poll_wake to req->head)
eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func
-->aio_poll_wake
This patch fixes that. by the way, fix the bug of the error handling path.
Signed-off-by: zhengbin <zhengbin13@huawei.com>
---
fs/aio.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 38b741a..3bf8cdc 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1668,8 +1668,6 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
__poll_t mask = key_to_poll(key);
unsigned long flags;
- req->woken = true;
-
/* for instances that support it check for an event match first: */
if (mask) {
if (!(mask & req->events))
@@ -1687,12 +1685,14 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
list_del_init(&req->wait.entry);
aio_poll_complete(iocb, mask);
+ req->woken = true;
return 1;
}
}
list_del_init(&req->wait.entry);
schedule_work(&req->work);
+ req->woken = true;
return 1;
}
@@ -1777,8 +1777,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
spin_unlock_irq(&ctx->ctx_lock);
out:
- if (unlikely(apt.error))
+ if (unlikely(apt.error)) {
+ iocb_put(aiocb);
return apt.error;
+ }
if (mask)
aio_poll_complete(aiocb, mask);
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] fix syzkaller task hung in exit_aio
2019-03-06 13:53 [PATCH] fix syzkaller task hung in exit_aio zhengbin
@ 2019-03-06 19:44 ` Al Viro
2019-03-07 0:07 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2019-03-06 19:44 UTC (permalink / raw)
To: zhengbin; +Cc: bcrl, linux-fsdevel, linux-aio, houtao1, yi.zhang
On Wed, Mar 06, 2019 at 09:53:23PM +0800, zhengbin wrote:
> CPU 0 CPU 1
> aio_poll-->vfs_poll
> eventfd_write-->spin_lock_irq(lock)
> -->..-->aio_poll_wake
> -->spin_unlock_irq(lock)
> -->spin_lock(lock)
> -->if (req->woken)
> mask = 0; --->did not call aio_poll_complete
> -->iocb_put
>
> aio_poll_wake
> req->woken = true;
> if (mask) {
> if (!(mask & req->events))
> return 0; --->did not call aio_poll_complete too
... and it's still on waitqueue, so it shouldn't be different from
_not_ having had a wakeup yet. And yes, aio_poll() in mainline right
now ends up _not_ adding it to "can be cancelled" list, leading to
that bug.
> vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add
> aio_poll_wake to req->head)
>
> eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func
> -->aio_poll_wake
>
> This patch fixes that. by the way, fix the bug of the error handling path.
Leak on error is real (see thread a few days ago), and overall logics for
"woken" should be similar to what you suggest, but I'd rather handle it
slightly differently (see the same thread).
I've a patch that ought to fix that and it seems to survive testing; I'll
post once I finish carving it up - too many cleanups mixed into it. Give
me a couple of hours; should be done (and posted) by then.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] fix syzkaller task hung in exit_aio
2019-03-06 19:44 ` Al Viro
@ 2019-03-07 0:07 ` Al Viro
0 siblings, 0 replies; 3+ messages in thread
From: Al Viro @ 2019-03-07 0:07 UTC (permalink / raw)
To: zhengbin; +Cc: bcrl, linux-fsdevel, linux-aio, houtao1, yi.zhang
On Wed, Mar 06, 2019 at 07:44:55PM +0000, Al Viro wrote:
> Leak on error is real (see thread a few days ago), and overall logics for
> "woken" should be similar to what you suggest, but I'd rather handle it
> slightly differently (see the same thread).
>
> I've a patch that ought to fix that and it seems to survive testing; I'll
> post once I finish carving it up - too many cleanups mixed into it. Give
> me a couple of hours; should be done (and posted) by then.
Carved up and posted - sorry, too longer than I hoped ;-/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-03-07 0:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-06 13:53 [PATCH] fix syzkaller task hung in exit_aio zhengbin
2019-03-06 19:44 ` Al Viro
2019-03-07 0:07 ` Al Viro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).