All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] 5.16 fixes for syz reports
@ 2021-11-26 14:38 Pavel Begunkov
  2021-11-26 14:38 ` [PATCH 1/2] io_uring: fail cancellation for EXITING tasks Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Pavel Begunkov @ 2021-11-26 14:38 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

The second patch fixes the last five reports, i.e. deadlocks and
inconsistent irq state, all of the caused by the same patch and
are dups of each other.

1/2 is for another report, where tw of cancellation requests don't
handle PF_EXITING.

Pavel Begunkov (2):
  io_uring: fail cancellation for EXITING tasks
  io_uring: fix link traversal locking

 fs/io_uring.c | 65 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 20 deletions(-)

-- 
2.34.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] io_uring: fail cancellation for EXITING tasks
  2021-11-26 14:38 [PATCH 0/2] 5.16 fixes for syz reports Pavel Begunkov
@ 2021-11-26 14:38 ` Pavel Begunkov
  2021-11-26 15:32   ` Pavel Begunkov
  2021-11-26 14:38 ` [PATCH 2/2] io_uring: fix link traversal locking Pavel Begunkov
  2021-11-26 15:36 ` [PATCH 0/2] 5.16 fixes for syz reports Jens Axboe
  2 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-11-26 14:38 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

WARNING: CPU: 1 PID: 20 at fs/io_uring.c:6269 io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 5.16.0-rc1-syzkaller #0
Workqueue: events io_fallback_req_func
RIP: 0010:io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
Call Trace:
 <TASK>
 io_req_task_link_timeout+0x6b/0x1e0 fs/io_uring.c:6886
 io_fallback_req_func+0xf9/0x1ae fs/io_uring.c:1334
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
 kthread+0x405/0x4f0 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>

We need original task's context to do cancellations, so if it's dying
and the callback is executed in a fallback mode, fail the cancellation
attempt.

Reported-by: syzbot+ab0cfe96c2b3cd1c1153@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index a4c508a1e0cf..7dd112d44adf 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6882,10 +6882,11 @@ static inline struct file *io_file_get(struct io_ring_ctx *ctx,
 static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
 {
 	struct io_kiocb *prev = req->timeout.prev;
-	int ret;
+	int ret = -ENOENT;
 
 	if (prev) {
-		ret = io_try_cancel_userdata(req, prev->user_data);
+		if (!(req->task->flags & PF_EXITING))
+			ret = io_try_cancel_userdata(req, prev->user_data);
 		io_req_complete_post(req, ret ?: -ETIME, 0);
 		io_put_req(prev);
 	} else {
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] io_uring: fix link traversal locking
  2021-11-26 14:38 [PATCH 0/2] 5.16 fixes for syz reports Pavel Begunkov
  2021-11-26 14:38 ` [PATCH 1/2] io_uring: fail cancellation for EXITING tasks Pavel Begunkov
@ 2021-11-26 14:38 ` Pavel Begunkov
  2021-11-26 15:29   ` Pavel Begunkov
  2021-11-26 15:36 ` [PATCH 0/2] 5.16 fixes for syz reports Jens Axboe
  2 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-11-26 14:38 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

WARNING: inconsistent lock state
5.16.0-rc2-syzkaller #0 Not tainted
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
ffff888078e11418 (&ctx->timeout_lock
){?.+.}-{2:2}
, at: io_timeout_fn+0x6f/0x360 fs/io_uring.c:5943
{HARDIRQ-ON-W} state was registered at:
  [...]
  spin_unlock_irq include/linux/spinlock.h:399 [inline]
  __io_poll_remove_one fs/io_uring.c:5669 [inline]
  __io_poll_remove_one fs/io_uring.c:5654 [inline]
  io_poll_remove_one+0x236/0x870 fs/io_uring.c:5680
  io_poll_remove_all+0x1af/0x235 fs/io_uring.c:5709
  io_ring_ctx_wait_and_kill+0x1cc/0x322 fs/io_uring.c:9534
  io_uring_release+0x42/0x46 fs/io_uring.c:9554
  __fput+0x286/0x9f0 fs/file_table.c:280
  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
  exit_task_work include/linux/task_work.h:32 [inline]
  do_exit+0xc14/0x2b40 kernel/exit.c:832

674ee8e1b4a41 ("io_uring: correct link-list traversal locking") fixed a
data race but introduced a possible deadlock and inconsistentcy in irq
states. E.g.

io_poll_remove_all()
    spin_lock_irq(timeout_lock)
    io_poll_remove_one()
        spin_lock/unlock_irq(poll_lock);
    spin_unlock_irq(timeout_lock)

Another type of problem is freeing a request while holding
->timeout_lock, which may leads to a deadlock in
io_commit_cqring() -> io_flush_timeouts() and other places.

Having 3 nested locks is also too ugly. Add io_match_task_safe(), which
would briefly take and release timeout_lock for race prevention inside,
so the actuall request cancellation / free / etc. code doesn't have it
taken.

Reported-by: syzbot+ff49a3059d49b0ca0eec@syzkaller.appspotmail.com
Reported-by: syzbot+847f02ec20a6609a328b@syzkaller.appspotmail.com
Reported-by: syzbot+3368aadcd30425ceb53b@syzkaller.appspotmail.com
Reported-by: syzbot+51ce8887cdef77c9ac83@syzkaller.appspotmail.com
Reported-by: syzbot+3cb756a49d2f394a9ee3@syzkaller.appspotmail.com
Fixes: 674ee8e1b4a41 ("io_uring: correct link-list traversal locking")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 60 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 18 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7dd112d44adf..75841b919dce 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1278,6 +1278,7 @@ static void io_refs_resurrect(struct percpu_ref *ref, struct completion *compl)
 
 static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
 			  bool cancel_all)
+	__must_hold(&req->ctx->timeout_lock)
 {
 	struct io_kiocb *req;
 
@@ -1293,6 +1294,44 @@ static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
 	return false;
 }
 
+static bool io_match_linked(struct io_kiocb *head)
+{
+	struct io_kiocb *req;
+
+	io_for_each_link(req, head) {
+		if (req->flags & REQ_F_INFLIGHT)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * As io_match_task() but protected against racing with linked timeouts.
+ * User must not hold timeout_lock.
+ */
+static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
+			       bool cancel_all)
+{
+	bool matched;
+
+	if (task && head->task != task)
+		return false;
+	if (cancel_all)
+		return true;
+
+	if (head->flags & REQ_F_LINK_TIMEOUT) {
+		struct io_ring_ctx *ctx = head->ctx;
+
+		/* protect against races with linked timeouts */
+		spin_lock_irq(&ctx->timeout_lock);
+		matched = io_match_linked(head);
+		spin_unlock_irq(&ctx->timeout_lock);
+	} else {
+		matched = io_match_linked(head);
+	}
+	return matched;
+}
+
 static inline bool req_has_async_data(struct io_kiocb *req)
 {
 	return req->flags & REQ_F_ASYNC_DATA;
@@ -5699,17 +5738,15 @@ static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
 	int posted = 0, i;
 
 	spin_lock(&ctx->completion_lock);
-	spin_lock_irq(&ctx->timeout_lock);
 	for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
 		struct hlist_head *list;
 
 		list = &ctx->cancel_hash[i];
 		hlist_for_each_entry_safe(req, tmp, list, hash_node) {
-			if (io_match_task(req, tsk, cancel_all))
+			if (io_match_task_safe(req, tsk, cancel_all))
 				posted += io_poll_remove_one(req);
 		}
 	}
-	spin_unlock_irq(&ctx->timeout_lock);
 	spin_unlock(&ctx->completion_lock);
 
 	if (posted)
@@ -9565,19 +9602,8 @@ static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
 {
 	struct io_kiocb *req = container_of(work, struct io_kiocb, work);
 	struct io_task_cancel *cancel = data;
-	bool ret;
 
-	if (!cancel->all && (req->flags & REQ_F_LINK_TIMEOUT)) {
-		struct io_ring_ctx *ctx = req->ctx;
-
-		/* protect against races with linked timeouts */
-		spin_lock_irq(&ctx->timeout_lock);
-		ret = io_match_task(req, cancel->task, cancel->all);
-		spin_unlock_irq(&ctx->timeout_lock);
-	} else {
-		ret = io_match_task(req, cancel->task, cancel->all);
-	}
-	return ret;
+	return io_match_task_safe(req, cancel->task, cancel->all);
 }
 
 static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
@@ -9588,14 +9614,12 @@ static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
 	LIST_HEAD(list);
 
 	spin_lock(&ctx->completion_lock);
-	spin_lock_irq(&ctx->timeout_lock);
 	list_for_each_entry_reverse(de, &ctx->defer_list, list) {
-		if (io_match_task(de->req, task, cancel_all)) {
+		if (io_match_task_safe(de->req, task, cancel_all)) {
 			list_cut_position(&list, &ctx->defer_list, &de->list);
 			break;
 		}
 	}
-	spin_unlock_irq(&ctx->timeout_lock);
 	spin_unlock(&ctx->completion_lock);
 	if (list_empty(&list))
 		return false;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: fix link traversal locking
  2021-11-26 14:38 ` [PATCH 2/2] io_uring: fix link traversal locking Pavel Begunkov
@ 2021-11-26 15:29   ` Pavel Begunkov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2021-11-26 15:29 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe

On 11/26/21 14:38, Pavel Begunkov wrote:
> 674ee8e1b4a41 ("io_uring: correct link-list traversal locking") fixed a

The problematic commit is marked stable, so

Cc: stable@kernel.org # 5.15+

> data race but introduced a possible deadlock and inconsistentcy in irq
> states. E.g.
> 
> io_poll_remove_all()
>      spin_lock_irq(timeout_lock)
>      io_poll_remove_one()
>          spin_lock/unlock_irq(poll_lock);
>      spin_unlock_irq(timeout_lock)
> 
> Another type of problem is freeing a request while holding
> ->timeout_lock, which may leads to a deadlock in
> io_commit_cqring() -> io_flush_timeouts() and other places.
> 
> Having 3 nested locks is also too ugly. Add io_match_task_safe(), which
> would briefly take and release timeout_lock for race prevention inside,
> so the actuall request cancellation / free / etc. code doesn't have it
> taken.
> 
> Reported-by: syzbot+ff49a3059d49b0ca0eec@syzkaller.appspotmail.com
> Reported-by: syzbot+847f02ec20a6609a328b@syzkaller.appspotmail.com
> Reported-by: syzbot+3368aadcd30425ceb53b@syzkaller.appspotmail.com
> Reported-by: syzbot+51ce8887cdef77c9ac83@syzkaller.appspotmail.com
> Reported-by: syzbot+3cb756a49d2f394a9ee3@syzkaller.appspotmail.com
> Fixes: 674ee8e1b4a41 ("io_uring: correct link-list traversal locking")
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>   fs/io_uring.c | 60 +++++++++++++++++++++++++++++++++++----------------
>   1 file changed, 42 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 7dd112d44adf..75841b919dce 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1278,6 +1278,7 @@ static void io_refs_resurrect(struct percpu_ref *ref, struct completion *compl)
>   
>   static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
>   			  bool cancel_all)
> +	__must_hold(&req->ctx->timeout_lock)
>   {
>   	struct io_kiocb *req;
>   
> @@ -1293,6 +1294,44 @@ static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
>   	return false;
>   }
>   
> +static bool io_match_linked(struct io_kiocb *head)
> +{
> +	struct io_kiocb *req;
> +
> +	io_for_each_link(req, head) {
> +		if (req->flags & REQ_F_INFLIGHT)
> +			return true;
> +	}
> +	return false;
> +}
> +
> +/*
> + * As io_match_task() but protected against racing with linked timeouts.
> + * User must not hold timeout_lock.
> + */
> +static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
> +			       bool cancel_all)
> +{
> +	bool matched;
> +
> +	if (task && head->task != task)
> +		return false;
> +	if (cancel_all)
> +		return true;
> +
> +	if (head->flags & REQ_F_LINK_TIMEOUT) {
> +		struct io_ring_ctx *ctx = head->ctx;
> +
> +		/* protect against races with linked timeouts */
> +		spin_lock_irq(&ctx->timeout_lock);
> +		matched = io_match_linked(head);
> +		spin_unlock_irq(&ctx->timeout_lock);
> +	} else {
> +		matched = io_match_linked(head);
> +	}
> +	return matched;
> +}
> +
>   static inline bool req_has_async_data(struct io_kiocb *req)
>   {
>   	return req->flags & REQ_F_ASYNC_DATA;
> @@ -5699,17 +5738,15 @@ static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
>   	int posted = 0, i;
>   
>   	spin_lock(&ctx->completion_lock);
> -	spin_lock_irq(&ctx->timeout_lock);
>   	for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
>   		struct hlist_head *list;
>   
>   		list = &ctx->cancel_hash[i];
>   		hlist_for_each_entry_safe(req, tmp, list, hash_node) {
> -			if (io_match_task(req, tsk, cancel_all))
> +			if (io_match_task_safe(req, tsk, cancel_all))
>   				posted += io_poll_remove_one(req);
>   		}
>   	}
> -	spin_unlock_irq(&ctx->timeout_lock);
>   	spin_unlock(&ctx->completion_lock);
>   
>   	if (posted)
> @@ -9565,19 +9602,8 @@ static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
>   {
>   	struct io_kiocb *req = container_of(work, struct io_kiocb, work);
>   	struct io_task_cancel *cancel = data;
> -	bool ret;
>   
> -	if (!cancel->all && (req->flags & REQ_F_LINK_TIMEOUT)) {
> -		struct io_ring_ctx *ctx = req->ctx;
> -
> -		/* protect against races with linked timeouts */
> -		spin_lock_irq(&ctx->timeout_lock);
> -		ret = io_match_task(req, cancel->task, cancel->all);
> -		spin_unlock_irq(&ctx->timeout_lock);
> -	} else {
> -		ret = io_match_task(req, cancel->task, cancel->all);
> -	}
> -	return ret;
> +	return io_match_task_safe(req, cancel->task, cancel->all);
>   }
>   
>   static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
> @@ -9588,14 +9614,12 @@ static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
>   	LIST_HEAD(list);
>   
>   	spin_lock(&ctx->completion_lock);
> -	spin_lock_irq(&ctx->timeout_lock);
>   	list_for_each_entry_reverse(de, &ctx->defer_list, list) {
> -		if (io_match_task(de->req, task, cancel_all)) {
> +		if (io_match_task_safe(de->req, task, cancel_all)) {
>   			list_cut_position(&list, &ctx->defer_list, &de->list);
>   			break;
>   		}
>   	}
> -	spin_unlock_irq(&ctx->timeout_lock);
>   	spin_unlock(&ctx->completion_lock);
>   	if (list_empty(&list))
>   		return false;
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] io_uring: fail cancellation for EXITING tasks
  2021-11-26 14:38 ` [PATCH 1/2] io_uring: fail cancellation for EXITING tasks Pavel Begunkov
@ 2021-11-26 15:32   ` Pavel Begunkov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2021-11-26 15:32 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe

On 11/26/21 14:38, Pavel Begunkov wrote:
> We need original task's context to do cancellations, so if it's dying
> and the callback is executed in a fallback mode, fail the cancellation
> attempt.

Fixes: 89b263f6d56e6 ("io_uring: run linked timeouts from task_work")
Cc: stable@kernel.org # 5.15+

> 
> Reported-by: syzbot+ab0cfe96c2b3cd1c1153@syzkaller.appspotmail.com
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>   fs/io_uring.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a4c508a1e0cf..7dd112d44adf 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6882,10 +6882,11 @@ static inline struct file *io_file_get(struct io_ring_ctx *ctx,
>   static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
>   {
>   	struct io_kiocb *prev = req->timeout.prev;
> -	int ret;
> +	int ret = -ENOENT;
>   
>   	if (prev) {
> -		ret = io_try_cancel_userdata(req, prev->user_data);
> +		if (!(req->task->flags & PF_EXITING))
> +			ret = io_try_cancel_userdata(req, prev->user_data);
>   		io_req_complete_post(req, ret ?: -ETIME, 0);
>   		io_put_req(prev);
>   	} else {
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] 5.16 fixes for syz reports
  2021-11-26 14:38 [PATCH 0/2] 5.16 fixes for syz reports Pavel Begunkov
  2021-11-26 14:38 ` [PATCH 1/2] io_uring: fail cancellation for EXITING tasks Pavel Begunkov
  2021-11-26 14:38 ` [PATCH 2/2] io_uring: fix link traversal locking Pavel Begunkov
@ 2021-11-26 15:36 ` Jens Axboe
  2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-11-26 15:36 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On Fri, 26 Nov 2021 14:38:13 +0000, Pavel Begunkov wrote:
> The second patch fixes the last five reports, i.e. deadlocks and
> inconsistent irq state, all of the caused by the same patch and
> are dups of each other.
> 
> 1/2 is for another report, where tw of cancellation requests don't
> handle PF_EXITING.
> 
> [...]

Applied, thanks!

[1/2] io_uring: fail cancellation for EXITING tasks
      commit: 617a89484debcd4e7999796d693cf0b77d2519de
[2/2] io_uring: fix link traversal locking
      commit: 6af3f48bf6156a7f02e91aca64e2927c4bebda03

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-26 15:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-26 14:38 [PATCH 0/2] 5.16 fixes for syz reports Pavel Begunkov
2021-11-26 14:38 ` [PATCH 1/2] io_uring: fail cancellation for EXITING tasks Pavel Begunkov
2021-11-26 15:32   ` Pavel Begunkov
2021-11-26 14:38 ` [PATCH 2/2] io_uring: fix link traversal locking Pavel Begunkov
2021-11-26 15:29   ` Pavel Begunkov
2021-11-26 15:36 ` [PATCH 0/2] 5.16 fixes for syz reports Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.