io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] io_uring: prevent eventfd recursion on poll
@ 2020-01-31  5:24 Jens Axboe
  2020-02-04  9:38 ` Daurnimator
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2020-01-31  5:24 UTC (permalink / raw)
  To: io-uring

If we register an eventfd with io_uring for completion notification,
and then subsequently does a poll for that very descriptor, then we
can trigger a deadlock scenario. Once a request completes and signals
the eventfd context, that will in turn trigger the poll command to
complete. When that poll request completes, it'll try trigger another
event on the eventfd, but this time off the path led us to complete
the poll in the first place. The result is a deadlock in eventfd,
as it tries to ctx->wqh.lock in a nested fashion.

Check if the file in question for the poll request is our eventfd
context, and if it is, don't trigger a nested event for the poll
completion.

Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

We're holding the file here, so checking ->private_data is fine.
It's a bit iffy in that it implies knowledge that this is where
eventfd stores the ctx. We could potentially make that helper in
eventfd, if needed.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ac5340fdcdfe..5788c2139c72 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1025,16 +1025,21 @@ static inline bool io_should_trigger_evfd(struct io_ring_ctx *ctx)
 	return io_wq_current_is_worker() || in_interrupt();
 }
 
-static void io_cqring_ev_posted(struct io_ring_ctx *ctx)
+static void __io_cqring_ev_posted(struct io_ring_ctx *ctx, bool no_evfd)
 {
 	if (waitqueue_active(&ctx->wait))
 		wake_up(&ctx->wait);
 	if (waitqueue_active(&ctx->sqo_wait))
 		wake_up(&ctx->sqo_wait);
-	if (ctx->cq_ev_fd && io_should_trigger_evfd(ctx))
+	if (!no_evfd && ctx->cq_ev_fd && io_should_trigger_evfd(ctx))
 		eventfd_signal(ctx->cq_ev_fd, 1);
 }
 
+static void io_cqring_ev_posted(struct io_ring_ctx *ctx)
+{
+	__io_cqring_ev_posted(ctx, false);
+}
+
 /* Returns true if there are no backlogged entries after the flush */
 static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
 {
@@ -3586,13 +3591,27 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 
 		if (llist_empty(&ctx->poll_llist) &&
 		    spin_trylock_irqsave(&ctx->completion_lock, flags)) {
+			bool no_evfd = false;
+
 			hash_del(&req->hash_node);
 			io_poll_complete(req, mask, 0);
 			req->flags |= REQ_F_COMP_LOCKED;
+
+			/*
+			 * If the file triggering the poll is our eventfd
+			 * context, then don't trigger another event for this
+			 * request. If we do, we can recurse, as this poll
+			 * request may be triggered by a completion that
+			 * signaled our eventfd.
+			 */
+			if (ctx->cq_ev_fd &&
+			    req->file->private_data == ctx->cq_ev_fd)
+				no_evfd = true;
+
 			io_put_req(req);
 			spin_unlock_irqrestore(&ctx->completion_lock, flags);
 
-			io_cqring_ev_posted(ctx);
+			__io_cqring_ev_posted(ctx, no_evfd);
 			req = NULL;
 		} else {
 			req->result = mask;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: prevent eventfd recursion on poll
  2020-01-31  5:24 [PATCH] io_uring: prevent eventfd recursion on poll Jens Axboe
@ 2020-02-04  9:38 ` Daurnimator
  2020-02-04 18:57   ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Daurnimator @ 2020-02-04  9:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Fri, 31 Jan 2020 at 16:25, Jens Axboe <axboe@kernel.dk> wrote:
>
> If we register an eventfd with io_uring for completion notification,
> and then subsequently does a poll for that very descriptor, then we
> can trigger a deadlock scenario. Once a request completes and signals
> the eventfd context, that will in turn trigger the poll command to
> complete. When that poll request completes, it'll try trigger another
> event on the eventfd, but this time off the path led us to complete
> the poll in the first place. The result is a deadlock in eventfd,
> as it tries to ctx->wqh.lock in a nested fashion.
>
> Check if the file in question for the poll request is our eventfd
> context, and if it is, don't trigger a nested event for the poll
> completion.

Could this deadlock/loop also happen via an epoll fd?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: prevent eventfd recursion on poll
  2020-02-04  9:38 ` Daurnimator
@ 2020-02-04 18:57   ` Jens Axboe
  0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2020-02-04 18:57 UTC (permalink / raw)
  To: Daurnimator; +Cc: io-uring

On 2/4/20 2:38 AM, Daurnimator wrote:
> On Fri, 31 Jan 2020 at 16:25, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> If we register an eventfd with io_uring for completion notification,
>> and then subsequently does a poll for that very descriptor, then we
>> can trigger a deadlock scenario. Once a request completes and signals
>> the eventfd context, that will in turn trigger the poll command to
>> complete. When that poll request completes, it'll try trigger another
>> event on the eventfd, but this time off the path led us to complete
>> the poll in the first place. The result is a deadlock in eventfd,
>> as it tries to ctx->wqh.lock in a nested fashion.
>>
>> Check if the file in question for the poll request is our eventfd
>> context, and if it is, don't trigger a nested event for the poll
>> completion.
> 
> Could this deadlock/loop also happen via an epoll fd?

No, only through io_uring and aio with the way they handle the
notification through a (potentially nested) waitqueue wakeup
handler.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-02-04 18:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31  5:24 [PATCH] io_uring: prevent eventfd recursion on poll Jens Axboe
2020-02-04  9:38 ` Daurnimator
2020-02-04 18:57   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).