From: Eric Biggers <firstname.lastname@example.org>
To: Alexander Viro <email@example.com>,
Benjamin LaHaise <firstname.lastname@example.org>
Cc: email@example.com, firstname.lastname@example.org,
email@example.com, Ramji Jiyani <firstname.lastname@example.org>,
Christoph Hellwig <email@example.com>,
Linus Torvalds <firstname.lastname@example.org>,
Oleg Nesterov <email@example.com>, Jens Axboe <firstname.lastname@example.org>,
Martijn Coenen <email@example.com>,
Subject: [PATCH v3 1/5] wait: add wake_up_pollfree()
Date: Wed, 8 Dec 2021 17:04:51 -0800 [thread overview]
Message-ID: <firstname.lastname@example.org> (raw)
From: Eric Biggers <email@example.com>
Several ->poll() implementations are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
However, that has a bug: wake_up_poll() calls __wake_up() with
nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters,
and the wakeup function for the first one returns a positive value, only
that one will be called. That's *not* what's needed for POLLFREE;
POLLFREE is special in that it really needs to wake up everyone.
Considering the three non-blocking poll systems:
- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
- aio poll is unaffected, since it doesn't support exclusive waits.
However, that's fragile, as someone could add this feature later.
- epoll doesn't appear to be broken by this, since its wakeup function
returns 0 when it sees POLLFREE. But this is fragile.
Although there is a workaround (see epoll), it's better to define a
function which always sends POLLFREE to all waiters. Add such a
function. Also make it verify that the queue really becomes empty after
all waiters have been woken up.
Reported-by: Linus Torvalds <firstname.lastname@example.org>
Signed-off-by: Eric Biggers <email@example.com>
include/linux/wait.h | 26 ++++++++++++++++++++++++++
kernel/sched/wait.c | 7 +++++++
2 files changed, 33 insertions(+)
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 2d0df57c99024..851e07da2583f 100644
@@ -217,6 +217,7 @@ void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode, void
void __wake_up_locked_sync_key(struct wait_queue_head *wq_head, unsigned int mode, void *key);
void __wake_up_locked(struct wait_queue_head *wq_head, unsigned int mode, int nr);
void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode);
+void __wake_up_pollfree(struct wait_queue_head *wq_head);
#define wake_up(x) __wake_up(x, TASK_NORMAL, 1, NULL)
#define wake_up_nr(x, nr) __wake_up(x, TASK_NORMAL, nr, NULL)
@@ -245,6 +246,31 @@ void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode);
#define wake_up_interruptible_sync_poll_locked(x, m) \
__wake_up_locked_sync_key((x), TASK_INTERRUPTIBLE, poll_to_key(m))
+ * wake_up_pollfree - signal that a polled waitqueue is going away
+ * @wq_head: the wait queue head
+ * In the very rare cases where a ->poll() implementation uses a waitqueue whose
+ * lifetime is tied to a task rather than to the 'struct file' being polled,
+ * this function must be called before the waitqueue is freed so that
+ * non-blocking polls (e.g. epoll) are notified that the queue is going away.
+ * The caller must also RCU-delay the freeing of the wait_queue_head, e.g. via
+ * an explicit synchronize_rcu() or call_rcu(), or via SLAB_TYPESAFE_BY_RCU.
+static inline void wake_up_pollfree(struct wait_queue_head *wq_head)
+ * For performance reasons, we don't always take the queue lock here.
+ * Therefore, we might race with someone removing the last entry from
+ * the queue, and proceed while they still hold the queue lock.
+ * However, rcu_read_lock() is required to be held in such cases, so we
+ * can safely proceed with an RCU-delayed free.
+ if (waitqueue_active(wq_head))
#define ___wait_cond_timeout(condition) \
bool __cond = (condition); \
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 76577d1642a5d..eca38107b32f1 100644
@@ -238,6 +238,13 @@ void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode)
EXPORT_SYMBOL_GPL(__wake_up_sync); /* For internal use only */
+void __wake_up_pollfree(struct wait_queue_head *wq_head)
+ __wake_up(wq_head, TASK_NORMAL, 0, poll_to_key(EPOLLHUP | POLLFREE));
+ /* POLLFREE must have cleared the queue. */
* Note: we use "set_current_state()" _after_ the wait-queue add,
* because we need a memory barrier there on SMP, so that any
next prev parent reply other threads:[~2021-12-09 1:06 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-09 1:04 [PATCH v3 0/5] aio: fix use-after-free and missing wakeups Eric Biggers
2021-12-09 1:04 ` Eric Biggers [this message]
2021-12-09 1:04 ` [PATCH v3 2/5] binder: use wake_up_pollfree() Eric Biggers
2021-12-09 1:04 ` [PATCH v3 3/5] signalfd: " Eric Biggers
2021-12-09 1:04 ` [PATCH v3 4/5] aio: keep poll requests on waitqueue until completed Eric Biggers
2021-12-09 1:04 ` [PATCH v3 5/5] aio: fix use-after-free due to missing POLLFREE handling Eric Biggers
2021-12-09 18:00 ` [PATCH v3 0/5] aio: fix use-after-free and missing wakeups Linus Torvalds
2021-12-09 18:37 ` Eric Biggers
2021-12-13 7:23 ` Christoph Hellwig
2021-12-13 17:24 ` Eric Biggers
2021-12-09 21:46 ` Jens Axboe
2021-12-10 5:10 ` Eric Biggers
2021-12-10 8:07 ` Eric Biggers
2022-01-05 15:26 ` Eric Biggers
2022-01-05 16:11 ` Jens Axboe
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.