From: Munehisa Kamata <kamatam@amazon.com>
To: <surenb@google.com>
Cc: <ebiggers@kernel.org>, <hannes@cmpxchg.org>, <hdanton@sina.com>,
<kamatam@amazon.com>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <mengcc@amazon.com>,
<stable@vger.kernel.org>
Subject: [PATCH v3] sched/psi: fix use-after-free in ep_remove_wait_queue()
Date: Tue, 14 Feb 2023 10:13:35 -0800 [thread overview]
Message-ID: <20230214181335.3946674-1-kamatam@amazon.com> (raw)
In-Reply-To: <CAJuCfpGiktjjPZYPp8LNtbmvYhkxh_icEWXOVgsq9qeq+w6s+g@mail.gmail.com>
If a non-root cgroup gets removed when there is a thread that registered
trigger and is polling on a pressure file within the cgroup, the polling
waitqueue gets freed in the following path.
do_rmdir
cgroup_rmdir
kernfs_drain_open_files
cgroup_file_release
cgroup_pressure_release
psi_trigger_destroy
However, the polling thread still has a reference to the pressure file and
will access the freed waitqueue when the file is closed or upon exit.
fput
ep_eventpoll_release
ep_free
ep_remove_wait_queue
remove_wait_queue
This results in use-after-free as pasted below.
The fundamental problem here is that cgroup_file_release() (and
consequently waitqueue's lifetime) is not tied to the file's real lifetime.
Using wake_up_pollfree() here might be less than ideal, but it also is not
fully contradicting the comment at commit 42288cb44c4b ("wait: add
wake_up_pollfree()") since the waitqueue's lifetime is not tied to file's
one and can be considered as another special case. While this would be
fixable by somehow making cgroup_file_release() be tied to the fput(), it
would require sizable refactoring at cgroups or higher layer which might be
more justifiable if we identify more cases like this.
BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0
Write of size 4 at addr ffff88810e625328 by task a.out/4404
CPU: 19 PID: 4404 Comm: a.out Not tainted 6.2.0-rc6 #38
Hardware name: Amazon EC2 c5a.8xlarge/, BIOS 1.0 10/16/2017
Call Trace:
<TASK>
dump_stack_lvl+0x73/0xa0
print_report+0x16c/0x4e0
? _printk+0x59/0x80
? __virt_addr_valid+0xb8/0x130
? _raw_spin_lock_irqsave+0x60/0xc0
kasan_report+0xc3/0xf0
? _raw_spin_lock_irqsave+0x60/0xc0
kasan_check_range+0x2d2/0x310
_raw_spin_lock_irqsave+0x60/0xc0
remove_wait_queue+0x1a/0xa0
ep_free+0x12c/0x170
ep_eventpoll_release+0x26/0x30
__fput+0x202/0x400
task_work_run+0x11d/0x170
do_exit+0x495/0x1130
? update_cfs_rq_load_avg+0x2c2/0x2e0
do_group_exit+0x100/0x100
get_signal+0xd67/0xde0
? finish_task_switch+0x15f/0x3a0
arch_do_signal_or_restart+0x2a/0x2b0
exit_to_user_mode_prepare+0x94/0x100
syscall_exit_to_user_mode+0x20/0x40
do_syscall_64+0x52/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f8e392bfb91
Code: Unable to access opcode bytes at 0x7f8e392bfb67.
RSP: 002b:00007fff261e08d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000022
RAX: fffffffffffffdfe RBX: 0000000000000000 RCX: 00007f8e392bfb91
RDX: 0000000000000001 RSI: 00007fff261e08e8 RDI: 0000000000000004
RBP: 00007fff261e0920 R08: 0000000000400780 R09: 00007f8e3960f240
R10: 00000000000003df R11: 0000000000000246 R12: 00000000004005a0
R13: 00007fff261e0a00 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Allocated by task 4404:
kasan_set_track+0x3d/0x60
__kasan_kmalloc+0x85/0x90
psi_trigger_create+0x113/0x3e0
pressure_write+0x146/0x2e0
cgroup_file_write+0x11c/0x250
kernfs_fop_write_iter+0x186/0x220
vfs_write+0x3d8/0x5c0
ksys_write+0x90/0x110
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 4407:
kasan_set_track+0x3d/0x60
kasan_save_free_info+0x27/0x40
____kasan_slab_free+0x11d/0x170
slab_free_freelist_hook+0x87/0x150
__kmem_cache_free+0xcb/0x180
psi_trigger_destroy+0x2e8/0x310
cgroup_file_release+0x4f/0xb0
kernfs_drain_open_files+0x165/0x1f0
kernfs_drain+0x162/0x1a0
__kernfs_remove+0x1fb/0x310
kernfs_remove_by_name_ns+0x95/0xe0
cgroup_addrm_files+0x67f/0x700
cgroup_destroy_locked+0x283/0x3c0
cgroup_rmdir+0x29/0x100
kernfs_iop_rmdir+0xd1/0x140
vfs_rmdir+0xfe/0x240
do_rmdir+0x13d/0x280
__x64_sys_rmdir+0x2c/0x30
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
v3: updated commit message and the comment in the code
v2: updated commit message
Link: https://lore.kernel.org/lkml/20230106224859.4123476-1-kamatam@amazon.com/
Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Cc: stable@vger.kernel.org
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Mengchi Cheng <mengcc@amazon.com>
---
kernel/sched/psi.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 8ac8b81bfee6..02e011cabe91 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1343,10 +1343,11 @@ void psi_trigger_destroy(struct psi_trigger *t)
group = t->group;
/*
- * Wakeup waiters to stop polling. Can happen if cgroup is deleted
- * from under a polling process.
+ * Wakeup waiters to stop polling and clear the queue to prevent it from
+ * being accessed later. Can happen if cgroup is deleted from under a
+ * polling process.
*/
- wake_up_interruptible(&t->event_wait);
+ wake_up_pollfree(&t->event_wait);
mutex_lock(&group->trigger_lock);
--
2.38.1
next prev parent reply other threads:[~2023-02-14 18:14 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-06 22:48 another use-after-free in ep_remove_wait_queue() Munehisa Kamata
2023-01-06 23:34 ` Suren Baghdasaryan
2023-01-07 8:07 ` Hillf Danton
2023-01-08 22:25 ` Munehisa Kamata
2023-01-08 23:49 ` Hillf Danton
2023-01-10 1:33 ` Suren Baghdasaryan
2023-01-10 3:06 ` Suren Baghdasaryan
2023-01-12 22:01 ` Suren Baghdasaryan
2023-01-13 2:25 ` Munehisa Kamata
2023-01-13 17:52 ` Suren Baghdasaryan
2023-01-19 3:06 ` Suren Baghdasaryan
2023-01-19 21:01 ` Suren Baghdasaryan
2023-01-19 22:25 ` Johannes Weiner
2023-01-20 1:30 ` Hillf Danton
2023-01-20 1:37 ` Suren Baghdasaryan
2023-01-20 2:46 ` Munehisa Kamata
2023-01-20 2:52 ` Munehisa Kamata
2023-01-20 9:00 ` Hillf Danton
2023-01-20 16:28 ` Suren Baghdasaryan
2023-01-21 5:17 ` Hillf Danton
2023-01-22 3:01 ` Suren Baghdasaryan
2023-01-20 1:45 ` Munehisa Kamata
2023-02-02 3:00 ` [PATCH] sched/psi: fix " Munehisa Kamata
2023-02-02 4:56 ` Eric Biggers
2023-02-02 21:11 ` Suren Baghdasaryan
2023-02-09 17:09 ` Suren Baghdasaryan
2023-02-09 18:46 ` Eric Biggers
2023-02-09 19:13 ` Suren Baghdasaryan
2023-02-13 23:50 ` Suren Baghdasaryan
2023-02-14 7:04 ` [PATCH v2] " Munehisa Kamata
2023-02-14 17:10 ` Suren Baghdasaryan
2023-02-14 18:13 ` Munehisa Kamata [this message]
2023-02-14 18:28 ` [PATCH v3] " Suren Baghdasaryan
2023-02-14 18:29 ` Suren Baghdasaryan
2023-02-14 18:55 ` Eric Biggers
2023-02-14 19:13 ` Suren Baghdasaryan
2023-02-14 18:37 ` [PATCH v2] " Munehisa Kamata
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230214181335.3946674-1-kamatam@amazon.com \
--to=kamatam@amazon.com \
--cc=ebiggers@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mengcc@amazon.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.