From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF3F6C38142 for ; Fri, 20 Jan 2023 01:31:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FDF96B0071; Thu, 19 Jan 2023 20:31:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AEA16B0072; Thu, 19 Jan 2023 20:31:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 076F96B0075; Thu, 19 Jan 2023 20:31:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E93056B0071 for ; Thu, 19 Jan 2023 20:31:15 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id BB60BC0903 for ; Fri, 20 Jan 2023 01:31:15 +0000 (UTC) X-FDA: 80373449310.17.59B3391 Received: from r3-18.sinamail.sina.com.cn (r3-18.sinamail.sina.com.cn [202.108.3.18]) by imf09.hostedemail.com (Postfix) with ESMTP id E0817140019 for ; Fri, 20 Jan 2023 01:31:10 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.18 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674178274; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UBcUIN5CEBWrWet6Zr9Q+uknUtYgXqFN1IWQAcQVN7c=; b=BV/e51YeUzCzTRsrw+Ml/qwwzdVl1Yu0329hi2sCRNpNSmmlaJ0/XkqmwMX1hd61nw3TRM p6JpNYqoFXx/vEfmOFvPJxEHzCXpv38L98oOzjeRZZ48B8ESQOIOz8FxHDhAvyZ22urZUy 4QJr7sYGbOk9m+HtGk3IzimSRwsI9+M= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.18 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674178274; a=rsa-sha256; cv=none; b=6blnhw6o1S+kd2hAs04qHHYP+5Ir2il6z9so4rTLTlHmLm9ZkTn9u1COX3igLE3376sa/s 1NNQ6+Fpq0rxdbjUeY4C/IKeU9UEox96oLq7GsNv+hEW3omsjL5kl4SftUxPQ5Db31abzr qxs9hEZP/+DUZB6UK8/D4VEfbEso+q4= Received: from unknown (HELO localhost.localdomain)([114.249.61.130]) by sina.com (172.16.97.35) with ESMTP id 63C9EE04000191E5; Fri, 20 Jan 2023 09:27:34 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 44081315086257 From: Hillf Danton To: Suren Baghdasaryan , Munehisa Kamata , Tejun Heo Cc: ebiggers@kernel.org, hannes@cmpxchg.org, hdanton@sina.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mengcc@amazon.com Subject: Re: another use-after-free in ep_remove_wait_queue() Date: Fri, 20 Jan 2023 09:30:55 +0800 Message-Id: <20230120013055.3628-1-hdanton@sina.com> In-Reply-To: References: <20230113022555.2467724-1-kamatam@amazon.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: w7nuh3kw9b49da7ujqk1aye67j34f751 X-Rspamd-Queue-Id: E0817140019 X-HE-Tag: 1674178270-641242 X-HE-Meta: U2FsdGVkX18VsaNzzYJEeYAo+hqTSv7Chu4RIpi9c+w6XF0bmpnfgg2zFXo913g9LTZLSGJdS3ZQa7/zy2g0DfR6Cb1ifHHAhe5QpR/ZJWaDxmdqVy1o7Led1Zweiu0zmnFMHPKV62HvYza1C95TX2Cbn0m4vcwcYAkdDv3XJFz0T8nYaUy+Ai1UuuAINBel2KotqcHxTZGUn4jaDh040TgOVAcKxtaWRYgr50yly1Fz3emhSPATQOhD0Tn3GSrGOP6YUPkI0/XOmP2mDt8/AOVaO2Ry80gjj3BfIf/b1QIv01n/Wq7SppnkTRefK5+uKrMefQqOv3cmzU7Tuwd06CfKjFU/Rv2zLAE9CpvS67iPcvBFCK4MPOcmeEdKOZ2UfeZCrMO1ReyDzMec3aGxcw/FK5enz2u0dSC3KE0yWIjFf6EEbrmtV04NDKOeUecDgWtVXmsE7vlAc9G6Bard7kkx/AJqn6wXgmxZvzTTaEa5pfjIgvL6e6SUCdvRob2Iz1e1hzRyt4EowNk7v7qGGJlZSlmmWTSc6wp2J+bGVDzIKVNZE9nLCiQ43m+AxoCFKKu1IpHnX4s/LdbarrJ7s2sj/w1+Ll4FMLiHLrtPFZWUHAteLWrNFZY0doUeER2SxjgwE65W65Mxbil096h8WHCHmoN8e9apc+VLvIKX8NK0rhdU7rlJ+3LFRUvFo0dZVofNfwTBpXpaUEzRqOHMy3TR6c02WnIkTypi5O3CURypK8MqbT/BjQR2fLvKxMndBiL6dEHZTxgblhvkEDTAWL8x69rYFp+shQFSM5te4kIDW3/FclWrcOuCAnuEgQf4eS0FGoQT9s0Jqehorj9qt88YXmNIjClUxhnFDYS5Xys6rCNZz7d8HTxyfO0Kao9Q8c+VFWCGVKSO8lRHsH3Yx8n7ZZZs3qOfg5jY+E9XX98SZwLV3KHlUB2CSAeSapsvsQOX9leQaB1qdscRx9o lr+kgiJ3 VZGrvmM6lz6Oknb53iTxz4/D1jLA7/QI6XhmmyW5CA77iQ871t72rGcnnZsbTQWZTGnXtezPdjX6DtFIdhP3yCuhnUvAPDKNlxEF0gEaeEZFNtZdglXKzHeBx2G+8e9DIEFw9V8628dAYCrtgivU3DH7W8Th1z6qWtV7p6EVvQExh4x5UMIKbpX1EIsinRFVql7XP8mG8RSF8JY2EA2+3MaUvM8KBiTlPjf8M X-Bogosity: Ham, tests=bogofilter, spamicity=0.009907, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 19 Jan 2023 13:01:42 -0800 Suren Baghdasaryan wrote: > > Hi Folks, > I spent some more time digging into the details and this is what's > happening. When we call rmdir to delete the cgroup with the pressure > file being epoll'ed, roughly the following call chain happens in the > context of the shell process: > > do_rmdir > cgroup_rmdir > kernfs_drain_open_files > cgroup_file_release > cgroup_pressure_release > psi_trigger_destroy > > Later on in the context of our reproducer, the last fput() is called > causing wait queue removal: > > fput > ep_eventpoll_release > ep_free > ep_remove_wait_queue > remove_wait_queue > > By this time psi_trigger_destroy() already destroyed the trigger's > waitqueue head and we hit UAF. > I think the conceptual problem here (or maybe that's by design?) is > that cgroup_file_release() is not really tied to the file's real > lifetime (when the last fput() is issued). Otherwise fput() would call > eventpoll_release() before f_op->release() and the order would be fine > (we would remove the wait queue first in eventpoll_release() and then > f_op->release() would cause trigger's destruction). eventpoll_release eventpoll_release_file ep_remove ep_unregister_pollwait ep_remove_wait_queue Different roads run into the same Roma city. > Considering these findings, I think we can use the wake_up_pollfree() > without contradicting the comment at > https://elixir.bootlin.com/linux/latest/source/include/linux/wait.h#L253 > because indeed, cgroup_file_release() and therefore > psi_trigger_destroy() are not tied to the file's lifetime. > > I'm CC'ing Tejun to check if this makes sense to him and > cgroup_file_release() is working as expected in this case. > > Munehisha, if Tejun confirms this is all valid, could you please post > a patch replacing wake_up_interruptible() with wake_up_pollfree()? We > don't need to worry about wake_up_all() because we have a limitation > of one trigger per file descriptor: > https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L1419, > so there can be only one waiter. > Thanks, > Suren.