From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F12DBC433C1 for ; Sat, 20 Mar 2021 03:26:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E0CE261981 for ; Sat, 20 Mar 2021 03:26:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E0CE261981 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3098C8D0005; Fri, 19 Mar 2021 23:26:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 294688D0002; Fri, 19 Mar 2021 23:26:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E7CA8D0005; Fri, 19 Mar 2021 23:26:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id E53EF8D0002 for ; Fri, 19 Mar 2021 23:26:31 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9785318249359 for ; Sat, 20 Mar 2021 03:26:31 +0000 (UTC) X-FDA: 77938814982.27.0572DA7 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf16.hostedemail.com (Postfix) with ESMTP id DB9E380192D5 for ; Sat, 20 Mar 2021 03:26:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=kEfEWIhBq2zE+HxZoC4lm1vNYaQCx2j9hxZ78LM+5kQ=; b=ZupSzN80E4VJvwrYB0bcCXNkhO Gkg0eqfi2txQszZAkHfFSsz1UHnKwTf1fAnSwtwkfpFDXo739+jczEndpqVQNiOeIS2g9F38+8N7E wMc7m4f7H73fzy6hV/cn5HnQ2Ix66O0q238F97ElkNDsvMRNYcpI5JGjruoPPkFE6vOb2qiH5f7/W IWLpUj1RXd9Sw1xbieBKa0cT179ibjAocvig7CKTih/ZVAF9V0FvyEa9r3u01r8Ntq6XbEo/8EIZE EI1VOmjECBLFc8nEyhtomT14d+WppGRt/ynK9qvjuNVdO5OsA6XhzvHCF5w6aIQMD4izMVhX8u7+m H2rpRYOg==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lNSFQ-005Kxb-QF; Sat, 20 Mar 2021 03:25:58 +0000 Date: Sat, 20 Mar 2021 03:25:56 +0000 From: Matthew Wilcox To: Hugh Dickins Cc: Johannes Weiner , Andrew Morton , Michal Hocko , Zhou Guanghui , Zi Yan , Shakeel Butt , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: page_alloc: fix memcg accounting leak in speculative cache lookup Message-ID: <20210320032556.GD3420@casper.infradead.org> References: <20210319071547.60973-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DB9E380192D5 X-Stat-Signature: qfka87wig5wzr8bnjhxt6bdwy7j3ybuq Received-SPF: none (infradead.org>: No applicable sender policy available) receiver=imf16; identity=mailfrom; envelope-from=""; helo=casper.infradead.org; client-ip=90.155.50.34 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616210788-415007 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 19, 2021 at 06:52:58PM -0700, Hugh Dickins wrote: > > + /* > > + * Drop the base reference from __alloc_pages and free. In > > + * case there is an outstanding speculative reference, from > > + * e.g. the page cache, it will put and free the page later. > > + */ > > + if (likely(put_page_testzero(page))) { > > free_the_page(page, order); > > - else if (!PageHead(page)) > > + return; > > + } > > + > > + /* > > + * The speculative reference will put and free the page. > > + * > > + * However, if the speculation was into a higher-order page > > + * chunk that isn't marked compound, the other side will know > > + * nothing about our buddy pages and only free the order-0 > > + * page at the start of our chunk! We must split off and free > > + * the buddy pages here. > > + * > > + * The buddy pages aren't individually refcounted, so they > > + * can't have any pending speculative references themselves. > > + */ > > + if (!PageHead(page) && order > 0) { > > The put_page_testzero() has released our reference to the first > subpage of page: it's now under the control of the racing speculative > lookup. So it seems to me unsafe to be checking PageHead(page) here: > if it was actually a compound page, PageHead might already be cleared > by now, and we doubly free its tail pages below? I think we need to > use a "bool compound = PageHead(page)" on entry to __free_pages(). > > Or alternatively, it's wrong to call __free_pages() on a compound > page anyway, so we should not check PageHead at all, except in a > WARN_ON_ONCE(PageCompound(page)) at the start? Alas ... $ git grep '__free_pages\>.*compound' drivers/dma-buf/heaps/system_heap.c: __free_pages(page, compound_order(page)); drivers/dma-buf/heaps/system_heap.c: __free_pages(p, compound_order(p)); drivers/dma-buf/heaps/system_heap.c: __free_pages(page, compound_order(page)); mm/huge_memory.c: __free_pages(zero_page, compound_order(zero_page)); mm/huge_memory.c: __free_pages(zero_page, compound_order(zero_page)); mm/slub.c: __free_pages(page, compound_order(page)); Maybe we should disallow it! There are a few other places to check: $ grep -l __GFP_COMP $(git grep -lw __free_pages) | wc -l 24 (assuming the pages are allocated and freed in the same file, which is a reasonable approximation, but not guaranteed to catch everything. Many of these 24 will be false positives, of course.)