From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11C94C433F5 for ; Wed, 6 Oct 2021 23:18:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 884B2610A1 for ; Wed, 6 Oct 2021 23:18:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 884B2610A1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0A46C6B006C; Wed, 6 Oct 2021 19:18:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 05421900002; Wed, 6 Oct 2021 19:18:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EABD96B0073; Wed, 6 Oct 2021 19:18:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id DB9726B006C for ; Wed, 6 Oct 2021 19:18:23 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 88E5139B9F for ; Wed, 6 Oct 2021 23:18:23 +0000 (UTC) X-FDA: 78667578486.28.CBC0473 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf06.hostedemail.com (Postfix) with ESMTP id 40EAF801B0ED for ; Wed, 6 Oct 2021 23:18:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=ffTwrIBmUt2AAjjYbplxTVj+0VLY0JCFISjMFxJRyH0=; b=RFQyO1qb+x4mkvFGwNUrZcbmv/ Ktr0rrrxz8WWL4HorWEaEEa/WvIQniKzRKicdNakJ0WtiOtqr3jZP1pyo0hQ2Kc6qqh2xws0wdgWC KyNHzKWo/w8Plj16yf8DaATQ0+/h8mYRdpVSIWX7k2UJ7Vms6HF/7BWkuvEeBKaoR2AoWnjeXcnpO Ih2Yt9U3IrZql6ruxxoDdf5ck/A8OEThOHgsXdFJv6aJJRtNAhNo3LwVpmHPX4YZix6m3TDUHxA6r jv0cIxqEX+TrJaxPQgxv6lG0ncNw+0y8PUZXWcFPaDyCdsTZd90z5NWz7vtbVAMh1u4rxF5XIFBLN R5KZFyBw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mYGAI-001Mhk-Gx; Wed, 06 Oct 2021 23:17:39 +0000 Date: Thu, 7 Oct 2021 00:17:34 +0100 From: Matthew Wilcox To: Kent Overstreet Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, rientjes@google.com Subject: Re: Compaction & folios Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 40EAF801B0ED X-Stat-Signature: 8aput48yznwyypkh3xf9w3aepg8szb9z Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=RFQyO1qb; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none X-Rspamd-Server: rspam06 X-HE-Tag: 1633562303-840578 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 06, 2021 at 06:53:41PM -0400, Kent Overstreet wrote: > It may turn out that allocating hugepages still doesn't work as reliably as we'd > like - but folios are still a big help even when we can't allocate a 2MB page, > because we'll be able to fall back to an order 6 or 7 or 8 allocation, which is > something we can't do now. And, since multiple CPU vendors now support > coalescing contiguous PTE entries in the TLB, this will still get us most of the > performance benefits of using hugepages. I'd like to add two things: 1. A lot of people talk about the performance improvements from using 2MB pages, and there are the obvious hardware ones -- one fewer level to dereference in the page table walk when there's a TLB miss; using a single TLB entry to cache an entire 2MB page. But there are the software ones, which I believe Google have measured (perhaps it was the ChromeOS team?) Allocating order-2/3/4 pages reduces the length of the LRU list by a factor of 4/8/16. That means we get 4-16x memory reclaimed per unit of time, which reduces the LRU lock contention. Not to mention the advantage of being able to use a pagevec to describe 960KB of memory rather than 60KB. 2. We can only measure what CPUs do today. If our behaviour changes, CPU vendors will adapt. I talked to someone who dabbles in hardware design who said that it really isn't that hard to design a TLB that can support mapping 64KB entries at arbitrary 4KB offsets. There's no particular incentive for CPU manufacturers to do that today, but if we start allocating 64KB pages to cache files, that will change.