From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE7E7C433F5 for ; Wed, 6 Oct 2021 22:53:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7FF2D61039 for ; Wed, 6 Oct 2021 22:53:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7FF2D61039 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B215A900004; Wed, 6 Oct 2021 18:53:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD07D900003; Wed, 6 Oct 2021 18:53:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BFAC900004; Wed, 6 Oct 2021 18:53:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 8A16D900003 for ; Wed, 6 Oct 2021 18:53:45 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 49C27181AC1F5 for ; Wed, 6 Oct 2021 22:53:45 +0000 (UTC) X-FDA: 78667516410.10.B494D2B Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) by imf12.hostedemail.com (Postfix) with ESMTP id D9FBE1002183 for ; Wed, 6 Oct 2021 22:53:44 +0000 (UTC) Received: by mail-qv1-f53.google.com with SMTP id z15so2998990qvj.7 for ; Wed, 06 Oct 2021 15:53:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:mime-version:content-disposition; bh=S1ArjmdVM5/CQn0/TN/e+j2KUxSm5jTQpFZ+7YalclQ=; b=MplgysbrNxONTnuN7aE/ULFC4Q5y3j4L/xuqPqmiAun7tdaYaO/mIw3SpcBRAsiebM KkqWWWKUgBfrDuMMvovqc/3srxu2ubfYGY/Lr40SzCAb5WL1HV7gkGwJbrwhCg5/0xou oPELpMLHDHSCmHighWguqu2pSPRJ8X+Mpb5EdiuT3EkRp6MSVxnd70M/9lXB6anEMYeH +k/XQfVEMS7y+Awhv1WTc+42ULgRanAzkt/7lbj/QpvAwHsyg892rQhZQOWtJfe5L0vN Zwz/Bx/nJG2u0mUOACh4Jml8qVmL6wbtjxd4VQhxz0p3SbBhFC0Jskuc9QLHyOYwH/x5 CMdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition; bh=S1ArjmdVM5/CQn0/TN/e+j2KUxSm5jTQpFZ+7YalclQ=; b=gQlTwnm3aEYzBye3Kk/Lm1ABrgW4X+2EyjQZQfEDj3f3jUKt7matbpBHsEoThFBp1y Gg6s3RKXt7UI9kZumPNX7OvmfZTny8kVbhD0rEU2GMLhFUy3MrT97PjG7My0R70/V+Ns BumLTJwGMXvmph7Ce+aj/gpryRfTeeh8N/KfeYr7klXXAoNm417PXpXBdtuaphfWVD3F Vm1WroobcrkLajrw+ZgfEGVQuPYAuS5rs9tvQdsSZ0JGLycQKOj3bamQkKVn3Ksxc1HF ap4yzszqQdG7fsS8T4Te0nheNkYtkm65oKYDA+ymoH9MaCoxdoUq4yeh8ofojcc6qAwF 1Xag== X-Gm-Message-State: AOAM531lL/GzjZivibSTMVnDwbjM3IIl/vDcqusYtF4LtnDDWISesTgi 0JySxy5MAj2e/vrKP/nKhA== X-Google-Smtp-Source: ABdhPJzhX4DDbZM2IVXfT34uF6z1jB2xRk40uoQmwb+7nZ0nhv/cY5T9h0Ge/IhQdtEYMGMU+ICqIQ== X-Received: by 2002:a0c:e1c1:: with SMTP id v1mr755124qvl.26.1633560824221; Wed, 06 Oct 2021 15:53:44 -0700 (PDT) Received: from moria.home.lan (c-73-219-103-14.hsd1.vt.comcast.net. [73.219.103.14]) by smtp.gmail.com with ESMTPSA id t24sm4567359qkj.38.2021.10.06.15.53.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Oct 2021 15:53:43 -0700 (PDT) Date: Wed, 6 Oct 2021 18:53:41 -0400 From: Kent Overstreet To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, willy@infradead.org, rientjes@google.com Subject: Compaction & folios Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D9FBE1002183 X-Stat-Signature: j6y99wnszsx9fx5ii7nsmq687rng9raw Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Mplgysbr; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of kent.overstreet@gmail.com designates 209.85.219.53 as permitted sender) smtp.mailfrom=kent.overstreet@gmail.com X-HE-Tag: 1633560824-466501 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000029, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: So I have some observations on memory compaction & hugepages. Right now, the working assumption in MM is that compaction is hard and expensive, and right now it is - because most allocations are order 0, with a small subset being hugepage order allocations. This means any time we need a hugepage, compaction has to move a bunch of order 0 pages around, and memory reclaim is no help here - when we reclaim memory, it's coming back as fragmented order 0 pages. But what if compaction wasn't such a difficult, expensive operation? With folios, and then folios for anonymous pages, we won't see nearly so many order 0 allocations anymore - we'll see a spread of allocation sizes based on a mixture of application usage patterns - something much closer to a poisson distribution, vs. our current very bimodal distribution. And since we won't be fragmenting all our allocations up front, memory reclaim will be freeing allocations in this same distribution. Which means that any time an order n allocation fails, it's likely that we'll still have order n-1 pages free - and of those free order n-1 pages, one will likely have a buddy that's moveable and hasn't been fragmented - meaning the common case is that compaction will have to move _one_ (higher order) page - we'll almost never be having to move a bunch of 4k pages. Another way of thinking of this is that memory reclaim will be doing most of the work that compaction has to do now to allocate a high order page. Compaction will go from an expensive, somewhat unreliable operation to one that mostly just works - it's going to be _much_ less of a pain point. It may turn out that allocating hugepages still doesn't work as reliably as we'd like - but folios are still a big help even when we can't allocate a 2MB page, because we'll be able to fall back to an order 6 or 7 or 8 allocation, which is something we can't do now. And, since multiple CPU vendors now support coalescing contiguous PTE entries in the TLB, this will still get us most of the performance benefits of using hugepages.