From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 345E7C47094 for ; Mon, 7 Jun 2021 20:18:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BA109610A2 for ; Mon, 7 Jun 2021 20:17:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BA109610A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4B4C26B006C; Mon, 7 Jun 2021 16:17:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 48B846B006E; Mon, 7 Jun 2021 16:17:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32CA26B0070; Mon, 7 Jun 2021 16:17:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id F03006B006C for ; Mon, 7 Jun 2021 16:17:58 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 940F682F3BEF for ; Mon, 7 Jun 2021 20:17:58 +0000 (UTC) X-FDA: 78228039036.28.C722ED2 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf27.hostedemail.com (Postfix) with ESMTP id D169280195AB for ; Mon, 7 Jun 2021 20:17:49 +0000 (UTC) Received: by mail-pf1-f170.google.com with SMTP id h12so11127701pfe.2 for ; Mon, 07 Jun 2021 13:17:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=p2a6TiTIH0Q2wGZLBH82j+mQxWn4672Ltdg3/xdw1sE=; b=SPO5dDf8I2VzmNdEsrrAPRwEKgUaAwNaYOf/N65rpu8/EJG2zJQRYmjfDSkXbxqNas FHuXY2msXMQnt7UMtClKynAhzh4Jjz5czmV/bEVrjvF87Z9Kvmm+jY3DTooUrqDBLWNp 6O8o86RHNkzd2XbWYig/Su18TklQMn2Tkr6DFeBDxvENxiiaDIR0zygvzR5jou/I1Lzi QYeksEYUd3Z2DMlSICly6anXD7XdKHxfDMaZu5ck3KoysnfGcz4xZ1wq52fi3wU3AMT+ 2dQw82Sd+2w8h/aS9wntIBy+I1pVWIC2CWCEkCrQrIrbjWNARg6TRDTowRrf4tamVv13 mFlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=p2a6TiTIH0Q2wGZLBH82j+mQxWn4672Ltdg3/xdw1sE=; b=tdseqyLklQc2W2neP5sNmZFVrTTHZ+ZxeUSMbDe/w/ZnJODM+zlfFoaWQlUci82GOK WdFj8SxVyZEtlaU96edCZC1CXho1X88WNWXSMBrvjTX9DDulYRHx/PnZp4T59kO2R6si PWthV0vdZb2o+vR/QaGhNHx4Ias9kpIQ39DZRgFGHR/STsguPoQs6XLT3mMWxv2d2o0m 4WUfFbCisO/qGeE/Cnu0B5/9CPAejeZy+zAncYByu2YNpMDKu7GKDvf/n0+qje0oIYqf tsXA5iJA6Gtka/vYHYIDPJJEQY8Ma+EVfa0Hl/w9ndodfeImLSuptQ9owEIOhp9TJrGZ nSKw== X-Gm-Message-State: AOAM532LqUd09DByKBeQUfxOhIMaBVicJzH6ABAvmS5BMStCunHKcIg9 /vXvfCSlBRGMyKiAI2PsxgrFX66pzU/CgTbbj1ZFSB/zG9I= X-Google-Smtp-Source: ABdhPJyGMgeTdnaAhIDaJ3Q/2ZSnYXrr1ELNTL/GtAQXRyWi3g/QgbdIPJWb6HDA1Gjr47HQ1nuKlL0fO+W7hIzQjMo= X-Received: by 2002:a62:8287:0:b029:2ec:9b1f:9c0a with SMTP id w129-20020a6282870000b02902ec9b1f9c0amr15892686pfd.31.1623097064105; Mon, 07 Jun 2021 13:17:44 -0700 (PDT) MIME-Version: 1.0 References: <20210325230938.30752-1-joao.m.martins@oracle.com> <20210325230938.30752-5-joao.m.martins@oracle.com> <56a3e271-4ef8-ba02-639e-fd7fe7de7e36@oracle.com> <8c922a58-c901-1ad9-5d19-1182bd6dea1e@oracle.com> In-Reply-To: <8c922a58-c901-1ad9-5d19-1182bd6dea1e@oracle.com> From: Dan Williams Date: Mon, 7 Jun 2021 13:17:33 -0700 Message-ID: Subject: Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages To: Joao Martins Cc: Linux MM , Ira Weiny , linux-nvdimm , Matthew Wilcox , Jason Gunthorpe , Jane Chu , Muchun Song , Mike Kravetz , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D169280195AB X-Stat-Signature: yixe1f3ozz39dfrx5u89u5eq3migr44u Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel-com.20150623.gappssmtp.com header.s=20150623 header.b=SPO5dDf8; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=intel.com (policy=none); spf=none (imf27.hostedemail.com: domain of dan.j.williams@intel.com has no SPF policy when checking 209.85.210.170) smtp.mailfrom=dan.j.williams@intel.com X-HE-Tag: 1623097069-259760 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 18, 2021 at 10:28 AM Joao Martins wrote: > > On 5/5/21 11:36 PM, Joao Martins wrote: > > On 5/5/21 11:20 PM, Dan Williams wrote: > >> On Wed, May 5, 2021 at 12:50 PM Joao Martins wrote: > >>> On 5/5/21 7:44 PM, Dan Williams wrote: > >>>> On Thu, Mar 25, 2021 at 4:10 PM Joao Martins wrote: > >>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h > >>>>> index b46f63dcaed3..bb28d82dda5e 100644 > >>>>> --- a/include/linux/memremap.h > >>>>> +++ b/include/linux/memremap.h > >>>>> @@ -114,6 +114,7 @@ struct dev_pagemap { > >>>>> struct completion done; > >>>>> enum memory_type type; > >>>>> unsigned int flags; > >>>>> + unsigned long align; > >>>> > >>>> I think this wants some kernel-doc above to indicate that non-zero > >>>> means "use compound pages with tail-page dedup" and zero / PAGE_SIZE > >>>> means "use non-compound base pages". > > [...] > > >>>> The non-zero value must be > >>>> PAGE_SIZE, PMD_PAGE_SIZE or PUD_PAGE_SIZE. > >>>> Hmm, maybe it should be an > >>>> enum: > >>>> > >>>> enum devmap_geometry { > >>>> DEVMAP_PTE, > >>>> DEVMAP_PMD, > >>>> DEVMAP_PUD, > >>>> } > >>>> > >>> I suppose a converter between devmap_geometry and page_size would be needed too? And maybe > >>> the whole dax/nvdimm align values change meanwhile (as a followup improvement)? > >> > >> I think it is ok for dax/nvdimm to continue to maintain their align > >> value because it should be ok to have 4MB align if the device really > >> wanted. However, when it goes to map that alignment with > >> memremap_pages() it can pick a mode. For example, it's already the > >> case that dax->align == 1GB is mapped with DEVMAP_PTE today, so > >> they're already separate concepts that can stay separate. > >> > > Gotcha. > > I am reconsidering part of the above. In general, yes, the meaning of devmap @align > represents a slightly different variation of the device @align i.e. how the metadata is > laid out **but** regardless of what kind of page table entries we use vmemmap. > > By using DEVMAP_PTE/PMD/PUD we might end up 1) duplicating what nvdimm/dax already > validates in terms of allowed device @align values (i.e. PAGE_SIZE, PMD_SIZE and PUD_SIZE) > 2) the geometry of metadata is very much tied to the value we pick to @align at namespace > provisioning -- not the "align" we might use at mmap() perhaps that's what you referred > above? -- and 3) the value of geometry actually derives from dax device @align because we > will need to create compound pages representing a page size of @align value. > > Using your example above: you're saying that dax->align == 1G is mapped with DEVMAP_PTEs, > in reality the vmemmap is populated with PMDs/PUDs page tables (depending on what archs > decide to do at vmemmap_populate()) and uses base pages as its metadata regardless of what > device @align. In reality what we want to convey in @geometry is not page table sizes, but > just the page size used for the vmemmap of the dax device. Good point, the names "PTE, PMD, PUD" imply the hardware mapping size, not the software compound page size. > Additionally, limiting its > value might not be desirable... if tomorrow Linux for some arch supports dax/nvdimm > devices with 4M align or 64K align, the value of @geometry will have to reflect the 4M to > create compound pages of order 10 for the said vmemmap. > > I am going to wait until you finish reviewing the remaining four patches of this series, > but maybe this is a simple misnomer (s/align/geometry/) with a comment but without > DEVMAP_{PTE,PMD,PUD} enum part? Or perhaps its own struct with a value and enum a > setter/getter to audit its value? Thoughts? I do see what you mean about the confusion DEVMAP_{PTE,PMD,PUD} introduces, but I still think the device-dax align and the organization of the 'struct page' metadata are distinct concepts. So I'm happy with any color of the bikeshed as long as the 2 concepts are distinct. How about calling it "compound_page_order"? Open to other ideas...