From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FB53C433B4 for ; Wed, 19 May 2021 18:36:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8AC8A61353 for ; Wed, 19 May 2021 18:36:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8AC8A61353 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E6B736B0036; Wed, 19 May 2021 14:36:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1B086B006E; Wed, 19 May 2021 14:36:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF91B6B0070; Wed, 19 May 2021 14:36:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 7F4EB6B0036 for ; Wed, 19 May 2021 14:36:23 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0C2E5998F for ; Wed, 19 May 2021 18:36:23 +0000 (UTC) X-FDA: 78158835846.22.038293F Received: from aserp2130.oracle.com (aserp2130.oracle.com [141.146.126.79]) by imf22.hostedemail.com (Postfix) with ESMTP id 89390C001C76 for ; Wed, 19 May 2021 18:36:21 +0000 (UTC) Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 14JIYQTb163771; Wed, 19 May 2021 18:36:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : in-reply-to : content-type : content-transfer-encoding : mime-version; s=corp-2020-01-29; bh=RWyUrsj3X6qMdoRbs15wpxyqc8Jdv5m9rjdfHCmGUfk=; b=x+V8Id5vLJKaMziqsUHBdYjFnU5TjS1BlFeaCaX5fvoAkGSntO71gw+3zIWz3URyKbfk rLk49UuPGoY1l2xj5x3eDlIyWUnyhjj/CrvcXpdyfmsYsC5FbpiUowuTJpjPinJ2XXo6 GHvegsqe3AmiHLOjZ3lPf6pkeYn1FoeY0g0JFy3yRNbhAc6TsziMOCmQNR5zSZDhGeH6 ISqzmYC3wQ5dte2Np/iSQxCP+5s4NtrUJg8s4YjDBxLMwN6kQcPyJyLJBQH3gDJzzr+a gmVk8C3I3vrBCcgrLjjORojbFK+/XFDtXcbRwG/Wyti7HglLuhRJma1KTMSi59ieycRd lA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2130.oracle.com with ESMTP id 38j3tbjmg7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 May 2021 18:36:09 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 14JIURmQ021055; Wed, 19 May 2021 18:36:08 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam08lp2042.outbound.protection.outlook.com [104.47.74.42]) by userp3030.oracle.com with ESMTP id 38megkpuwp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 May 2021 18:36:08 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k3iexOhFabhY6LAqGdMGPoYaAbtO1/2QodZFf9eRf2mUfWTcdQJ53jo2dK21RxrH3IC4jKzn8USo7R3SXGomGTbxfRhEwdHNp8NboN133WCEyJW2uDdqphQD8I5Bg542rK+snCj1XAIwsBxiw3r5uUa4GYsrg1oK5f7IiU8jQp5/Viy2XeyJEgGfJqnOrZkMZAKegniuKrxqA1PFn1a+vgUilbuRP1smrFZmCCipKdNYzhRv/tgrSTj7JY0MYwfUa1DuIy6vW6BKhflf7PWh2Qq9uRLJESQJsJL99OkP6mncjwmQFfQRjRjtw1I0rMhcXRRNdbgigh0Y2Uei5fIwww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RWyUrsj3X6qMdoRbs15wpxyqc8Jdv5m9rjdfHCmGUfk=; b=MuG3JGxbtH37U0Wt/zBpcJ7IPlxzqRQrpVSZMfkNOJE1Z4BxH+8La1NLQ4ghfmLlYKa834BioaMDK93298XOrlBc6pZJAFu20v9XjUhG9/uyrt0I90l+YoYBeDxrAwGCfNbEKT8XP9wAZGXGZ9RI4VVLinh3V+qKFap2sXUi4XLFeuKffTXNaik67U7Jy3aTxTGOBhSdnazRfkwHhXpCky5wrCBtzyRxexupcI4fyfgeoQoiAK/zSWmFLqsSW6ddk6hj8dBxqmtPU/LbhbFTfQoOMGnKKUT+c2El1Cbwb17dzLjVJLpVjSCzGveESkEd1CGo5LikNdmchFhX/ImMRg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RWyUrsj3X6qMdoRbs15wpxyqc8Jdv5m9rjdfHCmGUfk=; b=AdEln37Sk294W0lBGlLRphJL7GM3bQPAbCqyyp7SqFw08aYmBTIGXT2d4htACY+ByJxzndbTtD5FK2FMLcteiXsO3gPmoCZQzAxHOYgMcYNRbDzMs6LNi9vYPd1DZo6Rbf9zy6iU74AIpLuicuZeRa2lIjSa6NrX5L3pawQkdAA= Received: from SJ0PR10MB4429.namprd10.prod.outlook.com (2603:10b6:a03:2d1::14) by SJ0PR10MB4653.namprd10.prod.outlook.com (2603:10b6:a03:2d7::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.26; Wed, 19 May 2021 18:36:06 +0000 Received: from SJ0PR10MB4429.namprd10.prod.outlook.com ([fe80::70b7:1647:6367:92f3]) by SJ0PR10MB4429.namprd10.prod.outlook.com ([fe80::70b7:1647:6367:92f3%6]) with mapi id 15.20.4129.034; Wed, 19 May 2021 18:36:06 +0000 Subject: Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages To: Joao Martins , Dan Williams Cc: Linux MM , Ira Weiny , linux-nvdimm , Matthew Wilcox , Jason Gunthorpe , Muchun Song , Mike Kravetz , Andrew Morton , Jane Chu References: <20210325230938.30752-1-joao.m.martins@oracle.com> <20210325230938.30752-5-joao.m.martins@oracle.com> <56a3e271-4ef8-ba02-639e-fd7fe7de7e36@oracle.com> <8c922a58-c901-1ad9-5d19-1182bd6dea1e@oracle.com> From: Jane Chu Organization: Oracle Corporation Message-ID: Date: Wed, 19 May 2021 11:36:02 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Originating-IP: [108.226.113.12] X-ClientProxiedBy: SJ0PR03CA0047.namprd03.prod.outlook.com (2603:10b6:a03:33e::22) To SJ0PR10MB4429.namprd10.prod.outlook.com (2603:10b6:a03:2d1::14) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.1.70] (108.226.113.12) by SJ0PR03CA0047.namprd03.prod.outlook.com (2603:10b6:a03:33e::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Wed, 19 May 2021 18:36:05 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8f0e7c1c-4999-4e16-6dfe-08d91af4f667 X-MS-TrafficTypeDiagnostic: SJ0PR10MB4653: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: eokkdv5bxgyFWEWl8Fw8ApIq+0Yle/gyfQZ0S7PsSgVvJdyDdL0MN+x80KXGOzfkN5KyIuyXgRXtlHh2kj6oS4iCvJTBtdyyRQtPA4ybQc+ftGctOMdtiJYAFv2QVcY3MLQ8b8lkcO3zN2/6MGPQ3N47G1i5dFnLRCCOTJDUD5obHHQVRubDtsw2i5ZSYW3zpNIi1/VPt5avC+rA7DI5YWQpr//Cl6546XFyw1y+3iId00qpQSFO+53KIWkIs8Q3SJz7xSOdeB1lSWoJGpOjuidRxbGVPW532iO2VZzzNWxKSfzjREvX0xML+HcZCaZeUDSGzN3i5CKJOTgU0Iz0H2/VTTDNHI4f4tabW7P8rimB3xEs10EwVP6NIf3FJ98U9P/xuv0Jq+VXl9iGXZXq49V1XxCAqSzY939VPYRmQoFr1fkQf585cHKynszst+Dx+zu2Him9Qs8Rl4j8FW5zym1Ys7/m1ieZqCgHizcuSGC10q/UAciTvviehsJwkw3hL4y/ccPsQgSqFOlGPMo4e7/JlR3fvcd8PN7OJTT3mt7sAjg38eVcVmoAb6uYz6rCuJHWD3rmUzB5/AF4+oDukZJxA2m9ggSkJ+SRIb8IFRBml+ym4W8C8GVhW6FVR8K8XUL/4KFeN5vPmlXnDQXWKPfskHqhiPLxwRyj+PhuFUET47KFa5BXC+PYz73rOIEy X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB4429.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(136003)(366004)(376002)(39860400002)(346002)(396003)(2906002)(2616005)(6486002)(956004)(4326008)(8936002)(38100700002)(107886003)(86362001)(16576012)(110136005)(316002)(54906003)(66476007)(66556008)(478600001)(66946007)(44832011)(186003)(16526019)(31686004)(83380400001)(36756003)(6666004)(31696002)(36916002)(53546011)(8676002)(5660300002)(26005)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?emxKdjJlVFVPMHpsTVIzMTF1RGprVHY5aXF5eFdKcTRyYm1ZVnNxSkRUczNy?= =?utf-8?B?ZUplUFdyL2F5ZTloc1BYb0FUL1oreTQrOFlyM1JEeklpVGQyZ09tdlZqVlln?= =?utf-8?B?cTdjaTBUZlhMNDl3VHZ2aTdHZk82UDdEUTZQZUFGU2VMTXNCc3cvUkVzNjdy?= =?utf-8?B?NXpacFkzelU1MElOR1lKTkZCVW44V0VuTmZzT1FhVXh6eHEyOEtoSkpkVDdR?= =?utf-8?B?N0cyY0hnTGc2ZTBjTkxZcUEvcXVHNE93T0s5L1pqS29jOW9kRWRCWnk4MVBD?= =?utf-8?B?bVVTUWZuaHRGT0RPbk5sYVNXNGtSamVBbmY4eG1wbkVKclRGZGU2UEFnZXRK?= =?utf-8?B?TlFWbzhXdVdsVUhvOExqRVhLRHBvMkQ2eE4vT1p4VlFoMU5yeVBvUnQvbVZ2?= =?utf-8?B?dFZSVURPbjFQR2xITk9ERTljUHNCekpnc1N6NE0ydnliOWlyWFQxQ0ZpaHky?= =?utf-8?B?cklUUkhEMklUZnI2MXpod2M3WWlzRENRMGt3cFRJa0R1dTlzd0FkNzViYXFj?= =?utf-8?B?UUhhUXNod01rMkRrSHFyZjVZbmxSS1hWUFJML09sTTNyU1ZhTTJEZisrcHRp?= =?utf-8?B?YUp1bk5BcTJUbUcyRjYyN01FeGwxSUZNYndLcHowYmVPTjVsblNjbW5ETXRG?= =?utf-8?B?WnUzR0pCNHQ5NStVQ2dvS0g1TlVoQTFnUzBUVXhJcjlMR0ovc1k1YktxVngv?= =?utf-8?B?WHRqRmtydXNNNDBGNVdUd2RQZWRiZFVVUklERWhUeDJoQjZWYVFjclhkK1Yx?= =?utf-8?B?ZDc2di8zQ1VzeUt4aWkxUTZxMHhvQWttMUJxWXU5bVc5SkVrNTZnazUwTUov?= =?utf-8?B?RFhHUGcxU3d4TkRxQndSNGpSZFRVaGJUQWxqYmwyR3preFUrUWlwdXBsemd0?= =?utf-8?B?VE5aQXl4YTMvaEdqMGlaMCtWajcwQm40M000TDV2anMzT3U3cGRNT2FSUDha?= =?utf-8?B?SFI0Z04rWm1ReWJKaWlpUzNGQUh1YlROUGxTaUZGVE9WdGlkTE1IUzFjVzN6?= =?utf-8?B?SnVzMVFJVm9nY2cyTVllUU1DbUs1bnBSVFlGWC9tai9xV2NUOHRpRmw4R3ZD?= =?utf-8?B?UExubU5lZkExN3RzK2ZTdXJJWEJoL3o3Ym5XNTRWYTBxZkJ1OXdsK0J4dFlz?= =?utf-8?B?VzRtR0xoc1lVRGlTMHQxK3BrbG1ucGQvcjFxdnlIbkEybkFyYS8zZ0cxNEJD?= =?utf-8?B?eVpmemYwY3ZKbDJaT2x1QTJOaWMvWnpxZTZ6a3JyaEN2a0gyTXIvcVZmZTdE?= =?utf-8?B?L2xTQjFFbVpzWTBwaWRsT0g3d3YvU3R4SXppR29YRmNEdHJiUFRabVBQR0Uy?= =?utf-8?B?TEY4azdLeUxudFZsa2V4TmNyeW12WTl5MTdyUFFKc3hJOGlrZ2JRZkNKTE1V?= =?utf-8?B?a0pPTUhOZ2x4U1Q4MHNqUEUwQmpOZHZwVGw3UVZwaHVGUkh6TWZFRmNyeVk1?= =?utf-8?B?anFjRGlMcmRCZncyakZMSEJKRzN1L3JMcUZRNlA3N3MzaHNHSEhRTytPejUz?= =?utf-8?B?Z2RYc0FaNldPbE9KRGkwaUkwd3ZwcStxTm51QTc0ZGdWbkw3cWorTE51WSt1?= =?utf-8?B?NWFOVGhaaXhsREc3cXQyWnJvS3lQV1BhOVZ4U0dVMmdnUGtqUWxIdlN4YWlI?= =?utf-8?B?ZU9JSjcrY0xucUtYZmhYZk94Q2Q2NHo4NHFCc3l6b1p6R3RLdDMzMHA4SGtR?= =?utf-8?B?dmNOem4zMElvS2gxcmkrL1RZUTdyQ1VzU05wSEJkamlYbXQyZmZrcE04VzZ6?= =?utf-8?Q?xv83TOyf29fHQnCwDy9jb3B2apHMaP3A2ZyPTYa?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f0e7c1c-4999-4e16-6dfe-08d91af4f667 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4429.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2021 18:36:05.9017 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: E35+aPWBgrfSsDyDpAHo/pkTREtNSWf5MUOqy3YOxRaSS5BgW9UPwGJ4tzbkpXYE9nPi3HnuqFhkcNXO/aHQ2A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4653 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9989 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 phishscore=0 adultscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2105190114 X-Proofpoint-ORIG-GUID: x7JG_X4yozJK9vLsR6k4Y5MS48BlMEyZ X-Proofpoint-GUID: x7JG_X4yozJK9vLsR6k4Y5MS48BlMEyZ X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9989 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 malwarescore=0 spamscore=0 priorityscore=1501 suspectscore=0 mlxlogscore=999 mlxscore=0 impostorscore=0 adultscore=0 clxscore=1015 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2105190114 X-Rspamd-Queue-Id: 89390C001C76 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2020-01-29 header.b=x+V8Id5v; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=AdEln37S; dmarc=pass (policy=none) header.from=oracle.com; spf=none (imf22.hostedemail.com: domain of jane.chu@oracle.com has no SPF policy when checking 141.146.126.79) smtp.mailfrom=jane.chu@oracle.com X-Rspamd-Server: rspam03 X-Stat-Signature: eukke7p1wd1td1cz5kyfdfepquzokyqd X-HE-Tag: 1621449381-31557 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/19/2021 4:29 AM, Joao Martins wrote: > > On 5/18/21 8:56 PM, Jane Chu wrote: >> On 5/18/2021 10:27 AM, Joao Martins wrote: >> >>> On 5/5/21 11:36 PM, Joao Martins wrote: >>>> On 5/5/21 11:20 PM, Dan Williams wrote: >>>>> On Wed, May 5, 2021 at 12:50 PM Joao Martins wrote: >>>>>> On 5/5/21 7:44 PM, Dan Williams wrote: >>>>>>> On Thu, Mar 25, 2021 at 4:10 PM Joao Martins wrote: >>>>>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >>>>>>>> index b46f63dcaed3..bb28d82dda5e 100644 >>>>>>>> --- a/include/linux/memremap.h >>>>>>>> +++ b/include/linux/memremap.h >>>>>>>> @@ -114,6 +114,7 @@ struct dev_pagemap { >>>>>>>> struct completion done; >>>>>>>> enum memory_type type; >>>>>>>> unsigned int flags; >>>>>>>> + unsigned long align; >>>>>>> I think this wants some kernel-doc above to indicate that non-zer= o >>>>>>> means "use compound pages with tail-page dedup" and zero / PAGE_S= IZE >>>>>>> means "use non-compound base pages". >>> [...] >>> >>>>>>> The non-zero value must be >>>>>>> PAGE_SIZE, PMD_PAGE_SIZE or PUD_PAGE_SIZE. >>>>>>> Hmm, maybe it should be an >>>>>>> enum: >>>>>>> >>>>>>> enum devmap_geometry { >>>>>>> DEVMAP_PTE, >>>>>>> DEVMAP_PMD, >>>>>>> DEVMAP_PUD, >>>>>>> } >>>>>>> >>>>>> I suppose a converter between devmap_geometry and page_size would = be needed too? And maybe >>>>>> the whole dax/nvdimm align values change meanwhile (as a followup = improvement)? >>>>> I think it is ok for dax/nvdimm to continue to maintain their align >>>>> value because it should be ok to have 4MB align if the device reall= y >>>>> wanted. However, when it goes to map that alignment with >>>>> memremap_pages() it can pick a mode. For example, it's already the >>>>> case that dax->align =3D=3D 1GB is mapped with DEVMAP_PTE today, so >>>>> they're already separate concepts that can stay separate. >>>>> >>>> Gotcha. >>> I am reconsidering part of the above. In general, yes, the meaning of= devmap @align >>> represents a slightly different variation of the device @align i.e. h= ow the metadata is >>> laid out **but** regardless of what kind of page table entries we use= vmemmap. >>> >>> By using DEVMAP_PTE/PMD/PUD we might end up 1) duplicating what nvdim= m/dax already >>> validates in terms of allowed device @align values (i.e. PAGE_SIZE, P= MD_SIZE and PUD_SIZE) >>> 2) the geometry of metadata is very much tied to the value we pick to= @align at namespace >>> provisioning -- not the "align" we might use at mmap() perhaps that's= what you referred >>> above? -- and 3) the value of geometry actually derives from dax devi= ce @align because we >>> will need to create compound pages representing a page size of @align= value. >>> >>> Using your example above: you're saying that dax->align =3D=3D 1G is = mapped with DEVMAP_PTEs, >>> in reality the vmemmap is populated with PMDs/PUDs page tables (depen= ding on what archs >>> decide to do at vmemmap_populate()) and uses base pages as its metada= ta regardless of what >>> device @align. In reality what we want to convey in @geometry is not = page table sizes, but >>> just the page size used for the vmemmap of the dax device. Additional= ly, limiting its >>> value might not be desirable... if tomorrow Linux for some arch suppo= rts dax/nvdimm >>> devices with 4M align or 64K align, the value of @geometry will have = to reflect the 4M to >>> create compound pages of order 10 for the said vmemmap. >>> >>> I am going to wait until you finish reviewing the remaining four patc= hes of this series, >>> but maybe this is a simple misnomer (s/align/geometry/) with a commen= t but without >>> DEVMAP_{PTE,PMD,PUD} enum part? Or perhaps its own struct with a valu= e and enum a >>> setter/getter to audit its value? Thoughts? >> Good points there. >> >> My understanding is that=C2=A0 dax->align=C2=A0 conveys granularity of= size while >> carving out a namespace it's a geometry attribute loosely akin to sect= or size of a spindle >> disk.=C2=A0 I tend to think that device pagesize=C2=A0 has almost no r= elation to "align" in that, it's >> possible to have 1G "align" and 4K pagesize, or verse versa.=C2=A0 Tha= t is, with the advent of compound page >> support, it is possible to totally separate the two concepts. >> >> How about adding a new option to "ndctl create-namespace" that describ= es >> device creator's desired pagesize, and another parameter to describe w= hether the pagesize shall >> be fixed or allowed to be split up, such that, if the intention is to = never split up 2M pagesize, then it >> would be possible to save a lot metadata space on the device? > Maybe that can be selected by the driver too, but it's an interesting p= oint you raise > should we settle with the geometry (e.g. like a geometry sysfs entry II= UC your > suggestion?). device-dax for example would use geometry =3D=3D align an= d therefore save space > (like what I propose in patch 10). But fsdax would retain the default t= hat is geometry =3D > PAGE_SIZE and align =3D PMD_SIZE should it want to split pages. Let's see, I think this is what we have today =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | align=C2=A0=C2=A0 hpagesize=C2=A0= geometry=C2=A0 hpage-splittable =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D devdax | 4K..1G=C2=A0 2M,1G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= 4K=C2=A0=C2=A0=C2=A0=C2=A0 artificially no fsdax=C2=A0 | 4K..1G=C2=A0 2M=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 4K=C2=A0=C2=A0=C2=A0=C2=A0 yes So a hard no-split means=C2=A0 (hpagesize =3D=3D geometry), and that does= not apply to fsdax for now. But is it not possible in future?=C2=A0 Some customer p= refers an optional=C2=A0 guarantee that their DAX hpage never been splitted up f= or=20 the sake of rdma efficiency. > > Interestingly, devmap poisoning always occur at @align level regardless= of @geometry. Yeah, it's a simplification that's not ideal, because after all,=20 error-blast-radius !=3D UserMapping-pagesize. > > What I am not sure is what value (vs added complexity) it brings to all= ow geometry *value* > to be selecteable by user given that so far we seem to only ever initia= lize metadata as > either sets of base pages [*] or sets of compound pages (of a size). An= d the difference > between both can possibly be summarized to split-ability like you say. > > [*] that optionally can are morphed into compound pages by driver Agreed.=C2=A0 For this series, it's simpler not to make the=20 compound-page-size selectable. thanks, -jane