From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C5E3C433B4 for ; Tue, 11 May 2021 21:18:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F267B611BD for ; Tue, 11 May 2021 21:18:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F267B611BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 543636B0036; Tue, 11 May 2021 17:18:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F3726B006C; Tue, 11 May 2021 17:18:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F8496B006E; Tue, 11 May 2021 17:18:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 112716B0036 for ; Tue, 11 May 2021 17:18:20 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C1947BA08 for ; Tue, 11 May 2021 21:18:19 +0000 (UTC) X-FDA: 78130213518.05.3C411E9 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2046.outbound.protection.outlook.com [40.107.92.46]) by imf21.hostedemail.com (Postfix) with ESMTP id 0DC38E0011EE for ; Tue, 11 May 2021 21:18:13 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RMdQEB5fdunZJ4X6NBYDGrO98EVO+aG9AMfRD6juqKBh6E4G88pUrwP+rp7z/RnVB88iZEciyJ4v/9pEmxsjAWxj46fD/md44wn2Y2zfDzcE4KCJGtMWr+LzV3ngm2zaXZ9dvfaoJm5QYw1zx8COl1V75js8unWjmnEzfwdr0vu1xbjY/H/Zc84W9nCUrSbDG6K00htE4OYKIz9cDVc9grdK1sYSI/LGhCG/TWnWTqRkiCWV+QmWGuoJzsNfYJPU0mVRKOWMdkJV/Z0Rv62KwkN34Mrihn3CgyWDkZEZG6HNfYKuhxVB5L5MzMCpWwK+mebgeqRjJzNrbBzhQNX6zQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iM7yoM5JBMWgBoWGJjIQ/eXGEKMzZzeIgDXOuQ3q85Y=; b=U2x99Ya6xTbHFom8Qm0+Hc76Z4He1NC4HG+TQVk54HLt1KYaB2DiXiLVWPaYACjnrq4PRJPeskpuXbshRn6mnPvaNXLaqNDiEpICGDM7i0wwhdmET0rhcJKYoeN9f/hu0hWrd2UElNxmHXLBHB7eICAokYuLJQWhXG1xrERHGXnNt9pwJYSzrOQ6RZXBap9iXXx49qkMYOCFFkKoFgUo3onmkKNou38CicrxqIrx/klCaN4211oPmXbUWrNmnYFTK98mo1JPdCUp79Pm8hCOM60+fbsQeoODrHXzJq46Ac3KQvEL1zgwuRucTFTajK03h81/H+DMkB3iGJ4zt+ugyA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iM7yoM5JBMWgBoWGJjIQ/eXGEKMzZzeIgDXOuQ3q85Y=; b=URmNDqu16xEyVF2lwH4HVUsNI3mMSFO7OSGIVyM7whkWmHq8fnzo2eqbeeWyhJwAmhiRb4K7+usMnex3d7hFRdFOHncDdyheHK4oBgsm3FIb+m8+UApEwqSY26V+Wxd6DEsj6rb6eHWO+DuJa+3HMDGyQJ+ygsUJPxP6Ye6sV6/iudGBlvoy5vQQwXnAHEIR03KXegzdeZoLjHe+g660BdN1TjmSrRFYv/hzNdDZqhoD/cztcP8bwn/oXVIH4X/fBYxt66/byzistQJGGrjDuP4VG9IEBtk040UdPG2lSlhxWLIfQHvARKQk3fWk7QHl7r3ESs6/GLKlVci3ziVSow== Received: from MN2PR12MB3823.namprd12.prod.outlook.com (2603:10b6:208:168::26) by BL0PR12MB4852.namprd12.prod.outlook.com (2603:10b6:208:1ce::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4108.26; Tue, 11 May 2021 21:18:17 +0000 Received: from MN2PR12MB3823.namprd12.prod.outlook.com ([fe80::ccd7:fb49:6f2d:acf2]) by MN2PR12MB3823.namprd12.prod.outlook.com ([fe80::ccd7:fb49:6f2d:acf2%7]) with mapi id 15.20.4108.031; Tue, 11 May 2021 21:18:17 +0000 From: Zi Yan To: lsf-pc@lists.linux-foundation.org Cc: linux-mm@kvack.org Subject: [LSF/MM/BPF TOPIC] 1GB PUD THP support (gigantic page allocation, increasing MAX_ORDER, anti-fragmentation and more) Date: Tue, 11 May 2021 17:18:12 -0400 X-Mailer: MailMate (1.14r5757) Message-ID: Content-Type: multipart/signed; boundary="=_MailMate_CA32F16F-D4B1-4AA5-9A72-CAEB22377150_="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Originating-IP: [216.228.112.22] X-ClientProxiedBy: MN2PR16CA0004.namprd16.prod.outlook.com (2603:10b6:208:134::17) To MN2PR12MB3823.namprd12.prod.outlook.com (2603:10b6:208:168::26) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [10.2.93.183] (216.228.112.22) by MN2PR16CA0004.namprd16.prod.outlook.com (2603:10b6:208:134::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Tue, 11 May 2021 21:18:16 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7d530759-348c-4748-63c2-08d914c24ba7 X-MS-TrafficTypeDiagnostic: BL0PR12MB4852: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LRAGkhM18HQOqTmfJNsshjlRcPt84zGvqRFmsaralY68IcaxfFQUurVfOipN1zcWk03h6FZTpmAF91GtPiffN/N2kkA6OgzafXKNyb2LczoYtaR4nSRW/2SoxwK/dN25jxifVXyJRJ/w5xF/wfjUmYJxwHKK8D6lehXpkKfbABQNQ9eycq3WM+SpdqXzPvA8ajtf+JjBvDSVy9CsdNN/zUh5sI0b1QcJPozNMXC84UkmpVqrqdSw5190nT6qv/O/0XVwhFjnl2pK98jLfw+S9m1IBOWEcxDSRdQKhdOILsIUMBqmEm5mGXYXsMM9fYTkhiFXdKhW81HvjE+21yNJwtcoRFUPBzzsK7icHWICtinkLkut5QkeP5rI/jgQKNkMBX8Ylil/amUFSAL5AY9dn6tJ3vC5zmn0wqj45hQoCVJLw719BEHok3FmlVHbknnHrHnA3JudjA8RwDCgOiCQnK3Rrl53NemwzTc85dB2XJ0v8bfbEGltxhoz7qlxswmWrTgvyTNvQfUapDV0FmFUfKPrLo9WRxCzuhJ+AbHRwGX+icJFb6GYLAsrF4mv7MNbLJra3z3YxAO0JCWqT7y5E69FYr/YSq3GoNhPx4l3ZIdV+d3if2sFA7WCO+dfl7nOrvISRGSGWcx0Kf1p6QlZIIcexn7qElBLR7IAKFDcyB2B1XAeMv91l1fEGeujeOinIpuVqZHvBlFydrZzX9n2poIokcf+zMB0iasp6M/e3/diX7PrUBimwbf/Sw0hKVLKIysv9z8x5FiZF63mMKlkayPjFzz784YuYuxR8d5Swfw= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MN2PR12MB3823.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(16526019)(66556008)(33656002)(8676002)(966005)(38100700002)(235185007)(66946007)(186003)(8936002)(66476007)(26005)(36756003)(5660300002)(83380400001)(498600001)(6666004)(956004)(6486002)(16576012)(2906002)(33964004)(2616005)(6916009)(4326008)(86362001)(72826003)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?Mk5zWmlFWEJrTW91YVlNTThEeEh0c2ZqRGM5RHlWNHIzNnBrL2lSWXF4bk5P?= =?utf-8?B?QmJRR3FWYytrWTNkTnM3ZEJqMzJsMHI4YURrRkZZMHRBSnhFUnBjVm9wWmpH?= =?utf-8?B?V1NQTU1oQzRXVW5jam11ajFrUDBYKzZTdTFhaXRicXVLeEtsWHY5T3hTMGVD?= =?utf-8?B?OW44T3dWWjlPbzhWbFQyb1VkUi9semZBeFNELzBOV055K1d0V09sYmdHbWVK?= =?utf-8?B?KytydEJCNSt4dmMrWVhpYThnbTU3TzF0Q3J6WThZVng5cTRiL3FBcHZURmU2?= =?utf-8?B?ZldhWVdzYnJxNkw4eTRPOHRIcW9MT2wvQkdGRkdDdTNUTURKSTBROTFOK2RG?= =?utf-8?B?TXRHUlcxRDdXT2Y1QjNyWWJNUWxvZlZXUmZab0xxY2hPcnBmQVBza3hkVURM?= =?utf-8?B?NFFtMXF1TlJnVWFsaHRsOU1jUmwzb3ZkTVBqS0l4dlFWV3I4YzR6T3pGN0pI?= =?utf-8?B?eG8wRjJ1YWM1Y1doTng5N1Zhc1BRUHNrTW5SYnV4NW12ZmlxRlZ1Z1NReUNn?= =?utf-8?B?UFFLUUFwajVHN2NIK2pZNUdSRlBhR3VuRlJxbThINFpaU2JXamZPNCt0cVNN?= =?utf-8?B?TzRRMGg2UjZuQnVvMExhdXpsbjhwUitjUW53RGk5U2J5SG1NYXllNWoyOG85?= =?utf-8?B?MzErL3dmM3JDT2s3MTROMDkyTDE4Wnd2aUxldUg1STNHczE3ZkFIblBhYXg3?= =?utf-8?B?TTdKM1BtWFVXNkVOOGQweG1uVHRxZmxXZWtXY0pXU2NKaGhsWWhMYzhYQkxm?= =?utf-8?B?QVJzOThpVlIzRk9CbGVEaUk3ZkZVbzhZbTdrK3poNUlBTENMMFZnc1BVRzlv?= =?utf-8?B?UmRhSUNwR1hCTDVtYUhkZzIzbDlvdWhLZWcvZ1oxYUFkOFAycnpOTUNocGw1?= =?utf-8?B?SWJCYXd5V3VpVklkY2hLQXUrVEZBY1YxNnZLTk5jaGdGTzBvbTN3a3pVNTZK?= =?utf-8?B?T2tzaGJ6aFB3VWgzNXl3OVI0dWFiV2ZSZHc4aGdoVkcwZWNpTGhHNUdxeWFI?= =?utf-8?B?ZU1ITmlTcm1NOWc3MWJSOVNCRXBROSs4b1BwbXhoSTdnbFRwZkt6TWludzlG?= =?utf-8?B?SlV4YnlsY2xyNkYvTm1JMXlnZE9Ucy9tK1lZWXNYbVoxY2pLdHRaOHdrQ2tS?= =?utf-8?B?M0ZVSWVMbkxjcklxd1hrVGRIdk95cnFyU2xlRndCSjk4ZXVQb1poNmNMWEFZ?= =?utf-8?B?cktuU2VTNGtlMS8wLzN1UE5vckFackZnOTNXSk1IdGpKT2JqekYrRVJLUUx5?= =?utf-8?B?bWRnbXM4cHF0bi9UdkV0VW13NzYzdGZ2akIwQXpiVkRDckIwcnVMd1dDbU4x?= =?utf-8?B?SjNZWmhDY1gzREtIV2ZhSXBJZ2RiZ0pJSjdvRVBQRVBUMFJDYXdCQmc0K0F0?= =?utf-8?B?QktVVlUzeVUrQjNYa1F1OVJRaHJUQy85ZnVCZTJWVmc3MDlOcndOcnV4VUN4?= =?utf-8?B?VkJLUUMvUnZ4cFZDN0gxY1grSDNFRGdVc3Z2dkZUakQyQVg1MGUvTXlyUnE2?= =?utf-8?B?cDROV2ZEazdWSW9zS1N6RU96SmxLZER3aFVxK2ptWER3WWNXRHJOQldmRkpY?= =?utf-8?B?RmUzSjRoKzJOb2c2N3RuQ3ZvYnhtWkpjczNIRGlZSytQdWV1bUx2UXkzRVY5?= =?utf-8?B?cjZpdnZGZURLK2JyeVNkanRZOEFST3dMZkZmN0IydUtOd05LVnNjYkUzOHQx?= =?utf-8?B?R1dBcjl1N0MzYjc4eUlYcWF0TXlONFJweFhvbllkTDkwc3IxNzhQQnVmZnJS?= =?utf-8?Q?HC7tnCAgOSDBUImUAUk3bOyiv77lobt45vs/wpi?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7d530759-348c-4748-63c2-08d914c24ba7 X-MS-Exchange-CrossTenant-AuthSource: MN2PR12MB3823.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 May 2021 21:18:17.6567 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0B4CuSgLKKEruJDgJpa2pj75/6yOI5NmhXoVH97lUhRW2ydudprs+/G840Dv4y6Z X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR12MB4852 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=URmNDqu1; dmarc=pass (policy=none) header.from=nvidia.com; spf=none (imf21.hostedemail.com: domain of ziy@nvidia.com has no SPF policy when checking 40.107.92.46) smtp.mailfrom=ziy@nvidia.com X-Stat-Signature: qtss6afb1wqtjrzmjkrn4anm34hh7xcw X-Rspamd-Queue-Id: 0DC38E0011EE X-Rspamd-Server: rspam02 Received-SPF: none (nvidia.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=NAM10-BN7-obe.outbound.protection.outlook.com; client-ip=40.107.92.46 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620767893-205092 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=_MailMate_CA32F16F-D4B1-4AA5-9A72-CAEB22377150_= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I have been working on 1GB THP support [1][2][3] and would like to have a= discussion on the high-level design and some implementation details. The= topics I would like to discuss related to 1GB PUD THP include: 1. Gigantic page allocation. Since MAX_ORDER is limiting us from allocati= ng 1GB pages, we need to enable it via one or more ways, like using alloc= _contig_range() or increasing MAX_ORDER. 2. The successful rate of allocating gigantic pages. Exiting anti-fragmen= tation mechanism works at pageblock level, which is 2MB on x86_64. What c= ould be done to provide some guarantee on gigantic page allocation withou= t being hurt by unmoveable page fragmentation? Increasing pageblock size,= additional memory zone/region for gigantic pages, or something else. 3. How to expose 1GB PUD THP to user space. Allocating 1GB THP all the ti= me at page fault is unrealistic and can waste a lot of memory and take a = lot of page fault handling time. Would additional MADV_ flags to specify = the THP page size be a good choice? Or do we want to introduce an additio= nal API to ask kernel to create gigantic pages per user request[4]? 4. Code deduplication for THP handling and page table handling. When addi= ng 1GB THP support, I needed to mechanically replicate PMD THP code for P= UD THP, so I am thinking about possible code deduplication. One thing I d= id is to have a common split_huge_page_to_list_to_order() for both split_= huge_page() and split_huge_pud_page()[5] for THP handling. On the other h= and, I am also thinking about reviving Kirill=E2=80=99s idea[6] to consol= idate page table manipulation API using page table level numbers like lev= el=3D1,2,3,=E2=80=A6 instead of PTE, PMD, PUD, and so on. There might be other THP-specific topics like how to handling PMD mapping= s to a 1GB PUD THP in addition to existing PTE mappings to a 2MB PMD THP,= but I think we have plenty to discuss already and we can continue if we = have time. [1] https://lore.kernel.org/linux-mm/20200902180628.4052244-1-zi.yan@sent= =2Ecom/ [2] https://lore.kernel.org/linux-mm/20200928175428.4110504-1-zi.yan@sent= =2Ecom/ [3] https://lore.kernel.org/linux-mm/20210224223536.803765-1-zi.yan@sent.= com/ [4] https://lore.kernel.org/linux-mm/20200907072014.GD30144@dhcp22.suse.c= z/ [5] https://lore.kernel.org/linux-mm/20201119160605.1272425-1-zi.yan@sent= =2Ecom/ [6] https://lore.kernel.org/linux-mm/20180424154355.mfjgkf47kdp2by4e@blac= k.fi.intel.com/ =E2=80=94 Best Regards, Yan Zi --=_MailMate_CA32F16F-D4B1-4AA5-9A72-CAEB22377150_= Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQJDBAEBCgAtFiEEh7yFAW3gwjwQ4C9anbJR82th+ooFAmCa9JUPHHppeUBudmlk aWEuY29tAAoJEJ2yUfNrYfqKGLoP/3sIAIDZKr71ZVZ5trCkZlmBZqgOlUUVUlZv JKqPtMd435cZjcPFsojuPg0pQ8X3H45UVzDagNSJ9MfK5WX1bVoe4k3g1upcAZjM znUaX2TLsmEg7NlCE8Q1qV6PKPeAzuxpjQ2XhPQokyF/vGjo4E9JEqRIf/iJoL/7 vLfG7mTiWpQd+SjG0tBqLNejo7NmdsqM2YL3t8ZtQcIp+yDeejOouHbrMOpW95Xk GRqNU0sr6PsZfDg4xdM2IP+1ePed4JIRRl+v89cc1oeFVd7hZiGXlRIm4WTqH0Al 2eP3+N+rLzIcspx/dCWwL9LEeIYQh/cUT5cc3W3rnNRI+20ixA0SToe6xbkFCpSp 4CP8fCu3eLdcpDGR8jnHhcMEAQL+Esb0ziRZXdtpLMNv3hx//QxI9XmMrQJaXVsp ukgjHJyR+mhl9B584bbqIti4ixImGrg0nghKkwLlXpboNn/eeGOmzTqrnKRDkzRh Q++hXQfdhHAxbJG93Vm+y+RqsVjeXvHFF/k/DCTCMLQTXAauISTNtonDglIB3tVZ NSV9upGK/asUlYO0T/3UJJQM2VRcAXUTxr+L3hYdnL1tbWHpv7lo7LojcMXocVCX 3qruJTIggIl3DWK6ddI74Gb7I3ec/93FdGX0knV09d73BcH7c1M30wIdgXCh8goa QkGvfbqb =8PAh -----END PGP SIGNATURE----- --=_MailMate_CA32F16F-D4B1-4AA5-9A72-CAEB22377150_=--