From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C00AC433ED for ; Mon, 19 Apr 2021 10:48:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D96A36101E for ; Mon, 19 Apr 2021 10:48:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238510AbhDSKsf (ORCPT ); Mon, 19 Apr 2021 06:48:35 -0400 Received: from vmi485042.contaboserver.net ([161.97.139.209]:37626 "EHLO gentwo.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238403AbhDSKsd (ORCPT ); Mon, 19 Apr 2021 06:48:33 -0400 Received: by gentwo.de (Postfix, from userid 1001) id D7AF7B00128; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by gentwo.de (Postfix) with ESMTP id D4F82B00049; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Date: Mon, 19 Apr 2021 12:48:01 +0200 (CEST) From: Christoph Lameter To: Anshuman Khandual cc: David Hildenbrand , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, "linuxppc-dev @ lists . ozlabs . org" , "linux-ia64@vger.kernel.org" , Vlastimil Babka , Michal Hocko , Mel Gorman Subject: Re: [PATCH V2] mm/page_alloc: Ensure that HUGETLB_PAGE_ORDER is less than MAX_ORDER In-Reply-To: Message-ID: References: <1618199302-29335-1-git-send-email-anshuman.khandual@arm.com> <09284b9a-cfe1-fc49-e1f6-3cf0c1b74c76@arm.com> <162877dd-e6ba-d465-d301-2956bb034429@redhat.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323328-1783256993-1618829281=:777076" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1783256993-1618829281=:777076 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Mon, 19 Apr 2021, Anshuman Khandual wrote: > >> Unfortunately the build test fails on both the platforms (powerpc and ia64) > >> which subscribe HUGETLB_PAGE_SIZE_VARIABLE and where this check would make > >> sense. I some how overlooked the cross compile build failure that actually > >> detected this problem. > >> > >> But wondering why this assert is not holding true ? and how these platforms > >> do not see the warning during boot (or do they ?) at mm/vmscan.c:1092 like > >> arm64 did. > >> > >> static int __fragmentation_index(unsigned int order, struct contig_page_info *info) > >> { > >>          unsigned long requested = 1UL << order; > >> > >>          if (WARN_ON_ONCE(order >= MAX_ORDER)) > >>                  return 0; > >> .... > >> > >> Can pageblock_order really exceed MAX_ORDER - 1 ? You can have larger blocks but you would need to allocate multiple contigous max order blocks or do it at boot time before the buddy allocator is active. What IA64 did was to do this at boot time thereby avoiding the buddy lists. And it had a separate virtual address range and page table for the huge pages. Looks like the current code does these allocations via CMA which should also bypass the buddy allocator. > >     } > > > > > > But it's kind of weird, isn't it? Let's assume we have MAX_ORDER - 1 correspond to 4 MiB and pageblock_order correspond to 8 MiB. > > > > Sure, we'd be grouping pages in 8 MiB chunks, however, we cannot even > > allocate 8 MiB chunks via the buddy. So only alloc_contig_range() > > could really grab them (IOW: gigantic pages). > > Right. But then you can avoid the buddy allocator. > > Further, we have code like deferred_free_range(), where we end up > > calling __free_pages_core()->...->__free_one_page() with > > pageblock_order. Wouldn't we end up setting the buddy order to > > something > MAX_ORDER -1 on that path? > > Agreed. We would need to return the supersized block to the huge page pool and not to the buddy allocator. There is a special callback in the compound page sos that you can call an alternate free function that is not the buddy allocator. > > > > > Having pageblock_order > MAX_ORDER feels wrong and looks shaky. > > > Agreed, definitely does not look right. Lets see what other folks > might have to say on this. > > + Christoph Lameter > It was done for a long time successfully and is running in numerous configurations. --8323328-1783256993-1618829281=:777076-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8942C433B4 for ; Mon, 19 Apr 2021 10:48:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1B3336101E for ; Mon, 19 Apr 2021 10:48:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B3336101E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=gentwo.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A64636B0073; Mon, 19 Apr 2021 06:48:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A3A9A6B0074; Mon, 19 Apr 2021 06:48:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9039C6B0075; Mon, 19 Apr 2021 06:48:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id 6BB736B0073 for ; Mon, 19 Apr 2021 06:48:05 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 21A221DF9 for ; Mon, 19 Apr 2021 10:48:05 +0000 (UTC) X-FDA: 78048791730.21.87D296F Received: from gentwo.de (vmi485042.contaboserver.net [161.97.139.209]) by imf05.hostedemail.com (Postfix) with ESMTP id 6BF49E00012A for ; Mon, 19 Apr 2021 10:48:03 +0000 (UTC) Received: by gentwo.de (Postfix, from userid 1001) id D7AF7B00128; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by gentwo.de (Postfix) with ESMTP id D4F82B00049; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Date: Mon, 19 Apr 2021 12:48:01 +0200 (CEST) From: Christoph Lameter To: Anshuman Khandual cc: David Hildenbrand , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, "linuxppc-dev @ lists . ozlabs . org" , "linux-ia64@vger.kernel.org" , Vlastimil Babka , Michal Hocko , Mel Gorman Subject: Re: [PATCH V2] mm/page_alloc: Ensure that HUGETLB_PAGE_ORDER is less than MAX_ORDER In-Reply-To: Message-ID: References: <1618199302-29335-1-git-send-email-anshuman.khandual@arm.com> <09284b9a-cfe1-fc49-e1f6-3cf0c1b74c76@arm.com> <162877dd-e6ba-d465-d301-2956bb034429@redhat.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323328-1783256993-1618829281=:777076" X-Stat-Signature: 5ffejuuro9dfhcx7s95tgsyqja8d6sjs X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6BF49E00012A Received-SPF: none (gentwo.de>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=gentwo.de; client-ip=161.97.139.209 X-HE-DKIM-Result: none/none X-HE-Tag: 1618829283-723287 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1783256993-1618829281=:777076 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Mon, 19 Apr 2021, Anshuman Khandual wrote: > >> Unfortunately the build test fails on both the platforms (powerpc an= d ia64) > >> which subscribe HUGETLB_PAGE_SIZE_VARIABLE and where this check woul= d make > >> sense. I some how overlooked the cross compile build failure that ac= tually > >> detected this problem. > >> > >> But wondering why this assert is not holding true ? and how these pl= atforms > >> do not see the warning during boot (or do they ?) at mm/vmscan.c:109= 2 like > >> arm64 did. > >> > >> static int __fragmentation_index(unsigned int order, struct contig_p= age_info *info) > >> { > >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long reque= sted =3D 1UL << order; > >> > >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (WARN_ON_ONCE(or= der >=3D MAX_ORDER)) > >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return 0; > >> .... > >> > >> Can pageblock_order really exceed MAX_ORDER - 1 ? You can have larger blocks but you would need to allocate multiple contigous max order blocks or do it at boot time before the buddy allocator is active. What IA64 did was to do this at boot time thereby avoiding the buddy lists. And it had a separate virtual address range and page table for the huge pages. Looks like the current code does these allocations via CMA which should also bypass the buddy allocator. > > =C2=A0=C2=A0=C2=A0=C2=A0} > > > > > > But it's kind of weird, isn't it? Let's assume we have MAX_ORDER - 1 = correspond to 4 MiB and pageblock_order correspond to 8 MiB. > > > > Sure, we'd be grouping pages in 8 MiB chunks, however, we cannot even > > allocate 8 MiB chunks via the buddy. So only alloc_contig_range() > > could really grab them (IOW: gigantic pages). > > Right. But then you can avoid the buddy allocator. > > Further, we have code like deferred_free_range(), where we end up > > calling __free_pages_core()->...->__free_one_page() with > > pageblock_order. Wouldn't we end up setting the buddy order to > > something > MAX_ORDER -1 on that path? > > Agreed. We would need to return the supersized block to the huge page pool and no= t to the buddy allocator. There is a special callback in the compound page sos that you can call an alternate free function that is not the buddy allocator. > > > > > Having pageblock_order > MAX_ORDER feels wrong and looks shaky. > > > Agreed, definitely does not look right. Lets see what other folks > might have to say on this. > > + Christoph Lameter > It was done for a long time successfully and is running in numerous configurations. --8323328-1783256993-1618829281=:777076-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8259BC433B4 for ; Mon, 19 Apr 2021 11:08:32 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E77160E0C for ; Mon, 19 Apr 2021 11:08:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E77160E0C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=gentwo.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4FP3vy24CBz30Mp for ; Mon, 19 Apr 2021 21:08:30 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gentwo.de (client-ip=161.97.139.209; helo=gentwo.de; envelope-from=cl@gentwo.de; receiver=) X-Greylist: delayed 547 seconds by postgrey-1.36 at boromir; Mon, 19 Apr 2021 20:57:15 AEST Received: from gentwo.de (vmi485042.contaboserver.net [161.97.139.209]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4FP3fz2pjJz2xVt for ; Mon, 19 Apr 2021 20:57:15 +1000 (AEST) Received: by gentwo.de (Postfix, from userid 1001) id D7AF7B00128; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by gentwo.de (Postfix) with ESMTP id D4F82B00049; Mon, 19 Apr 2021 12:48:01 +0200 (CEST) Date: Mon, 19 Apr 2021 12:48:01 +0200 (CEST) From: Christoph Lameter To: Anshuman Khandual Subject: Re: [PATCH V2] mm/page_alloc: Ensure that HUGETLB_PAGE_ORDER is less than MAX_ORDER In-Reply-To: Message-ID: References: <1618199302-29335-1-git-send-email-anshuman.khandual@arm.com> <09284b9a-cfe1-fc49-e1f6-3cf0c1b74c76@arm.com> <162877dd-e6ba-d465-d301-2956bb034429@redhat.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323328-1783256993-1618829281=:777076" X-Mailman-Approved-At: Mon, 19 Apr 2021 21:08:11 +1000 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "linux-ia64@vger.kernel.org" , David Hildenbrand , Mel Gorman , linux-kernel@vger.kernel.org, Michal Hocko , linux-mm@kvack.org, akpm@linux-foundation.org, "linuxppc-dev @ lists . ozlabs . org" , Vlastimil Babka Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1783256993-1618829281=:777076 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Mon, 19 Apr 2021, Anshuman Khandual wrote: > >> Unfortunately the build test fails on both the platforms (powerpc and ia64) > >> which subscribe HUGETLB_PAGE_SIZE_VARIABLE and where this check would make > >> sense. I some how overlooked the cross compile build failure that actually > >> detected this problem. > >> > >> But wondering why this assert is not holding true ? and how these platforms > >> do not see the warning during boot (or do they ?) at mm/vmscan.c:1092 like > >> arm64 did. > >> > >> static int __fragmentation_index(unsigned int order, struct contig_page_info *info) > >> { > >>          unsigned long requested = 1UL << order; > >> > >>          if (WARN_ON_ONCE(order >= MAX_ORDER)) > >>                  return 0; > >> .... > >> > >> Can pageblock_order really exceed MAX_ORDER - 1 ? You can have larger blocks but you would need to allocate multiple contigous max order blocks or do it at boot time before the buddy allocator is active. What IA64 did was to do this at boot time thereby avoiding the buddy lists. And it had a separate virtual address range and page table for the huge pages. Looks like the current code does these allocations via CMA which should also bypass the buddy allocator. > >     } > > > > > > But it's kind of weird, isn't it? Let's assume we have MAX_ORDER - 1 correspond to 4 MiB and pageblock_order correspond to 8 MiB. > > > > Sure, we'd be grouping pages in 8 MiB chunks, however, we cannot even > > allocate 8 MiB chunks via the buddy. So only alloc_contig_range() > > could really grab them (IOW: gigantic pages). > > Right. But then you can avoid the buddy allocator. > > Further, we have code like deferred_free_range(), where we end up > > calling __free_pages_core()->...->__free_one_page() with > > pageblock_order. Wouldn't we end up setting the buddy order to > > something > MAX_ORDER -1 on that path? > > Agreed. We would need to return the supersized block to the huge page pool and not to the buddy allocator. There is a special callback in the compound page sos that you can call an alternate free function that is not the buddy allocator. > > > > > Having pageblock_order > MAX_ORDER feels wrong and looks shaky. > > > Agreed, definitely does not look right. Lets see what other folks > might have to say on this. > > + Christoph Lameter > It was done for a long time successfully and is running in numerous configurations. --8323328-1783256993-1618829281=:777076-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Lameter Date: Mon, 19 Apr 2021 10:48:01 +0000 Subject: Re: [PATCH V2] mm/page_alloc: Ensure that HUGETLB_PAGE_ORDER is less than MAX_ORDER Message-Id: MIME-Version: 1 Content-Type: multipart/mixed; boundary="8323328-1783256993-1618829281=:777076" List-Id: References: <1618199302-29335-1-git-send-email-anshuman.khandual@arm.com> <09284b9a-cfe1-fc49-e1f6-3cf0c1b74c76@arm.com> <162877dd-e6ba-d465-d301-2956bb034429@redhat.com> In-Reply-To: To: Anshuman Khandual Cc: David Hildenbrand , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, "linuxppc-dev @ lists . ozlabs . org" , "linux-ia64@vger.kernel.org" , Vlastimil Babka , Michal Hocko , Mel Gorman This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1783256993-1618829281=:777076 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit On Mon, 19 Apr 2021, Anshuman Khandual wrote: > >> Unfortunately the build test fails on both the platforms (powerpc and ia64) > >> which subscribe HUGETLB_PAGE_SIZE_VARIABLE and where this check would make > >> sense. I some how overlooked the cross compile build failure that actually > >> detected this problem. > >> > >> But wondering why this assert is not holding true ? and how these platforms > >> do not see the warning during boot (or do they ?) at mm/vmscan.c:1092 like > >> arm64 did. > >> > >> static int __fragmentation_index(unsigned int order, struct contig_page_info *info) > >> { > >>          unsigned long requested = 1UL << order; > >> > >>          if (WARN_ON_ONCE(order >= MAX_ORDER)) > >>                  return 0; > >> .... > >> > >> Can pageblock_order really exceed MAX_ORDER - 1 ? You can have larger blocks but you would need to allocate multiple contigous max order blocks or do it at boot time before the buddy allocator is active. What IA64 did was to do this at boot time thereby avoiding the buddy lists. And it had a separate virtual address range and page table for the huge pages. Looks like the current code does these allocations via CMA which should also bypass the buddy allocator. > >     } > > > > > > But it's kind of weird, isn't it? Let's assume we have MAX_ORDER - 1 correspond to 4 MiB and pageblock_order correspond to 8 MiB. > > > > Sure, we'd be grouping pages in 8 MiB chunks, however, we cannot even > > allocate 8 MiB chunks via the buddy. So only alloc_contig_range() > > could really grab them (IOW: gigantic pages). > > Right. But then you can avoid the buddy allocator. > > Further, we have code like deferred_free_range(), where we end up > > calling __free_pages_core()->...->__free_one_page() with > > pageblock_order. Wouldn't we end up setting the buddy order to > > something > MAX_ORDER -1 on that path? > > Agreed. We would need to return the supersized block to the huge page pool and not to the buddy allocator. There is a special callback in the compound page sos that you can call an alternate free function that is not the buddy allocator. > > > > > Having pageblock_order > MAX_ORDER feels wrong and looks shaky. > > > Agreed, definitely does not look right. Lets see what other folks > might have to say on this. > > + Christoph Lameter > It was done for a long time successfully and is running in numerous configurations. --8323328-1783256993-1618829281=:777076--