Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
       [not found] <20191011202932.GZoUOoURm%akpm@linux-foundation.org>
@ 2019-10-14 12:17 ` Michal Hocko
  2019-10-14 12:53   ` Anshuman Khandual
  2019-10-14 20:29   ` Matthew Wilcox
  0 siblings, 2 replies; 15+ messages in thread
From: Michal Hocko @ 2019-10-14 12:17 UTC (permalink / raw)
  To: akpm
  Cc: anshuman.khandual, ard.biesheuvel, broonie, christophe.leroy,
	dan.j.williams, dave.hansen, davem, gerald.schaefer, gregkh,
	heiko.carstens, jgg, jhogan, keescook, kirill, linux,
	mark.rutland, mike.kravetz, mm-commits, mpe, paul.burton, paulus,
	penguin-kernel, peterz, ralf, rppt, schowdary, schwidefsky,
	Steven.Price, tglx, vbabka, vgupta, willy, yamada.masahiro,
	linux-mm

On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> alloc_gigantic_page() implements an allocation method where it scans over
> various zones looking for a large contiguous memory block which could not
> have been allocated through the buddy allocator.  A subsequent patch which
> tests arch page table helpers needs such a method to allocate PUD_SIZE
> sized memory block.  In the future such methods might have other use cases
> as well.  So alloc_gigantic_page() has been split carving out actual
> memory allocation method and made available via new
> alloc_gigantic_page_order().

You are exporting a helper used for hugetlb internally. Is this really
what is needed? I haven't followed this patchset but don't you simply
need a generic 1GB allocator? If yes then you should be looking at
alloc_contig_range.

Or did I miss the point you really want hugetlb page?

> Link: http://lkml.kernel.org/r/1570775142-31425-2-git-send-email-anshuman.khandual@arm.com
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Steven Price <Steven.Price@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Sri Krishna chowdary <schowdary@nvidia.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Vineet Gupta <vgupta@synopsys.com>
> Cc: James Hogan <jhogan@kernel.org>
> Cc: Paul Burton <paul.burton@mips.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Kirill A. Shutemov <kirill@shutemov.name>
> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  include/linux/hugetlb.h |    9 +++++++++
>  mm/hugetlb.c            |   24 ++++++++++++++++++++++--
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> --- a/include/linux/hugetlb.h~mm-hugetlb-make-alloc_gigantic_page-available-for-general-use
> +++ a/include/linux/hugetlb.h
> @@ -299,6 +299,9 @@ static inline bool is_file_hugepages(str
>  }
>  
>  
> +struct page *
> +alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
> +			  int nid, nodemask_t *nodemask);
>  #else /* !CONFIG_HUGETLBFS */
>  
>  #define is_file_hugepages(file)			false
> @@ -310,6 +313,12 @@ hugetlb_file_setup(const char *name, siz
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> +static inline struct page *
> +alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
> +			  int nid, nodemask_t *nodemask)
> +{
> +	return NULL;
> +}
>  #endif /* !CONFIG_HUGETLBFS */
>  
>  #ifdef HAVE_ARCH_HUGETLB_UNMAPPED_AREA
> --- a/mm/hugetlb.c~mm-hugetlb-make-alloc_gigantic_page-available-for-general-use
> +++ a/mm/hugetlb.c
> @@ -1112,10 +1112,9 @@ static bool zone_spans_last_pfn(const st
>  	return zone_spans_pfn(zone, last_pfn);
>  }
>  
> -static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>  		int nid, nodemask_t *nodemask)
>  {
> -	unsigned int order = huge_page_order(h);
>  	unsigned long nr_pages = 1 << order;
>  	unsigned long ret, pfn, flags;
>  	struct zonelist *zonelist;
> @@ -1151,6 +1150,14 @@ static struct page *alloc_gigantic_page(
>  	return NULL;
>  }
>  
> +static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
> +					int nid, nodemask_t *nodemask)
> +{
> +	unsigned int order = huge_page_order(h);
> +
> +	return alloc_gigantic_page_order(order, gfp_mask, nid, nodemask);
> +}
> +
>  static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);
>  static void prep_compound_gigantic_page(struct page *page, unsigned int order);
>  #else /* !CONFIG_CONTIG_ALLOC */
> @@ -1159,6 +1166,12 @@ static struct page *alloc_gigantic_page(
>  {
>  	return NULL;
>  }
> +
> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
> +				       int nid, nodemask_t *nodemask)
> +{
> +	return NULL;
> +}
>  #endif /* CONFIG_CONTIG_ALLOC */
>  
>  #else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
> @@ -1167,6 +1180,13 @@ static struct page *alloc_gigantic_page(
>  {
>  	return NULL;
>  }
> +
> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
> +				       int nid, nodemask_t *nodemask)
> +{
> +	return NULL;
> +}
> +
>  static inline void free_gigantic_page(struct page *page, unsigned int order) { }
>  static inline void destroy_compound_gigantic_page(struct page *page,
>  						unsigned int order) { }
> _
> 
> Patches currently in -mm which might be from anshuman.khandual@arm.com are
> 
> mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch
> mm-debug-add-tests-validating-architecture-page-table-helpers.patch
> mm-hotplug-reorder-memblock_-calls-in-try_remove_memory.patch

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 12:17 ` + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree Michal Hocko
@ 2019-10-14 12:53   ` Anshuman Khandual
  2019-10-14 13:00     ` Michal Hocko
  2019-10-14 20:29   ` Matthew Wilcox
  1 sibling, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-14 12:53 UTC (permalink / raw)
  To: Michal Hocko, akpm
  Cc: ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, willy, yamada.masahiro, linux-mm



On 10/14/2019 05:47 PM, Michal Hocko wrote:
> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
>> alloc_gigantic_page() implements an allocation method where it scans over
>> various zones looking for a large contiguous memory block which could not
>> have been allocated through the buddy allocator.  A subsequent patch which
>> tests arch page table helpers needs such a method to allocate PUD_SIZE
>> sized memory block.  In the future such methods might have other use cases
>> as well.  So alloc_gigantic_page() has been split carving out actual
>> memory allocation method and made available via new
>> alloc_gigantic_page_order().
> 
> You are exporting a helper used for hugetlb internally. Is this really

Right, because the helper i.e alloc_gigantic_page() is generic enough which
scans over various zones to allocate a page order which could not be allocated
through the buddy. Only thing which is HugeTLB specific in there, is struct
hstate from where the order is derived with huge_page_order(). Otherwise it
is very generic.

> what is needed? I haven't followed this patchset but don't you simply

Originally I had just implemented similar allocator inside the test itself
but then figured that alloc_gigantic_page() can be factored out to create
a generic enough allocator helper.

> need a generic 1GB allocator? If yes then you should be looking at

The test needs a PUD_SIZE allocator.

> alloc_contig_range.

IIUC alloc_contig_range() requires (start pfn, end pfn) for the region to be
allocated. But before that all applicable zones need to be scanned to figure
out any available and suitable pfn range for alloc_contig_range() to try. In
this case pfn_range_valid_gigantic() check seemed reasonable while scanning
the zones.

If pfn_range_valid_gigantic() is good enough or could be made more generic,
then the new factored alloc_gigantic_page_order() could be made a helper in
mm/page_alloc.c

> 
> Or did I miss the point you really want hugetlb page?
> 
>> Link: http://lkml.kernel.org/r/1570775142-31425-2-git-send-email-anshuman.khandual@arm.com
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> Cc: Steven Price <Steven.Price@arm.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Sri Krishna chowdary <schowdary@nvidia.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
>> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Vineet Gupta <vgupta@synopsys.com>
>> Cc: James Hogan <jhogan@kernel.org>
>> Cc: Paul Burton <paul.burton@mips.com>
>> Cc: Ralf Baechle <ralf@linux-mips.org>
>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
>> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
>> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> ---
>>
>>  include/linux/hugetlb.h |    9 +++++++++
>>  mm/hugetlb.c            |   24 ++++++++++++++++++++++--
>>  2 files changed, 31 insertions(+), 2 deletions(-)
>>
>> --- a/include/linux/hugetlb.h~mm-hugetlb-make-alloc_gigantic_page-available-for-general-use
>> +++ a/include/linux/hugetlb.h
>> @@ -299,6 +299,9 @@ static inline bool is_file_hugepages(str
>>  }
>>  
>>  
>> +struct page *
>> +alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>> +			  int nid, nodemask_t *nodemask);
>>  #else /* !CONFIG_HUGETLBFS */
>>  
>>  #define is_file_hugepages(file)			false
>> @@ -310,6 +313,12 @@ hugetlb_file_setup(const char *name, siz
>>  	return ERR_PTR(-ENOSYS);
>>  }
>>  
>> +static inline struct page *
>> +alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>> +			  int nid, nodemask_t *nodemask)
>> +{
>> +	return NULL;
>> +}
>>  #endif /* !CONFIG_HUGETLBFS */
>>  
>>  #ifdef HAVE_ARCH_HUGETLB_UNMAPPED_AREA
>> --- a/mm/hugetlb.c~mm-hugetlb-make-alloc_gigantic_page-available-for-general-use
>> +++ a/mm/hugetlb.c
>> @@ -1112,10 +1112,9 @@ static bool zone_spans_last_pfn(const st
>>  	return zone_spans_pfn(zone, last_pfn);
>>  }
>>  
>> -static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>>  		int nid, nodemask_t *nodemask)
>>  {
>> -	unsigned int order = huge_page_order(h);
>>  	unsigned long nr_pages = 1 << order;
>>  	unsigned long ret, pfn, flags;
>>  	struct zonelist *zonelist;
>> @@ -1151,6 +1150,14 @@ static struct page *alloc_gigantic_page(
>>  	return NULL;
>>  }
>>  
>> +static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>> +					int nid, nodemask_t *nodemask)
>> +{
>> +	unsigned int order = huge_page_order(h);
>> +
>> +	return alloc_gigantic_page_order(order, gfp_mask, nid, nodemask);
>> +}
>> +
>>  static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);
>>  static void prep_compound_gigantic_page(struct page *page, unsigned int order);
>>  #else /* !CONFIG_CONTIG_ALLOC */
>> @@ -1159,6 +1166,12 @@ static struct page *alloc_gigantic_page(
>>  {
>>  	return NULL;
>>  }
>> +
>> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>> +				       int nid, nodemask_t *nodemask)
>> +{
>> +	return NULL;
>> +}
>>  #endif /* CONFIG_CONTIG_ALLOC */
>>  
>>  #else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
>> @@ -1167,6 +1180,13 @@ static struct page *alloc_gigantic_page(
>>  {
>>  	return NULL;
>>  }
>> +
>> +struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
>> +				       int nid, nodemask_t *nodemask)
>> +{
>> +	return NULL;
>> +}
>> +
>>  static inline void free_gigantic_page(struct page *page, unsigned int order) { }
>>  static inline void destroy_compound_gigantic_page(struct page *page,
>>  						unsigned int order) { }
>> _
>>
>> Patches currently in -mm which might be from anshuman.khandual@arm.com are
>>
>> mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch
>> mm-debug-add-tests-validating-architecture-page-table-helpers.patch
>> mm-hotplug-reorder-memblock_-calls-in-try_remove_memory.patch
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 12:53   ` Anshuman Khandual
@ 2019-10-14 13:00     ` Michal Hocko
  2019-10-14 13:08       ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2019-10-14 13:00 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, willy, yamada.masahiro, linux-mm

On Mon 14-10-19 18:23:00, Anshuman Khandual wrote:
> 
> 
> On 10/14/2019 05:47 PM, Michal Hocko wrote:
> > On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> >> alloc_gigantic_page() implements an allocation method where it scans over
> >> various zones looking for a large contiguous memory block which could not
> >> have been allocated through the buddy allocator.  A subsequent patch which
> >> tests arch page table helpers needs such a method to allocate PUD_SIZE
> >> sized memory block.  In the future such methods might have other use cases
> >> as well.  So alloc_gigantic_page() has been split carving out actual
> >> memory allocation method and made available via new
> >> alloc_gigantic_page_order().
> > 
> > You are exporting a helper used for hugetlb internally. Is this really
> 
> Right, because the helper i.e alloc_gigantic_page() is generic enough which
> scans over various zones to allocate a page order which could not be allocated
> through the buddy. Only thing which is HugeTLB specific in there, is struct
> hstate from where the order is derived with huge_page_order(). Otherwise it
> is very generic.
> 
> > what is needed? I haven't followed this patchset but don't you simply
> 
> Originally I had just implemented similar allocator inside the test itself
> but then figured that alloc_gigantic_page() can be factored out to create
> a generic enough allocator helper.
> 
> > need a generic 1GB allocator? If yes then you should be looking at
> 
> The test needs a PUD_SIZE allocator.
> 
> > alloc_contig_range.
> 
> IIUC alloc_contig_range() requires (start pfn, end pfn) for the region to be
> allocated. But before that all applicable zones need to be scanned to figure
> out any available and suitable pfn range for alloc_contig_range() to try. In
> this case pfn_range_valid_gigantic() check seemed reasonable while scanning
> the zones.
> 
> If pfn_range_valid_gigantic() is good enough or could be made more generic,
> then the new factored alloc_gigantic_page_order() could be made a helper in
> mm/page_alloc.c

OK, thanks for the clarification. This all means that this patch is not
the right approach. If you need a more generic alloc_contig_range then
add it to page_alloc.c and make it completely independent on the hugetlb
config and the code. Hugetlb allocator can reuse that helper.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 13:00     ` Michal Hocko
@ 2019-10-14 13:08       ` Anshuman Khandual
  2019-10-14 13:21         ` Michal Hocko
  2019-10-14 16:52         ` Mike Kravetz
  0 siblings, 2 replies; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-14 13:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, willy, yamada.masahiro, linux-mm



On 10/14/2019 06:30 PM, Michal Hocko wrote:
> On Mon 14-10-19 18:23:00, Anshuman Khandual wrote:
>>
>>
>> On 10/14/2019 05:47 PM, Michal Hocko wrote:
>>> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
>>>> alloc_gigantic_page() implements an allocation method where it scans over
>>>> various zones looking for a large contiguous memory block which could not
>>>> have been allocated through the buddy allocator.  A subsequent patch which
>>>> tests arch page table helpers needs such a method to allocate PUD_SIZE
>>>> sized memory block.  In the future such methods might have other use cases
>>>> as well.  So alloc_gigantic_page() has been split carving out actual
>>>> memory allocation method and made available via new
>>>> alloc_gigantic_page_order().
>>>
>>> You are exporting a helper used for hugetlb internally. Is this really
>>
>> Right, because the helper i.e alloc_gigantic_page() is generic enough which
>> scans over various zones to allocate a page order which could not be allocated
>> through the buddy. Only thing which is HugeTLB specific in there, is struct
>> hstate from where the order is derived with huge_page_order(). Otherwise it
>> is very generic.
>>
>>> what is needed? I haven't followed this patchset but don't you simply
>>
>> Originally I had just implemented similar allocator inside the test itself
>> but then figured that alloc_gigantic_page() can be factored out to create
>> a generic enough allocator helper.
>>
>>> need a generic 1GB allocator? If yes then you should be looking at
>>
>> The test needs a PUD_SIZE allocator.
>>
>>> alloc_contig_range.
>>
>> IIUC alloc_contig_range() requires (start pfn, end pfn) for the region to be
>> allocated. But before that all applicable zones need to be scanned to figure
>> out any available and suitable pfn range for alloc_contig_range() to try. In
>> this case pfn_range_valid_gigantic() check seemed reasonable while scanning
>> the zones.
>>
>> If pfn_range_valid_gigantic() is good enough or could be made more generic,
>> then the new factored alloc_gigantic_page_order() could be made a helper in
>> mm/page_alloc.c
> 
> OK, thanks for the clarification. This all means that this patch is not
> the right approach. If you need a more generic alloc_contig_range then
> add it to page_alloc.c and make it completely independent on the hugetlb
> config and the code. Hugetlb allocator can reuse that helper.

Sure. Just to confirm again, the checks on pfns in pfn_range_valid_gigantic()
while looking for contiguous memory is generic enough in it's current form ?

static bool pfn_range_valid_gigantic(struct zone *z,
                        unsigned long start_pfn, unsigned long nr_pages)
{
        unsigned long i, end_pfn = start_pfn + nr_pages;
        struct page *page;

        for (i = start_pfn; i < end_pfn; i++) {
                if (!pfn_valid(i))
                        return false;

                page = pfn_to_page(i);

                if (page_zone(page) != z)
                        return false;

                if (PageReserved(page))
                        return false;

                if (page_count(page) > 0)
                        return false;

                if (PageHuge(page))
                        return false;
        }

        return true;
}


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 13:08       ` Anshuman Khandual
@ 2019-10-14 13:21         ` Michal Hocko
  2019-10-14 16:52         ` Mike Kravetz
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2019-10-14 13:21 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, willy, yamada.masahiro, linux-mm

On Mon 14-10-19 18:38:52, Anshuman Khandual wrote:
> 
> 
> On 10/14/2019 06:30 PM, Michal Hocko wrote:
> > On Mon 14-10-19 18:23:00, Anshuman Khandual wrote:
> >>
> >>
> >> On 10/14/2019 05:47 PM, Michal Hocko wrote:
> >>> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> >>>> alloc_gigantic_page() implements an allocation method where it scans over
> >>>> various zones looking for a large contiguous memory block which could not
> >>>> have been allocated through the buddy allocator.  A subsequent patch which
> >>>> tests arch page table helpers needs such a method to allocate PUD_SIZE
> >>>> sized memory block.  In the future such methods might have other use cases
> >>>> as well.  So alloc_gigantic_page() has been split carving out actual
> >>>> memory allocation method and made available via new
> >>>> alloc_gigantic_page_order().
> >>>
> >>> You are exporting a helper used for hugetlb internally. Is this really
> >>
> >> Right, because the helper i.e alloc_gigantic_page() is generic enough which
> >> scans over various zones to allocate a page order which could not be allocated
> >> through the buddy. Only thing which is HugeTLB specific in there, is struct
> >> hstate from where the order is derived with huge_page_order(). Otherwise it
> >> is very generic.
> >>
> >>> what is needed? I haven't followed this patchset but don't you simply
> >>
> >> Originally I had just implemented similar allocator inside the test itself
> >> but then figured that alloc_gigantic_page() can be factored out to create
> >> a generic enough allocator helper.
> >>
> >>> need a generic 1GB allocator? If yes then you should be looking at
> >>
> >> The test needs a PUD_SIZE allocator.
> >>
> >>> alloc_contig_range.
> >>
> >> IIUC alloc_contig_range() requires (start pfn, end pfn) for the region to be
> >> allocated. But before that all applicable zones need to be scanned to figure
> >> out any available and suitable pfn range for alloc_contig_range() to try. In
> >> this case pfn_range_valid_gigantic() check seemed reasonable while scanning
> >> the zones.
> >>
> >> If pfn_range_valid_gigantic() is good enough or could be made more generic,
> >> then the new factored alloc_gigantic_page_order() could be made a helper in
> >> mm/page_alloc.c
> > 
> > OK, thanks for the clarification. This all means that this patch is not
> > the right approach. If you need a more generic alloc_contig_range then
> > add it to page_alloc.c and make it completely independent on the hugetlb
> > config and the code. Hugetlb allocator can reuse that helper.
> 
> Sure. Just to confirm again, the checks on pfns in pfn_range_valid_gigantic()
> while looking for contiguous memory is generic enough in it's current form ?
> 
> static bool pfn_range_valid_gigantic(struct zone *z,
>                         unsigned long start_pfn, unsigned long nr_pages)
> {
>         unsigned long i, end_pfn = start_pfn + nr_pages;
>         struct page *page;
> 
>         for (i = start_pfn; i < end_pfn; i++) {
>                 if (!pfn_valid(i))
>                         return false;
> 
>                 page = pfn_to_page(i);
> 
>                 if (page_zone(page) != z)
>                         return false;
> 
>                 if (PageReserved(page))
>                         return false;
> 
>                 if (page_count(page) > 0)
>                         return false;
> 
>                 if (PageHuge(page))
>                         return false;
>         }
> 
>         return true;
> }

Yes, I do not think we want to allow allocating cross zones or other
used memory. This is more of a quick check than a correctness one. The
allocator has to make sure that all that is properly done while
isolating memory.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 13:08       ` Anshuman Khandual
  2019-10-14 13:21         ` Michal Hocko
@ 2019-10-14 16:52         ` Mike Kravetz
  2019-10-15  9:57           ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Mike Kravetz @ 2019-10-14 16:52 UTC (permalink / raw)
  To: Anshuman Khandual, Michal Hocko
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mm-commits, mpe,
	paul.burton, paulus, penguin-kernel, peterz, ralf, rppt,
	schowdary, schwidefsky, Steven.Price, tglx, vbabka, vgupta,
	willy, yamada.masahiro, linux-mm

On 10/14/19 6:08 AM, Anshuman Khandual wrote:
> On 10/14/2019 06:30 PM, Michal Hocko wrote:
>>
>> OK, thanks for the clarification. This all means that this patch is not
>> the right approach. If you need a more generic alloc_contig_range then
>> add it to page_alloc.c and make it completely independent on the hugetlb
>> config and the code. Hugetlb allocator can reuse that helper.

Should we revisit this previous attempt at such an interface?

https://lkml.org/lkml/2018/4/16/1072

This looks like another use case.
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 16:52         ` Mike Kravetz
@ 2019-10-15  9:57           ` Anshuman Khandual
  2019-10-15 10:31             ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-15  9:57 UTC (permalink / raw)
  To: Mike Kravetz, Michal Hocko
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mm-commits, mpe,
	paul.burton, paulus, penguin-kernel, peterz, ralf, rppt,
	schowdary, schwidefsky, Steven.Price, tglx, vbabka, vgupta,
	willy, yamada.masahiro, linux-mm



On 10/14/2019 10:22 PM, Mike Kravetz wrote:
> On 10/14/19 6:08 AM, Anshuman Khandual wrote:
>> On 10/14/2019 06:30 PM, Michal Hocko wrote:
>>>
>>> OK, thanks for the clarification. This all means that this patch is not
>>> the right approach. If you need a more generic alloc_contig_range then
>>> add it to page_alloc.c and make it completely independent on the hugetlb
>>> config and the code. Hugetlb allocator can reuse that helper.
> 
> Should we revisit this previous attempt at such an interface?
> 
> https://lkml.org/lkml/2018/4/16/1072
> 
> This looks like another use case.
> 

The current proposal [v6] does not go far enough to unify all callers
of alloc_contig_range() looking for contiguous pages of certain size,
but instead it just tries not to duplicate HugeTLB gigantic allocation
code in the test case for it's purpose.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15  9:57           ` Anshuman Khandual
@ 2019-10-15 10:31             ` Michal Hocko
  2019-10-15 10:34               ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2019-10-15 10:31 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mike Kravetz, akpm, ard.biesheuvel, broonie, christophe.leroy,
	dan.j.williams, dave.hansen, davem, gerald.schaefer, gregkh,
	heiko.carstens, jgg, jhogan, keescook, kirill, linux,
	mark.rutland, mm-commits, mpe, paul.burton, paulus,
	penguin-kernel, peterz, ralf, rppt, schowdary, schwidefsky,
	Steven.Price, tglx, vbabka, vgupta, willy, yamada.masahiro,
	linux-mm

On Tue 15-10-19 15:27:46, Anshuman Khandual wrote:
> 
> 
> On 10/14/2019 10:22 PM, Mike Kravetz wrote:
> > On 10/14/19 6:08 AM, Anshuman Khandual wrote:
> >> On 10/14/2019 06:30 PM, Michal Hocko wrote:
> >>>
> >>> OK, thanks for the clarification. This all means that this patch is not
> >>> the right approach. If you need a more generic alloc_contig_range then
> >>> add it to page_alloc.c and make it completely independent on the hugetlb
> >>> config and the code. Hugetlb allocator can reuse that helper.
> > 
> > Should we revisit this previous attempt at such an interface?
> > 
> > https://lkml.org/lkml/2018/4/16/1072
> > 
> > This looks like another use case.
> > 
> 
> The current proposal [v6] does not go far enough to unify all callers
> of alloc_contig_range() looking for contiguous pages of certain size,
> but instead it just tries not to duplicate HugeTLB gigantic allocation
> code in the test case for it's purpose.

Please go and extract that functionality to a commonly usable helper.
Dependency on hugetlb is just ugly as hell.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15 10:31             ` Michal Hocko
@ 2019-10-15 10:34               ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-15 10:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Kravetz, akpm, ard.biesheuvel, broonie, christophe.leroy,
	dan.j.williams, dave.hansen, davem, gerald.schaefer, gregkh,
	heiko.carstens, jgg, jhogan, keescook, kirill, linux,
	mark.rutland, mm-commits, mpe, paul.burton, paulus,
	penguin-kernel, peterz, ralf, rppt, schowdary, schwidefsky,
	Steven.Price, tglx, vbabka, vgupta, willy, yamada.masahiro,
	linux-mm



On 10/15/2019 04:01 PM, Michal Hocko wrote:
> On Tue 15-10-19 15:27:46, Anshuman Khandual wrote:
>>
>>
>> On 10/14/2019 10:22 PM, Mike Kravetz wrote:
>>> On 10/14/19 6:08 AM, Anshuman Khandual wrote:
>>>> On 10/14/2019 06:30 PM, Michal Hocko wrote:
>>>>>
>>>>> OK, thanks for the clarification. This all means that this patch is not
>>>>> the right approach. If you need a more generic alloc_contig_range then
>>>>> add it to page_alloc.c and make it completely independent on the hugetlb
>>>>> config and the code. Hugetlb allocator can reuse that helper.
>>>
>>> Should we revisit this previous attempt at such an interface?
>>>
>>> https://lkml.org/lkml/2018/4/16/1072
>>>
>>> This looks like another use case.
>>>
>>
>> The current proposal [v6] does not go far enough to unify all callers
>> of alloc_contig_range() looking for contiguous pages of certain size,
>> but instead it just tries not to duplicate HugeTLB gigantic allocation
>> code in the test case for it's purpose.
> 
> Please go and extract that functionality to a commonly usable helper.
> Dependency on hugetlb is just ugly as hell.

I did that in V6 and posted some time back. Could you please review that
and let me know if anything needs to be improved.

- Anshuman



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 12:17 ` + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree Michal Hocko
  2019-10-14 12:53   ` Anshuman Khandual
@ 2019-10-14 20:29   ` Matthew Wilcox
  2019-10-15  9:30     ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2019-10-14 20:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, anshuman.khandual, ard.biesheuvel, broonie,
	christophe.leroy, dan.j.williams, dave.hansen, davem,
	gerald.schaefer, gregkh, heiko.carstens, jgg, jhogan, keescook,
	kirill, linux, mark.rutland, mike.kravetz, mm-commits, mpe,
	paul.burton, paulus, penguin-kernel, peterz, ralf, rppt,
	schowdary, schwidefsky, Steven.Price, tglx, vbabka, vgupta,
	yamada.masahiro, linux-mm

On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> > alloc_gigantic_page() implements an allocation method where it scans over
> > various zones looking for a large contiguous memory block which could not
> > have been allocated through the buddy allocator.  A subsequent patch which
> > tests arch page table helpers needs such a method to allocate PUD_SIZE
> > sized memory block.  In the future such methods might have other use cases
> > as well.  So alloc_gigantic_page() has been split carving out actual
> > memory allocation method and made available via new
> > alloc_gigantic_page_order().
> 
> You are exporting a helper used for hugetlb internally. Is this really
> what is needed? I haven't followed this patchset but don't you simply
> need a generic 1GB allocator? If yes then you should be looking at
> alloc_contig_range.

He actually doesn't need to allocate any memory at all.  All he needs is
the address of a valid contiguous PUD-sized chunk of memory.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-14 20:29   ` Matthew Wilcox
@ 2019-10-15  9:30     ` Anshuman Khandual
  2019-10-15 11:24       ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-15  9:30 UTC (permalink / raw)
  To: Matthew Wilcox, Michal Hocko
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, yamada.masahiro, linux-mm



On 10/15/2019 01:59 AM, Matthew Wilcox wrote:
> On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
>> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
>>> alloc_gigantic_page() implements an allocation method where it scans over
>>> various zones looking for a large contiguous memory block which could not
>>> have been allocated through the buddy allocator.  A subsequent patch which
>>> tests arch page table helpers needs such a method to allocate PUD_SIZE
>>> sized memory block.  In the future such methods might have other use cases
>>> as well.  So alloc_gigantic_page() has been split carving out actual
>>> memory allocation method and made available via new
>>> alloc_gigantic_page_order().
>>
>> You are exporting a helper used for hugetlb internally. Is this really
>> what is needed? I haven't followed this patchset but don't you simply
>> need a generic 1GB allocator? If yes then you should be looking at
>> alloc_contig_range.
> 
> He actually doesn't need to allocate any memory at all.  All he needs is
> the address of a valid contiguous PUD-sized chunk of memory.
> 

We had already discussed about the benefits of allocated memory over
synthetic pfn potentially derived from a kernel text symbol. More
over we are not adding any new helper or new code for this purpose,
but instead just trying to reuse code which is already present.

https://lore.kernel.org/linux-mm/1565335998-22553-1-git-send-email-anshuman.khandual@arm.com/T/




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15  9:30     ` Anshuman Khandual
@ 2019-10-15 11:24       ` Matthew Wilcox
  2019-10-15 11:36         ` Michal Hocko
  2019-10-16  8:55         ` Anshuman Khandual
  0 siblings, 2 replies; 15+ messages in thread
From: Matthew Wilcox @ 2019-10-15 11:24 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Michal Hocko, akpm, ard.biesheuvel, broonie, christophe.leroy,
	dan.j.williams, dave.hansen, davem, gerald.schaefer, gregkh,
	heiko.carstens, jgg, jhogan, keescook, kirill, linux,
	mark.rutland, mike.kravetz, mm-commits, mpe, paul.burton, paulus,
	penguin-kernel, peterz, ralf, rppt, schowdary, schwidefsky,
	Steven.Price, tglx, vbabka, vgupta, yamada.masahiro, linux-mm

On Tue, Oct 15, 2019 at 03:00:49PM +0530, Anshuman Khandual wrote:
> On 10/15/2019 01:59 AM, Matthew Wilcox wrote:
> > On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
> >> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> >>> alloc_gigantic_page() implements an allocation method where it scans over
> >>> various zones looking for a large contiguous memory block which could not
> >>> have been allocated through the buddy allocator.  A subsequent patch which
> >>> tests arch page table helpers needs such a method to allocate PUD_SIZE
> >>> sized memory block.  In the future such methods might have other use cases
> >>> as well.  So alloc_gigantic_page() has been split carving out actual
> >>> memory allocation method and made available via new
> >>> alloc_gigantic_page_order().
> >>
> >> You are exporting a helper used for hugetlb internally. Is this really
> >> what is needed? I haven't followed this patchset but don't you simply
> >> need a generic 1GB allocator? If yes then you should be looking at
> >> alloc_contig_range.
> > 
> > He actually doesn't need to allocate any memory at all.  All he needs is
> > the address of a valid contiguous PUD-sized chunk of memory.
> > 
> 
> We had already discussed about the benefits of allocated memory over
> synthetic pfn potentially derived from a kernel text symbol. More
> over we are not adding any new helper or new code for this purpose,
> but instead just trying to reuse code which is already present.

Yes, and I'm pretty sure you're just wrong.  But I don't know that you're
wrong for all architectures.  Re-reading that, I'm still not sure you
understood what I was suggesting, so I'll say it again differently.

Look up a kernel symbol, something like kernel_init().  This will
have a virtual address upon which it is safe to call virt_to_pfn().
Let's presume it's in PFN 0x12345678.  Use 0x12345678 as the PFN for
your PTE level tests.

Then clear the bottom (eg) 9 bits from it and use 0x1234400 for your PMD
level tests.  Then clear the bottom 18 bits from it and use 0x12300000
for your PUD level tests.

I don't think it matters whether there's physical memory at PFN 0x12300000
or not.  The various p?d_* functions should work as long as the PFN is
in some plausible range.

I gave up arguing because you seemed uninterested in this approach,
but now that Michal is pointing out that your approach is all wrong,
maybe you'll listen.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15 11:24       ` Matthew Wilcox
@ 2019-10-15 11:36         ` Michal Hocko
  2019-10-15 12:14           ` Anshuman Khandual
  2019-10-16  8:55         ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2019-10-15 11:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Anshuman Khandual, akpm, ard.biesheuvel, broonie,
	christophe.leroy, dan.j.williams, dave.hansen, davem,
	gerald.schaefer, gregkh, heiko.carstens, jgg, jhogan, keescook,
	kirill, linux, mark.rutland, mike.kravetz, mm-commits, mpe,
	paul.burton, paulus, penguin-kernel, peterz, ralf, rppt,
	schowdary, schwidefsky, Steven.Price, tglx, vbabka, vgupta,
	yamada.masahiro, linux-mm

On Tue 15-10-19 04:24:33, Matthew Wilcox wrote:
> On Tue, Oct 15, 2019 at 03:00:49PM +0530, Anshuman Khandual wrote:
> > On 10/15/2019 01:59 AM, Matthew Wilcox wrote:
> > > On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
> > >> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
> > >>> alloc_gigantic_page() implements an allocation method where it scans over
> > >>> various zones looking for a large contiguous memory block which could not
> > >>> have been allocated through the buddy allocator.  A subsequent patch which
> > >>> tests arch page table helpers needs such a method to allocate PUD_SIZE
> > >>> sized memory block.  In the future such methods might have other use cases
> > >>> as well.  So alloc_gigantic_page() has been split carving out actual
> > >>> memory allocation method and made available via new
> > >>> alloc_gigantic_page_order().
> > >>
> > >> You are exporting a helper used for hugetlb internally. Is this really
> > >> what is needed? I haven't followed this patchset but don't you simply
> > >> need a generic 1GB allocator? If yes then you should be looking at
> > >> alloc_contig_range.
> > > 
> > > He actually doesn't need to allocate any memory at all.  All he needs is
> > > the address of a valid contiguous PUD-sized chunk of memory.
> > > 
> > 
> > We had already discussed about the benefits of allocated memory over
> > synthetic pfn potentially derived from a kernel text symbol. More
> > over we are not adding any new helper or new code for this purpose,
> > but instead just trying to reuse code which is already present.
> 
> Yes, and I'm pretty sure you're just wrong.  But I don't know that you're
> wrong for all architectures.  Re-reading that, I'm still not sure you
> understood what I was suggesting, so I'll say it again differently.
> 
> Look up a kernel symbol, something like kernel_init().  This will
> have a virtual address upon which it is safe to call virt_to_pfn().
> Let's presume it's in PFN 0x12345678.  Use 0x12345678 as the PFN for
> your PTE level tests.
> 
> Then clear the bottom (eg) 9 bits from it and use 0x1234400 for your PMD
> level tests.  Then clear the bottom 18 bits from it and use 0x12300000
> for your PUD level tests.
> 
> I don't think it matters whether there's physical memory at PFN 0x12300000
> or not.  The various p?d_* functions should work as long as the PFN is
> in some plausible range.
> 
> I gave up arguing because you seemed uninterested in this approach,
> but now that Michal is pointing out that your approach is all wrong,
> maybe you'll listen.

Just for the record. I didn't really get to read the patch 2 in this
series. Matthew is right that I disagree with the current state of the
"large pages" allocator. If my concerns get resolved then I do not mind
having it regardless of what patch 2 ends up doing and whether it uses
it or not.

On the other hand, having a testing code that really requires a lot of
memory to allocate to test page table walkers seems indeed a bit too
strong of an assumption. Especially when there are ways around that as
Matthew is suggesting so I would really listen to his suggestions.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15 11:36         ` Michal Hocko
@ 2019-10-15 12:14           ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-15 12:14 UTC (permalink / raw)
  To: Michal Hocko, Matthew Wilcox
  Cc: akpm, ard.biesheuvel, broonie, christophe.leroy, dan.j.williams,
	dave.hansen, davem, gerald.schaefer, gregkh, heiko.carstens, jgg,
	jhogan, keescook, kirill, linux, mark.rutland, mike.kravetz,
	mm-commits, mpe, paul.burton, paulus, penguin-kernel, peterz,
	ralf, rppt, schowdary, schwidefsky, Steven.Price, tglx, vbabka,
	vgupta, yamada.masahiro, linux-mm



On 10/15/2019 05:06 PM, Michal Hocko wrote:
> On Tue 15-10-19 04:24:33, Matthew Wilcox wrote:
>> On Tue, Oct 15, 2019 at 03:00:49PM +0530, Anshuman Khandual wrote:
>>> On 10/15/2019 01:59 AM, Matthew Wilcox wrote:
>>>> On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
>>>>> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
>>>>>> alloc_gigantic_page() implements an allocation method where it scans over
>>>>>> various zones looking for a large contiguous memory block which could not
>>>>>> have been allocated through the buddy allocator.  A subsequent patch which
>>>>>> tests arch page table helpers needs such a method to allocate PUD_SIZE
>>>>>> sized memory block.  In the future such methods might have other use cases
>>>>>> as well.  So alloc_gigantic_page() has been split carving out actual
>>>>>> memory allocation method and made available via new
>>>>>> alloc_gigantic_page_order().
>>>>>
>>>>> You are exporting a helper used for hugetlb internally. Is this really
>>>>> what is needed? I haven't followed this patchset but don't you simply
>>>>> need a generic 1GB allocator? If yes then you should be looking at
>>>>> alloc_contig_range.
>>>>
>>>> He actually doesn't need to allocate any memory at all.  All he needs is
>>>> the address of a valid contiguous PUD-sized chunk of memory.
>>>>
>>>
>>> We had already discussed about the benefits of allocated memory over
>>> synthetic pfn potentially derived from a kernel text symbol. More
>>> over we are not adding any new helper or new code for this purpose,
>>> but instead just trying to reuse code which is already present.
>>
>> Yes, and I'm pretty sure you're just wrong.  But I don't know that you're
>> wrong for all architectures.  Re-reading that, I'm still not sure you
>> understood what I was suggesting, so I'll say it again differently.
>>
>> Look up a kernel symbol, something like kernel_init().  This will
>> have a virtual address upon which it is safe to call virt_to_pfn().
>> Let's presume it's in PFN 0x12345678.  Use 0x12345678 as the PFN for
>> your PTE level tests.
>>
>> Then clear the bottom (eg) 9 bits from it and use 0x1234400 for your PMD
>> level tests.  Then clear the bottom 18 bits from it and use 0x12300000
>> for your PUD level tests.
>>
>> I don't think it matters whether there's physical memory at PFN 0x12300000
>> or not.  The various p?d_* functions should work as long as the PFN is
>> in some plausible range.
>>
>> I gave up arguing because you seemed uninterested in this approach,
>> but now that Michal is pointing out that your approach is all wrong,
>> maybe you'll listen.
> 
> Just for the record. I didn't really get to read the patch 2 in this
> series. Matthew is right that I disagree with the current state of the
> "large pages" allocator. If my concerns get resolved then I do not mind
> having it regardless of what patch 2 ends up doing and whether it uses
> it or not.

Sure, will then remove the first patch from here and post it separately
after accommodating all suggestions in the meantime.

> 
> On the other hand, having a testing code that really requires a lot of
> memory to allocate to test page table walkers seems indeed a bit too
> strong of an assumption. Especially when there are ways around that as

I agree it is a strong assumption even though we have fallback of not
running tests when required memory size could not be allocated. There were
some reasons around it, which I am still trying to figure out from our
previous discussions.

> Matthew is suggesting so I would really listen to his suggestions.

Sure, lets evaluate it once more (on the other thread). Current proposal
skips related tests when required size memory could not be allocated, if
we can do without allocating memory, it definitely increases test coverage
on many platforms.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree
  2019-10-15 11:24       ` Matthew Wilcox
  2019-10-15 11:36         ` Michal Hocko
@ 2019-10-16  8:55         ` Anshuman Khandual
  1 sibling, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2019-10-16  8:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Michal Hocko, akpm, ard.biesheuvel, broonie, christophe.leroy,
	dan.j.williams, dave.hansen, davem, gerald.schaefer, gregkh,
	heiko.carstens, jgg, jhogan, keescook, kirill, linux,
	mark.rutland, mike.kravetz, mm-commits, mpe, paul.burton, paulus,
	penguin-kernel, peterz, ralf, rppt, schowdary, schwidefsky,
	Steven.Price, tglx, vbabka, vgupta, yamada.masahiro, linux-mm

On 10/15/2019 04:54 PM, Matthew Wilcox wrote:
> On Tue, Oct 15, 2019 at 03:00:49PM +0530, Anshuman Khandual wrote:
>> On 10/15/2019 01:59 AM, Matthew Wilcox wrote:
>>> On Mon, Oct 14, 2019 at 02:17:30PM +0200, Michal Hocko wrote:
>>>> On Fri 11-10-19 13:29:32, Andrew Morton wrote:
>>>>> alloc_gigantic_page() implements an allocation method where it scans over
>>>>> various zones looking for a large contiguous memory block which could not
>>>>> have been allocated through the buddy allocator.  A subsequent patch which
>>>>> tests arch page table helpers needs such a method to allocate PUD_SIZE
>>>>> sized memory block.  In the future such methods might have other use cases
>>>>> as well.  So alloc_gigantic_page() has been split carving out actual
>>>>> memory allocation method and made available via new
>>>>> alloc_gigantic_page_order().
>>>>
>>>> You are exporting a helper used for hugetlb internally. Is this really
>>>> what is needed? I haven't followed this patchset but don't you simply
>>>> need a generic 1GB allocator? If yes then you should be looking at
>>>> alloc_contig_range.
>>>
>>> He actually doesn't need to allocate any memory at all.  All he needs is
>>> the address of a valid contiguous PUD-sized chunk of memory.
>>>
>>
>> We had already discussed about the benefits of allocated memory over
>> synthetic pfn potentially derived from a kernel text symbol. More
>> over we are not adding any new helper or new code for this purpose,
>> but instead just trying to reuse code which is already present.
> 
> Yes, and I'm pretty sure you're just wrong.  But I don't know that you're
> wrong for all architectures.  Re-reading that, I'm still not sure you
> understood what I was suggesting, so I'll say it again differently.

Sure, really appreciate that.

> 
> Look up a kernel symbol, something like kernel_init().  This will
> have a virtual address upon which it is safe to call virt_to_pfn().
> Let's presume it's in PFN 0x12345678.  Use 0x12345678 as the PFN for
> your PTE level tests.

Got it.

> 
> Then clear the bottom (eg) 9 bits from it and use 0x1234400 for your PMD
> level tests.  Then clear the bottom 18 bits from it and use 0x12300000
> for your PUD level tests.

Got it.

> 
> I don't think it matters whether there's physical memory at PFN 0x12300000
> or not.  The various p?d_* functions should work as long as the PFN is
> in some plausible range.

A quick check confirms that p?d_* functions work on those pfns.

Just for my understanding. Where these checks could happen which verifies
that any mapped pfn on an entry falls under a plausible range or not ? Could
this check be platform specific ? Could there be any problem for these pfns
not to test positive with pfn_valid(). Though I am not sure if these could
ever cause any problem, just wondering.

One of the other reason for explicit memory allocation was isolation. By
using pfns as explained earlier, even though the test is transient it might
map memory which could be used simultaneously some where else. Though it
might never cause any problem (this being a test page table not walked by
MMU) but is not breaking test's resource isolation a valid concern ? Just
thinking.

As mentioned earlier, I can definitely see benefits of using pfn approach.

> 
> I gave up arguing because you seemed uninterested in this approach,
> but now that Michal is pointing out that your approach is all wrong,
> maybe you'll listen.
> 

I had assumed that the previous discussion just remained inconclusive
but fair enough, will try it out on x86 and arm64 platforms.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-10-16  8:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191011202932.GZoUOoURm%akpm@linux-foundation.org>
2019-10-14 12:17 ` + mm-hugetlb-make-alloc_gigantic_page-available-for-general-use.patch added to -mm tree Michal Hocko
2019-10-14 12:53   ` Anshuman Khandual
2019-10-14 13:00     ` Michal Hocko
2019-10-14 13:08       ` Anshuman Khandual
2019-10-14 13:21         ` Michal Hocko
2019-10-14 16:52         ` Mike Kravetz
2019-10-15  9:57           ` Anshuman Khandual
2019-10-15 10:31             ` Michal Hocko
2019-10-15 10:34               ` Anshuman Khandual
2019-10-14 20:29   ` Matthew Wilcox
2019-10-15  9:30     ` Anshuman Khandual
2019-10-15 11:24       ` Matthew Wilcox
2019-10-15 11:36         ` Michal Hocko
2019-10-15 12:14           ` Anshuman Khandual
2019-10-16  8:55         ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).