All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mike Rapoport <rppt@linux.ibm.com>, Will Deacon <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages
Date: Wed, 14 Apr 2021 23:06:33 +0300	[thread overview]
Message-ID: <YHdLSeYE3f5+v3n5@kernel.org> (raw)
In-Reply-To: <0c48f98c-7454-1458-15a5-cc5a7e1fb7cd@redhat.com>

On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote:
> On 07.04.21 19:26, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The struct pages representing a reserved memory region are initialized
> > using reserve_bootmem_range() function. This function is called for each
> > reserved region just before the memory is freed from memblock to the buddy
> > page allocator.
> > 
> > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > values set by the memory map initialization which makes it necessary to
> > have a special treatment for such pages in pfn_valid() and
> > pfn_valid_within().
> 
> I assume these pages are never given to the buddy, because we don't have a
> direct mapping. So to the kernel, it's essentially just like a memory hole
> with benefits.

The pages should not be accessed as normal memory so they do not have a
direct (or in ARMish linear) mapping and are never given to buddy. 
After looking at ACPI standard I don't see a fundamental reason for this
but they've already made this mess and we need to cope with it.
 
> I can spot that we want to export such memory like any special memory
> thingy/hole in /proc/iomem -- "reserved", which makes sense.

It does, but let's wait with /proc/iomem changes. We don't really have a
100% consistent view of it on different architectures, so adding yet
another type there does not seem, well, urgent.
 
> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory.
> IOW, that for_each_reserved_mem_range() should already succeed on these as
> well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as
> reserved. Or are there valid reasons not to do so? What can anyone do with
> that memory?
> 
> I assume they are pretty much useless for the kernel, right? Like other
> reserved memory ranges.

I agree that there is a lot of commonality between NOMAP and reserved. The
problem is that even semantics for reserved is different between
architectures. Moreover, on the same architecture there could be
E820_TYPE_RESERVED and memblock.reserved with different properties.

I'd really prefer moving in baby steps here because any change in the boot
mm can bear several month of early hangs debugging ;-)

> > Split out initialization of the reserved pages to a function with a
> > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> > reserved regions and mark struct pages for the NOMAP regions as
> > PageReserved.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   mm/memblock.c | 23 +++++++++++++++++++++--
> >   1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index afaefa8fc6ab..6b7ea9d86310 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
> >   	return end_pfn - start_pfn;
> >   }
> > +static void __init memmap_init_reserved_pages(void)
> > +{
> > +	struct memblock_region *region;
> > +	phys_addr_t start, end;
> > +	u64 i;
> > +
> > +	/* initialize struct pages for the reserved regions */
> > +	for_each_reserved_mem_range(i, &start, &end)
> > +		reserve_bootmem_region(start, end);
> > +
> > +	/* and also treat struct pages for the NOMAP regions as PageReserved */
> > +	for_each_mem_region(region) {
> > +		if (memblock_is_nomap(region)) {
> > +			start = region->base;
> > +			end = start + region->size;
> > +			reserve_bootmem_region(start, end);
> > +		}
> > +	}
> > +}
> > +
> >   static unsigned long __init free_low_memory_core_early(void)
> >   {
> >   	unsigned long count = 0;
> > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void)
> >   	memblock_clear_hotplug(0, -1);
> > -	for_each_reserved_mem_range(i, &start, &end)
> > -		reserve_bootmem_region(start, end);
> > +	memmap_init_reserved_pages();
> >   	/*
> >   	 * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id

-- 
Sincerely yours,
Mike.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, kvmarm@lists.cs.columbia.edu,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages
Date: Wed, 14 Apr 2021 23:06:33 +0300	[thread overview]
Message-ID: <YHdLSeYE3f5+v3n5@kernel.org> (raw)
In-Reply-To: <0c48f98c-7454-1458-15a5-cc5a7e1fb7cd@redhat.com>

On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote:
> On 07.04.21 19:26, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The struct pages representing a reserved memory region are initialized
> > using reserve_bootmem_range() function. This function is called for each
> > reserved region just before the memory is freed from memblock to the buddy
> > page allocator.
> > 
> > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > values set by the memory map initialization which makes it necessary to
> > have a special treatment for such pages in pfn_valid() and
> > pfn_valid_within().
> 
> I assume these pages are never given to the buddy, because we don't have a
> direct mapping. So to the kernel, it's essentially just like a memory hole
> with benefits.

The pages should not be accessed as normal memory so they do not have a
direct (or in ARMish linear) mapping and are never given to buddy. 
After looking at ACPI standard I don't see a fundamental reason for this
but they've already made this mess and we need to cope with it.
 
> I can spot that we want to export such memory like any special memory
> thingy/hole in /proc/iomem -- "reserved", which makes sense.

It does, but let's wait with /proc/iomem changes. We don't really have a
100% consistent view of it on different architectures, so adding yet
another type there does not seem, well, urgent.
 
> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory.
> IOW, that for_each_reserved_mem_range() should already succeed on these as
> well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as
> reserved. Or are there valid reasons not to do so? What can anyone do with
> that memory?
> 
> I assume they are pretty much useless for the kernel, right? Like other
> reserved memory ranges.

I agree that there is a lot of commonality between NOMAP and reserved. The
problem is that even semantics for reserved is different between
architectures. Moreover, on the same architecture there could be
E820_TYPE_RESERVED and memblock.reserved with different properties.

I'd really prefer moving in baby steps here because any change in the boot
mm can bear several month of early hangs debugging ;-)

> > Split out initialization of the reserved pages to a function with a
> > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> > reserved regions and mark struct pages for the NOMAP regions as
> > PageReserved.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   mm/memblock.c | 23 +++++++++++++++++++++--
> >   1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index afaefa8fc6ab..6b7ea9d86310 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
> >   	return end_pfn - start_pfn;
> >   }
> > +static void __init memmap_init_reserved_pages(void)
> > +{
> > +	struct memblock_region *region;
> > +	phys_addr_t start, end;
> > +	u64 i;
> > +
> > +	/* initialize struct pages for the reserved regions */
> > +	for_each_reserved_mem_range(i, &start, &end)
> > +		reserve_bootmem_region(start, end);
> > +
> > +	/* and also treat struct pages for the NOMAP regions as PageReserved */
> > +	for_each_mem_region(region) {
> > +		if (memblock_is_nomap(region)) {
> > +			start = region->base;
> > +			end = start + region->size;
> > +			reserve_bootmem_region(start, end);
> > +		}
> > +	}
> > +}
> > +
> >   static unsigned long __init free_low_memory_core_early(void)
> >   {
> >   	unsigned long count = 0;
> > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void)
> >   	memblock_clear_hotplug(0, -1);
> > -	for_each_reserved_mem_range(i, &start, &end)
> > -		reserve_bootmem_region(start, end);
> > +	memmap_init_reserved_pages();
> >   	/*
> >   	 * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id

-- 
Sincerely yours,
Mike.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mike Rapoport <rppt@linux.ibm.com>, Will Deacon <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages
Date: Wed, 14 Apr 2021 23:06:33 +0300	[thread overview]
Message-ID: <YHdLSeYE3f5+v3n5@kernel.org> (raw)
In-Reply-To: <0c48f98c-7454-1458-15a5-cc5a7e1fb7cd@redhat.com>

On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote:
> On 07.04.21 19:26, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The struct pages representing a reserved memory region are initialized
> > using reserve_bootmem_range() function. This function is called for each
> > reserved region just before the memory is freed from memblock to the buddy
> > page allocator.
> > 
> > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > values set by the memory map initialization which makes it necessary to
> > have a special treatment for such pages in pfn_valid() and
> > pfn_valid_within().
> 
> I assume these pages are never given to the buddy, because we don't have a
> direct mapping. So to the kernel, it's essentially just like a memory hole
> with benefits.

The pages should not be accessed as normal memory so they do not have a
direct (or in ARMish linear) mapping and are never given to buddy. 
After looking at ACPI standard I don't see a fundamental reason for this
but they've already made this mess and we need to cope with it.
 
> I can spot that we want to export such memory like any special memory
> thingy/hole in /proc/iomem -- "reserved", which makes sense.

It does, but let's wait with /proc/iomem changes. We don't really have a
100% consistent view of it on different architectures, so adding yet
another type there does not seem, well, urgent.
 
> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory.
> IOW, that for_each_reserved_mem_range() should already succeed on these as
> well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as
> reserved. Or are there valid reasons not to do so? What can anyone do with
> that memory?
> 
> I assume they are pretty much useless for the kernel, right? Like other
> reserved memory ranges.

I agree that there is a lot of commonality between NOMAP and reserved. The
problem is that even semantics for reserved is different between
architectures. Moreover, on the same architecture there could be
E820_TYPE_RESERVED and memblock.reserved with different properties.

I'd really prefer moving in baby steps here because any change in the boot
mm can bear several month of early hangs debugging ;-)

> > Split out initialization of the reserved pages to a function with a
> > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> > reserved regions and mark struct pages for the NOMAP regions as
> > PageReserved.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   mm/memblock.c | 23 +++++++++++++++++++++--
> >   1 file changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index afaefa8fc6ab..6b7ea9d86310 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
> >   	return end_pfn - start_pfn;
> >   }
> > +static void __init memmap_init_reserved_pages(void)
> > +{
> > +	struct memblock_region *region;
> > +	phys_addr_t start, end;
> > +	u64 i;
> > +
> > +	/* initialize struct pages for the reserved regions */
> > +	for_each_reserved_mem_range(i, &start, &end)
> > +		reserve_bootmem_region(start, end);
> > +
> > +	/* and also treat struct pages for the NOMAP regions as PageReserved */
> > +	for_each_mem_region(region) {
> > +		if (memblock_is_nomap(region)) {
> > +			start = region->base;
> > +			end = start + region->size;
> > +			reserve_bootmem_region(start, end);
> > +		}
> > +	}
> > +}
> > +
> >   static unsigned long __init free_low_memory_core_early(void)
> >   {
> >   	unsigned long count = 0;
> > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void)
> >   	memblock_clear_hotplug(0, -1);
> > -	for_each_reserved_mem_range(i, &start, &end)
> > -		reserve_bootmem_region(start, end);
> > +	memmap_init_reserved_pages();
> >   	/*
> >   	 * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-04-14 20:06 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07 17:26 [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-07 17:26 ` Mike Rapoport
2021-04-07 17:26 ` Mike Rapoport
2021-04-07 17:26 ` [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-08  5:16   ` Anshuman Khandual
2021-04-08  5:16     ` Anshuman Khandual
2021-04-08  5:16     ` Anshuman Khandual
2021-04-08  5:48     ` Mike Rapoport
2021-04-08  5:48       ` Mike Rapoport
2021-04-08  5:48       ` Mike Rapoport
2021-04-14 15:12   ` David Hildenbrand
2021-04-14 15:12     ` David Hildenbrand
2021-04-14 15:12     ` David Hildenbrand
2021-04-14 15:27     ` Ard Biesheuvel
2021-04-14 15:27       ` Ard Biesheuvel
2021-04-14 15:27       ` Ard Biesheuvel
2021-04-14 15:27       ` Ard Biesheuvel
2021-04-14 15:52       ` David Hildenbrand
2021-04-14 15:52         ` David Hildenbrand
2021-04-14 15:52         ` David Hildenbrand
2021-04-14 20:24         ` Mike Rapoport
2021-04-14 20:24           ` Mike Rapoport
2021-04-14 20:24           ` Mike Rapoport
2021-04-15  9:30           ` David Hildenbrand
2021-04-15  9:30             ` David Hildenbrand
2021-04-15  9:30             ` David Hildenbrand
2021-04-16 11:44             ` Mike Rapoport
2021-04-16 11:44               ` Mike Rapoport
2021-04-16 11:44               ` Mike Rapoport
2021-04-16 11:54               ` David Hildenbrand
2021-04-16 11:54                 ` David Hildenbrand
2021-04-16 11:54                 ` David Hildenbrand
2021-04-14 20:11       ` Mike Rapoport
2021-04-14 20:11         ` Mike Rapoport
2021-04-14 20:11         ` Mike Rapoport
2021-04-14 20:06     ` Mike Rapoport [this message]
2021-04-14 20:06       ` Mike Rapoport
2021-04-14 20:06       ` Mike Rapoport
2021-04-14 20:09       ` David Hildenbrand
2021-04-14 20:09         ` David Hildenbrand
2021-04-07 17:26 ` [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-08  5:14   ` Anshuman Khandual
2021-04-08  5:14     ` Anshuman Khandual
2021-04-08  5:14     ` Anshuman Khandual
2021-04-08  6:00     ` Mike Rapoport
2021-04-08  6:00       ` Mike Rapoport
2021-04-08  6:00       ` Mike Rapoport
2021-04-14 15:58     ` David Hildenbrand
2021-04-14 15:58       ` David Hildenbrand
2021-04-14 15:58       ` David Hildenbrand
2021-04-14 20:29       ` Mike Rapoport
2021-04-14 20:29         ` Mike Rapoport
2021-04-14 20:29         ` Mike Rapoport
2021-04-15  9:31         ` David Hildenbrand
2021-04-15  9:31           ` David Hildenbrand
2021-04-15  9:31           ` David Hildenbrand
2021-04-16 11:40           ` Mike Rapoport
2021-04-16 11:40             ` Mike Rapoport
2021-04-16 11:40             ` Mike Rapoport
2021-04-07 17:26 ` [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-07 17:26   ` Mike Rapoport
2021-04-08  5:12   ` Anshuman Khandual
2021-04-08  5:12     ` Anshuman Khandual
2021-04-08  5:12     ` Anshuman Khandual
2021-04-08  6:17     ` Mike Rapoport
2021-04-08  6:17       ` Mike Rapoport
2021-04-08  6:17       ` Mike Rapoport
2021-04-08  5:19 ` [RFC/RFT PATCH 0/3] " Anshuman Khandual
2021-04-08  5:19   ` Anshuman Khandual
2021-04-08  5:19   ` Anshuman Khandual
2021-04-08  6:27   ` Mike Rapoport
2021-04-08  6:27     ` Mike Rapoport
2021-04-08  6:27     ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YHdLSeYE3f5+v3n5@kernel.org \
    --to=rppt@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.