All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02  4:11 ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Ard Biesheuvel,
	Mark Rutland, James Morse, Robin Murphy, Jérôme Glisse,
	Dan Williams, David Hildenbrand, Mike Rapoport

This series fixes pfn_valid() for ZONE_DEVICE based memory and also improves
its performance for normal hotplug memory. While here, it also reorganizes
pfn_valid() on CONFIG_SPARSEMEM. This series is based on v5.11-rc6.

Question - should pfn_section_valid() be tested both for boot and non boot
memory as well ?

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Changes in V2:

- Dropped pfn_valid() bifurcation based on CONFIG_SPARSEMEM
- Used PFN_PHYS() and PHYS_PFN() instead of __pfn_to_phys() and __phys_to_pfn()
- Moved __pfn_to_section() inside #ifdef CONFIG_SPARSEMEM with a { } construct

Changes in V1:

https://lore.kernel.org/linux-mm/1611905986-20155-1-git-send-email-anshuman.khandual@arm.com/

- Test pfn_section_valid() for non boot memory

Changes in RFC:

https://lore.kernel.org/linux-arm-kernel/1608621144-4001-1-git-send-email-anshuman.khandual@arm.com/

Anshuman Khandual (2):
  arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  arm64/mm: Reorganize pfn_valid()

 arch/arm64/mm/init.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02  4:11 ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas,
	David Hildenbrand, Robin Murphy, Mike Rapoport,
	Jérôme Glisse, James Morse, Dan Williams, Will Deacon,
	Ard Biesheuvel

This series fixes pfn_valid() for ZONE_DEVICE based memory and also improves
its performance for normal hotplug memory. While here, it also reorganizes
pfn_valid() on CONFIG_SPARSEMEM. This series is based on v5.11-rc6.

Question - should pfn_section_valid() be tested both for boot and non boot
memory as well ?

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Changes in V2:

- Dropped pfn_valid() bifurcation based on CONFIG_SPARSEMEM
- Used PFN_PHYS() and PHYS_PFN() instead of __pfn_to_phys() and __phys_to_pfn()
- Moved __pfn_to_section() inside #ifdef CONFIG_SPARSEMEM with a { } construct

Changes in V1:

https://lore.kernel.org/linux-mm/1611905986-20155-1-git-send-email-anshuman.khandual@arm.com/

- Test pfn_section_valid() for non boot memory

Changes in RFC:

https://lore.kernel.org/linux-arm-kernel/1608621144-4001-1-git-send-email-anshuman.khandual@arm.com/

Anshuman Khandual (2):
  arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  arm64/mm: Reorganize pfn_valid()

 arch/arm64/mm/init.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02  4:11 ` Anshuman Khandual
@ 2021-02-02  4:11   ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Ard Biesheuvel,
	Mark Rutland, James Morse, Robin Murphy, Jérôme Glisse,
	Dan Williams, David Hildenbrand, Mike Rapoport

pfn_valid() validates a pfn but basically it checks for a valid struct page
backing for that pfn. It should always return positive for memory ranges
backed with struct page mapping. But currently pfn_valid() fails for all
ZONE_DEVICE based memory types even though they have struct page mapping.

pfn_valid() asserts that there is a memblock entry for a given pfn without
MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
that they do not have memblock entries. Hence memblock_is_map_memory() will
invariably fail via memblock_search() for a ZONE_DEVICE based address. This
eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
into the system via memremap_pages() called from a driver, their respective
memory sections will not have SECTION_IS_EARLY set.

Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
for firmware reserved memory regions. memblock_is_map_memory() can just be
skipped as its always going to be positive and that will be an optimization
for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
hotplugged memory too will not have SECTION_IS_EARLY set for their sections

Skipping memblock_is_map_memory() for all non early memory sections would
fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
performance for normal hotplug memory as well.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: David Hildenbrand <david@redhat.com>
Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/init.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 709d98fea90c..1141075e4d53 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -230,6 +230,18 @@ int pfn_valid(unsigned long pfn)
 
 	if (!valid_section(__pfn_to_section(pfn)))
 		return 0;
+
+	/*
+	 * ZONE_DEVICE memory does not have the memblock entries.
+	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * addresses will always fail. Even the normal hotplugged
+	 * memory will never have MEMBLOCK_NOMAP flag set in their
+	 * memblock entries. Skip memblock search for all non early
+	 * memory sections covering all of hotplug memory including
+	 * both normal and ZONE_DEVICE based.
+	 */
+	if (!early_section(__pfn_to_section(pfn)))
+		return pfn_section_valid(__pfn_to_section(pfn), pfn);
 #endif
 	return memblock_is_map_memory(addr);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02  4:11   ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas,
	David Hildenbrand, Robin Murphy, Mike Rapoport,
	Jérôme Glisse, James Morse, Dan Williams, Will Deacon,
	Ard Biesheuvel

pfn_valid() validates a pfn but basically it checks for a valid struct page
backing for that pfn. It should always return positive for memory ranges
backed with struct page mapping. But currently pfn_valid() fails for all
ZONE_DEVICE based memory types even though they have struct page mapping.

pfn_valid() asserts that there is a memblock entry for a given pfn without
MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
that they do not have memblock entries. Hence memblock_is_map_memory() will
invariably fail via memblock_search() for a ZONE_DEVICE based address. This
eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
into the system via memremap_pages() called from a driver, their respective
memory sections will not have SECTION_IS_EARLY set.

Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
for firmware reserved memory regions. memblock_is_map_memory() can just be
skipped as its always going to be positive and that will be an optimization
for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
hotplugged memory too will not have SECTION_IS_EARLY set for their sections

Skipping memblock_is_map_memory() for all non early memory sections would
fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
performance for normal hotplug memory as well.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: David Hildenbrand <david@redhat.com>
Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/init.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 709d98fea90c..1141075e4d53 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -230,6 +230,18 @@ int pfn_valid(unsigned long pfn)
 
 	if (!valid_section(__pfn_to_section(pfn)))
 		return 0;
+
+	/*
+	 * ZONE_DEVICE memory does not have the memblock entries.
+	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * addresses will always fail. Even the normal hotplugged
+	 * memory will never have MEMBLOCK_NOMAP flag set in their
+	 * memblock entries. Skip memblock search for all non early
+	 * memory sections covering all of hotplug memory including
+	 * both normal and ZONE_DEVICE based.
+	 */
+	if (!early_section(__pfn_to_section(pfn)))
+		return pfn_section_valid(__pfn_to_section(pfn), pfn);
 #endif
 	return memblock_is_map_memory(addr);
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid()
  2021-02-02  4:11 ` Anshuman Khandual
@ 2021-02-02  4:11   ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Ard Biesheuvel,
	Mark Rutland, James Morse, Robin Murphy, Jérôme Glisse,
	Dan Williams, David Hildenbrand, Mike Rapoport

There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
is fetched earlier. This replaces the open coded PFN and ADDR conversion
with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
This does not cause any functional change.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/init.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1141075e4d53..5d8fd5360a68 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -219,16 +219,25 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 
 int pfn_valid(unsigned long pfn)
 {
-	phys_addr_t addr = pfn << PAGE_SHIFT;
+	phys_addr_t addr = PFN_PHYS(pfn);
 
-	if ((addr >> PAGE_SHIFT) != pfn)
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(addr) != pfn)
 		return 0;
 
 #ifdef CONFIG_SPARSEMEM
+{
+	struct mem_section *ms = __pfn_to_section(pfn);
+
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 
-	if (!valid_section(__pfn_to_section(pfn)))
+	if (!valid_section(ms))
 		return 0;
 
 	/*
@@ -240,8 +249,9 @@ int pfn_valid(unsigned long pfn)
 	 * memory sections covering all of hotplug memory including
 	 * both normal and ZONE_DEVICE based.
 	 */
-	if (!early_section(__pfn_to_section(pfn)))
-		return pfn_section_valid(__pfn_to_section(pfn), pfn);
+	if (!early_section(ms))
+		return pfn_section_valid(ms, pfn);
+}
 #endif
 	return memblock_is_map_memory(addr);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid()
@ 2021-02-02  4:11   ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-02  4:11 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas,
	David Hildenbrand, Robin Murphy, Mike Rapoport,
	Jérôme Glisse, James Morse, Dan Williams, Will Deacon,
	Ard Biesheuvel

There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
is fetched earlier. This replaces the open coded PFN and ADDR conversion
with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
This does not cause any functional change.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/init.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1141075e4d53..5d8fd5360a68 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -219,16 +219,25 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 
 int pfn_valid(unsigned long pfn)
 {
-	phys_addr_t addr = pfn << PAGE_SHIFT;
+	phys_addr_t addr = PFN_PHYS(pfn);
 
-	if ((addr >> PAGE_SHIFT) != pfn)
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(addr) != pfn)
 		return 0;
 
 #ifdef CONFIG_SPARSEMEM
+{
+	struct mem_section *ms = __pfn_to_section(pfn);
+
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 
-	if (!valid_section(__pfn_to_section(pfn)))
+	if (!valid_section(ms))
 		return 0;
 
 	/*
@@ -240,8 +249,9 @@ int pfn_valid(unsigned long pfn)
 	 * memory sections covering all of hotplug memory including
 	 * both normal and ZONE_DEVICE based.
 	 */
-	if (!early_section(__pfn_to_section(pfn)))
-		return pfn_section_valid(__pfn_to_section(pfn), pfn);
+	if (!early_section(ms))
+		return pfn_section_valid(ms, pfn);
+}
 #endif
 	return memblock_is_map_memory(addr);
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid()
  2021-02-02  4:11   ` Anshuman Khandual
@ 2021-02-02  8:26     ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02  8:26 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Robin Murphy, Jérôme Glisse, Dan Williams,
	Mike Rapoport

On 02.02.21 05:11, Anshuman Khandual wrote:
> There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
> when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
> is fetched earlier. This replaces the open coded PFN and ADDR conversion
> with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
> This does not cause any functional change.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/mm/init.c | 20 +++++++++++++++-----
>   1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 1141075e4d53..5d8fd5360a68 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -219,16 +219,25 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>   
>   int pfn_valid(unsigned long pfn)
>   {
> -	phys_addr_t addr = pfn << PAGE_SHIFT;
> +	phys_addr_t addr = PFN_PHYS(pfn);
>   
> -	if ((addr >> PAGE_SHIFT) != pfn)
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
>   		return 0;
>   
>   #ifdef CONFIG_SPARSEMEM
> +{
> +	struct mem_section *ms = __pfn_to_section(pfn);
> +
>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>   		return 0;
>   
> -	if (!valid_section(__pfn_to_section(pfn)))
> +	if (!valid_section(ms))
>   		return 0;
>   
>   	/*
> @@ -240,8 +249,9 @@ int pfn_valid(unsigned long pfn)
>   	 * memory sections covering all of hotplug memory including
>   	 * both normal and ZONE_DEVICE based.
>   	 */
> -	if (!early_section(__pfn_to_section(pfn)))
> -		return pfn_section_valid(__pfn_to_section(pfn), pfn);
> +	if (!early_section(ms))
> +		return pfn_section_valid(ms, pfn);
> +}
>   #endif
>   	return memblock_is_map_memory(addr);
>   }
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid()
@ 2021-02-02  8:26     ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02  8:26 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: Mark Rutland, Will Deacon, Catalin Marinas, Mike Rapoport,
	Jérôme Glisse, James Morse, Dan Williams, Robin Murphy,
	Ard Biesheuvel

On 02.02.21 05:11, Anshuman Khandual wrote:
> There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
> when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
> is fetched earlier. This replaces the open coded PFN and ADDR conversion
> with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
> This does not cause any functional change.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/mm/init.c | 20 +++++++++++++++-----
>   1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 1141075e4d53..5d8fd5360a68 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -219,16 +219,25 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>   
>   int pfn_valid(unsigned long pfn)
>   {
> -	phys_addr_t addr = pfn << PAGE_SHIFT;
> +	phys_addr_t addr = PFN_PHYS(pfn);
>   
> -	if ((addr >> PAGE_SHIFT) != pfn)
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
>   		return 0;
>   
>   #ifdef CONFIG_SPARSEMEM
> +{
> +	struct mem_section *ms = __pfn_to_section(pfn);
> +
>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>   		return 0;
>   
> -	if (!valid_section(__pfn_to_section(pfn)))
> +	if (!valid_section(ms))
>   		return 0;
>   
>   	/*
> @@ -240,8 +249,9 @@ int pfn_valid(unsigned long pfn)
>   	 * memory sections covering all of hotplug memory including
>   	 * both normal and ZONE_DEVICE based.
>   	 */
> -	if (!early_section(__pfn_to_section(pfn)))
> -		return pfn_section_valid(__pfn_to_section(pfn), pfn);
> +	if (!early_section(ms))
> +		return pfn_section_valid(ms, pfn);
> +}
>   #endif
>   	return memblock_is_map_memory(addr);
>   }
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02  4:11   ` Anshuman Khandual
@ 2021-02-02 12:32     ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:32 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-mm, Catalin Marinas,
	Ard Biesheuvel, Mark Rutland, James Morse, Robin Murphy,
	Jérôme Glisse, Dan Williams, David Hildenbrand,
	Mike Rapoport

On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> pfn_valid() validates a pfn but basically it checks for a valid struct page
> backing for that pfn. It should always return positive for memory ranges
> backed with struct page mapping. But currently pfn_valid() fails for all
> ZONE_DEVICE based memory types even though they have struct page mapping.
> 
> pfn_valid() asserts that there is a memblock entry for a given pfn without
> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> that they do not have memblock entries. Hence memblock_is_map_memory() will
> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> into the system via memremap_pages() called from a driver, their respective
> memory sections will not have SECTION_IS_EARLY set.
> 
> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> for firmware reserved memory regions. memblock_is_map_memory() can just be
> skipped as its always going to be positive and that will be an optimization
> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> 
> Skipping memblock_is_map_memory() for all non early memory sections would
> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> performance for normal hotplug memory as well.

Hmm. Although I follow your logic, this does seem to rely on an awful lot of
assumptions to continue to hold true as the kernel evolves. In particular,
how do we ensure that early sections are always fully backed with
'struct page's and never contain any nomap entries? What's to stop somebody
changing that and quietly breaking our pfn_valid() implementation?

And to be clear, I'm not trying to say that this patch is broken. I'm just
trying to work out how on Earth we can maintain it!

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02 12:32     ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:32 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> pfn_valid() validates a pfn but basically it checks for a valid struct page
> backing for that pfn. It should always return positive for memory ranges
> backed with struct page mapping. But currently pfn_valid() fails for all
> ZONE_DEVICE based memory types even though they have struct page mapping.
> 
> pfn_valid() asserts that there is a memblock entry for a given pfn without
> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> that they do not have memblock entries. Hence memblock_is_map_memory() will
> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> into the system via memremap_pages() called from a driver, their respective
> memory sections will not have SECTION_IS_EARLY set.
> 
> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> for firmware reserved memory regions. memblock_is_map_memory() can just be
> skipped as its always going to be positive and that will be an optimization
> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> 
> Skipping memblock_is_map_memory() for all non early memory sections would
> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> performance for normal hotplug memory as well.

Hmm. Although I follow your logic, this does seem to rely on an awful lot of
assumptions to continue to hold true as the kernel evolves. In particular,
how do we ensure that early sections are always fully backed with
'struct page's and never contain any nomap entries? What's to stop somebody
changing that and quietly breaking our pfn_valid() implementation?

And to be clear, I'm not trying to say that this patch is broken. I'm just
trying to work out how on Earth we can maintain it!

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02 12:32     ` Will Deacon
@ 2021-02-02 12:35       ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:35 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-mm, Catalin Marinas,
	Ard Biesheuvel, Mark Rutland, James Morse, Robin Murphy,
	Jérôme Glisse, Dan Williams, David Hildenbrand,
	Mike Rapoport

On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> > pfn_valid() validates a pfn but basically it checks for a valid struct page
> > backing for that pfn. It should always return positive for memory ranges
> > backed with struct page mapping. But currently pfn_valid() fails for all
> > ZONE_DEVICE based memory types even though they have struct page mapping.
> > 
> > pfn_valid() asserts that there is a memblock entry for a given pfn without
> > MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> > that they do not have memblock entries. Hence memblock_is_map_memory() will
> > invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> > eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> > to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> > into the system via memremap_pages() called from a driver, their respective
> > memory sections will not have SECTION_IS_EARLY set.
> > 
> > Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> > regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> > for firmware reserved memory regions. memblock_is_map_memory() can just be
> > skipped as its always going to be positive and that will be an optimization
> > for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> > hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> > 
> > Skipping memblock_is_map_memory() for all non early memory sections would
> > fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> > performance for normal hotplug memory as well.
> 
> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
> assumptions to continue to hold true as the kernel evolves. In particular,
> how do we ensure that early sections are always fully backed with

Sorry, typo here:       ^^^ should be *non-early* sections.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02 12:35       ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:35 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> > pfn_valid() validates a pfn but basically it checks for a valid struct page
> > backing for that pfn. It should always return positive for memory ranges
> > backed with struct page mapping. But currently pfn_valid() fails for all
> > ZONE_DEVICE based memory types even though they have struct page mapping.
> > 
> > pfn_valid() asserts that there is a memblock entry for a given pfn without
> > MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> > that they do not have memblock entries. Hence memblock_is_map_memory() will
> > invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> > eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> > to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> > into the system via memremap_pages() called from a driver, their respective
> > memory sections will not have SECTION_IS_EARLY set.
> > 
> > Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> > regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> > for firmware reserved memory regions. memblock_is_map_memory() can just be
> > skipped as its always going to be positive and that will be an optimization
> > for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> > hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> > 
> > Skipping memblock_is_map_memory() for all non early memory sections would
> > fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> > performance for normal hotplug memory as well.
> 
> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
> assumptions to continue to hold true as the kernel evolves. In particular,
> how do we ensure that early sections are always fully backed with

Sorry, typo here:       ^^^ should be *non-early* sections.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02 12:35       ` Will Deacon
@ 2021-02-02 12:39         ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02 12:39 UTC (permalink / raw)
  To: Will Deacon, Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, linux-mm, Catalin Marinas,
	Ard Biesheuvel, Mark Rutland, James Morse, Robin Murphy,
	Jérôme Glisse, Dan Williams, Mike Rapoport

On 02.02.21 13:35, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>> backing for that pfn. It should always return positive for memory ranges
>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>
>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>> into the system via memremap_pages() called from a driver, their respective
>>> memory sections will not have SECTION_IS_EARLY set.
>>>
>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>> skipped as its always going to be positive and that will be an optimization
>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>
>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>> performance for normal hotplug memory as well.
>>
>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>> assumptions to continue to hold true as the kernel evolves. In particular,
>> how do we ensure that early sections are always fully backed with
> 
> Sorry, typo here:       ^^^ should be *non-early* sections.

It might be a good idea to have a look at generic 
include/linux/mmzone.h:pfn_valid()

As I expressed already, long term we should really get rid of the arm64 
variant and rather special-case the generic one. Then we won't go out of 
sync - just as it happened with ZONE_DEVICE handling here.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02 12:39         ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02 12:39 UTC (permalink / raw)
  To: Will Deacon, Anshuman Khandual
  Cc: Mark Rutland, Catalin Marinas, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 02.02.21 13:35, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>> backing for that pfn. It should always return positive for memory ranges
>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>
>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>> into the system via memremap_pages() called from a driver, their respective
>>> memory sections will not have SECTION_IS_EARLY set.
>>>
>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>> skipped as its always going to be positive and that will be an optimization
>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>
>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>> performance for normal hotplug memory as well.
>>
>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>> assumptions to continue to hold true as the kernel evolves. In particular,
>> how do we ensure that early sections are always fully backed with
> 
> Sorry, typo here:       ^^^ should be *non-early* sections.

It might be a good idea to have a look at generic 
include/linux/mmzone.h:pfn_valid()

As I expressed already, long term we should really get rid of the arm64 
variant and rather special-case the generic one. Then we won't go out of 
sync - just as it happened with ZONE_DEVICE handling here.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02 12:39         ` David Hildenbrand
@ 2021-02-02 12:51           ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm,
	Catalin Marinas, Ard Biesheuvel, Mark Rutland, James Morse,
	Robin Murphy, Jérôme Glisse, Dan Williams,
	Mike Rapoport

On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> On 02.02.21 13:35, Will Deacon wrote:
> > On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
> > > On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> > > > pfn_valid() validates a pfn but basically it checks for a valid struct page
> > > > backing for that pfn. It should always return positive for memory ranges
> > > > backed with struct page mapping. But currently pfn_valid() fails for all
> > > > ZONE_DEVICE based memory types even though they have struct page mapping.
> > > > 
> > > > pfn_valid() asserts that there is a memblock entry for a given pfn without
> > > > MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> > > > that they do not have memblock entries. Hence memblock_is_map_memory() will
> > > > invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> > > > eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> > > > to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> > > > into the system via memremap_pages() called from a driver, their respective
> > > > memory sections will not have SECTION_IS_EARLY set.
> > > > 
> > > > Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> > > > regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> > > > for firmware reserved memory regions. memblock_is_map_memory() can just be
> > > > skipped as its always going to be positive and that will be an optimization
> > > > for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> > > > hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> > > > 
> > > > Skipping memblock_is_map_memory() for all non early memory sections would
> > > > fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> > > > performance for normal hotplug memory as well.
> > > 
> > > Hmm. Although I follow your logic, this does seem to rely on an awful lot of
> > > assumptions to continue to hold true as the kernel evolves. In particular,
> > > how do we ensure that early sections are always fully backed with
> > 
> > Sorry, typo here:       ^^^ should be *non-early* sections.
> 
> It might be a good idea to have a look at generic
> include/linux/mmzone.h:pfn_valid()

The generic implementation already makes assumptions that aren't true on
arm64, so that's why we've ended up with our own implementation. But the
patches here put us in a position where I worry that pfn_valid() may return
'true' in future for cases where the underlying struct page is either
non-existent or bogus, and debugging those failures really sucks. We had a
raft of those back when NOMAP was introduced and I don't want to re-live
that experience.

> As I expressed already, long term we should really get rid of the arm64
> variant and rather special-case the generic one. Then we won't go out of
> sync - just as it happened with ZONE_DEVICE handling here.

Why does this have to be long term? This ZONE_DEVICE stuff could be the
carrot on the stick :)

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02 12:51           ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-02 12:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> On 02.02.21 13:35, Will Deacon wrote:
> > On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
> > > On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
> > > > pfn_valid() validates a pfn but basically it checks for a valid struct page
> > > > backing for that pfn. It should always return positive for memory ranges
> > > > backed with struct page mapping. But currently pfn_valid() fails for all
> > > > ZONE_DEVICE based memory types even though they have struct page mapping.
> > > > 
> > > > pfn_valid() asserts that there is a memblock entry for a given pfn without
> > > > MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
> > > > that they do not have memblock entries. Hence memblock_is_map_memory() will
> > > > invariably fail via memblock_search() for a ZONE_DEVICE based address. This
> > > > eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
> > > > to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
> > > > into the system via memremap_pages() called from a driver, their respective
> > > > memory sections will not have SECTION_IS_EARLY set.
> > > > 
> > > > Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
> > > > regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
> > > > for firmware reserved memory regions. memblock_is_map_memory() can just be
> > > > skipped as its always going to be positive and that will be an optimization
> > > > for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
> > > > hotplugged memory too will not have SECTION_IS_EARLY set for their sections
> > > > 
> > > > Skipping memblock_is_map_memory() for all non early memory sections would
> > > > fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
> > > > performance for normal hotplug memory as well.
> > > 
> > > Hmm. Although I follow your logic, this does seem to rely on an awful lot of
> > > assumptions to continue to hold true as the kernel evolves. In particular,
> > > how do we ensure that early sections are always fully backed with
> > 
> > Sorry, typo here:       ^^^ should be *non-early* sections.
> 
> It might be a good idea to have a look at generic
> include/linux/mmzone.h:pfn_valid()

The generic implementation already makes assumptions that aren't true on
arm64, so that's why we've ended up with our own implementation. But the
patches here put us in a position where I worry that pfn_valid() may return
'true' in future for cases where the underlying struct page is either
non-existent or bogus, and debugging those failures really sucks. We had a
raft of those back when NOMAP was introduced and I don't want to re-live
that experience.

> As I expressed already, long term we should really get rid of the arm64
> variant and rather special-case the generic one. Then we won't go out of
> sync - just as it happened with ZONE_DEVICE handling here.

Why does this have to be long term? This ZONE_DEVICE stuff could be the
carrot on the stick :)

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02 12:51           ` Will Deacon
@ 2021-02-02 12:56             ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02 12:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm,
	Catalin Marinas, Ard Biesheuvel, Mark Rutland, James Morse,
	Robin Murphy, Jérôme Glisse, Dan Williams,
	Mike Rapoport

On 02.02.21 13:51, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>> On 02.02.21 13:35, Will Deacon wrote:
>>> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>>>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>>>> backing for that pfn. It should always return positive for memory ranges
>>>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>>>
>>>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>>>> into the system via memremap_pages() called from a driver, their respective
>>>>> memory sections will not have SECTION_IS_EARLY set.
>>>>>
>>>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>>>> skipped as its always going to be positive and that will be an optimization
>>>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>>>
>>>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>>>> performance for normal hotplug memory as well.
>>>>
>>>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>>>> assumptions to continue to hold true as the kernel evolves. In particular,
>>>> how do we ensure that early sections are always fully backed with
>>>
>>> Sorry, typo here:       ^^^ should be *non-early* sections.
>>
>> It might be a good idea to have a look at generic
>> include/linux/mmzone.h:pfn_valid()
> 
> The generic implementation already makes assumptions that aren't true on
> arm64, so that's why we've ended up with our own implementation. But the
> patches here put us in a position where I worry that pfn_valid() may return
> 'true' in future for cases where the underlying struct page is either
> non-existent or bogus, and debugging those failures really sucks. We had a
> raft of those back when NOMAP was introduced and I don't want to re-live
> that experience.

Yeah, and I agree when it comes to boot mem. However, the way generic 
memory hotplug/memremap infrastructure (->!early sections) works does 
not allow for such special cases you mention and would break quite some 
other code if messed up. So I wouldn't worry about that part too much 
for now.

> 
>> As I expressed already, long term we should really get rid of the arm64
>> variant and rather special-case the generic one. Then we won't go out of
>> sync - just as it happened with ZONE_DEVICE handling here.
> 
> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> carrot on the stick :)

Yes, I suggested to do it now, but Anshuman convinced me that doing a 
simple fix upfront might be cleaner --- for example when it comes to 
backporting :)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-02 12:56             ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-02 12:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 02.02.21 13:51, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>> On 02.02.21 13:35, Will Deacon wrote:
>>> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>>>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>>>> backing for that pfn. It should always return positive for memory ranges
>>>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>>>
>>>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>>>> into the system via memremap_pages() called from a driver, their respective
>>>>> memory sections will not have SECTION_IS_EARLY set.
>>>>>
>>>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>>>> skipped as its always going to be positive and that will be an optimization
>>>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>>>
>>>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>>>> performance for normal hotplug memory as well.
>>>>
>>>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>>>> assumptions to continue to hold true as the kernel evolves. In particular,
>>>> how do we ensure that early sections are always fully backed with
>>>
>>> Sorry, typo here:       ^^^ should be *non-early* sections.
>>
>> It might be a good idea to have a look at generic
>> include/linux/mmzone.h:pfn_valid()
> 
> The generic implementation already makes assumptions that aren't true on
> arm64, so that's why we've ended up with our own implementation. But the
> patches here put us in a position where I worry that pfn_valid() may return
> 'true' in future for cases where the underlying struct page is either
> non-existent or bogus, and debugging those failures really sucks. We had a
> raft of those back when NOMAP was introduced and I don't want to re-live
> that experience.

Yeah, and I agree when it comes to boot mem. However, the way generic 
memory hotplug/memremap infrastructure (->!early sections) works does 
not allow for such special cases you mention and would break quite some 
other code if messed up. So I wouldn't worry about that part too much 
for now.

> 
>> As I expressed already, long term we should really get rid of the arm64
>> variant and rather special-case the generic one. Then we won't go out of
>> sync - just as it happened with ZONE_DEVICE handling here.
> 
> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> carrot on the stick :)

Yes, I suggested to do it now, but Anshuman convinced me that doing a 
simple fix upfront might be cleaner --- for example when it comes to 
backporting :)

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02 12:56             ` David Hildenbrand
@ 2021-02-03  3:50               ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-03  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Will Deacon
  Cc: linux-arm-kernel, linux-kernel, linux-mm, Catalin Marinas,
	Ard Biesheuvel, Mark Rutland, James Morse, Robin Murphy,
	Jérôme Glisse, Dan Williams, Mike Rapoport



On 2/2/21 6:26 PM, David Hildenbrand wrote:
> On 02.02.21 13:51, Will Deacon wrote:
>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>> On 02.02.21 13:35, Will Deacon wrote:
>>>> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>>>>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>>>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>>>>> backing for that pfn. It should always return positive for memory ranges
>>>>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>>>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>>>>
>>>>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>>>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>>>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>>>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>>>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>>>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>>>>> into the system via memremap_pages() called from a driver, their respective
>>>>>> memory sections will not have SECTION_IS_EARLY set.
>>>>>>
>>>>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>>>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>>>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>>>>> skipped as its always going to be positive and that will be an optimization
>>>>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>>>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>>>>
>>>>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>>>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>>>>> performance for normal hotplug memory as well.
>>>>>
>>>>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>>>>> assumptions to continue to hold true as the kernel evolves. In particular,
>>>>> how do we ensure that early sections are always fully backed with
>>>>
>>>> Sorry, typo here:       ^^^ should be *non-early* sections.
>>>
>>> It might be a good idea to have a look at generic
>>> include/linux/mmzone.h:pfn_valid()
>>
>> The generic implementation already makes assumptions that aren't true on
>> arm64, so that's why we've ended up with our own implementation. But the
>> patches here put us in a position where I worry that pfn_valid() may return
>> 'true' in future for cases where the underlying struct page is either
>> non-existent or bogus, and debugging those failures really sucks. We had a
>> raft of those back when NOMAP was introduced and I don't want to re-live
>> that experience.
> 
> Yeah, and I agree when it comes to boot mem. However, the way generic memory hotplug/memremap infrastructure (->!early sections) works does not allow for such special cases you mention and would break quite some other code if messed up. So I wouldn't worry about that part too much for now.

Agreed.

> 
>>
>>> As I expressed already, long term we should really get rid of the arm64
>>> variant and rather special-case the generic one. Then we won't go out of
>>> sync - just as it happened with ZONE_DEVICE handling here.
>>
>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>> carrot on the stick :)
> 
> Yes, I suggested to do it now, but Anshuman convinced me that doing a simple fix upfront might be cleaner --- for example when it comes to backporting :)

Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
the problem in the present context which can be easily backported if required.

Changing or rather overhauling the generic code with new configs as proposed
earlier (which I am planning to work on subsequently) would definitely be an
improvement for the current pfn_valid() situation in terms of maintainability
but then it should not stop us from fixing the problem now.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-03  3:50               ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-03  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Will Deacon
  Cc: Mark Rutland, Catalin Marinas, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel



On 2/2/21 6:26 PM, David Hildenbrand wrote:
> On 02.02.21 13:51, Will Deacon wrote:
>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>> On 02.02.21 13:35, Will Deacon wrote:
>>>> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>>>>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>>>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>>>>> backing for that pfn. It should always return positive for memory ranges
>>>>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>>>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>>>>
>>>>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>>>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>>>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>>>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>>>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>>>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>>>>> into the system via memremap_pages() called from a driver, their respective
>>>>>> memory sections will not have SECTION_IS_EARLY set.
>>>>>>
>>>>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>>>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>>>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>>>>> skipped as its always going to be positive and that will be an optimization
>>>>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>>>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>>>>
>>>>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>>>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>>>>> performance for normal hotplug memory as well.
>>>>>
>>>>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>>>>> assumptions to continue to hold true as the kernel evolves. In particular,
>>>>> how do we ensure that early sections are always fully backed with
>>>>
>>>> Sorry, typo here:       ^^^ should be *non-early* sections.
>>>
>>> It might be a good idea to have a look at generic
>>> include/linux/mmzone.h:pfn_valid()
>>
>> The generic implementation already makes assumptions that aren't true on
>> arm64, so that's why we've ended up with our own implementation. But the
>> patches here put us in a position where I worry that pfn_valid() may return
>> 'true' in future for cases where the underlying struct page is either
>> non-existent or bogus, and debugging those failures really sucks. We had a
>> raft of those back when NOMAP was introduced and I don't want to re-live
>> that experience.
> 
> Yeah, and I agree when it comes to boot mem. However, the way generic memory hotplug/memremap infrastructure (->!early sections) works does not allow for such special cases you mention and would break quite some other code if messed up. So I wouldn't worry about that part too much for now.

Agreed.

> 
>>
>>> As I expressed already, long term we should really get rid of the arm64
>>> variant and rather special-case the generic one. Then we won't go out of
>>> sync - just as it happened with ZONE_DEVICE handling here.
>>
>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>> carrot on the stick :)
> 
> Yes, I suggested to do it now, but Anshuman convinced me that doing a simple fix upfront might be cleaner --- for example when it comes to backporting :)

Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
the problem in the present context which can be easily backported if required.

Changing or rather overhauling the generic code with new configs as proposed
earlier (which I am planning to work on subsequently) would definitely be an
improvement for the current pfn_valid() situation in terms of maintainability
but then it should not stop us from fixing the problem now.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-02  4:11 ` Anshuman Khandual
@ 2021-02-05 18:52   ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-05 18:52 UTC (permalink / raw)
  To: linux-mm, linux-arm-kernel, linux-kernel, Anshuman Khandual
  Cc: catalin.marinas, kernel-team, Will Deacon, David Hildenbrand,
	Mark Rutland, Robin Murphy, Ard Biesheuvel,
	Jérôme Glisse, Mike Rapoport, James Morse,
	Dan Williams

On Tue, 2 Feb 2021 09:41:52 +0530, Anshuman Khandual wrote:
> This series fixes pfn_valid() for ZONE_DEVICE based memory and also improves
> its performance for normal hotplug memory. While here, it also reorganizes
> pfn_valid() on CONFIG_SPARSEMEM. This series is based on v5.11-rc6.
> 
> Question - should pfn_section_valid() be tested both for boot and non boot
> memory as well ?
> 
> [...]

Applied to arm64 (for-next/mm), thanks!

[1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
      https://git.kernel.org/arm64/c/fccf0a3dfeaf
[2/2] arm64/mm: Reorganize pfn_valid()
      https://git.kernel.org/arm64/c/387f3531116e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-05 18:52   ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-05 18:52 UTC (permalink / raw)
  To: linux-mm, linux-arm-kernel, linux-kernel, Anshuman Khandual
  Cc: Mark Rutland, Will Deacon, David Hildenbrand, catalin.marinas,
	Robin Murphy, Mike Rapoport, Jérôme Glisse,
	James Morse, Dan Williams, kernel-team, Ard Biesheuvel

On Tue, 2 Feb 2021 09:41:52 +0530, Anshuman Khandual wrote:
> This series fixes pfn_valid() for ZONE_DEVICE based memory and also improves
> its performance for normal hotplug memory. While here, it also reorganizes
> pfn_valid() on CONFIG_SPARSEMEM. This series is based on v5.11-rc6.
> 
> Question - should pfn_section_valid() be tested both for boot and non boot
> memory as well ?
> 
> [...]

Applied to arm64 (for-next/mm), thanks!

[1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
      https://git.kernel.org/arm64/c/fccf0a3dfeaf
[2/2] arm64/mm: Reorganize pfn_valid()
      https://git.kernel.org/arm64/c/387f3531116e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-03  3:50               ` Anshuman Khandual
@ 2021-02-05 18:55                 ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-05 18:55 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: David Hildenbrand, linux-arm-kernel, linux-kernel, linux-mm,
	Catalin Marinas, Ard Biesheuvel, Mark Rutland, James Morse,
	Robin Murphy, Jérôme Glisse, Dan Williams,
	Mike Rapoport

On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> On 2/2/21 6:26 PM, David Hildenbrand wrote:
> > On 02.02.21 13:51, Will Deacon wrote:
> >> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> >>> As I expressed already, long term we should really get rid of the arm64
> >>> variant and rather special-case the generic one. Then we won't go out of
> >>> sync - just as it happened with ZONE_DEVICE handling here.
> >>
> >> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> >> carrot on the stick :)
> > 
> > Yes, I suggested to do it now, but Anshuman convinced me that doing a
> > simple fix upfront might be cleaner --- for example when it comes to
> > backporting :)
> 
> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> the problem in the present context which can be easily backported if required.
> 
> Changing or rather overhauling the generic code with new configs as proposed
> earlier (which I am planning to work on subsequently) would definitely be an
> improvement for the current pfn_valid() situation in terms of maintainability
> but then it should not stop us from fixing the problem now.

Alright, I've mulled this over a bit. I don't agree that this patch helps
with maintainability (quite the opposite, in fact), but perfection is the
enemy of the good so I'll queue the series for 5.12. However, I'll revert
the changes at the first sign of a problem, so please do work towards a
generic solution which can replace this in the medium term.

Even just adding hooks with well-defined (documented) semantics to
pfn_valid() instead of having architectures override it wholesale would be a
massive step forward in my opinion.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-05 18:55                 ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-05 18:55 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> On 2/2/21 6:26 PM, David Hildenbrand wrote:
> > On 02.02.21 13:51, Will Deacon wrote:
> >> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> >>> As I expressed already, long term we should really get rid of the arm64
> >>> variant and rather special-case the generic one. Then we won't go out of
> >>> sync - just as it happened with ZONE_DEVICE handling here.
> >>
> >> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> >> carrot on the stick :)
> > 
> > Yes, I suggested to do it now, but Anshuman convinced me that doing a
> > simple fix upfront might be cleaner --- for example when it comes to
> > backporting :)
> 
> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> the problem in the present context which can be easily backported if required.
> 
> Changing or rather overhauling the generic code with new configs as proposed
> earlier (which I am planning to work on subsequently) would definitely be an
> improvement for the current pfn_valid() situation in terms of maintainability
> but then it should not stop us from fixing the problem now.

Alright, I've mulled this over a bit. I don't agree that this patch helps
with maintainability (quite the opposite, in fact), but perfection is the
enemy of the good so I'll queue the series for 5.12. However, I'll revert
the changes at the first sign of a problem, so please do work towards a
generic solution which can replace this in the medium term.

Even just adding hooks with well-defined (documented) semantics to
pfn_valid() instead of having architectures override it wholesale would be a
massive step forward in my opinion.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-05 18:55                 ` Will Deacon
@ 2021-02-11 11:53                   ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-11 11:53 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> > On 2/2/21 6:26 PM, David Hildenbrand wrote:
> > > On 02.02.21 13:51, Will Deacon wrote:
> > >> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> > >>> As I expressed already, long term we should really get rid of the arm64
> > >>> variant and rather special-case the generic one. Then we won't go out of
> > >>> sync - just as it happened with ZONE_DEVICE handling here.
> > >>
> > >> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> > >> carrot on the stick :)
> > > 
> > > Yes, I suggested to do it now, but Anshuman convinced me that doing a
> > > simple fix upfront might be cleaner --- for example when it comes to
> > > backporting :)
> > 
> > Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> > the problem in the present context which can be easily backported if required.
> > 
> > Changing or rather overhauling the generic code with new configs as proposed
> > earlier (which I am planning to work on subsequently) would definitely be an
> > improvement for the current pfn_valid() situation in terms of maintainability
> > but then it should not stop us from fixing the problem now.
> 
> Alright, I've mulled this over a bit. I don't agree that this patch helps
> with maintainability (quite the opposite, in fact), but perfection is the
> enemy of the good so I'll queue the series for 5.12. However, I'll revert
> the changes at the first sign of a problem, so please do work towards a
> generic solution which can replace this in the medium term.

... and dropped. These patches appear to be responsible for a boot
regression reported by CKI:

https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-11 11:53                   ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-11 11:53 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> > On 2/2/21 6:26 PM, David Hildenbrand wrote:
> > > On 02.02.21 13:51, Will Deacon wrote:
> > >> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> > >>> As I expressed already, long term we should really get rid of the arm64
> > >>> variant and rather special-case the generic one. Then we won't go out of
> > >>> sync - just as it happened with ZONE_DEVICE handling here.
> > >>
> > >> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> > >> carrot on the stick :)
> > > 
> > > Yes, I suggested to do it now, but Anshuman convinced me that doing a
> > > simple fix upfront might be cleaner --- for example when it comes to
> > > backporting :)
> > 
> > Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> > the problem in the present context which can be easily backported if required.
> > 
> > Changing or rather overhauling the generic code with new configs as proposed
> > earlier (which I am planning to work on subsequently) would definitely be an
> > improvement for the current pfn_valid() situation in terms of maintainability
> > but then it should not stop us from fixing the problem now.
> 
> Alright, I've mulled this over a bit. I don't agree that this patch helps
> with maintainability (quite the opposite, in fact), but perfection is the
> enemy of the good so I'll queue the series for 5.12. However, I'll revert
> the changes at the first sign of a problem, so please do work towards a
> generic solution which can replace this in the medium term.

... and dropped. These patches appear to be responsible for a boot
regression reported by CKI:

https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-11 11:53                   ` Will Deacon
@ 2021-02-11 12:10                     ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-11 12:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel



On 2/11/21 5:23 PM, Will Deacon wrote:
> On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
>> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
>>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
>>>> On 02.02.21 13:51, Will Deacon wrote:
>>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>>>>> As I expressed already, long term we should really get rid of the arm64
>>>>>> variant and rather special-case the generic one. Then we won't go out of
>>>>>> sync - just as it happened with ZONE_DEVICE handling here.
>>>>>
>>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>>>>> carrot on the stick :)
>>>>
>>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
>>>> simple fix upfront might be cleaner --- for example when it comes to
>>>> backporting :)
>>>
>>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
>>> the problem in the present context which can be easily backported if required.
>>>
>>> Changing or rather overhauling the generic code with new configs as proposed
>>> earlier (which I am planning to work on subsequently) would definitely be an
>>> improvement for the current pfn_valid() situation in terms of maintainability
>>> but then it should not stop us from fixing the problem now.
>>
>> Alright, I've mulled this over a bit. I don't agree that this patch helps
>> with maintainability (quite the opposite, in fact), but perfection is the
>> enemy of the good so I'll queue the series for 5.12. However, I'll revert
>> the changes at the first sign of a problem, so please do work towards a
>> generic solution which can replace this in the medium term.
> 
> ... and dropped. These patches appear to be responsible for a boot
> regression reported by CKI:

Ahh, boot regression ? These patches only change the behaviour
for non boot memory only.

> 
> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com

Will look into the logs and see if there is something pointing to
the problem.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-11 12:10                     ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-02-11 12:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel



On 2/11/21 5:23 PM, Will Deacon wrote:
> On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
>> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
>>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
>>>> On 02.02.21 13:51, Will Deacon wrote:
>>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>>>>> As I expressed already, long term we should really get rid of the arm64
>>>>>> variant and rather special-case the generic one. Then we won't go out of
>>>>>> sync - just as it happened with ZONE_DEVICE handling here.
>>>>>
>>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>>>>> carrot on the stick :)
>>>>
>>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
>>>> simple fix upfront might be cleaner --- for example when it comes to
>>>> backporting :)
>>>
>>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
>>> the problem in the present context which can be easily backported if required.
>>>
>>> Changing or rather overhauling the generic code with new configs as proposed
>>> earlier (which I am planning to work on subsequently) would definitely be an
>>> improvement for the current pfn_valid() situation in terms of maintainability
>>> but then it should not stop us from fixing the problem now.
>>
>> Alright, I've mulled this over a bit. I don't agree that this patch helps
>> with maintainability (quite the opposite, in fact), but perfection is the
>> enemy of the good so I'll queue the series for 5.12. However, I'll revert
>> the changes at the first sign of a problem, so please do work towards a
>> generic solution which can replace this in the medium term.
> 
> ... and dropped. These patches appear to be responsible for a boot
> regression reported by CKI:

Ahh, boot regression ? These patches only change the behaviour
for non boot memory only.

> 
> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com

Will look into the logs and see if there is something pointing to
the problem.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-11 12:10                     ` Anshuman Khandual
@ 2021-02-11 12:21                       ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-11 12:21 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Thu, Feb 11, 2021 at 05:40:35PM +0530, Anshuman Khandual wrote:
> On 2/11/21 5:23 PM, Will Deacon wrote:
> > On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
> >> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> >>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
> >>>> On 02.02.21 13:51, Will Deacon wrote:
> >>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> >>>>>> As I expressed already, long term we should really get rid of the arm64
> >>>>>> variant and rather special-case the generic one. Then we won't go out of
> >>>>>> sync - just as it happened with ZONE_DEVICE handling here.
> >>>>>
> >>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> >>>>> carrot on the stick :)
> >>>>
> >>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
> >>>> simple fix upfront might be cleaner --- for example when it comes to
> >>>> backporting :)
> >>>
> >>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> >>> the problem in the present context which can be easily backported if required.
> >>>
> >>> Changing or rather overhauling the generic code with new configs as proposed
> >>> earlier (which I am planning to work on subsequently) would definitely be an
> >>> improvement for the current pfn_valid() situation in terms of maintainability
> >>> but then it should not stop us from fixing the problem now.
> >>
> >> Alright, I've mulled this over a bit. I don't agree that this patch helps
> >> with maintainability (quite the opposite, in fact), but perfection is the
> >> enemy of the good so I'll queue the series for 5.12. However, I'll revert
> >> the changes at the first sign of a problem, so please do work towards a
> >> generic solution which can replace this in the medium term.
> > 
> > ... and dropped. These patches appear to be responsible for a boot
> > regression reported by CKI:
> 
> Ahh, boot regression ? These patches only change the behaviour
> for non boot memory only.

Sure, but this thing is horribly fragile, which is why I was nervous about
touching it in the first place ;)

> > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> 
> Will look into the logs and see if there is something pointing to
> the problem.

We don't have a log yet, but I've asked whether earlycon works on the
problematic machine (the failure seems to be specific to a certain TX2).

Either way, this is too late for 5.12 now.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-11 12:21                       ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-02-11 12:21 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Thu, Feb 11, 2021 at 05:40:35PM +0530, Anshuman Khandual wrote:
> On 2/11/21 5:23 PM, Will Deacon wrote:
> > On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
> >> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
> >>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
> >>>> On 02.02.21 13:51, Will Deacon wrote:
> >>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
> >>>>>> As I expressed already, long term we should really get rid of the arm64
> >>>>>> variant and rather special-case the generic one. Then we won't go out of
> >>>>>> sync - just as it happened with ZONE_DEVICE handling here.
> >>>>>
> >>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> >>>>> carrot on the stick :)
> >>>>
> >>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
> >>>> simple fix upfront might be cleaner --- for example when it comes to
> >>>> backporting :)
> >>>
> >>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
> >>> the problem in the present context which can be easily backported if required.
> >>>
> >>> Changing or rather overhauling the generic code with new configs as proposed
> >>> earlier (which I am planning to work on subsequently) would definitely be an
> >>> improvement for the current pfn_valid() situation in terms of maintainability
> >>> but then it should not stop us from fixing the problem now.
> >>
> >> Alright, I've mulled this over a bit. I don't agree that this patch helps
> >> with maintainability (quite the opposite, in fact), but perfection is the
> >> enemy of the good so I'll queue the series for 5.12. However, I'll revert
> >> the changes at the first sign of a problem, so please do work towards a
> >> generic solution which can replace this in the medium term.
> > 
> > ... and dropped. These patches appear to be responsible for a boot
> > regression reported by CKI:
> 
> Ahh, boot regression ? These patches only change the behaviour
> for non boot memory only.

Sure, but this thing is horribly fragile, which is why I was nervous about
touching it in the first place ;)

> > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> 
> Will look into the logs and see if there is something pointing to
> the problem.

We don't have a log yet, but I've asked whether earlycon works on the
problematic machine (the failure seems to be specific to a certain TX2).

Either way, this is too late for 5.12 now.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-11 12:10                     ` Anshuman Khandual
@ 2021-02-11 12:35                       ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-11 12:35 UTC (permalink / raw)
  To: Anshuman Khandual, Will Deacon
  Cc: Mark Rutland, Catalin Marinas, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 11.02.21 13:10, Anshuman Khandual wrote:
> 
> 
> On 2/11/21 5:23 PM, Will Deacon wrote:
>> On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
>>> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
>>>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
>>>>> On 02.02.21 13:51, Will Deacon wrote:
>>>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>>>>>> As I expressed already, long term we should really get rid of the arm64
>>>>>>> variant and rather special-case the generic one. Then we won't go out of
>>>>>>> sync - just as it happened with ZONE_DEVICE handling here.
>>>>>>
>>>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>>>>>> carrot on the stick :)
>>>>>
>>>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
>>>>> simple fix upfront might be cleaner --- for example when it comes to
>>>>> backporting :)
>>>>
>>>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
>>>> the problem in the present context which can be easily backported if required.
>>>>
>>>> Changing or rather overhauling the generic code with new configs as proposed
>>>> earlier (which I am planning to work on subsequently) would definitely be an
>>>> improvement for the current pfn_valid() situation in terms of maintainability
>>>> but then it should not stop us from fixing the problem now.
>>>
>>> Alright, I've mulled this over a bit. I don't agree that this patch helps
>>> with maintainability (quite the opposite, in fact), but perfection is the
>>> enemy of the good so I'll queue the series for 5.12. However, I'll revert
>>> the changes at the first sign of a problem, so please do work towards a
>>> generic solution which can replace this in the medium term.
>>
>> ... and dropped. These patches appear to be responsible for a boot
>> regression reported by CKI:
> 
> Ahh, boot regression ? These patches only change the behaviour
> for non boot memory only.
> 
>>
>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> 
> Will look into the logs and see if there is something pointing to
> the problem.
> 

It's strange. One thing I can imagine is a mis-detection of early 
sections. However, I don't see that happening:

In sparse_init_nid(), we:
1. Initialize the memmap
2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
    sparse_init_one_section()

Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP 
without SECTION_IS_EARLY - which is correct, because these are not early.

So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is 
set -- early_section() should be correct.

Even if someone would be doing a pfn_valid() after 
memblocks_present()->memory_present() but before
sparse_init_nid(), we should be fine (!valid_section() -> return 0).


As it happens early during boot, I doubt that some NVDIMMs that get 
detected and added early during boot as system RAM (via dax/kmem). Are 
the problem.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-02-11 12:35                       ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-02-11 12:35 UTC (permalink / raw)
  To: Anshuman Khandual, Will Deacon
  Cc: Mark Rutland, Catalin Marinas, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 11.02.21 13:10, Anshuman Khandual wrote:
> 
> 
> On 2/11/21 5:23 PM, Will Deacon wrote:
>> On Fri, Feb 05, 2021 at 06:55:53PM +0000, Will Deacon wrote:
>>> On Wed, Feb 03, 2021 at 09:20:39AM +0530, Anshuman Khandual wrote:
>>>> On 2/2/21 6:26 PM, David Hildenbrand wrote:
>>>>> On 02.02.21 13:51, Will Deacon wrote:
>>>>>> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>>>>>>> As I expressed already, long term we should really get rid of the arm64
>>>>>>> variant and rather special-case the generic one. Then we won't go out of
>>>>>>> sync - just as it happened with ZONE_DEVICE handling here.
>>>>>>
>>>>>> Why does this have to be long term? This ZONE_DEVICE stuff could be the
>>>>>> carrot on the stick :)
>>>>>
>>>>> Yes, I suggested to do it now, but Anshuman convinced me that doing a
>>>>> simple fix upfront might be cleaner --- for example when it comes to
>>>>> backporting :)
>>>>
>>>> Right. The current pfn_valid() breaks for ZONE_DEVICE memory and this fixes
>>>> the problem in the present context which can be easily backported if required.
>>>>
>>>> Changing or rather overhauling the generic code with new configs as proposed
>>>> earlier (which I am planning to work on subsequently) would definitely be an
>>>> improvement for the current pfn_valid() situation in terms of maintainability
>>>> but then it should not stop us from fixing the problem now.
>>>
>>> Alright, I've mulled this over a bit. I don't agree that this patch helps
>>> with maintainability (quite the opposite, in fact), but perfection is the
>>> enemy of the good so I'll queue the series for 5.12. However, I'll revert
>>> the changes at the first sign of a problem, so please do work towards a
>>> generic solution which can replace this in the medium term.
>>
>> ... and dropped. These patches appear to be responsible for a boot
>> regression reported by CKI:
> 
> Ahh, boot regression ? These patches only change the behaviour
> for non boot memory only.
> 
>>
>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> 
> Will look into the logs and see if there is something pointing to
> the problem.
> 

It's strange. One thing I can imagine is a mis-detection of early 
sections. However, I don't see that happening:

In sparse_init_nid(), we:
1. Initialize the memmap
2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
    sparse_init_one_section()

Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP 
without SECTION_IS_EARLY - which is correct, because these are not early.

So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is 
set -- early_section() should be correct.

Even if someone would be doing a pfn_valid() after 
memblocks_present()->memory_present() but before
sparse_init_nid(), we should be fine (!valid_section() -> return 0).


As it happens early during boot, I doubt that some NVDIMMs that get 
detected and added early during boot as system RAM (via dax/kmem). Are 
the problem.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-02-11 12:35                       ` David Hildenbrand
@ 2021-03-03 19:04                         ` Catalin Marinas
  -1 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2021-03-03 19:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, Will Deacon, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> On 11.02.21 13:10, Anshuman Khandual wrote:
> > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > ... and dropped. These patches appear to be responsible for a boot
> > > regression reported by CKI:
> > 
> > Ahh, boot regression ? These patches only change the behaviour
> > for non boot memory only.
> > 
> > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > 
> > Will look into the logs and see if there is something pointing to
> > the problem.
> 
> It's strange. One thing I can imagine is a mis-detection of early sections.
> However, I don't see that happening:
> 
> In sparse_init_nid(), we:
> 1. Initialize the memmap
> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>    sparse_init_one_section()
> 
> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> SECTION_IS_EARLY - which is correct, because these are not early.
> 
> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> -- early_section() should be correct.
> 
> Even if someone would be doing a pfn_valid() after
> memblocks_present()->memory_present() but before
> sparse_init_nid(), we should be fine (!valid_section() -> return 0).

I couldn't figure out how this could fail with Anshuman's patches.
Will's suspicion is that some invalid/null pointer gets dereferenced
before being initialised but the only case I see is somewhere in
pfn_section_valid() (ms->usage) if valid_section() && !early_section().

Assuming that we do get a valid_section(ms) && !early_section(ms), is
there a case where ms->usage is not initialised? I guess races with
section_deactivate() are not possible this early.

Another situation could be that pfn_valid() returns true when no memory
is mapped for that pfn.

> As it happens early during boot, I doubt that some NVDIMMs that get detected
> and added early during boot as system RAM (via dax/kmem) are the problem.

It is indeed very early, we can't even get the early console output.
Debugging this is even harder as it's only misbehaving on a board we
don't have access to.

On the logic in this patch, is the hot-added memory always covering a
full subsection? The arm64 pfn_valid() currently relies on
memblock_is_map_memory() but the patch changes it to
pfn_section_valid(). So if hot-added memory doesn't cover the full
subsection, it may return true even if the pfn is not mapped.

Regarding the robustness of the pfn_valid for ZONE_DEVICE memory, could
we instead have a SECTION_IS_DEVICE flag and only check for that as not
to disturb the hotplugged memory check via memblock_is_map_memory()?

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-03 19:04                         ` Catalin Marinas
  0 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2021-03-03 19:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, Will Deacon, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> On 11.02.21 13:10, Anshuman Khandual wrote:
> > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > ... and dropped. These patches appear to be responsible for a boot
> > > regression reported by CKI:
> > 
> > Ahh, boot regression ? These patches only change the behaviour
> > for non boot memory only.
> > 
> > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > 
> > Will look into the logs and see if there is something pointing to
> > the problem.
> 
> It's strange. One thing I can imagine is a mis-detection of early sections.
> However, I don't see that happening:
> 
> In sparse_init_nid(), we:
> 1. Initialize the memmap
> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>    sparse_init_one_section()
> 
> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> SECTION_IS_EARLY - which is correct, because these are not early.
> 
> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> -- early_section() should be correct.
> 
> Even if someone would be doing a pfn_valid() after
> memblocks_present()->memory_present() but before
> sparse_init_nid(), we should be fine (!valid_section() -> return 0).

I couldn't figure out how this could fail with Anshuman's patches.
Will's suspicion is that some invalid/null pointer gets dereferenced
before being initialised but the only case I see is somewhere in
pfn_section_valid() (ms->usage) if valid_section() && !early_section().

Assuming that we do get a valid_section(ms) && !early_section(ms), is
there a case where ms->usage is not initialised? I guess races with
section_deactivate() are not possible this early.

Another situation could be that pfn_valid() returns true when no memory
is mapped for that pfn.

> As it happens early during boot, I doubt that some NVDIMMs that get detected
> and added early during boot as system RAM (via dax/kmem) are the problem.

It is indeed very early, we can't even get the early console output.
Debugging this is even harder as it's only misbehaving on a board we
don't have access to.

On the logic in this patch, is the hot-added memory always covering a
full subsection? The arm64 pfn_valid() currently relies on
memblock_is_map_memory() but the patch changes it to
pfn_section_valid(). So if hot-added memory doesn't cover the full
subsection, it may return true even if the pfn is not mapped.

Regarding the robustness of the pfn_valid for ZONE_DEVICE memory, could
we instead have a SECTION_IS_DEVICE flag and only check for that as not
to disturb the hotplugged memory check via memblock_is_map_memory()?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-03 19:04                         ` Catalin Marinas
@ 2021-03-03 19:24                           ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-03-03 19:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Anshuman Khandual, Will Deacon, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 03.03.21 20:04, Catalin Marinas wrote:
> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>> ... and dropped. These patches appear to be responsible for a boot
>>>> regression reported by CKI:
>>>
>>> Ahh, boot regression ? These patches only change the behaviour
>>> for non boot memory only.
>>>
>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>
>>> Will look into the logs and see if there is something pointing to
>>> the problem.
>>
>> It's strange. One thing I can imagine is a mis-detection of early sections.
>> However, I don't see that happening:
>>
>> In sparse_init_nid(), we:
>> 1. Initialize the memmap
>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>     sparse_init_one_section()
>>
>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>> SECTION_IS_EARLY - which is correct, because these are not early.
>>
>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>> -- early_section() should be correct.
>>
>> Even if someone would be doing a pfn_valid() after
>> memblocks_present()->memory_present() but before
>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> 
> I couldn't figure out how this could fail with Anshuman's patches.
> Will's suspicion is that some invalid/null pointer gets dereferenced
> before being initialised but the only case I see is somewhere in
> pfn_section_valid() (ms->usage) if valid_section() && !early_section().

Indeed, it looks like a latent bug.

> 
> Assuming that we do get a valid_section(ms) && !early_section(ms), is
> there a case where ms->usage is not initialised? I guess races with
> section_deactivate() are not possible this early.
> 

Do you have access to that machine? We could identify which path is 
taken quite easily.

> Another situation could be that pfn_valid() returns true when no memory
> is mapped for that pfn.
> 
>> As it happens early during boot, I doubt that some NVDIMMs that get detected
>> and added early during boot as system RAM (via dax/kmem) are the problem.
> 
> It is indeed very early, we can't even get the early console output.

So even before any hotplug really happens. All sections should be early 
at that point I guess.

> Debugging this is even harder as it's only misbehaving on a board we
> don't have access to.
> 
> On the logic in this patch, is the hot-added memory always covering a
> full subsection? The arm64 pfn_valid() currently relies on
> memblock_is_map_memory() but the patch changes it to
> pfn_section_valid(). So if hot-added memory doesn't cover the full
> subsection, it may return true even if the pfn is not mapped.

Hotplugged System RAM always covers full sections. Hotplugged 
ZONE_DEVICE always covers full subsections. pfn_section_valid() properly 
handles both cases. (see generic pfn_valid())

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-03 19:24                           ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-03-03 19:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Anshuman Khandual, Will Deacon, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On 03.03.21 20:04, Catalin Marinas wrote:
> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>> ... and dropped. These patches appear to be responsible for a boot
>>>> regression reported by CKI:
>>>
>>> Ahh, boot regression ? These patches only change the behaviour
>>> for non boot memory only.
>>>
>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>
>>> Will look into the logs and see if there is something pointing to
>>> the problem.
>>
>> It's strange. One thing I can imagine is a mis-detection of early sections.
>> However, I don't see that happening:
>>
>> In sparse_init_nid(), we:
>> 1. Initialize the memmap
>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>     sparse_init_one_section()
>>
>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>> SECTION_IS_EARLY - which is correct, because these are not early.
>>
>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>> -- early_section() should be correct.
>>
>> Even if someone would be doing a pfn_valid() after
>> memblocks_present()->memory_present() but before
>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> 
> I couldn't figure out how this could fail with Anshuman's patches.
> Will's suspicion is that some invalid/null pointer gets dereferenced
> before being initialised but the only case I see is somewhere in
> pfn_section_valid() (ms->usage) if valid_section() && !early_section().

Indeed, it looks like a latent bug.

> 
> Assuming that we do get a valid_section(ms) && !early_section(ms), is
> there a case where ms->usage is not initialised? I guess races with
> section_deactivate() are not possible this early.
> 

Do you have access to that machine? We could identify which path is 
taken quite easily.

> Another situation could be that pfn_valid() returns true when no memory
> is mapped for that pfn.
> 
>> As it happens early during boot, I doubt that some NVDIMMs that get detected
>> and added early during boot as system RAM (via dax/kmem) are the problem.
> 
> It is indeed very early, we can't even get the early console output.

So even before any hotplug really happens. All sections should be early 
at that point I guess.

> Debugging this is even harder as it's only misbehaving on a board we
> don't have access to.
> 
> On the logic in this patch, is the hot-added memory always covering a
> full subsection? The arm64 pfn_valid() currently relies on
> memblock_is_map_memory() but the patch changes it to
> pfn_section_valid(). So if hot-added memory doesn't cover the full
> subsection, it may return true even if the pfn is not mapped.

Hotplugged System RAM always covers full sections. Hotplugged 
ZONE_DEVICE always covers full subsections. pfn_section_valid() properly 
handles both cases. (see generic pfn_valid())

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-03 19:04                         ` Catalin Marinas
@ 2021-03-03 21:24                           ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-03-03 21:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Anshuman Khandual, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> > On 11.02.21 13:10, Anshuman Khandual wrote:
> > > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > > ... and dropped. These patches appear to be responsible for a boot
> > > > regression reported by CKI:
> > > 
> > > Ahh, boot regression ? These patches only change the behaviour
> > > for non boot memory only.
> > > 
> > > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > > 
> > > Will look into the logs and see if there is something pointing to
> > > the problem.
> > 
> > It's strange. One thing I can imagine is a mis-detection of early sections.
> > However, I don't see that happening:
> > 
> > In sparse_init_nid(), we:
> > 1. Initialize the memmap
> > 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
> >    sparse_init_one_section()
> > 
> > Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> > SECTION_IS_EARLY - which is correct, because these are not early.
> > 
> > So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> > -- early_section() should be correct.
> > 
> > Even if someone would be doing a pfn_valid() after
> > memblocks_present()->memory_present() but before
> > sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> 
> I couldn't figure out how this could fail with Anshuman's patches.
> Will's suspicion is that some invalid/null pointer gets dereferenced
> before being initialised but the only case I see is somewhere in
> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
> 
> Assuming that we do get a valid_section(ms) && !early_section(ms), is
> there a case where ms->usage is not initialised? I guess races with
> section_deactivate() are not possible this early.
> 
> Another situation could be that pfn_valid() returns true when no memory
> is mapped for that pfn.

The case I wondered about was __pfn_to_section() with a bogus pfn, since
with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
sane.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-03 21:24                           ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-03-03 21:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Anshuman Khandual, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, linux-arm-kernel

On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> > On 11.02.21 13:10, Anshuman Khandual wrote:
> > > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > > ... and dropped. These patches appear to be responsible for a boot
> > > > regression reported by CKI:
> > > 
> > > Ahh, boot regression ? These patches only change the behaviour
> > > for non boot memory only.
> > > 
> > > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > > 
> > > Will look into the logs and see if there is something pointing to
> > > the problem.
> > 
> > It's strange. One thing I can imagine is a mis-detection of early sections.
> > However, I don't see that happening:
> > 
> > In sparse_init_nid(), we:
> > 1. Initialize the memmap
> > 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
> >    sparse_init_one_section()
> > 
> > Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> > SECTION_IS_EARLY - which is correct, because these are not early.
> > 
> > So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> > -- early_section() should be correct.
> > 
> > Even if someone would be doing a pfn_valid() after
> > memblocks_present()->memory_present() but before
> > sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> 
> I couldn't figure out how this could fail with Anshuman's patches.
> Will's suspicion is that some invalid/null pointer gets dereferenced
> before being initialised but the only case I see is somewhere in
> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
> 
> Assuming that we do get a valid_section(ms) && !early_section(ms), is
> there a case where ms->usage is not initialised? I guess races with
> section_deactivate() are not possible this early.
> 
> Another situation could be that pfn_valid() returns true when no memory
> is mapped for that pfn.

The case I wondered about was __pfn_to_section() with a bogus pfn, since
with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
sane.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-03 21:24                           ` Will Deacon
@ 2021-03-04  3:31                             ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-03-04  3:31 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: David Hildenbrand, Mark Rutland, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel



On 3/4/21 2:54 AM, Will Deacon wrote:
> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>> regression reported by CKI:
>>>>
>>>> Ahh, boot regression ? These patches only change the behaviour
>>>> for non boot memory only.
>>>>
>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>
>>>> Will look into the logs and see if there is something pointing to
>>>> the problem.
>>>
>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>> However, I don't see that happening:
>>>
>>> In sparse_init_nid(), we:
>>> 1. Initialize the memmap
>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>    sparse_init_one_section()
>>>
>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>
>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>> -- early_section() should be correct.
>>>
>>> Even if someone would be doing a pfn_valid() after
>>> memblocks_present()->memory_present() but before
>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>
>> I couldn't figure out how this could fail with Anshuman's patches.
>> Will's suspicion is that some invalid/null pointer gets dereferenced
>> before being initialised but the only case I see is somewhere in
>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>
>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>> there a case where ms->usage is not initialised? I guess races with
>> section_deactivate() are not possible this early.
>>
>> Another situation could be that pfn_valid() returns true when no memory
>> is mapped for that pfn.
> 
> The case I wondered about was __pfn_to_section() with a bogus pfn, since
> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
> sane.

Right, that is problematic. __pfn_to_section() should not be called without
first validating pfn_to_section_nr(), as it could cause out-of-bound access
on mem_section buffer. Will fix that order but as there is no test scenario
which is definitive for this reported regression, how should we ensure that
it fixes the problem ?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-04  3:31                             ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-03-04  3:31 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: David Hildenbrand, Mark Rutland, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, linux-arm-kernel



On 3/4/21 2:54 AM, Will Deacon wrote:
> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>> regression reported by CKI:
>>>>
>>>> Ahh, boot regression ? These patches only change the behaviour
>>>> for non boot memory only.
>>>>
>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>
>>>> Will look into the logs and see if there is something pointing to
>>>> the problem.
>>>
>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>> However, I don't see that happening:
>>>
>>> In sparse_init_nid(), we:
>>> 1. Initialize the memmap
>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>    sparse_init_one_section()
>>>
>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>
>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>> -- early_section() should be correct.
>>>
>>> Even if someone would be doing a pfn_valid() after
>>> memblocks_present()->memory_present() but before
>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>
>> I couldn't figure out how this could fail with Anshuman's patches.
>> Will's suspicion is that some invalid/null pointer gets dereferenced
>> before being initialised but the only case I see is somewhere in
>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>
>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>> there a case where ms->usage is not initialised? I guess races with
>> section_deactivate() are not possible this early.
>>
>> Another situation could be that pfn_valid() returns true when no memory
>> is mapped for that pfn.
> 
> The case I wondered about was __pfn_to_section() with a bogus pfn, since
> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
> sane.

Right, that is problematic. __pfn_to_section() should not be called without
first validating pfn_to_section_nr(), as it could cause out-of-bound access
on mem_section buffer. Will fix that order but as there is no test scenario
which is definitive for this reported regression, how should we ensure that
it fixes the problem ?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-04  3:31                             ` Anshuman Khandual
@ 2021-03-04  8:12                               ` David Hildenbrand
  -1 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-03-04  8:12 UTC (permalink / raw)
  To: Anshuman Khandual, Will Deacon, Catalin Marinas
  Cc: Mark Rutland, linux-kernel, Mike Rapoport, linux-mm,
	Jérôme Glisse, James Morse, Dan Williams, Robin Murphy,
	Ard Biesheuvel, linux-arm-kernel

On 04.03.21 04:31, Anshuman Khandual wrote:
> 
> 
> On 3/4/21 2:54 AM, Will Deacon wrote:
>> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>>> regression reported by CKI:
>>>>>
>>>>> Ahh, boot regression ? These patches only change the behaviour
>>>>> for non boot memory only.
>>>>>
>>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>>
>>>>> Will look into the logs and see if there is something pointing to
>>>>> the problem.
>>>>
>>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>>> However, I don't see that happening:
>>>>
>>>> In sparse_init_nid(), we:
>>>> 1. Initialize the memmap
>>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>>     sparse_init_one_section()
>>>>
>>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>>
>>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>>> -- early_section() should be correct.
>>>>
>>>> Even if someone would be doing a pfn_valid() after
>>>> memblocks_present()->memory_present() but before
>>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>>
>>> I couldn't figure out how this could fail with Anshuman's patches.
>>> Will's suspicion is that some invalid/null pointer gets dereferenced
>>> before being initialised but the only case I see is somewhere in
>>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>>
>>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>>> there a case where ms->usage is not initialised? I guess races with
>>> section_deactivate() are not possible this early.
>>>
>>> Another situation could be that pfn_valid() returns true when no memory
>>> is mapped for that pfn.
>>
>> The case I wondered about was __pfn_to_section() with a bogus pfn, since
>> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
>> sane.
> 
> Right, that is problematic. __pfn_to_section() should not be called without
> first validating pfn_to_section_nr(), as it could cause out-of-bound access
> on mem_section buffer. Will fix that order but as there is no test scenario
> which is definitive for this reported regression, how should we ensure that
> it fixes the problem ?

Oh, right, I missed that in patch #2. (and when comparing to generic 
pfn_valid()).

I thought bisecting pointed at patch #1, that's why I didn't even have 
another look at patch #2. Makes sense.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-04  8:12                               ` David Hildenbrand
  0 siblings, 0 replies; 46+ messages in thread
From: David Hildenbrand @ 2021-03-04  8:12 UTC (permalink / raw)
  To: Anshuman Khandual, Will Deacon, Catalin Marinas
  Cc: Mark Rutland, linux-kernel, Mike Rapoport, linux-mm,
	Jérôme Glisse, James Morse, Dan Williams, Robin Murphy,
	Ard Biesheuvel, linux-arm-kernel

On 04.03.21 04:31, Anshuman Khandual wrote:
> 
> 
> On 3/4/21 2:54 AM, Will Deacon wrote:
>> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>>> regression reported by CKI:
>>>>>
>>>>> Ahh, boot regression ? These patches only change the behaviour
>>>>> for non boot memory only.
>>>>>
>>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>>
>>>>> Will look into the logs and see if there is something pointing to
>>>>> the problem.
>>>>
>>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>>> However, I don't see that happening:
>>>>
>>>> In sparse_init_nid(), we:
>>>> 1. Initialize the memmap
>>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>>     sparse_init_one_section()
>>>>
>>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>>
>>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>>> -- early_section() should be correct.
>>>>
>>>> Even if someone would be doing a pfn_valid() after
>>>> memblocks_present()->memory_present() but before
>>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>>
>>> I couldn't figure out how this could fail with Anshuman's patches.
>>> Will's suspicion is that some invalid/null pointer gets dereferenced
>>> before being initialised but the only case I see is somewhere in
>>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>>
>>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>>> there a case where ms->usage is not initialised? I guess races with
>>> section_deactivate() are not possible this early.
>>>
>>> Another situation could be that pfn_valid() returns true when no memory
>>> is mapped for that pfn.
>>
>> The case I wondered about was __pfn_to_section() with a bogus pfn, since
>> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
>> sane.
> 
> Right, that is problematic. __pfn_to_section() should not be called without
> first validating pfn_to_section_nr(), as it could cause out-of-bound access
> on mem_section buffer. Will fix that order but as there is no test scenario
> which is definitive for this reported regression, how should we ensure that
> it fixes the problem ?

Oh, right, I missed that in patch #2. (and when comparing to generic 
pfn_valid()).

I thought bisecting pointed at patch #1, that's why I didn't even have 
another look at patch #2. Makes sense.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-04  8:12                               ` David Hildenbrand
@ 2021-03-04  9:36                                 ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-03-04  9:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, Catalin Marinas, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, vkabatov,
	linux-arm-kernel

On Thu, Mar 04, 2021 at 09:12:31AM +0100, David Hildenbrand wrote:
> On 04.03.21 04:31, Anshuman Khandual wrote:
> > On 3/4/21 2:54 AM, Will Deacon wrote:
> > > On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
> > > > On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> > > > > On 11.02.21 13:10, Anshuman Khandual wrote:
> > > > > > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > > > > > ... and dropped. These patches appear to be responsible for a boot
> > > > > > > regression reported by CKI:
> > > > > > 
> > > > > > Ahh, boot regression ? These patches only change the behaviour
> > > > > > for non boot memory only.
> > > > > > 
> > > > > > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > > > > > 
> > > > > > Will look into the logs and see if there is something pointing to
> > > > > > the problem.
> > > > > 
> > > > > It's strange. One thing I can imagine is a mis-detection of early sections.
> > > > > However, I don't see that happening:
> > > > > 
> > > > > In sparse_init_nid(), we:
> > > > > 1. Initialize the memmap
> > > > > 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
> > > > >     sparse_init_one_section()
> > > > > 
> > > > > Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> > > > > SECTION_IS_EARLY - which is correct, because these are not early.
> > > > > 
> > > > > So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> > > > > -- early_section() should be correct.
> > > > > 
> > > > > Even if someone would be doing a pfn_valid() after
> > > > > memblocks_present()->memory_present() but before
> > > > > sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> > > > 
> > > > I couldn't figure out how this could fail with Anshuman's patches.
> > > > Will's suspicion is that some invalid/null pointer gets dereferenced
> > > > before being initialised but the only case I see is somewhere in
> > > > pfn_section_valid() (ms->usage) if valid_section() && !early_section().
> > > > 
> > > > Assuming that we do get a valid_section(ms) && !early_section(ms), is
> > > > there a case where ms->usage is not initialised? I guess races with
> > > > section_deactivate() are not possible this early.
> > > > 
> > > > Another situation could be that pfn_valid() returns true when no memory
> > > > is mapped for that pfn.
> > > 
> > > The case I wondered about was __pfn_to_section() with a bogus pfn, since
> > > with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
> > > sane.
> > 
> > Right, that is problematic. __pfn_to_section() should not be called without
> > first validating pfn_to_section_nr(), as it could cause out-of-bound access
> > on mem_section buffer. Will fix that order but as there is no test scenario
> > which is definitive for this reported regression, how should we ensure that
> > it fixes the problem ?
> 
> Oh, right, I missed that in patch #2. (and when comparing to generic
> pfn_valid()).
> 
> I thought bisecting pointed at patch #1, that's why I didn't even have
> another look at patch #2. Makes sense.

I don't think we ever bisected it beyond these two patches, so it could
be either of them. Anshuman -- please work with Veronika on this, as she
has access to the problematic machine and was really helpful in debugging
this last time.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-04  9:36                                 ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2021-03-04  9:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, Catalin Marinas, Mark Rutland, linux-kernel,
	Mike Rapoport, linux-mm, Jérôme Glisse, James Morse,
	Dan Williams, Robin Murphy, Ard Biesheuvel, vkabatov,
	linux-arm-kernel

On Thu, Mar 04, 2021 at 09:12:31AM +0100, David Hildenbrand wrote:
> On 04.03.21 04:31, Anshuman Khandual wrote:
> > On 3/4/21 2:54 AM, Will Deacon wrote:
> > > On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
> > > > On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
> > > > > On 11.02.21 13:10, Anshuman Khandual wrote:
> > > > > > On 2/11/21 5:23 PM, Will Deacon wrote:
> > > > > > > ... and dropped. These patches appear to be responsible for a boot
> > > > > > > regression reported by CKI:
> > > > > > 
> > > > > > Ahh, boot regression ? These patches only change the behaviour
> > > > > > for non boot memory only.
> > > > > > 
> > > > > > > https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
> > > > > > 
> > > > > > Will look into the logs and see if there is something pointing to
> > > > > > the problem.
> > > > > 
> > > > > It's strange. One thing I can imagine is a mis-detection of early sections.
> > > > > However, I don't see that happening:
> > > > > 
> > > > > In sparse_init_nid(), we:
> > > > > 1. Initialize the memmap
> > > > > 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
> > > > >     sparse_init_one_section()
> > > > > 
> > > > > Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
> > > > > SECTION_IS_EARLY - which is correct, because these are not early.
> > > > > 
> > > > > So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
> > > > > -- early_section() should be correct.
> > > > > 
> > > > > Even if someone would be doing a pfn_valid() after
> > > > > memblocks_present()->memory_present() but before
> > > > > sparse_init_nid(), we should be fine (!valid_section() -> return 0).
> > > > 
> > > > I couldn't figure out how this could fail with Anshuman's patches.
> > > > Will's suspicion is that some invalid/null pointer gets dereferenced
> > > > before being initialised but the only case I see is somewhere in
> > > > pfn_section_valid() (ms->usage) if valid_section() && !early_section().
> > > > 
> > > > Assuming that we do get a valid_section(ms) && !early_section(ms), is
> > > > there a case where ms->usage is not initialised? I guess races with
> > > > section_deactivate() are not possible this early.
> > > > 
> > > > Another situation could be that pfn_valid() returns true when no memory
> > > > is mapped for that pfn.
> > > 
> > > The case I wondered about was __pfn_to_section() with a bogus pfn, since
> > > with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
> > > sane.
> > 
> > Right, that is problematic. __pfn_to_section() should not be called without
> > first validating pfn_to_section_nr(), as it could cause out-of-bound access
> > on mem_section buffer. Will fix that order but as there is no test scenario
> > which is definitive for this reported regression, how should we ensure that
> > it fixes the problem ?
> 
> Oh, right, I missed that in patch #2. (and when comparing to generic
> pfn_valid()).
> 
> I thought bisecting pointed at patch #1, that's why I didn't even have
> another look at patch #2. Makes sense.

I don't think we ever bisected it beyond these two patches, so it could
be either of them. Anshuman -- please work with Veronika on this, as she
has access to the problematic machine and was really helpful in debugging
this last time.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  2021-03-04  9:36                                 ` Will Deacon
@ 2021-03-05  4:22                                   ` Anshuman Khandual
  -1 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-03-05  4:22 UTC (permalink / raw)
  To: Will Deacon, David Hildenbrand
  Cc: Catalin Marinas, Mark Rutland, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, vkabatov, linux-arm-kernel



On 3/4/21 3:06 PM, Will Deacon wrote:
> On Thu, Mar 04, 2021 at 09:12:31AM +0100, David Hildenbrand wrote:
>> On 04.03.21 04:31, Anshuman Khandual wrote:
>>> On 3/4/21 2:54 AM, Will Deacon wrote:
>>>> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>>>>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>>>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>>>>> regression reported by CKI:
>>>>>>>
>>>>>>> Ahh, boot regression ? These patches only change the behaviour
>>>>>>> for non boot memory only.
>>>>>>>
>>>>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>>>>
>>>>>>> Will look into the logs and see if there is something pointing to
>>>>>>> the problem.
>>>>>>
>>>>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>>>>> However, I don't see that happening:
>>>>>>
>>>>>> In sparse_init_nid(), we:
>>>>>> 1. Initialize the memmap
>>>>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>>>>     sparse_init_one_section()
>>>>>>
>>>>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>>>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>>>>
>>>>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>>>>> -- early_section() should be correct.
>>>>>>
>>>>>> Even if someone would be doing a pfn_valid() after
>>>>>> memblocks_present()->memory_present() but before
>>>>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>>>>
>>>>> I couldn't figure out how this could fail with Anshuman's patches.
>>>>> Will's suspicion is that some invalid/null pointer gets dereferenced
>>>>> before being initialised but the only case I see is somewhere in
>>>>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>>>>
>>>>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>>>>> there a case where ms->usage is not initialised? I guess races with
>>>>> section_deactivate() are not possible this early.
>>>>>
>>>>> Another situation could be that pfn_valid() returns true when no memory
>>>>> is mapped for that pfn.
>>>>
>>>> The case I wondered about was __pfn_to_section() with a bogus pfn, since
>>>> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
>>>> sane.
>>>
>>> Right, that is problematic. __pfn_to_section() should not be called without
>>> first validating pfn_to_section_nr(), as it could cause out-of-bound access
>>> on mem_section buffer. Will fix that order but as there is no test scenario
>>> which is definitive for this reported regression, how should we ensure that
>>> it fixes the problem ?
>>
>> Oh, right, I missed that in patch #2. (and when comparing to generic
>> pfn_valid()).
>>
>> I thought bisecting pointed at patch #1, that's why I didn't even have
>> another look at patch #2. Makes sense.
> 
> I don't think we ever bisected it beyond these two patches, so it could
> be either of them. Anshuman -- please work with Veronika on this, as she
> has access to the problematic machine and was really helpful in debugging
> this last time.

Sure, will respin the patch series with a fix for [PATCH 2/2] as discussed
and then follow up with Veronika to recreate the problem.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
@ 2021-03-05  4:22                                   ` Anshuman Khandual
  0 siblings, 0 replies; 46+ messages in thread
From: Anshuman Khandual @ 2021-03-05  4:22 UTC (permalink / raw)
  To: Will Deacon, David Hildenbrand
  Cc: Catalin Marinas, Mark Rutland, linux-kernel, Mike Rapoport,
	linux-mm, Jérôme Glisse, James Morse, Dan Williams,
	Robin Murphy, Ard Biesheuvel, vkabatov, linux-arm-kernel



On 3/4/21 3:06 PM, Will Deacon wrote:
> On Thu, Mar 04, 2021 at 09:12:31AM +0100, David Hildenbrand wrote:
>> On 04.03.21 04:31, Anshuman Khandual wrote:
>>> On 3/4/21 2:54 AM, Will Deacon wrote:
>>>> On Wed, Mar 03, 2021 at 07:04:33PM +0000, Catalin Marinas wrote:
>>>>> On Thu, Feb 11, 2021 at 01:35:56PM +0100, David Hildenbrand wrote:
>>>>>> On 11.02.21 13:10, Anshuman Khandual wrote:
>>>>>>> On 2/11/21 5:23 PM, Will Deacon wrote:
>>>>>>>> ... and dropped. These patches appear to be responsible for a boot
>>>>>>>> regression reported by CKI:
>>>>>>>
>>>>>>> Ahh, boot regression ? These patches only change the behaviour
>>>>>>> for non boot memory only.
>>>>>>>
>>>>>>>> https://lore.kernel.org/r/cki.8D1CB60FEC.K6NJMEFQPV@redhat.com
>>>>>>>
>>>>>>> Will look into the logs and see if there is something pointing to
>>>>>>> the problem.
>>>>>>
>>>>>> It's strange. One thing I can imagine is a mis-detection of early sections.
>>>>>> However, I don't see that happening:
>>>>>>
>>>>>> In sparse_init_nid(), we:
>>>>>> 1. Initialize the memmap
>>>>>> 2. Set SECTION_IS_EARLY | SECTION_HAS_MEM_MAP via
>>>>>>     sparse_init_one_section()
>>>>>>
>>>>>> Only hotplugged sections (DIMMs, dax/kmem) set SECTION_HAS_MEM_MAP without
>>>>>> SECTION_IS_EARLY - which is correct, because these are not early.
>>>>>>
>>>>>> So once we know that we have valid_section() -- SECTION_HAS_MEM_MAP is set
>>>>>> -- early_section() should be correct.
>>>>>>
>>>>>> Even if someone would be doing a pfn_valid() after
>>>>>> memblocks_present()->memory_present() but before
>>>>>> sparse_init_nid(), we should be fine (!valid_section() -> return 0).
>>>>>
>>>>> I couldn't figure out how this could fail with Anshuman's patches.
>>>>> Will's suspicion is that some invalid/null pointer gets dereferenced
>>>>> before being initialised but the only case I see is somewhere in
>>>>> pfn_section_valid() (ms->usage) if valid_section() && !early_section().
>>>>>
>>>>> Assuming that we do get a valid_section(ms) && !early_section(ms), is
>>>>> there a case where ms->usage is not initialised? I guess races with
>>>>> section_deactivate() are not possible this early.
>>>>>
>>>>> Another situation could be that pfn_valid() returns true when no memory
>>>>> is mapped for that pfn.
>>>>
>>>> The case I wondered about was __pfn_to_section() with a bogus pfn, since
>>>> with patch 2/2 we call that *before* checking that pfn_to_section_nr() is
>>>> sane.
>>>
>>> Right, that is problematic. __pfn_to_section() should not be called without
>>> first validating pfn_to_section_nr(), as it could cause out-of-bound access
>>> on mem_section buffer. Will fix that order but as there is no test scenario
>>> which is definitive for this reported regression, how should we ensure that
>>> it fixes the problem ?
>>
>> Oh, right, I missed that in patch #2. (and when comparing to generic
>> pfn_valid()).
>>
>> I thought bisecting pointed at patch #1, that's why I didn't even have
>> another look at patch #2. Makes sense.
> 
> I don't think we ever bisected it beyond these two patches, so it could
> be either of them. Anshuman -- please work with Veronika on this, as she
> has access to the problematic machine and was really helpful in debugging
> this last time.

Sure, will respin the patch series with a fix for [PATCH 2/2] as discussed
and then follow up with Veronika to recreate the problem.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2021-03-05  4:24 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02  4:11 [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory Anshuman Khandual
2021-02-02  4:11 ` Anshuman Khandual
2021-02-02  4:11 ` [PATCH V2 1/2] " Anshuman Khandual
2021-02-02  4:11   ` Anshuman Khandual
2021-02-02 12:32   ` Will Deacon
2021-02-02 12:32     ` Will Deacon
2021-02-02 12:35     ` Will Deacon
2021-02-02 12:35       ` Will Deacon
2021-02-02 12:39       ` David Hildenbrand
2021-02-02 12:39         ` David Hildenbrand
2021-02-02 12:51         ` Will Deacon
2021-02-02 12:51           ` Will Deacon
2021-02-02 12:56           ` David Hildenbrand
2021-02-02 12:56             ` David Hildenbrand
2021-02-03  3:50             ` Anshuman Khandual
2021-02-03  3:50               ` Anshuman Khandual
2021-02-05 18:55               ` Will Deacon
2021-02-05 18:55                 ` Will Deacon
2021-02-11 11:53                 ` Will Deacon
2021-02-11 11:53                   ` Will Deacon
2021-02-11 12:10                   ` Anshuman Khandual
2021-02-11 12:10                     ` Anshuman Khandual
2021-02-11 12:21                     ` Will Deacon
2021-02-11 12:21                       ` Will Deacon
2021-02-11 12:35                     ` David Hildenbrand
2021-02-11 12:35                       ` David Hildenbrand
2021-03-03 19:04                       ` Catalin Marinas
2021-03-03 19:04                         ` Catalin Marinas
2021-03-03 19:24                         ` David Hildenbrand
2021-03-03 19:24                           ` David Hildenbrand
2021-03-03 21:24                         ` Will Deacon
2021-03-03 21:24                           ` Will Deacon
2021-03-04  3:31                           ` Anshuman Khandual
2021-03-04  3:31                             ` Anshuman Khandual
2021-03-04  8:12                             ` David Hildenbrand
2021-03-04  8:12                               ` David Hildenbrand
2021-03-04  9:36                               ` Will Deacon
2021-03-04  9:36                                 ` Will Deacon
2021-03-05  4:22                                 ` Anshuman Khandual
2021-03-05  4:22                                   ` Anshuman Khandual
2021-02-02  4:11 ` [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid() Anshuman Khandual
2021-02-02  4:11   ` Anshuman Khandual
2021-02-02  8:26   ` David Hildenbrand
2021-02-02  8:26     ` David Hildenbrand
2021-02-05 18:52 ` [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory Will Deacon
2021-02-05 18:52   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.