linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/3] arm64/mm/hotplug: Improve memory offline event notifier
@ 2020-09-21 12:05 Anshuman Khandual
  2020-09-21 12:05 ` [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-21 12:05 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, Anshuman Khandual, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This series brings three different changes to the only memory event notifier on
arm64 platform. These changes improve it's robustness while also enhancing debug
capabilities during potential memory offlining error conditions.

This applies on 5.9-rc6

Changes in V3:

- Split the single patch into three patch series per Catalin
- Trigger changed from setup_arch() to early_initcall() per Catalin
- Renamed back memory_hotremove_notifier() as prevent_bootmem_remove_init()
- validate_bootmem_online() is now called from prevent_bootmem_remove_init() per Catalin
- Skip registering the notifier if validate_bootmem_online() returns negative

Changes in V2: (https://patchwork.kernel.org/patch/11732161/)

- Dropped all generic changes wrt MEM_CANCEL_OFFLINE reasons enumeration
- Dropped all related (processing MEM_CANCEL_OFFLINE reasons) changes on arm64
- Added validate_boot_mem_online_state() that gets called with early_initcall()
- Added CONFIG_MEMORY_HOTREMOVE check before registering memory notifier
- Moved notifier registration i.e memory_hotremove_notifier into setup_arch()

Changes in V1: (https://patchwork.kernel.org/project/linux-mm/list/?series=271237)

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (3):
  arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  arm64/mm/hotplug: Ensure early memory sections are all online

 arch/arm64/mm/mmu.c | 110 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 103 insertions(+), 7 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-09-21 12:05 [PATCH V3 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
@ 2020-09-21 12:05 ` Anshuman Khandual
  2020-09-23  6:04   ` Gavin Shan
  2020-09-21 12:05 ` [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
  2020-09-21 12:05 ` [PATCH V3 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
  2 siblings, 1 reply; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-21 12:05 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, Anshuman Khandual, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62fea1b6..df3b7415b128 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1499,7 +1499,17 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 
 static int __init prevent_bootmem_remove_init(void)
 {
-	return register_memory_notifier(&prevent_bootmem_remove_nb);
+	int ret = 0;
+
+	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+		return ret;
+
+	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
+	if (!ret)
+		return ret;
+
+	pr_err("Notifier registration failed - boot memory can be removed\n");
+	return ret;
 }
-device_initcall(prevent_bootmem_remove_init);
+early_initcall(prevent_bootmem_remove_init);
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-21 12:05 [PATCH V3 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-09-21 12:05 ` [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
@ 2020-09-21 12:05 ` Anshuman Khandual
  2020-09-23  4:44   ` Anshuman Khandual
  2020-09-23  6:31   ` Gavin Shan
  2020-09-21 12:05 ` [PATCH V3 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
  2 siblings, 2 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-21 12:05 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, Anshuman Khandual, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index df3b7415b128..6b171bd88bcf 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1482,13 +1482,40 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
 	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
 	unsigned long pfn = arg->start_pfn;
 
-	if (action != MEM_GOING_OFFLINE)
+	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
 		return NOTIFY_OK;
 
-	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
-		ms = __pfn_to_section(pfn);
-		if (early_section(ms))
-			return NOTIFY_BAD;
+	if (action == MEM_GOING_OFFLINE) {
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+			if (early_section(ms)) {
+				pr_warn("Boot memory offlining attempted\n");
+				return NOTIFY_BAD;
+			}
+		}
+	} else if (action == MEM_OFFLINE) {
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+			if (early_section(ms)) {
+
+				/*
+				 * This should have never happened. Boot memory
+				 * offlining should have been prevented by this
+				 * very notifier. Probably some memory removal
+				 * procedure might have changed which would then
+				 * require further debug.
+				 */
+				pr_err("Boot memory offlined\n");
+
+				/*
+				 * Core memory hotplug does not process a return
+				 * code from the notifier for MEM_OFFLINE event.
+				 * Error condition has been reported. Report as
+				 * ignored.
+				 */
+				return NOTIFY_DONE;
+			}
+		}
 	}
 	return NOTIFY_OK;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V3 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-09-21 12:05 [PATCH V3 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-09-21 12:05 ` [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
  2020-09-21 12:05 ` [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
@ 2020-09-21 12:05 ` Anshuman Khandual
  2 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-21 12:05 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, Anshuman Khandual, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. The notifier
registration is skipped, if this validation does not go through. Although
the boot section scanning is selectively enabled with DEBUG_VM.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6b171bd88bcf..124eeb84ec43 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1524,6 +1524,62 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 	.notifier_call = prevent_bootmem_remove_notifier,
 };
 
+/*
+ * This ensures that boot memory sections on the plaltform are online
+ * during early boot. They could not be prevented from being offlined
+ * if for some reason they are not brought online to begin with. This
+ * help validate the basic assumption on which the above memory event
+ * notifier works to prevent boot memory offlining and it's possible
+ * removal.
+ */
+static bool validate_bootmem_online(void)
+{
+	struct memblock_region *mblk;
+	struct mem_section *ms;
+	unsigned long pfn, end_pfn, start, end;
+	bool all_online = true;
+
+	/*
+	 * Scanning across all memblock might be expensive
+	 * on some big memory systems. Hence enable this
+	 * validation only with DEBUG_VM.
+	 */
+	if (!IS_ENABLED(CONFIG_DEBUG_VM))
+		return all_online;
+
+	for_each_memblock(memory, mblk) {
+		pfn = PHYS_PFN(mblk->base);
+		end_pfn = PHYS_PFN(mblk->base + mblk->size);
+
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+
+			/*
+			 * All memory ranges in the system at this point
+			 * should have been marked early sections.
+			 */
+			WARN_ON(!early_section(ms));
+
+			/*
+			 * Memory notifier mechanism here to prevent boot
+			 * memory offlining depends on the fact that each
+			 * early section memory on the system is intially
+			 * online. Otherwise a given memory section which
+			 * is already offline will be overlooked and can
+			 * be removed completely. Call out such sections.
+			 */
+			if (!online_section(ms)) {
+				start = PFN_PHYS(pfn);
+				end = start + (1UL << PA_SECTION_SHIFT);
+				pr_err("Memory range [%lx %lx] is offline\n", start, end);
+				pr_err("Memory range [%lx %lx] can be removed\n", start, end);
+				all_online = false;
+			}
+		}
+	}
+	return all_online;
+}
+
 static int __init prevent_bootmem_remove_init(void)
 {
 	int ret = 0;
@@ -1531,6 +1587,9 @@ static int __init prevent_bootmem_remove_init(void)
 	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
 		return ret;
 
+	if (!validate_bootmem_online())
+		return -EINVAL;
+
 	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
 	if (!ret)
 		return ret;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-21 12:05 ` [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
@ 2020-09-23  4:44   ` Anshuman Khandual
  2020-09-23  6:31   ` Gavin Shan
  1 sibling, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-23  4:44 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, Will Deacon, Mark Rutland, Marc Zyngier,
	Steve Capper, Mark Brown, linux-kernel



On 09/21/2020 05:35 PM, Anshuman Khandual wrote:
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an explicit notifier failure, potentially by a future change in
> generic hot plug framework. This would help detect such scenarios and help
> debug further.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 37 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index df3b7415b128..6b171bd88bcf 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1482,13 +1482,40 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>  	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>  	unsigned long pfn = arg->start_pfn;
>  
> -	if (action != MEM_GOING_OFFLINE)
> +	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>  		return NOTIFY_OK;
>  
> -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> -		ms = __pfn_to_section(pfn);
> -		if (early_section(ms))
> -			return NOTIFY_BAD;
> +	if (action == MEM_GOING_OFFLINE) {
> +		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +			ms = __pfn_to_section(pfn);
> +			if (early_section(ms)) {
> +				pr_warn("Boot memory offlining attempted\n");
> +				return NOTIFY_BAD;
> +			}
> +		}
> +	} else if (action == MEM_OFFLINE) {
> +		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +			ms = __pfn_to_section(pfn);
> +			if (early_section(ms)) {
> +
> +				/*
> +				 * This should have never happened. Boot memory
> +				 * offlining should have been prevented by this
> +				 * very notifier. Probably some memory removal
> +				 * procedure might have changed which would then
> +				 * require further debug.
> +				 */
> +				pr_err("Boot memory offlined\n");

It is returning in the first instance, when a section inside the
offline range happen to be part of the boot memory. So wondering
if it would be better to call out here, entire attempted offline
range or just the first section inside that which overlaps with
boot memory ? But some range information here will be helpful.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-09-21 12:05 ` [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
@ 2020-09-23  6:04   ` Gavin Shan
  2020-09-24  3:23     ` Anshuman Khandual
  0 siblings, 1 reply; 9+ messages in thread
From: Gavin Shan @ 2020-09-23  6:04 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, catalin.marinas, Steve Capper,
	linux-kernel, Mark Brown, Marc Zyngier, will

Hi Anshuman,

On 9/21/20 10:05 PM, Anshuman Khandual wrote:
> This moves memory notifier registration earlier in the boot process from
> device_initcall() to early_initcall() which will help in guarding against
> potential early boot memory offline requests. Even though there should not
> be any actual offlinig requests till memory block devices are initialized
> with memory_dev_init() but then generic init sequence might just change in
> future. Hence an early registration for the memory event notifier would be
> helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
> is not enabled and also call out when memory notifier registration fails.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/mm/mmu.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 

With the following nit-picky comments resolved:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 75df62fea1b6..df3b7415b128 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1499,7 +1499,17 @@ static struct notifier_block prevent_bootmem_remove_nb = {
>   
>   static int __init prevent_bootmem_remove_init(void)
>   {
> -	return register_memory_notifier(&prevent_bootmem_remove_nb);
> +	int ret = 0;
> +
> +	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> +		return ret;
> +
> +	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
> +	if (!ret)
> +		return ret;
> +
> +	pr_err("Notifier registration failed - boot memory can be removed\n");
> +	return ret;
>   }

It might be cleaner if the duplicated return statements can be
avoided. Besides, it's always nice to print the errno even though
zero is always returned from register_memory_notifier(). So I guess
you probably need something like below:

         ret = register_memory_notifier(&prevent_bootmem_remove_nb);
         if (ret)
             pr_err("%s: Error %d registering notifier\n", __func__, ret)

         return ret;


register_memory_notifier                   # 0 is returned on !CONFIG_MEMORY_HOTPLUG_SPARSE
    blocking_notifier_chain_register
       notifier_chain_register              # 0 is always returned
       
> -device_initcall(prevent_bootmem_remove_init);
> +early_initcall(prevent_bootmem_remove_init);
>   #endif
> 

Cheers,
Gavin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-21 12:05 ` [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
  2020-09-23  4:44   ` Anshuman Khandual
@ 2020-09-23  6:31   ` Gavin Shan
  2020-09-24  3:51     ` Anshuman Khandual
  1 sibling, 1 reply; 9+ messages in thread
From: Gavin Shan @ 2020-09-23  6:31 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, catalin.marinas, Steve Capper,
	linux-kernel, Mark Brown, Marc Zyngier, will

Hi Anshuman,

On 9/21/20 10:05 PM, Anshuman Khandual wrote:
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an explicit notifier failure, potentially by a future change in
> generic hot plug framework. This would help detect such scenarios and help
> debug further.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---

I'm not sure if it makes sense since MEM_OFFLINE won't be triggered
after NOTIFY_BAD is returned from MEM_GOING_OFFLINE. NOTIFY_BAD means
the whole offline process is stopped. It would be guranteed by generic
framework from syntax standpoint.

However, this looks good if MEM_OFFLINE is triggered without calling
into MEM_GOING_OFFLINE previously, but it would be a bug from generic
framework.

>   arch/arm64/mm/mmu.c | 37 ++++++++++++++++++++++++++++++++-----
>   1 file changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index df3b7415b128..6b171bd88bcf 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1482,13 +1482,40 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>   	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>   	unsigned long pfn = arg->start_pfn;
>   
> -	if (action != MEM_GOING_OFFLINE)
> +	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>   		return NOTIFY_OK;
>   
> -	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> -		ms = __pfn_to_section(pfn);
> -		if (early_section(ms))
> -			return NOTIFY_BAD;
> +	if (action == MEM_GOING_OFFLINE) {
> +		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +			ms = __pfn_to_section(pfn);
> +			if (early_section(ms)) {
> +				pr_warn("Boot memory offlining attempted\n");
> +				return NOTIFY_BAD;
> +			}
> +		}
> +	} else if (action == MEM_OFFLINE) {
> +		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +			ms = __pfn_to_section(pfn);
> +			if (early_section(ms)) {
> +
> +				/*
> +				 * This should have never happened. Boot memory
> +				 * offlining should have been prevented by this
> +				 * very notifier. Probably some memory removal
> +				 * procedure might have changed which would then
> +				 * require further debug.
> +				 */
> +				pr_err("Boot memory offlined\n");
> +
> +				/*
> +				 * Core memory hotplug does not process a return
> +				 * code from the notifier for MEM_OFFLINE event.
> +				 * Error condition has been reported. Report as
> +				 * ignored.
> +				 */
> +				return NOTIFY_DONE;
> +			}
> +		}
>   	}
>   	return NOTIFY_OK;
>   }
> 

It's pretty much irrelevant comment if the patch doesn't make sense:
the logical block for MEM_GOING_OFFLINE would be reused by MEM_OFFLINE
as they looks similar except the return value and error message :)

Cheers,
Gavin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-09-23  6:04   ` Gavin Shan
@ 2020-09-24  3:23     ` Anshuman Khandual
  0 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-24  3:23 UTC (permalink / raw)
  To: Gavin Shan, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, catalin.marinas, Steve Capper,
	linux-kernel, Mark Brown, Marc Zyngier, will



On 09/23/2020 11:34 AM, Gavin Shan wrote:
> Hi Anshuman,
> 
> On 9/21/20 10:05 PM, Anshuman Khandual wrote:
>> This moves memory notifier registration earlier in the boot process from
>> device_initcall() to early_initcall() which will help in guarding against
>> potential early boot memory offline requests. Even though there should not
>> be any actual offlinig requests till memory block devices are initialized
>> with memory_dev_init() but then generic init sequence might just change in
>> future. Hence an early registration for the memory event notifier would be
>> helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
>> is not enabled and also call out when memory notifier registration fails.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Steve Capper <steve.capper@arm.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/mm/mmu.c | 14 ++++++++++++--
>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>
> 
> With the following nit-picky comments resolved:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> 
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 75df62fea1b6..df3b7415b128 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1499,7 +1499,17 @@ static struct notifier_block prevent_bootmem_remove_nb = {
>>     static int __init prevent_bootmem_remove_init(void)
>>   {
>> -    return register_memory_notifier(&prevent_bootmem_remove_nb);
>> +    int ret = 0;
>> +
>> +    if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
>> +        return ret;
>> +
>> +    ret = register_memory_notifier(&prevent_bootmem_remove_nb);
>> +    if (!ret)
>> +        return ret;
>> +
>> +    pr_err("Notifier registration failed - boot memory can be removed\n");
>> +    return ret;
>>   }
> 
> It might be cleaner if the duplicated return statements can be
> avoided. Besides, it's always nice to print the errno even though

Thought about it, just that the error message was too long.

> zero is always returned from register_memory_notifier(). So I guess
> you probably need something like below:
> 
>         ret = register_memory_notifier(&prevent_bootmem_remove_nb);
>         if (ret)
>             pr_err("%s: Error %d registering notifier\n", __func__, ret)
> 
>         return ret;

Sure, will do.

> 
> 
> register_memory_notifier                   # 0 is returned on !CONFIG_MEMORY_HOTPLUG_SPARSE
>    blocking_notifier_chain_register
>       notifier_chain_register              # 0 is always returned
>      
>> -device_initcall(prevent_bootmem_remove_init);
>> +early_initcall(prevent_bootmem_remove_init);
>>   #endif
>>
> 
> Cheers,
> Gavin
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-23  6:31   ` Gavin Shan
@ 2020-09-24  3:51     ` Anshuman Khandual
  0 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2020-09-24  3:51 UTC (permalink / raw)
  To: Gavin Shan, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, catalin.marinas, Steve Capper,
	linux-kernel, Mark Brown, Marc Zyngier, will



On 09/23/2020 12:01 PM, Gavin Shan wrote:
> Hi Anshuman,
> 
> On 9/21/20 10:05 PM, Anshuman Khandual wrote:
>> This enables MEM_OFFLINE memory event handling. It will help intercept any
>> possible error condition such as if boot memory some how still got offlined
>> even after an explicit notifier failure, potentially by a future change in
>> generic hot plug framework. This would help detect such scenarios and help
>> debug further.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Steve Capper <steve.capper@arm.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
> 
> I'm not sure if it makes sense since MEM_OFFLINE won't be triggered
> after NOTIFY_BAD is returned from MEM_GOING_OFFLINE. NOTIFY_BAD means
> the whole offline process is stopped. It would be guranteed by generic
> framework from syntax standpoint.

Right but the intent here is to catch any deviation in generic hotplug
semantics going forward.
 > 
> However, this looks good if MEM_OFFLINE is triggered without calling
> into MEM_GOING_OFFLINE previously, but it would be a bug from generic
> framework.

Exactly, this will just ensure that we know about any change or a bug
in the generic framework. But if required, this additional check can
be enabled only with DEBUG_VM.

> 
>>   arch/arm64/mm/mmu.c | 37 ++++++++++++++++++++++++++++++++-----
>>   1 file changed, 32 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index df3b7415b128..6b171bd88bcf 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1482,13 +1482,40 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>>       unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>>       unsigned long pfn = arg->start_pfn;
>>   -    if (action != MEM_GOING_OFFLINE)
>> +    if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>>           return NOTIFY_OK;
>>   -    for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>> -        ms = __pfn_to_section(pfn);
>> -        if (early_section(ms))
>> -            return NOTIFY_BAD;
>> +    if (action == MEM_GOING_OFFLINE) {
>> +        for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>> +            ms = __pfn_to_section(pfn);
>> +            if (early_section(ms)) {
>> +                pr_warn("Boot memory offlining attempted\n");
>> +                return NOTIFY_BAD;
>> +            }
>> +        }
>> +    } else if (action == MEM_OFFLINE) {
>> +        for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>> +            ms = __pfn_to_section(pfn);
>> +            if (early_section(ms)) {
>> +
>> +                /*
>> +                 * This should have never happened. Boot memory
>> +                 * offlining should have been prevented by this
>> +                 * very notifier. Probably some memory removal
>> +                 * procedure might have changed which would then
>> +                 * require further debug.
>> +                 */
>> +                pr_err("Boot memory offlined\n");
>> +
>> +                /*
>> +                 * Core memory hotplug does not process a return
>> +                 * code from the notifier for MEM_OFFLINE event.
>> +                 * Error condition has been reported. Report as
>> +                 * ignored.
>> +                 */
>> +                return NOTIFY_DONE;
>> +            }
>> +        }
>>       }
>>       return NOTIFY_OK;
>>   }
>>
> 
> It's pretty much irrelevant comment if the patch doesn't make sense:
> the logical block for MEM_GOING_OFFLINE would be reused by MEM_OFFLINE
> as they looks similar except the return value and error message :)

This can be reorganized in the above mentioned format as well. Without
much additional code or iteration, it might not need DEBUG_VM as well.

for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
	ms = __pfn_to_section(pfn);
	if (!early_section(ms))
		continue;

	if (action == MEM_GOING_OFFLINE) {
		pr_warn("Boot memory offlining attempted\n");
		return NOTIFY_BAD;
	}
	else if (action == MEM_OFFLINE) {
		pr_err("Boot memory offlined\n");
		return NOTIFY_DONE;
	}
}
return NOTIFY_OK;

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-09-24  3:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 12:05 [PATCH V3 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
2020-09-21 12:05 ` [PATCH V3 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
2020-09-23  6:04   ` Gavin Shan
2020-09-24  3:23     ` Anshuman Khandual
2020-09-21 12:05 ` [PATCH V3 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
2020-09-23  4:44   ` Anshuman Khandual
2020-09-23  6:31   ` Gavin Shan
2020-09-24  3:51     ` Anshuman Khandual
2020-09-21 12:05 ` [PATCH V3 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).