linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/3] arm64/mm/hotplug: Improve memory offline event notifier
@ 2020-09-29 13:54 Anshuman Khandual
  2020-09-29 13:54 ` [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Anshuman Khandual @ 2020-09-29 13:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, gshan, Anshuman Khandual, Mark Rutland,
	Marc Zyngier, Steve Capper, Mark Brown, linux-kernel

This series brings three different changes to the only memory event notifier on
arm64 platform. These changes improve it's robustness while also enhancing debug
capabilities during potential memory offlining error conditions.

This applies on 5.9-rc7

Changes in V4:

- Dropped additional return in prevent_bootmem_remove_init() per Gavin
- Rearranged memory section loop in prevent_bootmem_remove_notifier() per Gavin
- Call out boot memory ranges for attempted offline or offline events

Changes in V3: (https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=352717)

- Split the single patch into three patch series per Catalin
- Trigger changed from setup_arch() to early_initcall() per Catalin
- Renamed back memory_hotremove_notifier() as prevent_bootmem_remove_init()
- validate_bootmem_online() is now called from prevent_bootmem_remove_init() per Catalin
- Skip registering the notifier if validate_bootmem_online() returns negative

Changes in V2: (https://patchwork.kernel.org/patch/11732161/)

- Dropped all generic changes wrt MEM_CANCEL_OFFLINE reasons enumeration
- Dropped all related (processing MEM_CANCEL_OFFLINE reasons) changes on arm64
- Added validate_boot_mem_online_state() that gets called with early_initcall()
- Added CONFIG_MEMORY_HOTREMOVE check before registering memory notifier
- Moved notifier registration i.e memory_hotremove_notifier into setup_arch()

Changes in V1: (https://patchwork.kernel.org/project/linux-mm/list/?series=271237)

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (3):
  arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  arm64/mm/hotplug: Ensure early memory sections are all online

 arch/arm64/mm/mmu.c | 101 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 97 insertions(+), 4 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-09-29 13:54 [PATCH V4 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
@ 2020-09-29 13:54 ` Anshuman Khandual
  2020-10-01 13:51   ` Catalin Marinas
  2020-09-29 13:54 ` [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
  2020-09-29 13:54 ` [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
  2 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2020-09-29 13:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, gshan, Anshuman Khandual, Mark Rutland,
	Marc Zyngier, Steve Capper, Mark Brown, linux-kernel

This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62fea1b6..4e70f4fea06c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1499,7 +1499,16 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 
 static int __init prevent_bootmem_remove_init(void)
 {
-	return register_memory_notifier(&prevent_bootmem_remove_nb);
+	int ret = 0;
+
+	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+		return ret;
+
+	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
+	if (ret)
+		pr_err("%s: Notifier registration failed %d\n", __func__, ret);
+
+	return ret;
 }
-device_initcall(prevent_bootmem_remove_init);
+early_initcall(prevent_bootmem_remove_init);
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-29 13:54 [PATCH V4 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-09-29 13:54 ` [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
@ 2020-09-29 13:54 ` Anshuman Khandual
  2020-09-30 23:57   ` Gavin Shan
  2020-09-29 13:54 ` [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
  2 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2020-09-29 13:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, gshan, Anshuman Khandual, Mark Rutland,
	Marc Zyngier, Steve Capper, Mark Brown, linux-kernel

This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further. While here, also call out the first section being attempted
for offline or got offlined.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4e70f4fea06c..90a30f5ebfc0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1482,13 +1482,38 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
 	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
 	unsigned long pfn = arg->start_pfn;
 
-	if (action != MEM_GOING_OFFLINE)
+	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
 		return NOTIFY_OK;
 
 	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		unsigned long start = PFN_PHYS(pfn);
+		unsigned long end = start + (1UL << PA_SECTION_SHIFT);
+
 		ms = __pfn_to_section(pfn);
-		if (early_section(ms))
+		if (!early_section(ms))
+			continue;
+
+		if (action == MEM_GOING_OFFLINE) {
+			pr_warn("Boot memory [%lx %lx] offlining attempted\n", start, end);
 			return NOTIFY_BAD;
+		} else if (action == MEM_OFFLINE) {
+			/*
+			 * This should have never happened. Boot memory
+			 * offlining should have been prevented by this
+			 * very notifier. Probably some memory removal
+			 * procedure might have changed which would then
+			 * require further debug.
+			 */
+			pr_err("Boot memory [%lx %lx] offlined\n", start, end);
+
+			/*
+			 * Core memory hotplug does not process a return
+			 * code from the notifier for MEM_OFFLINE event.
+			 * Error condition has been reported. Report as
+			 * ignored.
+			 */
+			return NOTIFY_DONE;
+		}
 	}
 	return NOTIFY_OK;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-09-29 13:54 [PATCH V4 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-09-29 13:54 ` [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
  2020-09-29 13:54 ` [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
@ 2020-09-29 13:54 ` Anshuman Khandual
  2020-10-01  0:53   ` Gavin Shan
  2 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2020-09-29 13:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will, gshan, Anshuman Khandual, Mark Rutland,
	Marc Zyngier, Steve Capper, Mark Brown, linux-kernel

This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. The notifier
registration is skipped, if this validation does not go through. Although
the boot section scanning is selectively enabled with DEBUG_VM.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 90a30f5ebfc0..b67a657ea1ad 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1522,6 +1522,62 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 	.notifier_call = prevent_bootmem_remove_notifier,
 };
 
+/*
+ * This ensures that boot memory sections on the plaltform are online
+ * during early boot. They could not be prevented from being offlined
+ * if for some reason they are not brought online to begin with. This
+ * help validate the basic assumption on which the above memory event
+ * notifier works to prevent boot memory offlining and it's possible
+ * removal.
+ */
+static bool validate_bootmem_online(void)
+{
+	struct memblock_region *mblk;
+	struct mem_section *ms;
+	unsigned long pfn, end_pfn, start, end;
+	bool all_online = true;
+
+	/*
+	 * Scanning across all memblock might be expensive
+	 * on some big memory systems. Hence enable this
+	 * validation only with DEBUG_VM.
+	 */
+	if (!IS_ENABLED(CONFIG_DEBUG_VM))
+		return all_online;
+
+	for_each_memblock(memory, mblk) {
+		pfn = PHYS_PFN(mblk->base);
+		end_pfn = PHYS_PFN(mblk->base + mblk->size);
+
+		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+			ms = __pfn_to_section(pfn);
+
+			/*
+			 * All memory ranges in the system at this point
+			 * should have been marked early sections.
+			 */
+			WARN_ON(!early_section(ms));
+
+			/*
+			 * Memory notifier mechanism here to prevent boot
+			 * memory offlining depends on the fact that each
+			 * early section memory on the system is intially
+			 * online. Otherwise a given memory section which
+			 * is already offline will be overlooked and can
+			 * be removed completely. Call out such sections.
+			 */
+			if (!online_section(ms)) {
+				start = PFN_PHYS(pfn);
+				end = start + (1UL << PA_SECTION_SHIFT);
+				pr_err("Memory range [%lx %lx] is offline\n", start, end);
+				pr_err("Memory range [%lx %lx] can be removed\n", start, end);
+				all_online = false;
+			}
+		}
+	}
+	return all_online;
+}
+
 static int __init prevent_bootmem_remove_init(void)
 {
 	int ret = 0;
@@ -1529,6 +1585,9 @@ static int __init prevent_bootmem_remove_init(void)
 	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
 		return ret;
 
+	if (!validate_bootmem_online())
+		return -EINVAL;
+
 	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
 	if (ret)
 		pr_err("%s: Notifier registration failed %d\n", __func__, ret);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-29 13:54 ` [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
@ 2020-09-30 23:57   ` Gavin Shan
  2020-10-06  2:59     ` Anshuman Khandual
  0 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2020-09-30 23:57 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel

Hi Anshuman,

On 9/29/20 11:54 PM, Anshuman Khandual wrote:
> This enables MEM_OFFLINE memory event handling. It will help intercept any
> possible error condition such as if boot memory some how still got offlined
> even after an explicit notifier failure, potentially by a future change in
> generic hot plug framework. This would help detect such scenarios and help
> debug further. While here, also call out the first section being attempted
> for offline or got offlined.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/mm/mmu.c | 29 +++++++++++++++++++++++++++--
>   1 file changed, 27 insertions(+), 2 deletions(-)
> 

This looks good to me except a nit and it can be improved if
that looks reasonable and only when you get a chance for
respin.

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 4e70f4fea06c..90a30f5ebfc0 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1482,13 +1482,38 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>   	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>   	unsigned long pfn = arg->start_pfn;
>   
> -	if (action != MEM_GOING_OFFLINE)
> +	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>   		return NOTIFY_OK;
>   
>   	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		unsigned long start = PFN_PHYS(pfn);
> +		unsigned long end = start + (1UL << PA_SECTION_SHIFT);
> +
>   		ms = __pfn_to_section(pfn);
> -		if (early_section(ms))
> +		if (!early_section(ms))
> +			continue;
> +

The discussion here is irrelevant to this patch itself. It seems
early_section() is coarse, which means all memory detected during
boot time won't be hotpluggable?

> +		if (action == MEM_GOING_OFFLINE) {
> +			pr_warn("Boot memory [%lx %lx] offlining attempted\n", start, end);
>   			return NOTIFY_BAD;
> +		} else if (action == MEM_OFFLINE) {
> +			/*
> +			 * This should have never happened. Boot memory
> +			 * offlining should have been prevented by this
> +			 * very notifier. Probably some memory removal
> +			 * procedure might have changed which would then
> +			 * require further debug.
> +			 */
> +			pr_err("Boot memory [%lx %lx] offlined\n", start, end);
> +
> +			/*
> +			 * Core memory hotplug does not process a return
> +			 * code from the notifier for MEM_OFFLINE event.
> +			 * Error condition has been reported. Report as
> +			 * ignored.
> +			 */
> +			return NOTIFY_DONE;
> +		}
>   	}
>   	return NOTIFY_OK;
>   }
> 

I think NOTIFY_BAD is returned for MEM_OFFLINE wouldn't be a
bad idea, even the core isn't handling the errno. With this,
the code can be simplified. However, it's not a big deal and
you probably evaluate and change when you need another respin:

     pr_warn("Boot memory [%lx %lx] %s\n",
             (action == MEM_GOING_OFFLINE) ? "offlining attempted" : "offlined",
             start, end);
     return NOTIFY_BAD;

Cheers,
Gavin
         


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-09-29 13:54 ` [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
@ 2020-10-01  0:53   ` Gavin Shan
  2020-10-06  3:11     ` Anshuman Khandual
  0 siblings, 1 reply; 11+ messages in thread
From: Gavin Shan @ 2020-10-01  0:53 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel

Hi Anshuman,

On 9/29/20 11:54 PM, Anshuman Khandual wrote:
> This adds a validation function that scans the entire boot memory and makes
> sure that all early memory sections are online. This check is essential for
> the memory notifier to work properly, as it cannot prevent any boot memory
> from offlining, if all sections are not online to begin with. The notifier
> registration is skipped, if this validation does not go through. Although
> the boot section scanning is selectively enabled with DEBUG_VM.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/mm/mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 59 insertions(+)

I don't understand why this is necessary. The core already ensure the
corresponding section is online when trying to offline it. It's guranteed
that section is online when the notifier is triggered. I'm not sure if
there is anything I missed?
  

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 90a30f5ebfc0..b67a657ea1ad 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1522,6 +1522,62 @@ static struct notifier_block prevent_bootmem_remove_nb = {
>   	.notifier_call = prevent_bootmem_remove_notifier,
>   };
>   
> +/*
> + * This ensures that boot memory sections on the plaltform are online
                                                     ^^^^^^^^^
> + * during early boot. They could not be prevented from being offlined
> + * if for some reason they are not brought online to begin with. This
> + * help validate the basic assumption on which the above memory event
> + * notifier works to prevent boot memory offlining and it's possible
> + * removal.
> + */
> +static bool validate_bootmem_online(void)
> +{
> +	struct memblock_region *mblk;
> +	struct mem_section *ms;
> +	unsigned long pfn, end_pfn, start, end;
> +	bool all_online = true;
> +
> +	/*
> +	 * Scanning across all memblock might be expensive
> +	 * on some big memory systems. Hence enable this
> +	 * validation only with DEBUG_VM.
> +	 */
> +	if (!IS_ENABLED(CONFIG_DEBUG_VM))
> +		return all_online;
> +
> +	for_each_memblock(memory, mblk) {
> +		pfn = PHYS_PFN(mblk->base);
> +		end_pfn = PHYS_PFN(mblk->base + mblk->size);
> +

It's not a good idea to access @mblk->{base, size}. There are two
accessors: memblock_region_memory_{base, end}_pfn().

> +		for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +			ms = __pfn_to_section(pfn);
> +
> +			/*
> +			 * All memory ranges in the system at this point
> +			 * should have been marked early sections.
> +			 */
> +			WARN_ON(!early_section(ms));
> +
> +			/*
> +			 * Memory notifier mechanism here to prevent boot
> +			 * memory offlining depends on the fact that each
> +			 * early section memory on the system is intially
> +			 * online. Otherwise a given memory section which
> +			 * is already offline will be overlooked and can
> +			 * be removed completely. Call out such sections.
> +			 */

s/intially/initially

> +			if (!online_section(ms)) {
> +				start = PFN_PHYS(pfn);
> +				end = start + (1UL << PA_SECTION_SHIFT);
> +				pr_err("Memory range [%lx %lx] is offline\n", start, end);
> +				pr_err("Memory range [%lx %lx] can be removed\n", start, end);
> +				all_online = false;

These two error messages can be combined:

     pr_err("Memory range [%lx %lx] not online, can't be offlined\n",
            start, end);

I think you need to return @all_online immediately, without
checking if the subsequent sections are online or not? :)

> +			}
> +		}
> +	}
> +	return all_online;
> +}
> +
>   static int __init prevent_bootmem_remove_init(void)
>   {
>   	int ret = 0;
> @@ -1529,6 +1585,9 @@ static int __init prevent_bootmem_remove_init(void)
>   	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
>   		return ret;
>   
> +	if (!validate_bootmem_online())
> +		return -EINVAL;
> +
>   	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
>   	if (ret)
>   		pr_err("%s: Notifier registration failed %d\n", __func__, ret);
> 

Cheers,
Gavin


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-09-29 13:54 ` [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
@ 2020-10-01 13:51   ` Catalin Marinas
  0 siblings, 0 replies; 11+ messages in thread
From: Catalin Marinas @ 2020-10-01 13:51 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, will, gshan, Mark Rutland, Marc Zyngier,
	Steve Capper, Mark Brown, linux-kernel

On Tue, Sep 29, 2020 at 07:24:45PM +0530, Anshuman Khandual wrote:
> This moves memory notifier registration earlier in the boot process from
> device_initcall() to early_initcall() which will help in guarding against
> potential early boot memory offline requests. Even though there should not
> be any actual offlinig requests till memory block devices are initialized
> with memory_dev_init() but then generic init sequence might just change in
> future. Hence an early registration for the memory event notifier would be
> helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
> is not enabled and also call out when memory notifier registration fails.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-09-30 23:57   ` Gavin Shan
@ 2020-10-06  2:59     ` Anshuman Khandual
  2020-10-12  3:27       ` Gavin Shan
  0 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2020-10-06  2:59 UTC (permalink / raw)
  To: Gavin Shan, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel



On 10/01/2020 05:27 AM, Gavin Shan wrote:
> Hi Anshuman,
> 
> On 9/29/20 11:54 PM, Anshuman Khandual wrote:
>> This enables MEM_OFFLINE memory event handling. It will help intercept any
>> possible error condition such as if boot memory some how still got offlined
>> even after an explicit notifier failure, potentially by a future change in
>> generic hot plug framework. This would help detect such scenarios and help
>> debug further. While here, also call out the first section being attempted
>> for offline or got offlined.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Steve Capper <steve.capper@arm.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/mm/mmu.c | 29 +++++++++++++++++++++++++++--
>>   1 file changed, 27 insertions(+), 2 deletions(-)
>>
> 
> This looks good to me except a nit and it can be improved if
> that looks reasonable and only when you get a chance for
> respin.
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> 
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 4e70f4fea06c..90a30f5ebfc0 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1482,13 +1482,38 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>>       unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>>       unsigned long pfn = arg->start_pfn;
>>   -    if (action != MEM_GOING_OFFLINE)
>> +    if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>>           return NOTIFY_OK;
>>         for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>> +        unsigned long start = PFN_PHYS(pfn);
>> +        unsigned long end = start + (1UL << PA_SECTION_SHIFT);
>> +
>>           ms = __pfn_to_section(pfn);
>> -        if (early_section(ms))
>> +        if (!early_section(ms))
>> +            continue;
>> +
> 
> The discussion here is irrelevant to this patch itself. It seems
> early_section() is coarse, which means all memory detected during
> boot time won't be hotpluggable?

Right, thats the policy being enforced on arm64 platform for various
critical reasons. Please refer to earlier discussions around memory
hot remove development on arm64.

> 
>> +        if (action == MEM_GOING_OFFLINE) {
>> +            pr_warn("Boot memory [%lx %lx] offlining attempted\n", start, end);
>>               return NOTIFY_BAD;
>> +        } else if (action == MEM_OFFLINE) {
>> +            /*
>> +             * This should have never happened. Boot memory
>> +             * offlining should have been prevented by this
>> +             * very notifier. Probably some memory removal
>> +             * procedure might have changed which would then
>> +             * require further debug.
>> +             */
>> +            pr_err("Boot memory [%lx %lx] offlined\n", start, end);
>> +
>> +            /*
>> +             * Core memory hotplug does not process a return
>> +             * code from the notifier for MEM_OFFLINE event.
>> +             * Error condition has been reported. Report as
>> +             * ignored.
>> +             */
>> +            return NOTIFY_DONE;
>> +        }
>>       }
>>       return NOTIFY_OK;
>>   }
>>
> 
> I think NOTIFY_BAD is returned for MEM_OFFLINE wouldn't be a
> bad idea, even the core isn't handling the errno. With this,
> the code can be simplified. However, it's not a big deal and
> you probably evaluate and change when you need another respin:
> 
>     pr_warn("Boot memory [%lx %lx] %s\n",
>             (action == MEM_GOING_OFFLINE) ? "offlining attempted" : "offlined",
>             start, end);
>     return NOTIFY_BAD;

Wondering whether returning a NOTIFY_BAD for MEM_OFFLINE event could
be somewhat risky if generic hotplug mechanism to change later. But
again, probably it might just be OK.

Regardless, also wanted to differentiate error messages for both the
cases. An warning messages i.e pr_warn() for MEM_GOING_OFFLINE which
suggests an unexpected user action but an error message i.e pr_err()
for MEM_OFFLINE which clearly indicates an error condition that needs
to be debugged further.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-10-01  0:53   ` Gavin Shan
@ 2020-10-06  3:11     ` Anshuman Khandual
  2020-10-12  4:07       ` Gavin Shan
  0 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2020-10-06  3:11 UTC (permalink / raw)
  To: Gavin Shan, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel



On 10/01/2020 06:23 AM, Gavin Shan wrote:
> Hi Anshuman,
> 
> On 9/29/20 11:54 PM, Anshuman Khandual wrote:
>> This adds a validation function that scans the entire boot memory and makes
>> sure that all early memory sections are online. This check is essential for
>> the memory notifier to work properly, as it cannot prevent any boot memory
>> from offlining, if all sections are not online to begin with. The notifier
>> registration is skipped, if this validation does not go through. Although
>> the boot section scanning is selectively enabled with DEBUG_VM.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Steve Capper <steve.capper@arm.com>
>> Cc: Mark Brown <broonie@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/mm/mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 59 insertions(+)
> 
> I don't understand why this is necessary. The core already ensure the
> corresponding section is online when trying to offline it. It's guranteed
> that section is online when the notifier is triggered. I'm not sure if
> there is anything I missed?

Current memory notifier blocks any boot memory hot removal attempt via
blocking its offlining step itself. So if some sections in boot memory
are not online (because of a bug or change in init sequence) by the
time memory block device can be removed, the notifier loses the ability
to prevent its removal. This validation here, ensures that entire boot
memory is in online state, otherwise call out sections that are not,
with an warning that those boot memory can be removed.  

>  
> 
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 90a30f5ebfc0..b67a657ea1ad 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1522,6 +1522,62 @@ static struct notifier_block prevent_bootmem_remove_nb = {
>>       .notifier_call = prevent_bootmem_remove_notifier,
>>   };
>>   +/*
>> + * This ensures that boot memory sections on the plaltform are online

Will fix.

>                                                     ^^^^^^^^^
>> + * during early boot. They could not be prevented from being offlined
>> + * if for some reason they are not brought online to begin with. This
>> + * help validate the basic assumption on which the above memory event
>> + * notifier works to prevent boot memory offlining and it's possible
>> + * removal.
>> + */
>> +static bool validate_bootmem_online(void)
>> +{
>> +    struct memblock_region *mblk;
>> +    struct mem_section *ms;
>> +    unsigned long pfn, end_pfn, start, end;
>> +    bool all_online = true;
>> +
>> +    /*
>> +     * Scanning across all memblock might be expensive
>> +     * on some big memory systems. Hence enable this
>> +     * validation only with DEBUG_VM.
>> +     */
>> +    if (!IS_ENABLED(CONFIG_DEBUG_VM))
>> +        return all_online;
>> +
>> +    for_each_memblock(memory, mblk) {
>> +        pfn = PHYS_PFN(mblk->base);
>> +        end_pfn = PHYS_PFN(mblk->base + mblk->size);
>> +
> 
> It's not a good idea to access @mblk->{base, size}. There are two
> accessors: memblock_region_memory_{base, end}_pfn().

Sure, will replace.

> 
>> +        for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>> +            ms = __pfn_to_section(pfn);
>> +
>> +            /*
>> +             * All memory ranges in the system at this point
>> +             * should have been marked early sections.
>> +             */
>> +            WARN_ON(!early_section(ms));
>> +
>> +            /*
>> +             * Memory notifier mechanism here to prevent boot
>> +             * memory offlining depends on the fact that each
>> +             * early section memory on the system is intially
>> +             * online. Otherwise a given memory section which
>> +             * is already offline will be overlooked and can
>> +             * be removed completely. Call out such sections.
>> +             */
> 
> s/intially/initially

Will change.

> 
>> +            if (!online_section(ms)) {
>> +                start = PFN_PHYS(pfn);
>> +                end = start + (1UL << PA_SECTION_SHIFT);
>> +                pr_err("Memory range [%lx %lx] is offline\n", start, end);
>> +                pr_err("Memory range [%lx %lx] can be removed\n", start, end);
>> +                all_online = false;
> 
> These two error messages can be combined:
> 
>     pr_err("Memory range [%lx %lx] not online, can't be offlined\n",
>            start, end);

Will change but it is actually s/can't be offlined/can be removed/ instead.

> 
> I think you need to return @all_online immediately, without
> checking if the subsequent sections are online or not? :)

Thinking about this again. It might be better if the notifier registration
does not depend on return value from validate_bootmem_online(). Instead it
should proceed either way but after calling out all boot memory sections
that are not online. In that case notifier will atleast prevent removal of
some parts of boot memory which are online.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-10-06  2:59     ` Anshuman Khandual
@ 2020-10-12  3:27       ` Gavin Shan
  0 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2020-10-12  3:27 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel

Hi Anshuman,

On 10/6/20 1:59 PM, Anshuman Khandual wrote:
> On 10/01/2020 05:27 AM, Gavin Shan wrote:
>> On 9/29/20 11:54 PM, Anshuman Khandual wrote:
>>> This enables MEM_OFFLINE memory event handling. It will help intercept any
>>> possible error condition such as if boot memory some how still got offlined
>>> even after an explicit notifier failure, potentially by a future change in
>>> generic hot plug framework. This would help detect such scenarios and help
>>> debug further. While here, also call out the first section being attempted
>>> for offline or got offlined.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Marc Zyngier <maz@kernel.org>
>>> Cc: Steve Capper <steve.capper@arm.com>
>>> Cc: Mark Brown <broonie@kernel.org>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>    arch/arm64/mm/mmu.c | 29 +++++++++++++++++++++++++++--
>>>    1 file changed, 27 insertions(+), 2 deletions(-)
>>>
>>
>> This looks good to me except a nit and it can be improved if
>> that looks reasonable and only when you get a chance for
>> respin.
>>
>> Reviewed-by: Gavin Shan <gshan@redhat.com>
>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 4e70f4fea06c..90a30f5ebfc0 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1482,13 +1482,38 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
>>>        unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>>>        unsigned long pfn = arg->start_pfn;
>>>    -    if (action != MEM_GOING_OFFLINE)
>>> +    if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
>>>            return NOTIFY_OK;
>>>          for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>>> +        unsigned long start = PFN_PHYS(pfn);
>>> +        unsigned long end = start + (1UL << PA_SECTION_SHIFT);
>>> +
>>>            ms = __pfn_to_section(pfn);
>>> -        if (early_section(ms))
>>> +        if (!early_section(ms))
>>> +            continue;
>>> +
>>
>> The discussion here is irrelevant to this patch itself. It seems
>> early_section() is coarse, which means all memory detected during
>> boot time won't be hotpluggable?
> 
> Right, thats the policy being enforced on arm64 platform for various
> critical reasons. Please refer to earlier discussions around memory
> hot remove development on arm64.
> 

Thanks for the hints.

>>
>>> +        if (action == MEM_GOING_OFFLINE) {
>>> +            pr_warn("Boot memory [%lx %lx] offlining attempted\n", start, end);
>>>                return NOTIFY_BAD;
>>> +        } else if (action == MEM_OFFLINE) {
>>> +            /*
>>> +             * This should have never happened. Boot memory
>>> +             * offlining should have been prevented by this
>>> +             * very notifier. Probably some memory removal
>>> +             * procedure might have changed which would then
>>> +             * require further debug.
>>> +             */
>>> +            pr_err("Boot memory [%lx %lx] offlined\n", start, end);
>>> +
>>> +            /*
>>> +             * Core memory hotplug does not process a return
>>> +             * code from the notifier for MEM_OFFLINE event.
>>> +             * Error condition has been reported. Report as
>>> +             * ignored.
>>> +             */
>>> +            return NOTIFY_DONE;
>>> +        }
>>>        }
>>>        return NOTIFY_OK;
>>>    }
>>>
>>
>> I think NOTIFY_BAD is returned for MEM_OFFLINE wouldn't be a
>> bad idea, even the core isn't handling the errno. With this,
>> the code can be simplified. However, it's not a big deal and
>> you probably evaluate and change when you need another respin:
>>
>>      pr_warn("Boot memory [%lx %lx] %s\n",
>>              (action == MEM_GOING_OFFLINE) ? "offlining attempted" : "offlined",
>>              start, end);
>>      return NOTIFY_BAD;
> 
> Wondering whether returning a NOTIFY_BAD for MEM_OFFLINE event could
> be somewhat risky if generic hotplug mechanism to change later. But
> again, probably it might just be OK.
> 
> Regardless, also wanted to differentiate error messages for both the
> cases. An warning messages i.e pr_warn() for MEM_GOING_OFFLINE which
> suggests an unexpected user action but an error message i.e pr_err()
> for MEM_OFFLINE which clearly indicates an error condition that needs
> to be debugged further.
> 

Ok, fair enough and it looks good to me either.

Cheers,
Gavin



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-10-06  3:11     ` Anshuman Khandual
@ 2020-10-12  4:07       ` Gavin Shan
  0 siblings, 0 replies; 11+ messages in thread
From: Gavin Shan @ 2020-10-12  4:07 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: catalin.marinas, will, Mark Rutland, Marc Zyngier, Steve Capper,
	Mark Brown, linux-kernel

Hi Anshuman,

On 10/6/20 2:11 PM, Anshuman Khandual wrote:
> On 10/01/2020 06:23 AM, Gavin Shan wrote:
>> On 9/29/20 11:54 PM, Anshuman Khandual wrote:
>>> This adds a validation function that scans the entire boot memory and makes
>>> sure that all early memory sections are online. This check is essential for
>>> the memory notifier to work properly, as it cannot prevent any boot memory
>>> from offlining, if all sections are not online to begin with. The notifier
>>> registration is skipped, if this validation does not go through. Although
>>> the boot section scanning is selectively enabled with DEBUG_VM.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Marc Zyngier <maz@kernel.org>
>>> Cc: Steve Capper <steve.capper@arm.com>
>>> Cc: Mark Brown <broonie@kernel.org>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>    arch/arm64/mm/mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 59 insertions(+)
>>
>> I don't understand why this is necessary. The core already ensure the
>> corresponding section is online when trying to offline it. It's guranteed
>> that section is online when the notifier is triggered. I'm not sure if
>> there is anything I missed?
> 
> Current memory notifier blocks any boot memory hot removal attempt via
> blocking its offlining step itself. So if some sections in boot memory
> are not online (because of a bug or change in init sequence) by the
> time memory block device can be removed, the notifier loses the ability
> to prevent its removal. This validation here, ensures that entire boot
> memory is in online state, otherwise call out sections that are not,
> with an warning that those boot memory can be removed.
> 

Well. I think it should be very rare. I guess you don't observe the
errornous case so far? However, I think it's fine to add the check
since it's only enabled with CONFIG_DEBUG_VM.

>>   
>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 90a30f5ebfc0..b67a657ea1ad 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1522,6 +1522,62 @@ static struct notifier_block prevent_bootmem_remove_nb = {
>>>        .notifier_call = prevent_bootmem_remove_notifier,
>>>    };
>>>    +/*
>>> + * This ensures that boot memory sections on the plaltform are online
> 
> Will fix.
> 
>>                                                      ^^^^^^^^^
>>> + * during early boot. They could not be prevented from being offlined
>>> + * if for some reason they are not brought online to begin with. This
>>> + * help validate the basic assumption on which the above memory event
>>> + * notifier works to prevent boot memory offlining and it's possible
>>> + * removal.
>>> + */
>>> +static bool validate_bootmem_online(void)
>>> +{
>>> +    struct memblock_region *mblk;
>>> +    struct mem_section *ms;
>>> +    unsigned long pfn, end_pfn, start, end;
>>> +    bool all_online = true;
>>> +
>>> +    /*
>>> +     * Scanning across all memblock might be expensive
>>> +     * on some big memory systems. Hence enable this
>>> +     * validation only with DEBUG_VM.
>>> +     */
>>> +    if (!IS_ENABLED(CONFIG_DEBUG_VM))
>>> +        return all_online;
>>> +
>>> +    for_each_memblock(memory, mblk) {
>>> +        pfn = PHYS_PFN(mblk->base);
>>> +        end_pfn = PHYS_PFN(mblk->base + mblk->size);
>>> +
>>
>> It's not a good idea to access @mblk->{base, size}. There are two
>> accessors: memblock_region_memory_{base, end}_pfn().
> 
> Sure, will replace.
> 
>>
>>> +        for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>>> +            ms = __pfn_to_section(pfn);
>>> +
>>> +            /*
>>> +             * All memory ranges in the system at this point
>>> +             * should have been marked early sections.
>>> +             */
>>> +            WARN_ON(!early_section(ms));
>>> +
>>> +            /*
>>> +             * Memory notifier mechanism here to prevent boot
>>> +             * memory offlining depends on the fact that each
>>> +             * early section memory on the system is intially
>>> +             * online. Otherwise a given memory section which
>>> +             * is already offline will be overlooked and can
>>> +             * be removed completely. Call out such sections.
>>> +             */
>>
>> s/intially/initially
> 
> Will change.
> 
>>
>>> +            if (!online_section(ms)) {
>>> +                start = PFN_PHYS(pfn);
>>> +                end = start + (1UL << PA_SECTION_SHIFT);
>>> +                pr_err("Memory range [%lx %lx] is offline\n", start, end);
>>> +                pr_err("Memory range [%lx %lx] can be removed\n", start, end);
>>> +                all_online = false;
>>
>> These two error messages can be combined:
>>
>>      pr_err("Memory range [%lx %lx] not online, can't be offlined\n",
>>             start, end);
> 
> Will change but it is actually s/can't be offlined/can be removed/ instead.
> 
>>
>> I think you need to return @all_online immediately, without
>> checking if the subsequent sections are online or not? :)
> 
> Thinking about this again. It might be better if the notifier registration
> does not depend on return value from validate_bootmem_online(). Instead it
> should proceed either way but after calling out all boot memory sections
> that are not online. In that case notifier will atleast prevent removal of
> some parts of boot memory which are online.
> 

Yes, agreed. However, the most important part is to print the errornous
messages introduced in validate_bootmem_online().

Cheers,
Gavin



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-12  4:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 13:54 [PATCH V4 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
2020-09-29 13:54 ` [PATCH V4 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
2020-10-01 13:51   ` Catalin Marinas
2020-09-29 13:54 ` [PATCH V4 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
2020-09-30 23:57   ` Gavin Shan
2020-10-06  2:59     ` Anshuman Khandual
2020-10-12  3:27       ` Gavin Shan
2020-09-29 13:54 ` [PATCH V4 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
2020-10-01  0:53   ` Gavin Shan
2020-10-06  3:11     ` Anshuman Khandual
2020-10-12  4:07       ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).