linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier
@ 2020-11-09  4:28 Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Anshuman Khandual @ 2020-11-09  4:28 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: gshan, Anshuman Khandual, Catalin Marinas, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This series brings three different changes to the only memory event notifier on
arm64 platform. These changes improve it's robustness while also enhancing debug
capabilities during potential memory offlining error conditions.

This applies on 5.10-rc3

Changes in V5:

- Added some more documentation in [PATCH 2/3]
- Used for_each_mem_range() as for_each_memblock() has been dropped
- validate_bootmem_online() just prints non-compliant early sections
- validate_bootmem_online() does not prevent notifier registration
- Folded two pr_err() statements into just a single one per Gavin

Changes in V4: (https://lore.kernel.org/linux-arm-kernel/1601387687-6077-1-git-send-email-anshuman.khandual@arm.com/

- Dropped additional return in prevent_bootmem_remove_init() per Gavin
- Rearranged memory section loop in prevent_bootmem_remove_notifier() per Gavin
- Call out boot memory ranges for attempted offline or offline events

Changes in V3: (https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=352717)

- Split the single patch into three patch series per Catalin
- Trigger changed from setup_arch() to early_initcall() per Catalin
- Renamed back memory_hotremove_notifier() as prevent_bootmem_remove_init()
- validate_bootmem_online() is now called from prevent_bootmem_remove_init() per Catalin
- Skip registering the notifier if validate_bootmem_online() returns negative

Changes in V2: (https://patchwork.kernel.org/patch/11732161/)

- Dropped all generic changes wrt MEM_CANCEL_OFFLINE reasons enumeration
- Dropped all related (processing MEM_CANCEL_OFFLINE reasons) changes on arm64
- Added validate_boot_mem_online_state() that gets called with early_initcall()
- Added CONFIG_MEMORY_HOTREMOVE check before registering memory notifier
- Moved notifier registration i.e memory_hotremove_notifier into setup_arch()

Changes in V1: (https://patchwork.kernel.org/project/linux-mm/list/?series=271237)

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (3):
  arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  arm64/mm/hotplug: Ensure early memory sections are all online

 arch/arm64/mm/mmu.c | 95 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 4 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH V5 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  2020-11-09  4:28 [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
@ 2020-11-09  4:28 ` Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Anshuman Khandual @ 2020-11-09  4:28 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: gshan, Anshuman Khandual, Catalin Marinas, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 1c0f3e02f731..71dd9d753b8b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1510,7 +1510,16 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 
 static int __init prevent_bootmem_remove_init(void)
 {
-	return register_memory_notifier(&prevent_bootmem_remove_nb);
+	int ret = 0;
+
+	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
+		return ret;
+
+	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
+	if (ret)
+		pr_err("%s: Notifier registration failed %d\n", __func__, ret);
+
+	return ret;
 }
-device_initcall(prevent_bootmem_remove_init);
+early_initcall(prevent_bootmem_remove_init);
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V5 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  2020-11-09  4:28 [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
@ 2020-11-09  4:28 ` Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
  2020-11-10 19:14 ` [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Catalin Marinas
  3 siblings, 0 replies; 5+ messages in thread
From: Anshuman Khandual @ 2020-11-09  4:28 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: gshan, Anshuman Khandual, Catalin Marinas, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further. While here, also call out the first section being attempted
for offline or got offlined.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 71dd9d753b8b..ca6d4952b733 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1493,13 +1493,43 @@ static int prevent_bootmem_remove_notifier(struct notifier_block *nb,
 	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
 	unsigned long pfn = arg->start_pfn;
 
-	if (action != MEM_GOING_OFFLINE)
+	if ((action != MEM_GOING_OFFLINE) && (action != MEM_OFFLINE))
 		return NOTIFY_OK;
 
 	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		unsigned long start = PFN_PHYS(pfn);
+		unsigned long end = start + (1UL << PA_SECTION_SHIFT);
+
 		ms = __pfn_to_section(pfn);
-		if (early_section(ms))
+		if (!early_section(ms))
+			continue;
+
+		if (action == MEM_GOING_OFFLINE) {
+			/*
+			 * Boot memory removal is not supported. Prevent
+			 * it via blocking any attempted offline request
+			 * for the boot memory and just report it.
+			 */
+			pr_warn("Boot memory [%lx %lx] offlining attempted\n", start, end);
 			return NOTIFY_BAD;
+		} else if (action == MEM_OFFLINE) {
+			/*
+			 * This should have never happened. Boot memory
+			 * offlining should have been prevented by this
+			 * very notifier. Probably some memory removal
+			 * procedure might have changed which would then
+			 * require further debug.
+			 */
+			pr_err("Boot memory [%lx %lx] offlined\n", start, end);
+
+			/*
+			 * Core memory hotplug does not process a return
+			 * code from the notifier for MEM_OFFLINE events.
+			 * The error condition has been reported. Return
+			 * from here as if ignored.
+			 */
+			return NOTIFY_DONE;
+		}
 	}
 	return NOTIFY_OK;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V5 3/3] arm64/mm/hotplug: Ensure early memory sections are all online
  2020-11-09  4:28 [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
  2020-11-09  4:28 ` [PATCH V5 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
@ 2020-11-09  4:28 ` Anshuman Khandual
  2020-11-10 19:14 ` [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Catalin Marinas
  3 siblings, 0 replies; 5+ messages in thread
From: Anshuman Khandual @ 2020-11-09  4:28 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: gshan, Anshuman Khandual, Catalin Marinas, Will Deacon,
	Mark Rutland, Marc Zyngier, Steve Capper, Mark Brown,
	linux-kernel

This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. Although the
boot section scanning is selectively enabled with DEBUG_VM.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 48 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ca6d4952b733..f293f2222f50 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1538,6 +1538,53 @@ static struct notifier_block prevent_bootmem_remove_nb = {
 	.notifier_call = prevent_bootmem_remove_notifier,
 };
 
+/*
+ * This ensures that boot memory sections on the platform are online
+ * from early boot. Memory sections could not be prevented from being
+ * offlined, unless for some reason they are not online to begin with.
+ * This helps validate the basic assumption on which the above memory
+ * event notifier works to prevent boot memory section offlining and
+ * its possible removal.
+ */
+static void validate_bootmem_online(void)
+{
+	phys_addr_t start, end, addr;
+	struct mem_section *ms;
+	u64 i;
+
+	/*
+	 * Scanning across all memblock might be expensive
+	 * on some big memory systems. Hence enable this
+	 * validation only with DEBUG_VM.
+	 */
+	if (!IS_ENABLED(CONFIG_DEBUG_VM))
+		return;
+
+	for_each_mem_range(i, &start, &end) {
+		for (addr = start; addr < end; addr += (1UL << PA_SECTION_SHIFT)) {
+			ms = __pfn_to_section(PHYS_PFN(addr));
+
+			/*
+			 * All memory ranges in the system at this point
+			 * should have been marked as early sections.
+			 */
+			WARN_ON(!early_section(ms));
+
+			/*
+			 * Memory notifier mechanism here to prevent boot
+			 * memory offlining depends on the fact that each
+			 * early section memory on the system is initially
+			 * online. Otherwise a given memory section which
+			 * is already offline will be overlooked and can
+			 * be removed completely. Call out such sections.
+			 */
+			if (!online_section(ms))
+				pr_err("Boot memory [%llx %llx] is offline, can be removed\n",
+					addr, addr + (1UL << PA_SECTION_SHIFT));
+		}
+	}
+}
+
 static int __init prevent_bootmem_remove_init(void)
 {
 	int ret = 0;
@@ -1545,6 +1592,7 @@ static int __init prevent_bootmem_remove_init(void)
 	if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
 		return ret;
 
+	validate_bootmem_online();
 	ret = register_memory_notifier(&prevent_bootmem_remove_nb);
 	if (ret)
 		pr_err("%s: Notifier registration failed %d\n", __func__, ret);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier
  2020-11-09  4:28 [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
                   ` (2 preceding siblings ...)
  2020-11-09  4:28 ` [PATCH V5 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
@ 2020-11-10 19:14 ` Catalin Marinas
  3 siblings, 0 replies; 5+ messages in thread
From: Catalin Marinas @ 2020-11-10 19:14 UTC (permalink / raw)
  To: linux-arm-kernel, Anshuman Khandual
  Cc: Will Deacon, linux-kernel, Mark Rutland, Steve Capper,
	Mark Brown, Marc Zyngier, gshan

On Mon, 9 Nov 2020 09:58:54 +0530, Anshuman Khandual wrote:
> This series brings three different changes to the only memory event notifier on
> arm64 platform. These changes improve it's robustness while also enhancing debug
> capabilities during potential memory offlining error conditions.
> 
> This applies on 5.10-rc3
> 
> Changes in V5:
> 
> [...]

Applied to arm64 (for-next/mem-hotplug), thanks!

[1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier
      https://git.kernel.org/arm64/c/cb45babe1b80
[2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling
      https://git.kernel.org/arm64/c/9fb3d4a30338
[3/3] arm64/mm/hotplug: Ensure early memory sections are all online
      https://git.kernel.org/arm64/c/fdd99a4103c9

-- 
Catalin


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-10 19:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09  4:28 [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Anshuman Khandual
2020-11-09  4:28 ` [PATCH V5 1/3] arm64/mm/hotplug: Register boot memory hot remove notifier earlier Anshuman Khandual
2020-11-09  4:28 ` [PATCH V5 2/3] arm64/mm/hotplug: Enable MEM_OFFLINE event handling Anshuman Khandual
2020-11-09  4:28 ` [PATCH V5 3/3] arm64/mm/hotplug: Ensure early memory sections are all online Anshuman Khandual
2020-11-10 19:14 ` [PATCH V5 0/3] arm64/mm/hotplug: Improve memory offline event notifier Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).