linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform
@ 2021-01-25  2:58 Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added " Anshuman Khandual
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  2:58 UTC (permalink / raw)
  To: linux-mm, akpm, hca, catalin.marinas
  Cc: Anshuman Khandual, Oscar Salvador, Vasily Gorbik, Will Deacon,
	Ard Biesheuvel, Mark Rutland, David Hildenbrand,
	linux-arm-kernel, linux-s390, linux-kernel

This series adds a mechanism allowing platforms to weigh in and prevalidate
incoming address range before proceeding further with the memory hotplug.
This helps prevent potential platform errors for the given address range,
down the hotplug call chain, which inevitably fails the hotplug itself.

This mechanism was suggested by David Hildenbrand during another discussion
with respect to a memory hotplug fix on arm64 platform.

https://lore.kernel.org/linux-arm-kernel/1600332402-30123-1-git-send-email-anshuman.khandual@arm.com/

This mechanism focuses on the addressibility aspect and not [sub] section
alignment aspect. Hence check_hotplug_memory_range() and check_pfn_span()
have been left unchanged. Wondering if all these can still be unified in
an expanded memhp_range_allowed() check, that can be called from multiple
memory hot add and remove paths.

This series applies on v5.11-rc5 and has been tested on arm64. But only
build tested on s390.

Changes in V4:

- Moved arch_get_mappable_range() earlier in vmem_add_mapping() on s390
- Moved mhp_range_allowed() back in pagemap_range() as in V2 series
- Changed max_phys as (1ULL << MAX_PHYSMEM_BITS) - 1 in mhp_get_pluggable_range()
- Renamed all memhp instances into mhp
- Dropped the RFC tag from the last patch

Changes in V3:

https://lore.kernel.org/linux-mm/1610975582-12646-1-git-send-email-anshuman.khandual@arm.com/

- Updated the commit message in [PATCH 1/3]
- Replaced 1 with 'true' and 0 with 'false' in memhp_range_allowed()
- Updated memhp_range.end as VMEM_MAX_PHYS - 1 and updated vmem_add_mapping() on s390
- Changed memhp_range_allowed() behaviour in __add_pages()
- Updated __add_pages() to return E2BIG when memhp_range_allowed() fails for non-linear mapping based requests

Changes in V2:

https://lore.kernel.org/linux-mm/1608218912-28932-1-git-send-email-anshuman.khandual@arm.com/

- Changed s390 version per Heiko and updated the commit message
- Called memhp_range_allowed() only for arch_add_memory() in pagemap_range()
- Exported the symbol memhp_get_pluggable_range() 

Changes in V1:

https://lore.kernel.org/linux-mm/1607400978-31595-1-git-send-email-anshuman.khandual@arm.com/

- Fixed build problems with (MEMORY_HOTPLUG & !MEMORY_HOTREMOVE)
- Added missing prototype for arch_get_mappable_range()
- Added VM_BUG_ON() check for memhp_range_allowed() in arch_add_memory() per David

Changes in RFC V2:

https://lore.kernel.org/linux-mm/1606706992-26656-1-git-send-email-anshuman.khandual@arm.com/

Incorporated all review feedbacks from David.

- Added additional range check in __segment_load() on s390 which was lost
- Changed is_private init in pagemap_range()
- Moved the framework into mm/memory_hotplug.c
- Made arch_get_addressable_range() a __weak function
- Renamed arch_get_addressable_range() as arch_get_mappable_range()
- Callback arch_get_mappable_range() only handles range requiring linear mapping
- Merged multiple memhp_range_allowed() checks in register_memory_resource()
- Replaced WARN() with pr_warn() in memhp_range_allowed()
- Replaced error return code ERANGE with E2BIG

Changes in RFC V1:

https://lore.kernel.org/linux-mm/1606098529-7907-1-git-send-email-anshuman.khandual@arm.com/

Cc: Oscar Salvador <osalvador@suse.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (3):
  mm/memory_hotplug: Prevalidate the address range being added with platform
  arm64/mm: Define arch_get_mappable_range()
  s390/mm: Define arch_get_mappable_range()

David Hildenbrand (1):
  virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug

 arch/arm64/mm/mmu.c            | 15 +++----
 arch/s390/mm/init.c            |  1 +
 arch/s390/mm/vmem.c            | 14 +++++-
 drivers/virtio/virtio_mem.c    | 40 +++++++++++------
 include/linux/memory_hotplug.h | 10 +++++
 mm/memory_hotplug.c            | 78 +++++++++++++++++++++++++---------
 mm/memremap.c                  |  6 ++-
 7 files changed, 122 insertions(+), 42 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added with platform
  2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
@ 2021-01-25  2:58 ` Anshuman Khandual
  2021-01-25  9:21   ` David Hildenbrand
  2021-01-25  2:58 ` [PATCH V4 2/4] arm64/mm: Define arch_get_mappable_range() Anshuman Khandual
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  2:58 UTC (permalink / raw)
  To: linux-mm, akpm, hca, catalin.marinas
  Cc: Anshuman Khandual, Oscar Salvador, Vasily Gorbik, Will Deacon,
	Ard Biesheuvel, Mark Rutland, David Hildenbrand,
	linux-arm-kernel, linux-s390, linux-kernel

This introduces mhp_range_allowed() which can be called in various memory
hotplug paths to prevalidate the address range which is being added, with
the platform. Then mhp_range_allowed() calls mhp_get_pluggable_range()
which provides applicable address range depending on whether linear mapping
is required or not. For ranges that require linear mapping, it calls a new
arch callback arch_get_mappable_range() which the platform can override. So
the new callback, in turn provides the platform an opportunity to configure
acceptable memory hotplug address ranges in case there are constraints.

This mechanism will help prevent platform specific errors deep down during
hotplug calls. This drops now redundant check_hotplug_memory_addressable()
check in __add_pages() but instead adds a VM_BUG_ON() check which would
ensure that the range has been validated with mhp_range_allowed() earlier
in the call chain. Besides mhp_get_pluggable_range() also can be used by
potential memory hotplug callers to avail the allowed physical range which
would go through on a given platform.

This does not really add any new range check in generic memory hotplug but
instead compensates for lost checks in arch_add_memory() where applicable
and check_hotplug_memory_addressable(), with unified mhp_range_allowed().

Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/memory_hotplug.h | 10 +++++
 mm/memory_hotplug.c            | 78 +++++++++++++++++++++++++---------
 mm/memremap.c                  |  6 ++-
 3 files changed, 74 insertions(+), 20 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 15acce5ab106..e4c80561e519 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -70,6 +70,9 @@ typedef int __bitwise mhp_t;
  */
 #define MEMHP_MERGE_RESOURCE	((__force mhp_t)BIT(0))
 
+bool mhp_range_allowed(u64 start, u64 size, bool need_mapping);
+struct range mhp_get_pluggable_range(bool need_mapping);
+
 /*
  * Extended parameters for memory hotplug:
  * altmap: alternative allocator for memmap array (optional)
@@ -281,6 +284,13 @@ static inline bool movable_node_is_enabled(void)
 }
 #endif /* ! CONFIG_MEMORY_HOTPLUG */
 
+/*
+ * Keep this declaration outside CONFIG_MEMORY_HOTPLUG as some
+ * platforms might override and use arch_get_mappable_range()
+ * for internal non memory hotplug purposes.
+ */
+struct range arch_get_mappable_range(void);
+
 #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
 /*
  * pgdat resizing functions
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f9d57b9be8c7..fb734a865807 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -107,6 +107,9 @@ static struct resource *register_memory_resource(u64 start, u64 size,
 	if (strcmp(resource_name, "System RAM"))
 		flags |= IORESOURCE_SYSRAM_DRIVER_MANAGED;
 
+	if (!mhp_range_allowed(start, size, true))
+		return ERR_PTR(-E2BIG);
+
 	/*
 	 * Make sure value parsed from 'mem=' only restricts memory adding
 	 * while booting, so that memory hotplug won't be impacted. Please
@@ -284,22 +287,6 @@ static int check_pfn_span(unsigned long pfn, unsigned long nr_pages,
 	return 0;
 }
 
-static int check_hotplug_memory_addressable(unsigned long pfn,
-					    unsigned long nr_pages)
-{
-	const u64 max_addr = PFN_PHYS(pfn + nr_pages) - 1;
-
-	if (max_addr >> MAX_PHYSMEM_BITS) {
-		const u64 max_allowed = (1ull << (MAX_PHYSMEM_BITS + 1)) - 1;
-		WARN(1,
-		     "Hotplugged memory exceeds maximum addressable address, range=%#llx-%#llx, maximum=%#llx\n",
-		     (u64)PFN_PHYS(pfn), max_addr, max_allowed);
-		return -E2BIG;
-	}
-
-	return 0;
-}
-
 /*
  * Reasonably generic function for adding memory.  It is
  * expected that archs that support memory hotplug will
@@ -317,9 +304,7 @@ int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages,
 	if (WARN_ON_ONCE(!params->pgprot.pgprot))
 		return -EINVAL;
 
-	err = check_hotplug_memory_addressable(pfn, nr_pages);
-	if (err)
-		return err;
+	VM_BUG_ON(!mhp_range_allowed(PFN_PHYS(pfn), nr_pages * PAGE_SIZE, false));
 
 	if (altmap) {
 		/*
@@ -1180,6 +1165,61 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
 }
 EXPORT_SYMBOL_GPL(add_memory_driver_managed);
 
+/*
+ * Platforms should define arch_get_mappable_range() that provides
+ * maximum possible addressable physical memory range for which the
+ * linear mapping could be created. The platform returned address
+ * range must adhere to these following semantics.
+ *
+ * - range.start <= range.end
+ * - Range includes both end points [range.start..range.end]
+ *
+ * There is also a fallback definition provided here, allowing the
+ * entire possible physical address range in case any platform does
+ * not define arch_get_mappable_range().
+ */
+struct range __weak arch_get_mappable_range(void)
+{
+	struct range mhp_range = {
+		.start = 0UL,
+		.end = -1ULL,
+	};
+	return mhp_range;
+}
+
+struct range mhp_get_pluggable_range(bool need_mapping)
+{
+	const u64 max_phys = (1ULL << MAX_PHYSMEM_BITS) - 1;
+	struct range mhp_range;
+
+	if (need_mapping) {
+		mhp_range = arch_get_mappable_range();
+		if (mhp_range.start > max_phys) {
+			mhp_range.start = 0;
+			mhp_range.end = 0;
+		}
+		mhp_range.end = min_t(u64, mhp_range.end, max_phys);
+	} else {
+		mhp_range.start = 0;
+		mhp_range.end = max_phys;
+	}
+	return mhp_range;
+}
+EXPORT_SYMBOL_GPL(mhp_get_pluggable_range);
+
+bool mhp_range_allowed(u64 start, u64 size, bool need_mapping)
+{
+	struct range mhp_range = mhp_get_pluggable_range(need_mapping);
+	u64 end = start + size;
+
+	if (start < end && start >= mhp_range.start && (end - 1) <= mhp_range.end)
+		return true;
+
+	pr_warn("Hotplug memory [%#llx-%#llx] exceeds maximum addressable range [%#llx-%#llx]\n",
+		start, end, mhp_range.start, mhp_range.end);
+	return false;
+}
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
  * Confirm all pages in a range [start, end) belong to the same zone (skipping
diff --git a/mm/memremap.c b/mm/memremap.c
index 16b2fb482da1..fcc5679e3fab 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -185,6 +185,7 @@ static void dev_pagemap_percpu_release(struct percpu_ref *ref)
 static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 		int range_id, int nid)
 {
+	const bool is_private = pgmap->type == MEMORY_DEVICE_PRIVATE;
 	struct range *range = &pgmap->ranges[range_id];
 	struct dev_pagemap *conflict_pgmap;
 	int error, is_ram;
@@ -230,6 +231,9 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 	if (error)
 		goto err_pfn_remap;
 
+	if (!mhp_range_allowed(range->start, range_len(range), !is_private))
+		goto err_pfn_remap;
+
 	mem_hotplug_begin();
 
 	/*
@@ -243,7 +247,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 	 * the CPU, we do want the linear mapping and thus use
 	 * arch_add_memory().
 	 */
-	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+	if (is_private) {
 		error = add_pages(nid, PHYS_PFN(range->start),
 				PHYS_PFN(range_len(range)), params);
 	} else {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V4 2/4] arm64/mm: Define arch_get_mappable_range()
  2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added " Anshuman Khandual
@ 2021-01-25  2:58 ` Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 3/4] s390/mm: " Anshuman Khandual
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  2:58 UTC (permalink / raw)
  To: linux-mm, akpm, hca, catalin.marinas
  Cc: Anshuman Khandual, Oscar Salvador, Vasily Gorbik, Will Deacon,
	Ard Biesheuvel, Mark Rutland, David Hildenbrand,
	linux-arm-kernel, linux-s390, linux-kernel

This overrides arch_get_mappable_range() on arm64 platform which will be
used with recently added generic framework. It drops inside_linear_region()
and subsequent check in arch_add_memory() which are no longer required. It
also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed()
has already been called.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/mm/mmu.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ae0c3d023824..b14fd1b6b1a0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1442,16 +1442,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size)
 	free_empty_tables(start, end, PAGE_OFFSET, PAGE_END);
 }
 
-static bool inside_linear_region(u64 start, u64 size)
+struct range arch_get_mappable_range(void)
 {
+	struct range mhp_range;
+
 	/*
 	 * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)]
 	 * accommodating both its ends but excluding PAGE_END. Max physical
 	 * range which can be mapped inside this linear mapping range, must
 	 * also be derived from its end points.
 	 */
-	return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
-	       (start + size - 1) <= __pa(PAGE_END - 1);
+	mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
+	mhp_range.end =  __pa(PAGE_END - 1);
+	return mhp_range;
 }
 
 int arch_add_memory(int nid, u64 start, u64 size,
@@ -1459,11 +1462,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
 {
 	int ret, flags = 0;
 
-	if (!inside_linear_region(start, size)) {
-		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
-		return -EINVAL;
-	}
-
+	VM_BUG_ON(!mhp_range_allowed(start, size, true));
 	if (rodata_full || debug_pagealloc_enabled())
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V4 3/4] s390/mm: Define arch_get_mappable_range()
  2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added " Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 2/4] arm64/mm: Define arch_get_mappable_range() Anshuman Khandual
@ 2021-01-25  2:58 ` Anshuman Khandual
  2021-01-25  2:58 ` [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug Anshuman Khandual
  2021-01-25  9:25 ` [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform David Hildenbrand
  4 siblings, 0 replies; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  2:58 UTC (permalink / raw)
  To: linux-mm, akpm, hca, catalin.marinas
  Cc: Anshuman Khandual, Oscar Salvador, Vasily Gorbik, Will Deacon,
	Ard Biesheuvel, Mark Rutland, David Hildenbrand,
	linux-arm-kernel, linux-s390, linux-kernel

This overrides arch_get_mappabble_range() on s390 platform which will be
used with recently added generic framework. It modifies the existing range
check in vmem_add_mapping() using arch_get_mappable_range(). It also adds a
VM_BUG_ON() check that would ensure that mhp_range_allowed() has already
been called on the hotplug path.

Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/s390/mm/init.c |  1 +
 arch/s390/mm/vmem.c | 14 +++++++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 73a163065b95..0e76b2127dc6 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -297,6 +297,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
 	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
 		return -EINVAL;
 
+	VM_BUG_ON(!mhp_range_allowed(start, size, true));
 	rc = vmem_add_mapping(start, size);
 	if (rc)
 		return rc;
diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index 01f3a5f58e64..82dbf9450105 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -4,6 +4,7 @@
  *    Author(s): Heiko Carstens <heiko.carstens@de.ibm.com>
  */
 
+#include <linux/memory_hotplug.h>
 #include <linux/memblock.h>
 #include <linux/pfn.h>
 #include <linux/mm.h>
@@ -532,11 +533,22 @@ void vmem_remove_mapping(unsigned long start, unsigned long size)
 	mutex_unlock(&vmem_mutex);
 }
 
+struct range arch_get_mappable_range(void)
+{
+	struct range mhp_range;
+
+	mhp_range.start = 0;
+	mhp_range.end =  VMEM_MAX_PHYS - 1;
+	return mhp_range;
+}
+
 int vmem_add_mapping(unsigned long start, unsigned long size)
 {
+	struct range range = arch_get_mappable_range();
 	int ret;
 
-	if (start + size > VMEM_MAX_PHYS ||
+	if (start < range.start ||
+	    start + size > range.end + 1 ||
 	    start + size < start)
 		return -ERANGE;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug
  2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
                   ` (2 preceding siblings ...)
  2021-01-25  2:58 ` [PATCH V4 3/4] s390/mm: " Anshuman Khandual
@ 2021-01-25  2:58 ` Anshuman Khandual
  2021-01-25 12:01   ` David Hildenbrand
  2021-01-25  9:25 ` [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform David Hildenbrand
  4 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  2:58 UTC (permalink / raw)
  To: linux-mm, akpm, hca, catalin.marinas
  Cc: Anshuman Khandual, Oscar Salvador, Vasily Gorbik, Will Deacon,
	Ard Biesheuvel, Mark Rutland, David Hildenbrand,
	linux-arm-kernel, linux-s390, linux-kernel, Michael S. Tsirkin,
	Jason Wang, Pankaj Gupta, Wei Yang, teawater, Pankaj Gupta,
	Jonathan Cameron, Michal Hocko

From: David Hildenbrand <david@redhat.com>

Right now, we only check against MAX_PHYSMEM_BITS - but turns out there
are more restrictions of which memory we can actually hotplug, especially
om arm64 or s390x once we support them: we might receive something like
-E2BIG or -ERANGE from add_memory_driver_managed(), stopping device
operation.

So, check right when initializing the device which memory we can add,
warning the user. Try only adding actually pluggable ranges: in the worst
case, no memory provided by our device is pluggable.

In the usual case, we expect all device memory to be pluggable, and in
corner cases only some memory at the end of the device-managed memory
region to not be pluggable.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: catalin.marinas@arm.com
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: hca@linux.ibm.com
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/virtio/virtio_mem.c | 40 +++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 9fc9ec4a25f5..14c17c5c1695 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2222,7 +2222,7 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
  */
 static void virtio_mem_refresh_config(struct virtio_mem *vm)
 {
-	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;
+	const struct range pluggable_range = mhp_get_pluggable_range(true);
 	uint64_t new_plugged_size, usable_region_size, end_addr;
 
 	/* the plugged_size is just a reflection of what _we_ did previously */
@@ -2234,15 +2234,25 @@ static void virtio_mem_refresh_config(struct virtio_mem *vm)
 	/* calculate the last usable memory block id */
 	virtio_cread_le(vm->vdev, struct virtio_mem_config,
 			usable_region_size, &usable_region_size);
-	end_addr = vm->addr + usable_region_size;
-	end_addr = min(end_addr, phys_limit);
+	end_addr = min(vm->addr + usable_region_size - 1,
+		       pluggable_range.end);
 
-	if (vm->in_sbm)
-		vm->sbm.last_usable_mb_id =
-					 virtio_mem_phys_to_mb_id(end_addr) - 1;
-	else
-		vm->bbm.last_usable_bb_id =
-				     virtio_mem_phys_to_bb_id(vm, end_addr) - 1;
+	if (vm->in_sbm) {
+		vm->sbm.last_usable_mb_id = virtio_mem_phys_to_mb_id(end_addr);
+		if (!IS_ALIGNED(end_addr + 1, memory_block_size_bytes()))
+			vm->sbm.last_usable_mb_id--;
+	} else {
+		vm->bbm.last_usable_bb_id = virtio_mem_phys_to_bb_id(vm,
+								     end_addr);
+		if (!IS_ALIGNED(end_addr + 1, vm->bbm.bb_size))
+			vm->bbm.last_usable_bb_id--;
+	}
+	/*
+	 * If we cannot plug any of our device memory (e.g., nothing in the
+	 * usable region is addressable), the last usable memory block id will
+	 * be smaller than the first usable memory block id. We'll stop
+	 * attempting to add memory with -ENOSPC from our main loop.
+	 */
 
 	/* see if there is a request to change the size */
 	virtio_cread_le(vm->vdev, struct virtio_mem_config, requested_size,
@@ -2364,6 +2374,7 @@ static int virtio_mem_init_vq(struct virtio_mem *vm)
 
 static int virtio_mem_init(struct virtio_mem *vm)
 {
+	const struct range pluggable_range = mhp_get_pluggable_range(true);
 	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;
 	uint64_t sb_size, addr;
 	uint16_t node_id;
@@ -2405,9 +2416,10 @@ static int virtio_mem_init(struct virtio_mem *vm)
 	if (!IS_ALIGNED(vm->addr + vm->region_size, memory_block_size_bytes()))
 		dev_warn(&vm->vdev->dev,
 			 "The alignment of the physical end address can make some memory unusable.\n");
-	if (vm->addr + vm->region_size > phys_limit)
+	if (vm->addr < pluggable_range.start ||
+	    vm->addr + vm->region_size - 1 > pluggable_range.end)
 		dev_warn(&vm->vdev->dev,
-			 "Some memory is not addressable. This can make some memory unusable.\n");
+			 "Some device memory is not addressable/pluggable. This can make some memory unusable.\n");
 
 	/*
 	 * We want subblocks to span at least MAX_ORDER_NR_PAGES and
@@ -2429,7 +2441,8 @@ static int virtio_mem_init(struct virtio_mem *vm)
 				     vm->sbm.sb_size;
 
 		/* Round up to the next full memory block */
-		addr = vm->addr + memory_block_size_bytes() - 1;
+		addr = max_t(uint64_t, vm->addr, pluggable_range.start) +
+		       memory_block_size_bytes() - 1;
 		vm->sbm.first_mb_id = virtio_mem_phys_to_mb_id(addr);
 		vm->sbm.next_mb_id = vm->sbm.first_mb_id;
 	} else {
@@ -2450,7 +2463,8 @@ static int virtio_mem_init(struct virtio_mem *vm)
 		}
 
 		/* Round up to the next aligned big block */
-		addr = vm->addr + vm->bbm.bb_size - 1;
+		addr = max_t(uint64_t, vm->addr, pluggable_range.start) +
+		       vm->bbm.bb_size - 1;
 		vm->bbm.first_bb_id = virtio_mem_phys_to_bb_id(vm, addr);
 		vm->bbm.next_bb_id = vm->bbm.first_bb_id;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added with platform
  2021-01-25  2:58 ` [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added " Anshuman Khandual
@ 2021-01-25  9:21   ` David Hildenbrand
  0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2021-01-25  9:21 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel

On 25.01.21 03:58, Anshuman Khandual wrote:
> This introduces mhp_range_allowed() which can be called in various memory
> hotplug paths to prevalidate the address range which is being added, with
> the platform. Then mhp_range_allowed() calls mhp_get_pluggable_range()
> which provides applicable address range depending on whether linear mapping
> is required or not. For ranges that require linear mapping, it calls a new
> arch callback arch_get_mappable_range() which the platform can override. So
> the new callback, in turn provides the platform an opportunity to configure
> acceptable memory hotplug address ranges in case there are constraints.
> 
> This mechanism will help prevent platform specific errors deep down during
> hotplug calls. This drops now redundant check_hotplug_memory_addressable()
> check in __add_pages() but instead adds a VM_BUG_ON() check which would
> ensure that the range has been validated with mhp_range_allowed() earlier
> in the call chain. Besides mhp_get_pluggable_range() also can be used by
> potential memory hotplug callers to avail the allowed physical range which
> would go through on a given platform.
> 
> This does not really add any new range check in generic memory hotplug but
> instead compensates for lost checks in arch_add_memory() where applicable
> and check_hotplug_memory_addressable(), with unified mhp_range_allowed().
> 
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

[...]

> +
> +struct range mhp_get_pluggable_range(bool need_mapping)
> +{
> +	const u64 max_phys = (1ULL << MAX_PHYSMEM_BITS) - 1;
> +	struct range mhp_range;
> +
> +	if (need_mapping) {
> +		mhp_range = arch_get_mappable_range();
> +		if (mhp_range.start > max_phys) {
> +			mhp_range.start = 0;
> +			mhp_range.end = 0;
> +		}
> +		mhp_range.end = min_t(u64, mhp_range.end, max_phys);
> +	} else {
> +		mhp_range.start = 0;
> +		mhp_range.end = max_phys;
> +	}
> +	return mhp_range;
> +}
> +EXPORT_SYMBOL_GPL(mhp_get_pluggable_range);


Some people might prefer the EXPORT_SYMBOL_GPL() going to the actual
user (patch #4), I don't care :) Thanks!

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform
  2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
                   ` (3 preceding siblings ...)
  2021-01-25  2:58 ` [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug Anshuman Khandual
@ 2021-01-25  9:25 ` David Hildenbrand
  2021-01-25  9:52   ` Anshuman Khandual
  4 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2021-01-25  9:25 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel

On 25.01.21 03:58, Anshuman Khandual wrote:
> This series adds a mechanism allowing platforms to weigh in and prevalidate
> incoming address range before proceeding further with the memory hotplug.
> This helps prevent potential platform errors for the given address range,
> down the hotplug call chain, which inevitably fails the hotplug itself.
> 
> This mechanism was suggested by David Hildenbrand during another discussion
> with respect to a memory hotplug fix on arm64 platform.
> 
> https://lore.kernel.org/linux-arm-kernel/1600332402-30123-1-git-send-email-anshuman.khandual@arm.com/
> 
> This mechanism focuses on the addressibility aspect and not [sub] section
> alignment aspect. Hence check_hotplug_memory_range() and check_pfn_span()
> have been left unchanged. Wondering if all these can still be unified in
> an expanded memhp_range_allowed() check, that can be called from multiple
> memory hot add and remove paths.
> 
> This series applies on v5.11-rc5 and has been tested on arm64. But only
> build tested on s390.
> 

Note that this fails to apply right now to both, -next and Linus' tree.
Do you have a branch with he patches on top I can use for a quick test?
Thanks

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform
  2021-01-25  9:25 ` [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform David Hildenbrand
@ 2021-01-25  9:52   ` Anshuman Khandual
  2021-01-25  9:53     ` David Hildenbrand
  0 siblings, 1 reply; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-25  9:52 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel



On 1/25/21 2:55 PM, David Hildenbrand wrote:
> On 25.01.21 03:58, Anshuman Khandual wrote:
>> This series adds a mechanism allowing platforms to weigh in and prevalidate
>> incoming address range before proceeding further with the memory hotplug.
>> This helps prevent potential platform errors for the given address range,
>> down the hotplug call chain, which inevitably fails the hotplug itself.
>>
>> This mechanism was suggested by David Hildenbrand during another discussion
>> with respect to a memory hotplug fix on arm64 platform.
>>
>> https://lore.kernel.org/linux-arm-kernel/1600332402-30123-1-git-send-email-anshuman.khandual@arm.com/
>>
>> This mechanism focuses on the addressibility aspect and not [sub] section
>> alignment aspect. Hence check_hotplug_memory_range() and check_pfn_span()
>> have been left unchanged. Wondering if all these can still be unified in
>> an expanded memhp_range_allowed() check, that can be called from multiple
>> memory hot add and remove paths.
>>
>> This series applies on v5.11-rc5 and has been tested on arm64. But only
>> build tested on s390.
>>
> 
> Note that this fails to apply right now to both, -next and Linus' tree.
> Do you have a branch with he patches on top I can use for a quick test?
> Thanks
> 

Applied all four patches on v5.11-rc5.

https://gitlab.arm.com/linux-arm/linux-anshuman/-/tree/mm/hotplug_callback/v4/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform
  2021-01-25  9:52   ` Anshuman Khandual
@ 2021-01-25  9:53     ` David Hildenbrand
  0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2021-01-25  9:53 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel

On 25.01.21 10:52, Anshuman Khandual wrote:
> 
> 
> On 1/25/21 2:55 PM, David Hildenbrand wrote:
>> On 25.01.21 03:58, Anshuman Khandual wrote:
>>> This series adds a mechanism allowing platforms to weigh in and prevalidate
>>> incoming address range before proceeding further with the memory hotplug.
>>> This helps prevent potential platform errors for the given address range,
>>> down the hotplug call chain, which inevitably fails the hotplug itself.
>>>
>>> This mechanism was suggested by David Hildenbrand during another discussion
>>> with respect to a memory hotplug fix on arm64 platform.
>>>
>>> https://lore.kernel.org/linux-arm-kernel/1600332402-30123-1-git-send-email-anshuman.khandual@arm.com/
>>>
>>> This mechanism focuses on the addressibility aspect and not [sub] section
>>> alignment aspect. Hence check_hotplug_memory_range() and check_pfn_span()
>>> have been left unchanged. Wondering if all these can still be unified in
>>> an expanded memhp_range_allowed() check, that can be called from multiple
>>> memory hot add and remove paths.
>>>
>>> This series applies on v5.11-rc5 and has been tested on arm64. But only
>>> build tested on s390.
>>>
>>
>> Note that this fails to apply right now to both, -next and Linus' tree.
>> Do you have a branch with he patches on top I can use for a quick test?
>> Thanks
>>
> 
> Applied all four patches on v5.11-rc5.
> 
> https://gitlab.arm.com/linux-arm/linux-anshuman/-/tree/mm/hotplug_callback/v4/
> 

Ah, my fault, they do apply directly on v5.11-rc5 (not sure what I
messed up jumping between branches - thanks!). Will give it a test.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug
  2021-01-25  2:58 ` [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug Anshuman Khandual
@ 2021-01-25 12:01   ` David Hildenbrand
  2021-01-27  3:42     ` Anshuman Khandual
  0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2021-01-25 12:01 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel,
	Michael S. Tsirkin, Jason Wang, Pankaj Gupta, Wei Yang, teawater,
	Pankaj Gupta, Jonathan Cameron, Michal Hocko

On 25.01.21 03:58, Anshuman Khandual wrote:
> From: David Hildenbrand <david@redhat.com>
> 
> Right now, we only check against MAX_PHYSMEM_BITS - but turns out there
> are more restrictions of which memory we can actually hotplug, especially
> om arm64 or s390x once we support them: we might receive something like
> -E2BIG or -ERANGE from add_memory_driver_managed(), stopping device
> operation.
> 
> So, check right when initializing the device which memory we can add,
> warning the user. Try only adding actually pluggable ranges: in the worst
> case, no memory provided by our device is pluggable.
> 
> In the usual case, we expect all device memory to be pluggable, and in
> corner cases only some memory at the end of the device-managed memory
> region to not be pluggable.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: catalin.marinas@arm.com
> Cc: teawater <teawaterz@linux.alibaba.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: hca@linux.ibm.com
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  drivers/virtio/virtio_mem.c | 40 +++++++++++++++++++++++++------------
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index 9fc9ec4a25f5..14c17c5c1695 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -2222,7 +2222,7 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
>   */
>  static void virtio_mem_refresh_config(struct virtio_mem *vm)
>  {
> -	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;
> +	const struct range pluggable_range = mhp_get_pluggable_range(true);
>  	uint64_t new_plugged_size, usable_region_size, end_addr;
>  
>  	/* the plugged_size is just a reflection of what _we_ did previously */
> @@ -2234,15 +2234,25 @@ static void virtio_mem_refresh_config(struct virtio_mem *vm)
>  	/* calculate the last usable memory block id */
>  	virtio_cread_le(vm->vdev, struct virtio_mem_config,
>  			usable_region_size, &usable_region_size);
> -	end_addr = vm->addr + usable_region_size;
> -	end_addr = min(end_addr, phys_limit);
> +	end_addr = min(vm->addr + usable_region_size - 1,
> +		       pluggable_range.end);
>  
> -	if (vm->in_sbm)
> -		vm->sbm.last_usable_mb_id =
> -					 virtio_mem_phys_to_mb_id(end_addr) - 1;
> -	else
> -		vm->bbm.last_usable_bb_id =
> -				     virtio_mem_phys_to_bb_id(vm, end_addr) - 1;
> +	if (vm->in_sbm) {
> +		vm->sbm.last_usable_mb_id = virtio_mem_phys_to_mb_id(end_addr);
> +		if (!IS_ALIGNED(end_addr + 1, memory_block_size_bytes()))
> +			vm->sbm.last_usable_mb_id--;
> +	} else {
> +		vm->bbm.last_usable_bb_id = virtio_mem_phys_to_bb_id(vm,
> +								     end_addr);
> +		if (!IS_ALIGNED(end_addr + 1, vm->bbm.bb_size))
> +			vm->bbm.last_usable_bb_id--;
> +	}
> +	/*
> +	 * If we cannot plug any of our device memory (e.g., nothing in the
> +	 * usable region is addressable), the last usable memory block id will
> +	 * be smaller than the first usable memory block id. We'll stop
> +	 * attempting to add memory with -ENOSPC from our main loop.
> +	 */
>  
>  	/* see if there is a request to change the size */
>  	virtio_cread_le(vm->vdev, struct virtio_mem_config, requested_size,
> @@ -2364,6 +2374,7 @@ static int virtio_mem_init_vq(struct virtio_mem *vm)
>  
>  static int virtio_mem_init(struct virtio_mem *vm)
>  {
> +	const struct range pluggable_range = mhp_get_pluggable_range(true);
>  	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;

Sorry, forgot to drop ^ (phys_limit), otherwise ther is a friendly
warning from the compiler. We have to drop that line.



Apart from that, at least on x86-64 it does what it's supposed to do. I
temporarily changed MAX_PHYSMEM_BITS to 35 bits and added a virtio-mem
device that crosses the 32 GiB address limit.


[    0.572084] virtio_mem virtio1: Some device memory is not
addressable/pluggable. This can make some memory unusable.
[    0.573013] virtio_mem virtio1: start address: 0x740000000
[    0.573497] virtio_mem virtio1: region size: 0x500000000


And virtio-mem won't add any memory exceeding that:

(qemu) qom-set vmem0 requested-size 20G
(qemu) info memory-devices
Memory device [virtio-mem]: "vmem0"
[...]
Memory device [virtio-mem]: "vmem1"
  memaddr: 0x740000000
  node: 1
  requested-size: 21474836480
  size: 3221225472
  max-size: 21474836480
  block-size: 2097152
  memdev: /objects/mem1

I adds all memory up to the 32GiB address limit (35 bits) and stops.

LGTM (arm64 to be tested in the future once supported).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug
  2021-01-25 12:01   ` David Hildenbrand
@ 2021-01-27  3:42     ` Anshuman Khandual
  0 siblings, 0 replies; 11+ messages in thread
From: Anshuman Khandual @ 2021-01-27  3:42 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm, akpm, hca, catalin.marinas
  Cc: Oscar Salvador, Vasily Gorbik, Will Deacon, Ard Biesheuvel,
	Mark Rutland, linux-arm-kernel, linux-s390, linux-kernel,
	Michael S. Tsirkin, Jason Wang, Pankaj Gupta, Wei Yang, teawater,
	Pankaj Gupta, Jonathan Cameron, Michal Hocko



On 1/25/21 5:31 PM, David Hildenbrand wrote:
> On 25.01.21 03:58, Anshuman Khandual wrote:
>> From: David Hildenbrand <david@redhat.com>
>>
>> Right now, we only check against MAX_PHYSMEM_BITS - but turns out there
>> are more restrictions of which memory we can actually hotplug, especially
>> om arm64 or s390x once we support them: we might receive something like
>> -E2BIG or -ERANGE from add_memory_driver_managed(), stopping device
>> operation.
>>
>> So, check right when initializing the device which memory we can add,
>> warning the user. Try only adding actually pluggable ranges: in the worst
>> case, no memory provided by our device is pluggable.
>>
>> In the usual case, we expect all device memory to be pluggable, and in
>> corner cases only some memory at the end of the device-managed memory
>> region to not be pluggable.
>>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: catalin.marinas@arm.com
>> Cc: teawater <teawaterz@linux.alibaba.com>
>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
>> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
>> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Cc: hca@linux.ibm.com
>> Cc: Vasily Gorbik <gor@linux.ibm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Ard Biesheuvel <ardb@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Heiko Carstens <hca@linux.ibm.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  drivers/virtio/virtio_mem.c | 40 +++++++++++++++++++++++++------------
>>  1 file changed, 27 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
>> index 9fc9ec4a25f5..14c17c5c1695 100644
>> --- a/drivers/virtio/virtio_mem.c
>> +++ b/drivers/virtio/virtio_mem.c
>> @@ -2222,7 +2222,7 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
>>   */
>>  static void virtio_mem_refresh_config(struct virtio_mem *vm)
>>  {
>> -	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;
>> +	const struct range pluggable_range = mhp_get_pluggable_range(true);
>>  	uint64_t new_plugged_size, usable_region_size, end_addr;
>>  
>>  	/* the plugged_size is just a reflection of what _we_ did previously */
>> @@ -2234,15 +2234,25 @@ static void virtio_mem_refresh_config(struct virtio_mem *vm)
>>  	/* calculate the last usable memory block id */
>>  	virtio_cread_le(vm->vdev, struct virtio_mem_config,
>>  			usable_region_size, &usable_region_size);
>> -	end_addr = vm->addr + usable_region_size;
>> -	end_addr = min(end_addr, phys_limit);
>> +	end_addr = min(vm->addr + usable_region_size - 1,
>> +		       pluggable_range.end);
>>  
>> -	if (vm->in_sbm)
>> -		vm->sbm.last_usable_mb_id =
>> -					 virtio_mem_phys_to_mb_id(end_addr) - 1;
>> -	else
>> -		vm->bbm.last_usable_bb_id =
>> -				     virtio_mem_phys_to_bb_id(vm, end_addr) - 1;
>> +	if (vm->in_sbm) {
>> +		vm->sbm.last_usable_mb_id = virtio_mem_phys_to_mb_id(end_addr);
>> +		if (!IS_ALIGNED(end_addr + 1, memory_block_size_bytes()))
>> +			vm->sbm.last_usable_mb_id--;
>> +	} else {
>> +		vm->bbm.last_usable_bb_id = virtio_mem_phys_to_bb_id(vm,
>> +								     end_addr);
>> +		if (!IS_ALIGNED(end_addr + 1, vm->bbm.bb_size))
>> +			vm->bbm.last_usable_bb_id--;
>> +	}
>> +	/*
>> +	 * If we cannot plug any of our device memory (e.g., nothing in the
>> +	 * usable region is addressable), the last usable memory block id will
>> +	 * be smaller than the first usable memory block id. We'll stop
>> +	 * attempting to add memory with -ENOSPC from our main loop.
>> +	 */
>>  
>>  	/* see if there is a request to change the size */
>>  	virtio_cread_le(vm->vdev, struct virtio_mem_config, requested_size,
>> @@ -2364,6 +2374,7 @@ static int virtio_mem_init_vq(struct virtio_mem *vm)
>>  
>>  static int virtio_mem_init(struct virtio_mem *vm)
>>  {
>> +	const struct range pluggable_range = mhp_get_pluggable_range(true);
>>  	const uint64_t phys_limit = 1UL << MAX_PHYSMEM_BITS;
> 
> Sorry, forgot to drop ^ (phys_limit), otherwise ther is a friendly
> warning from the compiler. We have to drop that line.

Okay sure, will drop.

> 
> 
> 
> Apart from that, at least on x86-64 it does what it's supposed to do. I
> temporarily changed MAX_PHYSMEM_BITS to 35 bits and added a virtio-mem
> device that crosses the 32 GiB address limit.
> 
> 
> [    0.572084] virtio_mem virtio1: Some device memory is not
> addressable/pluggable. This can make some memory unusable.
> [    0.573013] virtio_mem virtio1: start address: 0x740000000
> [    0.573497] virtio_mem virtio1: region size: 0x500000000
> 
> 
> And virtio-mem won't add any memory exceeding that:
> 
> (qemu) qom-set vmem0 requested-size 20G
> (qemu) info memory-devices
> Memory device [virtio-mem]: "vmem0"
> [...]
> Memory device [virtio-mem]: "vmem1"
>   memaddr: 0x740000000
>   node: 1
>   requested-size: 21474836480
>   size: 3221225472
>   max-size: 21474836480
>   block-size: 2097152
>   memdev: /objects/mem1
> 
> I adds all memory up to the 32GiB address limit (35 bits) and stops.
> 
> LGTM (arm64 to be tested in the future once supported).
> 

I will respin the series with the above minor change unless something
else comes up in the meantime. But once this virtio-mem change gets
some more reviews, I guess the series should be complete in itself.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-27  6:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25  2:58 [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform Anshuman Khandual
2021-01-25  2:58 ` [PATCH V4 1/4] mm/memory_hotplug: Prevalidate the address range being added " Anshuman Khandual
2021-01-25  9:21   ` David Hildenbrand
2021-01-25  2:58 ` [PATCH V4 2/4] arm64/mm: Define arch_get_mappable_range() Anshuman Khandual
2021-01-25  2:58 ` [PATCH V4 3/4] s390/mm: " Anshuman Khandual
2021-01-25  2:58 ` [PATCH V4 4/4] virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug Anshuman Khandual
2021-01-25 12:01   ` David Hildenbrand
2021-01-27  3:42     ` Anshuman Khandual
2021-01-25  9:25 ` [PATCH V4 0/4] mm/memory_hotplug: Pre-validate the address range with platform David Hildenbrand
2021-01-25  9:52   ` Anshuman Khandual
2021-01-25  9:53     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).