All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] xen/balloon: fixes for memory hotplug
@ 2020-07-23  8:45 Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Roger Pau Monne

Hello,

The following series contain some fixes in order to make Xen balloon
memory hotplug function properly. Fix two patches are bugfixes that IMO
should be backported to stable branches, last patch might be more
controversial (to backport) since it includes a small change to the
generic memory hotplug interface.

Thanks, Roger.

Roger Pau Monne (3):
  xen/balloon: fix accounting in alloc_xenballooned_pages error path
  xen/balloon: make the balloon wait interruptible
  memory: introduce an option to force onlining of hotplug memory

 drivers/xen/balloon.c          | 14 +++++++++++---
 include/linux/memory_hotplug.h |  3 ++-
 mm/memory_hotplug.c            | 16 ++++++++++------
 3 files changed, 23 insertions(+), 10 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/3] xen/balloon: fix accounting in alloc_xenballooned_pages error path
  2020-07-23  8:45 [PATCH 0/3] xen/balloon: fixes for memory hotplug Roger Pau Monne
@ 2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Roger Pau Monne, stable, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, xen-devel

target_unpopulated is incremented with nr_pages at the start of the
function, but the call to free_xenballooned_pages will only subtract
pgno number of pages, and thus the rest need to be subtracted before
returning or else accounting will be skewed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
---
 drivers/xen/balloon.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 77c57568e5d7..3cb10ed32557 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -630,6 +630,12 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages)
  out_undo:
 	mutex_unlock(&balloon_mutex);
 	free_xenballooned_pages(pgno, pages);
+	/*
+	 * NB: free_xenballooned_pages will only subtract pgno pages, but since
+	 * target_unpopulated is incremented with nr_pages at the start we need
+	 * to remove the remaining ones also, or accounting will be screwed.
+	 */
+	balloon_stats.target_unpopulated -= nr_pages - pgno;
 	return ret;
 }
 EXPORT_SYMBOL(alloc_xenballooned_pages);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 1/3] xen/balloon: fix accounting in alloc_xenballooned_pages error path
@ 2020-07-23  8:45   ` Roger Pau Monne
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Juergen Gross, Stefano Stabellini, stable, xen-devel,
	Boris Ostrovsky, Roger Pau Monne

target_unpopulated is incremented with nr_pages at the start of the
function, but the call to free_xenballooned_pages will only subtract
pgno number of pages, and thus the rest need to be subtracted before
returning or else accounting will be skewed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
---
 drivers/xen/balloon.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 77c57568e5d7..3cb10ed32557 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -630,6 +630,12 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages)
  out_undo:
 	mutex_unlock(&balloon_mutex);
 	free_xenballooned_pages(pgno, pages);
+	/*
+	 * NB: free_xenballooned_pages will only subtract pgno pages, but since
+	 * target_unpopulated is incremented with nr_pages at the start we need
+	 * to remove the remaining ones also, or accounting will be screwed.
+	 */
+	balloon_stats.target_unpopulated -= nr_pages - pgno;
 	return ret;
 }
 EXPORT_SYMBOL(alloc_xenballooned_pages);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/3] xen/balloon: make the balloon wait interruptible
  2020-07-23  8:45 [PATCH 0/3] xen/balloon: fixes for memory hotplug Roger Pau Monne
@ 2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Roger Pau Monne, stable, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, xen-devel

So it can be killed, or else processes can get hung indefinitely
waiting for balloon pages.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
---
 drivers/xen/balloon.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 3cb10ed32557..292413b27575 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -568,11 +568,13 @@ static int add_ballooned_pages(int nr_pages)
 	if (xen_hotplug_unpopulated) {
 		st = reserve_additional_memory();
 		if (st != BP_ECANCELED) {
+			int rc;
+
 			mutex_unlock(&balloon_mutex);
-			wait_event(balloon_wq,
+			rc = wait_event_interruptible(balloon_wq,
 				   !list_empty(&ballooned_pages));
 			mutex_lock(&balloon_mutex);
-			return 0;
+			return rc ? -ENOMEM : 0;
 		}
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/3] xen/balloon: make the balloon wait interruptible
@ 2020-07-23  8:45   ` Roger Pau Monne
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Juergen Gross, Stefano Stabellini, stable, xen-devel,
	Boris Ostrovsky, Roger Pau Monne

So it can be killed, or else processes can get hung indefinitely
waiting for balloon pages.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
---
 drivers/xen/balloon.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 3cb10ed32557..292413b27575 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -568,11 +568,13 @@ static int add_ballooned_pages(int nr_pages)
 	if (xen_hotplug_unpopulated) {
 		st = reserve_additional_memory();
 		if (st != BP_ECANCELED) {
+			int rc;
+
 			mutex_unlock(&balloon_mutex);
-			wait_event(balloon_wq,
+			rc = wait_event_interruptible(balloon_wq,
 				   !list_empty(&ballooned_pages));
 			mutex_lock(&balloon_mutex);
-			return 0;
+			return rc ? -ENOMEM : 0;
 		}
 	}
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23  8:45 [PATCH 0/3] xen/balloon: fixes for memory hotplug Roger Pau Monne
@ 2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2020-07-23  8:45   ` Roger Pau Monne
  2 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Roger Pau Monne, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, Andrew Morton, xen-devel, linux-mm

Add an extra option to add_memory_resource that overrides the memory
hotplug online behavior in order to force onlining of memory from
add_memory_resource unconditionally.

This is required for the Xen balloon driver, that must run the
online page callback in order to correctly process the newly added
memory region, note this is an unpopulated region that is used by Linux
to either hotplug RAM or to map foreign pages from other domains, and
hence memory hotplug when running on Xen can be used even without the
user explicitly requesting it, as part of the normal operations of the
OS when attempting to map memory from a different domain.

Setting a different default value of memhp_default_online_type when
attaching the balloon driver is not a robust solution, as the user (or
distro init scripts) could still change it and thus break the Xen
balloon driver.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: xen-devel@lists.xenproject.org
Cc: linux-mm@kvack.org
---
 drivers/xen/balloon.c          |  2 +-
 include/linux/memory_hotplug.h |  3 ++-
 mm/memory_hotplug.c            | 16 ++++++++++------
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 292413b27575..fe0e0c76834b 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -346,7 +346,7 @@ static enum bp_state reserve_additional_memory(void)
 	mutex_unlock(&balloon_mutex);
 	/* add_memory_resource() requires the device_hotplug lock */
 	lock_device_hotplug();
-	rc = add_memory_resource(nid, resource);
+	rc = add_memory_resource(nid, resource, true);
 	unlock_device_hotplug();
 	mutex_lock(&balloon_mutex);
 
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 375515803cd8..1793619fe4a6 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -342,7 +342,8 @@ extern void clear_zone_contiguous(struct zone *zone);
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
-extern int add_memory_resource(int nid, struct resource *resource);
+extern int add_memory_resource(int nid, struct resource *resource,
+			       bool force_online);
 extern int add_memory_driver_managed(int nid, u64 start, u64 size,
 				     const char *resource_name);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da374cd3d45b..2491588d3f86 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1002,7 +1002,10 @@ static int check_hotplug_memory_range(u64 start, u64 size)
 
 static int online_memory_block(struct memory_block *mem, void *arg)
 {
-	mem->online_type = memhp_default_online_type;
+	bool force_online = arg;
+
+	mem->online_type = force_online ? MMOP_ONLINE
+					: memhp_default_online_type;
 	return device_online(&mem->dev);
 }
 
@@ -1012,7 +1015,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
  *
  * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
  */
-int __ref add_memory_resource(int nid, struct resource *res)
+int __ref add_memory_resource(int nid, struct resource *res, bool force_online)
 {
 	struct mhp_params params = { .pgprot = PAGE_KERNEL };
 	u64 start, size;
@@ -1076,8 +1079,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
 	mem_hotplug_done();
 
 	/* online pages if requested */
-	if (memhp_default_online_type != MMOP_OFFLINE)
-		walk_memory_blocks(start, size, NULL, online_memory_block);
+	if (memhp_default_online_type != MMOP_OFFLINE || force_online)
+		walk_memory_blocks(start, size, (void *)force_online,
+				   online_memory_block);
 
 	return ret;
 error:
@@ -1100,7 +1104,7 @@ int __ref __add_memory(int nid, u64 start, u64 size)
 	if (IS_ERR(res))
 		return PTR_ERR(res);
 
-	ret = add_memory_resource(nid, res);
+	ret = add_memory_resource(nid, res, false);
 	if (ret < 0)
 		release_memory_resource(res);
 	return ret;
@@ -1158,7 +1162,7 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
 		goto out_unlock;
 	}
 
-	rc = add_memory_resource(nid, res);
+	rc = add_memory_resource(nid, res, false);
 	if (rc < 0)
 		release_memory_resource(res);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23  8:45   ` Roger Pau Monne
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2020-07-23  8:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Juergen Gross, Stefano Stabellini, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton, Roger Pau Monne

Add an extra option to add_memory_resource that overrides the memory
hotplug online behavior in order to force onlining of memory from
add_memory_resource unconditionally.

This is required for the Xen balloon driver, that must run the
online page callback in order to correctly process the newly added
memory region, note this is an unpopulated region that is used by Linux
to either hotplug RAM or to map foreign pages from other domains, and
hence memory hotplug when running on Xen can be used even without the
user explicitly requesting it, as part of the normal operations of the
OS when attempting to map memory from a different domain.

Setting a different default value of memhp_default_online_type when
attaching the balloon driver is not a robust solution, as the user (or
distro init scripts) could still change it and thus break the Xen
balloon driver.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: xen-devel@lists.xenproject.org
Cc: linux-mm@kvack.org
---
 drivers/xen/balloon.c          |  2 +-
 include/linux/memory_hotplug.h |  3 ++-
 mm/memory_hotplug.c            | 16 ++++++++++------
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 292413b27575..fe0e0c76834b 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -346,7 +346,7 @@ static enum bp_state reserve_additional_memory(void)
 	mutex_unlock(&balloon_mutex);
 	/* add_memory_resource() requires the device_hotplug lock */
 	lock_device_hotplug();
-	rc = add_memory_resource(nid, resource);
+	rc = add_memory_resource(nid, resource, true);
 	unlock_device_hotplug();
 	mutex_lock(&balloon_mutex);
 
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 375515803cd8..1793619fe4a6 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -342,7 +342,8 @@ extern void clear_zone_contiguous(struct zone *zone);
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
-extern int add_memory_resource(int nid, struct resource *resource);
+extern int add_memory_resource(int nid, struct resource *resource,
+			       bool force_online);
 extern int add_memory_driver_managed(int nid, u64 start, u64 size,
 				     const char *resource_name);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da374cd3d45b..2491588d3f86 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1002,7 +1002,10 @@ static int check_hotplug_memory_range(u64 start, u64 size)
 
 static int online_memory_block(struct memory_block *mem, void *arg)
 {
-	mem->online_type = memhp_default_online_type;
+	bool force_online = arg;
+
+	mem->online_type = force_online ? MMOP_ONLINE
+					: memhp_default_online_type;
 	return device_online(&mem->dev);
 }
 
@@ -1012,7 +1015,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
  *
  * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
  */
-int __ref add_memory_resource(int nid, struct resource *res)
+int __ref add_memory_resource(int nid, struct resource *res, bool force_online)
 {
 	struct mhp_params params = { .pgprot = PAGE_KERNEL };
 	u64 start, size;
@@ -1076,8 +1079,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
 	mem_hotplug_done();
 
 	/* online pages if requested */
-	if (memhp_default_online_type != MMOP_OFFLINE)
-		walk_memory_blocks(start, size, NULL, online_memory_block);
+	if (memhp_default_online_type != MMOP_OFFLINE || force_online)
+		walk_memory_blocks(start, size, (void *)force_online,
+				   online_memory_block);
 
 	return ret;
 error:
@@ -1100,7 +1104,7 @@ int __ref __add_memory(int nid, u64 start, u64 size)
 	if (IS_ERR(res))
 		return PTR_ERR(res);
 
-	ret = add_memory_resource(nid, res);
+	ret = add_memory_resource(nid, res, false);
 	if (ret < 0)
 		release_memory_resource(res);
 	return ret;
@@ -1158,7 +1162,7 @@ int add_memory_driver_managed(int nid, u64 start, u64 size,
 		goto out_unlock;
 	}
 
-	rc = add_memory_resource(nid, res);
+	rc = add_memory_resource(nid, res, false);
 	if (rc < 0)
 		release_memory_resource(res);
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23  8:45   ` Roger Pau Monne
@ 2020-07-23 11:37     ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 11:37 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel
  Cc: Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On 23.07.20 10:45, Roger Pau Monne wrote:
> Add an extra option to add_memory_resource that overrides the memory
> hotplug online behavior in order to force onlining of memory from
> add_memory_resource unconditionally.
> 
> This is required for the Xen balloon driver, that must run the
> online page callback in order to correctly process the newly added
> memory region, note this is an unpopulated region that is used by Linux
> to either hotplug RAM or to map foreign pages from other domains, and
> hence memory hotplug when running on Xen can be used even without the
> user explicitly requesting it, as part of the normal operations of the
> OS when attempting to map memory from a different domain.
> 
> Setting a different default value of memhp_default_online_type when
> attaching the balloon driver is not a robust solution, as the user (or
> distro init scripts) could still change it and thus break the Xen
> balloon driver.

I think we discussed this a couple of times before (even triggered by my
request), and this is responsibility of user space to configure. Usually
distros have udev rules to online memory automatically. Especially, user
space should eb able to configure *how* to online memory.

It's the admin/distro responsibility to configure this properly. In case
this doesn't happen (or as you say, users change it), bad luck.

E.g., virtio-mem takes care to not add more memory in case it is not
getting onlined. I remember hyper-v has similar code to at least wait a
bit for memory to get onlined.

Nacked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 11:37     ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 11:37 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel
  Cc: Juergen Gross, Stefano Stabellini, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 10:45, Roger Pau Monne wrote:
> Add an extra option to add_memory_resource that overrides the memory
> hotplug online behavior in order to force onlining of memory from
> add_memory_resource unconditionally.
> 
> This is required for the Xen balloon driver, that must run the
> online page callback in order to correctly process the newly added
> memory region, note this is an unpopulated region that is used by Linux
> to either hotplug RAM or to map foreign pages from other domains, and
> hence memory hotplug when running on Xen can be used even without the
> user explicitly requesting it, as part of the normal operations of the
> OS when attempting to map memory from a different domain.
> 
> Setting a different default value of memhp_default_online_type when
> attaching the balloon driver is not a robust solution, as the user (or
> distro init scripts) could still change it and thus break the Xen
> balloon driver.

I think we discussed this a couple of times before (even triggered by my
request), and this is responsibility of user space to configure. Usually
distros have udev rules to online memory automatically. Especially, user
space should eb able to configure *how* to online memory.

It's the admin/distro responsibility to configure this properly. In case
this doesn't happen (or as you say, users change it), bad luck.

E.g., virtio-mem takes care to not add more memory in case it is not
getting onlined. I remember hyper-v has similar code to at least wait a
bit for memory to get onlined.

Nacked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 11:37     ` David Hildenbrand
@ 2020-07-23 11:52       ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 11:52 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel
  Cc: Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On 23.07.20 13:37, David Hildenbrand wrote:
> On 23.07.20 10:45, Roger Pau Monne wrote:
>> Add an extra option to add_memory_resource that overrides the memory
>> hotplug online behavior in order to force onlining of memory from
>> add_memory_resource unconditionally.
>>
>> This is required for the Xen balloon driver, that must run the
>> online page callback in order to correctly process the newly added
>> memory region, note this is an unpopulated region that is used by Linux
>> to either hotplug RAM or to map foreign pages from other domains, and
>> hence memory hotplug when running on Xen can be used even without the
>> user explicitly requesting it, as part of the normal operations of the
>> OS when attempting to map memory from a different domain.
>>
>> Setting a different default value of memhp_default_online_type when
>> attaching the balloon driver is not a robust solution, as the user (or
>> distro init scripts) could still change it and thus break the Xen
>> balloon driver.
> 
> I think we discussed this a couple of times before (even triggered by my
> request), and this is responsibility of user space to configure. Usually
> distros have udev rules to online memory automatically. Especially, user
> space should eb able to configure *how* to online memory.
> 
> It's the admin/distro responsibility to configure this properly. In case
> this doesn't happen (or as you say, users change it), bad luck.
> 
> E.g., virtio-mem takes care to not add more memory in case it is not
> getting onlined. I remember hyper-v has similar code to at least wait a
> bit for memory to get onlined.
> 
> Nacked-by: David Hildenbrand <david@redhat.com>
> 

Oh, BTW, I removed that "online" parameter in

commit f29d8e9c0191a2a02500945db505e5c89159c3f4
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Dec 28 00:35:36 2018 -0800

    mm/memory_hotplug: drop "online" parameter from add_memory_resource()
    
    Userspace should always be in charge of how to online memory and if memory
    should be onlined automatically in the kernel.  Let's drop the parameter
    to overwrite this - XEN passes memhp_auto_online, just like add_memory(),
    so we can directly use that instead internally.


Xen was passing "memhp_auto_online" since

commit 703fc13a3f6615e29ce3eb862275d7b58a5d03ba
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Tue Mar 15 14:56:52 2016 -0700

    xen_balloon: support memory auto onlining policy
    
    Add support for the newly added kernel memory auto onlining policy to
    Xen ballon driver.


And before that I assume XEN was completely relying on udev rules to handle it. Parameter was introduced in

commit 31bc3858ea3ebcc3157b3f5f0e624c5962f5a7a6
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Tue Mar 15 14:56:48 2016 -0700

    memory-hotplug: add automatic onlining policy for the newly added memory
    
    Currently, all newly added memory blocks remain in 'offline' state
    unless someone onlines them, some linux distributions carry special udev
    rules like:
    
      SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 11:52       ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 11:52 UTC (permalink / raw)
  To: Roger Pau Monne, linux-kernel
  Cc: Juergen Gross, Stefano Stabellini, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 13:37, David Hildenbrand wrote:
> On 23.07.20 10:45, Roger Pau Monne wrote:
>> Add an extra option to add_memory_resource that overrides the memory
>> hotplug online behavior in order to force onlining of memory from
>> add_memory_resource unconditionally.
>>
>> This is required for the Xen balloon driver, that must run the
>> online page callback in order to correctly process the newly added
>> memory region, note this is an unpopulated region that is used by Linux
>> to either hotplug RAM or to map foreign pages from other domains, and
>> hence memory hotplug when running on Xen can be used even without the
>> user explicitly requesting it, as part of the normal operations of the
>> OS when attempting to map memory from a different domain.
>>
>> Setting a different default value of memhp_default_online_type when
>> attaching the balloon driver is not a robust solution, as the user (or
>> distro init scripts) could still change it and thus break the Xen
>> balloon driver.
> 
> I think we discussed this a couple of times before (even triggered by my
> request), and this is responsibility of user space to configure. Usually
> distros have udev rules to online memory automatically. Especially, user
> space should eb able to configure *how* to online memory.
> 
> It's the admin/distro responsibility to configure this properly. In case
> this doesn't happen (or as you say, users change it), bad luck.
> 
> E.g., virtio-mem takes care to not add more memory in case it is not
> getting onlined. I remember hyper-v has similar code to at least wait a
> bit for memory to get onlined.
> 
> Nacked-by: David Hildenbrand <david@redhat.com>
> 

Oh, BTW, I removed that "online" parameter in

commit f29d8e9c0191a2a02500945db505e5c89159c3f4
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Dec 28 00:35:36 2018 -0800

    mm/memory_hotplug: drop "online" parameter from add_memory_resource()
    
    Userspace should always be in charge of how to online memory and if memory
    should be onlined automatically in the kernel.  Let's drop the parameter
    to overwrite this - XEN passes memhp_auto_online, just like add_memory(),
    so we can directly use that instead internally.


Xen was passing "memhp_auto_online" since

commit 703fc13a3f6615e29ce3eb862275d7b58a5d03ba
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Tue Mar 15 14:56:52 2016 -0700

    xen_balloon: support memory auto onlining policy
    
    Add support for the newly added kernel memory auto onlining policy to
    Xen ballon driver.


And before that I assume XEN was completely relying on udev rules to handle it. Parameter was introduced in

commit 31bc3858ea3ebcc3157b3f5f0e624c5962f5a7a6
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Tue Mar 15 14:56:48 2016 -0700

    memory-hotplug: add automatic onlining policy for the newly added memory
    
    Currently, all newly added memory blocks remain in 'offline' state
    unless someone onlines them, some linux distributions carry special udev
    rules like:
    
      SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 11:37     ` David Hildenbrand
@ 2020-07-23 12:23       ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 12:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> On 23.07.20 10:45, Roger Pau Monne wrote:
> > Add an extra option to add_memory_resource that overrides the memory
> > hotplug online behavior in order to force onlining of memory from
> > add_memory_resource unconditionally.
> > 
> > This is required for the Xen balloon driver, that must run the
> > online page callback in order to correctly process the newly added
> > memory region, note this is an unpopulated region that is used by Linux
> > to either hotplug RAM or to map foreign pages from other domains, and
> > hence memory hotplug when running on Xen can be used even without the
> > user explicitly requesting it, as part of the normal operations of the
> > OS when attempting to map memory from a different domain.
> > 
> > Setting a different default value of memhp_default_online_type when
> > attaching the balloon driver is not a robust solution, as the user (or
> > distro init scripts) could still change it and thus break the Xen
> > balloon driver.
> 
> I think we discussed this a couple of times before (even triggered by my
> request), and this is responsibility of user space to configure. Usually
> distros have udev rules to online memory automatically. Especially, user
> space should eb able to configure *how* to online memory.

Note (as per the commit message) that in the specific case I'm
referring to the memory hotplugged by the Xen balloon driver will be
an unpopulated range to be used internally by certain Xen subsystems,
like the xen-blkback or the privcmd drivers. The addition of such
blocks of (unpopulated) memory can happen without the user explicitly
requesting it, and hence not even aware such hotplug process is taking
place. To be clear: no actual RAM will be added to the system.

Failure to online such blocks using the Xen specific online handler
(which does not handle back the memory to the allocator in any way)
will result in the system getting stuck and malfunctioning.

> It's the admin/distro responsibility to configure this properly. In case
> this doesn't happen (or as you say, users change it), bad luck.
> 
> E.g., virtio-mem takes care to not add more memory in case it is not
> getting onlined. I remember hyper-v has similar code to at least wait a
> bit for memory to get onlined.

I don't think VirtIO or Hyper-V use the hotplug system in the same way
as Xen, as said this is done to add unpopulated memory regions that
will be used to map foreign memory (from other domains) by Xen drivers
on the system.

Maybe this should somehow use a different mechanism to hotplug such
empty memory blocks? I don't mind doing this differently, but I would
need some pointers. Allowing user-space to change a (seemingly
unrelated) parameter and as a result produce failures on Xen drivers
is not an acceptable solution IMO.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 12:23       ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 12:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Juergen Gross, Stefano Stabellini, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> On 23.07.20 10:45, Roger Pau Monne wrote:
> > Add an extra option to add_memory_resource that overrides the memory
> > hotplug online behavior in order to force onlining of memory from
> > add_memory_resource unconditionally.
> > 
> > This is required for the Xen balloon driver, that must run the
> > online page callback in order to correctly process the newly added
> > memory region, note this is an unpopulated region that is used by Linux
> > to either hotplug RAM or to map foreign pages from other domains, and
> > hence memory hotplug when running on Xen can be used even without the
> > user explicitly requesting it, as part of the normal operations of the
> > OS when attempting to map memory from a different domain.
> > 
> > Setting a different default value of memhp_default_online_type when
> > attaching the balloon driver is not a robust solution, as the user (or
> > distro init scripts) could still change it and thus break the Xen
> > balloon driver.
> 
> I think we discussed this a couple of times before (even triggered by my
> request), and this is responsibility of user space to configure. Usually
> distros have udev rules to online memory automatically. Especially, user
> space should eb able to configure *how* to online memory.

Note (as per the commit message) that in the specific case I'm
referring to the memory hotplugged by the Xen balloon driver will be
an unpopulated range to be used internally by certain Xen subsystems,
like the xen-blkback or the privcmd drivers. The addition of such
blocks of (unpopulated) memory can happen without the user explicitly
requesting it, and hence not even aware such hotplug process is taking
place. To be clear: no actual RAM will be added to the system.

Failure to online such blocks using the Xen specific online handler
(which does not handle back the memory to the allocator in any way)
will result in the system getting stuck and malfunctioning.

> It's the admin/distro responsibility to configure this properly. In case
> this doesn't happen (or as you say, users change it), bad luck.
> 
> E.g., virtio-mem takes care to not add more memory in case it is not
> getting onlined. I remember hyper-v has similar code to at least wait a
> bit for memory to get onlined.

I don't think VirtIO or Hyper-V use the hotplug system in the same way
as Xen, as said this is done to add unpopulated memory regions that
will be used to map foreign memory (from other domains) by Xen drivers
on the system.

Maybe this should somehow use a different mechanism to hotplug such
empty memory blocks? I don't mind doing this differently, but I would
need some pointers. Allowing user-space to change a (seemingly
unrelated) parameter and as a result produce failures on Xen drivers
is not an acceptable solution IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 12:23       ` Roger Pau Monné
@ 2020-07-23 12:28         ` Jürgen Groß
  -1 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 12:28 UTC (permalink / raw)
  To: Roger Pau Monné, David Hildenbrand
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 14:23, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>> Add an extra option to add_memory_resource that overrides the memory
>>> hotplug online behavior in order to force onlining of memory from
>>> add_memory_resource unconditionally.
>>>
>>> This is required for the Xen balloon driver, that must run the
>>> online page callback in order to correctly process the newly added
>>> memory region, note this is an unpopulated region that is used by Linux
>>> to either hotplug RAM or to map foreign pages from other domains, and
>>> hence memory hotplug when running on Xen can be used even without the
>>> user explicitly requesting it, as part of the normal operations of the
>>> OS when attempting to map memory from a different domain.
>>>
>>> Setting a different default value of memhp_default_online_type when
>>> attaching the balloon driver is not a robust solution, as the user (or
>>> distro init scripts) could still change it and thus break the Xen
>>> balloon driver.
>>
>> I think we discussed this a couple of times before (even triggered by my
>> request), and this is responsibility of user space to configure. Usually
>> distros have udev rules to online memory automatically. Especially, user
>> space should eb able to configure *how* to online memory.
> 
> Note (as per the commit message) that in the specific case I'm
> referring to the memory hotplugged by the Xen balloon driver will be
> an unpopulated range to be used internally by certain Xen subsystems,
> like the xen-blkback or the privcmd drivers. The addition of such
> blocks of (unpopulated) memory can happen without the user explicitly
> requesting it, and hence not even aware such hotplug process is taking
> place. To be clear: no actual RAM will be added to the system.
> 
> Failure to online such blocks using the Xen specific online handler
> (which does not handle back the memory to the allocator in any way)
> will result in the system getting stuck and malfunctioning.
> 
>> It's the admin/distro responsibility to configure this properly. In case
>> this doesn't happen (or as you say, users change it), bad luck.
>>
>> E.g., virtio-mem takes care to not add more memory in case it is not
>> getting onlined. I remember hyper-v has similar code to at least wait a
>> bit for memory to get onlined.
> 
> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> as Xen, as said this is done to add unpopulated memory regions that
> will be used to map foreign memory (from other domains) by Xen drivers
> on the system.
> 
> Maybe this should somehow use a different mechanism to hotplug such
> empty memory blocks? I don't mind doing this differently, but I would
> need some pointers. Allowing user-space to change a (seemingly
> unrelated) parameter and as a result produce failures on Xen drivers
> is not an acceptable solution IMO.

Maybe we can use the same approach as Xen PV-domains: pre-allocate a
region in the memory map to be used for mapping foreign pages. For the
kernel it will look like pre-ballooned memory, so it will create struct
page for the region (which is what we are after), but it won't give the
memory to the allocator.


Juergen


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 12:28         ` Jürgen Groß
  0 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 12:28 UTC (permalink / raw)
  To: Roger Pau Monné, David Hildenbrand
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 14:23, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>> Add an extra option to add_memory_resource that overrides the memory
>>> hotplug online behavior in order to force onlining of memory from
>>> add_memory_resource unconditionally.
>>>
>>> This is required for the Xen balloon driver, that must run the
>>> online page callback in order to correctly process the newly added
>>> memory region, note this is an unpopulated region that is used by Linux
>>> to either hotplug RAM or to map foreign pages from other domains, and
>>> hence memory hotplug when running on Xen can be used even without the
>>> user explicitly requesting it, as part of the normal operations of the
>>> OS when attempting to map memory from a different domain.
>>>
>>> Setting a different default value of memhp_default_online_type when
>>> attaching the balloon driver is not a robust solution, as the user (or
>>> distro init scripts) could still change it and thus break the Xen
>>> balloon driver.
>>
>> I think we discussed this a couple of times before (even triggered by my
>> request), and this is responsibility of user space to configure. Usually
>> distros have udev rules to online memory automatically. Especially, user
>> space should eb able to configure *how* to online memory.
> 
> Note (as per the commit message) that in the specific case I'm
> referring to the memory hotplugged by the Xen balloon driver will be
> an unpopulated range to be used internally by certain Xen subsystems,
> like the xen-blkback or the privcmd drivers. The addition of such
> blocks of (unpopulated) memory can happen without the user explicitly
> requesting it, and hence not even aware such hotplug process is taking
> place. To be clear: no actual RAM will be added to the system.
> 
> Failure to online such blocks using the Xen specific online handler
> (which does not handle back the memory to the allocator in any way)
> will result in the system getting stuck and malfunctioning.
> 
>> It's the admin/distro responsibility to configure this properly. In case
>> this doesn't happen (or as you say, users change it), bad luck.
>>
>> E.g., virtio-mem takes care to not add more memory in case it is not
>> getting onlined. I remember hyper-v has similar code to at least wait a
>> bit for memory to get onlined.
> 
> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> as Xen, as said this is done to add unpopulated memory regions that
> will be used to map foreign memory (from other domains) by Xen drivers
> on the system.
> 
> Maybe this should somehow use a different mechanism to hotplug such
> empty memory blocks? I don't mind doing this differently, but I would
> need some pointers. Allowing user-space to change a (seemingly
> unrelated) parameter and as a result produce failures on Xen drivers
> is not an acceptable solution IMO.

Maybe we can use the same approach as Xen PV-domains: pre-allocate a
region in the memory map to be used for mapping foreign pages. For the
kernel it will look like pre-ballooned memory, so it will create struct
page for the region (which is what we are after), but it won't give the
memory to the allocator.


Juergen



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 12:28         ` Jürgen Groß
@ 2020-07-23 12:31           ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 12:31 UTC (permalink / raw)
  To: Jürgen Groß, Roger Pau Monné
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 14:28, Jürgen Groß wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>> Add an extra option to add_memory_resource that overrides the memory
>>>> hotplug online behavior in order to force onlining of memory from
>>>> add_memory_resource unconditionally.
>>>>
>>>> This is required for the Xen balloon driver, that must run the
>>>> online page callback in order to correctly process the newly added
>>>> memory region, note this is an unpopulated region that is used by Linux
>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>> hence memory hotplug when running on Xen can be used even without the
>>>> user explicitly requesting it, as part of the normal operations of the
>>>> OS when attempting to map memory from a different domain.
>>>>
>>>> Setting a different default value of memhp_default_online_type when
>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>> distro init scripts) could still change it and thus break the Xen
>>>> balloon driver.
>>>
>>> I think we discussed this a couple of times before (even triggered by my
>>> request), and this is responsibility of user space to configure. Usually
>>> distros have udev rules to online memory automatically. Especially, user
>>> space should eb able to configure *how* to online memory.
>>
>> Note (as per the commit message) that in the specific case I'm
>> referring to the memory hotplugged by the Xen balloon driver will be
>> an unpopulated range to be used internally by certain Xen subsystems,
>> like the xen-blkback or the privcmd drivers. The addition of such
>> blocks of (unpopulated) memory can happen without the user explicitly
>> requesting it, and hence not even aware such hotplug process is taking
>> place. To be clear: no actual RAM will be added to the system.
>>
>> Failure to online such blocks using the Xen specific online handler
>> (which does not handle back the memory to the allocator in any way)
>> will result in the system getting stuck and malfunctioning.
>>
>>> It's the admin/distro responsibility to configure this properly. In case
>>> this doesn't happen (or as you say, users change it), bad luck.
>>>
>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>> bit for memory to get onlined.
>>
>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>> as Xen, as said this is done to add unpopulated memory regions that
>> will be used to map foreign memory (from other domains) by Xen drivers
>> on the system.
>>
>> Maybe this should somehow use a different mechanism to hotplug such
>> empty memory blocks? I don't mind doing this differently, but I would
>> need some pointers. Allowing user-space to change a (seemingly
>> unrelated) parameter and as a result produce failures on Xen drivers
>> is not an acceptable solution IMO.
> 
> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> region in the memory map to be used for mapping foreign pages. For the
> kernel it will look like pre-ballooned memory, so it will create struct
> page for the region (which is what we are after), but it won't give the
> memory to the allocator.

Something like that sounds a lot cleaner to me than abusing the  memory
hotplug mechanism (which the xen balloon also uses to just expose more
memory). Because there are other issues in case you "really want memory
to be onlined". What if onlining fails (nacked by a notifier, e.g., kasan)?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 12:31           ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 12:31 UTC (permalink / raw)
  To: Jürgen Groß, Roger Pau Monné
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 14:28, Jürgen Groß wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>> Add an extra option to add_memory_resource that overrides the memory
>>>> hotplug online behavior in order to force onlining of memory from
>>>> add_memory_resource unconditionally.
>>>>
>>>> This is required for the Xen balloon driver, that must run the
>>>> online page callback in order to correctly process the newly added
>>>> memory region, note this is an unpopulated region that is used by Linux
>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>> hence memory hotplug when running on Xen can be used even without the
>>>> user explicitly requesting it, as part of the normal operations of the
>>>> OS when attempting to map memory from a different domain.
>>>>
>>>> Setting a different default value of memhp_default_online_type when
>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>> distro init scripts) could still change it and thus break the Xen
>>>> balloon driver.
>>>
>>> I think we discussed this a couple of times before (even triggered by my
>>> request), and this is responsibility of user space to configure. Usually
>>> distros have udev rules to online memory automatically. Especially, user
>>> space should eb able to configure *how* to online memory.
>>
>> Note (as per the commit message) that in the specific case I'm
>> referring to the memory hotplugged by the Xen balloon driver will be
>> an unpopulated range to be used internally by certain Xen subsystems,
>> like the xen-blkback or the privcmd drivers. The addition of such
>> blocks of (unpopulated) memory can happen without the user explicitly
>> requesting it, and hence not even aware such hotplug process is taking
>> place. To be clear: no actual RAM will be added to the system.
>>
>> Failure to online such blocks using the Xen specific online handler
>> (which does not handle back the memory to the allocator in any way)
>> will result in the system getting stuck and malfunctioning.
>>
>>> It's the admin/distro responsibility to configure this properly. In case
>>> this doesn't happen (or as you say, users change it), bad luck.
>>>
>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>> bit for memory to get onlined.
>>
>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>> as Xen, as said this is done to add unpopulated memory regions that
>> will be used to map foreign memory (from other domains) by Xen drivers
>> on the system.
>>
>> Maybe this should somehow use a different mechanism to hotplug such
>> empty memory blocks? I don't mind doing this differently, but I would
>> need some pointers. Allowing user-space to change a (seemingly
>> unrelated) parameter and as a result produce failures on Xen drivers
>> is not an acceptable solution IMO.
> 
> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> region in the memory map to be used for mapping foreign pages. For the
> kernel it will look like pre-ballooned memory, so it will create struct
> page for the region (which is what we are after), but it won't give the
> memory to the allocator.

Something like that sounds a lot cleaner to me than abusing the  memory
hotplug mechanism (which the xen balloon also uses to just expose more
memory). Because there are other issues in case you "really want memory
to be onlined". What if onlining fails (nacked by a notifier, e.g., kasan)?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 12:28         ` Jürgen Groß
@ 2020-07-23 13:08           ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:08 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: David Hildenbrand, linux-kernel, Boris Ostrovsky,
	Stefano Stabellini, Andrew Morton, xen-devel, linux-mm

On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > Add an extra option to add_memory_resource that overrides the memory
> > > > hotplug online behavior in order to force onlining of memory from
> > > > add_memory_resource unconditionally.
> > > > 
> > > > This is required for the Xen balloon driver, that must run the
> > > > online page callback in order to correctly process the newly added
> > > > memory region, note this is an unpopulated region that is used by Linux
> > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > hence memory hotplug when running on Xen can be used even without the
> > > > user explicitly requesting it, as part of the normal operations of the
> > > > OS when attempting to map memory from a different domain.
> > > > 
> > > > Setting a different default value of memhp_default_online_type when
> > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > distro init scripts) could still change it and thus break the Xen
> > > > balloon driver.
> > > 
> > > I think we discussed this a couple of times before (even triggered by my
> > > request), and this is responsibility of user space to configure. Usually
> > > distros have udev rules to online memory automatically. Especially, user
> > > space should eb able to configure *how* to online memory.
> > 
> > Note (as per the commit message) that in the specific case I'm
> > referring to the memory hotplugged by the Xen balloon driver will be
> > an unpopulated range to be used internally by certain Xen subsystems,
> > like the xen-blkback or the privcmd drivers. The addition of such
> > blocks of (unpopulated) memory can happen without the user explicitly
> > requesting it, and hence not even aware such hotplug process is taking
> > place. To be clear: no actual RAM will be added to the system.
> > 
> > Failure to online such blocks using the Xen specific online handler
> > (which does not handle back the memory to the allocator in any way)
> > will result in the system getting stuck and malfunctioning.
> > 
> > > It's the admin/distro responsibility to configure this properly. In case
> > > this doesn't happen (or as you say, users change it), bad luck.
> > > 
> > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > bit for memory to get onlined.
> > 
> > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > as Xen, as said this is done to add unpopulated memory regions that
> > will be used to map foreign memory (from other domains) by Xen drivers
> > on the system.
> > 
> > Maybe this should somehow use a different mechanism to hotplug such
> > empty memory blocks? I don't mind doing this differently, but I would
> > need some pointers. Allowing user-space to change a (seemingly
> > unrelated) parameter and as a result produce failures on Xen drivers
> > is not an acceptable solution IMO.
> 
> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> region in the memory map to be used for mapping foreign pages. For the
> kernel it will look like pre-ballooned memory, so it will create struct
> page for the region (which is what we are after), but it won't give the
> memory to the allocator.

IMO using something similar to memory hotplug would give us more
flexibility, and TBH the logic is already there in the balloon driver.
It seems quite wasteful to allocate such region(s) beforehand for all
domains, even when most of them won't end up using foreign mappings at
all.

Anyway, I'm going to take a look at how to do that, I guess it's going
to involve playing with the memory map and reserving some space.

I suggest we should remove the Xen balloon hotplug logic, as it's not
working properly and we don't have a plan to fix it.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:08           ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:08 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, David Hildenbrand, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > Add an extra option to add_memory_resource that overrides the memory
> > > > hotplug online behavior in order to force onlining of memory from
> > > > add_memory_resource unconditionally.
> > > > 
> > > > This is required for the Xen balloon driver, that must run the
> > > > online page callback in order to correctly process the newly added
> > > > memory region, note this is an unpopulated region that is used by Linux
> > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > hence memory hotplug when running on Xen can be used even without the
> > > > user explicitly requesting it, as part of the normal operations of the
> > > > OS when attempting to map memory from a different domain.
> > > > 
> > > > Setting a different default value of memhp_default_online_type when
> > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > distro init scripts) could still change it and thus break the Xen
> > > > balloon driver.
> > > 
> > > I think we discussed this a couple of times before (even triggered by my
> > > request), and this is responsibility of user space to configure. Usually
> > > distros have udev rules to online memory automatically. Especially, user
> > > space should eb able to configure *how* to online memory.
> > 
> > Note (as per the commit message) that in the specific case I'm
> > referring to the memory hotplugged by the Xen balloon driver will be
> > an unpopulated range to be used internally by certain Xen subsystems,
> > like the xen-blkback or the privcmd drivers. The addition of such
> > blocks of (unpopulated) memory can happen without the user explicitly
> > requesting it, and hence not even aware such hotplug process is taking
> > place. To be clear: no actual RAM will be added to the system.
> > 
> > Failure to online such blocks using the Xen specific online handler
> > (which does not handle back the memory to the allocator in any way)
> > will result in the system getting stuck and malfunctioning.
> > 
> > > It's the admin/distro responsibility to configure this properly. In case
> > > this doesn't happen (or as you say, users change it), bad luck.
> > > 
> > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > bit for memory to get onlined.
> > 
> > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > as Xen, as said this is done to add unpopulated memory regions that
> > will be used to map foreign memory (from other domains) by Xen drivers
> > on the system.
> > 
> > Maybe this should somehow use a different mechanism to hotplug such
> > empty memory blocks? I don't mind doing this differently, but I would
> > need some pointers. Allowing user-space to change a (seemingly
> > unrelated) parameter and as a result produce failures on Xen drivers
> > is not an acceptable solution IMO.
> 
> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> region in the memory map to be used for mapping foreign pages. For the
> kernel it will look like pre-ballooned memory, so it will create struct
> page for the region (which is what we are after), but it won't give the
> memory to the allocator.

IMO using something similar to memory hotplug would give us more
flexibility, and TBH the logic is already there in the balloon driver.
It seems quite wasteful to allocate such region(s) beforehand for all
domains, even when most of them won't end up using foreign mappings at
all.

Anyway, I'm going to take a look at how to do that, I guess it's going
to involve playing with the memory map and reserving some space.

I suggest we should remove the Xen balloon hotplug logic, as it's not
working properly and we don't have a plan to fix it.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:08           ` Roger Pau Monné
@ 2020-07-23 13:14             ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:14 UTC (permalink / raw)
  To: Roger Pau Monné, Jürgen Groß
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 15:08, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Failure to online such blocks using the Xen specific online handler
>>> (which does not handle back the memory to the allocator in any way)
>>> will result in the system getting stuck and malfunctioning.
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>>
>>> Maybe this should somehow use a different mechanism to hotplug such
>>> empty memory blocks? I don't mind doing this differently, but I would
>>> need some pointers. Allowing user-space to change a (seemingly
>>> unrelated) parameter and as a result produce failures on Xen drivers
>>> is not an acceptable solution IMO.
>>
>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>> region in the memory map to be used for mapping foreign pages. For the
>> kernel it will look like pre-ballooned memory, so it will create struct
>> page for the region (which is what we are after), but it won't give the
>> memory to the allocator.
> 
> IMO using something similar to memory hotplug would give us more
> flexibility, and TBH the logic is already there in the balloon driver.
> It seems quite wasteful to allocate such region(s) beforehand for all
> domains, even when most of them won't end up using foreign mappings at
> all.

I do wonder why these issues you describe start to pop up now, literally
years after this stuff has been implemented - or am I missing something
important?

> 
> Anyway, I'm going to take a look at how to do that, I guess it's going
> to involve playing with the memory map and reserving some space.
> 
> I suggest we should remove the Xen balloon hotplug logic, as it's not
> working properly and we don't have a plan to fix it.

Which exact hotplug logic are you referring to?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:14             ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:14 UTC (permalink / raw)
  To: Roger Pau Monné, Jürgen Groß
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 15:08, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Failure to online such blocks using the Xen specific online handler
>>> (which does not handle back the memory to the allocator in any way)
>>> will result in the system getting stuck and malfunctioning.
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>>
>>> Maybe this should somehow use a different mechanism to hotplug such
>>> empty memory blocks? I don't mind doing this differently, but I would
>>> need some pointers. Allowing user-space to change a (seemingly
>>> unrelated) parameter and as a result produce failures on Xen drivers
>>> is not an acceptable solution IMO.
>>
>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>> region in the memory map to be used for mapping foreign pages. For the
>> kernel it will look like pre-ballooned memory, so it will create struct
>> page for the region (which is what we are after), but it won't give the
>> memory to the allocator.
> 
> IMO using something similar to memory hotplug would give us more
> flexibility, and TBH the logic is already there in the balloon driver.
> It seems quite wasteful to allocate such region(s) beforehand for all
> domains, even when most of them won't end up using foreign mappings at
> all.

I do wonder why these issues you describe start to pop up now, literally
years after this stuff has been implemented - or am I missing something
important?

> 
> Anyway, I'm going to take a look at how to do that, I guess it's going
> to involve playing with the memory map and reserving some space.
> 
> I suggest we should remove the Xen balloon hotplug logic, as it's not
> working properly and we don't have a plan to fix it.

Which exact hotplug logic are you referring to?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:08           ` Roger Pau Monné
@ 2020-07-23 13:20             ` Jürgen Groß
  -1 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:20 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: David Hildenbrand, linux-kernel, Boris Ostrovsky,
	Stefano Stabellini, Andrew Morton, xen-devel, linux-mm

On 23.07.20 15:08, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Failure to online such blocks using the Xen specific online handler
>>> (which does not handle back the memory to the allocator in any way)
>>> will result in the system getting stuck and malfunctioning.
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>>
>>> Maybe this should somehow use a different mechanism to hotplug such
>>> empty memory blocks? I don't mind doing this differently, but I would
>>> need some pointers. Allowing user-space to change a (seemingly
>>> unrelated) parameter and as a result produce failures on Xen drivers
>>> is not an acceptable solution IMO.
>>
>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>> region in the memory map to be used for mapping foreign pages. For the
>> kernel it will look like pre-ballooned memory, so it will create struct
>> page for the region (which is what we are after), but it won't give the
>> memory to the allocator.
> 
> IMO using something similar to memory hotplug would give us more
> flexibility, and TBH the logic is already there in the balloon driver.
> It seems quite wasteful to allocate such region(s) beforehand for all
> domains, even when most of them won't end up using foreign mappings at
> all.

We can do it for dom0 only per default, and add a boot parameter e.g.
for driver domains.

And the logic is already there (just pv-only right now).

> 
> Anyway, I'm going to take a look at how to do that, I guess it's going
> to involve playing with the memory map and reserving some space.

Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).

> 
> I suggest we should remove the Xen balloon hotplug logic, as it's not
> working properly and we don't have a plan to fix it.

I have used memory hotplug successfully not very long ago.


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:20             ` Jürgen Groß
  0 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:20 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, David Hildenbrand, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On 23.07.20 15:08, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Failure to online such blocks using the Xen specific online handler
>>> (which does not handle back the memory to the allocator in any way)
>>> will result in the system getting stuck and malfunctioning.
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>>
>>> Maybe this should somehow use a different mechanism to hotplug such
>>> empty memory blocks? I don't mind doing this differently, but I would
>>> need some pointers. Allowing user-space to change a (seemingly
>>> unrelated) parameter and as a result produce failures on Xen drivers
>>> is not an acceptable solution IMO.
>>
>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>> region in the memory map to be used for mapping foreign pages. For the
>> kernel it will look like pre-ballooned memory, so it will create struct
>> page for the region (which is what we are after), but it won't give the
>> memory to the allocator.
> 
> IMO using something similar to memory hotplug would give us more
> flexibility, and TBH the logic is already there in the balloon driver.
> It seems quite wasteful to allocate such region(s) beforehand for all
> domains, even when most of them won't end up using foreign mappings at
> all.

We can do it for dom0 only per default, and add a boot parameter e.g.
for driver domains.

And the logic is already there (just pv-only right now).

> 
> Anyway, I'm going to take a look at how to do that, I guess it's going
> to involve playing with the memory map and reserving some space.

Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).

> 
> I suggest we should remove the Xen balloon hotplug logic, as it's not
> working properly and we don't have a plan to fix it.

I have used memory hotplug successfully not very long ago.


Juergen


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 12:23       ` Roger Pau Monné
@ 2020-07-23 13:22         ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:22 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On 23.07.20 14:23, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>> Add an extra option to add_memory_resource that overrides the memory
>>> hotplug online behavior in order to force onlining of memory from
>>> add_memory_resource unconditionally.
>>>
>>> This is required for the Xen balloon driver, that must run the
>>> online page callback in order to correctly process the newly added
>>> memory region, note this is an unpopulated region that is used by Linux
>>> to either hotplug RAM or to map foreign pages from other domains, and
>>> hence memory hotplug when running on Xen can be used even without the
>>> user explicitly requesting it, as part of the normal operations of the
>>> OS when attempting to map memory from a different domain.
>>>
>>> Setting a different default value of memhp_default_online_type when
>>> attaching the balloon driver is not a robust solution, as the user (or
>>> distro init scripts) could still change it and thus break the Xen
>>> balloon driver.
>>
>> I think we discussed this a couple of times before (even triggered by my
>> request), and this is responsibility of user space to configure. Usually
>> distros have udev rules to online memory automatically. Especially, user
>> space should eb able to configure *how* to online memory.
> 
> Note (as per the commit message) that in the specific case I'm
> referring to the memory hotplugged by the Xen balloon driver will be
> an unpopulated range to be used internally by certain Xen subsystems,
> like the xen-blkback or the privcmd drivers. The addition of such
> blocks of (unpopulated) memory can happen without the user explicitly
> requesting it, and hence not even aware such hotplug process is taking
> place. To be clear: no actual RAM will be added to the system.

Okay, but there is also the case where XEN will actually hotplug memory
using this same handler IIRC (at least I've read papers about it). Both
are using the same handler, correct?

> 
>> It's the admin/distro responsibility to configure this properly. In case
>> this doesn't happen (or as you say, users change it), bad luck.
>>
>> E.g., virtio-mem takes care to not add more memory in case it is not
>> getting onlined. I remember hyper-v has similar code to at least wait a
>> bit for memory to get onlined.
> 
> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> as Xen, as said this is done to add unpopulated memory regions that
> will be used to map foreign memory (from other domains) by Xen drivers
> on the system.

Indeed, if the memory is never exposed to the buddy (and all you need is
struct pages +  a kernel virtual mapping), I wonder if
memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
memory blocks created with unclear online semantics, partially involving
the buddy.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:22         ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:22 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On 23.07.20 14:23, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>> Add an extra option to add_memory_resource that overrides the memory
>>> hotplug online behavior in order to force onlining of memory from
>>> add_memory_resource unconditionally.
>>>
>>> This is required for the Xen balloon driver, that must run the
>>> online page callback in order to correctly process the newly added
>>> memory region, note this is an unpopulated region that is used by Linux
>>> to either hotplug RAM or to map foreign pages from other domains, and
>>> hence memory hotplug when running on Xen can be used even without the
>>> user explicitly requesting it, as part of the normal operations of the
>>> OS when attempting to map memory from a different domain.
>>>
>>> Setting a different default value of memhp_default_online_type when
>>> attaching the balloon driver is not a robust solution, as the user (or
>>> distro init scripts) could still change it and thus break the Xen
>>> balloon driver.
>>
>> I think we discussed this a couple of times before (even triggered by my
>> request), and this is responsibility of user space to configure. Usually
>> distros have udev rules to online memory automatically. Especially, user
>> space should eb able to configure *how* to online memory.
> 
> Note (as per the commit message) that in the specific case I'm
> referring to the memory hotplugged by the Xen balloon driver will be
> an unpopulated range to be used internally by certain Xen subsystems,
> like the xen-blkback or the privcmd drivers. The addition of such
> blocks of (unpopulated) memory can happen without the user explicitly
> requesting it, and hence not even aware such hotplug process is taking
> place. To be clear: no actual RAM will be added to the system.

Okay, but there is also the case where XEN will actually hotplug memory
using this same handler IIRC (at least I've read papers about it). Both
are using the same handler, correct?

> 
>> It's the admin/distro responsibility to configure this properly. In case
>> this doesn't happen (or as you say, users change it), bad luck.
>>
>> E.g., virtio-mem takes care to not add more memory in case it is not
>> getting onlined. I remember hyper-v has similar code to at least wait a
>> bit for memory to get onlined.
> 
> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> as Xen, as said this is done to add unpopulated memory regions that
> will be used to map foreign memory (from other domains) by Xen drivers
> on the system.

Indeed, if the memory is never exposed to the buddy (and all you need is
struct pages +  a kernel virtual mapping), I wonder if
memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
memory blocks created with unclear online semantics, partially involving
the buddy.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:14             ` David Hildenbrand
@ 2020-07-23 13:25               ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Jürgen Groß,
	linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On Thu, Jul 23, 2020 at 03:14:31PM +0200, David Hildenbrand wrote:
> On 23.07.20 15:08, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> >> On 23.07.20 14:23, Roger Pau Monné wrote:
> >>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> >>>> On 23.07.20 10:45, Roger Pau Monne wrote:
> >>>>> Add an extra option to add_memory_resource that overrides the memory
> >>>>> hotplug online behavior in order to force onlining of memory from
> >>>>> add_memory_resource unconditionally.
> >>>>>
> >>>>> This is required for the Xen balloon driver, that must run the
> >>>>> online page callback in order to correctly process the newly added
> >>>>> memory region, note this is an unpopulated region that is used by Linux
> >>>>> to either hotplug RAM or to map foreign pages from other domains, and
> >>>>> hence memory hotplug when running on Xen can be used even without the
> >>>>> user explicitly requesting it, as part of the normal operations of the
> >>>>> OS when attempting to map memory from a different domain.
> >>>>>
> >>>>> Setting a different default value of memhp_default_online_type when
> >>>>> attaching the balloon driver is not a robust solution, as the user (or
> >>>>> distro init scripts) could still change it and thus break the Xen
> >>>>> balloon driver.
> >>>>
> >>>> I think we discussed this a couple of times before (even triggered by my
> >>>> request), and this is responsibility of user space to configure. Usually
> >>>> distros have udev rules to online memory automatically. Especially, user
> >>>> space should eb able to configure *how* to online memory.
> >>>
> >>> Note (as per the commit message) that in the specific case I'm
> >>> referring to the memory hotplugged by the Xen balloon driver will be
> >>> an unpopulated range to be used internally by certain Xen subsystems,
> >>> like the xen-blkback or the privcmd drivers. The addition of such
> >>> blocks of (unpopulated) memory can happen without the user explicitly
> >>> requesting it, and hence not even aware such hotplug process is taking
> >>> place. To be clear: no actual RAM will be added to the system.
> >>>
> >>> Failure to online such blocks using the Xen specific online handler
> >>> (which does not handle back the memory to the allocator in any way)
> >>> will result in the system getting stuck and malfunctioning.
> >>>
> >>>> It's the admin/distro responsibility to configure this properly. In case
> >>>> this doesn't happen (or as you say, users change it), bad luck.
> >>>>
> >>>> E.g., virtio-mem takes care to not add more memory in case it is not
> >>>> getting onlined. I remember hyper-v has similar code to at least wait a
> >>>> bit for memory to get onlined.
> >>>
> >>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> >>> as Xen, as said this is done to add unpopulated memory regions that
> >>> will be used to map foreign memory (from other domains) by Xen drivers
> >>> on the system.
> >>>
> >>> Maybe this should somehow use a different mechanism to hotplug such
> >>> empty memory blocks? I don't mind doing this differently, but I would
> >>> need some pointers. Allowing user-space to change a (seemingly
> >>> unrelated) parameter and as a result produce failures on Xen drivers
> >>> is not an acceptable solution IMO.
> >>
> >> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> >> region in the memory map to be used for mapping foreign pages. For the
> >> kernel it will look like pre-ballooned memory, so it will create struct
> >> page for the region (which is what we are after), but it won't give the
> >> memory to the allocator.
> > 
> > IMO using something similar to memory hotplug would give us more
> > flexibility, and TBH the logic is already there in the balloon driver.
> > It seems quite wasteful to allocate such region(s) beforehand for all
> > domains, even when most of them won't end up using foreign mappings at
> > all.
> 
> I do wonder why these issues you describe start to pop up now, literally
> years after this stuff has been implemented - or am I missing something
> important?

We are (very slowly) implementing support to switch to a PVH dom0
(something similar to a fully emulated guest as dom0), and that kind
of guest no longer has a pre-allocated memory region for mappings,
likely because we mostly use the native path when dealing with
memory, and such reservation was done by the PV specific setup path
that deals with the memory map.

> > 
> > Anyway, I'm going to take a look at how to do that, I guess it's going
> > to involve playing with the memory map and reserving some space.
> > 
> > I suggest we should remove the Xen balloon hotplug logic, as it's not
> > working properly and we don't have a plan to fix it.
> 
> Which exact hotplug logic are you referring to?

There are some sections in the Xen balloon driver protected with
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG.

When xen_hotplug_unpopulated is enabled but the default policy for
memory hotplug is to not online the added blocks certain operations
like alloc_xenballooned_pages would block forever waiting for the
hotplugged memory to be onlined in order to be usable.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:25               ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Jürgen Groß,
	Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 03:14:31PM +0200, David Hildenbrand wrote:
> On 23.07.20 15:08, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> >> On 23.07.20 14:23, Roger Pau Monné wrote:
> >>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> >>>> On 23.07.20 10:45, Roger Pau Monne wrote:
> >>>>> Add an extra option to add_memory_resource that overrides the memory
> >>>>> hotplug online behavior in order to force onlining of memory from
> >>>>> add_memory_resource unconditionally.
> >>>>>
> >>>>> This is required for the Xen balloon driver, that must run the
> >>>>> online page callback in order to correctly process the newly added
> >>>>> memory region, note this is an unpopulated region that is used by Linux
> >>>>> to either hotplug RAM or to map foreign pages from other domains, and
> >>>>> hence memory hotplug when running on Xen can be used even without the
> >>>>> user explicitly requesting it, as part of the normal operations of the
> >>>>> OS when attempting to map memory from a different domain.
> >>>>>
> >>>>> Setting a different default value of memhp_default_online_type when
> >>>>> attaching the balloon driver is not a robust solution, as the user (or
> >>>>> distro init scripts) could still change it and thus break the Xen
> >>>>> balloon driver.
> >>>>
> >>>> I think we discussed this a couple of times before (even triggered by my
> >>>> request), and this is responsibility of user space to configure. Usually
> >>>> distros have udev rules to online memory automatically. Especially, user
> >>>> space should eb able to configure *how* to online memory.
> >>>
> >>> Note (as per the commit message) that in the specific case I'm
> >>> referring to the memory hotplugged by the Xen balloon driver will be
> >>> an unpopulated range to be used internally by certain Xen subsystems,
> >>> like the xen-blkback or the privcmd drivers. The addition of such
> >>> blocks of (unpopulated) memory can happen without the user explicitly
> >>> requesting it, and hence not even aware such hotplug process is taking
> >>> place. To be clear: no actual RAM will be added to the system.
> >>>
> >>> Failure to online such blocks using the Xen specific online handler
> >>> (which does not handle back the memory to the allocator in any way)
> >>> will result in the system getting stuck and malfunctioning.
> >>>
> >>>> It's the admin/distro responsibility to configure this properly. In case
> >>>> this doesn't happen (or as you say, users change it), bad luck.
> >>>>
> >>>> E.g., virtio-mem takes care to not add more memory in case it is not
> >>>> getting onlined. I remember hyper-v has similar code to at least wait a
> >>>> bit for memory to get onlined.
> >>>
> >>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
> >>> as Xen, as said this is done to add unpopulated memory regions that
> >>> will be used to map foreign memory (from other domains) by Xen drivers
> >>> on the system.
> >>>
> >>> Maybe this should somehow use a different mechanism to hotplug such
> >>> empty memory blocks? I don't mind doing this differently, but I would
> >>> need some pointers. Allowing user-space to change a (seemingly
> >>> unrelated) parameter and as a result produce failures on Xen drivers
> >>> is not an acceptable solution IMO.
> >>
> >> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> >> region in the memory map to be used for mapping foreign pages. For the
> >> kernel it will look like pre-ballooned memory, so it will create struct
> >> page for the region (which is what we are after), but it won't give the
> >> memory to the allocator.
> > 
> > IMO using something similar to memory hotplug would give us more
> > flexibility, and TBH the logic is already there in the balloon driver.
> > It seems quite wasteful to allocate such region(s) beforehand for all
> > domains, even when most of them won't end up using foreign mappings at
> > all.
> 
> I do wonder why these issues you describe start to pop up now, literally
> years after this stuff has been implemented - or am I missing something
> important?

We are (very slowly) implementing support to switch to a PVH dom0
(something similar to a fully emulated guest as dom0), and that kind
of guest no longer has a pre-allocated memory region for mappings,
likely because we mostly use the native path when dealing with
memory, and such reservation was done by the PV specific setup path
that deals with the memory map.

> > 
> > Anyway, I'm going to take a look at how to do that, I guess it's going
> > to involve playing with the memory map and reserving some space.
> > 
> > I suggest we should remove the Xen balloon hotplug logic, as it's not
> > working properly and we don't have a plan to fix it.
> 
> Which exact hotplug logic are you referring to?

There are some sections in the Xen balloon driver protected with
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG.

When xen_hotplug_unpopulated is enabled but the default policy for
memory hotplug is to not online the added blocks certain operations
like alloc_xenballooned_pages would block forever waiting for the
hotplugged memory to be onlined in order to be usable.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:20             ` Jürgen Groß
@ 2020-07-23 13:39               ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:39 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: David Hildenbrand, linux-kernel, Boris Ostrovsky,
	Stefano Stabellini, Andrew Morton, xen-devel, linux-mm

On Thu, Jul 23, 2020 at 03:20:55PM +0200, Jürgen Groß wrote:
> On 23.07.20 15:08, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> > > On 23.07.20 14:23, Roger Pau Monné wrote:
> > > > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > > > Add an extra option to add_memory_resource that overrides the memory
> > > > > > hotplug online behavior in order to force onlining of memory from
> > > > > > add_memory_resource unconditionally.
> > > > > > 
> > > > > > This is required for the Xen balloon driver, that must run the
> > > > > > online page callback in order to correctly process the newly added
> > > > > > memory region, note this is an unpopulated region that is used by Linux
> > > > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > > > hence memory hotplug when running on Xen can be used even without the
> > > > > > user explicitly requesting it, as part of the normal operations of the
> > > > > > OS when attempting to map memory from a different domain.
> > > > > > 
> > > > > > Setting a different default value of memhp_default_online_type when
> > > > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > > > distro init scripts) could still change it and thus break the Xen
> > > > > > balloon driver.
> > > > > 
> > > > > I think we discussed this a couple of times before (even triggered by my
> > > > > request), and this is responsibility of user space to configure. Usually
> > > > > distros have udev rules to online memory automatically. Especially, user
> > > > > space should eb able to configure *how* to online memory.
> > > > 
> > > > Note (as per the commit message) that in the specific case I'm
> > > > referring to the memory hotplugged by the Xen balloon driver will be
> > > > an unpopulated range to be used internally by certain Xen subsystems,
> > > > like the xen-blkback or the privcmd drivers. The addition of such
> > > > blocks of (unpopulated) memory can happen without the user explicitly
> > > > requesting it, and hence not even aware such hotplug process is taking
> > > > place. To be clear: no actual RAM will be added to the system.
> > > > 
> > > > Failure to online such blocks using the Xen specific online handler
> > > > (which does not handle back the memory to the allocator in any way)
> > > > will result in the system getting stuck and malfunctioning.
> > > > 
> > > > > It's the admin/distro responsibility to configure this properly. In case
> > > > > this doesn't happen (or as you say, users change it), bad luck.
> > > > > 
> > > > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > > > bit for memory to get onlined.
> > > > 
> > > > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > > > as Xen, as said this is done to add unpopulated memory regions that
> > > > will be used to map foreign memory (from other domains) by Xen drivers
> > > > on the system.
> > > > 
> > > > Maybe this should somehow use a different mechanism to hotplug such
> > > > empty memory blocks? I don't mind doing this differently, but I would
> > > > need some pointers. Allowing user-space to change a (seemingly
> > > > unrelated) parameter and as a result produce failures on Xen drivers
> > > > is not an acceptable solution IMO.
> > > 
> > > Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> > > region in the memory map to be used for mapping foreign pages. For the
> > > kernel it will look like pre-ballooned memory, so it will create struct
> > > page for the region (which is what we are after), but it won't give the
> > > memory to the allocator.
> > 
> > IMO using something similar to memory hotplug would give us more
> > flexibility, and TBH the logic is already there in the balloon driver.
> > It seems quite wasteful to allocate such region(s) beforehand for all
> > domains, even when most of them won't end up using foreign mappings at
> > all.
> 
> We can do it for dom0 only per default, and add a boot parameter e.g.
> for driver domains.
> 
> And the logic is already there (just pv-only right now).
> 
> > 
> > Anyway, I'm going to take a look at how to do that, I guess it's going
> > to involve playing with the memory map and reserving some space.
> 
> Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).

Yes, I've taken a look. It's my rough understanding that I would need
to add a hook for HVM/PVH that modifies the memory map in order to add
an extra region (or regions) that would be marked as reserved using
memblock_reserve by xen_add_extra_mem.

Adding such hook for PVH guests booted using the PVH entry point and
fetching the memory map using the hypercall interface
(mem_map_via_hcall) seems feasible, however I'm not sure dealing with
other guests types is that easy.

> > 
> > I suggest we should remove the Xen balloon hotplug logic, as it's not
> > working properly and we don't have a plan to fix it.
> 
> I have used memory hotplug successfully not very long ago.

Right, but it requires a certain set of enabled options, which IMO is
not obvious. For example enabling xen_hotplug_unpopulated without also
setting the default memory hotplug policy to online the added blocks
will result in processes getting stuck. This is IMO too fragile.

Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:39               ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:39 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, David Hildenbrand, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 03:20:55PM +0200, Jürgen Groß wrote:
> On 23.07.20 15:08, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
> > > On 23.07.20 14:23, Roger Pau Monné wrote:
> > > > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > > > Add an extra option to add_memory_resource that overrides the memory
> > > > > > hotplug online behavior in order to force onlining of memory from
> > > > > > add_memory_resource unconditionally.
> > > > > > 
> > > > > > This is required for the Xen balloon driver, that must run the
> > > > > > online page callback in order to correctly process the newly added
> > > > > > memory region, note this is an unpopulated region that is used by Linux
> > > > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > > > hence memory hotplug when running on Xen can be used even without the
> > > > > > user explicitly requesting it, as part of the normal operations of the
> > > > > > OS when attempting to map memory from a different domain.
> > > > > > 
> > > > > > Setting a different default value of memhp_default_online_type when
> > > > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > > > distro init scripts) could still change it and thus break the Xen
> > > > > > balloon driver.
> > > > > 
> > > > > I think we discussed this a couple of times before (even triggered by my
> > > > > request), and this is responsibility of user space to configure. Usually
> > > > > distros have udev rules to online memory automatically. Especially, user
> > > > > space should eb able to configure *how* to online memory.
> > > > 
> > > > Note (as per the commit message) that in the specific case I'm
> > > > referring to the memory hotplugged by the Xen balloon driver will be
> > > > an unpopulated range to be used internally by certain Xen subsystems,
> > > > like the xen-blkback or the privcmd drivers. The addition of such
> > > > blocks of (unpopulated) memory can happen without the user explicitly
> > > > requesting it, and hence not even aware such hotplug process is taking
> > > > place. To be clear: no actual RAM will be added to the system.
> > > > 
> > > > Failure to online such blocks using the Xen specific online handler
> > > > (which does not handle back the memory to the allocator in any way)
> > > > will result in the system getting stuck and malfunctioning.
> > > > 
> > > > > It's the admin/distro responsibility to configure this properly. In case
> > > > > this doesn't happen (or as you say, users change it), bad luck.
> > > > > 
> > > > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > > > bit for memory to get onlined.
> > > > 
> > > > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > > > as Xen, as said this is done to add unpopulated memory regions that
> > > > will be used to map foreign memory (from other domains) by Xen drivers
> > > > on the system.
> > > > 
> > > > Maybe this should somehow use a different mechanism to hotplug such
> > > > empty memory blocks? I don't mind doing this differently, but I would
> > > > need some pointers. Allowing user-space to change a (seemingly
> > > > unrelated) parameter and as a result produce failures on Xen drivers
> > > > is not an acceptable solution IMO.
> > > 
> > > Maybe we can use the same approach as Xen PV-domains: pre-allocate a
> > > region in the memory map to be used for mapping foreign pages. For the
> > > kernel it will look like pre-ballooned memory, so it will create struct
> > > page for the region (which is what we are after), but it won't give the
> > > memory to the allocator.
> > 
> > IMO using something similar to memory hotplug would give us more
> > flexibility, and TBH the logic is already there in the balloon driver.
> > It seems quite wasteful to allocate such region(s) beforehand for all
> > domains, even when most of them won't end up using foreign mappings at
> > all.
> 
> We can do it for dom0 only per default, and add a boot parameter e.g.
> for driver domains.
> 
> And the logic is already there (just pv-only right now).
> 
> > 
> > Anyway, I'm going to take a look at how to do that, I guess it's going
> > to involve playing with the memory map and reserving some space.
> 
> Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).

Yes, I've taken a look. It's my rough understanding that I would need
to add a hook for HVM/PVH that modifies the memory map in order to add
an extra region (or regions) that would be marked as reserved using
memblock_reserve by xen_add_extra_mem.

Adding such hook for PVH guests booted using the PVH entry point and
fetching the memory map using the hypercall interface
(mem_map_via_hcall) seems feasible, however I'm not sure dealing with
other guests types is that easy.

> > 
> > I suggest we should remove the Xen balloon hotplug logic, as it's not
> > working properly and we don't have a plan to fix it.
> 
> I have used memory hotplug successfully not very long ago.

Right, but it requires a certain set of enabled options, which IMO is
not obvious. For example enabling xen_hotplug_unpopulated without also
setting the default memory hotplug policy to online the added blocks
will result in processes getting stuck. This is IMO too fragile.

Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:22         ` David Hildenbrand
@ 2020-07-23 13:47           ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:47 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On 23.07.20 15:22, David Hildenbrand wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>> Add an extra option to add_memory_resource that overrides the memory
>>>> hotplug online behavior in order to force onlining of memory from
>>>> add_memory_resource unconditionally.
>>>>
>>>> This is required for the Xen balloon driver, that must run the
>>>> online page callback in order to correctly process the newly added
>>>> memory region, note this is an unpopulated region that is used by Linux
>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>> hence memory hotplug when running on Xen can be used even without the
>>>> user explicitly requesting it, as part of the normal operations of the
>>>> OS when attempting to map memory from a different domain.
>>>>
>>>> Setting a different default value of memhp_default_online_type when
>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>> distro init scripts) could still change it and thus break the Xen
>>>> balloon driver.
>>>
>>> I think we discussed this a couple of times before (even triggered by my
>>> request), and this is responsibility of user space to configure. Usually
>>> distros have udev rules to online memory automatically. Especially, user
>>> space should eb able to configure *how* to online memory.
>>
>> Note (as per the commit message) that in the specific case I'm
>> referring to the memory hotplugged by the Xen balloon driver will be
>> an unpopulated range to be used internally by certain Xen subsystems,
>> like the xen-blkback or the privcmd drivers. The addition of such
>> blocks of (unpopulated) memory can happen without the user explicitly
>> requesting it, and hence not even aware such hotplug process is taking
>> place. To be clear: no actual RAM will be added to the system.
> 
> Okay, but there is also the case where XEN will actually hotplug memory
> using this same handler IIRC (at least I've read papers about it). Both
> are using the same handler, correct?
> 
>>
>>> It's the admin/distro responsibility to configure this properly. In case
>>> this doesn't happen (or as you say, users change it), bad luck.
>>>
>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>> bit for memory to get onlined.
>>
>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>> as Xen, as said this is done to add unpopulated memory regions that
>> will be used to map foreign memory (from other domains) by Xen drivers
>> on the system.
> 
> Indeed, if the memory is never exposed to the buddy (and all you need is
> struct pages +  a kernel virtual mapping), I wonder if
> memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
> memory blocks created with unclear online semantics, partially involving
> the buddy.

And just a note that there is also DCSS on s390x / z/VM which allows to
map segments into the VM physical address space (e.g., you can share
segments between VMs). They don't need any memmap (struct page) for that
memory, though. All they do is create the identity mapping in the kernel
virtual address space manually. Not sure what the exact requirements on
the XEN side are. I assume you need a memmap for this memory.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:47           ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 13:47 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On 23.07.20 15:22, David Hildenbrand wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>> Add an extra option to add_memory_resource that overrides the memory
>>>> hotplug online behavior in order to force onlining of memory from
>>>> add_memory_resource unconditionally.
>>>>
>>>> This is required for the Xen balloon driver, that must run the
>>>> online page callback in order to correctly process the newly added
>>>> memory region, note this is an unpopulated region that is used by Linux
>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>> hence memory hotplug when running on Xen can be used even without the
>>>> user explicitly requesting it, as part of the normal operations of the
>>>> OS when attempting to map memory from a different domain.
>>>>
>>>> Setting a different default value of memhp_default_online_type when
>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>> distro init scripts) could still change it and thus break the Xen
>>>> balloon driver.
>>>
>>> I think we discussed this a couple of times before (even triggered by my
>>> request), and this is responsibility of user space to configure. Usually
>>> distros have udev rules to online memory automatically. Especially, user
>>> space should eb able to configure *how* to online memory.
>>
>> Note (as per the commit message) that in the specific case I'm
>> referring to the memory hotplugged by the Xen balloon driver will be
>> an unpopulated range to be used internally by certain Xen subsystems,
>> like the xen-blkback or the privcmd drivers. The addition of such
>> blocks of (unpopulated) memory can happen without the user explicitly
>> requesting it, and hence not even aware such hotplug process is taking
>> place. To be clear: no actual RAM will be added to the system.
> 
> Okay, but there is also the case where XEN will actually hotplug memory
> using this same handler IIRC (at least I've read papers about it). Both
> are using the same handler, correct?
> 
>>
>>> It's the admin/distro responsibility to configure this properly. In case
>>> this doesn't happen (or as you say, users change it), bad luck.
>>>
>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>> bit for memory to get onlined.
>>
>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>> as Xen, as said this is done to add unpopulated memory regions that
>> will be used to map foreign memory (from other domains) by Xen drivers
>> on the system.
> 
> Indeed, if the memory is never exposed to the buddy (and all you need is
> struct pages +  a kernel virtual mapping), I wonder if
> memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
> memory blocks created with unclear online semantics, partially involving
> the buddy.

And just a note that there is also DCSS on s390x / z/VM which allows to
map segments into the VM physical address space (e.g., you can share
segments between VMs). They don't need any memmap (struct page) for that
memory, though. All they do is create the identity mapping in the kernel
virtual address space manually. Not sure what the exact requirements on
the XEN side are. I assume you need a memmap for this memory.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:39               ` Roger Pau Monné
@ 2020-07-23 13:49                 ` Jürgen Groß
  -1 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:49 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: David Hildenbrand, linux-kernel, Boris Ostrovsky,
	Stefano Stabellini, Andrew Morton, xen-devel, linux-mm

On 23.07.20 15:39, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 03:20:55PM +0200, Jürgen Groß wrote:
>> On 23.07.20 15:08, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>>> add_memory_resource unconditionally.
>>>>>>>
>>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>>> online page callback in order to correctly process the newly added
>>>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>>>> OS when attempting to map memory from a different domain.
>>>>>>>
>>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>>> balloon driver.
>>>>>>
>>>>>> I think we discussed this a couple of times before (even triggered by my
>>>>>> request), and this is responsibility of user space to configure. Usually
>>>>>> distros have udev rules to online memory automatically. Especially, user
>>>>>> space should eb able to configure *how* to online memory.
>>>>>
>>>>> Note (as per the commit message) that in the specific case I'm
>>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>>> requesting it, and hence not even aware such hotplug process is taking
>>>>> place. To be clear: no actual RAM will be added to the system.
>>>>>
>>>>> Failure to online such blocks using the Xen specific online handler
>>>>> (which does not handle back the memory to the allocator in any way)
>>>>> will result in the system getting stuck and malfunctioning.
>>>>>
>>>>>> It's the admin/distro responsibility to configure this properly. In case
>>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>>
>>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>>>> bit for memory to get onlined.
>>>>>
>>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>>> on the system.
>>>>>
>>>>> Maybe this should somehow use a different mechanism to hotplug such
>>>>> empty memory blocks? I don't mind doing this differently, but I would
>>>>> need some pointers. Allowing user-space to change a (seemingly
>>>>> unrelated) parameter and as a result produce failures on Xen drivers
>>>>> is not an acceptable solution IMO.
>>>>
>>>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>>>> region in the memory map to be used for mapping foreign pages. For the
>>>> kernel it will look like pre-ballooned memory, so it will create struct
>>>> page for the region (which is what we are after), but it won't give the
>>>> memory to the allocator.
>>>
>>> IMO using something similar to memory hotplug would give us more
>>> flexibility, and TBH the logic is already there in the balloon driver.
>>> It seems quite wasteful to allocate such region(s) beforehand for all
>>> domains, even when most of them won't end up using foreign mappings at
>>> all.
>>
>> We can do it for dom0 only per default, and add a boot parameter e.g.
>> for driver domains.
>>
>> And the logic is already there (just pv-only right now).
>>
>>>
>>> Anyway, I'm going to take a look at how to do that, I guess it's going
>>> to involve playing with the memory map and reserving some space.
>>
>> Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).
> 
> Yes, I've taken a look. It's my rough understanding that I would need
> to add a hook for HVM/PVH that modifies the memory map in order to add
> an extra region (or regions) that would be marked as reserved using
> memblock_reserve by xen_add_extra_mem.
> 
> Adding such hook for PVH guests booted using the PVH entry point and
> fetching the memory map using the hypercall interface
> (mem_map_via_hcall) seems feasible, however I'm not sure dealing with
> other guests types is that easy.

I think for dom0 we can just use the existing logic using the host
memory map for selecting which region to use (possibly the size could
be specified as boot parameter in order to overwrite the default).

For domUs we'd need a boot parameter specifying either just the size
(resulting in a possible clash in case of pci passthrough) or
specifying the guest physical region for that additional area.

> 
>>>
>>> I suggest we should remove the Xen balloon hotplug logic, as it's not
>>> working properly and we don't have a plan to fix it.
>>
>> I have used memory hotplug successfully not very long ago.
> 
> Right, but it requires a certain set of enabled options, which IMO is
> not obvious. For example enabling xen_hotplug_unpopulated without also
> setting the default memory hotplug policy to online the added blocks
> will result in processes getting stuck. This is IMO too fragile.

Yes, memory hotplug is an item on my todo-list for some years now.


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:49                 ` Jürgen Groß
  0 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:49 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, David Hildenbrand, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On 23.07.20 15:39, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 03:20:55PM +0200, Jürgen Groß wrote:
>> On 23.07.20 15:08, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 02:28:13PM +0200, Jürgen Groß wrote:
>>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>>> add_memory_resource unconditionally.
>>>>>>>
>>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>>> online page callback in order to correctly process the newly added
>>>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>>>> OS when attempting to map memory from a different domain.
>>>>>>>
>>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>>> balloon driver.
>>>>>>
>>>>>> I think we discussed this a couple of times before (even triggered by my
>>>>>> request), and this is responsibility of user space to configure. Usually
>>>>>> distros have udev rules to online memory automatically. Especially, user
>>>>>> space should eb able to configure *how* to online memory.
>>>>>
>>>>> Note (as per the commit message) that in the specific case I'm
>>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>>> requesting it, and hence not even aware such hotplug process is taking
>>>>> place. To be clear: no actual RAM will be added to the system.
>>>>>
>>>>> Failure to online such blocks using the Xen specific online handler
>>>>> (which does not handle back the memory to the allocator in any way)
>>>>> will result in the system getting stuck and malfunctioning.
>>>>>
>>>>>> It's the admin/distro responsibility to configure this properly. In case
>>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>>
>>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>>>> bit for memory to get onlined.
>>>>>
>>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>>> on the system.
>>>>>
>>>>> Maybe this should somehow use a different mechanism to hotplug such
>>>>> empty memory blocks? I don't mind doing this differently, but I would
>>>>> need some pointers. Allowing user-space to change a (seemingly
>>>>> unrelated) parameter and as a result produce failures on Xen drivers
>>>>> is not an acceptable solution IMO.
>>>>
>>>> Maybe we can use the same approach as Xen PV-domains: pre-allocate a
>>>> region in the memory map to be used for mapping foreign pages. For the
>>>> kernel it will look like pre-ballooned memory, so it will create struct
>>>> page for the region (which is what we are after), but it won't give the
>>>> memory to the allocator.
>>>
>>> IMO using something similar to memory hotplug would give us more
>>> flexibility, and TBH the logic is already there in the balloon driver.
>>> It seems quite wasteful to allocate such region(s) beforehand for all
>>> domains, even when most of them won't end up using foreign mappings at
>>> all.
>>
>> We can do it for dom0 only per default, and add a boot parameter e.g.
>> for driver domains.
>>
>> And the logic is already there (just pv-only right now).
>>
>>>
>>> Anyway, I'm going to take a look at how to do that, I guess it's going
>>> to involve playing with the memory map and reserving some space.
>>
>> Look at arch/x86/xen/setup.c (xen_add_extra_mem() and its usage).
> 
> Yes, I've taken a look. It's my rough understanding that I would need
> to add a hook for HVM/PVH that modifies the memory map in order to add
> an extra region (or regions) that would be marked as reserved using
> memblock_reserve by xen_add_extra_mem.
> 
> Adding such hook for PVH guests booted using the PVH entry point and
> fetching the memory map using the hypercall interface
> (mem_map_via_hcall) seems feasible, however I'm not sure dealing with
> other guests types is that easy.

I think for dom0 we can just use the existing logic using the host
memory map for selecting which region to use (possibly the size could
be specified as boot parameter in order to overwrite the default).

For domUs we'd need a boot parameter specifying either just the size
(resulting in a possible clash in case of pci passthrough) or
specifying the guest physical region for that additional area.

> 
>>>
>>> I suggest we should remove the Xen balloon hotplug logic, as it's not
>>> working properly and we don't have a plan to fix it.
>>
>> I have used memory hotplug successfully not very long ago.
> 
> Right, but it requires a certain set of enabled options, which IMO is
> not obvious. For example enabling xen_hotplug_unpopulated without also
> setting the default memory hotplug policy to online the added blocks
> will result in processes getting stuck. This is IMO too fragile.

Yes, memory hotplug is an item on my todo-list for some years now.


Juergen


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:47           ` David Hildenbrand
@ 2020-07-23 13:53             ` Jürgen Groß
  -1 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:53 UTC (permalink / raw)
  To: David Hildenbrand, Roger Pau Monné
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 15:47, David Hildenbrand wrote:
> On 23.07.20 15:22, David Hildenbrand wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>
>> Okay, but there is also the case where XEN will actually hotplug memory
>> using this same handler IIRC (at least I've read papers about it). Both
>> are using the same handler, correct?
>>
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>
>> Indeed, if the memory is never exposed to the buddy (and all you need is
>> struct pages +  a kernel virtual mapping), I wonder if
>> memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
>> memory blocks created with unclear online semantics, partially involving
>> the buddy.
> 
> And just a note that there is also DCSS on s390x / z/VM which allows to
> map segments into the VM physical address space (e.g., you can share
> segments between VMs). They don't need any memmap (struct page) for that
> memory, though. All they do is create the identity mapping in the kernel
> virtual address space manually. Not sure what the exact requirements on
> the XEN side are. I assume you need a memmap for this memory.

We need to be able to do I/O with that memory via normal drivers and we
need to be able to map it, both from user land and from the kernel.


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:53             ` Jürgen Groß
  0 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 13:53 UTC (permalink / raw)
  To: David Hildenbrand, Roger Pau Monné
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 15:47, David Hildenbrand wrote:
> On 23.07.20 15:22, David Hildenbrand wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>
>> Okay, but there is also the case where XEN will actually hotplug memory
>> using this same handler IIRC (at least I've read papers about it). Both
>> are using the same handler, correct?
>>
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>
>> Indeed, if the memory is never exposed to the buddy (and all you need is
>> struct pages +  a kernel virtual mapping), I wonder if
>> memremap/ZONE_DEVICE is what you want? Then you won't have user-visible
>> memory blocks created with unclear online semantics, partially involving
>> the buddy.
> 
> And just a note that there is also DCSS on s390x / z/VM which allows to
> map segments into the VM physical address space (e.g., you can share
> segments between VMs). They don't need any memmap (struct page) for that
> memory, though. All they do is create the identity mapping in the kernel
> virtual address space manually. Not sure what the exact requirements on
> the XEN side are. I assume you need a memmap for this memory.

We need to be able to do I/O with that memory via normal drivers and we
need to be able to map it, both from user land and from the kernel.


Juergen


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:22         ` David Hildenbrand
@ 2020-07-23 13:59           ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Boris Ostrovsky, Juergen Gross, Stefano Stabellini,
	Andrew Morton, xen-devel, linux-mm

On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> >> On 23.07.20 10:45, Roger Pau Monne wrote:
> >>> Add an extra option to add_memory_resource that overrides the memory
> >>> hotplug online behavior in order to force onlining of memory from
> >>> add_memory_resource unconditionally.
> >>>
> >>> This is required for the Xen balloon driver, that must run the
> >>> online page callback in order to correctly process the newly added
> >>> memory region, note this is an unpopulated region that is used by Linux
> >>> to either hotplug RAM or to map foreign pages from other domains, and
> >>> hence memory hotplug when running on Xen can be used even without the
> >>> user explicitly requesting it, as part of the normal operations of the
> >>> OS when attempting to map memory from a different domain.
> >>>
> >>> Setting a different default value of memhp_default_online_type when
> >>> attaching the balloon driver is not a robust solution, as the user (or
> >>> distro init scripts) could still change it and thus break the Xen
> >>> balloon driver.
> >>
> >> I think we discussed this a couple of times before (even triggered by my
> >> request), and this is responsibility of user space to configure. Usually
> >> distros have udev rules to online memory automatically. Especially, user
> >> space should eb able to configure *how* to online memory.
> > 
> > Note (as per the commit message) that in the specific case I'm
> > referring to the memory hotplugged by the Xen balloon driver will be
> > an unpopulated range to be used internally by certain Xen subsystems,
> > like the xen-blkback or the privcmd drivers. The addition of such
> > blocks of (unpopulated) memory can happen without the user explicitly
> > requesting it, and hence not even aware such hotplug process is taking
> > place. To be clear: no actual RAM will be added to the system.
> 
> Okay, but there is also the case where XEN will actually hotplug memory
> using this same handler IIRC (at least I've read papers about it). Both
> are using the same handler, correct?

Yes, it's used by this dual purpose, which I have to admit I don't
like that much either.

One set of pages should be clearly used for RAM memory hotplug, and
the other to map foreign pages that are not related to memory hotplug,
it's just that we happen to need a physical region with backing struct
pages.

> > 
> >> It's the admin/distro responsibility to configure this properly. In case
> >> this doesn't happen (or as you say, users change it), bad luck.
> >>
> >> E.g., virtio-mem takes care to not add more memory in case it is not
> >> getting onlined. I remember hyper-v has similar code to at least wait a
> >> bit for memory to get onlined.
> > 
> > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > as Xen, as said this is done to add unpopulated memory regions that
> > will be used to map foreign memory (from other domains) by Xen drivers
> > on the system.
> 
> Indeed, if the memory is never exposed to the buddy (and all you need is
> struct pages +  a kernel virtual mapping), I wonder if
> memremap/ZONE_DEVICE is what you want?

I'm certainly not familiar with the Linux memory subsystem, but if
that gets us a backing struct page and a kernel mapping then I would
say yes.

> Then you won't have user-visible
> memory blocks created with unclear online semantics, partially involving
> the buddy.

Seems like a fine solution.

Juergen: would you be OK to use a separate page-list for
alloc_xenballooned_pages on HVM/PVH using the logic described by
David?

I guess I would leave PV as-is, since it already has this reserved
region to map foreign pages.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 13:59           ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 13:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Juergen Gross, Stefano Stabellini, linux-kernel, linux-mm,
	xen-devel, Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
> On 23.07.20 14:23, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> >> On 23.07.20 10:45, Roger Pau Monne wrote:
> >>> Add an extra option to add_memory_resource that overrides the memory
> >>> hotplug online behavior in order to force onlining of memory from
> >>> add_memory_resource unconditionally.
> >>>
> >>> This is required for the Xen balloon driver, that must run the
> >>> online page callback in order to correctly process the newly added
> >>> memory region, note this is an unpopulated region that is used by Linux
> >>> to either hotplug RAM or to map foreign pages from other domains, and
> >>> hence memory hotplug when running on Xen can be used even without the
> >>> user explicitly requesting it, as part of the normal operations of the
> >>> OS when attempting to map memory from a different domain.
> >>>
> >>> Setting a different default value of memhp_default_online_type when
> >>> attaching the balloon driver is not a robust solution, as the user (or
> >>> distro init scripts) could still change it and thus break the Xen
> >>> balloon driver.
> >>
> >> I think we discussed this a couple of times before (even triggered by my
> >> request), and this is responsibility of user space to configure. Usually
> >> distros have udev rules to online memory automatically. Especially, user
> >> space should eb able to configure *how* to online memory.
> > 
> > Note (as per the commit message) that in the specific case I'm
> > referring to the memory hotplugged by the Xen balloon driver will be
> > an unpopulated range to be used internally by certain Xen subsystems,
> > like the xen-blkback or the privcmd drivers. The addition of such
> > blocks of (unpopulated) memory can happen without the user explicitly
> > requesting it, and hence not even aware such hotplug process is taking
> > place. To be clear: no actual RAM will be added to the system.
> 
> Okay, but there is also the case where XEN will actually hotplug memory
> using this same handler IIRC (at least I've read papers about it). Both
> are using the same handler, correct?

Yes, it's used by this dual purpose, which I have to admit I don't
like that much either.

One set of pages should be clearly used for RAM memory hotplug, and
the other to map foreign pages that are not related to memory hotplug,
it's just that we happen to need a physical region with backing struct
pages.

> > 
> >> It's the admin/distro responsibility to configure this properly. In case
> >> this doesn't happen (or as you say, users change it), bad luck.
> >>
> >> E.g., virtio-mem takes care to not add more memory in case it is not
> >> getting onlined. I remember hyper-v has similar code to at least wait a
> >> bit for memory to get onlined.
> > 
> > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > as Xen, as said this is done to add unpopulated memory regions that
> > will be used to map foreign memory (from other domains) by Xen drivers
> > on the system.
> 
> Indeed, if the memory is never exposed to the buddy (and all you need is
> struct pages +  a kernel virtual mapping), I wonder if
> memremap/ZONE_DEVICE is what you want?

I'm certainly not familiar with the Linux memory subsystem, but if
that gets us a backing struct page and a kernel mapping then I would
say yes.

> Then you won't have user-visible
> memory blocks created with unclear online semantics, partially involving
> the buddy.

Seems like a fine solution.

Juergen: would you be OK to use a separate page-list for
alloc_xenballooned_pages on HVM/PVH using the logic described by
David?

I guess I would leave PV as-is, since it already has this reserved
region to map foreign pages.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 13:59           ` Roger Pau Monné
@ 2020-07-23 15:10             ` Jürgen Groß
  -1 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 15:10 UTC (permalink / raw)
  To: Roger Pau Monné, David Hildenbrand
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 15:59, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>
>> Okay, but there is also the case where XEN will actually hotplug memory
>> using this same handler IIRC (at least I've read papers about it). Both
>> are using the same handler, correct?
> 
> Yes, it's used by this dual purpose, which I have to admit I don't
> like that much either.
> 
> One set of pages should be clearly used for RAM memory hotplug, and
> the other to map foreign pages that are not related to memory hotplug,
> it's just that we happen to need a physical region with backing struct
> pages.
> 
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>
>> Indeed, if the memory is never exposed to the buddy (and all you need is
>> struct pages +  a kernel virtual mapping), I wonder if
>> memremap/ZONE_DEVICE is what you want?
> 
> I'm certainly not familiar with the Linux memory subsystem, but if
> that gets us a backing struct page and a kernel mapping then I would
> say yes.
> 
>> Then you won't have user-visible
>> memory blocks created with unclear online semantics, partially involving
>> the buddy.
> 
> Seems like a fine solution.
> 
> Juergen: would you be OK to use a separate page-list for
> alloc_xenballooned_pages on HVM/PVH using the logic described by
> David?
> 
> I guess I would leave PV as-is, since it already has this reserved
> region to map foreign pages.

I would really like a common solution, especially as it would enable
pv driver domains to use that feature, too.

And finding a region for this memory zone in PVH dom0 should be common
with PV dom0 after all. We don't want to collide with either PCI space
or hotplug memory.


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 15:10             ` Jürgen Groß
  0 siblings, 0 replies; 47+ messages in thread
From: Jürgen Groß @ 2020-07-23 15:10 UTC (permalink / raw)
  To: Roger Pau Monné, David Hildenbrand
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 15:59, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>> hotplug online behavior in order to force onlining of memory from
>>>>> add_memory_resource unconditionally.
>>>>>
>>>>> This is required for the Xen balloon driver, that must run the
>>>>> online page callback in order to correctly process the newly added
>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>> OS when attempting to map memory from a different domain.
>>>>>
>>>>> Setting a different default value of memhp_default_online_type when
>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>> distro init scripts) could still change it and thus break the Xen
>>>>> balloon driver.
>>>>
>>>> I think we discussed this a couple of times before (even triggered by my
>>>> request), and this is responsibility of user space to configure. Usually
>>>> distros have udev rules to online memory automatically. Especially, user
>>>> space should eb able to configure *how* to online memory.
>>>
>>> Note (as per the commit message) that in the specific case I'm
>>> referring to the memory hotplugged by the Xen balloon driver will be
>>> an unpopulated range to be used internally by certain Xen subsystems,
>>> like the xen-blkback or the privcmd drivers. The addition of such
>>> blocks of (unpopulated) memory can happen without the user explicitly
>>> requesting it, and hence not even aware such hotplug process is taking
>>> place. To be clear: no actual RAM will be added to the system.
>>
>> Okay, but there is also the case where XEN will actually hotplug memory
>> using this same handler IIRC (at least I've read papers about it). Both
>> are using the same handler, correct?
> 
> Yes, it's used by this dual purpose, which I have to admit I don't
> like that much either.
> 
> One set of pages should be clearly used for RAM memory hotplug, and
> the other to map foreign pages that are not related to memory hotplug,
> it's just that we happen to need a physical region with backing struct
> pages.
> 
>>>
>>>> It's the admin/distro responsibility to configure this properly. In case
>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>
>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>> bit for memory to get onlined.
>>>
>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>> as Xen, as said this is done to add unpopulated memory regions that
>>> will be used to map foreign memory (from other domains) by Xen drivers
>>> on the system.
>>
>> Indeed, if the memory is never exposed to the buddy (and all you need is
>> struct pages +  a kernel virtual mapping), I wonder if
>> memremap/ZONE_DEVICE is what you want?
> 
> I'm certainly not familiar with the Linux memory subsystem, but if
> that gets us a backing struct page and a kernel mapping then I would
> say yes.
> 
>> Then you won't have user-visible
>> memory blocks created with unclear online semantics, partially involving
>> the buddy.
> 
> Seems like a fine solution.
> 
> Juergen: would you be OK to use a separate page-list for
> alloc_xenballooned_pages on HVM/PVH using the logic described by
> David?
> 
> I guess I would leave PV as-is, since it already has this reserved
> region to map foreign pages.

I would really like a common solution, especially as it would enable
pv driver domains to use that feature, too.

And finding a region for this memory zone in PVH dom0 should be common
with PV dom0 after all. We don't want to collide with either PCI space
or hotplug memory.


Juergen


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 15:10             ` Jürgen Groß
@ 2020-07-23 16:03               ` Andrew Cooper
  -1 siblings, 0 replies; 47+ messages in thread
From: Andrew Cooper @ 2020-07-23 16:03 UTC (permalink / raw)
  To: Jürgen Groß, Roger Pau Monné, David Hildenbrand
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23/07/2020 16:10, Jürgen Groß wrote:
> On 23.07.20 15:59, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>> add_memory_resource unconditionally.
>>>>>>
>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>> online page callback in order to correctly process the newly added
>>>>>> memory region, note this is an unpopulated region that is used by
>>>>>> Linux
>>>>>> to either hotplug RAM or to map foreign pages from other domains,
>>>>>> and
>>>>>> hence memory hotplug when running on Xen can be used even without
>>>>>> the
>>>>>> user explicitly requesting it, as part of the normal operations
>>>>>> of the
>>>>>> OS when attempting to map memory from a different domain.
>>>>>>
>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>> attaching the balloon driver is not a robust solution, as the
>>>>>> user (or
>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>> balloon driver.
>>>>>
>>>>> I think we discussed this a couple of times before (even triggered
>>>>> by my
>>>>> request), and this is responsibility of user space to configure.
>>>>> Usually
>>>>> distros have udev rules to online memory automatically.
>>>>> Especially, user
>>>>> space should eb able to configure *how* to online memory.
>>>>
>>>> Note (as per the commit message) that in the specific case I'm
>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>> requesting it, and hence not even aware such hotplug process is taking
>>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Okay, but there is also the case where XEN will actually hotplug memory
>>> using this same handler IIRC (at least I've read papers about it). Both
>>> are using the same handler, correct?
>>
>> Yes, it's used by this dual purpose, which I have to admit I don't
>> like that much either.
>>
>> One set of pages should be clearly used for RAM memory hotplug, and
>> the other to map foreign pages that are not related to memory hotplug,
>> it's just that we happen to need a physical region with backing struct
>> pages.
>>
>>>>
>>>>> It's the admin/distro responsibility to configure this properly.
>>>>> In case
>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>
>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>> getting onlined. I remember hyper-v has similar code to at least
>>>>> wait a
>>>>> bit for memory to get onlined.
>>>>
>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>> on the system.
>>>
>>> Indeed, if the memory is never exposed to the buddy (and all you
>>> need is
>>> struct pages +  a kernel virtual mapping), I wonder if
>>> memremap/ZONE_DEVICE is what you want?
>>
>> I'm certainly not familiar with the Linux memory subsystem, but if
>> that gets us a backing struct page and a kernel mapping then I would
>> say yes.
>>
>>> Then you won't have user-visible
>>> memory blocks created with unclear online semantics, partially
>>> involving
>>> the buddy.
>>
>> Seems like a fine solution.
>>
>> Juergen: would you be OK to use a separate page-list for
>> alloc_xenballooned_pages on HVM/PVH using the logic described by
>> David?
>>
>> I guess I would leave PV as-is, since it already has this reserved
>> region to map foreign pages.
>
> I would really like a common solution, especially as it would enable
> pv driver domains to use that feature, too.
>
> And finding a region for this memory zone in PVH dom0 should be common
> with PV dom0 after all. We don't want to collide with either PCI space
> or hotplug memory.

While I agree with goal here, these are two very different things, due
to the completely different nature of PV and HVM/PVH guests.

HVM/PVH guests have a concrete guest physical address space.  Linux
needs to pick some gfn's to use which aren't used by anything else (and
Xen's behaviour of not providing any help here is deeply unhelpful, and
needs fixing), and get struct page_info's for them.

PV is totally different.  Linux still needs page_info's for them, but
there is no concept of a guest physical address space.  You can
literally gain access to foreign mappings or grant maps by asking Xen to
modify a PTE.  For convenience with the core code, Linux tries to map
this concept back into a 1:1 pfn space, but it is quite fictitious.

~Andrew

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 16:03               ` Andrew Cooper
  0 siblings, 0 replies; 47+ messages in thread
From: Andrew Cooper @ 2020-07-23 16:03 UTC (permalink / raw)
  To: Jürgen Groß, Roger Pau Monné, David Hildenbrand
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23/07/2020 16:10, Jürgen Groß wrote:
> On 23.07.20 15:59, Roger Pau Monné wrote:
>> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>> add_memory_resource unconditionally.
>>>>>>
>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>> online page callback in order to correctly process the newly added
>>>>>> memory region, note this is an unpopulated region that is used by
>>>>>> Linux
>>>>>> to either hotplug RAM or to map foreign pages from other domains,
>>>>>> and
>>>>>> hence memory hotplug when running on Xen can be used even without
>>>>>> the
>>>>>> user explicitly requesting it, as part of the normal operations
>>>>>> of the
>>>>>> OS when attempting to map memory from a different domain.
>>>>>>
>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>> attaching the balloon driver is not a robust solution, as the
>>>>>> user (or
>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>> balloon driver.
>>>>>
>>>>> I think we discussed this a couple of times before (even triggered
>>>>> by my
>>>>> request), and this is responsibility of user space to configure.
>>>>> Usually
>>>>> distros have udev rules to online memory automatically.
>>>>> Especially, user
>>>>> space should eb able to configure *how* to online memory.
>>>>
>>>> Note (as per the commit message) that in the specific case I'm
>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>> requesting it, and hence not even aware such hotplug process is taking
>>>> place. To be clear: no actual RAM will be added to the system.
>>>
>>> Okay, but there is also the case where XEN will actually hotplug memory
>>> using this same handler IIRC (at least I've read papers about it). Both
>>> are using the same handler, correct?
>>
>> Yes, it's used by this dual purpose, which I have to admit I don't
>> like that much either.
>>
>> One set of pages should be clearly used for RAM memory hotplug, and
>> the other to map foreign pages that are not related to memory hotplug,
>> it's just that we happen to need a physical region with backing struct
>> pages.
>>
>>>>
>>>>> It's the admin/distro responsibility to configure this properly.
>>>>> In case
>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>
>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>> getting onlined. I remember hyper-v has similar code to at least
>>>>> wait a
>>>>> bit for memory to get onlined.
>>>>
>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>> on the system.
>>>
>>> Indeed, if the memory is never exposed to the buddy (and all you
>>> need is
>>> struct pages +  a kernel virtual mapping), I wonder if
>>> memremap/ZONE_DEVICE is what you want?
>>
>> I'm certainly not familiar with the Linux memory subsystem, but if
>> that gets us a backing struct page and a kernel mapping then I would
>> say yes.
>>
>>> Then you won't have user-visible
>>> memory blocks created with unclear online semantics, partially
>>> involving
>>> the buddy.
>>
>> Seems like a fine solution.
>>
>> Juergen: would you be OK to use a separate page-list for
>> alloc_xenballooned_pages on HVM/PVH using the logic described by
>> David?
>>
>> I guess I would leave PV as-is, since it already has this reserved
>> region to map foreign pages.
>
> I would really like a common solution, especially as it would enable
> pv driver domains to use that feature, too.
>
> And finding a region for this memory zone in PVH dom0 should be common
> with PV dom0 after all. We don't want to collide with either PCI space
> or hotplug memory.

While I agree with goal here, these are two very different things, due
to the completely different nature of PV and HVM/PVH guests.

HVM/PVH guests have a concrete guest physical address space.  Linux
needs to pick some gfn's to use which aren't used by anything else (and
Xen's behaviour of not providing any help here is deeply unhelpful, and
needs fixing), and get struct page_info's for them.

PV is totally different.  Linux still needs page_info's for them, but
there is no concept of a guest physical address space.  You can
literally gain access to foreign mappings or grant maps by asking Xen to
modify a PTE.  For convenience with the core code, Linux tries to map
this concept back into a 1:1 pfn space, but it is quite fictitious.

~Andrew


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 15:10             ` Jürgen Groß
@ 2020-07-23 16:22               ` Roger Pau Monné
  -1 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 16:22 UTC (permalink / raw)
  To: Jürgen Groß, David Hildenbrand
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On Thu, Jul 23, 2020 at 05:10:03PM +0200, Jürgen Groß wrote:
> On 23.07.20 15:59, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
> > > On 23.07.20 14:23, Roger Pau Monné wrote:
> > > > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > > > Add an extra option to add_memory_resource that overrides the memory
> > > > > > hotplug online behavior in order to force onlining of memory from
> > > > > > add_memory_resource unconditionally.
> > > > > > 
> > > > > > This is required for the Xen balloon driver, that must run the
> > > > > > online page callback in order to correctly process the newly added
> > > > > > memory region, note this is an unpopulated region that is used by Linux
> > > > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > > > hence memory hotplug when running on Xen can be used even without the
> > > > > > user explicitly requesting it, as part of the normal operations of the
> > > > > > OS when attempting to map memory from a different domain.
> > > > > > 
> > > > > > Setting a different default value of memhp_default_online_type when
> > > > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > > > distro init scripts) could still change it and thus break the Xen
> > > > > > balloon driver.
> > > > > 
> > > > > I think we discussed this a couple of times before (even triggered by my
> > > > > request), and this is responsibility of user space to configure. Usually
> > > > > distros have udev rules to online memory automatically. Especially, user
> > > > > space should eb able to configure *how* to online memory.
> > > > 
> > > > Note (as per the commit message) that in the specific case I'm
> > > > referring to the memory hotplugged by the Xen balloon driver will be
> > > > an unpopulated range to be used internally by certain Xen subsystems,
> > > > like the xen-blkback or the privcmd drivers. The addition of such
> > > > blocks of (unpopulated) memory can happen without the user explicitly
> > > > requesting it, and hence not even aware such hotplug process is taking
> > > > place. To be clear: no actual RAM will be added to the system.
> > > 
> > > Okay, but there is also the case where XEN will actually hotplug memory
> > > using this same handler IIRC (at least I've read papers about it). Both
> > > are using the same handler, correct?
> > 
> > Yes, it's used by this dual purpose, which I have to admit I don't
> > like that much either.
> > 
> > One set of pages should be clearly used for RAM memory hotplug, and
> > the other to map foreign pages that are not related to memory hotplug,
> > it's just that we happen to need a physical region with backing struct
> > pages.
> > 
> > > > 
> > > > > It's the admin/distro responsibility to configure this properly. In case
> > > > > this doesn't happen (or as you say, users change it), bad luck.
> > > > > 
> > > > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > > > bit for memory to get onlined.
> > > > 
> > > > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > > > as Xen, as said this is done to add unpopulated memory regions that
> > > > will be used to map foreign memory (from other domains) by Xen drivers
> > > > on the system.
> > > 
> > > Indeed, if the memory is never exposed to the buddy (and all you need is
> > > struct pages +  a kernel virtual mapping), I wonder if
> > > memremap/ZONE_DEVICE is what you want?
> > 
> > I'm certainly not familiar with the Linux memory subsystem, but if
> > that gets us a backing struct page and a kernel mapping then I would
> > say yes.
> > 
> > > Then you won't have user-visible
> > > memory blocks created with unclear online semantics, partially involving
> > > the buddy.
> > 
> > Seems like a fine solution.
> > 
> > Juergen: would you be OK to use a separate page-list for
> > alloc_xenballooned_pages on HVM/PVH using the logic described by
> > David?
> > 
> > I guess I would leave PV as-is, since it already has this reserved
> > region to map foreign pages.
> 
> I would really like a common solution, especially as it would enable
> pv driver domains to use that feature, too.

I think PV is much more easy on that regard, as it doesn't have MMIO
holes except when using PCI passthrough, and in that case it's
trivial to identify those.

However on HVM/PVH this is not so trivial. I'm certainly not opposing
to a solution that covers both, but ATM I would really like to get
something working for PVH dom0, or else it's not usable on Linux IMO.

> And finding a region for this memory zone in PVH dom0 should be common
> with PV dom0 after all. We don't want to collide with either PCI space
> or hotplug memory.

Right, we could use the native memory map for that on dom0, and maybe
create a custom resource for the Xen balloon driver instead of
allocating from iomem_resource?

DomUs are more tricky as a guest has no idea where mappings can be
safely placed, maybe we will have to resort to iomem_resource in that
case, as I don't see much other option due to the lack of information
from Xen.

I also think that ZONE_DEVICE will need some adjustments, for once the
types in memory_type don't seem to be suitable for Xen, as they are
either specific to DAX or PCI. I gave a try at using allocate_resource
plus memremap_pages but that didn't seem to fly, I will need to
investigate further.

Maybe we can resort to something even simpler than memremap_pages? I
certainly have very little idea of how this is supposed to be used,
but dev_pagemap seems overly complex for what we are trying to
achieve.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 16:22               ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2020-07-23 16:22 UTC (permalink / raw)
  To: Jürgen Groß, David Hildenbrand
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On Thu, Jul 23, 2020 at 05:10:03PM +0200, Jürgen Groß wrote:
> On 23.07.20 15:59, Roger Pau Monné wrote:
> > On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
> > > On 23.07.20 14:23, Roger Pau Monné wrote:
> > > > On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
> > > > > On 23.07.20 10:45, Roger Pau Monne wrote:
> > > > > > Add an extra option to add_memory_resource that overrides the memory
> > > > > > hotplug online behavior in order to force onlining of memory from
> > > > > > add_memory_resource unconditionally.
> > > > > > 
> > > > > > This is required for the Xen balloon driver, that must run the
> > > > > > online page callback in order to correctly process the newly added
> > > > > > memory region, note this is an unpopulated region that is used by Linux
> > > > > > to either hotplug RAM or to map foreign pages from other domains, and
> > > > > > hence memory hotplug when running on Xen can be used even without the
> > > > > > user explicitly requesting it, as part of the normal operations of the
> > > > > > OS when attempting to map memory from a different domain.
> > > > > > 
> > > > > > Setting a different default value of memhp_default_online_type when
> > > > > > attaching the balloon driver is not a robust solution, as the user (or
> > > > > > distro init scripts) could still change it and thus break the Xen
> > > > > > balloon driver.
> > > > > 
> > > > > I think we discussed this a couple of times before (even triggered by my
> > > > > request), and this is responsibility of user space to configure. Usually
> > > > > distros have udev rules to online memory automatically. Especially, user
> > > > > space should eb able to configure *how* to online memory.
> > > > 
> > > > Note (as per the commit message) that in the specific case I'm
> > > > referring to the memory hotplugged by the Xen balloon driver will be
> > > > an unpopulated range to be used internally by certain Xen subsystems,
> > > > like the xen-blkback or the privcmd drivers. The addition of such
> > > > blocks of (unpopulated) memory can happen without the user explicitly
> > > > requesting it, and hence not even aware such hotplug process is taking
> > > > place. To be clear: no actual RAM will be added to the system.
> > > 
> > > Okay, but there is also the case where XEN will actually hotplug memory
> > > using this same handler IIRC (at least I've read papers about it). Both
> > > are using the same handler, correct?
> > 
> > Yes, it's used by this dual purpose, which I have to admit I don't
> > like that much either.
> > 
> > One set of pages should be clearly used for RAM memory hotplug, and
> > the other to map foreign pages that are not related to memory hotplug,
> > it's just that we happen to need a physical region with backing struct
> > pages.
> > 
> > > > 
> > > > > It's the admin/distro responsibility to configure this properly. In case
> > > > > this doesn't happen (or as you say, users change it), bad luck.
> > > > > 
> > > > > E.g., virtio-mem takes care to not add more memory in case it is not
> > > > > getting onlined. I remember hyper-v has similar code to at least wait a
> > > > > bit for memory to get onlined.
> > > > 
> > > > I don't think VirtIO or Hyper-V use the hotplug system in the same way
> > > > as Xen, as said this is done to add unpopulated memory regions that
> > > > will be used to map foreign memory (from other domains) by Xen drivers
> > > > on the system.
> > > 
> > > Indeed, if the memory is never exposed to the buddy (and all you need is
> > > struct pages +  a kernel virtual mapping), I wonder if
> > > memremap/ZONE_DEVICE is what you want?
> > 
> > I'm certainly not familiar with the Linux memory subsystem, but if
> > that gets us a backing struct page and a kernel mapping then I would
> > say yes.
> > 
> > > Then you won't have user-visible
> > > memory blocks created with unclear online semantics, partially involving
> > > the buddy.
> > 
> > Seems like a fine solution.
> > 
> > Juergen: would you be OK to use a separate page-list for
> > alloc_xenballooned_pages on HVM/PVH using the logic described by
> > David?
> > 
> > I guess I would leave PV as-is, since it already has this reserved
> > region to map foreign pages.
> 
> I would really like a common solution, especially as it would enable
> pv driver domains to use that feature, too.

I think PV is much more easy on that regard, as it doesn't have MMIO
holes except when using PCI passthrough, and in that case it's
trivial to identify those.

However on HVM/PVH this is not so trivial. I'm certainly not opposing
to a solution that covers both, but ATM I would really like to get
something working for PVH dom0, or else it's not usable on Linux IMO.

> And finding a region for this memory zone in PVH dom0 should be common
> with PV dom0 after all. We don't want to collide with either PCI space
> or hotplug memory.

Right, we could use the native memory map for that on dom0, and maybe
create a custom resource for the Xen balloon driver instead of
allocating from iomem_resource?

DomUs are more tricky as a guest has no idea where mappings can be
safely placed, maybe we will have to resort to iomem_resource in that
case, as I don't see much other option due to the lack of information
from Xen.

I also think that ZONE_DEVICE will need some adjustments, for once the
types in memory_type don't seem to be suitable for Xen, as they are
either specific to DAX or PCI. I gave a try at using allocate_resource
plus memremap_pages but that didn't seem to fly, I will need to
investigate further.

Maybe we can resort to something even simpler than memremap_pages? I
certainly have very little idea of how this is supposed to be used,
but dev_pagemap seems overly complex for what we are trying to
achieve.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 16:22               ` Roger Pau Monné
@ 2020-07-23 17:39                 ` David Hildenbrand
  -1 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 17:39 UTC (permalink / raw)
  To: Roger Pau Monné, Jürgen Groß
  Cc: linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On 23.07.20 18:22, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 05:10:03PM +0200, Jürgen Groß wrote:
>> On 23.07.20 15:59, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>>> add_memory_resource unconditionally.
>>>>>>>
>>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>>> online page callback in order to correctly process the newly added
>>>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>>>> OS when attempting to map memory from a different domain.
>>>>>>>
>>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>>> balloon driver.
>>>>>>
>>>>>> I think we discussed this a couple of times before (even triggered by my
>>>>>> request), and this is responsibility of user space to configure. Usually
>>>>>> distros have udev rules to online memory automatically. Especially, user
>>>>>> space should eb able to configure *how* to online memory.
>>>>>
>>>>> Note (as per the commit message) that in the specific case I'm
>>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>>> requesting it, and hence not even aware such hotplug process is taking
>>>>> place. To be clear: no actual RAM will be added to the system.
>>>>
>>>> Okay, but there is also the case where XEN will actually hotplug memory
>>>> using this same handler IIRC (at least I've read papers about it). Both
>>>> are using the same handler, correct?
>>>
>>> Yes, it's used by this dual purpose, which I have to admit I don't
>>> like that much either.
>>>
>>> One set of pages should be clearly used for RAM memory hotplug, and
>>> the other to map foreign pages that are not related to memory hotplug,
>>> it's just that we happen to need a physical region with backing struct
>>> pages.
>>>
>>>>>
>>>>>> It's the admin/distro responsibility to configure this properly. In case
>>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>>
>>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>>>> bit for memory to get onlined.
>>>>>
>>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>>> on the system.
>>>>
>>>> Indeed, if the memory is never exposed to the buddy (and all you need is
>>>> struct pages +  a kernel virtual mapping), I wonder if
>>>> memremap/ZONE_DEVICE is what you want?
>>>
>>> I'm certainly not familiar with the Linux memory subsystem, but if
>>> that gets us a backing struct page and a kernel mapping then I would
>>> say yes.
>>>
>>>> Then you won't have user-visible
>>>> memory blocks created with unclear online semantics, partially involving
>>>> the buddy.
>>>
>>> Seems like a fine solution.
>>>
>>> Juergen: would you be OK to use a separate page-list for
>>> alloc_xenballooned_pages on HVM/PVH using the logic described by
>>> David?
>>>
>>> I guess I would leave PV as-is, since it already has this reserved
>>> region to map foreign pages.
>>
>> I would really like a common solution, especially as it would enable
>> pv driver domains to use that feature, too.
> 
> I think PV is much more easy on that regard, as it doesn't have MMIO
> holes except when using PCI passthrough, and in that case it's
> trivial to identify those.
> 
> However on HVM/PVH this is not so trivial. I'm certainly not opposing
> to a solution that covers both, but ATM I would really like to get
> something working for PVH dom0, or else it's not usable on Linux IMO.
> 
>> And finding a region for this memory zone in PVH dom0 should be common
>> with PV dom0 after all. We don't want to collide with either PCI space
>> or hotplug memory.
> 
> Right, we could use the native memory map for that on dom0, and maybe
> create a custom resource for the Xen balloon driver instead of
> allocating from iomem_resource?
> 
> DomUs are more tricky as a guest has no idea where mappings can be
> safely placed, maybe we will have to resort to iomem_resource in that
> case, as I don't see much other option due to the lack of information
> from Xen.
> 
> I also think that ZONE_DEVICE will need some adjustments, for once the
> types in memory_type don't seem to be suitable for Xen, as they are
> either specific to DAX or PCI. I gave a try at using allocate_resource
> plus memremap_pages but that didn't seem to fly, I will need to
> investigate further.
> 
> Maybe we can resort to something even simpler than memremap_pages? I
> certainly have very little idea of how this is supposed to be used,
> but dev_pagemap seems overly complex for what we are trying to
> achieve.

Yeah, might require some code churn. It just feels wrong to involve
buddy concepts (e.g., onlining pages, calling memory notifiers, exposing
memory block devices) and introducing hacks (forced onlining) just to
get a memmap+identity mapping+iomem resource. I think reserving such a
region during boot as suggested is the easiest approach, but I am
*absolutely* not an expert on all these XEN-specific things :)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-23 17:39                 ` David Hildenbrand
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand @ 2020-07-23 17:39 UTC (permalink / raw)
  To: Roger Pau Monné, Jürgen Groß
  Cc: Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton

On 23.07.20 18:22, Roger Pau Monné wrote:
> On Thu, Jul 23, 2020 at 05:10:03PM +0200, Jürgen Groß wrote:
>> On 23.07.20 15:59, Roger Pau Monné wrote:
>>> On Thu, Jul 23, 2020 at 03:22:49PM +0200, David Hildenbrand wrote:
>>>> On 23.07.20 14:23, Roger Pau Monné wrote:
>>>>> On Thu, Jul 23, 2020 at 01:37:03PM +0200, David Hildenbrand wrote:
>>>>>> On 23.07.20 10:45, Roger Pau Monne wrote:
>>>>>>> Add an extra option to add_memory_resource that overrides the memory
>>>>>>> hotplug online behavior in order to force onlining of memory from
>>>>>>> add_memory_resource unconditionally.
>>>>>>>
>>>>>>> This is required for the Xen balloon driver, that must run the
>>>>>>> online page callback in order to correctly process the newly added
>>>>>>> memory region, note this is an unpopulated region that is used by Linux
>>>>>>> to either hotplug RAM or to map foreign pages from other domains, and
>>>>>>> hence memory hotplug when running on Xen can be used even without the
>>>>>>> user explicitly requesting it, as part of the normal operations of the
>>>>>>> OS when attempting to map memory from a different domain.
>>>>>>>
>>>>>>> Setting a different default value of memhp_default_online_type when
>>>>>>> attaching the balloon driver is not a robust solution, as the user (or
>>>>>>> distro init scripts) could still change it and thus break the Xen
>>>>>>> balloon driver.
>>>>>>
>>>>>> I think we discussed this a couple of times before (even triggered by my
>>>>>> request), and this is responsibility of user space to configure. Usually
>>>>>> distros have udev rules to online memory automatically. Especially, user
>>>>>> space should eb able to configure *how* to online memory.
>>>>>
>>>>> Note (as per the commit message) that in the specific case I'm
>>>>> referring to the memory hotplugged by the Xen balloon driver will be
>>>>> an unpopulated range to be used internally by certain Xen subsystems,
>>>>> like the xen-blkback or the privcmd drivers. The addition of such
>>>>> blocks of (unpopulated) memory can happen without the user explicitly
>>>>> requesting it, and hence not even aware such hotplug process is taking
>>>>> place. To be clear: no actual RAM will be added to the system.
>>>>
>>>> Okay, but there is also the case where XEN will actually hotplug memory
>>>> using this same handler IIRC (at least I've read papers about it). Both
>>>> are using the same handler, correct?
>>>
>>> Yes, it's used by this dual purpose, which I have to admit I don't
>>> like that much either.
>>>
>>> One set of pages should be clearly used for RAM memory hotplug, and
>>> the other to map foreign pages that are not related to memory hotplug,
>>> it's just that we happen to need a physical region with backing struct
>>> pages.
>>>
>>>>>
>>>>>> It's the admin/distro responsibility to configure this properly. In case
>>>>>> this doesn't happen (or as you say, users change it), bad luck.
>>>>>>
>>>>>> E.g., virtio-mem takes care to not add more memory in case it is not
>>>>>> getting onlined. I remember hyper-v has similar code to at least wait a
>>>>>> bit for memory to get onlined.
>>>>>
>>>>> I don't think VirtIO or Hyper-V use the hotplug system in the same way
>>>>> as Xen, as said this is done to add unpopulated memory regions that
>>>>> will be used to map foreign memory (from other domains) by Xen drivers
>>>>> on the system.
>>>>
>>>> Indeed, if the memory is never exposed to the buddy (and all you need is
>>>> struct pages +  a kernel virtual mapping), I wonder if
>>>> memremap/ZONE_DEVICE is what you want?
>>>
>>> I'm certainly not familiar with the Linux memory subsystem, but if
>>> that gets us a backing struct page and a kernel mapping then I would
>>> say yes.
>>>
>>>> Then you won't have user-visible
>>>> memory blocks created with unclear online semantics, partially involving
>>>> the buddy.
>>>
>>> Seems like a fine solution.
>>>
>>> Juergen: would you be OK to use a separate page-list for
>>> alloc_xenballooned_pages on HVM/PVH using the logic described by
>>> David?
>>>
>>> I guess I would leave PV as-is, since it already has this reserved
>>> region to map foreign pages.
>>
>> I would really like a common solution, especially as it would enable
>> pv driver domains to use that feature, too.
> 
> I think PV is much more easy on that regard, as it doesn't have MMIO
> holes except when using PCI passthrough, and in that case it's
> trivial to identify those.
> 
> However on HVM/PVH this is not so trivial. I'm certainly not opposing
> to a solution that covers both, but ATM I would really like to get
> something working for PVH dom0, or else it's not usable on Linux IMO.
> 
>> And finding a region for this memory zone in PVH dom0 should be common
>> with PV dom0 after all. We don't want to collide with either PCI space
>> or hotplug memory.
> 
> Right, we could use the native memory map for that on dom0, and maybe
> create a custom resource for the Xen balloon driver instead of
> allocating from iomem_resource?
> 
> DomUs are more tricky as a guest has no idea where mappings can be
> safely placed, maybe we will have to resort to iomem_resource in that
> case, as I don't see much other option due to the lack of information
> from Xen.
> 
> I also think that ZONE_DEVICE will need some adjustments, for once the
> types in memory_type don't seem to be suitable for Xen, as they are
> either specific to DAX or PCI. I gave a try at using allocate_resource
> plus memremap_pages but that didn't seem to fly, I will need to
> investigate further.
> 
> Maybe we can resort to something even simpler than memremap_pages? I
> certainly have very little idea of how this is supposed to be used,
> but dev_pagemap seems overly complex for what we are trying to
> achieve.

Yeah, might require some code churn. It just feels wrong to involve
buddy concepts (e.g., onlining pages, calling memory notifiers, exposing
memory block devices) and introducing hacks (forced onlining) just to
get a memmap+identity mapping+iomem resource. I think reserving such a
region during boot as suggested is the easiest approach, but I am
*absolutely* not an expert on all these XEN-specific things :)

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
  2020-07-23 17:39                 ` David Hildenbrand
@ 2020-07-24  7:28                   ` Michal Hocko
  -1 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2020-07-24  7:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Roger Pau Monné, Jürgen Groß,
	linux-kernel, Boris Ostrovsky, Stefano Stabellini, Andrew Morton,
	xen-devel, linux-mm

On Thu 23-07-20 19:39:54, David Hildenbrand wrote:
[...]
> Yeah, might require some code churn. It just feels wrong to involve
> buddy concepts (e.g., onlining pages, calling memory notifiers, exposing
> memory block devices) and introducing hacks (forced onlining) just to
> get a memmap+identity mapping+iomem resource. I think reserving such a
> region during boot as suggested is the easiest approach, but I am
> *absolutely* not an expert on all these XEN-specific things :)

I am late to the discussion but FTR I completely agree.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory
@ 2020-07-24  7:28                   ` Michal Hocko
  0 siblings, 0 replies; 47+ messages in thread
From: Michal Hocko @ 2020-07-24  7:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Jürgen Groß,
	Stefano Stabellini, linux-kernel, linux-mm, xen-devel,
	Boris Ostrovsky, Andrew Morton, Roger Pau Monné

On Thu 23-07-20 19:39:54, David Hildenbrand wrote:
[...]
> Yeah, might require some code churn. It just feels wrong to involve
> buddy concepts (e.g., onlining pages, calling memory notifiers, exposing
> memory block devices) and introducing hacks (forced onlining) just to
> get a memmap+identity mapping+iomem resource. I think reserving such a
> region during boot as suggested is the easiest approach, but I am
> *absolutely* not an expert on all these XEN-specific things :)

I am late to the discussion but FTR I completely agree.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-07-24  7:28 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23  8:45 [PATCH 0/3] xen/balloon: fixes for memory hotplug Roger Pau Monne
2020-07-23  8:45 ` [PATCH 1/3] xen/balloon: fix accounting in alloc_xenballooned_pages error path Roger Pau Monne
2020-07-23  8:45   ` Roger Pau Monne
2020-07-23  8:45 ` [PATCH 2/3] xen/balloon: make the balloon wait interruptible Roger Pau Monne
2020-07-23  8:45   ` Roger Pau Monne
2020-07-23  8:45 ` [PATCH 3/3] memory: introduce an option to force onlining of hotplug memory Roger Pau Monne
2020-07-23  8:45   ` Roger Pau Monne
2020-07-23 11:37   ` David Hildenbrand
2020-07-23 11:37     ` David Hildenbrand
2020-07-23 11:52     ` David Hildenbrand
2020-07-23 11:52       ` David Hildenbrand
2020-07-23 12:23     ` Roger Pau Monné
2020-07-23 12:23       ` Roger Pau Monné
2020-07-23 12:28       ` Jürgen Groß
2020-07-23 12:28         ` Jürgen Groß
2020-07-23 12:31         ` David Hildenbrand
2020-07-23 12:31           ` David Hildenbrand
2020-07-23 13:08         ` Roger Pau Monné
2020-07-23 13:08           ` Roger Pau Monné
2020-07-23 13:14           ` David Hildenbrand
2020-07-23 13:14             ` David Hildenbrand
2020-07-23 13:25             ` Roger Pau Monné
2020-07-23 13:25               ` Roger Pau Monné
2020-07-23 13:20           ` Jürgen Groß
2020-07-23 13:20             ` Jürgen Groß
2020-07-23 13:39             ` Roger Pau Monné
2020-07-23 13:39               ` Roger Pau Monné
2020-07-23 13:49               ` Jürgen Groß
2020-07-23 13:49                 ` Jürgen Groß
2020-07-23 13:22       ` David Hildenbrand
2020-07-23 13:22         ` David Hildenbrand
2020-07-23 13:47         ` David Hildenbrand
2020-07-23 13:47           ` David Hildenbrand
2020-07-23 13:53           ` Jürgen Groß
2020-07-23 13:53             ` Jürgen Groß
2020-07-23 13:59         ` Roger Pau Monné
2020-07-23 13:59           ` Roger Pau Monné
2020-07-23 15:10           ` Jürgen Groß
2020-07-23 15:10             ` Jürgen Groß
2020-07-23 16:03             ` Andrew Cooper
2020-07-23 16:03               ` Andrew Cooper
2020-07-23 16:22             ` Roger Pau Monné
2020-07-23 16:22               ` Roger Pau Monné
2020-07-23 17:39               ` David Hildenbrand
2020-07-23 17:39                 ` David Hildenbrand
2020-07-24  7:28                 ` Michal Hocko
2020-07-24  7:28                   ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.