All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-07 17:23 ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07 17:23 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-acpi, Jonathan Corbet, Greg Kroah-Hartman,
	Daniel Kiper, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Currently, all newly added memory blocks remain in 'offline' state unless
someone onlines them, some linux distributions carry special udev rules
like:

SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically. This is not a great solution for virtual
machines where memory hotplug is being used to address high memory pressure
situations as such onlining is slow and a userspace process doing this
(udev) has a chance of being killed by the OOM killer as it will probably
require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online" which
causes all newly added memory blocks to go online as soon as they're added.
The default is "offline".

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
- Changes since 'v2':
  - Remove config option, revert to 'offline' by default [Andrew Morton,
    David Rientjes]
  - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]

- Changes since 'v1':
  Add 'online' parameter to add_memory_resource() as it is being used by
  xen ballon driver and it adds "empty" memory pages [David Vrabel].
  (I don't completely understand what prevents manual onlining in this
   case as we still have all newly added blocks in sysfs ... this is the
   discussion point.)

- Changes since 'RFC':
  It seems nobody is strongly opposed to the idea, thus non-RFC.
  Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
  and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
  from show_memhp_autoonline(). [Daniel Kiper]
  Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
  feature by default (when the config option is selected) and add
  kernel parameter (nomemhp_autoonline) to disable the functionality
  upon boot when needed.

- RFC:
  I was able to find previous attempts to fix the issue, e.g.:
  http://marc.info/?l=linux-kernel&m=137425951924598&w=2
  http://marc.info/?l=linux-acpi&m=127186488905382
  but I'm not completely sure why it didn't work out and the solution
  I suggest is not 'smart enough', thus 'RFC'.
---
 Documentation/memory-hotplug.txt | 19 +++++++++++++++----
 drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
 drivers/xen/balloon.c            |  2 +-
 include/linux/memory_hotplug.h   |  4 +++-
 mm/memory_hotplug.c              | 12 ++++++++++--
 5 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index ce2cfcf..ceaf40c 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
 If the memory block is offline, you'll read "offline".
 
 
-5.2. How to online memory
+5.2. Memory onlining
 ------------
-Even if the memory is hot-added, it is not at ready-to-use state.
-For using newly added memory, you have to "online" the memory block.
+When the memory is hot-added, the kernel decides whether or not to "online"
+it according to the policy which can be read from "auto_online_blocks" file:
 
-For onlining, you have to write "online" to the memory block's state file as:
+% cat /sys/devices/system/memory/auto_online_blocks
+
+The default is "offline" which means the newly added memory is not in a
+ready-to-use state and you have to "online" the newly added memory blocks
+manually. Automatic onlining can be requested by writing "online" to
+"auto_online_blocks" file:
+
+% echo online > /sys/devices/system/memory/auto_online_blocks
+
+If the automatic onlining wasn't requested or some memory block was offlined
+it is possible to change the individual block's state by writing to the "state"
+file:
 
 % echo online > /sys/devices/system/memory/memoryXXX/state
 
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 25425d3..44a618d 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
 static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
 
 /*
+ * Memory auto online policy.
+ */
+
+static ssize_t
+show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
+			char *buf)
+{
+	if (memhp_auto_online)
+		return sprintf(buf, "online\n");
+	else
+		return sprintf(buf, "offline\n");
+}
+
+static ssize_t
+store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
+			 const char *buf, size_t count)
+{
+	if (sysfs_streq(buf, "online"))
+		memhp_auto_online = true;
+	else if (sysfs_streq(buf, "offline"))
+		memhp_auto_online = false;
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
+		   store_auto_online_blocks);
+
+/*
  * Some architectures will have custom drivers to do this, and
  * will not need to do it from userspace.  The fake hot-add code
  * as well as ppc64 will do all of their discovery in userspace
@@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
 #endif
 
 	&dev_attr_block_size_bytes.attr,
+	&dev_attr_auto_online_blocks.attr,
 	NULL
 };
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 12eab50..890c3b5 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
 	}
 #endif
 
-	rc = add_memory_resource(nid, resource);
+	rc = add_memory_resource(nid, resource, false);
 	if (rc) {
 		pr_warn("Cannot add additional memory (%i)\n", rc);
 		goto err;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 2ea574f..4b7949a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
 
 extern int try_online_node(int nid);
 
+extern bool memhp_auto_online;
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 extern bool is_pageblock_removable_nolock(struct page *page);
 extern int arch_remove_memory(u64 start, u64 size);
@@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
 extern int add_memory(int nid, u64 start, u64 size);
-extern int add_memory_resource(int nid, struct resource *resource);
+extern int add_memory_resource(int nid, struct resource *resource, bool online);
 extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
 		bool for_device);
 extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a042a9d..0ecf860 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -76,6 +76,9 @@ static struct {
 #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
 #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
 
+bool memhp_auto_online;
+EXPORT_SYMBOL_GPL(memhp_auto_online);
+
 void get_online_mems(void)
 {
 	might_sleep();
@@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
 }
 
 /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
-int __ref add_memory_resource(int nid, struct resource *res)
+int __ref add_memory_resource(int nid, struct resource *res, bool online)
 {
 	u64 start, size;
 	pg_data_t *pgdat = NULL;
@@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
 	/* create new memmap entry */
 	firmware_map_add_hotplug(start, start + size, "System RAM");
 
+	/* online pages if requested */
+	if (online)
+		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
+			     MMOP_ONLINE_KEEP);
+
 	goto out;
 
 error:
@@ -1315,7 +1323,7 @@ int __ref add_memory(int nid, u64 start, u64 size)
 	if (!res)
 		return -EEXIST;
 
-	ret = add_memory_resource(nid, res);
+	ret = add_memory_resource(nid, res, memhp_auto_online);
 	if (ret < 0)
 		release_memory_resource(res);
 	return ret;
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-07 17:23 ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07 17:23 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-acpi, Jonathan Corbet, Greg Kroah-Hartman,
	Daniel Kiper, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Currently, all newly added memory blocks remain in 'offline' state unless
someone onlines them, some linux distributions carry special udev rules
like:

SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically. This is not a great solution for virtual
machines where memory hotplug is being used to address high memory pressure
situations as such onlining is slow and a userspace process doing this
(udev) has a chance of being killed by the OOM killer as it will probably
require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online" which
causes all newly added memory blocks to go online as soon as they're added.
The default is "offline".

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
- Changes since 'v2':
  - Remove config option, revert to 'offline' by default [Andrew Morton,
    David Rientjes]
  - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]

- Changes since 'v1':
  Add 'online' parameter to add_memory_resource() as it is being used by
  xen ballon driver and it adds "empty" memory pages [David Vrabel].
  (I don't completely understand what prevents manual onlining in this
   case as we still have all newly added blocks in sysfs ... this is the
   discussion point.)

- Changes since 'RFC':
  It seems nobody is strongly opposed to the idea, thus non-RFC.
  Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
  and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
  from show_memhp_autoonline(). [Daniel Kiper]
  Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
  feature by default (when the config option is selected) and add
  kernel parameter (nomemhp_autoonline) to disable the functionality
  upon boot when needed.

- RFC:
  I was able to find previous attempts to fix the issue, e.g.:
  http://marc.info/?l=linux-kernel&m=137425951924598&w=2
  http://marc.info/?l=linux-acpi&m=127186488905382
  but I'm not completely sure why it didn't work out and the solution
  I suggest is not 'smart enough', thus 'RFC'.
---
 Documentation/memory-hotplug.txt | 19 +++++++++++++++----
 drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
 drivers/xen/balloon.c            |  2 +-
 include/linux/memory_hotplug.h   |  4 +++-
 mm/memory_hotplug.c              | 12 ++++++++++--
 5 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index ce2cfcf..ceaf40c 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
 If the memory block is offline, you'll read "offline".
 
 
-5.2. How to online memory
+5.2. Memory onlining
 ------------
-Even if the memory is hot-added, it is not at ready-to-use state.
-For using newly added memory, you have to "online" the memory block.
+When the memory is hot-added, the kernel decides whether or not to "online"
+it according to the policy which can be read from "auto_online_blocks" file:
 
-For onlining, you have to write "online" to the memory block's state file as:
+% cat /sys/devices/system/memory/auto_online_blocks
+
+The default is "offline" which means the newly added memory is not in a
+ready-to-use state and you have to "online" the newly added memory blocks
+manually. Automatic onlining can be requested by writing "online" to
+"auto_online_blocks" file:
+
+% echo online > /sys/devices/system/memory/auto_online_blocks
+
+If the automatic onlining wasn't requested or some memory block was offlined
+it is possible to change the individual block's state by writing to the "state"
+file:
 
 % echo online > /sys/devices/system/memory/memoryXXX/state
 
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 25425d3..44a618d 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
 static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
 
 /*
+ * Memory auto online policy.
+ */
+
+static ssize_t
+show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
+			char *buf)
+{
+	if (memhp_auto_online)
+		return sprintf(buf, "online\n");
+	else
+		return sprintf(buf, "offline\n");
+}
+
+static ssize_t
+store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
+			 const char *buf, size_t count)
+{
+	if (sysfs_streq(buf, "online"))
+		memhp_auto_online = true;
+	else if (sysfs_streq(buf, "offline"))
+		memhp_auto_online = false;
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
+		   store_auto_online_blocks);
+
+/*
  * Some architectures will have custom drivers to do this, and
  * will not need to do it from userspace.  The fake hot-add code
  * as well as ppc64 will do all of their discovery in userspace
@@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
 #endif
 
 	&dev_attr_block_size_bytes.attr,
+	&dev_attr_auto_online_blocks.attr,
 	NULL
 };
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 12eab50..890c3b5 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
 	}
 #endif
 
-	rc = add_memory_resource(nid, resource);
+	rc = add_memory_resource(nid, resource, false);
 	if (rc) {
 		pr_warn("Cannot add additional memory (%i)\n", rc);
 		goto err;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 2ea574f..4b7949a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
 
 extern int try_online_node(int nid);
 
+extern bool memhp_auto_online;
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 extern bool is_pageblock_removable_nolock(struct page *page);
 extern int arch_remove_memory(u64 start, u64 size);
@@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
 extern int add_memory(int nid, u64 start, u64 size);
-extern int add_memory_resource(int nid, struct resource *resource);
+extern int add_memory_resource(int nid, struct resource *resource, bool online);
 extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
 		bool for_device);
 extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a042a9d..0ecf860 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -76,6 +76,9 @@ static struct {
 #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
 #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
 
+bool memhp_auto_online;
+EXPORT_SYMBOL_GPL(memhp_auto_online);
+
 void get_online_mems(void)
 {
 	might_sleep();
@@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
 }
 
 /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
-int __ref add_memory_resource(int nid, struct resource *res)
+int __ref add_memory_resource(int nid, struct resource *res, bool online)
 {
 	u64 start, size;
 	pg_data_t *pgdat = NULL;
@@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
 	/* create new memmap entry */
 	firmware_map_add_hotplug(start, start + size, "System RAM");
 
+	/* online pages if requested */
+	if (online)
+		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
+			     MMOP_ONLINE_KEEP);
+
 	goto out;
 
 error:
@@ -1315,7 +1323,7 @@ int __ref add_memory(int nid, u64 start, u64 size)
 	if (!res)
 		return -EEXIST;
 
-	ret = add_memory_resource(nid, res);
+	ret = add_memory_resource(nid, res, memhp_auto_online);
 	if (ret < 0)
 		release_memory_resource(res);
 	return ret;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-07 17:23 ` Vitaly Kuznetsov
@ 2016-01-08 14:01   ` Daniel Kiper
  -1 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-08 14:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Hey Vitaly,

Sorry for late reply but I have been on holiday and I was not able
to follow this thread.

I saw that Andrew put this patch in -mm queue but I have still some
concerns. I hope that they could be addressed in one way or another.
Please look below for more details.

On Thu, Jan 07, 2016 at 06:23:41PM +0100, Vitaly Kuznetsov wrote:
> Currently, all newly added memory blocks remain in 'offline' state unless
> someone onlines them, some linux distributions carry special udev rules
> like:
>
> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
>
> to make this happen automatically. This is not a great solution for virtual
> machines where memory hotplug is being used to address high memory pressure
> situations as such onlining is slow and a userspace process doing this
> (udev) has a chance of being killed by the OOM killer as it will probably
> require to allocate some memory.
>
> Introduce default policy for the newly added memory blocks in
> /sys/devices/system/memory/auto_online_blocks file with two possible
> values: "offline" which preserves the current behavior and "online" which
> causes all newly added memory blocks to go online as soon as they're added.
> The default is "offline".
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Daniel Kiper <daniel.kiper@oracle.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Tang Chen <tangchen@cn.fujitsu.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Xishi Qiu <qiuxishi@huawei.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Kay Sievers <kay@vrfy.org>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> - Changes since 'v2':
>   - Remove config option, revert to 'offline' by default [Andrew Morton,
>     David Rientjes]
>   - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]
>
> - Changes since 'v1':
>   Add 'online' parameter to add_memory_resource() as it is being used by
>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>   (I don't completely understand what prevents manual onlining in this
>    case as we still have all newly added blocks in sysfs ... this is the
>    discussion point.)

Sometimes Xen guest needs to share some memory with other guests. This
memory must live in an address space and must be described by struct page.
Original mechanism gets such struct page by ballooning down guest memory a bit.
However, if memory hotplug is available then we do not need to steal real
memory from guest to share memory. We can get address space and struct page
from memory hotplug machinery. However, there is one requirement: every
page shared between guests cannot be allocated (onlined) in normal way.
All of them must be taken from hypervisor and later described by struct page.
Otherwise, if we online whole section/block automatically then we must use
ballooning mechanism described earlier.

David, please correct me if I am wrong.

I hope that helps.

> - Changes since 'RFC':
>   It seems nobody is strongly opposed to the idea, thus non-RFC.
>   Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
>   and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
>   from show_memhp_autoonline(). [Daniel Kiper]
>   Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
>   feature by default (when the config option is selected) and add
>   kernel parameter (nomemhp_autoonline) to disable the functionality
>   upon boot when needed.
>
> - RFC:
>   I was able to find previous attempts to fix the issue, e.g.:
>   http://marc.info/?l=linux-kernel&m=137425951924598&w=2
>   http://marc.info/?l=linux-acpi&m=127186488905382
>   but I'm not completely sure why it didn't work out and the solution
>   I suggest is not 'smart enough', thus 'RFC'.
> ---
>  Documentation/memory-hotplug.txt | 19 +++++++++++++++----
>  drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
>  drivers/xen/balloon.c            |  2 +-
>  include/linux/memory_hotplug.h   |  4 +++-
>  mm/memory_hotplug.c              | 12 ++++++++++--
>  5 files changed, 61 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> index ce2cfcf..ceaf40c 100644
> --- a/Documentation/memory-hotplug.txt
> +++ b/Documentation/memory-hotplug.txt
> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
>  If the memory block is offline, you'll read "offline".
>
>
> -5.2. How to online memory
> +5.2. Memory onlining
>  ------------
> -Even if the memory is hot-added, it is not at ready-to-use state.
> -For using newly added memory, you have to "online" the memory block.
> +When the memory is hot-added, the kernel decides whether or not to "online"
> +it according to the policy which can be read from "auto_online_blocks" file:
>
> -For onlining, you have to write "online" to the memory block's state file as:
> +% cat /sys/devices/system/memory/auto_online_blocks
> +
> +The default is "offline" which means the newly added memory is not in a
> +ready-to-use state and you have to "online" the newly added memory blocks
> +manually. Automatic onlining can be requested by writing "online" to
> +"auto_online_blocks" file:
> +
> +% echo online > /sys/devices/system/memory/auto_online_blocks
> +
> +If the automatic onlining wasn't requested or some memory block was offlined
> +it is possible to change the individual block's state by writing to the "state"
> +file:
>
>  % echo online > /sys/devices/system/memory/memoryXXX/state

Please say clearly that offlined blocks are not onlined automatically
when /sys/devices/system/memory/auto_online_blocks is set to online.

> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 25425d3..44a618d 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
>
>  /*
> + * Memory auto online policy.
> + */
> +
> +static ssize_t
> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> +			char *buf)
> +{
> +	if (memhp_auto_online)
> +		return sprintf(buf, "online\n");
> +	else
> +		return sprintf(buf, "offline\n");
> +}
> +
> +static ssize_t
> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> +			 const char *buf, size_t count)
> +{
> +	if (sysfs_streq(buf, "online"))
> +		memhp_auto_online = true;
> +	else if (sysfs_streq(buf, "offline"))
> +		memhp_auto_online = false;
> +	else
> +		return -EINVAL;
> +
> +	return count;
> +}
> +
> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> +		   store_auto_online_blocks);
> +
> +/*
>   * Some architectures will have custom drivers to do this, and
>   * will not need to do it from userspace.  The fake hot-add code
>   * as well as ppc64 will do all of their discovery in userspace
> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
>  #endif
>
>  	&dev_attr_block_size_bytes.attr,
> +	&dev_attr_auto_online_blocks.attr,
>  	NULL
>  };
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 12eab50..890c3b5 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
>  	}
>  #endif
>
> -	rc = add_memory_resource(nid, resource);
> +	rc = add_memory_resource(nid, resource, false);

This is partial solution and does not allow us to use new feature in Xen.
Could you add separate patch which fixes this issue?

>  	if (rc) {
>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>  		goto err;
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 2ea574f..4b7949a 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
>
>  extern int try_online_node(int nid);
>
> +extern bool memhp_auto_online;
> +
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  extern bool is_pageblock_removable_nolock(struct page *page);
>  extern int arch_remove_memory(u64 start, u64 size);
> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>  		void *arg, int (*func)(struct memory_block *, void *));
>  extern int add_memory(int nid, u64 start, u64 size);
> -extern int add_memory_resource(int nid, struct resource *resource);
> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>  		bool for_device);
>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a042a9d..0ecf860 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -76,6 +76,9 @@ static struct {
>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
>
> +bool memhp_auto_online;
> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> +
>  void get_online_mems(void)
>  {
>  	might_sleep();
> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>  }
>
>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> -int __ref add_memory_resource(int nid, struct resource *res)
> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
>  {
>  	u64 start, size;
>  	pg_data_t *pgdat = NULL;
> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
>  	/* create new memmap entry */
>  	firmware_map_add_hotplug(start, start + size, "System RAM");
>
> +	/* online pages if requested */
> +	if (online)
> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> +			     MMOP_ONLINE_KEEP);
> +

This way we go in deadlock if auto online feature is enabled in Xen (this was
pointed out by David Vrabel). And we want to have it working out of the box.
So, I think that we should find proper solution. I suppose that we can schedule
a task here which auto online attached blocks. Hmmm... Not nice but should work.
Or maybe you have better idea how to fix this issue.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-08 14:01   ` Daniel Kiper
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-08 14:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Hey Vitaly,

Sorry for late reply but I have been on holiday and I was not able
to follow this thread.

I saw that Andrew put this patch in -mm queue but I have still some
concerns. I hope that they could be addressed in one way or another.
Please look below for more details.

On Thu, Jan 07, 2016 at 06:23:41PM +0100, Vitaly Kuznetsov wrote:
> Currently, all newly added memory blocks remain in 'offline' state unless
> someone onlines them, some linux distributions carry special udev rules
> like:
>
> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
>
> to make this happen automatically. This is not a great solution for virtual
> machines where memory hotplug is being used to address high memory pressure
> situations as such onlining is slow and a userspace process doing this
> (udev) has a chance of being killed by the OOM killer as it will probably
> require to allocate some memory.
>
> Introduce default policy for the newly added memory blocks in
> /sys/devices/system/memory/auto_online_blocks file with two possible
> values: "offline" which preserves the current behavior and "online" which
> causes all newly added memory blocks to go online as soon as they're added.
> The default is "offline".
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Daniel Kiper <daniel.kiper@oracle.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Tang Chen <tangchen@cn.fujitsu.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Xishi Qiu <qiuxishi@huawei.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Kay Sievers <kay@vrfy.org>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> - Changes since 'v2':
>   - Remove config option, revert to 'offline' by default [Andrew Morton,
>     David Rientjes]
>   - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]
>
> - Changes since 'v1':
>   Add 'online' parameter to add_memory_resource() as it is being used by
>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>   (I don't completely understand what prevents manual onlining in this
>    case as we still have all newly added blocks in sysfs ... this is the
>    discussion point.)

Sometimes Xen guest needs to share some memory with other guests. This
memory must live in an address space and must be described by struct page.
Original mechanism gets such struct page by ballooning down guest memory a bit.
However, if memory hotplug is available then we do not need to steal real
memory from guest to share memory. We can get address space and struct page
from memory hotplug machinery. However, there is one requirement: every
page shared between guests cannot be allocated (onlined) in normal way.
All of them must be taken from hypervisor and later described by struct page.
Otherwise, if we online whole section/block automatically then we must use
ballooning mechanism described earlier.

David, please correct me if I am wrong.

I hope that helps.

> - Changes since 'RFC':
>   It seems nobody is strongly opposed to the idea, thus non-RFC.
>   Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
>   and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
>   from show_memhp_autoonline(). [Daniel Kiper]
>   Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
>   feature by default (when the config option is selected) and add
>   kernel parameter (nomemhp_autoonline) to disable the functionality
>   upon boot when needed.
>
> - RFC:
>   I was able to find previous attempts to fix the issue, e.g.:
>   http://marc.info/?l=linux-kernel&m=137425951924598&w=2
>   http://marc.info/?l=linux-acpi&m=127186488905382
>   but I'm not completely sure why it didn't work out and the solution
>   I suggest is not 'smart enough', thus 'RFC'.
> ---
>  Documentation/memory-hotplug.txt | 19 +++++++++++++++----
>  drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
>  drivers/xen/balloon.c            |  2 +-
>  include/linux/memory_hotplug.h   |  4 +++-
>  mm/memory_hotplug.c              | 12 ++++++++++--
>  5 files changed, 61 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> index ce2cfcf..ceaf40c 100644
> --- a/Documentation/memory-hotplug.txt
> +++ b/Documentation/memory-hotplug.txt
> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
>  If the memory block is offline, you'll read "offline".
>
>
> -5.2. How to online memory
> +5.2. Memory onlining
>  ------------
> -Even if the memory is hot-added, it is not at ready-to-use state.
> -For using newly added memory, you have to "online" the memory block.
> +When the memory is hot-added, the kernel decides whether or not to "online"
> +it according to the policy which can be read from "auto_online_blocks" file:
>
> -For onlining, you have to write "online" to the memory block's state file as:
> +% cat /sys/devices/system/memory/auto_online_blocks
> +
> +The default is "offline" which means the newly added memory is not in a
> +ready-to-use state and you have to "online" the newly added memory blocks
> +manually. Automatic onlining can be requested by writing "online" to
> +"auto_online_blocks" file:
> +
> +% echo online > /sys/devices/system/memory/auto_online_blocks
> +
> +If the automatic onlining wasn't requested or some memory block was offlined
> +it is possible to change the individual block's state by writing to the "state"
> +file:
>
>  % echo online > /sys/devices/system/memory/memoryXXX/state

Please say clearly that offlined blocks are not onlined automatically
when /sys/devices/system/memory/auto_online_blocks is set to online.

> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 25425d3..44a618d 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
>
>  /*
> + * Memory auto online policy.
> + */
> +
> +static ssize_t
> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> +			char *buf)
> +{
> +	if (memhp_auto_online)
> +		return sprintf(buf, "online\n");
> +	else
> +		return sprintf(buf, "offline\n");
> +}
> +
> +static ssize_t
> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> +			 const char *buf, size_t count)
> +{
> +	if (sysfs_streq(buf, "online"))
> +		memhp_auto_online = true;
> +	else if (sysfs_streq(buf, "offline"))
> +		memhp_auto_online = false;
> +	else
> +		return -EINVAL;
> +
> +	return count;
> +}
> +
> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> +		   store_auto_online_blocks);
> +
> +/*
>   * Some architectures will have custom drivers to do this, and
>   * will not need to do it from userspace.  The fake hot-add code
>   * as well as ppc64 will do all of their discovery in userspace
> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
>  #endif
>
>  	&dev_attr_block_size_bytes.attr,
> +	&dev_attr_auto_online_blocks.attr,
>  	NULL
>  };
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 12eab50..890c3b5 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
>  	}
>  #endif
>
> -	rc = add_memory_resource(nid, resource);
> +	rc = add_memory_resource(nid, resource, false);

This is partial solution and does not allow us to use new feature in Xen.
Could you add separate patch which fixes this issue?

>  	if (rc) {
>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>  		goto err;
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 2ea574f..4b7949a 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
>
>  extern int try_online_node(int nid);
>
> +extern bool memhp_auto_online;
> +
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  extern bool is_pageblock_removable_nolock(struct page *page);
>  extern int arch_remove_memory(u64 start, u64 size);
> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>  		void *arg, int (*func)(struct memory_block *, void *));
>  extern int add_memory(int nid, u64 start, u64 size);
> -extern int add_memory_resource(int nid, struct resource *resource);
> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>  		bool for_device);
>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a042a9d..0ecf860 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -76,6 +76,9 @@ static struct {
>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
>
> +bool memhp_auto_online;
> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> +
>  void get_online_mems(void)
>  {
>  	might_sleep();
> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>  }
>
>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> -int __ref add_memory_resource(int nid, struct resource *res)
> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
>  {
>  	u64 start, size;
>  	pg_data_t *pgdat = NULL;
> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
>  	/* create new memmap entry */
>  	firmware_map_add_hotplug(start, start + size, "System RAM");
>
> +	/* online pages if requested */
> +	if (online)
> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> +			     MMOP_ONLINE_KEEP);
> +

This way we go in deadlock if auto online feature is enabled in Xen (this was
pointed out by David Vrabel). And we want to have it working out of the box.
So, I think that we should find proper solution. I suppose that we can schedule
a task here which auto online attached blocks. Hmmm... Not nice but should work.
Or maybe you have better idea how to fix this issue.

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-08 14:01   ` Daniel Kiper
@ 2016-01-08 16:55     ` Vitaly Kuznetsov
  -1 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-08 16:55 UTC (permalink / raw)
  To: Daniel Kiper
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Daniel Kiper <daniel.kiper@oracle.com> writes:

> Hey Vitaly,
>
> Sorry for late reply but I have been on holiday and I was not able
> to follow this thread.
>
> I saw that Andrew put this patch in -mm queue but I have still some
> concerns. I hope that they could be addressed in one way or another.
> Please look below for more details.
>
> On Thu, Jan 07, 2016 at 06:23:41PM +0100, Vitaly Kuznetsov wrote:
>> Currently, all newly added memory blocks remain in 'offline' state unless
>> someone onlines them, some linux distributions carry special udev rules
>> like:
>>
>> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
>>
>> to make this happen automatically. This is not a great solution for virtual
>> machines where memory hotplug is being used to address high memory pressure
>> situations as such onlining is slow and a userspace process doing this
>> (udev) has a chance of being killed by the OOM killer as it will probably
>> require to allocate some memory.
>>
>> Introduce default policy for the newly added memory blocks in
>> /sys/devices/system/memory/auto_online_blocks file with two possible
>> values: "offline" which preserves the current behavior and "online" which
>> causes all newly added memory blocks to go online as soon as they're added.
>> The default is "offline".
>>
>> Cc: Jonathan Corbet <corbet@lwn.net>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Daniel Kiper <daniel.kiper@oracle.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Tang Chen <tangchen@cn.fujitsu.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Xishi Qiu <qiuxishi@huawei.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Kay Sievers <kay@vrfy.org>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> - Changes since 'v2':
>>   - Remove config option, revert to 'offline' by default [Andrew Morton,
>>     David Rientjes]
>>   - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]
>>
>> - Changes since 'v1':
>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>   (I don't completely understand what prevents manual onlining in this
>>    case as we still have all newly added blocks in sysfs ... this is the
>>    discussion point.)

Hi Daniel!

>
> Sometimes Xen guest needs to share some memory with other guests. This
> memory must live in an address space and must be described by struct page.
> Original mechanism gets such struct page by ballooning down guest memory a bit.
> However, if memory hotplug is available then we do not need to steal real
> memory from guest to share memory. We can get address space and struct page
> from memory hotplug machinery. However, there is one requirement: every
> page shared between guests cannot be allocated (onlined) in normal way.
> All of them must be taken from hypervisor and later described by struct page.
> Otherwise, if we online whole section/block automatically then we must use
> ballooning mechanism described earlier.
>
> David, please correct me if I am wrong.
>
> I hope that helps.
>
>> - Changes since 'RFC':
>>   It seems nobody is strongly opposed to the idea, thus non-RFC.
>>   Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
>>   and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
>>   from show_memhp_autoonline(). [Daniel Kiper]
>>   Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
>>   feature by default (when the config option is selected) and add
>>   kernel parameter (nomemhp_autoonline) to disable the functionality
>>   upon boot when needed.
>>
>> - RFC:
>>   I was able to find previous attempts to fix the issue, e.g.:
>>   http://marc.info/?l=linux-kernel&m=137425951924598&w=2
>>   http://marc.info/?l=linux-acpi&m=127186488905382
>>   but I'm not completely sure why it didn't work out and the solution
>>   I suggest is not 'smart enough', thus 'RFC'.
>> ---
>>  Documentation/memory-hotplug.txt | 19 +++++++++++++++----
>>  drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
>>  drivers/xen/balloon.c            |  2 +-
>>  include/linux/memory_hotplug.h   |  4 +++-
>>  mm/memory_hotplug.c              | 12 ++++++++++--
>>  5 files changed, 61 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
>> index ce2cfcf..ceaf40c 100644
>> --- a/Documentation/memory-hotplug.txt
>> +++ b/Documentation/memory-hotplug.txt
>> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
>>  If the memory block is offline, you'll read "offline".
>>
>>
>> -5.2. How to online memory
>> +5.2. Memory onlining
>>  ------------
>> -Even if the memory is hot-added, it is not at ready-to-use state.
>> -For using newly added memory, you have to "online" the memory block.
>> +When the memory is hot-added, the kernel decides whether or not to "online"
>> +it according to the policy which can be read from "auto_online_blocks" file:
>>
>> -For onlining, you have to write "online" to the memory block's state file as:
>> +% cat /sys/devices/system/memory/auto_online_blocks
>> +
>> +The default is "offline" which means the newly added memory is not in a
>> +ready-to-use state and you have to "online" the newly added memory blocks
>> +manually. Automatic onlining can be requested by writing "online" to
>> +"auto_online_blocks" file:
>> +
>> +% echo online > /sys/devices/system/memory/auto_online_blocks
>> +
>> +If the automatic onlining wasn't requested or some memory block was offlined
>> +it is possible to change the individual block's state by writing to the "state"
>> +file:
>>
>>  % echo online > /sys/devices/system/memory/memoryXXX/state
>
> Please say clearly that offlined blocks are not onlined automatically
> when /sys/devices/system/memory/auto_online_blocks is set to online.
>

You mean the blocks which were manually offlined won't magically come
back, right? Ok, I'll try.
 
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 25425d3..44a618d 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
>>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
>>
>>  /*
>> + * Memory auto online policy.
>> + */
>> +
>> +static ssize_t
>> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
>> +			char *buf)
>> +{
>> +	if (memhp_auto_online)
>> +		return sprintf(buf, "online\n");
>> +	else
>> +		return sprintf(buf, "offline\n");
>> +}
>> +
>> +static ssize_t
>> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
>> +			 const char *buf, size_t count)
>> +{
>> +	if (sysfs_streq(buf, "online"))
>> +		memhp_auto_online = true;
>> +	else if (sysfs_streq(buf, "offline"))
>> +		memhp_auto_online = false;
>> +	else
>> +		return -EINVAL;
>> +
>> +	return count;
>> +}
>> +
>> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
>> +		   store_auto_online_blocks);
>> +
>> +/*
>>   * Some architectures will have custom drivers to do this, and
>>   * will not need to do it from userspace.  The fake hot-add code
>>   * as well as ppc64 will do all of their discovery in userspace
>> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
>>  #endif
>>
>>  	&dev_attr_block_size_bytes.attr,
>> +	&dev_attr_auto_online_blocks.attr,
>>  	NULL
>>  };
>>
>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>> index 12eab50..890c3b5 100644
>> --- a/drivers/xen/balloon.c
>> +++ b/drivers/xen/balloon.c
>> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
>>  	}
>>  #endif
>>
>> -	rc = add_memory_resource(nid, resource);
>> +	rc = add_memory_resource(nid, resource, false);
>
> This is partial solution and does not allow us to use new feature in Xen.
> Could you add separate patch which fixes this issue?
>

Sure, I'd be glad to make this work for Xen too.

>>  	if (rc) {
>>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>>  		goto err;
>> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>> index 2ea574f..4b7949a 100644
>> --- a/include/linux/memory_hotplug.h
>> +++ b/include/linux/memory_hotplug.h
>> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
>>
>>  extern int try_online_node(int nid);
>>
>> +extern bool memhp_auto_online;
>> +
>>  #ifdef CONFIG_MEMORY_HOTREMOVE
>>  extern bool is_pageblock_removable_nolock(struct page *page);
>>  extern int arch_remove_memory(u64 start, u64 size);
>> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
>>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>>  		void *arg, int (*func)(struct memory_block *, void *));
>>  extern int add_memory(int nid, u64 start, u64 size);
>> -extern int add_memory_resource(int nid, struct resource *resource);
>> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
>>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>>  		bool for_device);
>>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index a042a9d..0ecf860 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -76,6 +76,9 @@ static struct {
>>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
>>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
>>
>> +bool memhp_auto_online;
>> +EXPORT_SYMBOL_GPL(memhp_auto_online);
>> +
>>  void get_online_mems(void)
>>  {
>>  	might_sleep();
>> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>>  }
>>
>>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
>> -int __ref add_memory_resource(int nid, struct resource *res)
>> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
>>  {
>>  	u64 start, size;
>>  	pg_data_t *pgdat = NULL;
>> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
>>  	/* create new memmap entry */
>>  	firmware_map_add_hotplug(start, start + size, "System RAM");
>>
>> +	/* online pages if requested */
>> +	if (online)
>> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>> +			     MMOP_ONLINE_KEEP);
>> +
>
> This way we go in deadlock if auto online feature is enabled in Xen (this was
> pointed out by David Vrabel).

Yes, but as I said the patch doesn't change anything for Xen guests for
now, we always call add_memory_resource() with online = false.

> And we want to have it working out of the box.
> So, I think that we should find proper solution. I suppose that we can schedule
> a task here which auto online attached blocks. Hmmm... Not nice but should work.
> Or maybe you have better idea how to fix this issue.

I'd like to avoid additional delays and memory allocations between
adding new memory and onlining it (and this is the main purpose of the
patch). Maybe we can have a tristate online parameter ('online_now',
'online_delay', 'keep_offlined') and handle it
accordingly. Alternatively I can suggest we have the onlining in Xen
balloon driver code, memhp_auto_online is exported so we can call
online_pages() after we release the ballon_mutex.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-08 16:55     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-08 16:55 UTC (permalink / raw)
  To: Daniel Kiper
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Daniel Kiper <daniel.kiper@oracle.com> writes:

> Hey Vitaly,
>
> Sorry for late reply but I have been on holiday and I was not able
> to follow this thread.
>
> I saw that Andrew put this patch in -mm queue but I have still some
> concerns. I hope that they could be addressed in one way or another.
> Please look below for more details.
>
> On Thu, Jan 07, 2016 at 06:23:41PM +0100, Vitaly Kuznetsov wrote:
>> Currently, all newly added memory blocks remain in 'offline' state unless
>> someone onlines them, some linux distributions carry special udev rules
>> like:
>>
>> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
>>
>> to make this happen automatically. This is not a great solution for virtual
>> machines where memory hotplug is being used to address high memory pressure
>> situations as such onlining is slow and a userspace process doing this
>> (udev) has a chance of being killed by the OOM killer as it will probably
>> require to allocate some memory.
>>
>> Introduce default policy for the newly added memory blocks in
>> /sys/devices/system/memory/auto_online_blocks file with two possible
>> values: "offline" which preserves the current behavior and "online" which
>> causes all newly added memory blocks to go online as soon as they're added.
>> The default is "offline".
>>
>> Cc: Jonathan Corbet <corbet@lwn.net>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Daniel Kiper <daniel.kiper@oracle.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Tang Chen <tangchen@cn.fujitsu.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Xishi Qiu <qiuxishi@huawei.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Kay Sievers <kay@vrfy.org>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> - Changes since 'v2':
>>   - Remove config option, revert to 'offline' by default [Andrew Morton,
>>     David Rientjes]
>>   - Rename 'hotplug_autoonline' to 'auto_online_blocks' [David Rientjes]
>>
>> - Changes since 'v1':
>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>   (I don't completely understand what prevents manual onlining in this
>>    case as we still have all newly added blocks in sysfs ... this is the
>>    discussion point.)

Hi Daniel!

>
> Sometimes Xen guest needs to share some memory with other guests. This
> memory must live in an address space and must be described by struct page.
> Original mechanism gets such struct page by ballooning down guest memory a bit.
> However, if memory hotplug is available then we do not need to steal real
> memory from guest to share memory. We can get address space and struct page
> from memory hotplug machinery. However, there is one requirement: every
> page shared between guests cannot be allocated (onlined) in normal way.
> All of them must be taken from hypervisor and later described by struct page.
> Otherwise, if we online whole section/block automatically then we must use
> ballooning mechanism described earlier.
>
> David, please correct me if I am wrong.
>
> I hope that helps.
>
>> - Changes since 'RFC':
>>   It seems nobody is strongly opposed to the idea, thus non-RFC.
>>   Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
>>   and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
>>   from show_memhp_autoonline(). [Daniel Kiper]
>>   Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
>>   feature by default (when the config option is selected) and add
>>   kernel parameter (nomemhp_autoonline) to disable the functionality
>>   upon boot when needed.
>>
>> - RFC:
>>   I was able to find previous attempts to fix the issue, e.g.:
>>   http://marc.info/?l=linux-kernel&m=137425951924598&w=2
>>   http://marc.info/?l=linux-acpi&m=127186488905382
>>   but I'm not completely sure why it didn't work out and the solution
>>   I suggest is not 'smart enough', thus 'RFC'.
>> ---
>>  Documentation/memory-hotplug.txt | 19 +++++++++++++++----
>>  drivers/base/memory.c            | 32 ++++++++++++++++++++++++++++++++
>>  drivers/xen/balloon.c            |  2 +-
>>  include/linux/memory_hotplug.h   |  4 +++-
>>  mm/memory_hotplug.c              | 12 ++++++++++--
>>  5 files changed, 61 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
>> index ce2cfcf..ceaf40c 100644
>> --- a/Documentation/memory-hotplug.txt
>> +++ b/Documentation/memory-hotplug.txt
>> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
>>  If the memory block is offline, you'll read "offline".
>>
>>
>> -5.2. How to online memory
>> +5.2. Memory onlining
>>  ------------
>> -Even if the memory is hot-added, it is not at ready-to-use state.
>> -For using newly added memory, you have to "online" the memory block.
>> +When the memory is hot-added, the kernel decides whether or not to "online"
>> +it according to the policy which can be read from "auto_online_blocks" file:
>>
>> -For onlining, you have to write "online" to the memory block's state file as:
>> +% cat /sys/devices/system/memory/auto_online_blocks
>> +
>> +The default is "offline" which means the newly added memory is not in a
>> +ready-to-use state and you have to "online" the newly added memory blocks
>> +manually. Automatic onlining can be requested by writing "online" to
>> +"auto_online_blocks" file:
>> +
>> +% echo online > /sys/devices/system/memory/auto_online_blocks
>> +
>> +If the automatic onlining wasn't requested or some memory block was offlined
>> +it is possible to change the individual block's state by writing to the "state"
>> +file:
>>
>>  % echo online > /sys/devices/system/memory/memoryXXX/state
>
> Please say clearly that offlined blocks are not onlined automatically
> when /sys/devices/system/memory/auto_online_blocks is set to online.
>

You mean the blocks which were manually offlined won't magically come
back, right? Ok, I'll try.
 
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 25425d3..44a618d 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
>>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
>>
>>  /*
>> + * Memory auto online policy.
>> + */
>> +
>> +static ssize_t
>> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
>> +			char *buf)
>> +{
>> +	if (memhp_auto_online)
>> +		return sprintf(buf, "online\n");
>> +	else
>> +		return sprintf(buf, "offline\n");
>> +}
>> +
>> +static ssize_t
>> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
>> +			 const char *buf, size_t count)
>> +{
>> +	if (sysfs_streq(buf, "online"))
>> +		memhp_auto_online = true;
>> +	else if (sysfs_streq(buf, "offline"))
>> +		memhp_auto_online = false;
>> +	else
>> +		return -EINVAL;
>> +
>> +	return count;
>> +}
>> +
>> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
>> +		   store_auto_online_blocks);
>> +
>> +/*
>>   * Some architectures will have custom drivers to do this, and
>>   * will not need to do it from userspace.  The fake hot-add code
>>   * as well as ppc64 will do all of their discovery in userspace
>> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
>>  #endif
>>
>>  	&dev_attr_block_size_bytes.attr,
>> +	&dev_attr_auto_online_blocks.attr,
>>  	NULL
>>  };
>>
>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>> index 12eab50..890c3b5 100644
>> --- a/drivers/xen/balloon.c
>> +++ b/drivers/xen/balloon.c
>> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
>>  	}
>>  #endif
>>
>> -	rc = add_memory_resource(nid, resource);
>> +	rc = add_memory_resource(nid, resource, false);
>
> This is partial solution and does not allow us to use new feature in Xen.
> Could you add separate patch which fixes this issue?
>

Sure, I'd be glad to make this work for Xen too.

>>  	if (rc) {
>>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>>  		goto err;
>> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>> index 2ea574f..4b7949a 100644
>> --- a/include/linux/memory_hotplug.h
>> +++ b/include/linux/memory_hotplug.h
>> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
>>
>>  extern int try_online_node(int nid);
>>
>> +extern bool memhp_auto_online;
>> +
>>  #ifdef CONFIG_MEMORY_HOTREMOVE
>>  extern bool is_pageblock_removable_nolock(struct page *page);
>>  extern int arch_remove_memory(u64 start, u64 size);
>> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
>>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>>  		void *arg, int (*func)(struct memory_block *, void *));
>>  extern int add_memory(int nid, u64 start, u64 size);
>> -extern int add_memory_resource(int nid, struct resource *resource);
>> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
>>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>>  		bool for_device);
>>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index a042a9d..0ecf860 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -76,6 +76,9 @@ static struct {
>>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
>>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
>>
>> +bool memhp_auto_online;
>> +EXPORT_SYMBOL_GPL(memhp_auto_online);
>> +
>>  void get_online_mems(void)
>>  {
>>  	might_sleep();
>> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
>>  }
>>
>>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
>> -int __ref add_memory_resource(int nid, struct resource *res)
>> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
>>  {
>>  	u64 start, size;
>>  	pg_data_t *pgdat = NULL;
>> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
>>  	/* create new memmap entry */
>>  	firmware_map_add_hotplug(start, start + size, "System RAM");
>>
>> +	/* online pages if requested */
>> +	if (online)
>> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>> +			     MMOP_ONLINE_KEEP);
>> +
>
> This way we go in deadlock if auto online feature is enabled in Xen (this was
> pointed out by David Vrabel).

Yes, but as I said the patch doesn't change anything for Xen guests for
now, we always call add_memory_resource() with online = false.

> And we want to have it working out of the box.
> So, I think that we should find proper solution. I suppose that we can schedule
> a task here which auto online attached blocks. Hmmm... Not nice but should work.
> Or maybe you have better idea how to fix this issue.

I'd like to avoid additional delays and memory allocations between
adding new memory and onlining it (and this is the main purpose of the
patch). Maybe we can have a tristate online parameter ('online_now',
'online_delay', 'keep_offlined') and handle it
accordingly. Alternatively I can suggest we have the onlining in Xen
balloon driver code, memhp_auto_online is exported so we can call
online_pages() after we release the ballon_mutex.

-- 
  Vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-08 16:55     ` Vitaly Kuznetsov
@ 2016-01-11  8:10       ` Daniel Kiper
  -1 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11  8:10 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Fri, Jan 08, 2016 at 05:55:07PM +0100, Vitaly Kuznetsov wrote:
> Daniel Kiper <daniel.kiper@oracle.com> writes:

[...]

> >> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> >> index ce2cfcf..ceaf40c 100644
> >> --- a/Documentation/memory-hotplug.txt
> >> +++ b/Documentation/memory-hotplug.txt
> >> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
> >>  If the memory block is offline, you'll read "offline".
> >>
> >>
> >> -5.2. How to online memory
> >> +5.2. Memory onlining
> >>  ------------
> >> -Even if the memory is hot-added, it is not at ready-to-use state.
> >> -For using newly added memory, you have to "online" the memory block.
> >> +When the memory is hot-added, the kernel decides whether or not to "online"
> >> +it according to the policy which can be read from "auto_online_blocks" file:
> >>
> >> -For onlining, you have to write "online" to the memory block's state file as:
> >> +% cat /sys/devices/system/memory/auto_online_blocks
> >> +
> >> +The default is "offline" which means the newly added memory is not in a
> >> +ready-to-use state and you have to "online" the newly added memory blocks
> >> +manually. Automatic onlining can be requested by writing "online" to
> >> +"auto_online_blocks" file:
> >> +
> >> +% echo online > /sys/devices/system/memory/auto_online_blocks
> >> +
> >> +If the automatic onlining wasn't requested or some memory block was offlined
> >> +it is possible to change the individual block's state by writing to the "state"
> >> +file:
> >>
> >>  % echo online > /sys/devices/system/memory/memoryXXX/state
> >
> > Please say clearly that offlined blocks are not onlined automatically
> > when /sys/devices/system/memory/auto_online_blocks is set to online.
> >
>
> You mean the blocks which were manually offlined won't magically come
> back, right? Ok, I'll try.

Yep, but AIUI it works in that way for all offlined blocks not only for
earlier manually offlined ones.

> >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> >> index 25425d3..44a618d 100644
> >> --- a/drivers/base/memory.c
> >> +++ b/drivers/base/memory.c
> >> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
> >>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
> >>
> >>  /*
> >> + * Memory auto online policy.
> >> + */
> >> +
> >> +static ssize_t
> >> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> >> +			char *buf)
> >> +{
> >> +	if (memhp_auto_online)
> >> +		return sprintf(buf, "online\n");
> >> +	else
> >> +		return sprintf(buf, "offline\n");
> >> +}
> >> +
> >> +static ssize_t
> >> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> >> +			 const char *buf, size_t count)
> >> +{
> >> +	if (sysfs_streq(buf, "online"))
> >> +		memhp_auto_online = true;
> >> +	else if (sysfs_streq(buf, "offline"))
> >> +		memhp_auto_online = false;
> >> +	else
> >> +		return -EINVAL;
> >> +
> >> +	return count;
> >> +}
> >> +
> >> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> >> +		   store_auto_online_blocks);
> >> +
> >> +/*
> >>   * Some architectures will have custom drivers to do this, and
> >>   * will not need to do it from userspace.  The fake hot-add code
> >>   * as well as ppc64 will do all of their discovery in userspace
> >> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
> >>  #endif
> >>
> >>  	&dev_attr_block_size_bytes.attr,
> >> +	&dev_attr_auto_online_blocks.attr,
> >>  	NULL
> >>  };
> >>
> >> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> >> index 12eab50..890c3b5 100644
> >> --- a/drivers/xen/balloon.c
> >> +++ b/drivers/xen/balloon.c
> >> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
> >>  	}
> >>  #endif
> >>
> >> -	rc = add_memory_resource(nid, resource);
> >> +	rc = add_memory_resource(nid, resource, false);
> >
> > This is partial solution and does not allow us to use new feature in Xen.
> > Could you add separate patch which fixes this issue?
> >
>
> Sure, I'd be glad to make this work for Xen too.

Great! Thanks a lot!

> >>  	if (rc) {
> >>  		pr_warn("Cannot add additional memory (%i)\n", rc);
> >>  		goto err;
> >> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> >> index 2ea574f..4b7949a 100644
> >> --- a/include/linux/memory_hotplug.h
> >> +++ b/include/linux/memory_hotplug.h
> >> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
> >>
> >>  extern int try_online_node(int nid);
> >>
> >> +extern bool memhp_auto_online;
> >> +
> >>  #ifdef CONFIG_MEMORY_HOTREMOVE
> >>  extern bool is_pageblock_removable_nolock(struct page *page);
> >>  extern int arch_remove_memory(u64 start, u64 size);
> >> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
> >>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  		void *arg, int (*func)(struct memory_block *, void *));
> >>  extern int add_memory(int nid, u64 start, u64 size);
> >> -extern int add_memory_resource(int nid, struct resource *resource);
> >> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
> >>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> >>  		bool for_device);
> >>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index a042a9d..0ecf860 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -76,6 +76,9 @@ static struct {
> >>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> >>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> >>
> >> +bool memhp_auto_online;
> >> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> >> +
> >>  void get_online_mems(void)
> >>  {
> >>  	might_sleep();
> >> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> >>  }
> >>
> >>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> >> -int __ref add_memory_resource(int nid, struct resource *res)
> >> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
> >>  {
> >>  	u64 start, size;
> >>  	pg_data_t *pgdat = NULL;
> >> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
> >>  	/* create new memmap entry */
> >>  	firmware_map_add_hotplug(start, start + size, "System RAM");
> >>
> >> +	/* online pages if requested */
> >> +	if (online)
> >> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> >> +			     MMOP_ONLINE_KEEP);
> >> +
> >
> > This way we go in deadlock if auto online feature is enabled in Xen (this was
> > pointed out by David Vrabel).
>
> Yes, but as I said the patch doesn't change anything for Xen guests for
> now, we always call add_memory_resource() with online = false.
>
> > And we want to have it working out of the box.
> > So, I think that we should find proper solution. I suppose that we can schedule
> > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > Or maybe you have better idea how to fix this issue.
>
> I'd like to avoid additional delays and memory allocations between
> adding new memory and onlining it (and this is the main purpose of the
> patch). Maybe we can have a tristate online parameter ('online_now',
> 'online_delay', 'keep_offlined') and handle it
> accordingly. Alternatively I can suggest we have the onlining in Xen
> balloon driver code, memhp_auto_online is exported so we can call
> online_pages() after we release the ballon_mutex.

This is not nice too. I prefer the same code path for every case.
Give me some time. I will think how to solve that issue.

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11  8:10       ` Daniel Kiper
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11  8:10 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Fri, Jan 08, 2016 at 05:55:07PM +0100, Vitaly Kuznetsov wrote:
> Daniel Kiper <daniel.kiper@oracle.com> writes:

[...]

> >> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> >> index ce2cfcf..ceaf40c 100644
> >> --- a/Documentation/memory-hotplug.txt
> >> +++ b/Documentation/memory-hotplug.txt
> >> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
> >>  If the memory block is offline, you'll read "offline".
> >>
> >>
> >> -5.2. How to online memory
> >> +5.2. Memory onlining
> >>  ------------
> >> -Even if the memory is hot-added, it is not at ready-to-use state.
> >> -For using newly added memory, you have to "online" the memory block.
> >> +When the memory is hot-added, the kernel decides whether or not to "online"
> >> +it according to the policy which can be read from "auto_online_blocks" file:
> >>
> >> -For onlining, you have to write "online" to the memory block's state file as:
> >> +% cat /sys/devices/system/memory/auto_online_blocks
> >> +
> >> +The default is "offline" which means the newly added memory is not in a
> >> +ready-to-use state and you have to "online" the newly added memory blocks
> >> +manually. Automatic onlining can be requested by writing "online" to
> >> +"auto_online_blocks" file:
> >> +
> >> +% echo online > /sys/devices/system/memory/auto_online_blocks
> >> +
> >> +If the automatic onlining wasn't requested or some memory block was offlined
> >> +it is possible to change the individual block's state by writing to the "state"
> >> +file:
> >>
> >>  % echo online > /sys/devices/system/memory/memoryXXX/state
> >
> > Please say clearly that offlined blocks are not onlined automatically
> > when /sys/devices/system/memory/auto_online_blocks is set to online.
> >
>
> You mean the blocks which were manually offlined won't magically come
> back, right? Ok, I'll try.

Yep, but AIUI it works in that way for all offlined blocks not only for
earlier manually offlined ones.

> >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> >> index 25425d3..44a618d 100644
> >> --- a/drivers/base/memory.c
> >> +++ b/drivers/base/memory.c
> >> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
> >>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
> >>
> >>  /*
> >> + * Memory auto online policy.
> >> + */
> >> +
> >> +static ssize_t
> >> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> >> +			char *buf)
> >> +{
> >> +	if (memhp_auto_online)
> >> +		return sprintf(buf, "online\n");
> >> +	else
> >> +		return sprintf(buf, "offline\n");
> >> +}
> >> +
> >> +static ssize_t
> >> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> >> +			 const char *buf, size_t count)
> >> +{
> >> +	if (sysfs_streq(buf, "online"))
> >> +		memhp_auto_online = true;
> >> +	else if (sysfs_streq(buf, "offline"))
> >> +		memhp_auto_online = false;
> >> +	else
> >> +		return -EINVAL;
> >> +
> >> +	return count;
> >> +}
> >> +
> >> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> >> +		   store_auto_online_blocks);
> >> +
> >> +/*
> >>   * Some architectures will have custom drivers to do this, and
> >>   * will not need to do it from userspace.  The fake hot-add code
> >>   * as well as ppc64 will do all of their discovery in userspace
> >> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
> >>  #endif
> >>
> >>  	&dev_attr_block_size_bytes.attr,
> >> +	&dev_attr_auto_online_blocks.attr,
> >>  	NULL
> >>  };
> >>
> >> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> >> index 12eab50..890c3b5 100644
> >> --- a/drivers/xen/balloon.c
> >> +++ b/drivers/xen/balloon.c
> >> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
> >>  	}
> >>  #endif
> >>
> >> -	rc = add_memory_resource(nid, resource);
> >> +	rc = add_memory_resource(nid, resource, false);
> >
> > This is partial solution and does not allow us to use new feature in Xen.
> > Could you add separate patch which fixes this issue?
> >
>
> Sure, I'd be glad to make this work for Xen too.

Great! Thanks a lot!

> >>  	if (rc) {
> >>  		pr_warn("Cannot add additional memory (%i)\n", rc);
> >>  		goto err;
> >> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> >> index 2ea574f..4b7949a 100644
> >> --- a/include/linux/memory_hotplug.h
> >> +++ b/include/linux/memory_hotplug.h
> >> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
> >>
> >>  extern int try_online_node(int nid);
> >>
> >> +extern bool memhp_auto_online;
> >> +
> >>  #ifdef CONFIG_MEMORY_HOTREMOVE
> >>  extern bool is_pageblock_removable_nolock(struct page *page);
> >>  extern int arch_remove_memory(u64 start, u64 size);
> >> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
> >>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  		void *arg, int (*func)(struct memory_block *, void *));
> >>  extern int add_memory(int nid, u64 start, u64 size);
> >> -extern int add_memory_resource(int nid, struct resource *resource);
> >> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
> >>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> >>  		bool for_device);
> >>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index a042a9d..0ecf860 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -76,6 +76,9 @@ static struct {
> >>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> >>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> >>
> >> +bool memhp_auto_online;
> >> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> >> +
> >>  void get_online_mems(void)
> >>  {
> >>  	might_sleep();
> >> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> >>  }
> >>
> >>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> >> -int __ref add_memory_resource(int nid, struct resource *res)
> >> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
> >>  {
> >>  	u64 start, size;
> >>  	pg_data_t *pgdat = NULL;
> >> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
> >>  	/* create new memmap entry */
> >>  	firmware_map_add_hotplug(start, start + size, "System RAM");
> >>
> >> +	/* online pages if requested */
> >> +	if (online)
> >> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> >> +			     MMOP_ONLINE_KEEP);
> >> +
> >
> > This way we go in deadlock if auto online feature is enabled in Xen (this was
> > pointed out by David Vrabel).
>
> Yes, but as I said the patch doesn't change anything for Xen guests for
> now, we always call add_memory_resource() with online = false.
>
> > And we want to have it working out of the box.
> > So, I think that we should find proper solution. I suppose that we can schedule
> > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > Or maybe you have better idea how to fix this issue.
>
> I'd like to avoid additional delays and memory allocations between
> adding new memory and onlining it (and this is the main purpose of the
> patch). Maybe we can have a tristate online parameter ('online_now',
> 'online_delay', 'keep_offlined') and handle it
> accordingly. Alternatively I can suggest we have the onlining in Xen
> balloon driver code, memhp_auto_online is exported so we can call
> online_pages() after we release the ballon_mutex.

This is not nice too. I prefer the same code path for every case.
Give me some time. I will think how to solve that issue.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-07 17:23 ` Vitaly Kuznetsov
@ 2016-01-11 11:01   ` David Vrabel
  -1 siblings, 0 replies; 23+ messages in thread
From: David Vrabel @ 2016-01-11 11:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov, linux-mm
  Cc: linux-kernel, linux-acpi, Jonathan Corbet, Greg Kroah-Hartman,
	Daniel Kiper, Dan Williams, Tang Chen, David Rientjes,
	Andrew Morton, Naoya Horiguchi, Xishi Qiu, Mel Gorman,
	K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On 07/01/16 17:23, Vitaly Kuznetsov wrote:
> 
> - Changes since 'v1':
>   Add 'online' parameter to add_memory_resource() as it is being used by
>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>   (I don't completely understand what prevents manual onlining in this
>    case as we still have all newly added blocks in sysfs ... this is the
>    discussion point.)

I'm not sure what you're not understanding here?

Memory added by the Xen balloon driver (whether populated with real
memory or not) does need to be onlined by udev or similar.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 11:01   ` David Vrabel
  0 siblings, 0 replies; 23+ messages in thread
From: David Vrabel @ 2016-01-11 11:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov, linux-mm
  Cc: linux-kernel, linux-acpi, Jonathan Corbet, Greg Kroah-Hartman,
	Daniel Kiper, Dan Williams, Tang Chen, David Rientjes,
	Andrew Morton, Naoya Horiguchi, Xishi Qiu, Mel Gorman,
	K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On 07/01/16 17:23, Vitaly Kuznetsov wrote:
> 
> - Changes since 'v1':
>   Add 'online' parameter to add_memory_resource() as it is being used by
>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>   (I don't completely understand what prevents manual onlining in this
>    case as we still have all newly added blocks in sysfs ... this is the
>    discussion point.)

I'm not sure what you're not understanding here?

Memory added by the Xen balloon driver (whether populated with real
memory or not) does need to be onlined by udev or similar.

David

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11 11:01   ` David Vrabel
  (?)
@ 2016-01-11 11:59     ` Vitaly Kuznetsov
  -1 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-11 11:59 UTC (permalink / raw)
  To: David Vrabel
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Daniel Kiper, Dan Williams, Tang Chen,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

David Vrabel <david.vrabel@citrix.com> writes:

> On 07/01/16 17:23, Vitaly Kuznetsov wrote:
>> 
>> - Changes since 'v1':
>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>   (I don't completely understand what prevents manual onlining in this
>>    case as we still have all newly added blocks in sysfs ... this is the
>>    discussion point.)
>

(there is a discussion with Daniel on the same topic in a parallel
thread)

> I'm not sure what you're not understanding here?
>
> Memory added by the Xen balloon driver (whether populated with real
> memory or not) does need to be onlined by udev or similar.


Yes, same as all other memory hotplug mechanisms (hyper-v's balloon
driver and acpi memory hotplug). My patch adds an option to make this
happen automatically. Xen driver is currently excluded because of a
deadlock. If this deadlock is the only problem we can easily change
taking the lock to checking that the lock was taken (and taking in case
it wasn't) or something similar and everything is going to work. From
briefly looking at the code it seems to me it's going to work.

What I wasn't sure about is 'empty pages' you were mentioning. In case
there are some pages we can't online there should be a protection
mechanism actively preventing them from going online (similar to
hv_online_page() in Hyper-V driver) as this patch does nothing
'special' compared to udev onlining newly added blocks. 

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 11:59     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-11 11:59 UTC (permalink / raw)
  To: David Vrabel
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Daniel Kiper, Dan Williams, Tang Chen,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

David Vrabel <david.vrabel@citrix.com> writes:

> On 07/01/16 17:23, Vitaly Kuznetsov wrote:
>> 
>> - Changes since 'v1':
>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>   (I don't completely understand what prevents manual onlining in this
>>    case as we still have all newly added blocks in sysfs ... this is the
>>    discussion point.)
>

(there is a discussion with Daniel on the same topic in a parallel
thread)

> I'm not sure what you're not understanding here?
>
> Memory added by the Xen balloon driver (whether populated with real
> memory or not) does need to be onlined by udev or similar.


Yes, same as all other memory hotplug mechanisms (hyper-v's balloon
driver and acpi memory hotplug). My patch adds an option to make this
happen automatically. Xen driver is currently excluded because of a
deadlock. If this deadlock is the only problem we can easily change
taking the lock to checking that the lock was taken (and taking in case
it wasn't) or something similar and everything is going to work. From
briefly looking at the code it seems to me it's going to work.

What I wasn't sure about is 'empty pages' you were mentioning. In case
there are some pages we can't online there should be a protection
mechanism actively preventing them from going online (similar to
hv_online_page() in Hyper-V driver) as this patch does nothing
'special' compared to udev onlining newly added blocks. 

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 11:59     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-11 11:59 UTC (permalink / raw)
  To: David Vrabel
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Daniel Kiper, Dan Williams, Tang Chen,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

David Vrabel <david.vrabel@citrix.com> writes:

> On 07/01/16 17:23, Vitaly Kuznetsov wrote:
>> 
>> - Changes since 'v1':
>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>   (I don't completely understand what prevents manual onlining in this
>>    case as we still have all newly added blocks in sysfs ... this is the
>>    discussion point.)
>

(there is a discussion with Daniel on the same topic in a parallel
thread)

> I'm not sure what you're not understanding here?
>
> Memory added by the Xen balloon driver (whether populated with real
> memory or not) does need to be onlined by udev or similar.


Yes, same as all other memory hotplug mechanisms (hyper-v's balloon
driver and acpi memory hotplug). My patch adds an option to make this
happen automatically. Xen driver is currently excluded because of a
deadlock. If this deadlock is the only problem we can easily change
taking the lock to checking that the lock was taken (and taking in case
it wasn't) or something similar and everything is going to work. From
briefly looking at the code it seems to me it's going to work.

What I wasn't sure about is 'empty pages' you were mentioning. In case
there are some pages we can't online there should be a protection
mechanism actively preventing them from going online (similar to
hv_online_page() in Hyper-V driver) as this patch does nothing
'special' compared to udev onlining newly added blocks. 

-- 
  Vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11  8:10       ` Daniel Kiper
@ 2016-01-11 12:42         ` Daniel Kiper
  -1 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11 12:42 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 09:10:13AM +0100, Daniel Kiper wrote:
> On Fri, Jan 08, 2016 at 05:55:07PM +0100, Vitaly Kuznetsov wrote:
> > Daniel Kiper <daniel.kiper@oracle.com> writes:
>
> [...]
>
> > >> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> > >> index ce2cfcf..ceaf40c 100644
> > >> --- a/Documentation/memory-hotplug.txt
> > >> +++ b/Documentation/memory-hotplug.txt
> > >> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
> > >>  If the memory block is offline, you'll read "offline".
> > >>
> > >>
> > >> -5.2. How to online memory
> > >> +5.2. Memory onlining
> > >>  ------------
> > >> -Even if the memory is hot-added, it is not at ready-to-use state.
> > >> -For using newly added memory, you have to "online" the memory block.
> > >> +When the memory is hot-added, the kernel decides whether or not to "online"
> > >> +it according to the policy which can be read from "auto_online_blocks" file:
> > >>
> > >> -For onlining, you have to write "online" to the memory block's state file as:
> > >> +% cat /sys/devices/system/memory/auto_online_blocks
> > >> +
> > >> +The default is "offline" which means the newly added memory is not in a
> > >> +ready-to-use state and you have to "online" the newly added memory blocks
> > >> +manually. Automatic onlining can be requested by writing "online" to
> > >> +"auto_online_blocks" file:
> > >> +
> > >> +% echo online > /sys/devices/system/memory/auto_online_blocks
> > >> +
> > >> +If the automatic onlining wasn't requested or some memory block was offlined
> > >> +it is possible to change the individual block's state by writing to the "state"
> > >> +file:
> > >>
> > >>  % echo online > /sys/devices/system/memory/memoryXXX/state
> > >
> > > Please say clearly that offlined blocks are not onlined automatically
> > > when /sys/devices/system/memory/auto_online_blocks is set to online.
> > >
> >
> > You mean the blocks which were manually offlined won't magically come
> > back, right? Ok, I'll try.
>
> Yep, but AIUI it works in that way for all offlined blocks not only for
> earlier manually offlined ones.
>
> > >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > >> index 25425d3..44a618d 100644
> > >> --- a/drivers/base/memory.c
> > >> +++ b/drivers/base/memory.c
> > >> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
> > >>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
> > >>
> > >>  /*
> > >> + * Memory auto online policy.
> > >> + */
> > >> +
> > >> +static ssize_t
> > >> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> > >> +			char *buf)
> > >> +{
> > >> +	if (memhp_auto_online)
> > >> +		return sprintf(buf, "online\n");
> > >> +	else
> > >> +		return sprintf(buf, "offline\n");
> > >> +}
> > >> +
> > >> +static ssize_t
> > >> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> > >> +			 const char *buf, size_t count)
> > >> +{
> > >> +	if (sysfs_streq(buf, "online"))
> > >> +		memhp_auto_online = true;
> > >> +	else if (sysfs_streq(buf, "offline"))
> > >> +		memhp_auto_online = false;
> > >> +	else
> > >> +		return -EINVAL;
> > >> +
> > >> +	return count;
> > >> +}
> > >> +
> > >> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> > >> +		   store_auto_online_blocks);
> > >> +
> > >> +/*
> > >>   * Some architectures will have custom drivers to do this, and
> > >>   * will not need to do it from userspace.  The fake hot-add code
> > >>   * as well as ppc64 will do all of their discovery in userspace
> > >> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
> > >>  #endif
> > >>
> > >>  	&dev_attr_block_size_bytes.attr,
> > >> +	&dev_attr_auto_online_blocks.attr,
> > >>  	NULL
> > >>  };
> > >>
> > >> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> > >> index 12eab50..890c3b5 100644
> > >> --- a/drivers/xen/balloon.c
> > >> +++ b/drivers/xen/balloon.c
> > >> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
> > >>  	}
> > >>  #endif
> > >>
> > >> -	rc = add_memory_resource(nid, resource);
> > >> +	rc = add_memory_resource(nid, resource, false);
> > >
> > > This is partial solution and does not allow us to use new feature in Xen.
> > > Could you add separate patch which fixes this issue?
> > >
> >
> > Sure, I'd be glad to make this work for Xen too.
>
> Great! Thanks a lot!
>
> > >>  	if (rc) {
> > >>  		pr_warn("Cannot add additional memory (%i)\n", rc);
> > >>  		goto err;
> > >> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > >> index 2ea574f..4b7949a 100644
> > >> --- a/include/linux/memory_hotplug.h
> > >> +++ b/include/linux/memory_hotplug.h
> > >> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
> > >>
> > >>  extern int try_online_node(int nid);
> > >>
> > >> +extern bool memhp_auto_online;
> > >> +
> > >>  #ifdef CONFIG_MEMORY_HOTREMOVE
> > >>  extern bool is_pageblock_removable_nolock(struct page *page);
> > >>  extern int arch_remove_memory(u64 start, u64 size);
> > >> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
> > >>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> > >>  		void *arg, int (*func)(struct memory_block *, void *));
> > >>  extern int add_memory(int nid, u64 start, u64 size);
> > >> -extern int add_memory_resource(int nid, struct resource *resource);
> > >> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
> > >>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> > >>  		bool for_device);
> > >>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> > >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > >> index a042a9d..0ecf860 100644
> > >> --- a/mm/memory_hotplug.c
> > >> +++ b/mm/memory_hotplug.c
> > >> @@ -76,6 +76,9 @@ static struct {
> > >>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> > >>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> > >>
> > >> +bool memhp_auto_online;
> > >> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> > >> +
> > >>  void get_online_mems(void)
> > >>  {
> > >>  	might_sleep();
> > >> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> > >>  }
> > >>
> > >>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> > >> -int __ref add_memory_resource(int nid, struct resource *res)
> > >> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
> > >>  {
> > >>  	u64 start, size;
> > >>  	pg_data_t *pgdat = NULL;
> > >> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
> > >>  	/* create new memmap entry */
> > >>  	firmware_map_add_hotplug(start, start + size, "System RAM");
> > >>
> > >> +	/* online pages if requested */
> > >> +	if (online)
> > >> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> > >> +			     MMOP_ONLINE_KEEP);
> > >> +
> > >
> > > This way we go in deadlock if auto online feature is enabled in Xen (this was
> > > pointed out by David Vrabel).
> >
> > Yes, but as I said the patch doesn't change anything for Xen guests for
> > now, we always call add_memory_resource() with online = false.
> >
> > > And we want to have it working out of the box.
> > > So, I think that we should find proper solution. I suppose that we can schedule
> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > > Or maybe you have better idea how to fix this issue.
> >
> > I'd like to avoid additional delays and memory allocations between
> > adding new memory and onlining it (and this is the main purpose of the
> > patch). Maybe we can have a tristate online parameter ('online_now',
> > 'online_delay', 'keep_offlined') and handle it
> > accordingly. Alternatively I can suggest we have the onlining in Xen
> > balloon driver code, memhp_auto_online is exported so we can call
> > online_pages() after we release the ballon_mutex.
>
> This is not nice too. I prefer the same code path for every case.
> Give me some time. I will think how to solve that issue.

It looks that we can safely call mutex_unlock() just before add_memory_resource()
call and retake lock immediately after add_memory_resource(). add_memory_resource()
itself does not play with balloon stuff and even if online_pages() does then it
take balloon_mutex in right place. Additionally, only one balloon task can run,
so, I think that we are on safe side. Am I right?

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 12:42         ` Daniel Kiper
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11 12:42 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 09:10:13AM +0100, Daniel Kiper wrote:
> On Fri, Jan 08, 2016 at 05:55:07PM +0100, Vitaly Kuznetsov wrote:
> > Daniel Kiper <daniel.kiper@oracle.com> writes:
>
> [...]
>
> > >> diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
> > >> index ce2cfcf..ceaf40c 100644
> > >> --- a/Documentation/memory-hotplug.txt
> > >> +++ b/Documentation/memory-hotplug.txt
> > >> @@ -254,12 +254,23 @@ If the memory block is online, you'll read "online".
> > >>  If the memory block is offline, you'll read "offline".
> > >>
> > >>
> > >> -5.2. How to online memory
> > >> +5.2. Memory onlining
> > >>  ------------
> > >> -Even if the memory is hot-added, it is not at ready-to-use state.
> > >> -For using newly added memory, you have to "online" the memory block.
> > >> +When the memory is hot-added, the kernel decides whether or not to "online"
> > >> +it according to the policy which can be read from "auto_online_blocks" file:
> > >>
> > >> -For onlining, you have to write "online" to the memory block's state file as:
> > >> +% cat /sys/devices/system/memory/auto_online_blocks
> > >> +
> > >> +The default is "offline" which means the newly added memory is not in a
> > >> +ready-to-use state and you have to "online" the newly added memory blocks
> > >> +manually. Automatic onlining can be requested by writing "online" to
> > >> +"auto_online_blocks" file:
> > >> +
> > >> +% echo online > /sys/devices/system/memory/auto_online_blocks
> > >> +
> > >> +If the automatic onlining wasn't requested or some memory block was offlined
> > >> +it is possible to change the individual block's state by writing to the "state"
> > >> +file:
> > >>
> > >>  % echo online > /sys/devices/system/memory/memoryXXX/state
> > >
> > > Please say clearly that offlined blocks are not onlined automatically
> > > when /sys/devices/system/memory/auto_online_blocks is set to online.
> > >
> >
> > You mean the blocks which were manually offlined won't magically come
> > back, right? Ok, I'll try.
>
> Yep, but AIUI it works in that way for all offlined blocks not only for
> earlier manually offlined ones.
>
> > >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > >> index 25425d3..44a618d 100644
> > >> --- a/drivers/base/memory.c
> > >> +++ b/drivers/base/memory.c
> > >> @@ -439,6 +439,37 @@ print_block_size(struct device *dev, struct device_attribute *attr,
> > >>  static DEVICE_ATTR(block_size_bytes, 0444, print_block_size, NULL);
> > >>
> > >>  /*
> > >> + * Memory auto online policy.
> > >> + */
> > >> +
> > >> +static ssize_t
> > >> +show_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> > >> +			char *buf)
> > >> +{
> > >> +	if (memhp_auto_online)
> > >> +		return sprintf(buf, "online\n");
> > >> +	else
> > >> +		return sprintf(buf, "offline\n");
> > >> +}
> > >> +
> > >> +static ssize_t
> > >> +store_auto_online_blocks(struct device *dev, struct device_attribute *attr,
> > >> +			 const char *buf, size_t count)
> > >> +{
> > >> +	if (sysfs_streq(buf, "online"))
> > >> +		memhp_auto_online = true;
> > >> +	else if (sysfs_streq(buf, "offline"))
> > >> +		memhp_auto_online = false;
> > >> +	else
> > >> +		return -EINVAL;
> > >> +
> > >> +	return count;
> > >> +}
> > >> +
> > >> +static DEVICE_ATTR(auto_online_blocks, 0644, show_auto_online_blocks,
> > >> +		   store_auto_online_blocks);
> > >> +
> > >> +/*
> > >>   * Some architectures will have custom drivers to do this, and
> > >>   * will not need to do it from userspace.  The fake hot-add code
> > >>   * as well as ppc64 will do all of their discovery in userspace
> > >> @@ -737,6 +768,7 @@ static struct attribute *memory_root_attrs[] = {
> > >>  #endif
> > >>
> > >>  	&dev_attr_block_size_bytes.attr,
> > >> +	&dev_attr_auto_online_blocks.attr,
> > >>  	NULL
> > >>  };
> > >>
> > >> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> > >> index 12eab50..890c3b5 100644
> > >> --- a/drivers/xen/balloon.c
> > >> +++ b/drivers/xen/balloon.c
> > >> @@ -338,7 +338,7 @@ static enum bp_state reserve_additional_memory(void)
> > >>  	}
> > >>  #endif
> > >>
> > >> -	rc = add_memory_resource(nid, resource);
> > >> +	rc = add_memory_resource(nid, resource, false);
> > >
> > > This is partial solution and does not allow us to use new feature in Xen.
> > > Could you add separate patch which fixes this issue?
> > >
> >
> > Sure, I'd be glad to make this work for Xen too.
>
> Great! Thanks a lot!
>
> > >>  	if (rc) {
> > >>  		pr_warn("Cannot add additional memory (%i)\n", rc);
> > >>  		goto err;
> > >> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > >> index 2ea574f..4b7949a 100644
> > >> --- a/include/linux/memory_hotplug.h
> > >> +++ b/include/linux/memory_hotplug.h
> > >> @@ -99,6 +99,8 @@ extern void __online_page_free(struct page *page);
> > >>
> > >>  extern int try_online_node(int nid);
> > >>
> > >> +extern bool memhp_auto_online;
> > >> +
> > >>  #ifdef CONFIG_MEMORY_HOTREMOVE
> > >>  extern bool is_pageblock_removable_nolock(struct page *page);
> > >>  extern int arch_remove_memory(u64 start, u64 size);
> > >> @@ -267,7 +269,7 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
> > >>  extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> > >>  		void *arg, int (*func)(struct memory_block *, void *));
> > >>  extern int add_memory(int nid, u64 start, u64 size);
> > >> -extern int add_memory_resource(int nid, struct resource *resource);
> > >> +extern int add_memory_resource(int nid, struct resource *resource, bool online);
> > >>  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> > >>  		bool for_device);
> > >>  extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> > >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > >> index a042a9d..0ecf860 100644
> > >> --- a/mm/memory_hotplug.c
> > >> +++ b/mm/memory_hotplug.c
> > >> @@ -76,6 +76,9 @@ static struct {
> > >>  #define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> > >>  #define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> > >>
> > >> +bool memhp_auto_online;
> > >> +EXPORT_SYMBOL_GPL(memhp_auto_online);
> > >> +
> > >>  void get_online_mems(void)
> > >>  {
> > >>  	might_sleep();
> > >> @@ -1232,7 +1235,7 @@ int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> > >>  }
> > >>
> > >>  /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> > >> -int __ref add_memory_resource(int nid, struct resource *res)
> > >> +int __ref add_memory_resource(int nid, struct resource *res, bool online)
> > >>  {
> > >>  	u64 start, size;
> > >>  	pg_data_t *pgdat = NULL;
> > >> @@ -1292,6 +1295,11 @@ int __ref add_memory_resource(int nid, struct resource *res)
> > >>  	/* create new memmap entry */
> > >>  	firmware_map_add_hotplug(start, start + size, "System RAM");
> > >>
> > >> +	/* online pages if requested */
> > >> +	if (online)
> > >> +		online_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> > >> +			     MMOP_ONLINE_KEEP);
> > >> +
> > >
> > > This way we go in deadlock if auto online feature is enabled in Xen (this was
> > > pointed out by David Vrabel).
> >
> > Yes, but as I said the patch doesn't change anything for Xen guests for
> > now, we always call add_memory_resource() with online = false.
> >
> > > And we want to have it working out of the box.
> > > So, I think that we should find proper solution. I suppose that we can schedule
> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > > Or maybe you have better idea how to fix this issue.
> >
> > I'd like to avoid additional delays and memory allocations between
> > adding new memory and onlining it (and this is the main purpose of the
> > patch). Maybe we can have a tristate online parameter ('online_now',
> > 'online_delay', 'keep_offlined') and handle it
> > accordingly. Alternatively I can suggest we have the onlining in Xen
> > balloon driver code, memhp_auto_online is exported so we can call
> > online_pages() after we release the ballon_mutex.
>
> This is not nice too. I prefer the same code path for every case.
> Give me some time. I will think how to solve that issue.

It looks that we can safely call mutex_unlock() just before add_memory_resource()
call and retake lock immediately after add_memory_resource(). add_memory_resource()
itself does not play with balloon stuff and even if online_pages() does then it
take balloon_mutex in right place. Additionally, only one balloon task can run,
so, I think that we are on safe side. Am I right?

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11 11:59     ` Vitaly Kuznetsov
@ 2016-01-11 13:21       ` David Vrabel
  -1 siblings, 0 replies; 23+ messages in thread
From: David Vrabel @ 2016-01-11 13:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Daniel Kiper, Dan Williams, Tang Chen,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On 11/01/16 11:59, Vitaly Kuznetsov wrote:
> David Vrabel <david.vrabel@citrix.com> writes:
> 
>> On 07/01/16 17:23, Vitaly Kuznetsov wrote:
>>>
>>> - Changes since 'v1':
>>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>>   (I don't completely understand what prevents manual onlining in this
>>>    case as we still have all newly added blocks in sysfs ... this is the
>>>    discussion point.)
>>
> 
> (there is a discussion with Daniel on the same topic in a parallel
> thread)
> 
>> I'm not sure what you're not understanding here?
>>
>> Memory added by the Xen balloon driver (whether populated with real
>> memory or not) does need to be onlined by udev or similar.
> 
> 
> Yes, same as all other memory hotplug mechanisms (hyper-v's balloon
> driver and acpi memory hotplug). My patch adds an option to make this
> happen automatically. Xen driver is currently excluded because of a
> deadlock. If this deadlock is the only problem we can easily change
> taking the lock to checking that the lock was taken (and taking in case
> it wasn't) or something similar and everything is going to work. From
> briefly looking at the code it seems to me it's going to work.

I don't think Linux has recursive mutex that we could use for this.

> What I wasn't sure about is 'empty pages' you were mentioning. In case
> there are some pages we can't online there should be a protection
> mechanism actively preventing them from going online (similar to
> hv_online_page() in Hyper-V driver) as this patch does nothing
> 'special' compared to udev onlining newly added blocks. 

'Empty' (or unpopulated) pages are those without any physical RAM
backing them.  Backend drivers map foreign (from other guest) pages into
these unpopulated pages.  i.e., accesses by the kernel to the virtual
addresses of these pages access memory shared by another guest.

These empty pages are ones we would prefer to be always automatically
onlined because they're hotplugged in response to requests from the
backend drivers.

Anyway, this series is fine with this Xen balloon driver limitation --
it can be addresses at a later date if necessary.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 13:21       ` David Vrabel
  0 siblings, 0 replies; 23+ messages in thread
From: David Vrabel @ 2016-01-11 13:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Daniel Kiper, Dan Williams, Tang Chen,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On 11/01/16 11:59, Vitaly Kuznetsov wrote:
> David Vrabel <david.vrabel@citrix.com> writes:
> 
>> On 07/01/16 17:23, Vitaly Kuznetsov wrote:
>>>
>>> - Changes since 'v1':
>>>   Add 'online' parameter to add_memory_resource() as it is being used by
>>>   xen ballon driver and it adds "empty" memory pages [David Vrabel].
>>>   (I don't completely understand what prevents manual onlining in this
>>>    case as we still have all newly added blocks in sysfs ... this is the
>>>    discussion point.)
>>
> 
> (there is a discussion with Daniel on the same topic in a parallel
> thread)
> 
>> I'm not sure what you're not understanding here?
>>
>> Memory added by the Xen balloon driver (whether populated with real
>> memory or not) does need to be onlined by udev or similar.
> 
> 
> Yes, same as all other memory hotplug mechanisms (hyper-v's balloon
> driver and acpi memory hotplug). My patch adds an option to make this
> happen automatically. Xen driver is currently excluded because of a
> deadlock. If this deadlock is the only problem we can easily change
> taking the lock to checking that the lock was taken (and taking in case
> it wasn't) or something similar and everything is going to work. From
> briefly looking at the code it seems to me it's going to work.

I don't think Linux has recursive mutex that we could use for this.

> What I wasn't sure about is 'empty pages' you were mentioning. In case
> there are some pages we can't online there should be a protection
> mechanism actively preventing them from going online (similar to
> hv_online_page() in Hyper-V driver) as this patch does nothing
> 'special' compared to udev onlining newly added blocks. 

'Empty' (or unpopulated) pages are those without any physical RAM
backing them.  Backend drivers map foreign (from other guest) pages into
these unpopulated pages.  i.e., accesses by the kernel to the virtual
addresses of these pages access memory shared by another guest.

These empty pages are ones we would prefer to be always automatically
onlined because they're hotplugged in response to requests from the
backend drivers.

Anyway, this series is fine with this Xen balloon driver limitation --
it can be addresses at a later date if necessary.

David

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11 12:42         ` Daniel Kiper
@ 2016-01-11 15:03           ` Vitaly Kuznetsov
  -1 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-11 15:03 UTC (permalink / raw)
  To: Daniel Kiper
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Daniel Kiper <daniel.kiper@oracle.com> writes:

[skip]

>> > > And we want to have it working out of the box.
>> > > So, I think that we should find proper solution. I suppose that we can schedule
>> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
>> > > Or maybe you have better idea how to fix this issue.
>> >
>> > I'd like to avoid additional delays and memory allocations between
>> > adding new memory and onlining it (and this is the main purpose of the
>> > patch). Maybe we can have a tristate online parameter ('online_now',
>> > 'online_delay', 'keep_offlined') and handle it
>> > accordingly. Alternatively I can suggest we have the onlining in Xen
>> > balloon driver code, memhp_auto_online is exported so we can call
>> > online_pages() after we release the ballon_mutex.
>>
>> This is not nice too. I prefer the same code path for every case.
>> Give me some time. I will think how to solve that issue.
>
> It looks that we can safely call mutex_unlock() just before add_memory_resource()
> call and retake lock immediately after add_memory_resource(). add_memory_resource()
> itself does not play with balloon stuff and even if online_pages() does then it
> take balloon_mutex in right place. Additionally, only one balloon task can run,
> so, I think that we are on safe side. Am I right?

I think you are as balloon_mutex is internal to xen driver and there is
only one balloon_process() running at the time. I just smoke-tested the
following:

commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Jan 11 14:22:11 2016 +0100

    xen_balloon: make hotplug auto online work
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 890c3b5..08bbf35 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
 	}
 #endif
 
-	rc = add_memory_resource(nid, resource, false);
+	mutex_unlock(&balloon_mutex);
+	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	mutex_lock(&balloon_mutex);
+
 	if (rc) {
 		pr_warn("Cannot add additional memory (%i)\n", rc);
 		goto err;
@@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
 		if (credit > 0) {
 			if (balloon_is_inflated())
 				state = increase_reservation(credit);
-			else
+			else {
+				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
 				state = reserve_additional_memory();
+			}
 		}
 
 		if (credit < 0)

And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'
doesn't work with "libxl: error: libxl.c:4809:libxl_set_memory_target:
memory_dynamic_max must be less than or equal to memory_static_max". At
the same time I'm able to increase the reservation with "echo
NEW_VALUE > /sys/devices/system/xen_memory/xen_memory0/target_kb" from
inside the guest. Was it supposed to be like that?).

While the patch misses the logic for empty pages (see David's comment in
a parallel thread) it should work for the general case the same way
auto-online works for Hyper-V and ACPI memory hotplugs.

-- 
  Vitaly

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 15:03           ` Vitaly Kuznetsov
  0 siblings, 0 replies; 23+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-11 15:03 UTC (permalink / raw)
  To: Daniel Kiper
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

Daniel Kiper <daniel.kiper@oracle.com> writes:

[skip]

>> > > And we want to have it working out of the box.
>> > > So, I think that we should find proper solution. I suppose that we can schedule
>> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
>> > > Or maybe you have better idea how to fix this issue.
>> >
>> > I'd like to avoid additional delays and memory allocations between
>> > adding new memory and onlining it (and this is the main purpose of the
>> > patch). Maybe we can have a tristate online parameter ('online_now',
>> > 'online_delay', 'keep_offlined') and handle it
>> > accordingly. Alternatively I can suggest we have the onlining in Xen
>> > balloon driver code, memhp_auto_online is exported so we can call
>> > online_pages() after we release the ballon_mutex.
>>
>> This is not nice too. I prefer the same code path for every case.
>> Give me some time. I will think how to solve that issue.
>
> It looks that we can safely call mutex_unlock() just before add_memory_resource()
> call and retake lock immediately after add_memory_resource(). add_memory_resource()
> itself does not play with balloon stuff and even if online_pages() does then it
> take balloon_mutex in right place. Additionally, only one balloon task can run,
> so, I think that we are on safe side. Am I right?

I think you are as balloon_mutex is internal to xen driver and there is
only one balloon_process() running at the time. I just smoke-tested the
following:

commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Jan 11 14:22:11 2016 +0100

    xen_balloon: make hotplug auto online work
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 890c3b5..08bbf35 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
 	}
 #endif
 
-	rc = add_memory_resource(nid, resource, false);
+	mutex_unlock(&balloon_mutex);
+	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	mutex_lock(&balloon_mutex);
+
 	if (rc) {
 		pr_warn("Cannot add additional memory (%i)\n", rc);
 		goto err;
@@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
 		if (credit > 0) {
 			if (balloon_is_inflated())
 				state = increase_reservation(credit);
-			else
+			else {
+				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
 				state = reserve_additional_memory();
+			}
 		}
 
 		if (credit < 0)

And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'
doesn't work with "libxl: error: libxl.c:4809:libxl_set_memory_target:
memory_dynamic_max must be less than or equal to memory_static_max". At
the same time I'm able to increase the reservation with "echo
NEW_VALUE > /sys/devices/system/xen_memory/xen_memory0/target_kb" from
inside the guest. Was it supposed to be like that?).

While the patch misses the logic for empty pages (see David's comment in
a parallel thread) it should work for the general case the same way
auto-online works for Hyper-V and ACPI memory hotplugs.

-- 
  Vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11 15:03           ` Vitaly Kuznetsov
@ 2016-01-11 16:22             ` Daniel Kiper
  -1 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11 16:22 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 04:03:35PM +0100, Vitaly Kuznetsov wrote:
> Daniel Kiper <daniel.kiper@oracle.com> writes:
>
> [skip]
>
> >> > > And we want to have it working out of the box.
> >> > > So, I think that we should find proper solution. I suppose that we can schedule
> >> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> >> > > Or maybe you have better idea how to fix this issue.
> >> >
> >> > I'd like to avoid additional delays and memory allocations between
> >> > adding new memory and onlining it (and this is the main purpose of the
> >> > patch). Maybe we can have a tristate online parameter ('online_now',
> >> > 'online_delay', 'keep_offlined') and handle it
> >> > accordingly. Alternatively I can suggest we have the onlining in Xen
> >> > balloon driver code, memhp_auto_online is exported so we can call
> >> > online_pages() after we release the ballon_mutex.
> >>
> >> This is not nice too. I prefer the same code path for every case.
> >> Give me some time. I will think how to solve that issue.
> >
> > It looks that we can safely call mutex_unlock() just before add_memory_resource()
> > call and retake lock immediately after add_memory_resource(). add_memory_resource()
> > itself does not play with balloon stuff and even if online_pages() does then it
> > take balloon_mutex in right place. Additionally, only one balloon task can run,
> > so, I think that we are on safe side. Am I right?
>
> I think you are as balloon_mutex is internal to xen driver and there is
> only one balloon_process() running at the time. I just smoke-tested the
> following:
>
> commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
> Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date:   Mon Jan 11 14:22:11 2016 +0100
>
>     xen_balloon: make hotplug auto online work
>
>     Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 890c3b5..08bbf35 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
>  	}
>  #endif
>
> -	rc = add_memory_resource(nid, resource, false);
> +	mutex_unlock(&balloon_mutex);
> +	rc = add_memory_resource(nid, resource, memhp_auto_online);
> +	mutex_lock(&balloon_mutex);
> +
>  	if (rc) {
>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>  		goto err;
> @@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
>  		if (credit > 0) {
>  			if (balloon_is_inflated())
>  				state = increase_reservation(credit);
> -			else
> +			else {
> +				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
>  				state = reserve_additional_memory();
> +			}
>  		}
>
>  		if (credit < 0)
>
> And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'

Great! Thanks!

Let's go further. Please add bool online argument to reserve_additional_memory() and
then call add_memory_resource() with it. Then call reserve_additional_memory() with
memhp_auto_online from balloon_process() and with false from add_ballooned_pages(). Voila!

Please do not forget to add comment for mutex_unlock() and mutex_lock()
around add_memory_resource() (why it is needed and why it works correctly).

> doesn't work with "libxl: error: libxl.c:4809:libxl_set_memory_target:
> memory_dynamic_max must be less than or equal to memory_static_max". At

Ignore that. It must be fixed and it is on my TODO list. However, I am busy
with more important stuff right now.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-11 16:22             ` Daniel Kiper
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-11 16:22 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 04:03:35PM +0100, Vitaly Kuznetsov wrote:
> Daniel Kiper <daniel.kiper@oracle.com> writes:
>
> [skip]
>
> >> > > And we want to have it working out of the box.
> >> > > So, I think that we should find proper solution. I suppose that we can schedule
> >> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> >> > > Or maybe you have better idea how to fix this issue.
> >> >
> >> > I'd like to avoid additional delays and memory allocations between
> >> > adding new memory and onlining it (and this is the main purpose of the
> >> > patch). Maybe we can have a tristate online parameter ('online_now',
> >> > 'online_delay', 'keep_offlined') and handle it
> >> > accordingly. Alternatively I can suggest we have the onlining in Xen
> >> > balloon driver code, memhp_auto_online is exported so we can call
> >> > online_pages() after we release the ballon_mutex.
> >>
> >> This is not nice too. I prefer the same code path for every case.
> >> Give me some time. I will think how to solve that issue.
> >
> > It looks that we can safely call mutex_unlock() just before add_memory_resource()
> > call and retake lock immediately after add_memory_resource(). add_memory_resource()
> > itself does not play with balloon stuff and even if online_pages() does then it
> > take balloon_mutex in right place. Additionally, only one balloon task can run,
> > so, I think that we are on safe side. Am I right?
>
> I think you are as balloon_mutex is internal to xen driver and there is
> only one balloon_process() running at the time. I just smoke-tested the
> following:
>
> commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
> Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date:   Mon Jan 11 14:22:11 2016 +0100
>
>     xen_balloon: make hotplug auto online work
>
>     Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 890c3b5..08bbf35 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
>  	}
>  #endif
>
> -	rc = add_memory_resource(nid, resource, false);
> +	mutex_unlock(&balloon_mutex);
> +	rc = add_memory_resource(nid, resource, memhp_auto_online);
> +	mutex_lock(&balloon_mutex);
> +
>  	if (rc) {
>  		pr_warn("Cannot add additional memory (%i)\n", rc);
>  		goto err;
> @@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
>  		if (credit > 0) {
>  			if (balloon_is_inflated())
>  				state = increase_reservation(credit);
> -			else
> +			else {
> +				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
>  				state = reserve_additional_memory();
> +			}
>  		}
>
>  		if (credit < 0)
>
> And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'

Great! Thanks!

Let's go further. Please add bool online argument to reserve_additional_memory() and
then call add_memory_resource() with it. Then call reserve_additional_memory() with
memhp_auto_online from balloon_process() and with false from add_ballooned_pages(). Voila!

Please do not forget to add comment for mutex_unlock() and mutex_lock()
around add_memory_resource() (why it is needed and why it works correctly).

> doesn't work with "libxl: error: libxl.c:4809:libxl_set_memory_target:
> memory_dynamic_max must be less than or equal to memory_static_max". At

Ignore that. It must be fixed and it is on my TODO list. However, I am busy
with more important stuff right now.

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
  2016-01-11 16:22             ` Daniel Kiper
@ 2016-01-12  7:44               ` Daniel Kiper
  -1 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-12  7:44 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 05:22:58PM +0100, Daniel Kiper wrote:
> On Mon, Jan 11, 2016 at 04:03:35PM +0100, Vitaly Kuznetsov wrote:
> > Daniel Kiper <daniel.kiper@oracle.com> writes:
> >
> > [skip]
> >
> > >> > > And we want to have it working out of the box.
> > >> > > So, I think that we should find proper solution. I suppose that we can schedule
> > >> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > >> > > Or maybe you have better idea how to fix this issue.
> > >> >
> > >> > I'd like to avoid additional delays and memory allocations between
> > >> > adding new memory and onlining it (and this is the main purpose of the
> > >> > patch). Maybe we can have a tristate online parameter ('online_now',
> > >> > 'online_delay', 'keep_offlined') and handle it
> > >> > accordingly. Alternatively I can suggest we have the onlining in Xen
> > >> > balloon driver code, memhp_auto_online is exported so we can call
> > >> > online_pages() after we release the ballon_mutex.
> > >>
> > >> This is not nice too. I prefer the same code path for every case.
> > >> Give me some time. I will think how to solve that issue.
> > >
> > > It looks that we can safely call mutex_unlock() just before add_memory_resource()
> > > call and retake lock immediately after add_memory_resource(). add_memory_resource()
> > > itself does not play with balloon stuff and even if online_pages() does then it
> > > take balloon_mutex in right place. Additionally, only one balloon task can run,
> > > so, I think that we are on safe side. Am I right?
> >
> > I think you are as balloon_mutex is internal to xen driver and there is
> > only one balloon_process() running at the time. I just smoke-tested the
> > following:
> >
> > commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
> > Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Date:   Mon Jan 11 14:22:11 2016 +0100
> >
> >     xen_balloon: make hotplug auto online work
> >
> >     Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >
> > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> > index 890c3b5..08bbf35 100644
> > --- a/drivers/xen/balloon.c
> > +++ b/drivers/xen/balloon.c
> > @@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
> >  	}
> >  #endif
> >
> > -	rc = add_memory_resource(nid, resource, false);
> > +	mutex_unlock(&balloon_mutex);
> > +	rc = add_memory_resource(nid, resource, memhp_auto_online);
> > +	mutex_lock(&balloon_mutex);
> > +
> >  	if (rc) {
> >  		pr_warn("Cannot add additional memory (%i)\n", rc);
> >  		goto err;
> > @@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
> >  		if (credit > 0) {
> >  			if (balloon_is_inflated())
> >  				state = increase_reservation(credit);
> > -			else
> > +			else {
> > +				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
> >  				state = reserve_additional_memory();
> > +			}
> >  		}
> >
> >  		if (credit < 0)
> >
> > And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'
>
> Great! Thanks!
>
> Let's go further. Please add bool online argument to reserve_additional_memory() and
> then call add_memory_resource() with it. Then call reserve_additional_memory() with
> memhp_auto_online from balloon_process() and with false from add_ballooned_pages(). Voila!
>
> Please do not forget to add comment for mutex_unlock() and mutex_lock()
> around add_memory_resource() (why it is needed and why it works correctly).

I forgot about one thing. Please update help for XEN_BALLOON_MEMORY_HOTPLUG
config option accordingly.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory
@ 2016-01-12  7:44               ` Daniel Kiper
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Kiper @ 2016-01-12  7:44 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-mm, linux-kernel, linux-acpi, Jonathan Corbet,
	Greg Kroah-Hartman, Dan Williams, Tang Chen, David Vrabel,
	David Rientjes, Andrew Morton, Naoya Horiguchi, Xishi Qiu,
	Mel Gorman, K. Y. Srinivasan, Igor Mammedov, Kay Sievers,
	Konrad Rzeszutek Wilk, Boris Ostrovsky

On Mon, Jan 11, 2016 at 05:22:58PM +0100, Daniel Kiper wrote:
> On Mon, Jan 11, 2016 at 04:03:35PM +0100, Vitaly Kuznetsov wrote:
> > Daniel Kiper <daniel.kiper@oracle.com> writes:
> >
> > [skip]
> >
> > >> > > And we want to have it working out of the box.
> > >> > > So, I think that we should find proper solution. I suppose that we can schedule
> > >> > > a task here which auto online attached blocks. Hmmm... Not nice but should work.
> > >> > > Or maybe you have better idea how to fix this issue.
> > >> >
> > >> > I'd like to avoid additional delays and memory allocations between
> > >> > adding new memory and onlining it (and this is the main purpose of the
> > >> > patch). Maybe we can have a tristate online parameter ('online_now',
> > >> > 'online_delay', 'keep_offlined') and handle it
> > >> > accordingly. Alternatively I can suggest we have the onlining in Xen
> > >> > balloon driver code, memhp_auto_online is exported so we can call
> > >> > online_pages() after we release the ballon_mutex.
> > >>
> > >> This is not nice too. I prefer the same code path for every case.
> > >> Give me some time. I will think how to solve that issue.
> > >
> > > It looks that we can safely call mutex_unlock() just before add_memory_resource()
> > > call and retake lock immediately after add_memory_resource(). add_memory_resource()
> > > itself does not play with balloon stuff and even if online_pages() does then it
> > > take balloon_mutex in right place. Additionally, only one balloon task can run,
> > > so, I think that we are on safe side. Am I right?
> >
> > I think you are as balloon_mutex is internal to xen driver and there is
> > only one balloon_process() running at the time. I just smoke-tested the
> > following:
> >
> > commit 0fce4746a0090d533e9302cc42b3d3c0645d756d
> > Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Date:   Mon Jan 11 14:22:11 2016 +0100
> >
> >     xen_balloon: make hotplug auto online work
> >
> >     Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >
> > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> > index 890c3b5..08bbf35 100644
> > --- a/drivers/xen/balloon.c
> > +++ b/drivers/xen/balloon.c
> > @@ -338,7 +338,10 @@ static enum bp_state reserve_additional_memory(void)
> >  	}
> >  #endif
> >
> > -	rc = add_memory_resource(nid, resource, false);
> > +	mutex_unlock(&balloon_mutex);
> > +	rc = add_memory_resource(nid, resource, memhp_auto_online);
> > +	mutex_lock(&balloon_mutex);
> > +
> >  	if (rc) {
> >  		pr_warn("Cannot add additional memory (%i)\n", rc);
> >  		goto err;
> > @@ -565,8 +568,10 @@ static void balloon_process(struct work_struct *work)
> >  		if (credit > 0) {
> >  			if (balloon_is_inflated())
> >  				state = increase_reservation(credit);
> > -			else
> > +			else {
> > +				printk("balloon_process: adding memory (credit: %ld)!\n", credit);
> >  				state = reserve_additional_memory();
> > +			}
> >  		}
> >
> >  		if (credit < 0)
> >
> > And it seems to work (unrelated rant: 'xl mem-set' after 'xl max-mem'
>
> Great! Thanks!
>
> Let's go further. Please add bool online argument to reserve_additional_memory() and
> then call add_memory_resource() with it. Then call reserve_additional_memory() with
> memhp_auto_online from balloon_process() and with false from add_ballooned_pages(). Voila!
>
> Please do not forget to add comment for mutex_unlock() and mutex_lock()
> around add_memory_resource() (why it is needed and why it works correctly).

I forgot about one thing. Please update help for XEN_BALLOON_MEMORY_HOTPLUG
config option accordingly.

Daniel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-01-12  7:45 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07 17:23 [PATCH v3] memory-hotplug: add automatic onlining policy for the newly added memory Vitaly Kuznetsov
2016-01-07 17:23 ` Vitaly Kuznetsov
2016-01-08 14:01 ` Daniel Kiper
2016-01-08 14:01   ` Daniel Kiper
2016-01-08 16:55   ` Vitaly Kuznetsov
2016-01-08 16:55     ` Vitaly Kuznetsov
2016-01-11  8:10     ` Daniel Kiper
2016-01-11  8:10       ` Daniel Kiper
2016-01-11 12:42       ` Daniel Kiper
2016-01-11 12:42         ` Daniel Kiper
2016-01-11 15:03         ` Vitaly Kuznetsov
2016-01-11 15:03           ` Vitaly Kuznetsov
2016-01-11 16:22           ` Daniel Kiper
2016-01-11 16:22             ` Daniel Kiper
2016-01-12  7:44             ` Daniel Kiper
2016-01-12  7:44               ` Daniel Kiper
2016-01-11 11:01 ` David Vrabel
2016-01-11 11:01   ` David Vrabel
2016-01-11 11:59   ` Vitaly Kuznetsov
2016-01-11 11:59     ` Vitaly Kuznetsov
2016-01-11 11:59     ` Vitaly Kuznetsov
2016-01-11 13:21     ` David Vrabel
2016-01-11 13:21       ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.