linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type
@ 2020-03-11 12:30 David Hildenbrand
  2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Vitaly Kuznetsov, Yumei Huang, Igor Mammedov, Baoquan He,
	Eduardo Habkost, Milan Zamazal, Andrew Morton,
	Benjamin Herrenschmidt, Greg Kroah-Hartman, Haiyang Zhang,
	K. Y. Srinivasan, Michael Ellerman, Michal Hocko, Oscar Salvador,
	Paul Mackerras, Rafael J. Wysocki, Stephen Hemminger,
	Thomas Gleixner, Wei Liu, Wei Yang

Distributions nowadays use udev rules ([1] [2]) to specify if and
how to online hotplugged memory. The rules seem to get more complex with
many special cases. Due to the various special cases,
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used. All memory hotplug
is handled via udev rules.

Everytime we hotplug memory, the udev rule will come to the same
conclusion. Especially Hyper-V (but also soon virtio-mem) add a lot of
memory in separate memory blocks and wait for memory to get onlined by user
space before continuing to add more memory blocks (to not add memory faster
than it is getting onlined). This of course slows down the whole memory
hotplug process.

To make the job of distributions easier and to avoid udev rules that get
more and more complicated, let's extend the mechanism provided by
- /sys/devices/system/memory/auto_online_blocks
- "memhp_default_state=" on the kernel cmdline
to be able to specify also "online_movable" as well as "online_kernel"

=== Example /usr/libexec/config-memhotplug ===

#!/bin/bash

VIRT=`systemd-detect-virt --vm`
ARCH=`uname -p`

sense_virtio_mem() {
  if [ -d "/sys/bus/virtio/drivers/virtio_mem/" ]; then
    DEVICES=`find /sys/bus/virtio/drivers/virtio_mem/ -maxdepth 1 -type l | wc -l`
    if [ $DEVICES != "0" ]; then
        return 0
    fi
  fi
  return 1
}

if [ ! -e "/sys/devices/system/memory/auto_online_blocks" ]; then
  echo "Memory hotplug configuration support missing in the kernel"
  exit 1
fi

if grep "memhp_default_state=" /proc/cmdline > /dev/null; then
  echo "Memory hotplug configuration overridden in kernel cmdline (memhp_default_state=)"
  exit 1
fi

if [ $VIRT == "microsoft" ]; then
  echo "Detected Hyper-V on $ARCH"
  # Hyper-V wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif sense_virtio_mem; then
  echo "Detected virtio-mem on $ARCH"
  # virtio-mem wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif [ $ARCH == "s390x" ] || [ $ARCH == "s390" ]; then
  echo "Detected $ARCH"
  # standby memory should not be onlined automatically
  ONLINE_TYPE="offline"
elif [ $ARCH == "ppc64" ] || [ $ARCH == "ppc64le" ]; then
  echo "Detected" $ARCH
  # PPC64 is handled via a user space daemon AFAIK
  ONLINE_TYPE="offline"
elif [ $VIRT == "none" ]; then
  echo "Detected bare-metal on $ARCH"
  # Bare metal users expect hotplugged memory to be unpluggable. We assume
  # that ZONE imbalances on such enterpise servers cannot happen and is
  # properly documented
  ONLINE_TYPE="online_movable"
else
  # TODO: Hypervisors that want to unplug DIMMs and can guarantee that ZONE
  # imbalances won't happen
  echo "Detected $VIRT on $ARCH"
  # Usually, ballooning is used in virtual environments, so memory should go to
  # ZONE_NORMAL. However, sometimes "movable_node" is relevant.
  ONLINE_TYPE="online"
fi

echo "Selected online_type:" $ONLINE_TYPE

# Configure what to do with memory that will be hotplugged in the future
echo $ONLINE_TYPE 2>/dev/null > /sys/devices/system/memory/auto_online_blocks
if [ $? != "0" ]; then
  echo "Memory hotplug cannot be configured (e.g., old kernel or missing permissions)"
  # A backup udev rule should handle old kernels if necessary
  exit 1
fi

# Process all already pluggedd blocks (e.g., DIMMs, but also Hyper-V or virtio-mem)
if [ $ONLINE_TYPE != "offline" ]; then
  for MEMORY in /sys/devices/system/memory/memory*; do
    STATE=`cat $MEMORY/state`
    if [ $STATE == "offline" ]; then
        echo $ONLINE_TYPE > $MEMORY/state
    fi
  done
fi


=== Example /usr/lib/systemd/system/config-memhotplug.service ===

[Unit]
Description=Configure memory hotplug behavior
DefaultDependencies=no
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
After=systemd-modules-load.service
ConditionPathExists=|/sys/devices/system/memory/auto_online_blocks

[Service]
ExecStart=/usr/libexec/config-memhotplug
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes

[Install]
WantedBy=sysinit.target


=== Example modification to the 40-redhat.rules [2] ===

diff --git a/40-redhat.rules b/40-redhat.rules-new
index 2c690e5..168fd03 100644
--- a/40-redhat.rules
+++ b/40-redhat.rules-new
@@ -6,6 +6,9 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}
 # Memory hotadd request
 SUBSYSTEM!="memory", GOTO="memory_hotplug_end"
 ACTION!="add", GOTO="memory_hotplug_end"
+# memory hotplug behavior configured
+PROGRAM=="grep online /sys/devices/system/memory/auto_online_blocks", GOTO="memory_hotplug_end"
+
 PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end"

 ENV{.state}="online"

===


[1] https://github.com/lnykryn/systemd-rhel/pull/281
[2] https://github.com/lnykryn/systemd-rhel/blob/staging/rules/40-redhat.rules

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Yumei Huang <yuhuang@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Milan Zamazal <mzamazal@redhat.com>

David Hildenbrand (5):
  drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  drivers/base/memory: map MMOP_OFFLINE to 0
  drivers/base/memory: store mapping between MMOP_* and string in an
    array
  mm/memory_hotplug: convert memhp_auto_online to store an online_type
  mm/memory_hotplug: allow to specify a default online_type

 arch/powerpc/platforms/powernv/memtrace.c |  2 +-
 drivers/base/memory.c                     | 71 ++++++++++++-----------
 drivers/hv/hv_balloon.c                   |  2 +-
 include/linux/memory_hotplug.h            | 13 ++++-
 mm/memory_hotplug.c                       | 17 +++---
 5 files changed, 58 insertions(+), 47 deletions(-)

-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
@ 2020-03-11 12:30 ` David Hildenbrand
  2020-03-11 14:17   ` Wei Yang
  2020-03-16 15:12   ` Michal Hocko
  2020-03-11 12:30 ` [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

The name is misleading. Let's just name it like the online_type name we
expose to user space ("online").

Add some documentation to the types.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c          | 9 +++++----
 include/linux/memory_hotplug.h | 6 +++++-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 6448c9ece2cb..8c5ce42c0fc3 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -216,7 +216,7 @@ static int memory_subsys_online(struct device *dev)
 	 * attribute and need to set the online_type.
 	 */
 	if (mem->online_type < 0)
-		mem->online_type = MMOP_ONLINE_KEEP;
+		mem->online_type = MMOP_ONLINE;
 
 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
 
@@ -251,7 +251,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
 	else if (sysfs_streq(buf, "online_movable"))
 		online_type = MMOP_ONLINE_MOVABLE;
 	else if (sysfs_streq(buf, "online"))
-		online_type = MMOP_ONLINE_KEEP;
+		online_type = MMOP_ONLINE;
 	else if (sysfs_streq(buf, "offline"))
 		online_type = MMOP_OFFLINE;
 	else {
@@ -262,7 +262,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
-	case MMOP_ONLINE_KEEP:
+	case MMOP_ONLINE:
 		/* mem->online_type is protected by device_hotplug_lock */
 		mem->online_type = online_type;
 		ret = device_online(&mem->dev);
@@ -342,7 +342,8 @@ static ssize_t valid_zones_show(struct device *dev,
 	}
 
 	nid = mem->nid;
-	default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
+	default_zone = zone_for_pfn_range(MMOP_ONLINE, nid, start_pfn,
+					  nr_pages);
 	strcat(buf, default_zone->name);
 
 	print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL,
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index f4d59155f3d4..261dbf010d5d 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -47,9 +47,13 @@ enum {
 
 /* Types for control the zone type of onlined and offlined memory */
 enum {
+	/* Offline the memory. */
 	MMOP_OFFLINE = -1,
-	MMOP_ONLINE_KEEP,
+	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
+	MMOP_ONLINE,
+	/* Online the memory to ZONE_NORMAL. */
 	MMOP_ONLINE_KERNEL,
+	/* Online the memory to ZONE_MOVABLE. */
 	MMOP_ONLINE_MOVABLE,
 };
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0
  2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
@ 2020-03-11 12:30 ` David Hildenbrand
  2020-03-11 14:18   ` Wei Yang
  2020-03-16 15:19   ` Michal Hocko
  2020-03-11 12:30 ` [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

I have no idea why we have to start at -1. Just treat 0 as the special
case. Clarify a comment (which was wrong, when we come via
device_online() the first time, the online_type would have been 0 /
MEM_ONLINE). The default is now always MMOP_OFFLINE.

This is a preparation to use the online_type as an array index.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c          | 11 ++++-------
 include/linux/memory_hotplug.h |  2 +-
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8c5ce42c0fc3..e7e77cafef80 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -211,17 +211,14 @@ static int memory_subsys_online(struct device *dev)
 		return 0;
 
 	/*
-	 * If we are called from state_store(), online_type will be
-	 * set >= 0 Otherwise we were called from the device online
-	 * attribute and need to set the online_type.
+	 * When called via device_online() without configuring the online_type,
+	 * we want to default to MMOP_ONLINE.
 	 */
-	if (mem->online_type < 0)
+	if (mem->online_type == MMOP_OFFLINE)
 		mem->online_type = MMOP_ONLINE;
 
 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
-
-	/* clear online_type */
-	mem->online_type = -1;
+	mem->online_type = MMOP_OFFLINE;
 
 	return ret;
 }
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 261dbf010d5d..c2e06ed5e0e9 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -48,7 +48,7 @@ enum {
 /* Types for control the zone type of onlined and offlined memory */
 enum {
 	/* Offline the memory. */
-	MMOP_OFFLINE = -1,
+	MMOP_OFFLINE = 0,
 	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
 	MMOP_ONLINE,
 	/* Online the memory to ZONE_NORMAL. */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array
  2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
  2020-03-11 12:30 ` [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
@ 2020-03-11 12:30 ` David Hildenbrand
  2020-03-11 14:20   ` Wei Yang
  2020-03-16 15:20   ` Michal Hocko
  2020-03-11 12:30 ` [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
  2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  4 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

Let's use a simple array which we can reuse soon. While at it, move the
string->mmop conversion out of the device hotplug lock.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index e7e77cafef80..8a7f29c0bf97 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -28,6 +28,24 @@
 
 #define MEMORY_CLASS_NAME	"memory"
 
+static const char *const online_type_to_str[] = {
+	[MMOP_OFFLINE] = "offline",
+	[MMOP_ONLINE] = "online",
+	[MMOP_ONLINE_KERNEL] = "online_kernel",
+	[MMOP_ONLINE_MOVABLE] = "online_movable",
+};
+
+static int memhp_online_type_from_str(const char *str)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
+		if (sysfs_streq(str, online_type_to_str[i]))
+			return i;
+	}
+	return -EINVAL;
+}
+
 #define to_memory_block(dev) container_of(dev, struct memory_block, dev)
 
 static int sections_per_block;
@@ -236,26 +254,17 @@ static int memory_subsys_offline(struct device *dev)
 static ssize_t state_store(struct device *dev, struct device_attribute *attr,
 			   const char *buf, size_t count)
 {
+	const int online_type = memhp_online_type_from_str(buf);
 	struct memory_block *mem = to_memory_block(dev);
-	int ret, online_type;
+	int ret;
+
+	if (online_type < 0)
+		return -EINVAL;
 
 	ret = lock_device_hotplug_sysfs();
 	if (ret)
 		return ret;
 
-	if (sysfs_streq(buf, "online_kernel"))
-		online_type = MMOP_ONLINE_KERNEL;
-	else if (sysfs_streq(buf, "online_movable"))
-		online_type = MMOP_ONLINE_MOVABLE;
-	else if (sysfs_streq(buf, "online"))
-		online_type = MMOP_ONLINE;
-	else if (sysfs_streq(buf, "offline"))
-		online_type = MMOP_OFFLINE;
-	else {
-		ret = -EINVAL;
-		goto err;
-	}
-
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
@@ -271,7 +280,6 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
 		ret = -EINVAL; /* should never happen */
 	}
 
-err:
 	unlock_device_hotplug();
 
 	if (ret < 0)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type
  2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
                   ` (2 preceding siblings ...)
  2020-03-11 12:30 ` [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
@ 2020-03-11 12:30 ` David Hildenbrand
  2020-03-11 14:25   ` Wei Yang
  2020-03-16 15:24   ` Michal Hocko
  2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  4 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, Thomas Gleixner

... and rename it to memhp_default_online_type. This is a preparation
for more detailed default online behavior.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-hyperv@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c |  2 +-
 drivers/base/memory.c                     | 10 ++++------
 drivers/hv/hv_balloon.c                   |  2 +-
 include/linux/memory_hotplug.h            |  3 ++-
 mm/memory_hotplug.c                       | 13 +++++++------
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index d6d64f8718e6..e15a600cfa4d 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -235,7 +235,7 @@ static int memtrace_online(void)
 		 * If kernel isn't compiled with the auto online option
 		 * we need to online the memory ourselves.
 		 */
-		if (!memhp_auto_online) {
+		if (memhp_default_online_type == MMOP_OFFLINE) {
 			lock_device_hotplug();
 			walk_memory_blocks(ent->start, ent->size, NULL,
 					   online_mem_block);
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8a7f29c0bf97..8d3e16dab69f 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -386,10 +386,8 @@ static DEVICE_ATTR_RO(block_size_bytes);
 static ssize_t auto_online_blocks_show(struct device *dev,
 				       struct device_attribute *attr, char *buf)
 {
-	if (memhp_auto_online)
-		return sprintf(buf, "online\n");
-	else
-		return sprintf(buf, "offline\n");
+	return sprintf(buf, "%s\n",
+		       online_type_to_str[memhp_default_online_type]);
 }
 
 static ssize_t auto_online_blocks_store(struct device *dev,
@@ -397,9 +395,9 @@ static ssize_t auto_online_blocks_store(struct device *dev,
 					const char *buf, size_t count)
 {
 	if (sysfs_streq(buf, "online"))
-		memhp_auto_online = true;
+		memhp_default_online_type = MMOP_ONLINE;
 	else if (sysfs_streq(buf, "offline"))
-		memhp_auto_online = false;
+		memhp_default_online_type = MMOP_OFFLINE;
 	else
 		return -EINVAL;
 
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index a02ce43d778d..3b90fd12e0c5 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -727,7 +727,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
 		spin_unlock_irqrestore(&dm_device.ha_lock, flags);
 
 		init_completion(&dm_device.ol_waitevent);
-		dm_device.ha_waiting = !memhp_auto_online;
+		dm_device.ha_waiting = memhp_default_online_type == MMOP_OFFLINE;
 
 		nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
 		ret = add_memory(nid, PFN_PHYS((start_pfn)),
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index c2e06ed5e0e9..c6e090b34c4b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -117,7 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
 			struct mhp_restrictions *restrictions);
 extern u64 max_mem_size;
 
-extern bool memhp_auto_online;
+/* Default online_type (MMOP_*) when new memory blocks are added. */
+extern int memhp_default_online_type;
 /* If movable_node boot option specified */
 extern bool movable_node_enabled;
 static inline bool movable_node_is_enabled(void)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1a00b5a37ef6..01443c70aa27 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -67,18 +67,18 @@ void put_online_mems(void)
 bool movable_node_enabled = false;
 
 #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
-bool memhp_auto_online;
+int memhp_default_online_type = MMOP_OFFLINE;
 #else
-bool memhp_auto_online = true;
+int memhp_default_online_type = MMOP_ONLINE;
 #endif
-EXPORT_SYMBOL_GPL(memhp_auto_online);
+EXPORT_SYMBOL_GPL(memhp_default_online_type);
 
 static int __init setup_memhp_default_state(char *str)
 {
 	if (!strcmp(str, "online"))
-		memhp_auto_online = true;
+		memhp_default_online_type = MMOP_ONLINE;
 	else if (!strcmp(str, "offline"))
-		memhp_auto_online = false;
+		memhp_default_online_type = MMOP_OFFLINE;
 
 	return 1;
 }
@@ -991,6 +991,7 @@ static int check_hotplug_memory_range(u64 start, u64 size)
 
 static int online_memory_block(struct memory_block *mem, void *arg)
 {
+	mem->online_type = memhp_default_online_type;
 	return device_online(&mem->dev);
 }
 
@@ -1063,7 +1064,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
 	mem_hotplug_done();
 
 	/* online pages if requested */
-	if (memhp_auto_online)
+	if (memhp_default_online_type != MMOP_OFFLINE)
 		walk_memory_blocks(start, size, NULL, online_memory_block);
 
 	return ret;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
                   ` (3 preceding siblings ...)
  2020-03-11 12:30 ` [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
@ 2020-03-11 12:30 ` David Hildenbrand
  2020-03-11 14:26   ` Wei Yang
                     ` (2 more replies)
  4 siblings, 3 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

For now, distributions implement advanced udev rules to essentially
- Don't online any hotplugged memory (s390x)
- Online all memory to ZONE_NORMAL (e.g., most virt environments like
  hyperv)
- Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
  care of (e.g., bare metal, special virt environments)

In summary: All memory is usually onlined the same way, however, the
kernel always has to ask userspace to come up with the same answer.
E.g., HyperV always waits for a memory block to get onlined before
continuing, otherwise it might end up adding memory faster than
hotplugging it, which can result in strange OOM situations.

Let's allow to specify a default online_type, not just "online" and
"offline". This allows distributions to configure the default online_type
when booting up and be done with it.

We can now specify "offline", "online", "online_movable" and
"online_kernel" via
- "memhp_default_state=" on the kernel cmdline
- /sys/devices/systemn/memory/auto_online_blocks
just like we are able to specify for a single memory block via
/sys/devices/systemn/memory/memoryX/state

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c          | 11 +++++------
 include/linux/memory_hotplug.h |  2 ++
 mm/memory_hotplug.c            |  8 ++++----
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8d3e16dab69f..2b09b68b9f78 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -35,7 +35,7 @@ static const char *const online_type_to_str[] = {
 	[MMOP_ONLINE_MOVABLE] = "online_movable",
 };
 
-static int memhp_online_type_from_str(const char *str)
+int memhp_online_type_from_str(const char *str)
 {
 	int i;
 
@@ -394,13 +394,12 @@ static ssize_t auto_online_blocks_store(struct device *dev,
 					struct device_attribute *attr,
 					const char *buf, size_t count)
 {
-	if (sysfs_streq(buf, "online"))
-		memhp_default_online_type = MMOP_ONLINE;
-	else if (sysfs_streq(buf, "offline"))
-		memhp_default_online_type = MMOP_OFFLINE;
-	else
+	const int online_type = memhp_online_type_from_str(buf);
+
+	if (online_type < 0)
 		return -EINVAL;
 
+	memhp_default_online_type = online_type;
 	return count;
 }
 
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index c6e090b34c4b..ef55115320fb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -117,6 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
 			struct mhp_restrictions *restrictions);
 extern u64 max_mem_size;
 
+extern int memhp_online_type_from_str(const char *str);
+
 /* Default online_type (MMOP_*) when new memory blocks are added. */
 extern int memhp_default_online_type;
 /* If movable_node boot option specified */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 01443c70aa27..4a96273eafa7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -75,10 +75,10 @@ EXPORT_SYMBOL_GPL(memhp_default_online_type);
 
 static int __init setup_memhp_default_state(char *str)
 {
-	if (!strcmp(str, "online"))
-		memhp_default_online_type = MMOP_ONLINE;
-	else if (!strcmp(str, "offline"))
-		memhp_default_online_type = MMOP_OFFLINE;
+	const int online_type = memhp_online_type_from_str(str);
+
+	if (online_type >= 0)
+		memhp_default_online_type = online_type;
 
 	return 1;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
@ 2020-03-11 14:17   ` Wei Yang
  2020-03-16 15:12   ` Michal Hocko
  1 sibling, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:17 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed, Mar 11, 2020 at 01:30:22PM +0100, David Hildenbrand wrote:
>The name is misleading. Let's just name it like the online_type name we
>expose to user space ("online").
>
>Add some documentation to the types.
>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Michal Hocko <mhocko@kernel.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>---
> drivers/base/memory.c          | 9 +++++----
> include/linux/memory_hotplug.h | 6 +++++-
> 2 files changed, 10 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 6448c9ece2cb..8c5ce42c0fc3 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -216,7 +216,7 @@ static int memory_subsys_online(struct device *dev)
> 	 * attribute and need to set the online_type.
> 	 */
> 	if (mem->online_type < 0)
>-		mem->online_type = MMOP_ONLINE_KEEP;
>+		mem->online_type = MMOP_ONLINE;
> 
> 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
> 
>@@ -251,7 +251,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> 	else if (sysfs_streq(buf, "online_movable"))
> 		online_type = MMOP_ONLINE_MOVABLE;
> 	else if (sysfs_streq(buf, "online"))
>-		online_type = MMOP_ONLINE_KEEP;
>+		online_type = MMOP_ONLINE;
> 	else if (sysfs_streq(buf, "offline"))
> 		online_type = MMOP_OFFLINE;
> 	else {
>@@ -262,7 +262,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> 	switch (online_type) {
> 	case MMOP_ONLINE_KERNEL:
> 	case MMOP_ONLINE_MOVABLE:
>-	case MMOP_ONLINE_KEEP:
>+	case MMOP_ONLINE:
> 		/* mem->online_type is protected by device_hotplug_lock */
> 		mem->online_type = online_type;
> 		ret = device_online(&mem->dev);
>@@ -342,7 +342,8 @@ static ssize_t valid_zones_show(struct device *dev,
> 	}
> 
> 	nid = mem->nid;
>-	default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
>+	default_zone = zone_for_pfn_range(MMOP_ONLINE, nid, start_pfn,
>+					  nr_pages);
> 	strcat(buf, default_zone->name);
> 
> 	print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL,
>diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>index f4d59155f3d4..261dbf010d5d 100644
>--- a/include/linux/memory_hotplug.h
>+++ b/include/linux/memory_hotplug.h
>@@ -47,9 +47,13 @@ enum {
> 
> /* Types for control the zone type of onlined and offlined memory */
> enum {
>+	/* Offline the memory. */
> 	MMOP_OFFLINE = -1,
>-	MMOP_ONLINE_KEEP,
>+	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
>+	MMOP_ONLINE,
>+	/* Online the memory to ZONE_NORMAL. */
> 	MMOP_ONLINE_KERNEL,
>+	/* Online the memory to ZONE_MOVABLE. */
> 	MMOP_ONLINE_MOVABLE,
> };
> 
>-- 
>2.24.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0
  2020-03-11 12:30 ` [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
@ 2020-03-11 14:18   ` Wei Yang
  2020-03-16 15:19   ` Michal Hocko
  1 sibling, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed, Mar 11, 2020 at 01:30:23PM +0100, David Hildenbrand wrote:
>I have no idea why we have to start at -1. Just treat 0 as the special
>case. Clarify a comment (which was wrong, when we come via
>device_online() the first time, the online_type would have been 0 /
>MEM_ONLINE). The default is now always MMOP_OFFLINE.
>
>This is a preparation to use the online_type as an array index.
>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Michal Hocko <mhocko@kernel.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>---
> drivers/base/memory.c          | 11 ++++-------
> include/linux/memory_hotplug.h |  2 +-
> 2 files changed, 5 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 8c5ce42c0fc3..e7e77cafef80 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -211,17 +211,14 @@ static int memory_subsys_online(struct device *dev)
> 		return 0;
> 
> 	/*
>-	 * If we are called from state_store(), online_type will be
>-	 * set >= 0 Otherwise we were called from the device online
>-	 * attribute and need to set the online_type.
>+	 * When called via device_online() without configuring the online_type,
>+	 * we want to default to MMOP_ONLINE.
> 	 */
>-	if (mem->online_type < 0)
>+	if (mem->online_type == MMOP_OFFLINE)
> 		mem->online_type = MMOP_ONLINE;
> 
> 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
>-
>-	/* clear online_type */
>-	mem->online_type = -1;
>+	mem->online_type = MMOP_OFFLINE;
> 
> 	return ret;
> }
>diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>index 261dbf010d5d..c2e06ed5e0e9 100644
>--- a/include/linux/memory_hotplug.h
>+++ b/include/linux/memory_hotplug.h
>@@ -48,7 +48,7 @@ enum {
> /* Types for control the zone type of onlined and offlined memory */
> enum {
> 	/* Offline the memory. */
>-	MMOP_OFFLINE = -1,
>+	MMOP_OFFLINE = 0,
> 	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
> 	MMOP_ONLINE,
> 	/* Online the memory to ZONE_NORMAL. */
>-- 
>2.24.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array
  2020-03-11 12:30 ` [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
@ 2020-03-11 14:20   ` Wei Yang
  2020-03-11 14:27     ` Wei Yang
  2020-03-16 15:20   ` Michal Hocko
  1 sibling, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed, Mar 11, 2020 at 01:30:24PM +0100, David Hildenbrand wrote:
>Let's use a simple array which we can reuse soon. While at it, move the
>string->mmop conversion out of the device hotplug lock.
>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Michal Hocko <mhocko@kernel.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Signed-off-by: David Hildenbrand <david@redhat.com>
>---
> drivers/base/memory.c | 38 +++++++++++++++++++++++---------------
> 1 file changed, 23 insertions(+), 15 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index e7e77cafef80..8a7f29c0bf97 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -28,6 +28,24 @@
> 
> #define MEMORY_CLASS_NAME	"memory"
> 
>+static const char *const online_type_to_str[] = {
>+	[MMOP_OFFLINE] = "offline",
>+	[MMOP_ONLINE] = "online",
>+	[MMOP_ONLINE_KERNEL] = "online_kernel",
>+	[MMOP_ONLINE_MOVABLE] = "online_movable",
>+};
>+
>+static int memhp_online_type_from_str(const char *str)
>+{
>+	int i;
>+
>+	for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
>+		if (sysfs_streq(str, online_type_to_str[i]))
>+			return i;
>+	}
>+	return -EINVAL;
>+}
>+
> #define to_memory_block(dev) container_of(dev, struct memory_block, dev)
> 
> static int sections_per_block;
>@@ -236,26 +254,17 @@ static int memory_subsys_offline(struct device *dev)
> static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> 			   const char *buf, size_t count)
> {
>+	const int online_type = memhp_online_type_from_str(buf);

In your following patch, you did the same conversion. Is it possible to merge
them into this one?

> 	struct memory_block *mem = to_memory_block(dev);
>-	int ret, online_type;
>+	int ret;
>+
>+	if (online_type < 0)
>+		return -EINVAL;
> 
> 	ret = lock_device_hotplug_sysfs();
> 	if (ret)
> 		return ret;
> 
>-	if (sysfs_streq(buf, "online_kernel"))
>-		online_type = MMOP_ONLINE_KERNEL;
>-	else if (sysfs_streq(buf, "online_movable"))
>-		online_type = MMOP_ONLINE_MOVABLE;
>-	else if (sysfs_streq(buf, "online"))
>-		online_type = MMOP_ONLINE;
>-	else if (sysfs_streq(buf, "offline"))
>-		online_type = MMOP_OFFLINE;
>-	else {
>-		ret = -EINVAL;
>-		goto err;
>-	}
>-
> 	switch (online_type) {
> 	case MMOP_ONLINE_KERNEL:
> 	case MMOP_ONLINE_MOVABLE:
>@@ -271,7 +280,6 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
> 		ret = -EINVAL; /* should never happen */
> 	}
> 
>-err:
> 	unlock_device_hotplug();
> 
> 	if (ret < 0)
>-- 
>2.24.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type
  2020-03-11 12:30 ` [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
@ 2020-03-11 14:25   ` Wei Yang
  2020-03-16 15:24   ` Michal Hocko
  1 sibling, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, Thomas Gleixner

On Wed, Mar 11, 2020 at 01:30:25PM +0100, David Hildenbrand wrote:
>... and rename it to memhp_default_online_type. This is a preparation
>for more detailed default online behavior.
>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Michal Hocko <mhocko@kernel.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>Cc: Paul Mackerras <paulus@samba.org>
>Cc: Michael Ellerman <mpe@ellerman.id.au>
>Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>Cc: Haiyang Zhang <haiyangz@microsoft.com>
>Cc: Stephen Hemminger <sthemmin@microsoft.com>
>Cc: Wei Liu <wei.liu@kernel.org>
>Cc: Thomas Gleixner <tglx@linutronix.de>
>Cc: linuxppc-dev@lists.ozlabs.org
>Cc: linux-hyperv@vger.kernel.org
>Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>---
> arch/powerpc/platforms/powernv/memtrace.c |  2 +-
> drivers/base/memory.c                     | 10 ++++------
> drivers/hv/hv_balloon.c                   |  2 +-
> include/linux/memory_hotplug.h            |  3 ++-
> mm/memory_hotplug.c                       | 13 +++++++------
> 5 files changed, 15 insertions(+), 15 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
>index d6d64f8718e6..e15a600cfa4d 100644
>--- a/arch/powerpc/platforms/powernv/memtrace.c
>+++ b/arch/powerpc/platforms/powernv/memtrace.c
>@@ -235,7 +235,7 @@ static int memtrace_online(void)
> 		 * If kernel isn't compiled with the auto online option
> 		 * we need to online the memory ourselves.
> 		 */
>-		if (!memhp_auto_online) {
>+		if (memhp_default_online_type == MMOP_OFFLINE) {
> 			lock_device_hotplug();
> 			walk_memory_blocks(ent->start, ent->size, NULL,
> 					   online_mem_block);
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 8a7f29c0bf97..8d3e16dab69f 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -386,10 +386,8 @@ static DEVICE_ATTR_RO(block_size_bytes);
> static ssize_t auto_online_blocks_show(struct device *dev,
> 				       struct device_attribute *attr, char *buf)
> {
>-	if (memhp_auto_online)
>-		return sprintf(buf, "online\n");
>-	else
>-		return sprintf(buf, "offline\n");
>+	return sprintf(buf, "%s\n",
>+		       online_type_to_str[memhp_default_online_type]);
> }
> 
> static ssize_t auto_online_blocks_store(struct device *dev,
>@@ -397,9 +395,9 @@ static ssize_t auto_online_blocks_store(struct device *dev,
> 					const char *buf, size_t count)
> {
> 	if (sysfs_streq(buf, "online"))
>-		memhp_auto_online = true;
>+		memhp_default_online_type = MMOP_ONLINE;
> 	else if (sysfs_streq(buf, "offline"))
>-		memhp_auto_online = false;
>+		memhp_default_online_type = MMOP_OFFLINE;
> 	else
> 		return -EINVAL;
> 
>diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
>index a02ce43d778d..3b90fd12e0c5 100644
>--- a/drivers/hv/hv_balloon.c
>+++ b/drivers/hv/hv_balloon.c
>@@ -727,7 +727,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
> 		spin_unlock_irqrestore(&dm_device.ha_lock, flags);
> 
> 		init_completion(&dm_device.ol_waitevent);
>-		dm_device.ha_waiting = !memhp_auto_online;
>+		dm_device.ha_waiting = memhp_default_online_type == MMOP_OFFLINE;
> 
> 		nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
> 		ret = add_memory(nid, PFN_PHYS((start_pfn)),
>diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>index c2e06ed5e0e9..c6e090b34c4b 100644
>--- a/include/linux/memory_hotplug.h
>+++ b/include/linux/memory_hotplug.h
>@@ -117,7 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
> 			struct mhp_restrictions *restrictions);
> extern u64 max_mem_size;
> 
>-extern bool memhp_auto_online;
>+/* Default online_type (MMOP_*) when new memory blocks are added. */
>+extern int memhp_default_online_type;
> /* If movable_node boot option specified */
> extern bool movable_node_enabled;
> static inline bool movable_node_is_enabled(void)
>diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>index 1a00b5a37ef6..01443c70aa27 100644
>--- a/mm/memory_hotplug.c
>+++ b/mm/memory_hotplug.c
>@@ -67,18 +67,18 @@ void put_online_mems(void)
> bool movable_node_enabled = false;
> 
> #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
>-bool memhp_auto_online;
>+int memhp_default_online_type = MMOP_OFFLINE;
> #else
>-bool memhp_auto_online = true;
>+int memhp_default_online_type = MMOP_ONLINE;
> #endif
>-EXPORT_SYMBOL_GPL(memhp_auto_online);
>+EXPORT_SYMBOL_GPL(memhp_default_online_type);
> 
> static int __init setup_memhp_default_state(char *str)
> {
> 	if (!strcmp(str, "online"))
>-		memhp_auto_online = true;
>+		memhp_default_online_type = MMOP_ONLINE;
> 	else if (!strcmp(str, "offline"))
>-		memhp_auto_online = false;
>+		memhp_default_online_type = MMOP_OFFLINE;
> 
> 	return 1;
> }
>@@ -991,6 +991,7 @@ static int check_hotplug_memory_range(u64 start, u64 size)
> 
> static int online_memory_block(struct memory_block *mem, void *arg)
> {
>+	mem->online_type = memhp_default_online_type;
> 	return device_online(&mem->dev);
> }
> 
>@@ -1063,7 +1064,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
> 	mem_hotplug_done();
> 
> 	/* online pages if requested */
>-	if (memhp_auto_online)
>+	if (memhp_default_online_type != MMOP_OFFLINE)
> 		walk_memory_blocks(start, size, NULL, online_memory_block);
> 
> 	return ret;
>-- 
>2.24.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
@ 2020-03-11 14:26   ` Wei Yang
  2020-03-11 15:20     ` David Hildenbrand
  2020-03-11 16:55   ` Vitaly Kuznetsov
  2020-03-16 15:31   ` Michal Hocko
  2 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:26 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed, Mar 11, 2020 at 01:30:26PM +0100, David Hildenbrand wrote:
>For now, distributions implement advanced udev rules to essentially
>- Don't online any hotplugged memory (s390x)
>- Online all memory to ZONE_NORMAL (e.g., most virt environments like
>  hyperv)
>- Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>  care of (e.g., bare metal, special virt environments)
>
>In summary: All memory is usually onlined the same way, however, the
>kernel always has to ask userspace to come up with the same answer.
>E.g., HyperV always waits for a memory block to get onlined before
>continuing, otherwise it might end up adding memory faster than
>hotplugging it, which can result in strange OOM situations.
>
>Let's allow to specify a default online_type, not just "online" and
>"offline". This allows distributions to configure the default online_type
>when booting up and be done with it.
>
>We can now specify "offline", "online", "online_movable" and
>"online_kernel" via
>- "memhp_default_state=" on the kernel cmdline
>- /sys/devices/systemn/memory/auto_online_blocks
>just like we are able to specify for a single memory block via
>/sys/devices/systemn/memory/memoryX/state
>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Michal Hocko <mhocko@kernel.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Signed-off-by: David Hildenbrand <david@redhat.com>

Ok, I got the reason to leave the change on string compare here.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>---
> drivers/base/memory.c          | 11 +++++------
> include/linux/memory_hotplug.h |  2 ++
> mm/memory_hotplug.c            |  8 ++++----
> 3 files changed, 11 insertions(+), 10 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 8d3e16dab69f..2b09b68b9f78 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -35,7 +35,7 @@ static const char *const online_type_to_str[] = {
> 	[MMOP_ONLINE_MOVABLE] = "online_movable",
> };
> 
>-static int memhp_online_type_from_str(const char *str)
>+int memhp_online_type_from_str(const char *str)
> {
> 	int i;
> 
>@@ -394,13 +394,12 @@ static ssize_t auto_online_blocks_store(struct device *dev,
> 					struct device_attribute *attr,
> 					const char *buf, size_t count)
> {
>-	if (sysfs_streq(buf, "online"))
>-		memhp_default_online_type = MMOP_ONLINE;
>-	else if (sysfs_streq(buf, "offline"))
>-		memhp_default_online_type = MMOP_OFFLINE;
>-	else
>+	const int online_type = memhp_online_type_from_str(buf);
>+
>+	if (online_type < 0)
> 		return -EINVAL;
> 
>+	memhp_default_online_type = online_type;
> 	return count;
> }
> 
>diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>index c6e090b34c4b..ef55115320fb 100644
>--- a/include/linux/memory_hotplug.h
>+++ b/include/linux/memory_hotplug.h
>@@ -117,6 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
> 			struct mhp_restrictions *restrictions);
> extern u64 max_mem_size;
> 
>+extern int memhp_online_type_from_str(const char *str);
>+
> /* Default online_type (MMOP_*) when new memory blocks are added. */
> extern int memhp_default_online_type;
> /* If movable_node boot option specified */
>diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>index 01443c70aa27..4a96273eafa7 100644
>--- a/mm/memory_hotplug.c
>+++ b/mm/memory_hotplug.c
>@@ -75,10 +75,10 @@ EXPORT_SYMBOL_GPL(memhp_default_online_type);
> 
> static int __init setup_memhp_default_state(char *str)
> {
>-	if (!strcmp(str, "online"))
>-		memhp_default_online_type = MMOP_ONLINE;
>-	else if (!strcmp(str, "offline"))
>-		memhp_default_online_type = MMOP_OFFLINE;
>+	const int online_type = memhp_online_type_from_str(str);
>+
>+	if (online_type >= 0)
>+		memhp_default_online_type = online_type;
> 
> 	return 1;
> }
>-- 
>2.24.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array
  2020-03-11 14:20   ` Wei Yang
@ 2020-03-11 14:27     ` Wei Yang
  0 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-03-11 14:27 UTC (permalink / raw)
  To: Wei Yang
  Cc: David Hildenbrand, linux-kernel, linux-mm, linuxppc-dev,
	linux-hyperv, Greg Kroah-Hartman, Andrew Morton, Michal Hocko,
	Oscar Salvador, Rafael J. Wysocki, Baoquan He

On Wed, Mar 11, 2020 at 02:20:02PM +0000, Wei Yang wrote:
>On Wed, Mar 11, 2020 at 01:30:24PM +0100, David Hildenbrand wrote:
>>Let's use a simple array which we can reuse soon. While at it, move the
>>string->mmop conversion out of the device hotplug lock.
>>
>>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>Cc: Andrew Morton <akpm@linux-foundation.org>
>>Cc: Michal Hocko <mhocko@kernel.org>
>>Cc: Oscar Salvador <osalvador@suse.de>
>>Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>Cc: Baoquan He <bhe@redhat.com>
>>Cc: Wei Yang <richard.weiyang@gmail.com>
>>Signed-off-by: David Hildenbrand <david@redhat.com>

Ok, I got the reason.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>>---
>> drivers/base/memory.c | 38 +++++++++++++++++++++++---------------
>> 1 file changed, 23 insertions(+), 15 deletions(-)
>>
>>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>index e7e77cafef80..8a7f29c0bf97 100644
>>--- a/drivers/base/memory.c
>>+++ b/drivers/base/memory.c
>>@@ -28,6 +28,24 @@
>> 
>> #define MEMORY_CLASS_NAME	"memory"
>> 
>>+static const char *const online_type_to_str[] = {
>>+	[MMOP_OFFLINE] = "offline",
>>+	[MMOP_ONLINE] = "online",
>>+	[MMOP_ONLINE_KERNEL] = "online_kernel",
>>+	[MMOP_ONLINE_MOVABLE] = "online_movable",
>>+};
>>+
>>+static int memhp_online_type_from_str(const char *str)
>>+{
>>+	int i;
>>+
>>+	for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
>>+		if (sysfs_streq(str, online_type_to_str[i]))
>>+			return i;
>>+	}
>>+	return -EINVAL;
>>+}
>>+
>> #define to_memory_block(dev) container_of(dev, struct memory_block, dev)
>> 
>> static int sections_per_block;
>>@@ -236,26 +254,17 @@ static int memory_subsys_offline(struct device *dev)
>> static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>> 			   const char *buf, size_t count)
>> {
>>+	const int online_type = memhp_online_type_from_str(buf);
>
>In your following patch, you did the same conversion. Is it possible to merge
>them into this one?
>
>> 	struct memory_block *mem = to_memory_block(dev);
>>-	int ret, online_type;
>>+	int ret;
>>+
>>+	if (online_type < 0)
>>+		return -EINVAL;
>> 
>> 	ret = lock_device_hotplug_sysfs();
>> 	if (ret)
>> 		return ret;
>> 
>>-	if (sysfs_streq(buf, "online_kernel"))
>>-		online_type = MMOP_ONLINE_KERNEL;
>>-	else if (sysfs_streq(buf, "online_movable"))
>>-		online_type = MMOP_ONLINE_MOVABLE;
>>-	else if (sysfs_streq(buf, "online"))
>>-		online_type = MMOP_ONLINE;
>>-	else if (sysfs_streq(buf, "offline"))
>>-		online_type = MMOP_OFFLINE;
>>-	else {
>>-		ret = -EINVAL;
>>-		goto err;
>>-	}
>>-
>> 	switch (online_type) {
>> 	case MMOP_ONLINE_KERNEL:
>> 	case MMOP_ONLINE_MOVABLE:
>>@@ -271,7 +280,6 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>> 		ret = -EINVAL; /* should never happen */
>> 	}
>> 
>>-err:
>> 	unlock_device_hotplug();
>> 
>> 	if (ret < 0)
>>-- 
>>2.24.1
>
>-- 
>Wei Yang
>Help you, Help me

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 14:26   ` Wei Yang
@ 2020-03-11 15:20     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 15:20 UTC (permalink / raw)
  To: Wei Yang
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He

On 11.03.20 15:26, Wei Yang wrote:
> On Wed, Mar 11, 2020 at 01:30:26PM +0100, David Hildenbrand wrote:
>> For now, distributions implement advanced udev rules to essentially
>> - Don't online any hotplugged memory (s390x)
>> - Online all memory to ZONE_NORMAL (e.g., most virt environments like
>>  hyperv)
>> - Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>>  care of (e.g., bare metal, special virt environments)
>>
>> In summary: All memory is usually onlined the same way, however, the
>> kernel always has to ask userspace to come up with the same answer.
>> E.g., HyperV always waits for a memory block to get onlined before
>> continuing, otherwise it might end up adding memory faster than
>> hotplugging it, which can result in strange OOM situations.
>>
>> Let's allow to specify a default online_type, not just "online" and
>> "offline". This allows distributions to configure the default online_type
>> when booting up and be done with it.
>>
>> We can now specify "offline", "online", "online_movable" and
>> "online_kernel" via
>> - "memhp_default_state=" on the kernel cmdline
>> - /sys/devices/systemn/memory/auto_online_blocks
>> just like we are able to specify for a single memory block via
>> /sys/devices/systemn/memory/memoryX/state
>>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Wei Yang <richard.weiyang@gmail.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Ok, I got the reason to leave the change on string compare here.

Thanks for your *very fast* review! :)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  2020-03-11 14:26   ` Wei Yang
@ 2020-03-11 16:55   ` Vitaly Kuznetsov
  2020-03-11 17:05     ` David Hildenbrand
  2020-03-16 15:31   ` Michal Hocko
  2 siblings, 1 reply; 24+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-11 16:55 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, David Hildenbrand,
	Greg Kroah-Hartman, Andrew Morton, Michal Hocko, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

David Hildenbrand <david@redhat.com> writes:

> For now, distributions implement advanced udev rules to essentially
> - Don't online any hotplugged memory (s390x)
> - Online all memory to ZONE_NORMAL (e.g., most virt environments like
>   hyperv)
> - Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>   care of (e.g., bare metal, special virt environments)
>
> In summary: All memory is usually onlined the same way, however, the
> kernel always has to ask userspace to come up with the same answer.
> E.g., HyperV always waits for a memory block to get onlined before
> continuing, otherwise it might end up adding memory faster than
> hotplugging it, which can result in strange OOM situations.
>
> Let's allow to specify a default online_type, not just "online" and
> "offline". This allows distributions to configure the default online_type
> when booting up and be done with it.
>
> We can now specify "offline", "online", "online_movable" and
> "online_kernel" via
> - "memhp_default_state=" on the kernel cmdline
> - /sys/devices/systemn/memory/auto_online_blocks
> just like we are able to specify for a single memory block via
> /sys/devices/systemn/memory/memoryX/state
>

Thank you for picking this up! 

It's been awhile since I've added CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
but I vaguely recall one problem: memory hotplug may happen *very* early
(just because some memory is presented to a VM as hotplug memory, it is
not in e820). It happens way before we launch userspace (including
udev). The question is -- which ZONE will this memory be assigned too?

'memhp_default_state=' resolves the issue but nobody likes additional
kernel parameters for anything but
debug. CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE was supposed to help, but it
is binary and distro-wide (so *all* deployments will get the same
default and as you validly stated we want it differently).

We could've added something like your example onlining script to the
kernel itself but this is likely going to be hard to sell: "policies
belong to userspace!" will likely be the answer. 

So if we don't want to start the endless discussions (again), your
proposal is 'good enough'.


> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  drivers/base/memory.c          | 11 +++++------
>  include/linux/memory_hotplug.h |  2 ++
>  mm/memory_hotplug.c            |  8 ++++----
>  3 files changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 8d3e16dab69f..2b09b68b9f78 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -35,7 +35,7 @@ static const char *const online_type_to_str[] = {
>  	[MMOP_ONLINE_MOVABLE] = "online_movable",
>  };
>  
> -static int memhp_online_type_from_str(const char *str)
> +int memhp_online_type_from_str(const char *str)
>  {
>  	int i;
>  
> @@ -394,13 +394,12 @@ static ssize_t auto_online_blocks_store(struct device *dev,
>  					struct device_attribute *attr,
>  					const char *buf, size_t count)
>  {
> -	if (sysfs_streq(buf, "online"))
> -		memhp_default_online_type = MMOP_ONLINE;
> -	else if (sysfs_streq(buf, "offline"))
> -		memhp_default_online_type = MMOP_OFFLINE;
> -	else
> +	const int online_type = memhp_online_type_from_str(buf);
> +
> +	if (online_type < 0)
>  		return -EINVAL;
>  
> +	memhp_default_online_type = online_type;
>  	return count;
>  }
>  
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index c6e090b34c4b..ef55115320fb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -117,6 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
>  			struct mhp_restrictions *restrictions);
>  extern u64 max_mem_size;
>  
> +extern int memhp_online_type_from_str(const char *str);
> +
>  /* Default online_type (MMOP_*) when new memory blocks are added. */
>  extern int memhp_default_online_type;
>  /* If movable_node boot option specified */
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 01443c70aa27..4a96273eafa7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -75,10 +75,10 @@ EXPORT_SYMBOL_GPL(memhp_default_online_type);
>  
>  static int __init setup_memhp_default_state(char *str)
>  {
> -	if (!strcmp(str, "online"))
> -		memhp_default_online_type = MMOP_ONLINE;
> -	else if (!strcmp(str, "offline"))
> -		memhp_default_online_type = MMOP_OFFLINE;
> +	const int online_type = memhp_online_type_from_str(str);
> +
> +	if (online_type >= 0)
> +		memhp_default_online_type = online_type;
>  
>  	return 1;
>  }

-- 
Vitaly


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 16:55   ` Vitaly Kuznetsov
@ 2020-03-11 17:05     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-11 17:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov, linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-hyperv, Greg Kroah-Hartman,
	Andrew Morton, Michal Hocko, Oscar Salvador, Rafael J. Wysocki,
	Baoquan He, Wei Yang

On 11.03.20 17:55, Vitaly Kuznetsov wrote:
> David Hildenbrand <david@redhat.com> writes:
> 
>> For now, distributions implement advanced udev rules to essentially
>> - Don't online any hotplugged memory (s390x)
>> - Online all memory to ZONE_NORMAL (e.g., most virt environments like
>>   hyperv)
>> - Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>>   care of (e.g., bare metal, special virt environments)
>>
>> In summary: All memory is usually onlined the same way, however, the
>> kernel always has to ask userspace to come up with the same answer.
>> E.g., HyperV always waits for a memory block to get onlined before
>> continuing, otherwise it might end up adding memory faster than
>> hotplugging it, which can result in strange OOM situations.
>>
>> Let's allow to specify a default online_type, not just "online" and
>> "offline". This allows distributions to configure the default online_type
>> when booting up and be done with it.
>>
>> We can now specify "offline", "online", "online_movable" and
>> "online_kernel" via
>> - "memhp_default_state=" on the kernel cmdline
>> - /sys/devices/systemn/memory/auto_online_blocks
>> just like we are able to specify for a single memory block via
>> /sys/devices/systemn/memory/memoryX/state
>>
> 
> Thank you for picking this up! 
> 
> It's been awhile since I've added CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
> but I vaguely recall one problem: memory hotplug may happen *very* early
> (just because some memory is presented to a VM as hotplug memory, it is
> not in e820). It happens way before we launch userspace (including
> udev). The question is -- which ZONE will this memory be assigned too?

If it's added via add_memory() ("hot/cold plugged memory") like ACPI
DIMMs not part of e820, Hyper-V balloon added memory, XEN balloon added
memory, s390x standby memory etc. the memory will be onlined as
configured via memhp_default_online_type. Assume that one is set to
"offline".

*If* userspace changes memhp_default_online_type (as in my script in the
cover letter), userspace has to online all memory that has been added
before userspace was active itself (again, as done in my script).

Memory not added via add_memory() is considered "initial memory" and not
as hot/cold plugged memory.

Same handling as for now using udev rules. (once userspace is up, udev
rules for all early added memory is triggered as well)

> 
> 'memhp_default_state=' resolves the issue but nobody likes additional
> kernel parameters for anything but
> debug. CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE was supposed to help, but it
> is binary and distro-wide (so *all* deployments will get the same
> default and as you validly stated we want it differently).
> 
> We could've added something like your example onlining script to the
> kernel itself but this is likely going to be hard to sell: "policies
> belong to userspace!" will likely be the answer. 

Exactly my thought.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
  2020-03-11 14:17   ` Wei Yang
@ 2020-03-16 15:12   ` Michal Hocko
  2020-03-16 15:17     ` David Hildenbrand
  1 sibling, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed 11-03-20 13:30:22, David Hildenbrand wrote:
> The name is misleading. Let's just name it like the online_type name we
> expose to user space ("online").

I would disagree the name is misleading. It just says that you want to
online and keep the zone type. Nothing I would insist on though.

> Add some documentation to the types.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  drivers/base/memory.c          | 9 +++++----
>  include/linux/memory_hotplug.h | 6 +++++-
>  2 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 6448c9ece2cb..8c5ce42c0fc3 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -216,7 +216,7 @@ static int memory_subsys_online(struct device *dev)
>  	 * attribute and need to set the online_type.
>  	 */
>  	if (mem->online_type < 0)
> -		mem->online_type = MMOP_ONLINE_KEEP;
> +		mem->online_type = MMOP_ONLINE;
>  
>  	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
>  
> @@ -251,7 +251,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>  	else if (sysfs_streq(buf, "online_movable"))
>  		online_type = MMOP_ONLINE_MOVABLE;
>  	else if (sysfs_streq(buf, "online"))
> -		online_type = MMOP_ONLINE_KEEP;
> +		online_type = MMOP_ONLINE;
>  	else if (sysfs_streq(buf, "offline"))
>  		online_type = MMOP_OFFLINE;
>  	else {
> @@ -262,7 +262,7 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>  	switch (online_type) {
>  	case MMOP_ONLINE_KERNEL:
>  	case MMOP_ONLINE_MOVABLE:
> -	case MMOP_ONLINE_KEEP:
> +	case MMOP_ONLINE:
>  		/* mem->online_type is protected by device_hotplug_lock */
>  		mem->online_type = online_type;
>  		ret = device_online(&mem->dev);
> @@ -342,7 +342,8 @@ static ssize_t valid_zones_show(struct device *dev,
>  	}
>  
>  	nid = mem->nid;
> -	default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
> +	default_zone = zone_for_pfn_range(MMOP_ONLINE, nid, start_pfn,
> +					  nr_pages);
>  	strcat(buf, default_zone->name);
>  
>  	print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL,
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index f4d59155f3d4..261dbf010d5d 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -47,9 +47,13 @@ enum {
>  
>  /* Types for control the zone type of onlined and offlined memory */
>  enum {
> +	/* Offline the memory. */
>  	MMOP_OFFLINE = -1,
> -	MMOP_ONLINE_KEEP,
> +	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
> +	MMOP_ONLINE,
> +	/* Online the memory to ZONE_NORMAL. */
>  	MMOP_ONLINE_KERNEL,
> +	/* Online the memory to ZONE_MOVABLE. */
>  	MMOP_ONLINE_MOVABLE,
>  };
>  
> -- 
> 2.24.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  2020-03-16 15:12   ` Michal Hocko
@ 2020-03-16 15:17     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-16 15:17 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On 16.03.20 16:12, Michal Hocko wrote:
> On Wed 11-03-20 13:30:22, David Hildenbrand wrote:
>> The name is misleading. Let's just name it like the online_type name we
>> expose to user space ("online").
> 
> I would disagree the name is misleading. It just says that you want to
> online and keep the zone type. Nothing I would insist on though.

"online and keep the zone type" - that's not what's happening.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0
  2020-03-11 12:30 ` [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
  2020-03-11 14:18   ` Wei Yang
@ 2020-03-16 15:19   ` Michal Hocko
  1 sibling, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed 11-03-20 13:30:23, David Hildenbrand wrote:
> I have no idea why we have to start at -1.

Because this is how the offline state offline was represented
originally.

> Just treat 0 as the special
> case. Clarify a comment (which was wrong, when we come via
> device_online() the first time, the online_type would have been 0 /
> MEM_ONLINE). The default is now always MMOP_OFFLINE.

git grep says that you have covered the only remaining place which
hasn't used the enum value.

> This is a preparation to use the online_type as an array index.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  drivers/base/memory.c          | 11 ++++-------
>  include/linux/memory_hotplug.h |  2 +-
>  2 files changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 8c5ce42c0fc3..e7e77cafef80 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -211,17 +211,14 @@ static int memory_subsys_online(struct device *dev)
>  		return 0;
>  
>  	/*
> -	 * If we are called from state_store(), online_type will be
> -	 * set >= 0 Otherwise we were called from the device online
> -	 * attribute and need to set the online_type.
> +	 * When called via device_online() without configuring the online_type,
> +	 * we want to default to MMOP_ONLINE.
>  	 */
> -	if (mem->online_type < 0)
> +	if (mem->online_type == MMOP_OFFLINE)
>  		mem->online_type = MMOP_ONLINE;
>  
>  	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
> -
> -	/* clear online_type */
> -	mem->online_type = -1;
> +	mem->online_type = MMOP_OFFLINE;
>  
>  	return ret;
>  }
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 261dbf010d5d..c2e06ed5e0e9 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -48,7 +48,7 @@ enum {
>  /* Types for control the zone type of onlined and offlined memory */
>  enum {
>  	/* Offline the memory. */
> -	MMOP_OFFLINE = -1,
> +	MMOP_OFFLINE = 0,
>  	/* Online the memory. Zone depends, see default_zone_for_pfn(). */
>  	MMOP_ONLINE,
>  	/* Online the memory to ZONE_NORMAL. */
> -- 
> 2.24.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array
  2020-03-11 12:30 ` [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
  2020-03-11 14:20   ` Wei Yang
@ 2020-03-16 15:20   ` Michal Hocko
  1 sibling, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed 11-03-20 13:30:24, David Hildenbrand wrote:
> Let's use a simple array which we can reuse soon. While at it, move the
> string->mmop conversion out of the device hotplug lock.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  drivers/base/memory.c | 38 +++++++++++++++++++++++---------------
>  1 file changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index e7e77cafef80..8a7f29c0bf97 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -28,6 +28,24 @@
>  
>  #define MEMORY_CLASS_NAME	"memory"
>  
> +static const char *const online_type_to_str[] = {
> +	[MMOP_OFFLINE] = "offline",
> +	[MMOP_ONLINE] = "online",
> +	[MMOP_ONLINE_KERNEL] = "online_kernel",
> +	[MMOP_ONLINE_MOVABLE] = "online_movable",
> +};
> +
> +static int memhp_online_type_from_str(const char *str)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
> +		if (sysfs_streq(str, online_type_to_str[i]))
> +			return i;
> +	}
> +	return -EINVAL;
> +}
> +
>  #define to_memory_block(dev) container_of(dev, struct memory_block, dev)
>  
>  static int sections_per_block;
> @@ -236,26 +254,17 @@ static int memory_subsys_offline(struct device *dev)
>  static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>  			   const char *buf, size_t count)
>  {
> +	const int online_type = memhp_online_type_from_str(buf);
>  	struct memory_block *mem = to_memory_block(dev);
> -	int ret, online_type;
> +	int ret;
> +
> +	if (online_type < 0)
> +		return -EINVAL;
>  
>  	ret = lock_device_hotplug_sysfs();
>  	if (ret)
>  		return ret;
>  
> -	if (sysfs_streq(buf, "online_kernel"))
> -		online_type = MMOP_ONLINE_KERNEL;
> -	else if (sysfs_streq(buf, "online_movable"))
> -		online_type = MMOP_ONLINE_MOVABLE;
> -	else if (sysfs_streq(buf, "online"))
> -		online_type = MMOP_ONLINE;
> -	else if (sysfs_streq(buf, "offline"))
> -		online_type = MMOP_OFFLINE;
> -	else {
> -		ret = -EINVAL;
> -		goto err;
> -	}
> -
>  	switch (online_type) {
>  	case MMOP_ONLINE_KERNEL:
>  	case MMOP_ONLINE_MOVABLE:
> @@ -271,7 +280,6 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>  		ret = -EINVAL; /* should never happen */
>  	}
>  
> -err:
>  	unlock_device_hotplug();
>  
>  	if (ret < 0)
> -- 
> 2.24.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type
  2020-03-11 12:30 ` [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
  2020-03-11 14:25   ` Wei Yang
@ 2020-03-16 15:24   ` Michal Hocko
  2020-03-16 15:34     ` David Hildenbrand
  1 sibling, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:24 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, Thomas Gleixner

On Wed 11-03-20 13:30:25, David Hildenbrand wrote:
[...]
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
> index d6d64f8718e6..e15a600cfa4d 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -235,7 +235,7 @@ static int memtrace_online(void)
>  		 * If kernel isn't compiled with the auto online option
>  		 * we need to online the memory ourselves.
>  		 */
> -		if (!memhp_auto_online) {
> +		if (memhp_default_online_type == MMOP_OFFLINE) {
>  			lock_device_hotplug();
>  			walk_memory_blocks(ent->start, ent->size, NULL,
>  					   online_mem_block);

Whut? This stinks, doesn't it. For your defense, the original code is
fishy already but this just makes it even more ugly.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
  2020-03-11 14:26   ` Wei Yang
  2020-03-11 16:55   ` Vitaly Kuznetsov
@ 2020-03-16 15:31   ` Michal Hocko
  2020-03-16 15:48     ` David Hildenbrand
  2 siblings, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:31 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On Wed 11-03-20 13:30:26, David Hildenbrand wrote:
> For now, distributions implement advanced udev rules to essentially
> - Don't online any hotplugged memory (s390x)
> - Online all memory to ZONE_NORMAL (e.g., most virt environments like
>   hyperv)
> - Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>   care of (e.g., bare metal, special virt environments)
> 
> In summary: All memory is usually onlined the same way, however, the
> kernel always has to ask userspace to come up with the same answer.
> E.g., HyperV always waits for a memory block to get onlined before
> continuing, otherwise it might end up adding memory faster than
> hotplugging it, which can result in strange OOM situations.
> 
> Let's allow to specify a default online_type, not just "online" and
> "offline". This allows distributions to configure the default online_type
> when booting up and be done with it.
> 
> We can now specify "offline", "online", "online_movable" and
> "online_kernel" via
> - "memhp_default_state=" on the kernel cmdline
> - /sys/devices/systemn/memory/auto_online_blocks
> just like we are able to specify for a single memory block via
> /sys/devices/systemn/memory/memoryX/state

I still strongly believe that the whole interface is wrong. This is just
adding more lipstick on the pig. On the other hand I recognize that the
event based onlining is a PITA as well. The proper interface would
somehow communicate the type of the memory via the event or other sysfs
attribute and then the FW/HV could tell that this is an offline memory,
hotplugable memory or just an additional memory that doesn't need to
support hotremove by the consumer. The userspace or the kernel could
handle the hotadd request much more easier that way.

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

That being said, I will not object to this patch. I simply gave up
fighting this interface. So if it works for consumers and it doesn't
break the existing userspace (which is shouldn't AFAICS) then go ahead.

> ---
>  drivers/base/memory.c          | 11 +++++------
>  include/linux/memory_hotplug.h |  2 ++
>  mm/memory_hotplug.c            |  8 ++++----
>  3 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 8d3e16dab69f..2b09b68b9f78 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -35,7 +35,7 @@ static const char *const online_type_to_str[] = {
>  	[MMOP_ONLINE_MOVABLE] = "online_movable",
>  };
>  
> -static int memhp_online_type_from_str(const char *str)
> +int memhp_online_type_from_str(const char *str)
>  {
>  	int i;
>  
> @@ -394,13 +394,12 @@ static ssize_t auto_online_blocks_store(struct device *dev,
>  					struct device_attribute *attr,
>  					const char *buf, size_t count)
>  {
> -	if (sysfs_streq(buf, "online"))
> -		memhp_default_online_type = MMOP_ONLINE;
> -	else if (sysfs_streq(buf, "offline"))
> -		memhp_default_online_type = MMOP_OFFLINE;
> -	else
> +	const int online_type = memhp_online_type_from_str(buf);
> +
> +	if (online_type < 0)
>  		return -EINVAL;
>  
> +	memhp_default_online_type = online_type;
>  	return count;
>  }
>  
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index c6e090b34c4b..ef55115320fb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -117,6 +117,8 @@ extern int arch_add_memory(int nid, u64 start, u64 size,
>  			struct mhp_restrictions *restrictions);
>  extern u64 max_mem_size;
>  
> +extern int memhp_online_type_from_str(const char *str);
> +
>  /* Default online_type (MMOP_*) when new memory blocks are added. */
>  extern int memhp_default_online_type;
>  /* If movable_node boot option specified */
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 01443c70aa27..4a96273eafa7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -75,10 +75,10 @@ EXPORT_SYMBOL_GPL(memhp_default_online_type);
>  
>  static int __init setup_memhp_default_state(char *str)
>  {
> -	if (!strcmp(str, "online"))
> -		memhp_default_online_type = MMOP_ONLINE;
> -	else if (!strcmp(str, "offline"))
> -		memhp_default_online_type = MMOP_OFFLINE;
> +	const int online_type = memhp_online_type_from_str(str);
> +
> +	if (online_type >= 0)
> +		memhp_default_online_type = online_type;
>  
>  	return 1;
>  }
> -- 
> 2.24.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type
  2020-03-16 15:24   ` Michal Hocko
@ 2020-03-16 15:34     ` David Hildenbrand
  2020-03-16 15:46       ` Michal Hocko
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2020-03-16 15:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, Thomas Gleixner

On 16.03.20 16:24, Michal Hocko wrote:
> On Wed 11-03-20 13:30:25, David Hildenbrand wrote:
> [...]
>> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
>> index d6d64f8718e6..e15a600cfa4d 100644
>> --- a/arch/powerpc/platforms/powernv/memtrace.c
>> +++ b/arch/powerpc/platforms/powernv/memtrace.c
>> @@ -235,7 +235,7 @@ static int memtrace_online(void)
>>  		 * If kernel isn't compiled with the auto online option
>>  		 * we need to online the memory ourselves.
>>  		 */
>> -		if (!memhp_auto_online) {
>> +		if (memhp_default_online_type == MMOP_OFFLINE) {
>>  			lock_device_hotplug();
>>  			walk_memory_blocks(ent->start, ent->size, NULL,
>>  					   online_mem_block);
> 
> Whut? This stinks, doesn't it. For your defense, the original code is
> fishy already but this just makes it even more ugly.

PPC64 onlines all memory directly from the kernel, and not triggered by
user space (I think that's ugly and not desired, but it is what it is
and I am not going to touch that).

See
arch/powerpc/platforms/pseries/hotplug-memory.c:dlpar_change_lmb_state().

Best I can do here is to also always online all memory.

What are your suggestions?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type
  2020-03-16 15:34     ` David Hildenbrand
@ 2020-03-16 15:46       ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2020-03-16 15:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, Thomas Gleixner

On Mon 16-03-20 16:34:06, David Hildenbrand wrote:
[...]
> Best I can do here is to also always online all memory.

Yes that sounds like a cleaner solution than having a condition that
doesn't make much sense at first glance.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type
  2020-03-16 15:31   ` Michal Hocko
@ 2020-03-16 15:48     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2020-03-16 15:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, linuxppc-dev, linux-hyperv,
	Greg Kroah-Hartman, Andrew Morton, Oscar Salvador,
	Rafael J. Wysocki, Baoquan He, Wei Yang

On 16.03.20 16:31, Michal Hocko wrote:
> On Wed 11-03-20 13:30:26, David Hildenbrand wrote:
>> For now, distributions implement advanced udev rules to essentially
>> - Don't online any hotplugged memory (s390x)
>> - Online all memory to ZONE_NORMAL (e.g., most virt environments like
>>   hyperv)
>> - Online all memory to ZONE_MOVABLE in case the zone imbalance is taken
>>   care of (e.g., bare metal, special virt environments)
>>
>> In summary: All memory is usually onlined the same way, however, the
>> kernel always has to ask userspace to come up with the same answer.
>> E.g., HyperV always waits for a memory block to get onlined before
>> continuing, otherwise it might end up adding memory faster than
>> hotplugging it, which can result in strange OOM situations.
>>
>> Let's allow to specify a default online_type, not just "online" and
>> "offline". This allows distributions to configure the default online_type
>> when booting up and be done with it.
>>
>> We can now specify "offline", "online", "online_movable" and
>> "online_kernel" via
>> - "memhp_default_state=" on the kernel cmdline
>> - /sys/devices/systemn/memory/auto_online_blocks
>> just like we are able to specify for a single memory block via
>> /sys/devices/systemn/memory/memoryX/state
> 
> I still strongly believe that the whole interface is wrong. This is just
> adding more lipstick on the pig. On the other hand I recognize that the
> event based onlining is a PITA as well. The proper interface would
> somehow communicate the type of the memory via the event or other sysfs
> attribute and then the FW/HV could tell that this is an offline memory,
> hotplugable memory or just an additional memory that doesn't need to
> support hotremove by the consumer. The userspace or the kernel could
> handle the hotadd request much more easier that way.

Yeah, and I proposed patches like that which were not well received [1] [2].

But then, user space usually wants to online all memory the same way
right now. Also, HyperV and virtio-mem don't want to wait for onlining
to happen in user space, because it slows down the whole "add a hole
bunch of memory" process.

> 
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Wei Yang <richard.weiyang@gmail.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> That being said, I will not object to this patch. I simply gave up
> fighting this interface. So if it works for consumers and it doesn't
> break the existing userspace (which is shouldn't AFAICS) then go ahead.

As it solves a real problem and makes the interface to auto online
usable, I don't think anything speaks against it.

Thanks!

[1] https://spinics.net/lists/linux-driver-devel/msg118337.html
[2]
https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg32420.html

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-03-16 15:49 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11 12:30 [PATCH v1 0/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
2020-03-11 12:30 ` [PATCH v1 1/5] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
2020-03-11 14:17   ` Wei Yang
2020-03-16 15:12   ` Michal Hocko
2020-03-16 15:17     ` David Hildenbrand
2020-03-11 12:30 ` [PATCH v1 2/5] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
2020-03-11 14:18   ` Wei Yang
2020-03-16 15:19   ` Michal Hocko
2020-03-11 12:30 ` [PATCH v1 3/5] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
2020-03-11 14:20   ` Wei Yang
2020-03-11 14:27     ` Wei Yang
2020-03-16 15:20   ` Michal Hocko
2020-03-11 12:30 ` [PATCH v1 4/5] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
2020-03-11 14:25   ` Wei Yang
2020-03-16 15:24   ` Michal Hocko
2020-03-16 15:34     ` David Hildenbrand
2020-03-16 15:46       ` Michal Hocko
2020-03-11 12:30 ` [PATCH v1 5/5] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
2020-03-11 14:26   ` Wei Yang
2020-03-11 15:20     ` David Hildenbrand
2020-03-11 16:55   ` Vitaly Kuznetsov
2020-03-11 17:05     ` David Hildenbrand
2020-03-16 15:31   ` Michal Hocko
2020-03-16 15:48     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).