All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 11:22 ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

As raised by John Hubbard [1], offline_and_remove_memory() failing on
fatal signals can be sub-optimal for out-of-tree drivers: dying user space
might be the last one holding a device node open.

As that device node gets closed, the driver might unplug the device
and trigger offline_and_remove_memory() to unplug previously
hotplugged device memory. This, however, will fail reliably when fatal
signals are pending on the dying process, turning the device unusable until
the machine gets rebooted.

That can be optizied easily by ignoring fatal signals. In fact, checking
for fatal signals in the case of offline_and_remove_memory() doesn't
make too much sense; the check makes sense when offlining is triggered
directly via sysfs.  However, we actually do want a way to not end up
stuck in offline_and_remove_memory() forever.

What offline_and_remove_memory() users actually want is fail after some
given timeout and not care about fatal signals.

So let's implement that, optimizing virtio-mem along the way.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com

David Hildenbrand (5):
  mm/memory_hotplug: check for fatal signals only in offline_pages()
  virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
  mm/memory_hotplug: make offline_and_remove_memory() timeout instead of
    failing on fatal signals
  virtio-mem: set the timeout for offline_and_remove_memory() to 10
    seconds
  virtio-mem: check if the config changed before (fake) offlining memory

 drivers/virtio/virtio_mem.c    | 22 +++++++++++++--
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
 3 files changed, 68 insertions(+), 6 deletions(-)


base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
-- 
2.40.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 11:22 ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

As raised by John Hubbard [1], offline_and_remove_memory() failing on
fatal signals can be sub-optimal for out-of-tree drivers: dying user space
might be the last one holding a device node open.

As that device node gets closed, the driver might unplug the device
and trigger offline_and_remove_memory() to unplug previously
hotplugged device memory. This, however, will fail reliably when fatal
signals are pending on the dying process, turning the device unusable until
the machine gets rebooted.

That can be optizied easily by ignoring fatal signals. In fact, checking
for fatal signals in the case of offline_and_remove_memory() doesn't
make too much sense; the check makes sense when offlining is triggered
directly via sysfs.  However, we actually do want a way to not end up
stuck in offline_and_remove_memory() forever.

What offline_and_remove_memory() users actually want is fail after some
given timeout and not care about fatal signals.

So let's implement that, optimizing virtio-mem along the way.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com

David Hildenbrand (5):
  mm/memory_hotplug: check for fatal signals only in offline_pages()
  virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
  mm/memory_hotplug: make offline_and_remove_memory() timeout instead of
    failing on fatal signals
  virtio-mem: set the timeout for offline_and_remove_memory() to 10
    seconds
  virtio-mem: check if the config changed before (fake) offlining memory

 drivers/virtio/virtio_mem.c    | 22 +++++++++++++--
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
 3 files changed, 68 insertions(+), 6 deletions(-)


base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
  2023-06-27 11:22 ` David Hildenbrand
@ 2023-06-27 11:22   ` David Hildenbrand
  -1 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

Let's check for fatal signals only. That looks cleaner and still keeps
the documented use case for manual user-space triggered memory offlining
working. From Documentation/admin-guide/mm/memory-hotplug.rst:

	% timeout $TIMEOUT offline_block | failure_handling

In fact, we even document there: "the offlining context can be terminated
by sending a fatal signal".

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 8e0fa209d533..0d2151df4ee1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1879,7 +1879,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	do {
 		pfn = start_pfn;
 		do {
-			if (signal_pending(current)) {
+			if (fatal_signal_pending(current)) {
 				ret = -EINTR;
 				reason = "signal backoff";
 				goto failed_removal_isolated;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
@ 2023-06-27 11:22   ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

Let's check for fatal signals only. That looks cleaner and still keeps
the documented use case for manual user-space triggered memory offlining
working. From Documentation/admin-guide/mm/memory-hotplug.rst:

	% timeout $TIMEOUT offline_block | failure_handling

In fact, we even document there: "the offlining context can be terminated
by sending a fatal signal".

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 8e0fa209d533..0d2151df4ee1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1879,7 +1879,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	do {
 		pfn = start_pfn;
 		do {
-			if (signal_pending(current)) {
+			if (fatal_signal_pending(current)) {
 				ret = -EINTR;
 				reason = "signal backoff";
 				goto failed_removal_isolated;
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 2/5] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
  2023-06-27 11:22 ` David Hildenbrand
@ 2023-06-27 11:22   ` David Hildenbrand
  -1 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

Let's prepare for offline_and_remove_memory() to return other error
codes that effectively translate to -EBUSY, such as -ETIMEDOUT.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 835f6cc2fb66..cb8bc6f6aa90 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -750,7 +750,15 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		dev_dbg(&vm->vdev->dev,
 			"offlining and removing memory failed: %d\n", rc);
 	}
-	return rc;
+
+	switch (rc) {
+	case 0:
+	case -ENOMEM:
+	case -EINVAL:
+		return rc;
+	default:
+		return -EBUSY;
+	}
 }
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 2/5] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
@ 2023-06-27 11:22   ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

Let's prepare for offline_and_remove_memory() to return other error
codes that effectively translate to -EBUSY, such as -ETIMEDOUT.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 835f6cc2fb66..cb8bc6f6aa90 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -750,7 +750,15 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		dev_dbg(&vm->vdev->dev,
 			"offlining and removing memory failed: %d\n", rc);
 	}
-	return rc;
+
+	switch (rc) {
+	case 0:
+	case -ENOMEM:
+	case -EINVAL:
+		return rc;
+	default:
+		return -EBUSY;
+	}
 }
 
 /*
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 11:22 ` David Hildenbrand
@ 2023-06-27 11:22   ` David Hildenbrand
  -1 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

John Hubbard writes [1]:

        Some device drivers add memory to the system via memory hotplug.
        When the driver is unloaded, that memory is hot-unplugged.

        However, memory hot unplug can fail. And these days, it fails a
        little too easily, with respect to the above case. Specifically, if
        a signal is pending on the process, hot unplug fails.

        [...]

        So in this case, other things (unmovable pages, un-splittable huge
        pages) can also cause the above problem. However, those are
        demonstrably less common than simply having a pending signal. I've
        got bug reports from users who can trivially reproduce this by
        killing their process with a "kill -9", for example.

Especially with ZONE_MOVABLE, offlining is supposed to work in most
cases when offlining actually hotplugged (not boot) memory, and only fail
in rare corner cases (e.g., some driver holds a reference to a page in
ZONE_MOVABLE, turning it unmovable).

In these corner cases we really don't want to be stuck forever in
offline_and_remove_memory(). But in the general cases, we really want to
do our best to make memory offlining succeed -- in a reasonable
timeframe.

Reliably failing in the described case when there is a fatal signal pending
is sub-optimal. The pending signal check is mostly only relevant when user
space explicitly triggers offlining of memory using sysfs device attributes
("state" or "online" attribute), but not when coming via
offline_and_remove_memory().

So let's use a timer instead and ignore fatal signals, because they are
not really expressive for offline_and_remove_memory() users. Let's default
to 30 seconds if no timeout was specified, and limit the timeout to 120
seconds.

This change is also valuable for virtio-mem in BBM (Big Block Mode) with
"bbm_safe_unplug=off", to avoid endless loops when stuck forever in
offline_and_remove_memory().

While at it, drop the "extern" from offline_and_remove_memory() to make
it fit into a single line.

[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c    |  2 +-
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
 3 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index cb8bc6f6aa90..f8792223f1db 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -738,7 +738,7 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		"offlining and removing memory: 0x%llx - 0x%llx\n", addr,
 		addr + size - 1);
 
-	rc = offline_and_remove_memory(addr, size);
+	rc = offline_and_remove_memory(addr, size, 0);
 	if (!rc) {
 		atomic64_sub(size, &vm->offline_size);
 		/*
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 9fcbf5706595..d5f9e8b5a4a4 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -307,7 +307,7 @@ extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 			 struct zone *zone, struct memory_group *group);
 extern int remove_memory(u64 start, u64 size);
 extern void __remove_memory(u64 start, u64 size);
-extern int offline_and_remove_memory(u64 start, u64 size);
+int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms);
 
 #else
 static inline void try_offline_node(int nid) {}
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0d2151df4ee1..ca635121644a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -152,6 +152,22 @@ void put_online_mems(void)
 
 bool movable_node_enabled = false;
 
+/*
+ * Protected by the device hotplug lock: offline_and_remove_memory()
+ * will activate a timer such that offlining cannot be stuck forever.
+ *
+ * With an active timer, fatal signals will be ignored, because they can be
+ * counter-productive when dying user space triggers device unplug/driver
+ * unloading that ends up offlining+removing device memory.
+ */
+static bool mhp_offlining_timer_active;
+static atomic_t mhp_offlining_timer_expired;
+
+static void mhp_offline_timer_fn(struct timer_list *unused)
+{
+	atomic_set(&mhp_offlining_timer_expired, 1);
+}
+
 #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
 int mhp_default_online_type = MMOP_OFFLINE;
 #else
@@ -1879,7 +1895,18 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	do {
 		pfn = start_pfn;
 		do {
-			if (fatal_signal_pending(current)) {
+			/*
+			 * If a timer is active, we're coming via
+			 * offline_and_remove_memory() and want to ignore even
+			 * fatal signals.
+			 */
+			if (mhp_offlining_timer_active) {
+				if (atomic_read(&mhp_offlining_timer_expired)) {
+					ret = -ETIMEDOUT;
+					reason = "timeout";
+					goto failed_removal_isolated;
+				}
+			} else if (fatal_signal_pending(current)) {
 				ret = -EINTR;
 				reason = "signal backoff";
 				goto failed_removal_isolated;
@@ -2232,11 +2259,17 @@ static int try_reonline_memory_block(struct memory_block *mem, void *arg)
  * memory is still in use. Primarily useful for memory devices that logically
  * unplugged all memory (so it's no longer in use) and want to offline + remove
  * that memory.
+ *
+ * offline_and_remove_memory() will not fail on fatal signals. Instead, it will
+ * fail once the timeout has been reached and offlining was not completed. If
+ * no timeout was specified, it will timeout after 30 seconds. The timeout is
+ * limited to 120 seconds.
  */
-int offline_and_remove_memory(u64 start, u64 size)
+int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms)
 {
 	const unsigned long mb_count = size / memory_block_size_bytes();
 	uint8_t *online_types, *tmp;
+	struct timer_list timer;
 	int rc;
 
 	if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
@@ -2261,9 +2294,22 @@ int offline_and_remove_memory(u64 start, u64 size)
 
 	lock_device_hotplug();
 
+	if (!timeout_ms)
+		timeout_ms = 30 * MSEC_PER_SEC;
+	timeout_ms = min_t(unsigned int, timeout_ms, 120 * MSEC_PER_SEC);
+
+	timer_setup_on_stack(&timer, mhp_offline_timer_fn, 0);
+	mod_timer(&timer, jiffies + msecs_to_jiffies(timeout_ms));
+	mhp_offlining_timer_active = true;
+
 	tmp = online_types;
 	rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block);
 
+	timer_delete_sync(&timer);
+	atomic_set(&mhp_offlining_timer_expired, 0);
+	mhp_offlining_timer_active = false;
+	destroy_timer_on_stack(&timer);
+
 	/*
 	 * In case we succeeded to offline all memory, remove it.
 	 * This cannot fail as it cannot get onlined in the meantime.
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 11:22   ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

John Hubbard writes [1]:

        Some device drivers add memory to the system via memory hotplug.
        When the driver is unloaded, that memory is hot-unplugged.

        However, memory hot unplug can fail. And these days, it fails a
        little too easily, with respect to the above case. Specifically, if
        a signal is pending on the process, hot unplug fails.

        [...]

        So in this case, other things (unmovable pages, un-splittable huge
        pages) can also cause the above problem. However, those are
        demonstrably less common than simply having a pending signal. I've
        got bug reports from users who can trivially reproduce this by
        killing their process with a "kill -9", for example.

Especially with ZONE_MOVABLE, offlining is supposed to work in most
cases when offlining actually hotplugged (not boot) memory, and only fail
in rare corner cases (e.g., some driver holds a reference to a page in
ZONE_MOVABLE, turning it unmovable).

In these corner cases we really don't want to be stuck forever in
offline_and_remove_memory(). But in the general cases, we really want to
do our best to make memory offlining succeed -- in a reasonable
timeframe.

Reliably failing in the described case when there is a fatal signal pending
is sub-optimal. The pending signal check is mostly only relevant when user
space explicitly triggers offlining of memory using sysfs device attributes
("state" or "online" attribute), but not when coming via
offline_and_remove_memory().

So let's use a timer instead and ignore fatal signals, because they are
not really expressive for offline_and_remove_memory() users. Let's default
to 30 seconds if no timeout was specified, and limit the timeout to 120
seconds.

This change is also valuable for virtio-mem in BBM (Big Block Mode) with
"bbm_safe_unplug=off", to avoid endless loops when stuck forever in
offline_and_remove_memory().

While at it, drop the "extern" from offline_and_remove_memory() to make
it fit into a single line.

[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c    |  2 +-
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
 3 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index cb8bc6f6aa90..f8792223f1db 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -738,7 +738,7 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		"offlining and removing memory: 0x%llx - 0x%llx\n", addr,
 		addr + size - 1);
 
-	rc = offline_and_remove_memory(addr, size);
+	rc = offline_and_remove_memory(addr, size, 0);
 	if (!rc) {
 		atomic64_sub(size, &vm->offline_size);
 		/*
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 9fcbf5706595..d5f9e8b5a4a4 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -307,7 +307,7 @@ extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 			 struct zone *zone, struct memory_group *group);
 extern int remove_memory(u64 start, u64 size);
 extern void __remove_memory(u64 start, u64 size);
-extern int offline_and_remove_memory(u64 start, u64 size);
+int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms);
 
 #else
 static inline void try_offline_node(int nid) {}
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0d2151df4ee1..ca635121644a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -152,6 +152,22 @@ void put_online_mems(void)
 
 bool movable_node_enabled = false;
 
+/*
+ * Protected by the device hotplug lock: offline_and_remove_memory()
+ * will activate a timer such that offlining cannot be stuck forever.
+ *
+ * With an active timer, fatal signals will be ignored, because they can be
+ * counter-productive when dying user space triggers device unplug/driver
+ * unloading that ends up offlining+removing device memory.
+ */
+static bool mhp_offlining_timer_active;
+static atomic_t mhp_offlining_timer_expired;
+
+static void mhp_offline_timer_fn(struct timer_list *unused)
+{
+	atomic_set(&mhp_offlining_timer_expired, 1);
+}
+
 #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
 int mhp_default_online_type = MMOP_OFFLINE;
 #else
@@ -1879,7 +1895,18 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	do {
 		pfn = start_pfn;
 		do {
-			if (fatal_signal_pending(current)) {
+			/*
+			 * If a timer is active, we're coming via
+			 * offline_and_remove_memory() and want to ignore even
+			 * fatal signals.
+			 */
+			if (mhp_offlining_timer_active) {
+				if (atomic_read(&mhp_offlining_timer_expired)) {
+					ret = -ETIMEDOUT;
+					reason = "timeout";
+					goto failed_removal_isolated;
+				}
+			} else if (fatal_signal_pending(current)) {
 				ret = -EINTR;
 				reason = "signal backoff";
 				goto failed_removal_isolated;
@@ -2232,11 +2259,17 @@ static int try_reonline_memory_block(struct memory_block *mem, void *arg)
  * memory is still in use. Primarily useful for memory devices that logically
  * unplugged all memory (so it's no longer in use) and want to offline + remove
  * that memory.
+ *
+ * offline_and_remove_memory() will not fail on fatal signals. Instead, it will
+ * fail once the timeout has been reached and offlining was not completed. If
+ * no timeout was specified, it will timeout after 30 seconds. The timeout is
+ * limited to 120 seconds.
  */
-int offline_and_remove_memory(u64 start, u64 size)
+int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms)
 {
 	const unsigned long mb_count = size / memory_block_size_bytes();
 	uint8_t *online_types, *tmp;
+	struct timer_list timer;
 	int rc;
 
 	if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
@@ -2261,9 +2294,22 @@ int offline_and_remove_memory(u64 start, u64 size)
 
 	lock_device_hotplug();
 
+	if (!timeout_ms)
+		timeout_ms = 30 * MSEC_PER_SEC;
+	timeout_ms = min_t(unsigned int, timeout_ms, 120 * MSEC_PER_SEC);
+
+	timer_setup_on_stack(&timer, mhp_offline_timer_fn, 0);
+	mod_timer(&timer, jiffies + msecs_to_jiffies(timeout_ms));
+	mhp_offlining_timer_active = true;
+
 	tmp = online_types;
 	rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block);
 
+	timer_delete_sync(&timer);
+	atomic_set(&mhp_offlining_timer_expired, 0);
+	mhp_offlining_timer_active = false;
+	destroy_timer_on_stack(&timer);
+
 	/*
 	 * In case we succeeded to offline all memory, remove it.
 	 * This cannot fail as it cannot get onlined in the meantime.
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 4/5] virtio-mem: set the timeout for offline_and_remove_memory() to 10 seconds
  2023-06-27 11:22 ` David Hildenbrand
@ 2023-06-27 11:22   ` David Hildenbrand
  -1 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

Currently we use the default (30 seconds), let's reduce it to 10
seconds. In BBM, we barely deal with blocks larger than 1/2 GiB, and
after 10 seconds it's most probably best to give up on that memory block
and try another one (or retry this one later).

In the common fake-offline case where we effectively fake-offline memory
using alloc_contig_range() first (SBM or BBM with bbm_safe_unplug=on),
we expect offline_and_remove_memory() to be blazingly fast and never take
anywhere close to 10seconds -- so this should only affect BBM with
bbm_safe_unplug=off.

While at it, update the parameter description and the relationship to
unmovable pages.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index f8792223f1db..7468b4a907e3 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -41,7 +41,7 @@ MODULE_PARM_DESC(bbm_block_size,
 static bool bbm_safe_unplug = true;
 module_param(bbm_safe_unplug, bool, 0444);
 MODULE_PARM_DESC(bbm_safe_unplug,
-	     "Use a safe unplug mechanism in BBM, avoiding long/endless loops");
+	     "Use a safe/fast unplug mechanism in BBM, failing faster on unmovable pages");
 
 /*
  * virtio-mem currently supports the following modes of operation:
@@ -738,7 +738,7 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		"offlining and removing memory: 0x%llx - 0x%llx\n", addr,
 		addr + size - 1);
 
-	rc = offline_and_remove_memory(addr, size, 0);
+	rc = offline_and_remove_memory(addr, size, 10 * MSEC_PER_SEC);
 	if (!rc) {
 		atomic64_sub(size, &vm->offline_size);
 		/*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 4/5] virtio-mem: set the timeout for offline_and_remove_memory() to 10 seconds
@ 2023-06-27 11:22   ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

Currently we use the default (30 seconds), let's reduce it to 10
seconds. In BBM, we barely deal with blocks larger than 1/2 GiB, and
after 10 seconds it's most probably best to give up on that memory block
and try another one (or retry this one later).

In the common fake-offline case where we effectively fake-offline memory
using alloc_contig_range() first (SBM or BBM with bbm_safe_unplug=on),
we expect offline_and_remove_memory() to be blazingly fast and never take
anywhere close to 10seconds -- so this should only affect BBM with
bbm_safe_unplug=off.

While at it, update the parameter description and the relationship to
unmovable pages.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index f8792223f1db..7468b4a907e3 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -41,7 +41,7 @@ MODULE_PARM_DESC(bbm_block_size,
 static bool bbm_safe_unplug = true;
 module_param(bbm_safe_unplug, bool, 0444);
 MODULE_PARM_DESC(bbm_safe_unplug,
-	     "Use a safe unplug mechanism in BBM, avoiding long/endless loops");
+	     "Use a safe/fast unplug mechanism in BBM, failing faster on unmovable pages");
 
 /*
  * virtio-mem currently supports the following modes of operation:
@@ -738,7 +738,7 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
 		"offlining and removing memory: 0x%llx - 0x%llx\n", addr,
 		addr + size - 1);
 
-	rc = offline_and_remove_memory(addr, size, 0);
+	rc = offline_and_remove_memory(addr, size, 10 * MSEC_PER_SEC);
 	if (!rc) {
 		atomic64_sub(size, &vm->offline_size);
 		/*
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 5/5] virtio-mem: check if the config changed before (fake) offlining memory
  2023-06-27 11:22 ` David Hildenbrand
@ 2023-06-27 11:22   ` David Hildenbrand
  -1 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

If we repeatedly fail to (fake) offline memory, we won't be sending
any unplug requests to the device. However, we only check if the config
changed when sending such (un)plug requests.

So we could end up trying for a long time to offline memory, even though
the config changed already and we're not supposed to unplug memory
anymore.

Let's optimize for that case, identified while testing the
offline_and_remove() memory timeout and simulating it repeatedly running
into the timeout.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 7468b4a907e3..247fb3e0ce61 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -1922,6 +1922,10 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm,
 	unsigned long start_pfn;
 	int rc;
 
+	/* Stop fake offlining attempts if the config changed. */
+	if (atomic_read(&vm->config_changed))
+		return -EAGAIN;
+
 	start_pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
 			     sb_id * vm->sbm.sb_size);
 
@@ -2233,6 +2237,10 @@ static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff)
 		virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) {
 			cond_resched();
 
+			/* Stop (fake) offlining attempts if the config changed. */
+			if (atomic_read(&vm->config_changed))
+				return -EAGAIN;
+
 			/*
 			 * As we're holding no locks, these checks are racy,
 			 * but we don't care.
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v1 5/5] virtio-mem: check if the config changed before (fake) offlining memory
@ 2023-06-27 11:22   ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, Michael S. Tsirkin,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

If we repeatedly fail to (fake) offline memory, we won't be sending
any unplug requests to the device. However, we only check if the config
changed when sending such (un)plug requests.

So we could end up trying for a long time to offline memory, even though
the config changed already and we're not supposed to unplug memory
anymore.

Let's optimize for that case, identified while testing the
offline_and_remove() memory timeout and simulating it repeatedly running
into the timeout.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 7468b4a907e3..247fb3e0ce61 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -1922,6 +1922,10 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm,
 	unsigned long start_pfn;
 	int rc;
 
+	/* Stop fake offlining attempts if the config changed. */
+	if (atomic_read(&vm->config_changed))
+		return -EAGAIN;
+
 	start_pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
 			     sb_id * vm->sbm.sb_size);
 
@@ -2233,6 +2237,10 @@ static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff)
 		virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) {
 			cond_resched();
 
+			/* Stop (fake) offlining attempts if the config changed. */
+			if (atomic_read(&vm->config_changed))
+				return -EAGAIN;
+
 			/*
 			 * As we're holding no locks, these checks are racy,
 			 * but we don't care.
-- 
2.40.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
  2023-06-27 11:22   ` David Hildenbrand
  (?)
@ 2023-06-27 12:34   ` Michal Hocko
  2023-06-27 13:28       ` David Hildenbrand
  -1 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2023-06-27 12:34 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
> Let's check for fatal signals only. That looks cleaner and still keeps
> the documented use case for manual user-space triggered memory offlining
> working. From Documentation/admin-guide/mm/memory-hotplug.rst:
> 
> 	% timeout $TIMEOUT offline_block | failure_handling
> 
> In fact, we even document there: "the offlining context can be terminated
> by sending a fatal signal".

We should be fixing documentation instead. This could break users who do
have a SIGALRM signal hander installed.

> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 8e0fa209d533..0d2151df4ee1 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1879,7 +1879,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>  	do {
>  		pfn = start_pfn;
>  		do {
> -			if (signal_pending(current)) {
> +			if (fatal_signal_pending(current)) {
>  				ret = -EINTR;
>  				reason = "signal backoff";
>  				goto failed_removal_isolated;
> -- 
> 2.40.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 11:22   ` David Hildenbrand
  (?)
@ 2023-06-27 12:40   ` Michal Hocko
  2023-06-27 13:14       ` David Hildenbrand
  -1 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2023-06-27 12:40 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
> John Hubbard writes [1]:
> 
>         Some device drivers add memory to the system via memory hotplug.
>         When the driver is unloaded, that memory is hot-unplugged.
> 
>         However, memory hot unplug can fail. And these days, it fails a
>         little too easily, with respect to the above case. Specifically, if
>         a signal is pending on the process, hot unplug fails.
> 
>         [...]
> 
>         So in this case, other things (unmovable pages, un-splittable huge
>         pages) can also cause the above problem. However, those are
>         demonstrably less common than simply having a pending signal. I've
>         got bug reports from users who can trivially reproduce this by
>         killing their process with a "kill -9", for example.

This looks like a bug of the said driver no? If the tear down process is
killed it could very well happen right before offlining so you end up in
the very same state. Or what am I missing?
 
> Especially with ZONE_MOVABLE, offlining is supposed to work in most
> cases when offlining actually hotplugged (not boot) memory, and only fail
> in rare corner cases (e.g., some driver holds a reference to a page in
> ZONE_MOVABLE, turning it unmovable).
> 
> In these corner cases we really don't want to be stuck forever in
> offline_and_remove_memory(). But in the general cases, we really want to
> do our best to make memory offlining succeed -- in a reasonable
> timeframe.
> 
> Reliably failing in the described case when there is a fatal signal pending
> is sub-optimal. The pending signal check is mostly only relevant when user
> space explicitly triggers offlining of memory using sysfs device attributes
> ("state" or "online" attribute), but not when coming via
> offline_and_remove_memory().
> 
> So let's use a timer instead and ignore fatal signals, because they are
> not really expressive for offline_and_remove_memory() users. Let's default
> to 30 seconds if no timeout was specified, and limit the timeout to 120
> seconds.

I really hate having timeouts back. They just proven to be hard to get
right and it is essentially a policy implemented in the kernel. They
simply do not belong to the kernel space IMHO.

> This change is also valuable for virtio-mem in BBM (Big Block Mode) with
> "bbm_safe_unplug=off", to avoid endless loops when stuck forever in
> offline_and_remove_memory().
> 
> While at it, drop the "extern" from offline_and_remove_memory() to make
> it fit into a single line.
> 
> [1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  drivers/virtio/virtio_mem.c    |  2 +-
>  include/linux/memory_hotplug.h |  2 +-
>  mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
>  3 files changed, 50 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index cb8bc6f6aa90..f8792223f1db 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -738,7 +738,7 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
>  		"offlining and removing memory: 0x%llx - 0x%llx\n", addr,
>  		addr + size - 1);
>  
> -	rc = offline_and_remove_memory(addr, size);
> +	rc = offline_and_remove_memory(addr, size, 0);
>  	if (!rc) {
>  		atomic64_sub(size, &vm->offline_size);
>  		/*
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 9fcbf5706595..d5f9e8b5a4a4 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -307,7 +307,7 @@ extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>  			 struct zone *zone, struct memory_group *group);
>  extern int remove_memory(u64 start, u64 size);
>  extern void __remove_memory(u64 start, u64 size);
> -extern int offline_and_remove_memory(u64 start, u64 size);
> +int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms);
>  
>  #else
>  static inline void try_offline_node(int nid) {}
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0d2151df4ee1..ca635121644a 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -152,6 +152,22 @@ void put_online_mems(void)
>  
>  bool movable_node_enabled = false;
>  
> +/*
> + * Protected by the device hotplug lock: offline_and_remove_memory()
> + * will activate a timer such that offlining cannot be stuck forever.
> + *
> + * With an active timer, fatal signals will be ignored, because they can be
> + * counter-productive when dying user space triggers device unplug/driver
> + * unloading that ends up offlining+removing device memory.
> + */
> +static bool mhp_offlining_timer_active;
> +static atomic_t mhp_offlining_timer_expired;
> +
> +static void mhp_offline_timer_fn(struct timer_list *unused)
> +{
> +	atomic_set(&mhp_offlining_timer_expired, 1);
> +}
> +
>  #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
>  int mhp_default_online_type = MMOP_OFFLINE;
>  #else
> @@ -1879,7 +1895,18 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>  	do {
>  		pfn = start_pfn;
>  		do {
> -			if (fatal_signal_pending(current)) {
> +			/*
> +			 * If a timer is active, we're coming via
> +			 * offline_and_remove_memory() and want to ignore even
> +			 * fatal signals.
> +			 */
> +			if (mhp_offlining_timer_active) {
> +				if (atomic_read(&mhp_offlining_timer_expired)) {
> +					ret = -ETIMEDOUT;
> +					reason = "timeout";
> +					goto failed_removal_isolated;
> +				}
> +			} else if (fatal_signal_pending(current)) {
>  				ret = -EINTR;
>  				reason = "signal backoff";
>  				goto failed_removal_isolated;
> @@ -2232,11 +2259,17 @@ static int try_reonline_memory_block(struct memory_block *mem, void *arg)
>   * memory is still in use. Primarily useful for memory devices that logically
>   * unplugged all memory (so it's no longer in use) and want to offline + remove
>   * that memory.
> + *
> + * offline_and_remove_memory() will not fail on fatal signals. Instead, it will
> + * fail once the timeout has been reached and offlining was not completed. If
> + * no timeout was specified, it will timeout after 30 seconds. The timeout is
> + * limited to 120 seconds.
>   */
> -int offline_and_remove_memory(u64 start, u64 size)
> +int offline_and_remove_memory(u64 start, u64 size, unsigned int timeout_ms)
>  {
>  	const unsigned long mb_count = size / memory_block_size_bytes();
>  	uint8_t *online_types, *tmp;
> +	struct timer_list timer;
>  	int rc;
>  
>  	if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
> @@ -2261,9 +2294,22 @@ int offline_and_remove_memory(u64 start, u64 size)
>  
>  	lock_device_hotplug();
>  
> +	if (!timeout_ms)
> +		timeout_ms = 30 * MSEC_PER_SEC;
> +	timeout_ms = min_t(unsigned int, timeout_ms, 120 * MSEC_PER_SEC);
> +
> +	timer_setup_on_stack(&timer, mhp_offline_timer_fn, 0);
> +	mod_timer(&timer, jiffies + msecs_to_jiffies(timeout_ms));
> +	mhp_offlining_timer_active = true;
> +
>  	tmp = online_types;
>  	rc = walk_memory_blocks(start, size, &tmp, try_offline_memory_block);
>  
> +	timer_delete_sync(&timer);
> +	atomic_set(&mhp_offlining_timer_expired, 0);
> +	mhp_offlining_timer_active = false;
> +	destroy_timer_on_stack(&timer);
> +
>  	/*
>  	 * In case we succeeded to offline all memory, remove it.
>  	 * This cannot fail as it cannot get onlined in the meantime.
> -- 
> 2.40.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 12:40   ` Michal Hocko
@ 2023-06-27 13:14       ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 13:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On 27.06.23 14:40, Michal Hocko wrote:
> On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
>> John Hubbard writes [1]:
>>
>>          Some device drivers add memory to the system via memory hotplug.
>>          When the driver is unloaded, that memory is hot-unplugged.
>>
>>          However, memory hot unplug can fail. And these days, it fails a
>>          little too easily, with respect to the above case. Specifically, if
>>          a signal is pending on the process, hot unplug fails.
>>
>>          [...]
>>
>>          So in this case, other things (unmovable pages, un-splittable huge
>>          pages) can also cause the above problem. However, those are
>>          demonstrably less common than simply having a pending signal. I've
>>          got bug reports from users who can trivially reproduce this by
>>          killing their process with a "kill -9", for example.
> 
> This looks like a bug of the said driver no? If the tear down process is
> killed it could very well happen right before offlining so you end up in
> the very same state. Or what am I missing?

IIUC (John can correct me if I am wrong):

1) The process holds the device node open
2) The process gets killed or quits
3) As the process gets torn down, it closes the device node
4) Closing the device node results in the driver removing the device and
    calling offline_and_remove_memory()

So it's not a "tear down process" that triggers that offlining_removal 
somehow explicitly, it's just a side-product of it letting go of the 
device node as the process gets torn down.

>   
>> Especially with ZONE_MOVABLE, offlining is supposed to work in most
>> cases when offlining actually hotplugged (not boot) memory, and only fail
>> in rare corner cases (e.g., some driver holds a reference to a page in
>> ZONE_MOVABLE, turning it unmovable).
>>
>> In these corner cases we really don't want to be stuck forever in
>> offline_and_remove_memory(). But in the general cases, we really want to
>> do our best to make memory offlining succeed -- in a reasonable
>> timeframe.
>>
>> Reliably failing in the described case when there is a fatal signal pending
>> is sub-optimal. The pending signal check is mostly only relevant when user
>> space explicitly triggers offlining of memory using sysfs device attributes
>> ("state" or "online" attribute), but not when coming via
>> offline_and_remove_memory().
>>
>> So let's use a timer instead and ignore fatal signals, because they are
>> not really expressive for offline_and_remove_memory() users. Let's default
>> to 30 seconds if no timeout was specified, and limit the timeout to 120
>> seconds.
> 
> I really hate having timeouts back. They just proven to be hard to get
> right and it is essentially a policy implemented in the kernel. They
> simply do not belong to the kernel space IMHO.

As much as I agree with you in terms of offlining triggered from user 
space (e.g., write "state" or "online" attribute) where user-space is 
actually in charge  and can do something reasonable (timeout, retry, 
whatever), in these the offline_and_remove_memory() case it's the driver 
that wants a best-effort memory offlining+removal.

If it times out, virtio-mem will simply try another block or retry 
later. Right now, it could get stuck forever in 
offline_and_remove_memory(), which is obviously "not great". 
Fortunately, for virtio-mem it's configurable and we use the 
alloc_contig_range()-method for now as default.

If it would time out for John's driver, we most certainly don't want to 
be stuck in offline_and_remove_memory(), blocking device/driver 
unloading (and even a reboot IIRC) possibly forever.


I much rather have offline_and_remove_memory() indicate "timeout" to a 
in-kernel user a bit earlier than getting stuck in there forever. The 
timeout parameter allows for giving the in-kernel users a bit of 
flexibility, which I showcases for virtio-mem that unplugs smaller 
blocks and rather wants to fail fast and retry later.


Sure, we could make the timeout configurable to optimize for some corner 
cases, but that's not really what offline_and_remove_memory() users want 
and I doubt anybody would fine-tune that: they want a best-effort 
attempt. And that's IMHO not really a policy, it's an implementation 
detail of these drivers.

For example, the driver from John could simply never call 
offline_and_remove_memory() and always require a reboot when wanting to 
reuse a device. But that's definitely what users want.

virtio-mem could simply never call offline_and_remove_memory() and 
indicate "I don't support unplug of these online memory blocks". But 
that's *definitely* not what users want.


I'm very open for alternatives regarding offline_and_remove_memory(), so 
far this was the only reasonable thing I could come up with that 
actually achieves what we want for these users: not get stuck in there 
forever but rather fail earlier than later.

And most importantly, not somehow try to involve user space that isn't 
even in charge of the offlining operation.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 13:14       ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 13:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Xuan Zhuo, Michael S. Tsirkin, John Hubbard, linux-kernel,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

On 27.06.23 14:40, Michal Hocko wrote:
> On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
>> John Hubbard writes [1]:
>>
>>          Some device drivers add memory to the system via memory hotplug.
>>          When the driver is unloaded, that memory is hot-unplugged.
>>
>>          However, memory hot unplug can fail. And these days, it fails a
>>          little too easily, with respect to the above case. Specifically, if
>>          a signal is pending on the process, hot unplug fails.
>>
>>          [...]
>>
>>          So in this case, other things (unmovable pages, un-splittable huge
>>          pages) can also cause the above problem. However, those are
>>          demonstrably less common than simply having a pending signal. I've
>>          got bug reports from users who can trivially reproduce this by
>>          killing their process with a "kill -9", for example.
> 
> This looks like a bug of the said driver no? If the tear down process is
> killed it could very well happen right before offlining so you end up in
> the very same state. Or what am I missing?

IIUC (John can correct me if I am wrong):

1) The process holds the device node open
2) The process gets killed or quits
3) As the process gets torn down, it closes the device node
4) Closing the device node results in the driver removing the device and
    calling offline_and_remove_memory()

So it's not a "tear down process" that triggers that offlining_removal 
somehow explicitly, it's just a side-product of it letting go of the 
device node as the process gets torn down.

>   
>> Especially with ZONE_MOVABLE, offlining is supposed to work in most
>> cases when offlining actually hotplugged (not boot) memory, and only fail
>> in rare corner cases (e.g., some driver holds a reference to a page in
>> ZONE_MOVABLE, turning it unmovable).
>>
>> In these corner cases we really don't want to be stuck forever in
>> offline_and_remove_memory(). But in the general cases, we really want to
>> do our best to make memory offlining succeed -- in a reasonable
>> timeframe.
>>
>> Reliably failing in the described case when there is a fatal signal pending
>> is sub-optimal. The pending signal check is mostly only relevant when user
>> space explicitly triggers offlining of memory using sysfs device attributes
>> ("state" or "online" attribute), but not when coming via
>> offline_and_remove_memory().
>>
>> So let's use a timer instead and ignore fatal signals, because they are
>> not really expressive for offline_and_remove_memory() users. Let's default
>> to 30 seconds if no timeout was specified, and limit the timeout to 120
>> seconds.
> 
> I really hate having timeouts back. They just proven to be hard to get
> right and it is essentially a policy implemented in the kernel. They
> simply do not belong to the kernel space IMHO.

As much as I agree with you in terms of offlining triggered from user 
space (e.g., write "state" or "online" attribute) where user-space is 
actually in charge  and can do something reasonable (timeout, retry, 
whatever), in these the offline_and_remove_memory() case it's the driver 
that wants a best-effort memory offlining+removal.

If it times out, virtio-mem will simply try another block or retry 
later. Right now, it could get stuck forever in 
offline_and_remove_memory(), which is obviously "not great". 
Fortunately, for virtio-mem it's configurable and we use the 
alloc_contig_range()-method for now as default.

If it would time out for John's driver, we most certainly don't want to 
be stuck in offline_and_remove_memory(), blocking device/driver 
unloading (and even a reboot IIRC) possibly forever.


I much rather have offline_and_remove_memory() indicate "timeout" to a 
in-kernel user a bit earlier than getting stuck in there forever. The 
timeout parameter allows for giving the in-kernel users a bit of 
flexibility, which I showcases for virtio-mem that unplugs smaller 
blocks and rather wants to fail fast and retry later.


Sure, we could make the timeout configurable to optimize for some corner 
cases, but that's not really what offline_and_remove_memory() users want 
and I doubt anybody would fine-tune that: they want a best-effort 
attempt. And that's IMHO not really a policy, it's an implementation 
detail of these drivers.

For example, the driver from John could simply never call 
offline_and_remove_memory() and always require a reboot when wanting to 
reuse a device. But that's definitely what users want.

virtio-mem could simply never call offline_and_remove_memory() and 
indicate "I don't support unplug of these online memory blocks". But 
that's *definitely* not what users want.


I'm very open for alternatives regarding offline_and_remove_memory(), so 
far this was the only reasonable thing I could come up with that 
actually achieves what we want for these users: not get stuck in there 
forever but rather fail earlier than later.

And most importantly, not somehow try to involve user space that isn't 
even in charge of the offlining operation.

-- 
Cheers,

David / dhildenb

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
  2023-06-27 12:34   ` Michal Hocko
@ 2023-06-27 13:28       ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 13:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On 27.06.23 14:34, Michal Hocko wrote:
> On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
>> Let's check for fatal signals only. That looks cleaner and still keeps
>> the documented use case for manual user-space triggered memory offlining
>> working. From Documentation/admin-guide/mm/memory-hotplug.rst:
>>
>> 	% timeout $TIMEOUT offline_block | failure_handling
>>
>> In fact, we even document there: "the offlining context can be terminated
>> by sending a fatal signal".
> 
> We should be fixing documentation instead. This could break users who do
> have a SIGALRM signal hander installed.

You mean because timeout will send a SIGALRM, which is not considered 
fatal in case a signal handler is installed?

At least the "traditional" tools I am aware of don't set a timeout at 
all (crossing fingers that they never end up stuck):
* chmem
* QEMU guest agent
* powerpc-utils

libdaxctl also doesn't seem to implement an easy-to-spot timeout for 
memory offlining, but it also doesn't configure SIGALRM.


Of course, that doesn't mean that there isn't somewhere a program that 
does that; I merely assume that it would be pretty unlikely to find such 
a program.

But no strong opinion: we can also keep it like that, update the doc and 
add a comment why this one here is different than most other signal 
backoff checks.


Thanks!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
@ 2023-06-27 13:28       ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 13:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Xuan Zhuo, Michael S. Tsirkin, John Hubbard, linux-kernel,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

On 27.06.23 14:34, Michal Hocko wrote:
> On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
>> Let's check for fatal signals only. That looks cleaner and still keeps
>> the documented use case for manual user-space triggered memory offlining
>> working. From Documentation/admin-guide/mm/memory-hotplug.rst:
>>
>> 	% timeout $TIMEOUT offline_block | failure_handling
>>
>> In fact, we even document there: "the offlining context can be terminated
>> by sending a fatal signal".
> 
> We should be fixing documentation instead. This could break users who do
> have a SIGALRM signal hander installed.

You mean because timeout will send a SIGALRM, which is not considered 
fatal in case a signal handler is installed?

At least the "traditional" tools I am aware of don't set a timeout at 
all (crossing fingers that they never end up stuck):
* chmem
* QEMU guest agent
* powerpc-utils

libdaxctl also doesn't seem to implement an easy-to-spot timeout for 
memory offlining, but it also doesn't configure SIGALRM.


Of course, that doesn't mean that there isn't somewhere a program that 
does that; I merely assume that it would be pretty unlikely to find such 
a program.

But no strong opinion: we can also keep it like that, update the doc and 
add a comment why this one here is different than most other signal 
backoff checks.


Thanks!

-- 
Cheers,

David / dhildenb

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()
  2023-06-27 13:28       ` David Hildenbrand
  (?)
@ 2023-06-27 14:07       ` Michal Hocko
  -1 siblings, 0 replies; 27+ messages in thread
From: Michal Hocko @ 2023-06-27 14:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On Tue 27-06-23 15:28:29, David Hildenbrand wrote:
> On 27.06.23 14:34, Michal Hocko wrote:
> > On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
> > > Let's check for fatal signals only. That looks cleaner and still keeps
> > > the documented use case for manual user-space triggered memory offlining
> > > working. From Documentation/admin-guide/mm/memory-hotplug.rst:
> > > 
> > > 	% timeout $TIMEOUT offline_block | failure_handling
> > > 
> > > In fact, we even document there: "the offlining context can be terminated
> > > by sending a fatal signal".
> > 
> > We should be fixing documentation instead. This could break users who do
> > have a SIGALRM signal hander installed.
> 
> You mean because timeout will send a SIGALRM, which is not considered fatal
> in case a signal handler is installed?

Correct.

> At least the "traditional" tools I am aware of don't set a timeout at all
> (crossing fingers that they never end up stuck):
> * chmem
> * QEMU guest agent
> * powerpc-utils
> 
> libdaxctl also doesn't seem to implement an easy-to-spot timeout for memory
> offlining, but it also doesn't configure SIGALRM.
> 
> 
> Of course, that doesn't mean that there isn't somewhere a program that does
> that; I merely assume that it would be pretty unlikely to find such a
> program.
> 
> But no strong opinion: we can also keep it like that, update the doc and add
> a comment why this one here is different than most other signal backoff
> checks.

Well, the existing signal handling approach is there for way too long to
be sure. I personally would prefer fatal_signal_pending as that reflects
more what we do elsewhere but here we are. Historical baggage...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 13:14       ` David Hildenbrand
  (?)
@ 2023-06-27 14:17       ` Michal Hocko
  2023-06-27 14:57           ` David Hildenbrand
  -1 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2023-06-27 14:17 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On Tue 27-06-23 15:14:11, David Hildenbrand wrote:
> On 27.06.23 14:40, Michal Hocko wrote:
> > On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
> > > John Hubbard writes [1]:
> > > 
> > >          Some device drivers add memory to the system via memory hotplug.
> > >          When the driver is unloaded, that memory is hot-unplugged.
> > > 
> > >          However, memory hot unplug can fail. And these days, it fails a
> > >          little too easily, with respect to the above case. Specifically, if
> > >          a signal is pending on the process, hot unplug fails.
> > > 
> > >          [...]
> > > 
> > >          So in this case, other things (unmovable pages, un-splittable huge
> > >          pages) can also cause the above problem. However, those are
> > >          demonstrably less common than simply having a pending signal. I've
> > >          got bug reports from users who can trivially reproduce this by
> > >          killing their process with a "kill -9", for example.
> > 
> > This looks like a bug of the said driver no? If the tear down process is
> > killed it could very well happen right before offlining so you end up in
> > the very same state. Or what am I missing?
> 
> IIUC (John can correct me if I am wrong):
> 
> 1) The process holds the device node open
> 2) The process gets killed or quits
> 3) As the process gets torn down, it closes the device node
> 4) Closing the device node results in the driver removing the device and
>    calling offline_and_remove_memory()
> 
> So it's not a "tear down process" that triggers that offlining_removal
> somehow explicitly, it's just a side-product of it letting go of the device
> node as the process gets torn down.

Isn't that just fragile? The operation might fail for other reasons. Why
cannot there be a hold on the resource to control the tear down
explicitly?

> > > Especially with ZONE_MOVABLE, offlining is supposed to work in most
> > > cases when offlining actually hotplugged (not boot) memory, and only fail
> > > in rare corner cases (e.g., some driver holds a reference to a page in
> > > ZONE_MOVABLE, turning it unmovable).
> > > 
> > > In these corner cases we really don't want to be stuck forever in
> > > offline_and_remove_memory(). But in the general cases, we really want to
> > > do our best to make memory offlining succeed -- in a reasonable
> > > timeframe.
> > > 
> > > Reliably failing in the described case when there is a fatal signal pending
> > > is sub-optimal. The pending signal check is mostly only relevant when user
> > > space explicitly triggers offlining of memory using sysfs device attributes
> > > ("state" or "online" attribute), but not when coming via
> > > offline_and_remove_memory().
> > > 
> > > So let's use a timer instead and ignore fatal signals, because they are
> > > not really expressive for offline_and_remove_memory() users. Let's default
> > > to 30 seconds if no timeout was specified, and limit the timeout to 120
> > > seconds.
> > 
> > I really hate having timeouts back. They just proven to be hard to get
> > right and it is essentially a policy implemented in the kernel. They
> > simply do not belong to the kernel space IMHO.
> 
> As much as I agree with you in terms of offlining triggered from user space
> (e.g., write "state" or "online" attribute) where user-space is actually in
> charge  and can do something reasonable (timeout, retry, whatever), in these
> the offline_and_remove_memory() case it's the driver that wants a
> best-effort memory offlining+removal.
> 
> If it times out, virtio-mem will simply try another block or retry later.
> Right now, it could get stuck forever in offline_and_remove_memory(), which
> is obviously "not great". Fortunately, for virtio-mem it's configurable and
> we use the alloc_contig_range()-method for now as default.

It seems that offline_and_remove_memory is using a wrong operation then.
If it wants an opportunistic offlining with some sort of policy. Timeout
might be just one policy to use but failure mode or a retry count might
be a better fit for some users. So rather than (ab)using offline_pages,
would be make more sense to extract basic offlining steps and allow
drivers like virtio-mem to reuse them and define their own policy?

> If it would time out for John's driver, we most certainly don't want to be
> stuck in offline_and_remove_memory(), blocking device/driver unloading (and
> even a reboot IIRC) possibly forever.

Now I am confused. John driver wants to tear down the device now? There
is no way to release that memory otherwise AFAIU from the initial
problem description.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 14:17       ` Michal Hocko
@ 2023-06-27 14:57           ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 14:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On 27.06.23 16:17, Michal Hocko wrote:
> On Tue 27-06-23 15:14:11, David Hildenbrand wrote:
>> On 27.06.23 14:40, Michal Hocko wrote:
>>> On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
>>>> John Hubbard writes [1]:
>>>>
>>>>           Some device drivers add memory to the system via memory hotplug.
>>>>           When the driver is unloaded, that memory is hot-unplugged.
>>>>
>>>>           However, memory hot unplug can fail. And these days, it fails a
>>>>           little too easily, with respect to the above case. Specifically, if
>>>>           a signal is pending on the process, hot unplug fails.
>>>>
>>>>           [...]
>>>>
>>>>           So in this case, other things (unmovable pages, un-splittable huge
>>>>           pages) can also cause the above problem. However, those are
>>>>           demonstrably less common than simply having a pending signal. I've
>>>>           got bug reports from users who can trivially reproduce this by
>>>>           killing their process with a "kill -9", for example.
>>>
>>> This looks like a bug of the said driver no? If the tear down process is
>>> killed it could very well happen right before offlining so you end up in
>>> the very same state. Or what am I missing?
>>
>> IIUC (John can correct me if I am wrong):
>>
>> 1) The process holds the device node open
>> 2) The process gets killed or quits
>> 3) As the process gets torn down, it closes the device node
>> 4) Closing the device node results in the driver removing the device and
>>     calling offline_and_remove_memory()
>>
>> So it's not a "tear down process" that triggers that offlining_removal
>> somehow explicitly, it's just a side-product of it letting go of the device
>> node as the process gets torn down.
> 
> Isn't that just fragile? The operation might fail for other reasons. Why
> cannot there be a hold on the resource to control the tear down
> explicitly?

I'll let John comment on that. But from what I understood, in most 
setups where ZONE_MOVABLE gets used for hotplugged memory 
offline_and_remove_memory() succeeds and allows for reusing the device 
later without a reboot.

For the cases where it doesn't work, a reboot is required.

> 
>>>> Especially with ZONE_MOVABLE, offlining is supposed to work in most
>>>> cases when offlining actually hotplugged (not boot) memory, and only fail
>>>> in rare corner cases (e.g., some driver holds a reference to a page in
>>>> ZONE_MOVABLE, turning it unmovable).
>>>>
>>>> In these corner cases we really don't want to be stuck forever in
>>>> offline_and_remove_memory(). But in the general cases, we really want to
>>>> do our best to make memory offlining succeed -- in a reasonable
>>>> timeframe.
>>>>
>>>> Reliably failing in the described case when there is a fatal signal pending
>>>> is sub-optimal. The pending signal check is mostly only relevant when user
>>>> space explicitly triggers offlining of memory using sysfs device attributes
>>>> ("state" or "online" attribute), but not when coming via
>>>> offline_and_remove_memory().
>>>>
>>>> So let's use a timer instead and ignore fatal signals, because they are
>>>> not really expressive for offline_and_remove_memory() users. Let's default
>>>> to 30 seconds if no timeout was specified, and limit the timeout to 120
>>>> seconds.
>>>
>>> I really hate having timeouts back. They just proven to be hard to get
>>> right and it is essentially a policy implemented in the kernel. They
>>> simply do not belong to the kernel space IMHO.
>>
>> As much as I agree with you in terms of offlining triggered from user space
>> (e.g., write "state" or "online" attribute) where user-space is actually in
>> charge  and can do something reasonable (timeout, retry, whatever), in these
>> the offline_and_remove_memory() case it's the driver that wants a
>> best-effort memory offlining+removal.
>>
>> If it times out, virtio-mem will simply try another block or retry later.
>> Right now, it could get stuck forever in offline_and_remove_memory(), which
>> is obviously "not great". Fortunately, for virtio-mem it's configurable and
>> we use the alloc_contig_range()-method for now as default.
> 
> It seems that offline_and_remove_memory is using a wrong operation then.
> If it wants an opportunistic offlining with some sort of policy. Timeout
> might be just one policy to use but failure mode or a retry count might
> be a better fit for some users. So rather than (ab)using offline_pages,
> would be make more sense to extract basic offlining steps and allow
> drivers like virtio-mem to reuse them and define their own policy?

virtio-mem, in default operation, does that: use alloc_contig_range() to 
logically unplug ("fake offline") that memory and then just trigger 
offline_and_remove_memory() to make it "officially offline".

In that mode, offline_and_remove_memory() cannot really timeout and is 
almost always going to succeed (except memory notifiers and some hugetlb 
dissolving).

Right now we also allow the admin to configure ordinary offlining 
directly (without prior fake offlining) when bigger memory blocks are 
used: offline_pages() is more reliable than alloc_contig_range(), for 
example, because it disables the PCP and the LRU cache, and retries more 
often (well, unfortunately then also forever). It has a higher chance of 
succeeding especially when bigger blocks of memory are offlined+removed.

Maybe we should make the alloc_contig_range()-based mechanism more 
configurable and make it the only mode in virtio-mem, such that we don't 
have to mess with offline_and_remove_memory() endless loops -- at least 
for virtio-mem.

> 
>> If it would time out for John's driver, we most certainly don't want to be
>> stuck in offline_and_remove_memory(), blocking device/driver unloading (and
>> even a reboot IIRC) possibly forever.
> 
> Now I am confused. John driver wants to tear down the device now? There
> is no way to release that memory otherwise AFAIU from the initial
> problem description.

 From what I understood if offline_and_remove_memory() succeeded, the 
device can be reinitialized and used again. Otherwise, a reboot is 
required because that memory is still added to the system.

Thanks Michal!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 14:57           ` David Hildenbrand
  0 siblings, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2023-06-27 14:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Xuan Zhuo, Michael S. Tsirkin, John Hubbard, linux-kernel,
	virtualization, linux-mm, Andrew Morton, Oscar Salvador

On 27.06.23 16:17, Michal Hocko wrote:
> On Tue 27-06-23 15:14:11, David Hildenbrand wrote:
>> On 27.06.23 14:40, Michal Hocko wrote:
>>> On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
>>>> John Hubbard writes [1]:
>>>>
>>>>           Some device drivers add memory to the system via memory hotplug.
>>>>           When the driver is unloaded, that memory is hot-unplugged.
>>>>
>>>>           However, memory hot unplug can fail. And these days, it fails a
>>>>           little too easily, with respect to the above case. Specifically, if
>>>>           a signal is pending on the process, hot unplug fails.
>>>>
>>>>           [...]
>>>>
>>>>           So in this case, other things (unmovable pages, un-splittable huge
>>>>           pages) can also cause the above problem. However, those are
>>>>           demonstrably less common than simply having a pending signal. I've
>>>>           got bug reports from users who can trivially reproduce this by
>>>>           killing their process with a "kill -9", for example.
>>>
>>> This looks like a bug of the said driver no? If the tear down process is
>>> killed it could very well happen right before offlining so you end up in
>>> the very same state. Or what am I missing?
>>
>> IIUC (John can correct me if I am wrong):
>>
>> 1) The process holds the device node open
>> 2) The process gets killed or quits
>> 3) As the process gets torn down, it closes the device node
>> 4) Closing the device node results in the driver removing the device and
>>     calling offline_and_remove_memory()
>>
>> So it's not a "tear down process" that triggers that offlining_removal
>> somehow explicitly, it's just a side-product of it letting go of the device
>> node as the process gets torn down.
> 
> Isn't that just fragile? The operation might fail for other reasons. Why
> cannot there be a hold on the resource to control the tear down
> explicitly?

I'll let John comment on that. But from what I understood, in most 
setups where ZONE_MOVABLE gets used for hotplugged memory 
offline_and_remove_memory() succeeds and allows for reusing the device 
later without a reboot.

For the cases where it doesn't work, a reboot is required.

> 
>>>> Especially with ZONE_MOVABLE, offlining is supposed to work in most
>>>> cases when offlining actually hotplugged (not boot) memory, and only fail
>>>> in rare corner cases (e.g., some driver holds a reference to a page in
>>>> ZONE_MOVABLE, turning it unmovable).
>>>>
>>>> In these corner cases we really don't want to be stuck forever in
>>>> offline_and_remove_memory(). But in the general cases, we really want to
>>>> do our best to make memory offlining succeed -- in a reasonable
>>>> timeframe.
>>>>
>>>> Reliably failing in the described case when there is a fatal signal pending
>>>> is sub-optimal. The pending signal check is mostly only relevant when user
>>>> space explicitly triggers offlining of memory using sysfs device attributes
>>>> ("state" or "online" attribute), but not when coming via
>>>> offline_and_remove_memory().
>>>>
>>>> So let's use a timer instead and ignore fatal signals, because they are
>>>> not really expressive for offline_and_remove_memory() users. Let's default
>>>> to 30 seconds if no timeout was specified, and limit the timeout to 120
>>>> seconds.
>>>
>>> I really hate having timeouts back. They just proven to be hard to get
>>> right and it is essentially a policy implemented in the kernel. They
>>> simply do not belong to the kernel space IMHO.
>>
>> As much as I agree with you in terms of offlining triggered from user space
>> (e.g., write "state" or "online" attribute) where user-space is actually in
>> charge  and can do something reasonable (timeout, retry, whatever), in these
>> the offline_and_remove_memory() case it's the driver that wants a
>> best-effort memory offlining+removal.
>>
>> If it times out, virtio-mem will simply try another block or retry later.
>> Right now, it could get stuck forever in offline_and_remove_memory(), which
>> is obviously "not great". Fortunately, for virtio-mem it's configurable and
>> we use the alloc_contig_range()-method for now as default.
> 
> It seems that offline_and_remove_memory is using a wrong operation then.
> If it wants an opportunistic offlining with some sort of policy. Timeout
> might be just one policy to use but failure mode or a retry count might
> be a better fit for some users. So rather than (ab)using offline_pages,
> would be make more sense to extract basic offlining steps and allow
> drivers like virtio-mem to reuse them and define their own policy?

virtio-mem, in default operation, does that: use alloc_contig_range() to 
logically unplug ("fake offline") that memory and then just trigger 
offline_and_remove_memory() to make it "officially offline".

In that mode, offline_and_remove_memory() cannot really timeout and is 
almost always going to succeed (except memory notifiers and some hugetlb 
dissolving).

Right now we also allow the admin to configure ordinary offlining 
directly (without prior fake offlining) when bigger memory blocks are 
used: offline_pages() is more reliable than alloc_contig_range(), for 
example, because it disables the PCP and the LRU cache, and retries more 
often (well, unfortunately then also forever). It has a higher chance of 
succeeding especially when bigger blocks of memory are offlined+removed.

Maybe we should make the alloc_contig_range()-based mechanism more 
configurable and make it the only mode in virtio-mem, such that we don't 
have to mess with offline_and_remove_memory() endless loops -- at least 
for virtio-mem.

> 
>> If it would time out for John's driver, we most certainly don't want to be
>> stuck in offline_and_remove_memory(), blocking device/driver unloading (and
>> even a reboot IIRC) possibly forever.
> 
> Now I am confused. John driver wants to tear down the device now? There
> is no way to release that memory otherwise AFAIU from the initial
> problem description.

 From what I understood if offline_and_remove_memory() succeeded, the 
device can be reinitialized and used again. Otherwise, a reboot is 
required because that memory is still added to the system.

Thanks Michal!

-- 
Cheers,

David / dhildenb

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 14:57           ` David Hildenbrand
  (?)
@ 2023-06-27 15:14           ` Michal Hocko
  2023-06-27 21:34               ` John Hubbard via Virtualization
  -1 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2023-06-27 15:14 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Jason Wang,
	Xuan Zhuo

On Tue 27-06-23 16:57:53, David Hildenbrand wrote:
> On 27.06.23 16:17, Michal Hocko wrote:
> > On Tue 27-06-23 15:14:11, David Hildenbrand wrote:
> > > On 27.06.23 14:40, Michal Hocko wrote:
> > > > On Tue 27-06-23 13:22:18, David Hildenbrand wrote:
> > > > > John Hubbard writes [1]:
> > > > > 
> > > > >           Some device drivers add memory to the system via memory hotplug.
> > > > >           When the driver is unloaded, that memory is hot-unplugged.
> > > > > 
> > > > >           However, memory hot unplug can fail. And these days, it fails a
> > > > >           little too easily, with respect to the above case. Specifically, if
> > > > >           a signal is pending on the process, hot unplug fails.
> > > > > 
> > > > >           [...]
> > > > > 
> > > > >           So in this case, other things (unmovable pages, un-splittable huge
> > > > >           pages) can also cause the above problem. However, those are
> > > > >           demonstrably less common than simply having a pending signal. I've
> > > > >           got bug reports from users who can trivially reproduce this by
> > > > >           killing their process with a "kill -9", for example.
> > > > 
> > > > This looks like a bug of the said driver no? If the tear down process is
> > > > killed it could very well happen right before offlining so you end up in
> > > > the very same state. Or what am I missing?
> > > 
> > > IIUC (John can correct me if I am wrong):
> > > 
> > > 1) The process holds the device node open
> > > 2) The process gets killed or quits
> > > 3) As the process gets torn down, it closes the device node
> > > 4) Closing the device node results in the driver removing the device and
> > >     calling offline_and_remove_memory()
> > > 
> > > So it's not a "tear down process" that triggers that offlining_removal
> > > somehow explicitly, it's just a side-product of it letting go of the device
> > > node as the process gets torn down.
> > 
> > Isn't that just fragile? The operation might fail for other reasons. Why
> > cannot there be a hold on the resource to control the tear down
> > explicitly?
> 
> I'll let John comment on that. But from what I understood, in most setups
> where ZONE_MOVABLE gets used for hotplugged memory
> offline_and_remove_memory() succeeds and allows for reusing the device later
> without a reboot.
> 
> For the cases where it doesn't work, a reboot is required.

Then the solution should be really robust and means to handle the
failure - e.g. by retrying or alerting the admin.

> > > > > Especially with ZONE_MOVABLE, offlining is supposed to work in most
> > > > > cases when offlining actually hotplugged (not boot) memory, and only fail
> > > > > in rare corner cases (e.g., some driver holds a reference to a page in
> > > > > ZONE_MOVABLE, turning it unmovable).
> > > > > 
> > > > > In these corner cases we really don't want to be stuck forever in
> > > > > offline_and_remove_memory(). But in the general cases, we really want to
> > > > > do our best to make memory offlining succeed -- in a reasonable
> > > > > timeframe.
> > > > > 
> > > > > Reliably failing in the described case when there is a fatal signal pending
> > > > > is sub-optimal. The pending signal check is mostly only relevant when user
> > > > > space explicitly triggers offlining of memory using sysfs device attributes
> > > > > ("state" or "online" attribute), but not when coming via
> > > > > offline_and_remove_memory().
> > > > > 
> > > > > So let's use a timer instead and ignore fatal signals, because they are
> > > > > not really expressive for offline_and_remove_memory() users. Let's default
> > > > > to 30 seconds if no timeout was specified, and limit the timeout to 120
> > > > > seconds.
> > > > 
> > > > I really hate having timeouts back. They just proven to be hard to get
> > > > right and it is essentially a policy implemented in the kernel. They
> > > > simply do not belong to the kernel space IMHO.
> > > 
> > > As much as I agree with you in terms of offlining triggered from user space
> > > (e.g., write "state" or "online" attribute) where user-space is actually in
> > > charge  and can do something reasonable (timeout, retry, whatever), in these
> > > the offline_and_remove_memory() case it's the driver that wants a
> > > best-effort memory offlining+removal.
> > > 
> > > If it times out, virtio-mem will simply try another block or retry later.
> > > Right now, it could get stuck forever in offline_and_remove_memory(), which
> > > is obviously "not great". Fortunately, for virtio-mem it's configurable and
> > > we use the alloc_contig_range()-method for now as default.
> > 
> > It seems that offline_and_remove_memory is using a wrong operation then.
> > If it wants an opportunistic offlining with some sort of policy. Timeout
> > might be just one policy to use but failure mode or a retry count might
> > be a better fit for some users. So rather than (ab)using offline_pages,
> > would be make more sense to extract basic offlining steps and allow
> > drivers like virtio-mem to reuse them and define their own policy?
> 
> virtio-mem, in default operation, does that: use alloc_contig_range() to
> logically unplug ("fake offline") that memory and then just trigger
> offline_and_remove_memory() to make it "officially offline".
> 
> In that mode, offline_and_remove_memory() cannot really timeout and is
> almost always going to succeed (except memory notifiers and some hugetlb
> dissolving).
> 
> Right now we also allow the admin to configure ordinary offlining directly
> (without prior fake offlining) when bigger memory blocks are used:
> offline_pages() is more reliable than alloc_contig_range(), for example,
> because it disables the PCP and the LRU cache, and retries more often (well,
> unfortunately then also forever). It has a higher chance of succeeding
> especially when bigger blocks of memory are offlined+removed.
> 
> Maybe we should make the alloc_contig_range()-based mechanism more
> configurable and make it the only mode in virtio-mem, such that we don't
> have to mess with offline_and_remove_memory() endless loops -- at least for
> virtio-mem.

Yes, that sounds better than hooking up into offline_pages the way this
patch is doing.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 15:14           ` Michal Hocko
@ 2023-06-27 21:34               ` John Hubbard via Virtualization
  0 siblings, 0 replies; 27+ messages in thread
From: John Hubbard @ 2023-06-27 21:34 UTC (permalink / raw)
  To: Michal Hocko, David Hildenbrand
  Cc: linux-kernel, linux-mm, virtualization, Andrew Morton,
	Michael S. Tsirkin, Oscar Salvador, Jason Wang, Xuan Zhuo

On 6/27/23 08:14, Michal Hocko wrote:
> On Tue 27-06-23 16:57:53, David Hildenbrand wrote:
...
>>>> IIUC (John can correct me if I am wrong):
>>>>
>>>> 1) The process holds the device node open
>>>> 2) The process gets killed or quits
>>>> 3) As the process gets torn down, it closes the device node
>>>> 4) Closing the device node results in the driver removing the device and
>>>>      calling offline_and_remove_memory()
>>>>
>>>> So it's not a "tear down process" that triggers that offlining_removal
>>>> somehow explicitly, it's just a side-product of it letting go of the device
>>>> node as the process gets torn down.
>>>
>>> Isn't that just fragile? The operation might fail for other reasons. Why
>>> cannot there be a hold on the resource to control the tear down
>>> explicitly?
>>
>> I'll let John comment on that. But from what I understood, in most setups
>> where ZONE_MOVABLE gets used for hotplugged memory
>> offline_and_remove_memory() succeeds and allows for reusing the device later
>> without a reboot.
>>
>> For the cases where it doesn't work, a reboot is required.
  
That is exactly correct. That's what we ran into.

And there are workarounds (for example: kthreads don't have any signals
pending...), but I did want to follow through here and make -mm aware of the
problem. And see if there is a better way.

...
>>> It seems that offline_and_remove_memory is using a wrong operation then.
>>> If it wants an opportunistic offlining with some sort of policy. Timeout
>>> might be just one policy to use but failure mode or a retry count might
>>> be a better fit for some users. So rather than (ab)using offline_pages,
>>> would be make more sense to extract basic offlining steps and allow
>>> drivers like virtio-mem to reuse them and define their own policy?

...like this, perhaps. Sounds promising!


thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 21:34               ` John Hubbard via Virtualization
  0 siblings, 0 replies; 27+ messages in thread
From: John Hubbard via Virtualization @ 2023-06-27 21:34 UTC (permalink / raw)
  To: Michal Hocko, David Hildenbrand
  Cc: Xuan Zhuo, Michael S. Tsirkin, linux-kernel, virtualization,
	linux-mm, Andrew Morton, Oscar Salvador

On 6/27/23 08:14, Michal Hocko wrote:
> On Tue 27-06-23 16:57:53, David Hildenbrand wrote:
...
>>>> IIUC (John can correct me if I am wrong):
>>>>
>>>> 1) The process holds the device node open
>>>> 2) The process gets killed or quits
>>>> 3) As the process gets torn down, it closes the device node
>>>> 4) Closing the device node results in the driver removing the device and
>>>>      calling offline_and_remove_memory()
>>>>
>>>> So it's not a "tear down process" that triggers that offlining_removal
>>>> somehow explicitly, it's just a side-product of it letting go of the device
>>>> node as the process gets torn down.
>>>
>>> Isn't that just fragile? The operation might fail for other reasons. Why
>>> cannot there be a hold on the resource to control the tear down
>>> explicitly?
>>
>> I'll let John comment on that. But from what I understood, in most setups
>> where ZONE_MOVABLE gets used for hotplugged memory
>> offline_and_remove_memory() succeeds and allows for reusing the device later
>> without a reboot.
>>
>> For the cases where it doesn't work, a reboot is required.
  
That is exactly correct. That's what we ran into.

And there are workarounds (for example: kthreads don't have any signals
pending...), but I did want to follow through here and make -mm aware of the
problem. And see if there is a better way.

...
>>> It seems that offline_and_remove_memory is using a wrong operation then.
>>> If it wants an opportunistic offlining with some sort of policy. Timeout
>>> might be just one policy to use but failure mode or a retry count might
>>> be a better fit for some users. So rather than (ab)using offline_pages,
>>> would be make more sense to extract basic offlining steps and allow
>>> drivers like virtio-mem to reuse them and define their own policy?

...like this, perhaps. Sounds promising!


thanks,
-- 
John Hubbard
NVIDIA

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
  2023-06-27 11:22   ` David Hildenbrand
@ 2023-06-28  2:00     ` kernel test robot
  -1 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2023-06-28  2:00 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: llvm, oe-kbuild-all, linux-mm, virtualization, David Hildenbrand,
	Andrew Morton, Michael S. Tsirkin, John Hubbard, Oscar Salvador,
	Michal Hocko, Jason Wang, Xuan Zhuo

Hi David,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 6995e2de6891c724bfeb2db33d7b87775f913ad1]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Hildenbrand/mm-memory_hotplug-check-for-fatal-signals-only-in-offline_pages/20230627-192444
base:   6995e2de6891c724bfeb2db33d7b87775f913ad1
patch link:    https://lore.kernel.org/r/20230627112220.229240-4-david%40redhat.com
patch subject: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
config: x86_64-randconfig-x006-20230627 (https://download.01.org/0day-ci/archive/20230628/202306280935.dKTWlHFD-lkp@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project.git 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce: (https://download.01.org/0day-ci/archive/20230628/202306280935.dKTWlHFD-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306280935.dKTWlHFD-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/memory_hotplug.c:163:13: warning: unused variable 'mhp_offlining_timer_active' [-Wunused-variable]
   static bool mhp_offlining_timer_active;
               ^
   mm/memory_hotplug.c:166:13: warning: unused function 'mhp_offline_timer_fn' [-Wunused-function]
   static void mhp_offline_timer_fn(struct timer_list *unused)
               ^
   2 warnings generated.


vim +/mhp_offlining_timer_active +163 mm/memory_hotplug.c

   154	
   155	/*
   156	 * Protected by the device hotplug lock: offline_and_remove_memory()
   157	 * will activate a timer such that offlining cannot be stuck forever.
   158	 *
   159	 * With an active timer, fatal signals will be ignored, because they can be
   160	 * counter-productive when dying user space triggers device unplug/driver
   161	 * unloading that ends up offlining+removing device memory.
   162	 */
 > 163	static bool mhp_offlining_timer_active;
   164	static atomic_t mhp_offlining_timer_expired;
   165	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-28  2:00     ` kernel test robot
  0 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2023-06-28  2:00 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: Xuan Zhuo, Michal Hocko, John Hubbard, llvm, virtualization,
	linux-mm, Michael S. Tsirkin, oe-kbuild-all, Andrew Morton,
	Oscar Salvador

Hi David,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 6995e2de6891c724bfeb2db33d7b87775f913ad1]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Hildenbrand/mm-memory_hotplug-check-for-fatal-signals-only-in-offline_pages/20230627-192444
base:   6995e2de6891c724bfeb2db33d7b87775f913ad1
patch link:    https://lore.kernel.org/r/20230627112220.229240-4-david%40redhat.com
patch subject: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
config: x86_64-randconfig-x006-20230627 (https://download.01.org/0day-ci/archive/20230628/202306280935.dKTWlHFD-lkp@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project.git 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce: (https://download.01.org/0day-ci/archive/20230628/202306280935.dKTWlHFD-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306280935.dKTWlHFD-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/memory_hotplug.c:163:13: warning: unused variable 'mhp_offlining_timer_active' [-Wunused-variable]
   static bool mhp_offlining_timer_active;
               ^
   mm/memory_hotplug.c:166:13: warning: unused function 'mhp_offline_timer_fn' [-Wunused-function]
   static void mhp_offline_timer_fn(struct timer_list *unused)
               ^
   2 warnings generated.


vim +/mhp_offlining_timer_active +163 mm/memory_hotplug.c

   154	
   155	/*
   156	 * Protected by the device hotplug lock: offline_and_remove_memory()
   157	 * will activate a timer such that offlining cannot be stuck forever.
   158	 *
   159	 * With an active timer, fatal signals will be ignored, because they can be
   160	 * counter-productive when dying user space triggers device unplug/driver
   161	 * unloading that ends up offlining+removing device memory.
   162	 */
 > 163	static bool mhp_offlining_timer_active;
   164	static atomic_t mhp_offlining_timer_expired;
   165	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-06-28  2:00 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 11:22 [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals David Hildenbrand
2023-06-27 11:22 ` David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages() David Hildenbrand
2023-06-27 11:22   ` David Hildenbrand
2023-06-27 12:34   ` Michal Hocko
2023-06-27 13:28     ` David Hildenbrand
2023-06-27 13:28       ` David Hildenbrand
2023-06-27 14:07       ` Michal Hocko
2023-06-27 11:22 ` [PATCH v1 2/5] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY David Hildenbrand
2023-06-27 11:22   ` David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals David Hildenbrand
2023-06-27 11:22   ` David Hildenbrand
2023-06-27 12:40   ` Michal Hocko
2023-06-27 13:14     ` David Hildenbrand
2023-06-27 13:14       ` David Hildenbrand
2023-06-27 14:17       ` Michal Hocko
2023-06-27 14:57         ` David Hildenbrand
2023-06-27 14:57           ` David Hildenbrand
2023-06-27 15:14           ` Michal Hocko
2023-06-27 21:34             ` John Hubbard
2023-06-27 21:34               ` John Hubbard via Virtualization
2023-06-28  2:00   ` kernel test robot
2023-06-28  2:00     ` kernel test robot
2023-06-27 11:22 ` [PATCH v1 4/5] virtio-mem: set the timeout for offline_and_remove_memory() to 10 seconds David Hildenbrand
2023-06-27 11:22   ` David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 5/5] virtio-mem: check if the config changed before (fake) offlining memory David Hildenbrand
2023-06-27 11:22   ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.