All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Juergen Gross <jgross@suse.com>, Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH v1 08/10] mm/memory_hotplug: allow to control onlining/offlining of memory by a driver
Date: Wed, 23 May 2018 17:11:49 +0200	[thread overview]
Message-ID: <20180523151151.6730-9-david@redhat.com> (raw)
In-Reply-To: <20180523151151.6730-1-david@redhat.com>

Some devices (esp. paravirtualized) might want to control
- when to online/offline a memory block
- how to online memory (MOVABLE/NORMAL)
- in which granularity to online/offline memory

So let's add a new flag "driver_managed" and disallow to change the
state by user space. Device onlining/offlining will still work, however
the memory will not be actually onlined/offlined. That has to be handled
by the device driver that owns the memory.

Please note that we have to create user visible memory blocks after all
since this is required to trigger the right udevs events in order to
reload kexec/kdump. Also, it allows to see what is going on in the
system (e.g. which memory blocks are still around).

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c          | 22 ++++++++++++++--------
 drivers/xen/balloon.c          |  2 +-
 include/linux/memory.h         |  1 +
 include/linux/memory_hotplug.h |  4 +++-
 mm/memory_hotplug.c            | 34 ++++++++++++++++++++++++++++++++--
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index bffe8616bd55..3b8616551561 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -231,27 +231,28 @@ static bool pages_correctly_probed(unsigned long start_pfn)
  * Must already be protected by mem_hotplug_begin().
  */
 static int
-memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
+memory_block_action(struct memory_block *mem, unsigned long action)
 {
-	unsigned long start_pfn;
+	unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
-	int ret;
+	int ret = 0;
 
-	start_pfn = section_nr_to_pfn(phys_index);
+	if (mem->driver_managed)
+		return 0;
 
 	switch (action) {
 	case MEM_ONLINE:
 		if (!pages_correctly_probed(start_pfn))
 			return -EBUSY;
 
-		ret = online_pages(start_pfn, nr_pages, online_type);
+		ret = online_pages(start_pfn, nr_pages, mem->online_type);
 		break;
 	case MEM_OFFLINE:
 		ret = offline_pages(start_pfn, nr_pages);
 		break;
 	default:
 		WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: "
-		     "%ld\n", __func__, phys_index, action, action);
+		     "%ld\n", __func__, mem->start_section_nr, action, action);
 		ret = -EINVAL;
 	}
 
@@ -269,8 +270,7 @@ static int memory_block_change_state(struct memory_block *mem,
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
-	ret = memory_block_action(mem->start_section_nr, to_state,
-				mem->online_type);
+	ret = memory_block_action(mem, to_state);
 
 	mem->state = ret ? from_state_req : to_state;
 
@@ -350,6 +350,11 @@ store_mem_state(struct device *dev,
 	 */
 	mem_hotplug_begin();
 
+	if (mem->driver_managed) {
+		ret = -EINVAL;
+		goto out;
+	}
+
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
@@ -364,6 +369,7 @@ store_mem_state(struct device *dev,
 		ret = -EINVAL; /* should never happen */
 	}
 
+out:
 	mem_hotplug_done();
 err:
 	unlock_device_hotplug();
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 065f0b607373..89981d573c06 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -401,7 +401,7 @@ static enum bp_state reserve_additional_memory(void)
 	 * callers drop the mutex before trying again.
 	 */
 	mutex_unlock(&balloon_mutex);
-	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	rc = add_memory_resource(nid, resource, memhp_auto_online, false);
 	mutex_lock(&balloon_mutex);
 
 	if (rc) {
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 9f8cd856ca1e..018c5e5ecde1 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -29,6 +29,7 @@ struct memory_block {
 	unsigned long state;		/* serialized by the dev->lock */
 	int section_count;		/* serialized by mem_sysfs_mutex */
 	int online_type;		/* for passing data to online routine */
+	bool driver_managed;		/* driver handles online/offline */
 	int phys_device;		/* to which fru does this belong? */
 	void *hw;			/* optional pointer to fw/hw data */
 	int (*phys_callback)(struct memory_block *);
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index d71829d54360..497e28f5b000 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -326,7 +326,9 @@ static inline void remove_memory(int nid, u64 start, u64 size) {}
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
 extern int add_memory(int nid, u64 start, u64 size);
-extern int add_memory_resource(int nid, struct resource *resource, bool online);
+extern int add_memory_driver_managed(int nid, u64 start, u64 size);
+extern int add_memory_resource(int nid, struct resource *resource, bool online,
+			       bool driver_managed);
 extern int arch_add_memory(int nid, u64 start, u64 size,
 		struct vmem_altmap *altmap, bool want_memblock);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 27f7c27f57ac..1610e214bfc8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1124,8 +1124,15 @@ static int online_memory_block(struct memory_block *mem, void *arg)
 	return device_online(&mem->dev);
 }
 
+static int mark_memory_block_driver_managed(struct memory_block *mem, void *arg)
+{
+	mem->driver_managed = true;
+	return 0;
+}
+
 /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
-int __ref add_memory_resource(int nid, struct resource *res, bool online)
+int __ref add_memory_resource(int nid, struct resource *res, bool online,
+			      bool driver_managed)
 {
 	u64 start, size;
 	pg_data_t *pgdat = NULL;
@@ -1133,6 +1140,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	bool new_node;
 	int ret;
 
+	if (online && driver_managed)
+		return -EINVAL;
+
 	start = res->start;
 	size = resource_size(res);
 
@@ -1204,6 +1214,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	if (online)
 		walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
 				  NULL, online_memory_block);
+	else if (driver_managed)
+		walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
+				  NULL, mark_memory_block_driver_managed);
 
 	goto out;
 
@@ -1228,13 +1241,30 @@ int __ref add_memory(int nid, u64 start, u64 size)
 	if (IS_ERR(res))
 		return PTR_ERR(res);
 
-	ret = add_memory_resource(nid, res, memhp_auto_online);
+	ret = add_memory_resource(nid, res, memhp_auto_online, false);
 	if (ret < 0)
 		release_memory_resource(res);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(add_memory);
 
+int __ref add_memory_driver_managed(int nid, u64 start, u64 size)
+{
+	struct resource *res;
+	int ret;
+
+	res = register_memory_resource(start, size);
+	if (IS_ERR(res))
+		return PTR_ERR(res);
+
+	ret = add_memory_resource(nid, res, false, true);
+	if (ret < 0)
+		release_memory_resource(res);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(add_memory_driver_managed);
+
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
  * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
-- 
2.17.0

  parent reply	other threads:[~2018-05-23 15:12 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23 15:11 [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by device driver David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 01/10] mm: introduce and use PageOffline() David Hildenbrand
2018-05-23 15:11   ` David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 02/10] mm/page_ext.c: support online/offline of memory < section size David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 03/10] kasan: prepare for online/offline of different start/size David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 04/10] kdump: include PAGE_OFFLINE_MAPCOUNT_VALUE in VMCOREINFO David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 05/10] mm/memory_hotplug: limit offline_pages() to sizes we can actually handle David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 06/10] mm/memory_hotplug: onlining pages can only fail due to notifiers David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 07/10] mm/memory_hotplug: print only with DEBUG_VM in online/offline_pages() David Hildenbrand
2018-05-23 15:11 ` David Hildenbrand [this message]
2018-05-23 15:11 ` [PATCH v1 09/10] mm/memory_hotplug: teach offline_pages() to not try forever David Hildenbrand
2018-05-24 14:39   ` Michal Hocko
2018-05-24 20:36     ` David Hildenbrand
2018-05-23 15:11 ` [PATCH v1 10/10] mm/memory_hotplug: allow online/offline memory by a kernel module David Hildenbrand
2018-05-23 19:51   ` Christoph Hellwig
2018-05-24  5:59     ` David Hildenbrand
2018-05-24  7:53 ` [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by device driver Michal Hocko
2018-05-24  8:31   ` David Hildenbrand
2018-05-24  8:56     ` Dave Young
2018-05-24  9:14       ` David Hildenbrand
2018-05-28  8:28         ` Dave Young
2018-05-28 10:03           ` David Hildenbrand
2018-05-24  9:31     ` Michal Hocko
2018-05-24 10:45       ` David Hildenbrand
2018-05-24 12:03         ` Michal Hocko
2018-05-24 14:04           ` David Hildenbrand
2018-05-24 14:22             ` Michal Hocko
2018-05-24 21:07               ` David Hildenbrand
2018-06-11 11:53                 ` David Hildenbrand
2018-06-11 11:56                   ` Michal Hocko
2018-06-11 12:33                     ` David Hildenbrand
2018-07-16 19:48                       ` David Hildenbrand
2018-07-16 20:05                         ` Michal Hocko
2018-07-18  9:56                           ` David Hildenbrand
2018-07-18 11:23                             ` Michal Hocko
2018-07-18 13:19                 ` Michal Hocko
2018-07-18 13:39                   ` David Hildenbrand
2018-07-18 13:43                     ` Michal Hocko
2018-07-18 13:47                       ` David Hildenbrand
2018-07-18 13:56                         ` Michal Hocko
2018-05-25 15:08           ` David Hildenbrand
2018-05-25 15:08             ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180523151151.6730-9-david@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=pasha.tatashin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.