linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier
@ 2019-10-31 14:29 David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device David Hildenbrand
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Alexander Duyck, Alexander Potapenko, Alexey Kardashevskiy,
	Allison Randal, Anshuman Khandual, Anshuman Khandual, Arun KS,
	Benjamin Herrenschmidt, Christian Brauner, Dan Williams,
	David Howells, Enrico Weigelt, metux IT consult, Gao Xiang,
	Geert Uytterhoeven, Greg Hackmann, Greg Kroah-Hartman,
	Konstantin Khlebnikov, Mel Gorman, Michael Ellerman,
	Michal Hocko, Mike Rapoport, Oliver O'Halloran,
	Oscar Salvador, Paul Mackerras, Pavel Tatashin, Pingfan Liu,
	Qian Cai, Rafael J. Wysocki, Richard Fontana, Stephen Rothwell,
	Thiago Jung Bauermann, Thomas Gleixner, Todd Kjos,
	Vlastimil Babka, Wei Yang

This is the follow-up of:
	https://lkml.org/lkml/2019/9/10/711

We can get rid of the memory isolate notifier by switching to balloon
compaction in powerpc's CMM (Collaborative Memory Management). The memory
isolate notifier was only necessary to allow to offline memory blocks that
contain inflated/"loaned" pages - which also possible when the inflated
pages are movable (via balloon compaction). Having movable pages implies
that memory will also no longer get fragmented when CMM is active.

While I do have access to a LPAR, it does not have CMM active and I have np
clue how to enable it. Instead, I implemented a simple simulation mode. I
did some tests and the whole infrastructure, including page migration,
seems to work fine (e.g., I can still offline memory blocks that contain
inflated pages). Of course, I cannot tell if HW will like the changes,
especially:

1. I now use page_to_phys() to come up with the addresses to report to
   the hypervisor. Hope that's correct.
2. When migrating a page, I first inflate/"loan" the new page and then
   deflate the old page. I have no idea if HW accepts to set pages loaned
   if it didn't request a loan. I assume it does.
3. We might now inflate/deflate pages in parallel (of course, not the
   same page). I have no idea if HW likes that.

It would be good if somebody could either point me at the spec of the
hypervisor interface or verify directly. Also, it would be good if somebody
could test with actual HW that has this feature.

Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Allison Randal <allison@lohutok.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Enrico Weigelt, metux IT consult" <info@metux.net>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richardw.yang@linux.intel.com>

David Hildenbrand (12):
  powerpc/pseries: CMM: Implement release() function for sysfs device
  powerpc/pseries: CMM: Report errors when registering notifiers fails
  powerpc/pseries: CMM: Cleanup rc handling in cmm_init()
  powerpc/pseries: CMM: Drop page array
  powerpc/pseries: CMM: Use adjust_managed_page_count() insted of
    totalram_pages_*
  powerpc/pseries: CMM: Rip out memory isolate notifier
  powerpc/pseries: CMM: Convert loaned_pages to an atomic_long_t
  powerpc/pseries: CMM: Implement balloon compaction
  powerpc/pseries: CMM: Switch to balloon_page_alloc()
  powerpc/pseries: CMM: Simulation mode
  mm: remove the memory isolate notifier
  mm: remove "count" parameter from has_unmovable_pages()

 arch/powerpc/platforms/pseries/Kconfig |   1 +
 arch/powerpc/platforms/pseries/cmm.c   | 430 +++++++++++--------------
 drivers/base/memory.c                  |  19 --
 include/linux/memory.h                 |  27 --
 include/linux/page-isolation.h         |   4 +-
 include/uapi/linux/magic.h             |   1 +
 mm/memory_hotplug.c                    |   2 +-
 mm/page_alloc.c                        |  21 +-
 mm/page_isolation.c                    |  27 +-
 9 files changed, 204 insertions(+), 328 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-11-14  9:08   ` Michael Ellerman
  2019-10-31 14:29 ` [PATCH v1 02/12] powerpc/pseries: CMM: Report errors when registering notifiers fails David Hildenbrand
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Vlastimil Babka, Konstantin Khlebnikov, Allison Randal,
	Greg Kroah-Hartman, Thomas Gleixner, Arun KS

When unloading the module, one gets
[  548.188594] ------------[ cut here ]------------
[  548.188596] Device 'cmm0' does not have a release() function, it is broken and must be fixed. See Documentation/kobject.txt.
[  548.188622] WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .device_release+0xcc/0xf0
...

We only have on static fake device. There is nothing to do when
releasing the device (via cmm_exit).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Allison Randal <allison@lohutok.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index b33251d75927..572651a5c87b 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -411,6 +411,10 @@ static struct bus_type cmm_subsys = {
 	.dev_name = "cmm",
 };
 
+static void cmm_release_device(struct device *dev)
+{
+}
+
 /**
  * cmm_sysfs_register - Register with sysfs
  *
@@ -426,6 +430,7 @@ static int cmm_sysfs_register(struct device *dev)
 
 	dev->id = 0;
 	dev->bus = &cmm_subsys;
+	dev->release = cmm_release_device;
 
 	if ((rc = device_register(dev)))
 		goto subsys_unregister;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 02/12] powerpc/pseries: CMM: Report errors when registering notifiers fails
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 03/12] powerpc/pseries: CMM: Cleanup rc handling in cmm_init() David Hildenbrand
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Allison Randal, Thomas Gleixner,
	Arun KS

If we don't set the rc, we will return "0", making it look like we
succeeded.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 572651a5c87b..fab049d3ea1e 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -683,8 +683,12 @@ static int cmm_init(void)
 	if ((rc = cmm_sysfs_register(&cmm_dev)))
 		goto out_reboot_notifier;
 
-	if (register_memory_notifier(&cmm_mem_nb) ||
-	    register_memory_isolate_notifier(&cmm_mem_isolate_nb))
+	rc = register_memory_notifier(&cmm_mem_nb);
+	if (rc)
+		goto out_unregister_notifier;
+
+	rc = register_memory_isolate_notifier(&cmm_mem_isolate_nb);
+	if (rc)
 		goto out_unregister_notifier;
 
 	if (cmm_disabled)
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 03/12] powerpc/pseries: CMM: Cleanup rc handling in cmm_init()
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 02/12] powerpc/pseries: CMM: Report errors when registering notifiers fails David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 04/12] powerpc/pseries: CMM: Drop page array David Hildenbrand
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Michal Hocko, Konstantin Khlebnikov, Arun KS,
	Thomas Gleixner

No need to initialize rc. Also, let's return 0 directly when succeeding.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index fab049d3ea1e..738eb1681d40 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -669,7 +669,7 @@ static struct notifier_block cmm_mem_nb = {
  **/
 static int cmm_init(void)
 {
-	int rc = -ENOMEM;
+	int rc;
 
 	if (!firmware_has_feature(FW_FEATURE_CMO))
 		return -EOPNOTSUPP;
@@ -692,7 +692,7 @@ static int cmm_init(void)
 		goto out_unregister_notifier;
 
 	if (cmm_disabled)
-		return rc;
+		return 0;
 
 	cmm_thread_ptr = kthread_run(cmm_thread, NULL, "cmmthread");
 	if (IS_ERR(cmm_thread_ptr)) {
@@ -700,8 +700,7 @@ static int cmm_init(void)
 		goto out_unregister_notifier;
 	}
 
-	return rc;
-
+	return 0;
 out_unregister_notifier:
 	unregister_memory_notifier(&cmm_mem_nb);
 	unregister_memory_isolate_notifier(&cmm_mem_isolate_nb);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 04/12] powerpc/pseries: CMM: Drop page array
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 03/12] powerpc/pseries: CMM: Cleanup rc handling in cmm_init() David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 05/12] powerpc/pseries: CMM: Use adjust_managed_page_count() insted of totalram_pages_* David Hildenbrand
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Allison Randal, Thomas Gleixner,
	Arun KS

We can simply store the pages in a list (page->lru), no need for a
separate data structure (+ complicated handling). This is how most
other balloon drivers store allocated pages without additional tracking
data.

For the notifiers, use page_to_pfn() to check if a page is in the
applicable range. Use page_to_phys() in plpar_page_set_loaned() and
plpar_page_set_active() (I assume due to the __pa() that's the right
thing to do).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 163 ++++++---------------------
 1 file changed, 36 insertions(+), 127 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 738eb1681d40..33d31e48ec15 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -75,21 +75,13 @@ module_param_named(debug, cmm_debug, uint, 0644);
 MODULE_PARM_DESC(debug, "Enable module debugging logging. Set to 1 to enable. "
 		 "[Default=" __stringify(CMM_DEBUG) "]");
 
-#define CMM_NR_PAGES ((PAGE_SIZE - sizeof(void *) - sizeof(unsigned long)) / sizeof(unsigned long))
-
 #define cmm_dbg(...) if (cmm_debug) { printk(KERN_INFO "cmm: "__VA_ARGS__); }
 
-struct cmm_page_array {
-	struct cmm_page_array *next;
-	unsigned long index;
-	unsigned long page[CMM_NR_PAGES];
-};
-
 static unsigned long loaned_pages;
 static unsigned long loaned_pages_target;
 static unsigned long oom_freed_pages;
 
-static struct cmm_page_array *cmm_page_list;
+static LIST_HEAD(cmm_page_list);
 static DEFINE_SPINLOCK(cmm_lock);
 
 static DEFINE_MUTEX(hotplug_mutex);
@@ -97,8 +89,9 @@ static int hotplug_occurred; /* protected by the hotplug mutex */
 
 static struct task_struct *cmm_thread_ptr;
 
-static long plpar_page_set_loaned(unsigned long vpa)
+static long plpar_page_set_loaned(struct page *page)
 {
+	const unsigned long vpa = page_to_phys(page);
 	unsigned long cmo_page_sz = cmo_get_page_size();
 	long rc = 0;
 	int i;
@@ -113,8 +106,9 @@ static long plpar_page_set_loaned(unsigned long vpa)
 	return rc;
 }
 
-static long plpar_page_set_active(unsigned long vpa)
+static long plpar_page_set_active(struct page *page)
 {
+	const unsigned long vpa = page_to_phys(page);
 	unsigned long cmo_page_sz = cmo_get_page_size();
 	long rc = 0;
 	int i;
@@ -138,8 +132,7 @@ static long plpar_page_set_active(unsigned long vpa)
  **/
 static long cmm_alloc_pages(long nr)
 {
-	struct cmm_page_array *pa, *npa;
-	unsigned long addr;
+	struct page *page;
 	long rc;
 
 	cmm_dbg("Begin request for %ld pages\n", nr);
@@ -156,43 +149,20 @@ static long cmm_alloc_pages(long nr)
 			break;
 		}
 
-		addr = __get_free_page(GFP_NOIO | __GFP_NOWARN |
-				       __GFP_NORETRY | __GFP_NOMEMALLOC);
-		if (!addr)
+		page = alloc_page(GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY |
+				  __GFP_NOMEMALLOC);
+		if (!page)
 			break;
 		spin_lock(&cmm_lock);
-		pa = cmm_page_list;
-		if (!pa || pa->index >= CMM_NR_PAGES) {
-			/* Need a new page for the page list. */
-			spin_unlock(&cmm_lock);
-			npa = (struct cmm_page_array *)__get_free_page(
-					GFP_NOIO | __GFP_NOWARN |
-					__GFP_NORETRY | __GFP_NOMEMALLOC);
-			if (!npa) {
-				pr_info("%s: Can not allocate new page list\n", __func__);
-				free_page(addr);
-				break;
-			}
-			spin_lock(&cmm_lock);
-			pa = cmm_page_list;
-
-			if (!pa || pa->index >= CMM_NR_PAGES) {
-				npa->next = pa;
-				npa->index = 0;
-				pa = npa;
-				cmm_page_list = pa;
-			} else
-				free_page((unsigned long) npa);
-		}
-
-		if ((rc = plpar_page_set_loaned(__pa(addr)))) {
+		rc = plpar_page_set_loaned(page);
+		if (rc) {
 			pr_err("%s: Can not set page to loaned. rc=%ld\n", __func__, rc);
 			spin_unlock(&cmm_lock);
-			free_page(addr);
+			__free_page(page);
 			break;
 		}
 
-		pa->page[pa->index++] = addr;
+		list_add(&page->lru, &cmm_page_list);
 		loaned_pages++;
 		totalram_pages_dec();
 		spin_unlock(&cmm_lock);
@@ -212,25 +182,16 @@ static long cmm_alloc_pages(long nr)
  **/
 static long cmm_free_pages(long nr)
 {
-	struct cmm_page_array *pa;
-	unsigned long addr;
+	struct page *page, *tmp;
 
 	cmm_dbg("Begin free of %ld pages.\n", nr);
 	spin_lock(&cmm_lock);
-	pa = cmm_page_list;
-	while (nr) {
-		if (!pa || pa->index <= 0)
+	list_for_each_entry_safe(page, tmp, &cmm_page_list, lru) {
+		if (!nr)
 			break;
-		addr = pa->page[--pa->index];
-
-		if (pa->index == 0) {
-			pa = pa->next;
-			free_page((unsigned long) cmm_page_list);
-			cmm_page_list = pa;
-		}
-
-		plpar_page_set_active(__pa(addr));
-		free_page(addr);
+		plpar_page_set_active(page);
+		list_del(&page->lru);
+		__free_page(page);
 		loaned_pages--;
 		nr--;
 		totalram_pages_inc();
@@ -496,20 +457,13 @@ static struct notifier_block cmm_reboot_nb = {
 static unsigned long cmm_count_pages(void *arg)
 {
 	struct memory_isolate_notify *marg = arg;
-	struct cmm_page_array *pa;
-	unsigned long start = (unsigned long)pfn_to_kaddr(marg->start_pfn);
-	unsigned long end = start + (marg->nr_pages << PAGE_SHIFT);
-	unsigned long idx;
+	struct page *page;
 
 	spin_lock(&cmm_lock);
-	pa = cmm_page_list;
-	while (pa) {
-		if ((unsigned long)pa >= start && (unsigned long)pa < end)
+	list_for_each_entry(page, &cmm_page_list, lru) {
+		if (page_to_pfn(page) >= marg->start_pfn &&
+		    page_to_pfn(page) < marg->start_pfn + marg->nr_pages)
 			marg->pages_found++;
-		for (idx = 0; idx < pa->index; idx++)
-			if (pa->page[idx] >= start && pa->page[idx] < end)
-				marg->pages_found++;
-		pa = pa->next;
 	}
 	spin_unlock(&cmm_lock);
 	return 0;
@@ -550,69 +504,24 @@ static struct notifier_block cmm_mem_isolate_nb = {
 static int cmm_mem_going_offline(void *arg)
 {
 	struct memory_notify *marg = arg;
-	unsigned long start_page = (unsigned long)pfn_to_kaddr(marg->start_pfn);
-	unsigned long end_page = start_page + (marg->nr_pages << PAGE_SHIFT);
-	struct cmm_page_array *pa_curr, *pa_last, *npa;
-	unsigned long idx;
+	struct page *page, *tmp;
 	unsigned long freed = 0;
 
-	cmm_dbg("Memory going offline, searching 0x%lx (%ld pages).\n",
-			start_page, marg->nr_pages);
+	cmm_dbg("Memory going offline, searching PFN 0x%lx (%ld pages).\n",
+		marg->start_pfn, marg->nr_pages);
 	spin_lock(&cmm_lock);
 
 	/* Search the page list for pages in the range to be offlined */
-	pa_last = pa_curr = cmm_page_list;
-	while (pa_curr) {
-		for (idx = (pa_curr->index - 1); (idx + 1) > 0; idx--) {
-			if ((pa_curr->page[idx] < start_page) ||
-			    (pa_curr->page[idx] >= end_page))
-				continue;
-
-			plpar_page_set_active(__pa(pa_curr->page[idx]));
-			free_page(pa_curr->page[idx]);
-			freed++;
-			loaned_pages--;
-			totalram_pages_inc();
-			pa_curr->page[idx] = pa_last->page[--pa_last->index];
-			if (pa_last->index == 0) {
-				if (pa_curr == pa_last)
-					pa_curr = pa_last->next;
-				pa_last = pa_last->next;
-				free_page((unsigned long)cmm_page_list);
-				cmm_page_list = pa_last;
-			}
-		}
-		pa_curr = pa_curr->next;
-	}
-
-	/* Search for page list structures in the range to be offlined */
-	pa_last = NULL;
-	pa_curr = cmm_page_list;
-	while (pa_curr) {
-		if (((unsigned long)pa_curr >= start_page) &&
-				((unsigned long)pa_curr < end_page)) {
-			npa = (struct cmm_page_array *)__get_free_page(
-					GFP_NOIO | __GFP_NOWARN |
-					__GFP_NORETRY | __GFP_NOMEMALLOC);
-			if (!npa) {
-				spin_unlock(&cmm_lock);
-				cmm_dbg("Failed to allocate memory for list "
-						"management. Memory hotplug "
-						"failed.\n");
-				return -ENOMEM;
-			}
-			memcpy(npa, pa_curr, PAGE_SIZE);
-			if (pa_curr == cmm_page_list)
-				cmm_page_list = npa;
-			if (pa_last)
-				pa_last->next = npa;
-			free_page((unsigned long) pa_curr);
-			freed++;
-			pa_curr = npa;
-		}
-
-		pa_last = pa_curr;
-		pa_curr = pa_curr->next;
+	list_for_each_entry_safe(page, tmp, &cmm_page_list, lru) {
+		if (page_to_pfn(page) < marg->start_pfn ||
+		    page_to_pfn(page) >= marg->start_pfn + marg->nr_pages)
+			continue;
+		plpar_page_set_active(page);
+		list_del(&page->lru);
+		__free_page(page);
+		freed++;
+		loaned_pages--;
+		totalram_pages_inc();
 	}
 
 	spin_unlock(&cmm_lock);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 05/12] powerpc/pseries: CMM: Use adjust_managed_page_count() insted of totalram_pages_*
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (3 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 04/12] powerpc/pseries: CMM: Drop page array David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 06/12] powerpc/pseries: CMM: Rip out memory isolate notifier David Hildenbrand
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Allison Randal, Thomas Gleixner,
	Arun KS

adjust_managed_page_count() performs a totalram_pages_add(), but also
adjust the managed pages of the zone. Let's use that instead, similar to
virtio-balloon. Use it before freeing a page.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 33d31e48ec15..f82c468ca2c4 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -164,7 +164,7 @@ static long cmm_alloc_pages(long nr)
 
 		list_add(&page->lru, &cmm_page_list);
 		loaned_pages++;
-		totalram_pages_dec();
+		adjust_managed_page_count(page, -1);
 		spin_unlock(&cmm_lock);
 		nr--;
 	}
@@ -191,10 +191,10 @@ static long cmm_free_pages(long nr)
 			break;
 		plpar_page_set_active(page);
 		list_del(&page->lru);
+		adjust_managed_page_count(page, 1);
 		__free_page(page);
 		loaned_pages--;
 		nr--;
-		totalram_pages_inc();
 	}
 	spin_unlock(&cmm_lock);
 	cmm_dbg("End request with %ld pages unfulfilled\n", nr);
@@ -518,10 +518,10 @@ static int cmm_mem_going_offline(void *arg)
 			continue;
 		plpar_page_set_active(page);
 		list_del(&page->lru);
+		adjust_managed_page_count(page, 1);
 		__free_page(page);
 		freed++;
 		loaned_pages--;
-		totalram_pages_inc();
 	}
 
 	spin_unlock(&cmm_lock);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 06/12] powerpc/pseries: CMM: Rip out memory isolate notifier
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (4 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 05/12] powerpc/pseries: CMM: Use adjust_managed_page_count() insted of totalram_pages_* David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 07/12] powerpc/pseries: CMM: Convert loaned_pages to an atomic_long_t David Hildenbrand
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Allison Randal, Thomas Gleixner,
	Arun KS

The memory isolate notifier was added to allow to offline memory blocks
that contain inflated/"loaned" pages. We can achieve the same using the
balloon compaction framework.

Get rid of the memory isolate notifier. Also, we can get rid of
cmm_mem_going_offline(), as we will never reach that code path now when
we have allocated memory in the balloon (allocated pages are unmovable and
will no longer be special-cased using the memory isolation notifier).

Leave the memory notifier in place, so we can still back off in case
memory gets offlined.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 97 +---------------------------
 1 file changed, 1 insertion(+), 96 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index f82c468ca2c4..29416b621189 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -38,12 +38,8 @@
 #define CMM_MIN_MEM_MB		256
 #define KB2PAGES(_p)		((_p)>>(PAGE_SHIFT-10))
 #define PAGES2KB(_p)		((_p)<<(PAGE_SHIFT-10))
-/*
- * The priority level tries to ensure that this notifier is called as
- * late as possible to reduce thrashing in the shared memory pool.
- */
+
 #define CMM_MEM_HOTPLUG_PRI	1
-#define CMM_MEM_ISOLATE_PRI	15
 
 static unsigned int delay = CMM_DEFAULT_DELAY;
 static unsigned int hotplug_delay = CMM_HOTPLUG_DELAY;
@@ -446,90 +442,6 @@ static struct notifier_block cmm_reboot_nb = {
 	.notifier_call = cmm_reboot_notifier,
 };
 
-/**
- * cmm_count_pages - Count the number of pages loaned in a particular range.
- *
- * @arg: memory_isolate_notify structure with address range and count
- *
- * Return value:
- *      0 on success
- **/
-static unsigned long cmm_count_pages(void *arg)
-{
-	struct memory_isolate_notify *marg = arg;
-	struct page *page;
-
-	spin_lock(&cmm_lock);
-	list_for_each_entry(page, &cmm_page_list, lru) {
-		if (page_to_pfn(page) >= marg->start_pfn &&
-		    page_to_pfn(page) < marg->start_pfn + marg->nr_pages)
-			marg->pages_found++;
-	}
-	spin_unlock(&cmm_lock);
-	return 0;
-}
-
-/**
- * cmm_memory_isolate_cb - Handle memory isolation notifier calls
- * @self:	notifier block struct
- * @action:	action to take
- * @arg:	struct memory_isolate_notify data for handler
- *
- * Return value:
- *	NOTIFY_OK or notifier error based on subfunction return value
- **/
-static int cmm_memory_isolate_cb(struct notifier_block *self,
-				 unsigned long action, void *arg)
-{
-	int ret = 0;
-
-	if (action == MEM_ISOLATE_COUNT)
-		ret = cmm_count_pages(arg);
-
-	return notifier_from_errno(ret);
-}
-
-static struct notifier_block cmm_mem_isolate_nb = {
-	.notifier_call = cmm_memory_isolate_cb,
-	.priority = CMM_MEM_ISOLATE_PRI
-};
-
-/**
- * cmm_mem_going_offline - Unloan pages where memory is to be removed
- * @arg: memory_notify structure with page range to be offlined
- *
- * Return value:
- *	0 on success
- **/
-static int cmm_mem_going_offline(void *arg)
-{
-	struct memory_notify *marg = arg;
-	struct page *page, *tmp;
-	unsigned long freed = 0;
-
-	cmm_dbg("Memory going offline, searching PFN 0x%lx (%ld pages).\n",
-		marg->start_pfn, marg->nr_pages);
-	spin_lock(&cmm_lock);
-
-	/* Search the page list for pages in the range to be offlined */
-	list_for_each_entry_safe(page, tmp, &cmm_page_list, lru) {
-		if (page_to_pfn(page) < marg->start_pfn ||
-		    page_to_pfn(page) >= marg->start_pfn + marg->nr_pages)
-			continue;
-		plpar_page_set_active(page);
-		list_del(&page->lru);
-		adjust_managed_page_count(page, 1);
-		__free_page(page);
-		freed++;
-		loaned_pages--;
-	}
-
-	spin_unlock(&cmm_lock);
-	cmm_dbg("Released %ld pages in the search range.\n", freed);
-
-	return 0;
-}
-
 /**
  * cmm_memory_cb - Handle memory hotplug notifier calls
  * @self:	notifier block struct
@@ -549,7 +461,6 @@ static int cmm_memory_cb(struct notifier_block *self,
 	case MEM_GOING_OFFLINE:
 		mutex_lock(&hotplug_mutex);
 		hotplug_occurred = 1;
-		ret = cmm_mem_going_offline(arg);
 		break;
 	case MEM_OFFLINE:
 	case MEM_CANCEL_OFFLINE:
@@ -596,10 +507,6 @@ static int cmm_init(void)
 	if (rc)
 		goto out_unregister_notifier;
 
-	rc = register_memory_isolate_notifier(&cmm_mem_isolate_nb);
-	if (rc)
-		goto out_unregister_notifier;
-
 	if (cmm_disabled)
 		return 0;
 
@@ -612,7 +519,6 @@ static int cmm_init(void)
 	return 0;
 out_unregister_notifier:
 	unregister_memory_notifier(&cmm_mem_nb);
-	unregister_memory_isolate_notifier(&cmm_mem_isolate_nb);
 	cmm_unregister_sysfs(&cmm_dev);
 out_reboot_notifier:
 	unregister_reboot_notifier(&cmm_reboot_nb);
@@ -634,7 +540,6 @@ static void cmm_exit(void)
 	unregister_oom_notifier(&cmm_oom_nb);
 	unregister_reboot_notifier(&cmm_reboot_nb);
 	unregister_memory_notifier(&cmm_mem_nb);
-	unregister_memory_isolate_notifier(&cmm_mem_isolate_nb);
 	cmm_free_pages(loaned_pages);
 	cmm_unregister_sysfs(&cmm_dev);
 }
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 07/12] powerpc/pseries: CMM: Convert loaned_pages to an atomic_long_t
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (5 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 06/12] powerpc/pseries: CMM: Rip out memory isolate notifier David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction David Hildenbrand
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Allison Randal, Thomas Gleixner,
	Arun KS

When switching to balloon compaction, we want to drop the cmm_lock and
completely rely on the balloon compaction list lock internally.
loaned_pages is currently protected under the cmm_lock.

Note: Right now cmm_alloc_pages() and cmm_free_pages() can be called at
the same time, e.g., via the thread and a concurrent OOM notifier.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 35 +++++++++++++++-------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 29416b621189..3a55dd1fdd39 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -73,7 +73,7 @@ MODULE_PARM_DESC(debug, "Enable module debugging logging. Set to 1 to enable. "
 
 #define cmm_dbg(...) if (cmm_debug) { printk(KERN_INFO "cmm: "__VA_ARGS__); }
 
-static unsigned long loaned_pages;
+static atomic_long_t loaned_pages;
 static unsigned long loaned_pages_target;
 static unsigned long oom_freed_pages;
 
@@ -159,7 +159,7 @@ static long cmm_alloc_pages(long nr)
 		}
 
 		list_add(&page->lru, &cmm_page_list);
-		loaned_pages++;
+		atomic_long_inc(&loaned_pages);
 		adjust_managed_page_count(page, -1);
 		spin_unlock(&cmm_lock);
 		nr--;
@@ -189,7 +189,7 @@ static long cmm_free_pages(long nr)
 		list_del(&page->lru);
 		adjust_managed_page_count(page, 1);
 		__free_page(page);
-		loaned_pages--;
+		atomic_long_dec(&loaned_pages);
 		nr--;
 	}
 	spin_unlock(&cmm_lock);
@@ -214,7 +214,7 @@ static int cmm_oom_notify(struct notifier_block *self,
 
 	cmm_dbg("OOM processing started\n");
 	nr = cmm_free_pages(nr);
-	loaned_pages_target = loaned_pages;
+	loaned_pages_target = atomic_long_read(&loaned_pages);
 	*freed += KB2PAGES(oom_kb) - nr;
 	oom_freed_pages += KB2PAGES(oom_kb) - nr;
 	cmm_dbg("OOM processing complete\n");
@@ -231,10 +231,11 @@ static int cmm_oom_notify(struct notifier_block *self,
  **/
 static void cmm_get_mpp(void)
 {
+	const long __loaned_pages = atomic_long_read(&loaned_pages);
+	const long total_pages = totalram_pages() + __loaned_pages;
 	int rc;
 	struct hvcall_mpp_data mpp_data;
 	signed long active_pages_target, page_loan_request, target;
-	signed long total_pages = totalram_pages() + loaned_pages;
 	signed long min_mem_pages = (min_mem_mb * 1024 * 1024) / PAGE_SIZE;
 
 	rc = h_get_mpp(&mpp_data);
@@ -243,7 +244,7 @@ static void cmm_get_mpp(void)
 		return;
 
 	page_loan_request = div_s64((s64)mpp_data.loan_request, PAGE_SIZE);
-	target = page_loan_request + (signed long)loaned_pages;
+	target = page_loan_request + __loaned_pages;
 
 	if (target < 0 || total_pages < min_mem_pages)
 		target = 0;
@@ -264,7 +265,7 @@ static void cmm_get_mpp(void)
 	loaned_pages_target = target;
 
 	cmm_dbg("delta = %ld, loaned = %lu, target = %lu, oom = %lu, totalram = %lu\n",
-		page_loan_request, loaned_pages, loaned_pages_target,
+		page_loan_request, __loaned_pages, loaned_pages_target,
 		oom_freed_pages, totalram_pages());
 }
 
@@ -282,6 +283,7 @@ static struct notifier_block cmm_oom_nb = {
 static int cmm_thread(void *dummy)
 {
 	unsigned long timeleft;
+	long __loaned_pages;
 
 	while (1) {
 		timeleft = msleep_interruptible(delay * 1000);
@@ -312,11 +314,12 @@ static int cmm_thread(void *dummy)
 
 		cmm_get_mpp();
 
-		if (loaned_pages_target > loaned_pages) {
-			if (cmm_alloc_pages(loaned_pages_target - loaned_pages))
-				loaned_pages_target = loaned_pages;
-		} else if (loaned_pages_target < loaned_pages)
-			cmm_free_pages(loaned_pages - loaned_pages_target);
+		__loaned_pages = atomic_long_read(&loaned_pages);
+		if (loaned_pages_target > __loaned_pages) {
+			if (cmm_alloc_pages(loaned_pages_target - __loaned_pages))
+				loaned_pages_target = __loaned_pages;
+		} else if (loaned_pages_target < __loaned_pages)
+			cmm_free_pages(__loaned_pages - loaned_pages_target);
 	}
 	return 0;
 }
@@ -330,7 +333,7 @@ static int cmm_thread(void *dummy)
 	}							\
 	static DEVICE_ATTR(name, 0444, show_##name, NULL)
 
-CMM_SHOW(loaned_kb, "%lu\n", PAGES2KB(loaned_pages));
+CMM_SHOW(loaned_kb, "%lu\n", PAGES2KB(atomic_long_read(&loaned_pages)));
 CMM_SHOW(loaned_target_kb, "%lu\n", PAGES2KB(loaned_pages_target));
 
 static ssize_t show_oom_pages(struct device *dev,
@@ -433,7 +436,7 @@ static int cmm_reboot_notifier(struct notifier_block *nb,
 		if (cmm_thread_ptr)
 			kthread_stop(cmm_thread_ptr);
 		cmm_thread_ptr = NULL;
-		cmm_free_pages(loaned_pages);
+		cmm_free_pages(atomic_long_read(&loaned_pages));
 	}
 	return NOTIFY_DONE;
 }
@@ -540,7 +543,7 @@ static void cmm_exit(void)
 	unregister_oom_notifier(&cmm_oom_nb);
 	unregister_reboot_notifier(&cmm_reboot_nb);
 	unregister_memory_notifier(&cmm_mem_nb);
-	cmm_free_pages(loaned_pages);
+	cmm_free_pages(atomic_long_read(&loaned_pages));
 	cmm_unregister_sysfs(&cmm_dev);
 }
 
@@ -561,7 +564,7 @@ static int cmm_set_disable(const char *val, const struct kernel_param *kp)
 		if (cmm_thread_ptr)
 			kthread_stop(cmm_thread_ptr);
 		cmm_thread_ptr = NULL;
-		cmm_free_pages(loaned_pages);
+		cmm_free_pages(atomic_long_read(&loaned_pages));
 	} else if (!disable && cmm_disabled) {
 		cmm_thread_ptr = kthread_run(cmm_thread, NULL, "cmmthread");
 		if (IS_ERR(cmm_thread_ptr))
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (6 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 07/12] powerpc/pseries: CMM: Convert loaned_pages to an atomic_long_t David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:35   ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 09/12] powerpc/pseries: CMM: Switch to balloon_page_alloc() David Hildenbrand
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thiago Jung Bauermann, Geert Uytterhoeven, Anshuman Khandual,
	Oliver O'Halloran, Alexey Kardashevskiy, Enrico Weigelt,
	metux IT consult, Thomas Gleixner, Allison Randal,
	Greg Kroah-Hartman, Pavel Tatashin, Arun KS, Todd Kjos,
	Christian Brauner, Gao Xiang, Greg Hackmann, David Howells

We can now get rid of the cmm_lock and completely rely on the balloon
compaction internals, which now also manage the page list and the lock.

Inflated/"loaned" pages are now movable. Memory blocks that contain
such apges can get offlined. Also, all such pages will be marked
PageOffline() and can therefore be excluded in memory dumps using recent
versions of makedumpfile.

Don't switch to balloon_page_alloc() yet (due to the GFP_NOIO). Will
do that separately to discuss this change in detail.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: "Enrico Weigelt, metux IT consult" <info@metux.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Allison Randal <allison@lohutok.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Todd Kjos <tkjos@google.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/Kconfig |   1 +
 arch/powerpc/platforms/pseries/cmm.c   | 132 ++++++++++++++++++++++---
 include/uapi/linux/magic.h             |   1 +
 3 files changed, 120 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 9e35cddddf73..595e9f8a6539 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -108,6 +108,7 @@ config PPC_SMLPAR
 config CMM
 	tristate "Collaborative memory management"
 	depends on PPC_SMLPAR
+	select MEMORY_BALLOON
 	default y
 	help
 	  Select this option, if you want to enable the kernel interface
diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 3a55dd1fdd39..235fd7fe9df1 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -19,6 +19,10 @@
 #include <linux/stringify.h>
 #include <linux/swap.h>
 #include <linux/device.h>
+#include <linux/mount.h>
+#include <linux/pseudo_fs.h>
+#include <linux/magic.h>
+#include <linux/balloon_compaction.h>
 #include <asm/firmware.h>
 #include <asm/hvcall.h>
 #include <asm/mmu.h>
@@ -77,13 +81,11 @@ static atomic_long_t loaned_pages;
 static unsigned long loaned_pages_target;
 static unsigned long oom_freed_pages;
 
-static LIST_HEAD(cmm_page_list);
-static DEFINE_SPINLOCK(cmm_lock);
-
 static DEFINE_MUTEX(hotplug_mutex);
 static int hotplug_occurred; /* protected by the hotplug mutex */
 
 static struct task_struct *cmm_thread_ptr;
+static struct balloon_dev_info b_dev_info;
 
 static long plpar_page_set_loaned(struct page *page)
 {
@@ -149,19 +151,16 @@ static long cmm_alloc_pages(long nr)
 				  __GFP_NOMEMALLOC);
 		if (!page)
 			break;
-		spin_lock(&cmm_lock);
 		rc = plpar_page_set_loaned(page);
 		if (rc) {
 			pr_err("%s: Can not set page to loaned. rc=%ld\n", __func__, rc);
-			spin_unlock(&cmm_lock);
 			__free_page(page);
 			break;
 		}
 
-		list_add(&page->lru, &cmm_page_list);
+		balloon_page_enqueue(&b_dev_info, page);
 		atomic_long_inc(&loaned_pages);
 		adjust_managed_page_count(page, -1);
-		spin_unlock(&cmm_lock);
 		nr--;
 	}
 
@@ -178,21 +177,19 @@ static long cmm_alloc_pages(long nr)
  **/
 static long cmm_free_pages(long nr)
 {
-	struct page *page, *tmp;
+	struct page *page;
 
 	cmm_dbg("Begin free of %ld pages.\n", nr);
-	spin_lock(&cmm_lock);
-	list_for_each_entry_safe(page, tmp, &cmm_page_list, lru) {
-		if (!nr)
+	while (nr) {
+		page = balloon_page_dequeue(&b_dev_info);
+		if (!page)
 			break;
 		plpar_page_set_active(page);
-		list_del(&page->lru);
 		adjust_managed_page_count(page, 1);
 		__free_page(page);
 		atomic_long_dec(&loaned_pages);
 		nr--;
 	}
-	spin_unlock(&cmm_lock);
 	cmm_dbg("End request with %ld pages unfulfilled\n", nr);
 	return nr;
 }
@@ -484,6 +481,105 @@ static struct notifier_block cmm_mem_nb = {
 	.priority = CMM_MEM_HOTPLUG_PRI
 };
 
+#ifdef CONFIG_BALLOON_COMPACTION
+static struct vfsmount *balloon_mnt;
+
+static int cmm_init_fs_context(struct fs_context *fc)
+{
+	return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
+}
+
+static struct file_system_type balloon_fs = {
+	.name = "ppc-cmm",
+	.init_fs_context = cmm_init_fs_context,
+	.kill_sb = kill_anon_super,
+};
+
+static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
+			   struct page *newpage, struct page *page,
+			   enum migrate_mode mode)
+{
+	unsigned long flags;
+
+	/*
+	 * loan/"inflate" the newpage first.
+	 *
+	 * We might race against the cmm_thread who might discover after our
+	 * loan request that another page is to be unloaned. However, once
+	 * the cmm_thread runs again later, this error will automatically
+	 * be corrected.
+	 */
+	if (plpar_page_set_loaned(newpage)) {
+		/* Unlikely, but possible. Tell the caller not to retry now. */
+		pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
+		return -EBUSY;
+	}
+
+	/* balloon page list reference */
+	get_page(newpage);
+
+	spin_lock_irqsave(&b_dev_info->pages_lock, flags);
+	balloon_page_insert(b_dev_info, newpage);
+	balloon_page_delete(page);
+	spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
+
+	/*
+	 * activate/"deflate" the old page. We ignore any errors just like the
+	 * other callers.
+	 */
+	plpar_page_set_active(page);
+
+	/* balloon page list reference */
+	put_page(page);
+
+	return MIGRATEPAGE_SUCCESS;
+}
+
+static int cmm_balloon_compaction_init(void)
+{
+	int rc;
+
+	balloon_devinfo_init(&b_dev_info);
+	b_dev_info.migratepage = cmm_migratepage;
+
+	balloon_mnt = kern_mount(&balloon_fs);
+	if (IS_ERR(balloon_mnt)) {
+		rc = PTR_ERR(balloon_mnt);
+		balloon_mnt = NULL;
+		return rc;
+	}
+
+	b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
+	if (IS_ERR(b_dev_info.inode)) {
+		rc = PTR_ERR(b_dev_info.inode);
+		b_dev_info.inode = NULL;
+		kern_unmount(balloon_mnt);
+		balloon_mnt = NULL;
+		return rc;
+	}
+
+	b_dev_info.inode->i_mapping->a_ops = &balloon_aops;
+	return 0;
+}
+static void cmm_balloon_compaction_deinit(void)
+{
+	if (b_dev_info.inode)
+		iput(b_dev_info.inode);
+	b_dev_info.inode = NULL;
+	kern_unmount(balloon_mnt);
+	balloon_mnt = NULL;
+}
+#else /* CONFIG_BALLOON_COMPACTION */
+static int cmm_balloon_compaction_init(void)
+{
+	return 0;
+}
+
+static void cmm_balloon_compaction_deinit(void)
+{
+}
+#endif /* CONFIG_BALLOON_COMPACTION */
+
 /**
  * cmm_init - Module initialization
  *
@@ -497,9 +593,14 @@ static int cmm_init(void)
 	if (!firmware_has_feature(FW_FEATURE_CMO))
 		return -EOPNOTSUPP;
 
-	if ((rc = register_oom_notifier(&cmm_oom_nb)) < 0)
+	rc = cmm_balloon_compaction_init();
+	if (rc)
 		return rc;
 
+	rc = register_oom_notifier(&cmm_oom_nb);
+	if (rc < 0)
+		goto out_balloon_compaction;
+
 	if ((rc = register_reboot_notifier(&cmm_reboot_nb)))
 		goto out_oom_notifier;
 
@@ -527,6 +628,8 @@ static int cmm_init(void)
 	unregister_reboot_notifier(&cmm_reboot_nb);
 out_oom_notifier:
 	unregister_oom_notifier(&cmm_oom_nb);
+out_balloon_compaction:
+	cmm_balloon_compaction_deinit();
 	return rc;
 }
 
@@ -545,6 +648,7 @@ static void cmm_exit(void)
 	unregister_memory_notifier(&cmm_mem_nb);
 	cmm_free_pages(atomic_long_read(&loaned_pages));
 	cmm_unregister_sysfs(&cmm_dev);
+	cmm_balloon_compaction_deinit();
 }
 
 /**
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 903cc2d2750b..3ac436376d79 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -94,5 +94,6 @@
 #define ZSMALLOC_MAGIC		0x58295829
 #define DMA_BUF_MAGIC		0x444d4142	/* "DMAB" */
 #define Z3FOLD_MAGIC		0x33
+#define PPC_CMM_MAGIC		0xc7571590
 
 #endif /* __LINUX_MAGIC_H__ */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 09/12] powerpc/pseries: CMM: Switch to balloon_page_alloc()
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (7 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 10/12] powerpc/pseries: CMM: Simulation mode David Hildenbrand
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Konstantin Khlebnikov, Michal Hocko, Vlastimil Babka,
	Pavel Tatashin, Thomas Gleixner, Arun KS

balloon_page_alloc() will use GFP_HIGHUSER_MOVABLE in case we have
CONFIG_BALLOON_COMPACTION. This is now possible, as balloon pages are
movable with CONFIG_BALLOON_COMPACTION. Without
CONFIG_BALLOON_COMPACTION, GFP_HIGHUSER is used.

Note that apart from that, balloon_page_alloc() uses the following
flags:
    __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN
And current code used:
    GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC

GFP_HIGHUSER/GFP_HIGHUSER_MOVABLE include
    __GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM

GFP_NOIO is __GFP_RECLAIM.

With CONFIG_BALLOON_COMPACTION, we essentially add:
    __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM | __GFP_MOVABLE

Without CONFIG_BALLOON_COMPACTION, we essentially add:
    __GFP_IO | __GFP_FS | __GFP_HARDWALL | __GFP_HIGHMEM

I assume this is fine, as this is what all other balloon compaction
users use. If it turns out to be a problem, we could add
__GFP_MOVABLE manually if we have CONFIG_BALLOON_COMPACTION.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 235fd7fe9df1..a6ec2bbb1f91 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -147,8 +147,7 @@ static long cmm_alloc_pages(long nr)
 			break;
 		}
 
-		page = alloc_page(GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY |
-				  __GFP_NOMEMALLOC);
+		page = balloon_page_alloc();
 		if (!page)
 			break;
 		rc = plpar_page_set_loaned(page);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 10/12] powerpc/pseries: CMM: Simulation mode
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (8 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 09/12] powerpc/pseries: CMM: Switch to balloon_page_alloc() David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 11/12] mm: remove the memory isolate notifier David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 12/12] mm: remove "count" parameter from has_unmovable_pages() David Hildenbrand
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Pavel Tatashin, Richard Fontana, Greg Kroah-Hartman,
	Thomas Gleixner, Arun KS

Let's allow to test the implementation without needing HW support. When
"simulate=1" is specified when loading the module, we bypass all HW
checks and HW calls. The sysfs file "simulate_loan_target_kb" can be
used to simulate HW requests.

The simualtion mode can be activated using
	modprobe cmm debug=1 simulate=1
And the requested loan target can be changed using
	echo X > /sys/devices/system/cmm/cmm0/simulate_loan_target_kb

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/pseries/cmm.c | 38 ++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index a6ec2bbb1f91..63bb576b05da 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -51,6 +51,8 @@ static unsigned int oom_kb = CMM_OOM_KB;
 static unsigned int cmm_debug = CMM_DEBUG;
 static unsigned int cmm_disabled = CMM_DISABLE;
 static unsigned long min_mem_mb = CMM_MIN_MEM_MB;
+static bool __read_mostly simulate;
+static unsigned long simulate_loan_target_kb;
 static struct device cmm_dev;
 
 MODULE_AUTHOR("Brian King <brking@linux.vnet.ibm.com>");
@@ -74,6 +76,8 @@ MODULE_PARM_DESC(min_mem_mb, "Minimum amount of memory (in MB) to not balloon. "
 module_param_named(debug, cmm_debug, uint, 0644);
 MODULE_PARM_DESC(debug, "Enable module debugging logging. Set to 1 to enable. "
 		 "[Default=" __stringify(CMM_DEBUG) "]");
+module_param_named(simulate, simulate, bool, 0444);
+MODULE_PARM_DESC(simulate, "Enable simulation mode (no communication with hw).");
 
 #define cmm_dbg(...) if (cmm_debug) { printk(KERN_INFO "cmm: "__VA_ARGS__); }
 
@@ -94,6 +98,9 @@ static long plpar_page_set_loaned(struct page *page)
 	long rc = 0;
 	int i;
 
+	if (unlikely(simulate))
+		return 0;
+
 	for (i = 0; !rc && i < PAGE_SIZE; i += cmo_page_sz)
 		rc = plpar_hcall_norets(H_PAGE_INIT, H_PAGE_SET_LOANED, vpa + i, 0);
 
@@ -111,6 +118,9 @@ static long plpar_page_set_active(struct page *page)
 	long rc = 0;
 	int i;
 
+	if (unlikely(simulate))
+		return 0;
+
 	for (i = 0; !rc && i < PAGE_SIZE; i += cmo_page_sz)
 		rc = plpar_hcall_norets(H_PAGE_INIT, H_PAGE_SET_ACTIVE, vpa + i, 0);
 
@@ -234,13 +244,17 @@ static void cmm_get_mpp(void)
 	signed long active_pages_target, page_loan_request, target;
 	signed long min_mem_pages = (min_mem_mb * 1024 * 1024) / PAGE_SIZE;
 
-	rc = h_get_mpp(&mpp_data);
-
-	if (rc != H_SUCCESS)
-		return;
-
-	page_loan_request = div_s64((s64)mpp_data.loan_request, PAGE_SIZE);
-	target = page_loan_request + __loaned_pages;
+	if (likely(!simulate)) {
+		rc = h_get_mpp(&mpp_data);
+		if (rc != H_SUCCESS)
+			return;
+		page_loan_request = div_s64((s64)mpp_data.loan_request,
+					    PAGE_SIZE);
+		target = page_loan_request + __loaned_pages;
+	} else {
+		target = KB2PAGES(simulate_loan_target_kb);
+		page_loan_request = target - __loaned_pages;
+	}
 
 	if (target < 0 || total_pages < min_mem_pages)
 		target = 0;
@@ -362,6 +376,9 @@ static struct device_attribute *cmm_attrs[] = {
 	&dev_attr_oom_freed_kb,
 };
 
+static DEVICE_ULONG_ATTR(simulate_loan_target_kb, 0644,
+			 simulate_loan_target_kb);
+
 static struct bus_type cmm_subsys = {
 	.name = "cmm",
 	.dev_name = "cmm",
@@ -396,6 +413,11 @@ static int cmm_sysfs_register(struct device *dev)
 			goto fail;
 	}
 
+	if (!simulate)
+		return 0;
+	rc = device_create_file(dev, &dev_attr_simulate_loan_target_kb.attr);
+	if (rc)
+		goto fail;
 	return 0;
 
 fail:
@@ -589,7 +611,7 @@ static int cmm_init(void)
 {
 	int rc;
 
-	if (!firmware_has_feature(FW_FEATURE_CMO))
+	if (!firmware_has_feature(FW_FEATURE_CMO) && !simulate)
 		return -EOPNOTSUPP;
 
 	rc = cmm_balloon_compaction_init();
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 11/12] mm: remove the memory isolate notifier
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (9 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 10/12] powerpc/pseries: CMM: Simulation mode David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  2019-10-31 14:29 ` [PATCH v1 12/12] mm: remove "count" parameter from has_unmovable_pages() David Hildenbrand
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Greg Kroah-Hartman, Rafael J. Wysocki, Pavel Tatashin,
	Michal Hocko, Dan Williams, Oscar Salvador, Qian Cai,
	Anshuman Khandual, Pingfan Liu

Luckily, we have no users left, so we can get rid of it.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Qian Cai <cai@lca.pw>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c  | 19 -------------------
 include/linux/memory.h | 27 ---------------------------
 mm/page_isolation.c    | 27 ++-------------------------
 3 files changed, 2 insertions(+), 71 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index a757d9ed88a7..03c18c97c2bf 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -73,20 +73,6 @@ void unregister_memory_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(unregister_memory_notifier);
 
-static ATOMIC_NOTIFIER_HEAD(memory_isolate_chain);
-
-int register_memory_isolate_notifier(struct notifier_block *nb)
-{
-	return atomic_notifier_chain_register(&memory_isolate_chain, nb);
-}
-EXPORT_SYMBOL(register_memory_isolate_notifier);
-
-void unregister_memory_isolate_notifier(struct notifier_block *nb)
-{
-	atomic_notifier_chain_unregister(&memory_isolate_chain, nb);
-}
-EXPORT_SYMBOL(unregister_memory_isolate_notifier);
-
 static void memory_block_release(struct device *dev)
 {
 	struct memory_block *mem = to_memory_block(dev);
@@ -178,11 +164,6 @@ int memory_notify(unsigned long val, void *v)
 	return blocking_notifier_call_chain(&memory_chain, val, v);
 }
 
-int memory_isolate_notify(unsigned long val, void *v)
-{
-	return atomic_notifier_call_chain(&memory_isolate_chain, val, v);
-}
-
 /*
  * The probe routines leave the pages uninitialized, just as the bootmem code
  * does. Make sure we do not access them, but instead use only information from
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 0ebb105eb261..d3fde2d0d94b 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -55,19 +55,6 @@ struct memory_notify {
 	int status_change_nid;
 };
 
-/*
- * During pageblock isolation, count the number of pages within the
- * range [start_pfn, start_pfn + nr_pages) which are owned by code
- * in the notifier chain.
- */
-#define MEM_ISOLATE_COUNT	(1<<0)
-
-struct memory_isolate_notify {
-	unsigned long start_pfn;	/* Start of range to check */
-	unsigned int nr_pages;		/* # pages in range to check */
-	unsigned int pages_found;	/* # pages owned found by callbacks */
-};
-
 struct notifier_block;
 struct mem_section;
 
@@ -94,27 +81,13 @@ static inline int memory_notify(unsigned long val, void *v)
 {
 	return 0;
 }
-static inline int register_memory_isolate_notifier(struct notifier_block *nb)
-{
-	return 0;
-}
-static inline void unregister_memory_isolate_notifier(struct notifier_block *nb)
-{
-}
-static inline int memory_isolate_notify(unsigned long val, void *v)
-{
-	return 0;
-}
 #else
 extern int register_memory_notifier(struct notifier_block *nb);
 extern void unregister_memory_notifier(struct notifier_block *nb);
-extern int register_memory_isolate_notifier(struct notifier_block *nb);
-extern void unregister_memory_isolate_notifier(struct notifier_block *nb);
 int create_memory_block_devices(unsigned long start, unsigned long size);
 void remove_memory_block_devices(unsigned long start, unsigned long size);
 extern void memory_dev_init(void);
 extern int memory_notify(unsigned long val, void *v);
-extern int memory_isolate_notify(unsigned long val, void *v);
 extern struct memory_block *find_memory_block(struct mem_section *);
 typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);
 extern int walk_memory_blocks(unsigned long start, unsigned long size,
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 04ee1663cdbe..20d87d18c7cc 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -18,9 +18,7 @@
 static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
 {
 	struct zone *zone;
-	unsigned long flags, pfn;
-	struct memory_isolate_notify arg;
-	int notifier_ret;
+	unsigned long flags;
 	int ret = -EBUSY;
 
 	zone = page_zone(page);
@@ -35,32 +33,11 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	if (is_migrate_isolate_page(page))
 		goto out;
 
-	pfn = page_to_pfn(page);
-	arg.start_pfn = pfn;
-	arg.nr_pages = pageblock_nr_pages;
-	arg.pages_found = 0;
-
-	/*
-	 * It may be possible to isolate a pageblock even if the
-	 * migratetype is not MIGRATE_MOVABLE. The memory isolation
-	 * notifier chain is used by balloon drivers to return the
-	 * number of pages in a range that are held by the balloon
-	 * driver to shrink memory. If all the pages are accounted for
-	 * by balloons, are free, or on the LRU, isolation can continue.
-	 * Later, for example, when memory hotplug notifier runs, these
-	 * pages reported as "can be isolated" should be isolated(freed)
-	 * by the balloon driver through the memory notifier chain.
-	 */
-	notifier_ret = memory_isolate_notify(MEM_ISOLATE_COUNT, &arg);
-	notifier_ret = notifier_to_errno(notifier_ret);
-	if (notifier_ret)
-		goto out;
 	/*
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype,
-				 isol_flags))
+	if (!has_unmovable_pages(zone, page, 0, migratetype, isol_flags))
 		ret = 0;
 
 	/*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v1 12/12] mm: remove "count" parameter from has_unmovable_pages()
  2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
                   ` (10 preceding siblings ...)
  2019-10-31 14:29 ` [PATCH v1 11/12] mm: remove the memory isolate notifier David Hildenbrand
@ 2019-10-31 14:29 ` David Hildenbrand
  11 siblings, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, David Hildenbrand,
	Michal Hocko, Oscar Salvador, Anshuman Khandual, Qian Cai,
	Pingfan Liu, Stephen Rothwell, Dan Williams, Pavel Tatashin,
	Vlastimil Babka, Mel Gorman, Mike Rapoport, Wei Yang,
	Alexander Duyck, Alexander Potapenko, Arun KS

Now that the memory isolate notifier is gone, the parameter is always 0.
Drop it and cleanup has_unmovable_pages().

Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/page-isolation.h |  4 ++--
 mm/memory_hotplug.c            |  2 +-
 mm/page_alloc.c                | 21 +++++++--------------
 mm/page_isolation.c            |  2 +-
 4 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 6861df759fad..148e65a9c606 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -33,8 +33,8 @@ static inline bool is_migrate_isolate(int migratetype)
 #define MEMORY_OFFLINE	0x1
 #define REPORT_FAILURE	0x2
 
-bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-			 int migratetype, int flags);
+bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
+			 int flags);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
 				int migratetype, int *num_movable);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 561371ead39a..81e38e699e23 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1149,7 +1149,7 @@ static bool is_pageblock_removable_nolock(unsigned long pfn)
 	if (!zone_spans_pfn(zone, pfn))
 		return false;
 
-	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE,
+	return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
 				    MEMORY_OFFLINE);
 }
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 12f3ce09d33d..efcce493452c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8182,17 +8182,15 @@ void *__init alloc_large_system_hash(const char *tablename,
 
 /*
  * This function checks whether pageblock includes unmovable pages or not.
- * If @count is not zero, it is okay to include less @count unmovable pages
  *
  * PageLRU check without isolation or lru_lock could race so that
  * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
  * check without lock_page also may miss some movable non-lru pages at
  * race condition. So you can't expect this function should be exact.
  */
-bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-			 int migratetype, int flags)
+bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
+			 int flags)
 {
-	unsigned long found;
 	unsigned long iter = 0;
 	unsigned long pfn = page_to_pfn(page);
 	const char *reason = "unmovable page";
@@ -8218,13 +8216,11 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		goto unmovable;
 	}
 
-	for (found = 0; iter < pageblock_nr_pages; iter++) {
-		unsigned long check = pfn + iter;
-
-		if (!pfn_valid_within(check))
+	for (; iter < pageblock_nr_pages; iter++) {
+		if (!pfn_valid_within(pfn + iter))
 			continue;
 
-		page = pfn_to_page(check);
+		page = pfn_to_page(pfn + iter);
 
 		if (PageReserved(page))
 			goto unmovable;
@@ -8273,11 +8269,9 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
 			continue;
 
-		if (__PageMovable(page))
+		if (__PageMovable(page) || PageLRU(page))
 			continue;
 
-		if (!PageLRU(page))
-			found++;
 		/*
 		 * If there are RECLAIMABLE pages, we need to check
 		 * it.  But now, memory offline itself doesn't call
@@ -8291,8 +8285,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		 * is set to both of a memory hole page and a _used_ kernel
 		 * page at boot.
 		 */
-		if (found > count)
-			goto unmovable;
+		goto unmovable;
 	}
 	return false;
 unmovable:
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 20d87d18c7cc..c0ecae9f5f93 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -37,7 +37,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	if (!has_unmovable_pages(zone, page, 0, migratetype, isol_flags))
+	if (!has_unmovable_pages(zone, page, migratetype, isol_flags))
 		ret = 0;
 
 	/*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction
  2019-10-31 14:29 ` [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction David Hildenbrand
@ 2019-10-31 14:35   ` David Hildenbrand
  2019-11-12 10:46     ` Michael Ellerman
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2019-10-31 14:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thiago Jung Bauermann,
	Geert Uytterhoeven, Anshuman Khandual, Oliver O'Halloran,
	Alexey Kardashevskiy, Enrico Weigelt, metux IT consult,
	Thomas Gleixner, Allison Randal, Greg Kroah-Hartman,
	Pavel Tatashin, Arun KS, Todd Kjos, Christian Brauner, Gao Xiang,
	Greg Hackmann, David Howells

On 31.10.19 15:29, David Hildenbrand wrote:
> We can now get rid of the cmm_lock and completely rely on the balloon
> compaction internals, which now also manage the page list and the lock.
> 
> Inflated/"loaned" pages are now movable. Memory blocks that contain
> such apges can get offlined. Also, all such pages will be marked
> PageOffline() and can therefore be excluded in memory dumps using recent
> versions of makedumpfile.
> 
> Don't switch to balloon_page_alloc() yet (due to the GFP_NOIO). Will
> do that separately to discuss this change in detail.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> Cc: "Oliver O'Halloran" <oohall@gmail.com>
> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
> Cc: "Enrico Weigelt, metux IT consult" <info@metux.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Allison Randal <allison@lohutok.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Todd Kjos <tkjos@google.com>
> Cc: Christian Brauner <christian@brauner.io>
> Cc: Gao Xiang <xiang@kernel.org>
> Cc: Greg Hackmann <ghackmann@google.com>
> Cc: David Howells <dhowells@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   arch/powerpc/platforms/pseries/Kconfig |   1 +
>   arch/powerpc/platforms/pseries/cmm.c   | 132 ++++++++++++++++++++++---
>   include/uapi/linux/magic.h             |   1 +
>   3 files changed, 120 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
> index 9e35cddddf73..595e9f8a6539 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -108,6 +108,7 @@ config PPC_SMLPAR
>   config CMM
>   	tristate "Collaborative memory management"
>   	depends on PPC_SMLPAR
> +	select MEMORY_BALLOON
>   	default y
>   	help
>   	  Select this option, if you want to enable the kernel interface
> diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
> index 3a55dd1fdd39..235fd7fe9df1 100644
> --- a/arch/powerpc/platforms/pseries/cmm.c
> +++ b/arch/powerpc/platforms/pseries/cmm.c
> @@ -19,6 +19,10 @@
>   #include <linux/stringify.h>
>   #include <linux/swap.h>
>   #include <linux/device.h>
> +#include <linux/mount.h>
> +#include <linux/pseudo_fs.h>
> +#include <linux/magic.h>
> +#include <linux/balloon_compaction.h>
>   #include <asm/firmware.h>
>   #include <asm/hvcall.h>
>   #include <asm/mmu.h>
> @@ -77,13 +81,11 @@ static atomic_long_t loaned_pages;
>   static unsigned long loaned_pages_target;
>   static unsigned long oom_freed_pages;
>   
> -static LIST_HEAD(cmm_page_list);
> -static DEFINE_SPINLOCK(cmm_lock);
> -
>   static DEFINE_MUTEX(hotplug_mutex);
>   static int hotplug_occurred; /* protected by the hotplug mutex */
>   
>   static struct task_struct *cmm_thread_ptr;
> +static struct balloon_dev_info b_dev_info;
>   
>   static long plpar_page_set_loaned(struct page *page)
>   {
> @@ -149,19 +151,16 @@ static long cmm_alloc_pages(long nr)
>   				  __GFP_NOMEMALLOC);
>   		if (!page)
>   			break;
> -		spin_lock(&cmm_lock);
>   		rc = plpar_page_set_loaned(page);
>   		if (rc) {
>   			pr_err("%s: Can not set page to loaned. rc=%ld\n", __func__, rc);
> -			spin_unlock(&cmm_lock);
>   			__free_page(page);
>   			break;
>   		}
>   
> -		list_add(&page->lru, &cmm_page_list);
> +		balloon_page_enqueue(&b_dev_info, page);
>   		atomic_long_inc(&loaned_pages);
>   		adjust_managed_page_count(page, -1);
> -		spin_unlock(&cmm_lock);
>   		nr--;
>   	}
>   
> @@ -178,21 +177,19 @@ static long cmm_alloc_pages(long nr)
>    **/
>   static long cmm_free_pages(long nr)
>   {
> -	struct page *page, *tmp;
> +	struct page *page;
>   
>   	cmm_dbg("Begin free of %ld pages.\n", nr);
> -	spin_lock(&cmm_lock);
> -	list_for_each_entry_safe(page, tmp, &cmm_page_list, lru) {
> -		if (!nr)
> +	while (nr) {
> +		page = balloon_page_dequeue(&b_dev_info);
> +		if (!page)
>   			break;
>   		plpar_page_set_active(page);
> -		list_del(&page->lru);
>   		adjust_managed_page_count(page, 1);
>   		__free_page(page);
>   		atomic_long_dec(&loaned_pages);
>   		nr--;
>   	}
> -	spin_unlock(&cmm_lock);
>   	cmm_dbg("End request with %ld pages unfulfilled\n", nr);
>   	return nr;
>   }
> @@ -484,6 +481,105 @@ static struct notifier_block cmm_mem_nb = {
>   	.priority = CMM_MEM_HOTPLUG_PRI
>   };
>   
> +#ifdef CONFIG_BALLOON_COMPACTION
> +static struct vfsmount *balloon_mnt;
> +
> +static int cmm_init_fs_context(struct fs_context *fc)
> +{
> +	return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
> +}
> +
> +static struct file_system_type balloon_fs = {
> +	.name = "ppc-cmm",
> +	.init_fs_context = cmm_init_fs_context,
> +	.kill_sb = kill_anon_super,
> +};
> +
> +static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
> +			   struct page *newpage, struct page *page,
> +			   enum migrate_mode mode)
> +{
> +	unsigned long flags;
> +
> +	/*
> +	 * loan/"inflate" the newpage first.
> +	 *
> +	 * We might race against the cmm_thread who might discover after our
> +	 * loan request that another page is to be unloaned. However, once
> +	 * the cmm_thread runs again later, this error will automatically
> +	 * be corrected.
> +	 */
> +	if (plpar_page_set_loaned(newpage)) {
> +		/* Unlikely, but possible. Tell the caller not to retry now. */
> +		pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
> +		return -EBUSY;
> +	}
> +
> +	/* balloon page list reference */
> +	get_page(newpage);
> +
> +	spin_lock_irqsave(&b_dev_info->pages_lock, flags);
> +	balloon_page_insert(b_dev_info, newpage);
> +	balloon_page_delete(page);

I think I am missing a b_dev_info->isolated_pages-- here.

> +	spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction
  2019-10-31 14:35   ` David Hildenbrand
@ 2019-11-12 10:46     ` Michael Ellerman
  2019-11-12 11:12       ` David Hildenbrand
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Ellerman @ 2019-11-12 10:46 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, Benjamin Herrenschmidt,
	Paul Mackerras, Thiago Jung Bauermann, Geert Uytterhoeven,
	Anshuman Khandual, Oliver O'Halloran, Alexey Kardashevskiy,
	Enrico Weigelt, metux IT consult, Thomas Gleixner,
	Allison Randal, Greg Kroah-Hartman, Pavel Tatashin, Arun KS,
	Todd Kjos, Christian Brauner, Gao Xiang, Greg Hackmann,
	David Howells

David Hildenbrand <david@redhat.com> writes:
> On 31.10.19 15:29, David Hildenbrand wrote:
>> We can now get rid of the cmm_lock and completely rely on the balloon
>> compaction internals, which now also manage the page list and the lock.
...
>> +
>> +static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
>> +			   struct page *newpage, struct page *page,
>> +			   enum migrate_mode mode)
>> +{
>> +	unsigned long flags;
>> +
>> +	/*
>> +	 * loan/"inflate" the newpage first.
>> +	 *
>> +	 * We might race against the cmm_thread who might discover after our
>> +	 * loan request that another page is to be unloaned. However, once
>> +	 * the cmm_thread runs again later, this error will automatically
>> +	 * be corrected.
>> +	 */
>> +	if (plpar_page_set_loaned(newpage)) {
>> +		/* Unlikely, but possible. Tell the caller not to retry now. */
>> +		pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
>> +		return -EBUSY;
>> +	}
>> +
>> +	/* balloon page list reference */
>> +	get_page(newpage);
>> +
>> +	spin_lock_irqsave(&b_dev_info->pages_lock, flags);
>> +	balloon_page_insert(b_dev_info, newpage);
>> +	balloon_page_delete(page);
>
> I think I am missing a b_dev_info->isolated_pages-- here.

I don't know this code at all, but looking at other balloon drivers they
do seem to do that in roughly the same spot.

I'll add it, how can we test that it's correct?

cheers

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction
  2019-11-12 10:46     ` Michael Ellerman
@ 2019-11-12 11:12       ` David Hildenbrand
  2019-11-13 11:08         ` Michael Ellerman
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2019-11-12 11:12 UTC (permalink / raw)
  To: Michael Ellerman, linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, Benjamin Herrenschmidt,
	Paul Mackerras, Thiago Jung Bauermann, Geert Uytterhoeven,
	Anshuman Khandual, Oliver O'Halloran, Alexey Kardashevskiy,
	Enrico Weigelt, metux IT consult, Thomas Gleixner,
	Allison Randal, Greg Kroah-Hartman, Pavel Tatashin, Arun KS,
	Todd Kjos, Christian Brauner, Gao Xiang, Greg Hackmann,
	David Howells

On 12.11.19 11:46, Michael Ellerman wrote:
> David Hildenbrand <david@redhat.com> writes:
>> On 31.10.19 15:29, David Hildenbrand wrote:
>>> We can now get rid of the cmm_lock and completely rely on the balloon
>>> compaction internals, which now also manage the page list and the lock.
> ...
>>> +
>>> +static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
>>> +			   struct page *newpage, struct page *page,
>>> +			   enum migrate_mode mode)
>>> +{
>>> +	unsigned long flags;
>>> +
>>> +	/*
>>> +	 * loan/"inflate" the newpage first.
>>> +	 *
>>> +	 * We might race against the cmm_thread who might discover after our
>>> +	 * loan request that another page is to be unloaned. However, once
>>> +	 * the cmm_thread runs again later, this error will automatically
>>> +	 * be corrected.
>>> +	 */
>>> +	if (plpar_page_set_loaned(newpage)) {
>>> +		/* Unlikely, but possible. Tell the caller not to retry now. */
>>> +		pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
>>> +		return -EBUSY;
>>> +	}
>>> +
>>> +	/* balloon page list reference */
>>> +	get_page(newpage);
>>> +
>>> +	spin_lock_irqsave(&b_dev_info->pages_lock, flags);
>>> +	balloon_page_insert(b_dev_info, newpage);
>>> +	balloon_page_delete(page);
>>
>> I think I am missing a b_dev_info->isolated_pages-- here.
> 
> I don't know this code at all, but looking at other balloon drivers they
> do seem to do that in roughly the same spot.
> 
> I'll add it, how can we test that it's correct?

It's certainly correct. We increment when we isolate 
(balloon_page_isolate()) and decrement when we un-isolate.

Un-isolate happens when we putback a isolated page 
(balloon_page_putback() - migration aborted) or when we successfully 
migrate it (via balloon_page_migrate()).

The issue is that we cannot decrement in balloon_page_migrate(), as we 
have to hold the b_dev_info->pages_lock. That's why we have to do it in 
the registered callback under lock.

Please note that b_dev_info->isolated_pages is only needed for a sanity 
check in balloon_page_dequeue(). That's why I didn't notice during 
testing. I wonder if we should at some point rip out that sanity check ...

Thanks and cheers!

> 
> cheers
> 


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction
  2019-11-12 11:12       ` David Hildenbrand
@ 2019-11-13 11:08         ` Michael Ellerman
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Ellerman @ 2019-11-13 11:08 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linuxppc-dev, Andrew Morton, Benjamin Herrenschmidt,
	Paul Mackerras, Thiago Jung Bauermann, Geert Uytterhoeven,
	Anshuman Khandual, Oliver O'Halloran, Alexey Kardashevskiy,
	Enrico Weigelt, metux IT consult, Thomas Gleixner,
	Allison Randal, Greg Kroah-Hartman, Pavel Tatashin, Arun KS,
	Todd Kjos, Christian Brauner, Gao Xiang, Greg Hackmann,
	David Howells

David Hildenbrand <david@redhat.com> writes:
> On 12.11.19 11:46, Michael Ellerman wrote:
>> David Hildenbrand <david@redhat.com> writes:
>>> On 31.10.19 15:29, David Hildenbrand wrote:
>>>> We can now get rid of the cmm_lock and completely rely on the balloon
>>>> compaction internals, which now also manage the page list and the lock.
>> ...
>>>> +
>>>> +static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
>>>> +			   struct page *newpage, struct page *page,
>>>> +			   enum migrate_mode mode)
>>>> +{
>>>> +	unsigned long flags;
>>>> +
>>>> +	/*
>>>> +	 * loan/"inflate" the newpage first.
>>>> +	 *
>>>> +	 * We might race against the cmm_thread who might discover after our
>>>> +	 * loan request that another page is to be unloaned. However, once
>>>> +	 * the cmm_thread runs again later, this error will automatically
>>>> +	 * be corrected.
>>>> +	 */
>>>> +	if (plpar_page_set_loaned(newpage)) {
>>>> +		/* Unlikely, but possible. Tell the caller not to retry now. */
>>>> +		pr_err_ratelimited("%s: Cannot set page to loaned.", __func__);
>>>> +		return -EBUSY;
>>>> +	}
>>>> +
>>>> +	/* balloon page list reference */
>>>> +	get_page(newpage);
>>>> +
>>>> +	spin_lock_irqsave(&b_dev_info->pages_lock, flags);
>>>> +	balloon_page_insert(b_dev_info, newpage);
>>>> +	balloon_page_delete(page);
>>>
>>> I think I am missing a b_dev_info->isolated_pages-- here.
>> 
>> I don't know this code at all, but looking at other balloon drivers they
>> do seem to do that in roughly the same spot.
>> 
>> I'll add it, how can we test that it's correct?
>
> It's certainly correct. We increment when we isolate 
> (balloon_page_isolate()) and decrement when we un-isolate.
>
> Un-isolate happens when we putback a isolated page 
> (balloon_page_putback() - migration aborted) or when we successfully 
> migrate it (via balloon_page_migrate()).
>
> The issue is that we cannot decrement in balloon_page_migrate(), as we 
> have to hold the b_dev_info->pages_lock. That's why we have to do it in 
> the registered callback under lock.

OK, I get it now.

> Please note that b_dev_info->isolated_pages is only needed for a sanity 
> check in balloon_page_dequeue(). That's why I didn't notice during 
> testing. I wonder if we should at some point rip out that sanity check ...

OK. Sanity checks can be good, though checks that call BUG() are less
nice :)  But I'm not an mm expert so I'll defer to you folks on the
sanity check.

For now I've merged this series with the decrement added to
cmm_migratepage().

cheers

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device
  2019-10-31 14:29 ` [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device David Hildenbrand
@ 2019-11-14  9:08   ` Michael Ellerman
  2019-11-14 12:21     ` David Hildenbrand
  2019-11-20 10:35     ` David Hildenbrand
  0 siblings, 2 replies; 20+ messages in thread
From: Michael Ellerman @ 2019-11-14  9:08 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: David Hildenbrand, Konstantin Khlebnikov, Greg Kroah-Hartman,
	linux-mm, Paul Mackerras, Allison Randal, Arun KS, Andrew Morton,
	linuxppc-dev, Thomas Gleixner, Vlastimil Babka

On Thu, 2019-10-31 at 14:29:22 UTC, David Hildenbrand wrote:
> When unloading the module, one gets
> [  548.188594] ------------[ cut here ]------------
> [  548.188596] Device 'cmm0' does not have a release() function, it is brok=
> en and must be fixed. See Documentation/kobject.txt.
> [  548.188622] WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .devi=
> ce_release+0xcc/0xf0
> ...
> 
> We only have on static fake device. There is nothing to do when
> releasing the device (via cmm_exit).
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: Allison Randal <allison@lohutok.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Arun KS <arunks@codeaurora.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Patches 1-10 applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/7d8212747435c534c8d564fbef4541a463c976ff

cheers

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device
  2019-11-14  9:08   ` Michael Ellerman
@ 2019-11-14 12:21     ` David Hildenbrand
  2019-11-20 10:35     ` David Hildenbrand
  1 sibling, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-11-14 12:21 UTC (permalink / raw)
  To: Michael Ellerman, linux-kernel
  Cc: Konstantin Khlebnikov, Greg Kroah-Hartman, linux-mm,
	Paul Mackerras, Allison Randal, Arun KS, Andrew Morton,
	linuxppc-dev, Thomas Gleixner, Vlastimil Babka

On 14.11.19 10:08, Michael Ellerman wrote:
> On Thu, 2019-10-31 at 14:29:22 UTC, David Hildenbrand wrote:
>> When unloading the module, one gets
>> [  548.188594] ------------[ cut here ]------------
>> [  548.188596] Device 'cmm0' does not have a release() function, it is brok=
>> en and must be fixed. See Documentation/kobject.txt.
>> [  548.188622] WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .devi=
>> ce_release+0xcc/0xf0
>> ...
>>
>> We only have on static fake device. There is nothing to do when
>> releasing the device (via cmm_exit).
>>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Cc: Allison Randal <allison@lohutok.net>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Arun KS <arunks@codeaurora.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Patches 1-10 applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/7d8212747435c534c8d564fbef4541a463c976ff
> 
> cheers
> 

Thanks! I'll probably resend patch 11/12 to give it more attention and 
to fixup one comment leftover in patch 11. I guess if we get ACKs these 
two patch should also go via your tree to avoid collisions.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device
  2019-11-14  9:08   ` Michael Ellerman
  2019-11-14 12:21     ` David Hildenbrand
@ 2019-11-20 10:35     ` David Hildenbrand
  1 sibling, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2019-11-20 10:35 UTC (permalink / raw)
  To: Michael Ellerman, linux-kernel
  Cc: Konstantin Khlebnikov, Greg Kroah-Hartman, linux-mm,
	Paul Mackerras, Allison Randal, Arun KS, Andrew Morton,
	linuxppc-dev, Thomas Gleixner, Vlastimil Babka

On 14.11.19 10:08, Michael Ellerman wrote:
> On Thu, 2019-10-31 at 14:29:22 UTC, David Hildenbrand wrote:
>> When unloading the module, one gets
>> [  548.188594] ------------[ cut here ]------------
>> [  548.188596] Device 'cmm0' does not have a release() function, it is brok=
>> en and must be fixed. See Documentation/kobject.txt.
>> [  548.188622] WARNING: CPU: 0 PID: 19308 at drivers/base/core.c:1244 .devi=
>> ce_release+0xcc/0xf0
>> ...
>>
>> We only have on static fake device. There is nothing to do when
>> releasing the device (via cmm_exit).
>>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Cc: Allison Randal <allison@lohutok.net>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Arun KS <arunks@codeaurora.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Patches 1-10 applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/7d8212747435c534c8d564fbef4541a463c976ff
> 
> cheers
> 

Hi Michael,

just to make sure you saw the two MM patches (and the ACKs from Michal)

https://lkml.org/lkml/2019/11/14/410

if you prefer that Andrew picks these up, please let me know.

Cheers!

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-11-20 10:35 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-31 14:29 [PATCH v1 00/12] powerpc/pseries: CMM: Implement balloon compaction and remove isolate notifier David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 01/12] powerpc/pseries: CMM: Implement release() function for sysfs device David Hildenbrand
2019-11-14  9:08   ` Michael Ellerman
2019-11-14 12:21     ` David Hildenbrand
2019-11-20 10:35     ` David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 02/12] powerpc/pseries: CMM: Report errors when registering notifiers fails David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 03/12] powerpc/pseries: CMM: Cleanup rc handling in cmm_init() David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 04/12] powerpc/pseries: CMM: Drop page array David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 05/12] powerpc/pseries: CMM: Use adjust_managed_page_count() insted of totalram_pages_* David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 06/12] powerpc/pseries: CMM: Rip out memory isolate notifier David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 07/12] powerpc/pseries: CMM: Convert loaned_pages to an atomic_long_t David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 08/12] powerpc/pseries: CMM: Implement balloon compaction David Hildenbrand
2019-10-31 14:35   ` David Hildenbrand
2019-11-12 10:46     ` Michael Ellerman
2019-11-12 11:12       ` David Hildenbrand
2019-11-13 11:08         ` Michael Ellerman
2019-10-31 14:29 ` [PATCH v1 09/12] powerpc/pseries: CMM: Switch to balloon_page_alloc() David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 10/12] powerpc/pseries: CMM: Simulation mode David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 11/12] mm: remove the memory isolate notifier David Hildenbrand
2019-10-31 14:29 ` [PATCH v1 12/12] mm: remove "count" parameter from has_unmovable_pages() David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).