linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/5] make balloon pages movable by compaction
@ 2012-09-17 16:38 Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patch-set follows the main idea discussed at 2012 LSFMMS session:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
to introduce the required changes to the virtio_balloon driver, as well as
the changes to the core compaction & migration bits, in order to make those
subsystems aware of ballooned pages and allow memory balloon pages become
movable within a guest, thus avoiding the aforementioned fragmentation issue

Following are numbers that prove this patch benefits on allowing compaction
to be more effective at memory ballooned guests.

Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
running on a 4gB RAM KVM guest which was ballooning 1gB RAM in 256mB chunks,
at every minute (inflating/deflating), while test was running:

===BEGIN stress-highalloc

STRESS-HIGHALLOC
              stress-highalloc   highalloc-3.6.0
                     3.6.0-rc5         rc5-patch
Pass 1          47.00 ( 0.00%)    85.00 (38.00%)
Pass 2          52.00 ( 0.00%)    87.00 (35.00%)
while Rested    77.00 ( 0.00%)    99.00 (22.00%)

MMTests Statistics: duration
               3.6.0       3.6.0
                 rc5   rc5-patch
User         1566.87     1066.77
System        948.78      713.19
Elapsed      2008.95     1650.72

MMTests Statistics: vmstat
                              3.6.0       3.6.0
                                rc5   rc5-patch
Page Ins                    5037962     3458106
Page Outs                  10779728     8969512
Swap Ins                      34282        5565
Swap Outs                     63027       19717
Direct pages scanned         481017      166920
Kswapd pages scanned        2083130     1537202
Kswapd pages reclaimed      1838615     1459932
Direct pages reclaimed       337487      120613
Kswapd efficiency               88%         94%
Kswapd velocity            1036.925     931.231
Direct efficiency               70%         72%
Direct velocity             239.437     101.120
Percentage direct scans         18%          9%
Page writes by reclaim       157305       19855
Page writes file              94278         138
Page writes anon              63027       19717
Page reclaim immediate       111205       64510
Page rescued immediate            0           0
Slabs scanned               3362816     2375680
Direct inode steals           12411        2022
Kswapd inode steals          753789      524457
Kswapd skipped wait             136           7
THP fault alloc                 688         739
THP collapse alloc              378         481
THP splits                      279         317
THP fault fallback              172          45
THP collapse fail                12           5
Compaction stalls              1378         968
Compaction success              406         595
Compaction failures             972         373
Compaction pages moved      3104073     1790932
Compaction move failure       92713       41252

===END stress-highalloc

Rafael Aquini (5):
  mm: introduce a common interface for balloon pages mobility
  mm: introduce compaction and migration for ballooned pages
  virtio_balloon: introduce migration primitives to balloon pages
  mm: introduce putback_movable_pages()
  mm: add vm event counters for balloon pages compaction

 drivers/virtio/virtio_balloon.c    | 306 ++++++++++++++++++++++++++++++++++---
 include/linux/balloon_compaction.h | 147 ++++++++++++++++++
 include/linux/migrate.h            |   2 +
 include/linux/pagemap.h            |  18 +++
 include/linux/vm_event_item.h      |   8 +-
 mm/Kconfig                         |  15 ++
 mm/Makefile                        |   1 +
 mm/balloon_compaction.c            | 154 +++++++++++++++++++
 mm/compaction.c                    |  51 ++++---
 mm/migrate.c                       |  57 ++++++-
 mm/page_alloc.c                    |   2 +-
 mm/vmstat.c                        |  10 +-
 12 files changed, 726 insertions(+), 45 deletions(-)
 create mode 100644 include/linux/balloon_compaction.h
 create mode 100644 mm/balloon_compaction.c

Change log:
v10:
 * Adjust leak_balloon() wait_event logic to make a clear locking scheme (MST);
 * Drop the RCU protection approach for dereferencing balloon's page->mapping;
 * Minor nitpitcks on code commentaries (MST);
v9:
 * Adjust rcu_dereference usage to leverage page lock protection  (Paul, Peter);
 * Enhance doc on compaction interface introduced to balloon driver   (Michael);
 * Fix issue with isolated pages breaking leak_balloon() logics       (Michael);
v8:
 * introduce a common MM interface for balloon driver page compaction (Michael);
 * remove the global state preventing multiple balloon device support (Michael);
 * introduce RCU protection/syncrhonization to balloon page->mapping  (Michael);
v7:
 * fix a potential page leak case at 'putback_balloon_page'               (Mel);
 * adjust vm-events-counter patch and remove its drop-on-merge message    (Rik);
 * add 'putback_movable_pages' to avoid hacks on 'putback_lru_pages'  (Minchan);
v6:
 * rename 'is_balloon_page()' to 'movable_balloon_page()' 		  (Rik);
v5:
 * address Andrew Morton's review comments on the patch series;
 * address a couple extra nitpick suggestions on PATCH 01 	      (Minchan);
v4: 
 * address Rusty Russel's review comments on PATCH 02;
 * re-base virtio_balloon patch on 9c378abc5c0c6fc8e3acf5968924d274503819b3;
V3: 
 * address reviewers nitpick suggestions on PATCH 01		 (Mel, Minchan);
V2: 
 * address Mel Gorman's review comments on PATCH 01;
-- 
1.7.11.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
@ 2012-09-17 16:38 ` Rafael Aquini
  2012-09-17 22:15   ` Andrew Morton
  2012-09-24 12:44   ` Peter Zijlstra
  2012-09-17 16:38 ` [PATCH v10 2/5] mm: introduce compaction and migration for ballooned pages Rafael Aquini
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patch introduces a common interface to help a balloon driver on
making its page set movable to compaction, and thus allowing the system
to better leverage the compation efforts on memory defragmentation.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/balloon_compaction.h | 147 +++++++++++++++++++++++++++++++++++
 include/linux/pagemap.h            |  18 +++++
 mm/Kconfig                         |  15 ++++
 mm/Makefile                        |   1 +
 mm/balloon_compaction.c            | 152 +++++++++++++++++++++++++++++++++++++
 5 files changed, 333 insertions(+)
 create mode 100644 include/linux/balloon_compaction.h
 create mode 100644 mm/balloon_compaction.c

diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
new file mode 100644
index 0000000..1c27a93
--- /dev/null
+++ b/include/linux/balloon_compaction.h
@@ -0,0 +1,147 @@
+/*
+ * include/linux/balloon_compaction.h
+ *
+ * Common interface definitions for making balloon pages movable to compaction.
+ *
+ * Copyright (C) 2012, Red Hat, Inc.  Rafael Aquini <aquini@redhat.com>
+ */
+#ifndef _LINUX_BALLOON_COMPACTION_H
+#define _LINUX_BALLOON_COMPACTION_H
+
+#include <linux/rcupdate.h>
+#include <linux/pagemap.h>
+#include <linux/gfp.h>
+#include <linux/err.h>
+
+/* return code to identify when a ballooned page has been migrated */
+#define BALLOON_MIGRATION_RETURN	0xba1100
+
+#ifdef CONFIG_BALLOON_COMPACTION
+#define count_balloon_event(e)		count_vm_event(e)
+#define free_balloon_mapping(m)		kfree(m)
+
+extern bool isolate_balloon_page(struct page *);
+extern void putback_balloon_page(struct page *);
+extern int migrate_balloon_page(struct page *newpage,
+				struct page *page, enum migrate_mode mode);
+extern struct address_space *alloc_balloon_mapping(void *balloon_device,
+				const struct address_space_operations *a_ops);
+
+static inline void assign_balloon_mapping(struct page *page,
+					  struct address_space *mapping)
+{
+	page->mapping = mapping;
+	smp_wmb();
+}
+
+static inline void clear_balloon_mapping(struct page *page)
+{
+	page->mapping = NULL;
+	smp_wmb();
+}
+
+static inline gfp_t balloon_mapping_gfp_mask(void)
+{
+	return GFP_HIGHUSER_MOVABLE;
+}
+
+static inline bool __is_movable_balloon_page(struct page *page)
+{
+	struct address_space *mapping = ACCESS_ONCE(page->mapping);
+	smp_read_barrier_depends();
+	return mapping_balloon(mapping);
+}
+
+/*
+ * movable_balloon_page - test page->mapping->flags to identify balloon pages
+ *			  that can be moved by compaction/migration.
+ *
+ * This function is used at core compaction's page isolation scheme and so it's
+ * exposed to several system pages which may, or may not, be part of a memory
+ * balloon, and thus we cannot afford to hold a page locked to perform tests.
+ *
+ * Therefore, as we might return false positives in the case a balloon page
+ * is just released under us, the page->mapping->flags need to be retested
+ * with the proper page lock held, on the functions that will cope with the
+ * balloon page later.
+ */
+static inline bool movable_balloon_page(struct page *page)
+{
+	/*
+	 * Before dereferencing and testing mapping->flags, lets make sure
+	 * this is not a page that uses ->mapping in a different way
+	 */
+	if (!PageSlab(page) && !PageSwapCache(page) && !PageAnon(page) &&
+	    !page_mapped(page))
+		return __is_movable_balloon_page(page);
+
+	return false;
+}
+
+/*
+ * __page_balloon_device - get the balloon device that owns the given page.
+ *
+ * This shall only be used at driver callbacks under proper page lock,
+ * to get access to the balloon device which @page belongs.
+ */
+static inline void *__page_balloon_device(struct page *page)
+{
+	struct address_space *mapping = page->mapping;
+	if (mapping)
+		mapping = mapping->assoc_mapping;
+
+	return mapping;
+}
+
+/*
+ * DEFINE_BALLOON_MAPPING_AOPS - declare and instantiate a callback descriptor
+ *				 to be used as balloon page->mapping->a_ops.
+ *
+ * @label     : declaration identifier (var name)
+ * @isolatepg : callback symbol name for performing the page isolation step
+ * @migratepg : callback symbol name for performing the page migration step
+ * @putbackpg : callback symbol name for performing the page putback step
+ *
+ * address_space_operations utilized methods for ballooned pages:
+ *   .migratepage    - used to perform balloon's page migration (as is)
+ *   .invalidatepage - used to isolate a page from balloon's page list
+ *   .freepage       - used to reinsert an isolated page to balloon's page list
+ */
+#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
+	const struct address_space_operations (label) = {		    \
+		.migratepage    = (migratepg),				    \
+		.invalidatepage = (isolatepg),				    \
+		.freepage       = (putbackpg),				    \
+	}
+
+#else
+#define assign_balloon_mapping(p, m)	do { } while (0)
+#define clear_balloon_mapping(p)	do { } while (0)
+#define free_balloon_mapping(m)		do { } while (0)
+#define count_balloon_event(e)		do { } while (0)
+#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
+	const struct {} (label) = {}
+
+static inline bool movable_balloon_page(struct page *page) { return false; }
+static inline bool isolate_balloon_page(struct page *page) { return false; }
+static inline void putback_balloon_page(struct page *page) { return; }
+
+static inline int migrate_balloon_page(struct page *newpage,
+				struct page *page, enum migrate_mode mode)
+{
+	return 0;
+}
+
+static inline gfp_t balloon_mapping_gfp_mask(void)
+{
+	return GFP_HIGHUSER;
+}
+
+static inline void *alloc_balloon_mapping(void *balloon_device,
+				const struct address_space_operations *a_ops)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+#endif /* CONFIG_BALLOON_COMPACTION */
+
+#endif /* _LINUX_BALLOON_COMPACTION_H */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e42c762..6df0664 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -24,6 +24,7 @@ enum mapping_flags {
 	AS_ENOSPC	= __GFP_BITS_SHIFT + 1,	/* ENOSPC on async write */
 	AS_MM_ALL_LOCKS	= __GFP_BITS_SHIFT + 2,	/* under mm_take_all_locks() */
 	AS_UNEVICTABLE	= __GFP_BITS_SHIFT + 3,	/* e.g., ramdisk, SHM_LOCK */
+	AS_BALLOON_MAP  = __GFP_BITS_SHIFT + 4, /* balloon page special map */
 };
 
 static inline void mapping_set_error(struct address_space *mapping, int error)
@@ -53,6 +54,23 @@ static inline int mapping_unevictable(struct address_space *mapping)
 	return !!mapping;
 }
 
+static inline void mapping_set_balloon(struct address_space *mapping)
+{
+	set_bit(AS_BALLOON_MAP, &mapping->flags);
+}
+
+static inline void mapping_clear_balloon(struct address_space *mapping)
+{
+	clear_bit(AS_BALLOON_MAP, &mapping->flags);
+}
+
+static inline int mapping_balloon(struct address_space *mapping)
+{
+	if (mapping)
+		return test_bit(AS_BALLOON_MAP, &mapping->flags);
+	return !!mapping;
+}
+
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
 	return (__force gfp_t)mapping->flags & __GFP_BITS_MASK;
diff --git a/mm/Kconfig b/mm/Kconfig
index d5c8019..0bd783b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -188,6 +188,21 @@ config SPLIT_PTLOCK_CPUS
 	default "4"
 
 #
+# support for memory balloon compaction
+config BALLOON_COMPACTION
+	bool "Allow for balloon memory compaction/migration"
+	select COMPACTION
+	depends on VIRTIO_BALLOON
+	help
+	  Memory fragmentation introduced by ballooning might reduce
+	  significantly the number of 2MB contiguous memory blocks that can be
+	  used within a guest, thus imposing performance penalties associated
+	  with the reduced number of transparent huge pages that could be used
+	  by the guest workload. Allowing the compaction & migration for memory
+	  pages enlisted as being part of memory balloon devices avoids the
+	  scenario aforementioned and helps improving memory defragmentation.
+
+#
 # support for memory compaction
 config COMPACTION
 	bool "Allow for memory compaction"
diff --git a/mm/Makefile b/mm/Makefile
index 92753e2..23e54c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -57,3 +57,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CLEANCACHE) += cleancache.o
 obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
+obj-$(CONFIG_BALLOON_COMPACTION) += balloon_compaction.o
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
new file mode 100644
index 0000000..74c09ab
--- /dev/null
+++ b/mm/balloon_compaction.c
@@ -0,0 +1,152 @@
+/*
+ * mm/balloon_compaction.c
+ *
+ * Common interface for making balloon pages movable to compaction.
+ *
+ * Copyright (C) 2012, Red Hat, Inc.  Rafael Aquini <aquini@redhat.com>
+ */
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <linux/balloon_compaction.h>
+
+#ifdef CONFIG_BALLOON_COMPACTION
+/*
+ * alloc_balloon_mapping - allocates a special ->mapping for ballooned pages.
+ * @balloon_device: pointer address that references the balloon device which
+ *                 owns pages bearing this ->mapping.
+ * @a_ops: balloon_mapping address_space_operations descriptor.
+ *
+ * Users must call it to properly allocate and initialize an instance of
+ * struct address_space which will be used as the special page->mapping for
+ * balloon devices enlisted page instances.
+ */
+struct address_space *alloc_balloon_mapping(void *balloon_device,
+				const struct address_space_operations *a_ops)
+{
+	struct address_space *mapping;
+
+	mapping = kmalloc(sizeof(*mapping), GFP_KERNEL);
+	if (!mapping)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Give a clean 'zeroed' status to all elements of this special
+	 * balloon page->mapping struct address_space instance.
+	 */
+	address_space_init_once(mapping);
+
+	/*
+	 * Set mapping->flags appropriately, to allow balloon ->mapping
+	 * identification, as well as give a proper hint to the balloon
+	 * driver on what GFP allocation mask shall be used.
+	 */
+	mapping_set_balloon(mapping);
+	mapping_set_gfp_mask(mapping, balloon_mapping_gfp_mask());
+
+	/* balloon's page->mapping->a_ops callback descriptor */
+	mapping->a_ops = a_ops;
+
+	/*
+	 * balloon special page->mapping overloads ->assoc_mapping
+	 * to held a reference back to the balloon device wich 'owns'
+	 * a given page. This is the way we can cope with multiple
+	 * balloon devices without losing reference of several
+	 * ballooned pagesets.
+	 */
+	mapping->assoc_mapping = balloon_device;
+
+	return mapping;
+}
+EXPORT_SYMBOL_GPL(alloc_balloon_mapping);
+
+static inline void __isolate_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->invalidatepage(page, 0);
+}
+
+static inline void __putback_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->freepage(page);
+}
+
+static inline int __migrate_balloon_page(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	return page->mapping->a_ops->migratepage(mapping, newpage, page, mode);
+}
+
+/* __isolate_lru_page() counterpart for a ballooned page */
+bool isolate_balloon_page(struct page *page)
+{
+	if (likely(get_page_unless_zero(page))) {
+		/*
+		 * As balloon pages are not isolated from LRU lists, concurrent
+		 * compaction threads can race against page migration functions
+		 * move_to_new_page() & __unmap_and_move().
+		 * In order to avoid having an already isolated balloon page
+		 * being (wrongly) re-isolated while it is under migration,
+		 * lets be sure we have the page lock before proceeding with
+		 * the balloon page isolation steps.
+		 */
+		if (likely(trylock_page(page))) {
+			/*
+			 * A ballooned page, by default, has just one refcount.
+			 * Prevent concurrent compaction threads from isolating
+			 * an already isolated balloon page by refcount check.
+			 */
+			if (__is_movable_balloon_page(page) &&
+			    page_count(page) == 2) {
+				__isolate_balloon_page(page);
+				unlock_page(page);
+				return true;
+			}
+			unlock_page(page);
+		}
+		put_page(page);
+	}
+	return false;
+}
+
+/* putback_lru_page() counterpart for a ballooned page */
+void putback_balloon_page(struct page *page)
+{
+	/*
+	 * 'lock_page()' stabilizes the page and prevents races against
+	 * concurrent isolation threads attempting to re-isolate it.
+	 */
+	lock_page(page);
+
+	if (__is_movable_balloon_page(page)) {
+		__putback_balloon_page(page);
+		put_page(page);
+	} else {
+		__WARN();
+		dump_page(page);
+	}
+	unlock_page(page);
+}
+
+/* move_to_new_page() counterpart for a ballooned page */
+int migrate_balloon_page(struct page *newpage,
+			 struct page *page, enum migrate_mode mode)
+{
+	struct address_space *mapping;
+	int rc = -EAGAIN;
+
+	BUG_ON(!trylock_page(newpage));
+
+	if (WARN_ON(!__is_movable_balloon_page(page))) {
+		dump_page(page);
+		unlock_page(newpage);
+		return rc;
+	}
+
+	mapping = page->mapping;
+	if (mapping)
+		rc = __migrate_balloon_page(mapping, newpage, page, mode);
+
+	unlock_page(newpage);
+	return rc;
+}
+#endif /* CONFIG_BALLOON_COMPACTION */
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 2/5] mm: introduce compaction and migration for ballooned pages
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
@ 2012-09-17 16:38 ` Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patch introduces the helper functions as well as the necessary changes
to teach compaction and migration bits how to cope with pages which are
part of a guest memory balloon, in order to make them movable by memory
compaction procedures.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 mm/compaction.c | 47 ++++++++++++++++++++++++++++-------------------
 mm/migrate.c    | 36 ++++++++++++++++++++++++++++++++++--
 2 files changed, 62 insertions(+), 21 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 7fcd3a5..e50836b 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -14,6 +14,7 @@
 #include <linux/backing-dev.h>
 #include <linux/sysctl.h>
 #include <linux/sysfs.h>
+#include <linux/balloon_compaction.h>
 #include "internal.h"
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
@@ -358,32 +359,40 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
 			continue;
 		}
 
-		if (!PageLRU(page))
-			continue;
-
 		/*
-		 * PageLRU is set, and lru_lock excludes isolation,
-		 * splitting and collapsing (collapsing has already
-		 * happened if PageLRU is set).
+		 * It is possible to migrate LRU pages and balloon pages.
+		 * Skip any other type of page.
 		 */
-		if (PageTransHuge(page)) {
-			low_pfn += (1 << compound_order(page)) - 1;
-			continue;
-		}
+		if (PageLRU(page)) {
+			/*
+			 * PageLRU is set, and lru_lock excludes isolation,
+			 * splitting and collapsing (collapsing has already
+			 * happened if PageLRU is set).
+			 */
+			if (PageTransHuge(page)) {
+				low_pfn += (1 << compound_order(page)) - 1;
+				continue;
+			}
 
-		if (!cc->sync)
-			mode |= ISOLATE_ASYNC_MIGRATE;
+			if (!cc->sync)
+				mode |= ISOLATE_ASYNC_MIGRATE;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone);
+			lruvec = mem_cgroup_page_lruvec(page, zone);
 
-		/* Try isolate the page */
-		if (__isolate_lru_page(page, mode) != 0)
-			continue;
+			/* Try isolate the page */
+			if (__isolate_lru_page(page, mode) != 0)
+				continue;
 
-		VM_BUG_ON(PageTransCompound(page));
+			VM_BUG_ON(PageTransCompound(page));
+
+			/* Successfully isolated */
+			del_page_from_lru_list(page, lruvec, page_lru(page));
+		} else if (unlikely(movable_balloon_page(page))) {
+			if (!isolate_balloon_page(page))
+				continue;
+		} else
+			continue;
 
-		/* Successfully isolated */
-		del_page_from_lru_list(page, lruvec, page_lru(page));
 		list_add(&page->lru, migratelist);
 		cc->nr_migratepages++;
 		nr_isolated++;
diff --git a/mm/migrate.c b/mm/migrate.c
index 77ed2d7..ec439f8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -35,6 +35,7 @@
 #include <linux/hugetlb.h>
 #include <linux/hugetlb_cgroup.h>
 #include <linux/gfp.h>
+#include <linux/balloon_compaction.h>
 
 #include <asm/tlbflush.h>
 
@@ -79,7 +80,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		putback_lru_page(page);
+		if (unlikely(movable_balloon_page(page)))
+			putback_balloon_page(page);
+		else
+			putback_lru_page(page);
 	}
 }
 
@@ -799,6 +803,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		goto skip_unmap;
 	}
 
+	if (unlikely(movable_balloon_page(page))) {
+		/*
+		 * A ballooned page does not need any special attention from
+		 * physical to virtual reverse mapping procedures.
+		 * Skip any attempt to unmap PTEs or to remap swap cache,
+		 * in order to avoid burning cycles at rmap level, and perform
+		 * the page migration right away (proteced by page lock).
+		 */
+		rc = migrate_balloon_page(newpage, page, mode);
+		goto uncharge;
+	}
+
 	/* Establish migration ptes or remove ptes */
 	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
 
@@ -814,7 +830,8 @@ skip_unmap:
 		put_anon_vma(anon_vma);
 
 uncharge:
-	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
+	mem_cgroup_end_migration(mem, page, newpage,
+				 (rc == 0 || rc == BALLOON_MIGRATION_RETURN));
 unlock:
 	unlock_page(page);
 out:
@@ -846,6 +863,21 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 			goto out;
 
 	rc = __unmap_and_move(page, newpage, force, offlining, mode);
+
+	if (unlikely(rc == BALLOON_MIGRATION_RETURN)) {
+		/*
+		 * A ballooned page has been migrated already.
+		 * Now, it's the time to remove the old page from the isolated
+		 * pageset list and handle it back to Buddy, wrap-up counters
+		 * and return.
+		 */
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_cache(page));
+		list_del(&page->lru);
+		put_page(page);
+		__free_page(page);
+		return 0;
+	}
 out:
 	if (rc != -EAGAIN) {
 		/*
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 2/5] mm: introduce compaction and migration for ballooned pages Rafael Aquini
@ 2012-09-17 16:38 ` Rafael Aquini
  2012-09-17 22:15   ` Andrew Morton
  2012-09-25  0:40   ` Michael S. Tsirkin
  2012-09-17 16:38 ` [PATCH v10 4/5] mm: introduce putback_movable_pages() Rafael Aquini
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

Besides making balloon pages movable at allocation time and introducing
the necessary primitives to perform balloon page migration/compaction,
this patch also introduces the following locking scheme, in order to
enhance the syncronization methods for accessing elements of struct
virtio_balloon, thus providing protection against concurrent access
introduced by parallel memory compaction threads.

 - balloon_lock (mutex) : synchronizes the access demand to elements of
                          struct virtio_balloon and its queue operations;
 - pages_lock (spinlock): special protection to balloon's pages bookmarking
                          elements (list and atomic counters) against the
                          potential memory compaction concurrency;

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 305 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 286 insertions(+), 19 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 0908e60..a52c768 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,6 +27,7 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/balloon_compaction.h>
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -34,6 +35,7 @@
  * page units.
  */
 #define VIRTIO_BALLOON_PAGES_PER_PAGE (PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
+#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
 
 struct virtio_balloon
 {
@@ -46,11 +48,24 @@ struct virtio_balloon
 	/* The thread servicing the balloon. */
 	struct task_struct *thread;
 
+	/* balloon special page->mapping */
+	struct address_space *mapping;
+
+	/* Synchronize access/update to this struct virtio_balloon elements */
+	struct mutex balloon_lock;
+
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
 
+	/* Protect pages list, and pages bookeeping counters */
+	spinlock_t pages_lock;
+
+	/* Number of balloon pages isolated from 'pages' list for compaction */
+	unsigned int num_isolated_pages;
+
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+
 	/*
 	 * The pages we've told the Host we're not using.
 	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
@@ -60,7 +75,7 @@ struct virtio_balloon
 
 	/* The array of pfns we tell the Host about. */
 	unsigned int num_pfns;
-	u32 pfns[256];
+	u32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* Memory statistics */
 	int need_stats_update;
@@ -122,13 +137,17 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 
 static void fill_balloon(struct virtio_balloon *vb, size_t num)
 {
+	/* Get the proper GFP alloc mask from vb->mapping flags */
+	gfp_t vb_gfp_mask = mapping_gfp_mask(vb->mapping);
+
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	mutex_lock(&vb->balloon_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
-					__GFP_NOMEMALLOC | __GFP_NOWARN);
+		struct page *page = alloc_page(vb_gfp_mask | __GFP_NORETRY |
+					       __GFP_NOWARN | __GFP_NOMEMALLOC);
 		if (!page) {
 			if (printk_ratelimit())
 				dev_printk(KERN_INFO, &vb->vdev->dev,
@@ -139,9 +158,15 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 			break;
 		}
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
-		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		totalram_pages--;
+
+		BUG_ON(!trylock_page(page));
+		spin_lock(&vb->pages_lock);
 		list_add(&page->lru, &vb->pages);
+		assign_balloon_mapping(page, vb->mapping);
+		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
+		spin_unlock(&vb->pages_lock);
+		unlock_page(page);
 	}
 
 	/* Didn't get any?  Oh well. */
@@ -149,6 +174,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 		return;
 
 	tell_host(vb, vb->inflate_vq);
+	mutex_unlock(&vb->balloon_lock);
 }
 
 static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
@@ -162,19 +188,64 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
 	}
 }
 
-static void leak_balloon(struct virtio_balloon *vb, size_t num)
+/*
+ * __leak_balloon - back-end for the balloon pages deflate mechanism.
+ * @vb : pointer to the balloon device descriptor we're leaking (deflating)
+ * @leak_target : how many pages we are looking to drain off balloon's list.
+ *
+ * Here we do all the heavy lifting on behalf of leak_balloon(). This function
+ * must only be called by leak_balloon(), embedded on its wait_event() callsite
+ * and under the following locking scheme:
+ *	mutex(balloon_lock)
+ *	+--spinlock(pages_lock)
+ *	   +--__leak_balloon
+ */
+static int __leak_balloon(struct virtio_balloon *vb, size_t leak_target)
 {
-	struct page *page;
-
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	leak_target = min(leak_target, ARRAY_SIZE(vb->pfns));
 
-	for (vb->num_pfns = 0; vb->num_pfns < num;
+	for (vb->num_pfns = 0; vb->num_pfns < leak_target;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		page = list_first_entry(&vb->pages, struct page, lru);
+		struct page *page = NULL;
+		/*
+		 * leak_balloon() attempts to work releasing balloon pages by
+		 * groups of (at most) 'VIRTIO_BALLOON_ARRAY_PFNS_MAX' size
+		 * at each round.
+		 * When compaction isolates pages from balloon page list,
+		 * we might end up finding less pages on balloon's list than
+		 * what is our desired 'leak_target'.
+		 * When such occurrence happens, however, whe shall wrap-up
+		 * the work for this round and wait until enough pages get
+		 * inserted back into balloon's page list before proceeding.
+		 */
+		if (!list_empty(&vb->pages))
+			page = list_first_entry(&vb->pages, struct page, lru);
+
+		if (!page)
+			break;
+
+		/*
+		 * Grab the page lock to avoid racing against threads isolating
+		 * pages from, or migrating pages back to vb->pages list.
+		 * (both tasks are done under page lock protection)
+		 *
+		 * Failing to grab the page lock here means this page is being
+		 * isolated already, or its migration has not finished yet.
+		 *
+		 * We simply cannot afford to keep waiting on page lock here,
+		 * otherwise we might cause a lock inversion and remain dead-
+		 * locked with threads isolating/migrating pages.
+		 * So, we give up this round if we fail to grab the page lock.
+		 */
+		if (!trylock_page(page))
+			break;
+
+		clear_balloon_mapping(page);
 		list_del(&page->lru);
-		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb->pfns + vb->num_pfns, page);
+		unlock_page(page);
 	}
 
 	/*
@@ -182,8 +253,58 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	tell_host(vb, vb->deflate_vq);
-	release_pages_by_pfn(vb->pfns, vb->num_pfns);
+	if (vb->num_pfns > 0) {
+		spin_unlock(&vb->pages_lock);
+		tell_host(vb, vb->deflate_vq);
+		release_pages_by_pfn(vb->pfns, vb->num_pfns);
+		spin_lock(&vb->pages_lock);
+	}
+
+	return vb->num_pfns;
+}
+
+/*
+ * __wait_on_isolate_pages - evaluates the condition that breaks leak_balloon()
+ *			     wait_event loop.
+ * @vb : pointer to the balloon device descriptor we're leaking (deflating)
+ * @leak_target : pointer to our leak target.
+ *
+ * This is a helper to leak_balloon() wait_event scheme. Here we test the
+ * conditions to break the wait loop, as well as we embed the call to
+ * __leak_balloon() under the proper locking scheme.
+ */
+static inline bool __wait_on_isolated_pages(struct virtio_balloon *vb,
+					    size_t *leak_target)
+{
+	bool cond = false;
+	size_t leaked = 0;
+
+	mutex_lock(&vb->balloon_lock);
+	spin_lock(&vb->pages_lock);
+	if (*leak_target <= (vb->num_pages - vb->num_isolated_pages))
+		leaked = __leak_balloon(vb, *leak_target);
+
+	/* compensate the target with the amount of pages leaked this round */
+	*leak_target -= leaked;
+	cond = (*leak_target <= (vb->num_pages - vb->num_isolated_pages));
+	spin_unlock(&vb->pages_lock);
+	mutex_unlock(&vb->balloon_lock);
+	return cond;
+}
+
+/*
+ * leak_balloon - front-end for balloon pages deflate mechanism.
+ * @vb : pointer to the balloon device descriptor we're leaking (deflating)
+ * @leak_target : how many pages we are looking to drain off balloon's list.
+ */
+static void leak_balloon(struct virtio_balloon *vb, size_t leak_target)
+{
+	/* We can only do one array worth at a time. */
+	leak_target = min(leak_target, ARRAY_SIZE(vb->pfns));
+
+	/* Deflate balloon, or wait if there are too much isolated pages */
+	wait_event(vb->config_change,
+		   __wait_on_isolated_pages(vb, &leak_target));
 }
 
 static inline void update_stat(struct virtio_balloon *vb, int idx,
@@ -239,6 +360,7 @@ static void stats_handle_request(struct virtio_balloon *vb)
 	struct scatterlist sg;
 	unsigned int len;
 
+	mutex_lock(&vb->balloon_lock);
 	vb->need_stats_update = 0;
 	update_balloon_stats(vb);
 
@@ -249,6 +371,7 @@ static void stats_handle_request(struct virtio_balloon *vb)
 	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
 		BUG();
 	virtqueue_kick(vq);
+	mutex_unlock(&vb->balloon_lock);
 }
 
 static void virtballoon_changed(struct virtio_device *vdev)
@@ -261,22 +384,29 @@ static void virtballoon_changed(struct virtio_device *vdev)
 static inline s64 towards_target(struct virtio_balloon *vb)
 {
 	__le32 v;
-	s64 target;
+	s64 target, actual;
 
+	spin_lock(&vb->pages_lock);
+	actual = vb->num_pages;
 	vb->vdev->config->get(vb->vdev,
 			      offsetof(struct virtio_balloon_config, num_pages),
 			      &v, sizeof(v));
 	target = le32_to_cpu(v);
-	return target - vb->num_pages;
+	spin_unlock(&vb->pages_lock);
+	return target - actual;
 }
 
 static void update_balloon_size(struct virtio_balloon *vb)
 {
-	__le32 actual = cpu_to_le32(vb->num_pages);
+	__le32 actual;
+
+	spin_lock(&vb->pages_lock);
+	actual = cpu_to_le32(vb->num_pages);
 
 	vb->vdev->config->set(vb->vdev,
 			      offsetof(struct virtio_balloon_config, actual),
 			      &actual, sizeof(actual));
+	spin_unlock(&vb->pages_lock);
 }
 
 static int balloon(void *_vballoon)
@@ -339,9 +469,121 @@ static int init_vqs(struct virtio_balloon *vb)
 	return 0;
 }
 
+#ifdef CONFIG_BALLOON_COMPACTION
+/*
+ * virtballoon_isolatepage - perform the balloon page isolation on behalf of
+ *			     a compation thread.     (called under page lock)
+ * @page: the page to isolated from balloon's page list.
+ * @mode: not used for balloon page isolation.
+ *
+ * A memory compaction thread works by isolating pages from private lists,
+ * like LRUs or the balloon's page list (here), to a privative pageset that
+ * will be migrated subsequently. After the mentioned pageset gets isolated
+ * compaction relies on page migration procedures to do the heavy lifting.
+ *
+ * This function isolates a page from the balloon private page list.
+ * Called through balloon_mapping->a_ops.
+ */
+void virtballoon_isolatepage(struct page *page, unsigned long mode)
+{
+	struct virtio_balloon *vb = __page_balloon_device(page);
+
+	BUG_ON(!vb);
+
+	spin_lock(&vb->pages_lock);
+	list_del(&page->lru);
+	vb->num_isolated_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
+	spin_unlock(&vb->pages_lock);
+}
+
+/*
+ * virtballoon_migratepage - perform the balloon page migration on behalf of
+ *			     a compation thread.     (called under page lock)
+ * @mapping: the page->mapping which will be assigned to the new migrated page.
+ * @newpage: page that will replace the isolated page after migration finishes.
+ * @page   : the isolated (old) page that is about to be migrated to newpage.
+ * @mode   : compaction mode -- not used for balloon page migration.
+ *
+ * After a ballooned page gets isolated by compaction procedures, this is the
+ * function that performs the page migration on behalf of a compaction thread
+ * The page migration for virtio balloon is done in a simple swap fashion which
+ * follows these two macro steps:
+ *  1) insert newpage into vb->pages list and update the host about it;
+ *  2) update the host about the old page removed from vb->pages list;
+ *
+ * This function preforms the balloon page migration task.
+ * Called through balloon_mapping->a_ops.
+ */
+int virtballoon_migratepage(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	struct virtio_balloon *vb = __page_balloon_device(page);
+
+	BUG_ON(!vb);
+
+	mutex_lock(&vb->balloon_lock);
+
+	/* balloon's page migration 1st step */
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	spin_lock(&vb->pages_lock);
+	list_add(&newpage->lru, &vb->pages);
+	assign_balloon_mapping(newpage, mapping);
+	vb->num_isolated_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
+	spin_unlock(&vb->pages_lock);
+	set_page_pfns(vb->pfns, newpage);
+	tell_host(vb, vb->inflate_vq);
+
+	/* balloon's page migration 2nd step */
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	clear_balloon_mapping(page);
+	set_page_pfns(vb->pfns, page);
+	tell_host(vb, vb->deflate_vq);
+
+	mutex_unlock(&vb->balloon_lock);
+	wake_up(&vb->config_change);
+
+	return BALLOON_MIGRATION_RETURN;
+}
+
+/*
+ * virtballoon_putbackpage - insert an isolated page back into the list it was
+ *			     once taken off by a compaction thread.
+ *			     (called under page lock)
+ * @page: page that will be re-inserted into balloon page list.
+ *
+ * If for some reason, a compaction thread can not finish all its job in one
+ * round, and some isolated pages are still remaining at compaction's thread
+ * privative pageset (waiting for migration), then those pages will get
+ * re-inserted into their balloon private lists before compaction thread ends.
+ *
+ * This function inserts an isolated but not migrated balloon page
+ * back into private balloon list.
+ * Called through balloon_mapping->a_ops.
+ */
+void virtballoon_putbackpage(struct page *page)
+{
+	struct virtio_balloon *vb = __page_balloon_device(page);
+
+	BUG_ON(!vb);
+
+	spin_lock(&vb->pages_lock);
+	list_add(&page->lru, &vb->pages);
+	vb->num_isolated_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
+	spin_unlock(&vb->pages_lock);
+	wake_up(&vb->config_change);
+}
+#endif /* CONFIG_BALLOON_COMPACTION */
+
+/* define the balloon_mapping->a_ops callbacks to allow compaction/migration */
+static DEFINE_BALLOON_MAPPING_AOPS(virtio_balloon_aops,
+				   virtballoon_isolatepage,
+				   virtballoon_migratepage,
+				   virtballoon_putbackpage);
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
+	struct address_space *vb_mapping;
 	int err;
 
 	vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL);
@@ -351,12 +593,25 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	}
 
 	INIT_LIST_HEAD(&vb->pages);
+	mutex_init(&vb->balloon_lock);
+	spin_lock_init(&vb->pages_lock);
+
 	vb->num_pages = 0;
+	vb->num_isolated_pages = 0;
 	init_waitqueue_head(&vb->config_change);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
 
+	vb_mapping = alloc_balloon_mapping(vb, &virtio_balloon_aops);
+	if (IS_ERR(vb_mapping)) {
+		err = PTR_ERR(vb_mapping);
+		if (err != -EOPNOTSUPP)
+			goto out_free_vb;
+	}
+
+	vb->mapping = vb_mapping;
+
 	err = init_vqs(vb);
 	if (err)
 		goto out_free_vb;
@@ -372,16 +627,28 @@ static int virtballoon_probe(struct virtio_device *vdev)
 out_del_vqs:
 	vdev->config->del_vqs(vdev);
 out_free_vb:
+	free_balloon_mapping(vb_mapping);
 	kfree(vb);
 out:
 	return err;
 }
 
+static inline size_t __get_num_pages(struct virtio_balloon *vb)
+{
+	size_t num_pages;
+	spin_lock(&vb->pages_lock);
+	num_pages = vb->num_pages;
+	spin_unlock(&vb->pages_lock);
+	return num_pages;
+}
+
 static void remove_common(struct virtio_balloon *vb)
 {
+	size_t num_pages;
 	/* There might be pages left in the balloon: free them. */
-	while (vb->num_pages)
-		leak_balloon(vb, vb->num_pages);
+	while ((num_pages = __get_num_pages(vb)) > 0)
+		leak_balloon(vb, num_pages);
+
 	update_balloon_size(vb);
 
 	/* Now we reset the device so we can clean up the queues. */
@@ -396,6 +663,7 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev)
 
 	kthread_stop(vb->thread);
 	remove_common(vb);
+	free_balloon_mapping(vb->mapping);
 	kfree(vb);
 }
 
@@ -408,7 +676,6 @@ static int virtballoon_freeze(struct virtio_device *vdev)
 	 * The kthread is already frozen by the PM core before this
 	 * function is called.
 	 */
-
 	remove_common(vb);
 	return 0;
 }
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 4/5] mm: introduce putback_movable_pages()
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
                   ` (2 preceding siblings ...)
  2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
@ 2012-09-17 16:38 ` Rafael Aquini
  2012-09-17 16:38 ` [PATCH v10 5/5] mm: add vm event counters for balloon pages compaction Rafael Aquini
  2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
  5 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

The PATCH "mm: introduce compaction and migration for virtio ballooned pages"
hacks around putback_lru_pages() in order to allow ballooned pages to be
re-inserted on balloon page list as if a ballooned page was like a LRU page.

As ballooned pages are not legitimate LRU pages, this patch introduces
putback_movable_pages() to properly cope with cases where the isolated
pageset contains ballooned pages and LRU pages, thus fixing the mentioned
inelegant hack around putback_lru_pages().

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/migrate.h |  2 ++
 mm/compaction.c         |  4 ++--
 mm/migrate.c            | 20 ++++++++++++++++++++
 mm/page_alloc.c         |  2 +-
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index ce7e667..ff103a1 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -10,6 +10,7 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **);
 #ifdef CONFIG_MIGRATION
 
 extern void putback_lru_pages(struct list_head *l);
+extern void putback_movable_pages(struct list_head *l);
 extern int migrate_page(struct address_space *,
 			struct page *, struct page *, enum migrate_mode);
 extern int migrate_pages(struct list_head *l, new_page_t x,
@@ -33,6 +34,7 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping,
 #else
 
 static inline void putback_lru_pages(struct list_head *l) {}
+static inline void putback_movable_pages(struct list_head *l) {}
 static inline int migrate_pages(struct list_head *l, new_page_t x,
 		unsigned long private, bool offlining,
 		enum migrate_mode mode) { return -ENOSYS; }
diff --git a/mm/compaction.c b/mm/compaction.c
index e50836b..409b2f5 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -817,9 +817,9 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 		trace_mm_compaction_migratepages(nr_migrate - nr_remaining,
 						nr_remaining);
 
-		/* Release LRU pages not migrated */
+		/* Release isolated pages not migrated */
 		if (err) {
-			putback_lru_pages(&cc->migratepages);
+			putback_movable_pages(&cc->migratepages);
 			cc->nr_migratepages = 0;
 			if (err == -ENOMEM) {
 				ret = COMPACT_PARTIAL;
diff --git a/mm/migrate.c b/mm/migrate.c
index ec439f8..e47daf5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -80,6 +80,26 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
+			putback_lru_page(page);
+	}
+}
+
+/*
+ * Put previously isolated pages back onto the appropriate lists
+ * from where they were once taken off for compaction/migration.
+ *
+ * This function shall be used instead of putback_lru_pages(),
+ * whenever the isolated pageset has been built by isolate_migratepages_range()
+ */
+void putback_movable_pages(struct list_head *l)
+{
+	struct page *page;
+	struct page *page2;
+
+	list_for_each_entry_safe(page, page2, l, lru) {
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
 		if (unlikely(movable_balloon_page(page)))
 			putback_balloon_page(page);
 		else
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c66fb87..a0c2cc5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5675,7 +5675,7 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned long end)
 				    0, false, MIGRATE_SYNC);
 	}
 
-	putback_lru_pages(&cc.migratepages);
+	putback_movable_pages(&cc.migratepages);
 	return ret > 0 ? 0 : ret;
 }
 
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 5/5] mm: add vm event counters for balloon pages compaction
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
                   ` (3 preceding siblings ...)
  2012-09-17 16:38 ` [PATCH v10 4/5] mm: introduce putback_movable_pages() Rafael Aquini
@ 2012-09-17 16:38 ` Rafael Aquini
  2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
  5 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-17 16:38 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney, aquini

This patch introduces a new set of vm event counters to keep track of
ballooned pages compaction activity.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |  1 +
 include/linux/vm_event_item.h   |  8 +++++++-
 mm/balloon_compaction.c         |  2 ++
 mm/migrate.c                    |  1 +
 mm/vmstat.c                     | 10 +++++++++-
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index a52c768..c57d1ed 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -541,6 +541,7 @@ int virtballoon_migratepage(struct address_space *mapping,
 
 	mutex_unlock(&vb->balloon_lock);
 	wake_up(&vb->config_change);
+	count_balloon_event(COMPACTBALLOONMIGRATED);
 
 	return BALLOON_MIGRATION_RETURN;
 }
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 57f7b10..13573fe 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -41,7 +41,13 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 #ifdef CONFIG_COMPACTION
 		COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED,
 		COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
-#endif
+#ifdef CONFIG_BALLOON_COMPACTION
+		COMPACTBALLOONISOLATED, /* isolated from balloon pagelist */
+		COMPACTBALLOONMIGRATED, /* balloon page sucessfully migrated */
+		COMPACTBALLOONRELEASED, /* old-page released after migration */
+		COMPACTBALLOONRETURNED, /* putback to pagelist, not-migrated */
+#endif /* CONFIG_BALLOON_COMPACTION */
+#endif /* CONFIG_COMPACTION */
 #ifdef CONFIG_HUGETLB_PAGE
 		HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
 #endif
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 74c09ab..873af56 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -98,6 +98,7 @@ bool isolate_balloon_page(struct page *page)
 			if (__is_movable_balloon_page(page) &&
 			    page_count(page) == 2) {
 				__isolate_balloon_page(page);
+				count_balloon_event(COMPACTBALLOONISOLATED);
 				unlock_page(page);
 				return true;
 			}
@@ -120,6 +121,7 @@ void putback_balloon_page(struct page *page)
 	if (__is_movable_balloon_page(page)) {
 		__putback_balloon_page(page);
 		put_page(page);
+		count_balloon_event(COMPACTBALLOONRETURNED);
 	} else {
 		__WARN();
 		dump_page(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index e47daf5..124b16b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -896,6 +896,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 		list_del(&page->lru);
 		put_page(page);
 		__free_page(page);
+		count_balloon_event(COMPACTBALLOONRELEASED);
 		return 0;
 	}
 out:
diff --git a/mm/vmstat.c b/mm/vmstat.c
index df7a674..5824ad2 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -768,7 +768,15 @@ const char * const vmstat_text[] = {
 	"compact_stall",
 	"compact_fail",
 	"compact_success",
-#endif
+
+#ifdef CONFIG_BALLOON_COMPACTION
+	"compact_balloon_isolated",
+	"compact_balloon_migrated",
+	"compact_balloon_released",
+	"compact_balloon_returned",
+#endif /* CONFIG_BALLOON_COMPACTION */
+
+#endif /* CONFIG_COMPACTION */
 
 #ifdef CONFIG_HUGETLB_PAGE
 	"htlb_buddy_alloc_success",
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 0/5] make balloon pages movable by compaction
  2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
                   ` (4 preceding siblings ...)
  2012-09-17 16:38 ` [PATCH v10 5/5] mm: add vm event counters for balloon pages compaction Rafael Aquini
@ 2012-09-17 22:15 ` Andrew Morton
  2012-09-17 22:45   ` Rik van Riel
                     ` (2 more replies)
  5 siblings, 3 replies; 20+ messages in thread
From: Andrew Morton @ 2012-09-17 22:15 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, 17 Sep 2012 13:38:15 -0300
Rafael Aquini <aquini@redhat.com> wrote:

> Memory fragmentation introduced by ballooning might reduce significantly
> the number of 2MB contiguous memory blocks that can be used within a guest,
> thus imposing performance penalties associated with the reduced number of
> transparent huge pages that could be used by the guest workload.
> 
> This patch-set follows the main idea discussed at 2012 LSFMMS session:
> "Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
> to introduce the required changes to the virtio_balloon driver, as well as
> the changes to the core compaction & migration bits, in order to make those
> subsystems aware of ballooned pages and allow memory balloon pages become
> movable within a guest, thus avoiding the aforementioned fragmentation issue
> 
> Following are numbers that prove this patch benefits on allowing compaction
> to be more effective at memory ballooned guests.
> 
> Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
> running on a 4gB RAM KVM guest which was ballooning 1gB RAM in 256mB chunks,
> at every minute (inflating/deflating), while test was running:

How can a patchset reach v10 and have zero Reviewed-by's?

The patchset looks reasonable to me and your empirical results look
good.  But I don't feel that I'm in a position to decide on its overall
desirability, either in a standalone sense or in comparison to any
alternative schemes which anyone has proposed.

IOW, Rusty and KVM folks: please consider thyself poked.

I looked through the code and have some comments which are minor in the
overall scheme of things.  I'll be more comfortable when a compaction
expert has had a go over it.  IOW, Mel joins the pokee list ;)

(The question of "overall desirability" is the big one here.  Do we
actually want to add this to Linux?  The rest is details which we can
work out).



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
@ 2012-09-17 22:15   ` Andrew Morton
  2012-09-18 16:24     ` Rafael Aquini
  2012-09-24 12:44   ` Peter Zijlstra
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2012-09-17 22:15 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, 17 Sep 2012 13:38:16 -0300
Rafael Aquini <aquini@redhat.com> wrote:

> Memory fragmentation introduced by ballooning might reduce significantly
> the number of 2MB contiguous memory blocks that can be used within a guest,
> thus imposing performance penalties associated with the reduced number of
> transparent huge pages that could be used by the guest workload.
> 
> This patch introduces a common interface to help a balloon driver on
> making its page set movable to compaction, and thus allowing the system
> to better leverage the compation efforts on memory defragmentation.
> 
>
> ...
>
> +#ifndef _LINUX_BALLOON_COMPACTION_H
> +#define _LINUX_BALLOON_COMPACTION_H
> +
> +#include <linux/rcupdate.h>
> +#include <linux/pagemap.h>
> +#include <linux/gfp.h>
> +#include <linux/err.h>
> +
> +/* return code to identify when a ballooned page has been migrated */
> +#define BALLOON_MIGRATION_RETURN	0xba1100

I didn't really spend enough time to work out why this was done this
way, but I know a hack when I see one!

We forgot to document the a_ops.migratepage() return value.  Perhaps
it's time to work out what it should be.

> +#ifdef CONFIG_BALLOON_COMPACTION
> +#define count_balloon_event(e)		count_vm_event(e)
> +#define free_balloon_mapping(m)		kfree(m)

It would be better to write these in C please.  That way we get
typechecking, even when CONFIG_BALLOON_COMPACTION=n.

> +extern bool isolate_balloon_page(struct page *);
> +extern void putback_balloon_page(struct page *);
> +extern int migrate_balloon_page(struct page *newpage,
> +				struct page *page, enum migrate_mode mode);
> +extern struct address_space *alloc_balloon_mapping(void *balloon_device,
> +				const struct address_space_operations *a_ops);

There's a useful convention that interface identifiers are prefixed by
their interface's name.  IOW, everything in this file would start with
"balloon_".  balloon_page_isolate, balloon_page_putback, etc.  I think
we could follow that convention here?

> +static inline void assign_balloon_mapping(struct page *page,
> +					  struct address_space *mapping)
> +{
> +	page->mapping = mapping;
> +	smp_wmb();
> +}
> +
> +static inline void clear_balloon_mapping(struct page *page)
> +{
> +	page->mapping = NULL;
> +	smp_wmb();
> +}
> +
> +static inline gfp_t balloon_mapping_gfp_mask(void)
> +{
> +	return GFP_HIGHUSER_MOVABLE;
> +}
> +
> +static inline bool __is_movable_balloon_page(struct page *page)
> +{
> +	struct address_space *mapping = ACCESS_ONCE(page->mapping);
> +	smp_read_barrier_depends();
> +	return mapping_balloon(mapping);
> +}

hm.  Are these barrier tricks copied from somewhere else, or home-made?

> +/*
> + * movable_balloon_page - test page->mapping->flags to identify balloon pages
> + *			  that can be moved by compaction/migration.
> + *
> + * This function is used at core compaction's page isolation scheme and so it's
> + * exposed to several system pages which may, or may not, be part of a memory
> + * balloon, and thus we cannot afford to hold a page locked to perform tests.

I don't understand this.  What is a "system page"?  If I knew that, I
migth perhaps understand why we cannot lock such a page.

> + * Therefore, as we might return false positives in the case a balloon page
> + * is just released under us, the page->mapping->flags need to be retested
> + * with the proper page lock held, on the functions that will cope with the
> + * balloon page later.
> + */
> +static inline bool movable_balloon_page(struct page *page)
> +{
> +	/*
> +	 * Before dereferencing and testing mapping->flags, lets make sure
> +	 * this is not a page that uses ->mapping in a different way
> +	 */
> +	if (!PageSlab(page) && !PageSwapCache(page) && !PageAnon(page) &&
> +	    !page_mapped(page))
> +		return __is_movable_balloon_page(page);
> +
> +	return false;
> +}
> +
> +/*
> + * __page_balloon_device - get the balloon device that owns the given page.
> + *
> + * This shall only be used at driver callbacks under proper page lock,
> + * to get access to the balloon device which @page belongs.
> + */
> +static inline void *__page_balloon_device(struct page *page)
> +{
> +	struct address_space *mapping = page->mapping;
> +	if (mapping)
> +		mapping = mapping->assoc_mapping;
> +
> +	return mapping;
> +}

So you've repurposed address_space.assoc_mapping in new and unexpected
ways.

I don't immediately see a problem with doing this, but we should do it
properly.  Something like:

- rename address_space.assoc_mapping to private_data
- it has type void*
- document its ownership rules
- convert fs/buffer.c

all done as a standalone preparatory patch.

Also, your usage of ->private_data should minimise its use of void* -
use more specific types wherever possible.  So this function should
return a "struct virtio_balloon *".

It is unobvious why this interface function is prefixed with __.

> +/*
> + * DEFINE_BALLOON_MAPPING_AOPS - declare and instantiate a callback descriptor
> + *				 to be used as balloon page->mapping->a_ops.
> + *
> + * @label     : declaration identifier (var name)
> + * @isolatepg : callback symbol name for performing the page isolation step
> + * @migratepg : callback symbol name for performing the page migration step
> + * @putbackpg : callback symbol name for performing the page putback step
> + *
> + * address_space_operations utilized methods for ballooned pages:
> + *   .migratepage    - used to perform balloon's page migration (as is)
> + *   .invalidatepage - used to isolate a page from balloon's page list
> + *   .freepage       - used to reinsert an isolated page to balloon's page list
> + */
> +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> +	const struct address_space_operations (label) = {		    \
> +		.migratepage    = (migratepg),				    \
> +		.invalidatepage = (isolatepg),				    \
> +		.freepage       = (putbackpg),				    \
> +	}

erp.  Can we avoid doing this?  afaict it would be pretty simple to
avoid instantiating virtio_balloon_aops at all if
CONFIG_BALLOON_COMPACTION=n?

> +#else
> +#define assign_balloon_mapping(p, m)	do { } while (0)
> +#define clear_balloon_mapping(p)	do { } while (0)
> +#define free_balloon_mapping(m)		do { } while (0)
> +#define count_balloon_event(e)		do { } while (0)

Written in C with proper types if possible, please.

> +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> +	const struct {} (label) = {}
> +
> +static inline bool movable_balloon_page(struct page *page) { return false; }
> +static inline bool isolate_balloon_page(struct page *page) { return false; }
> +static inline void putback_balloon_page(struct page *page) { return; }
> +
> +static inline int migrate_balloon_page(struct page *newpage,
> +				struct page *page, enum migrate_mode mode)
> +{
> +	return 0;
> +}
> +
>
> ...
>
> @@ -53,6 +54,23 @@ static inline int mapping_unevictable(struct address_space *mapping)
>  	return !!mapping;
>  }
>  
> +static inline void mapping_set_balloon(struct address_space *mapping)
> +{
> +	set_bit(AS_BALLOON_MAP, &mapping->flags);
> +}
> +
> +static inline void mapping_clear_balloon(struct address_space *mapping)
> +{
> +	clear_bit(AS_BALLOON_MAP, &mapping->flags);
> +}
> +
> +static inline int mapping_balloon(struct address_space *mapping)
> +{
> +	if (mapping)
> +		return test_bit(AS_BALLOON_MAP, &mapping->flags);
> +	return !!mapping;

Why not "return 0"?

Or

	return mapping && test_bit(AS_BALLOON_MAP, &mapping->flags);

> +}
> +
>
> ...
>
> +struct address_space *alloc_balloon_mapping(void *balloon_device,
> +				const struct address_space_operations *a_ops)
> +{
> +	struct address_space *mapping;
> +
> +	mapping = kmalloc(sizeof(*mapping), GFP_KERNEL);
> +	if (!mapping)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * Give a clean 'zeroed' status to all elements of this special
> +	 * balloon page->mapping struct address_space instance.
> +	 */
> +	address_space_init_once(mapping);
> +
> +	/*
> +	 * Set mapping->flags appropriately, to allow balloon ->mapping
> +	 * identification, as well as give a proper hint to the balloon
> +	 * driver on what GFP allocation mask shall be used.
> +	 */
> +	mapping_set_balloon(mapping);
> +	mapping_set_gfp_mask(mapping, balloon_mapping_gfp_mask());
> +
> +	/* balloon's page->mapping->a_ops callback descriptor */
> +	mapping->a_ops = a_ops;
> +
> +	/*
> +	 * balloon special page->mapping overloads ->assoc_mapping
> +	 * to held a reference back to the balloon device wich 'owns'
> +	 * a given page. This is the way we can cope with multiple
> +	 * balloon devices without losing reference of several
> +	 * ballooned pagesets.

I don't really understand the final part of this comment.  Can you
expand more fully on the problem which this code is solving?

> +	 */
> +	mapping->assoc_mapping = balloon_device;
> +
> +	return mapping;
> +}
> +EXPORT_SYMBOL_GPL(alloc_balloon_mapping);

balloon_mapping_alloc() :)

> +static inline void __isolate_balloon_page(struct page *page)
> +{
> +	page->mapping->a_ops->invalidatepage(page, 0);
> +}
> +
>
> ...
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages
  2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
@ 2012-09-17 22:15   ` Andrew Morton
  2012-09-18 14:07     ` Rafael Aquini
  2012-09-25  0:40   ` Michael S. Tsirkin
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2012-09-17 22:15 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, 17 Sep 2012 13:38:18 -0300
Rafael Aquini <aquini@redhat.com> wrote:

> Memory fragmentation introduced by ballooning might reduce significantly
> the number of 2MB contiguous memory blocks that can be used within a guest,
> thus imposing performance penalties associated with the reduced number of
> transparent huge pages that could be used by the guest workload.
> 
> Besides making balloon pages movable at allocation time and introducing
> the necessary primitives to perform balloon page migration/compaction,
> this patch also introduces the following locking scheme, in order to
> enhance the syncronization methods for accessing elements of struct
> virtio_balloon, thus providing protection against concurrent access
> introduced by parallel memory compaction threads.
> 
>  - balloon_lock (mutex) : synchronizes the access demand to elements of
>                           struct virtio_balloon and its queue operations;
>  - pages_lock (spinlock): special protection to balloon's pages bookmarking
>                           elements (list and atomic counters) against the
>                           potential memory compaction concurrency;
> 
>
> ...
>
>  struct virtio_balloon
>  {
> @@ -46,11 +48,24 @@ struct virtio_balloon
>  	/* The thread servicing the balloon. */
>  	struct task_struct *thread;
>  
> +	/* balloon special page->mapping */
> +	struct address_space *mapping;
> +
> +	/* Synchronize access/update to this struct virtio_balloon elements */
> +	struct mutex balloon_lock;
> +
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
>  
> +	/* Protect pages list, and pages bookeeping counters */
> +	spinlock_t pages_lock;
> +
> +	/* Number of balloon pages isolated from 'pages' list for compaction */
> +	unsigned int num_isolated_pages;

Is it utterly inconceivable that this counter could exceed 4G, ever?

>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +
>  	/*
>  	 * The pages we've told the Host we're not using.
>  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
> @@ -60,7 +75,7 @@ struct virtio_balloon
>  
>  	/* The array of pfns we tell the Host about. */
>  	unsigned int num_pfns;
> -	u32 pfns[256];
> +	u32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
>  
>  	/* Memory statistics */
>  	int need_stats_update;
> @@ -122,13 +137,17 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  
>  static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> +	/* Get the proper GFP alloc mask from vb->mapping flags */
> +	gfp_t vb_gfp_mask = mapping_gfp_mask(vb->mapping);
> +
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	mutex_lock(&vb->balloon_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
>  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> -					__GFP_NOMEMALLOC | __GFP_NOWARN);
> +		struct page *page = alloc_page(vb_gfp_mask | __GFP_NORETRY |
> +					       __GFP_NOWARN | __GFP_NOMEMALLOC);

That looks like an allocation which could easily fail.

>  		if (!page) {
>  			if (printk_ratelimit())
>  				dev_printk(KERN_INFO, &vb->vdev->dev,

Strangely, we suppressed the core page allocator's warning and
substituted this less useful one.

Also, it would be nice if someone could get that printk_ratelimit() out
of there, for reasons described at the printk_ratelimit() definition
site.

> @@ -139,9 +158,15 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  			break;
>  		}
>  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>  		totalram_pages--;
> +
> +		BUG_ON(!trylock_page(page));
> +		spin_lock(&vb->pages_lock);
>  		list_add(&page->lru, &vb->pages);
> +		assign_balloon_mapping(page, vb->mapping);
> +		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> +		spin_unlock(&vb->pages_lock);
> +		unlock_page(page);
>  	}
>
> ...
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 0/5] make balloon pages movable by compaction
  2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
@ 2012-09-17 22:45   ` Rik van Riel
  2012-09-18  0:45   ` Rusty Russell
  2012-09-25  1:17   ` Michael S. Tsirkin
  2 siblings, 0 replies; 20+ messages in thread
From: Rik van Riel @ 2012-09-17 22:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael Aquini, linux-mm, linux-kernel, virtualization,
	Rusty Russell, Michael S. Tsirkin, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On 09/17/2012 06:15 PM, Andrew Morton wrote:
> On Mon, 17 Sep 2012 13:38:15 -0300
> Rafael Aquini <aquini@redhat.com> wrote:
>
>> Memory fragmentation introduced by ballooning might reduce significantly
>> the number of 2MB contiguous memory blocks that can be used within a guest,
>> thus imposing performance penalties associated with the reduced number of
>> transparent huge pages that could be used by the guest workload.
>>
>> This patch-set follows the main idea discussed at 2012 LSFMMS session:
>> "Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
>> to introduce the required changes to the virtio_balloon driver, as well as
>> the changes to the core compaction & migration bits, in order to make those
>> subsystems aware of ballooned pages and allow memory balloon pages become
>> movable within a guest, thus avoiding the aforementioned fragmentation issue
>>
>> Following are numbers that prove this patch benefits on allowing compaction
>> to be more effective at memory ballooned guests.
>>
>> Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
>> running on a 4gB RAM KVM guest which was ballooning 1gB RAM in 256mB chunks,
>> at every minute (inflating/deflating), while test was running:
>
> How can a patchset reach v10 and have zero Reviewed-by's?

Because people kept finding issues and nitpicks in patch 1/5,
which kept people from putting their Reviewed-by's on the other
patches :)

> (The question of "overall desirability" is the big one here.  Do we
> actually want to add this to Linux?  The rest is details which we can
> work out).

I believe we absolutely want this, to increase the likelyhood
of being able to use THP in KVM guests, which is exactly
where THP gives the largest performance benefit.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 0/5] make balloon pages movable by compaction
  2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
  2012-09-17 22:45   ` Rik van Riel
@ 2012-09-18  0:45   ` Rusty Russell
  2012-09-25  1:17   ` Michael S. Tsirkin
  2 siblings, 0 replies; 20+ messages in thread
From: Rusty Russell @ 2012-09-18  0:45 UTC (permalink / raw)
  To: Andrew Morton, Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Konrad Rzeszutek Wilk,
	Minchan Kim, Peter Zijlstra, Paul E. McKenney

Andrew Morton <akpm@linux-foundation.org> writes:
> On Mon, 17 Sep 2012 13:38:15 -0300
> Rafael Aquini <aquini@redhat.com> wrote:
>> Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
>> running on a 4gB RAM KVM guest which was ballooning 1gB RAM in 256mB chunks,
>> at every minute (inflating/deflating), while test was running:
>
> How can a patchset reach v10 and have zero Reviewed-by's?

The virtio_balloon changes are fairly trivial compared to the mm parts,
and Michael Tsirkin provided feedback on the last round.

However, the real trick is figuring out what the locking rules are when
the mm core calls in to ask us about a page.  And that requires someone
who really knows the mm stuff.

> The patchset looks reasonable to me and your empirical results look
> good.  But I don't feel that I'm in a position to decide on its overall
> desirability, either in a standalone sense or in comparison to any
> alternative schemes which anyone has proposed.

It's definitely nice to have, though it's far more complicated than I
would have thought.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages
  2012-09-17 22:15   ` Andrew Morton
@ 2012-09-18 14:07     ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-18 14:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, Sep 17, 2012 at 03:15:52PM -0700, Andrew Morton wrote:
> > +	/* Number of balloon pages isolated from 'pages' list for compaction */
> > +	unsigned int num_isolated_pages;
> 
> Is it utterly inconceivable that this counter could exceed 4G, ever?
> 
> >  	/* Number of balloon pages we've told the Host we're not using. */
> >  	unsigned int num_pages;

I've just followed the same unit the driver writers had used to keep track of
how many pages are 'enlisted' to a given balloon device (num_pages). As
compaction can not isolate more pages than what a balloon device possess, yes,
num_isolated_pages won't get bigger than 4G pages.



> > +	mutex_lock(&vb->balloon_lock);
> >  	for (vb->num_pfns = 0; vb->num_pfns < num;
> >  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> > -		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> > -					__GFP_NOMEMALLOC | __GFP_NOWARN);
> > +		struct page *page = alloc_page(vb_gfp_mask | __GFP_NORETRY |
> > +					       __GFP_NOWARN | __GFP_NOMEMALLOC);
> 
> That looks like an allocation which could easily fail.
>

That's not a big problem. If we fail that allocation and miss the desired 
balloon 'inflation' target at this round, the driver will take care of it
later, as it keeps chasing its targets.


 
> >  		if (!page) {
> >  			if (printk_ratelimit())
> >  				dev_printk(KERN_INFO, &vb->vdev->dev,
> 
> Strangely, we suppressed the core page allocator's warning and
> substituted this less useful one.
> 
> Also, it would be nice if someone could get that printk_ratelimit() out
> of there, for reasons described at the printk_ratelimit() definition
> site.
> 

Despite I agree 100% with you here, (IMHO) that was a change out of the scope 
for this patchseries original purposes and so I didn't propose it.

OTOH, I don't mind in introducing the aforementioned surgery by this patch, 
if the balloon driver folks are OK with it.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-17 22:15   ` Andrew Morton
@ 2012-09-18 16:24     ` Rafael Aquini
  2012-09-18 22:09       ` Andrew Morton
  2012-09-25  1:05       ` Michael S. Tsirkin
  0 siblings, 2 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-18 16:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, Sep 17, 2012 at 03:15:43PM -0700, Andrew Morton wrote:
> > +/* return code to identify when a ballooned page has been migrated */
> > +#define BALLOON_MIGRATION_RETURN	0xba1100
> 
> I didn't really spend enough time to work out why this was done this
> way, but I know a hack when I see one!
>
Yes, I'm afraid it's a hack, but, unfortunately, it's a necessary one (IMHO).

This 'distinct' return code is used to flag a sucessful balloon page migration
at the following unmap_and_move() snippet (patch 2).
If by any reason we fail to identify a sucessfull balloon page migration, we
will cause a page leak, as the old 'page' won't be properly released.
.....
        rc = __unmap_and_move(page, newpage, force, offlining, mode);
+
+        if (unlikely(rc == BALLOON_MIGRATION_RETURN)) {
+                /*
+                 * A ballooned page has been migrated already.
+                 * Now, it's the time to remove the old page from the isolated
+                 * pageset list and handle it back to Buddy, wrap-up counters
+                 * and return.
+                 */
......

By reaching that point in code, we cannot rely on testing page->mapping flags
anymore for both 'page' and 'newpage' because:
a) migration has already finished and 'page'->mapping is wiped out;
b) balloon might have started to deflate, and 'newpage' might be released
   already;

If the return code approach is unnaceptable, we might defer the 'page'->mapping
wipe-out step to that point in code for the balloon page case.
That, however, tends to be a little bit heavier, IMHO, as it will require us to
acquire the page lock once more to proceed the mapping wipe out, thus
potentially introducing overhead by lock contention (specially when several
parallel compaction threads are scanning pages for isolation)

 
> We forgot to document the a_ops.migratepage() return value.  Perhaps
> it's time to work out what it should be.
> 
> > +#ifdef CONFIG_BALLOON_COMPACTION
> > +#define count_balloon_event(e)		count_vm_event(e)
> > +#define free_balloon_mapping(m)		kfree(m)
> 
> It would be better to write these in C please.  That way we get
> typechecking, even when CONFIG_BALLOON_COMPACTION=n.
> 

Consider it done, sir.


> > +extern bool isolate_balloon_page(struct page *);
> > +extern void putback_balloon_page(struct page *);
> > +extern int migrate_balloon_page(struct page *newpage,
> > +				struct page *page, enum migrate_mode mode);
> > +extern struct address_space *alloc_balloon_mapping(void *balloon_device,
> > +				const struct address_space_operations *a_ops);
> 
> There's a useful convention that interface identifiers are prefixed by
> their interface's name.  IOW, everything in this file would start with
> "balloon_".  balloon_page_isolate, balloon_page_putback, etc.  I think
> we could follow that convention here?
> 

Consider it done, sir.


> > +static inline void assign_balloon_mapping(struct page *page,
> > +					  struct address_space *mapping)
> > +{
> > +	page->mapping = mapping;
> > +	smp_wmb();
> > +}
> > +
> > +static inline void clear_balloon_mapping(struct page *page)
> > +{
> > +	page->mapping = NULL;
> > +	smp_wmb();
> > +}
> > +
> > +static inline gfp_t balloon_mapping_gfp_mask(void)
> > +{
> > +	return GFP_HIGHUSER_MOVABLE;
> > +}
> > +
> > +static inline bool __is_movable_balloon_page(struct page *page)
> > +{
> > +	struct address_space *mapping = ACCESS_ONCE(page->mapping);
> > +	smp_read_barrier_depends();
> > +	return mapping_balloon(mapping);
> > +}
> 
> hm.  Are these barrier tricks copied from somewhere else, or home-made?
>

They were introduced by a reviewer request to assure the proper ordering when
inserting or deleting pages to/from a balloon device, so a given page won't get
elected as being a balloon page before it gets inserted into the balloon's page
list, just as it will only be deleted from the balloon's page list after it is
decomissioned of its balloon page status (page->mapping wipe-out). 

Despite the mentioned operations only take place under proper locking, I thought
it wouldn't hurt enforcing such order, thus I kept the barrier stuff. Btw,
considering the aforementioned usage case, I just realized the
assign_balloon_mapping() barrier is misplaced. I'll fix that and introduce
comments on those function's usage.
 

> > +/*
> > + * movable_balloon_page - test page->mapping->flags to identify balloon pages
> > + *			  that can be moved by compaction/migration.
> > + *
> > + * This function is used at core compaction's page isolation scheme and so it's
> > + * exposed to several system pages which may, or may not, be part of a memory
> > + * balloon, and thus we cannot afford to hold a page locked to perform tests.
> 
> I don't understand this.  What is a "system page"?  If I knew that, I
> migth perhaps understand why we cannot lock such a page.
> 

I've attempted to mean compaction threads scan through all memory pages in the
system to check whether a page can be isolated or not, thus we cannot held the
page locked to perform the mapping->flags test as it can cause undesired effects
on other subsystems where the weŕe about to test here page is on use.

I'll try to re-arrange the comment to make it clear.


> > + * Therefore, as we might return false positives in the case a balloon page
> > + * is just released under us, the page->mapping->flags need to be retested
> > + * with the proper page lock held, on the functions that will cope with the
> > + * balloon page later.
> > + */
> > +static inline bool movable_balloon_page(struct page *page)
> > +{
> > +	/*
> > +	 * Before dereferencing and testing mapping->flags, lets make sure
> > +	 * this is not a page that uses ->mapping in a different way
> > +	 */
> > +	if (!PageSlab(page) && !PageSwapCache(page) && !PageAnon(page) &&
> > +	    !page_mapped(page))
> > +		return __is_movable_balloon_page(page);
> > +
> > +	return false;
> > +}
> > +
> > +/*
> > + * __page_balloon_device - get the balloon device that owns the given page.
> > + *
> > + * This shall only be used at driver callbacks under proper page lock,
> > + * to get access to the balloon device which @page belongs.
> > + */
> > +static inline void *__page_balloon_device(struct page *page)
> > +{
> > +	struct address_space *mapping = page->mapping;
> > +	if (mapping)
> > +		mapping = mapping->assoc_mapping;
> > +
> > +	return mapping;
> > +}
> 
> So you've repurposed address_space.assoc_mapping in new and unexpected
> ways.
> 
> I don't immediately see a problem with doing this, but we should do it
> properly.  Something like:
> 
> - rename address_space.assoc_mapping to private_data
> - it has type void*
> - document its ownership rules
> - convert fs/buffer.c
> 
> all done as a standalone preparatory patch.
>

I'll do it as you requested/suggested.

 
> Also, your usage of ->private_data should minimise its use of void* -
> use more specific types wherever possible.  So this function should
> return a "struct virtio_balloon *".
> 

I believe we can keep it returning the opaque type, as it will allow us to expand 
its usage to other balloon drivers which might have different balloon descriptors
in the future. I didn't looked at all balloon drivers, but in a glance seems that
vmware's balloon is using a fairly similar scheme to virtio's and it could
leverage all this interfaces, as well.


> It is unobvious why this interface function is prefixed with __.
> 
> > +/*
> > + * DEFINE_BALLOON_MAPPING_AOPS - declare and instantiate a callback descriptor
> > + *				 to be used as balloon page->mapping->a_ops.
> > + *
> > + * @label     : declaration identifier (var name)
> > + * @isolatepg : callback symbol name for performing the page isolation step
> > + * @migratepg : callback symbol name for performing the page migration step
> > + * @putbackpg : callback symbol name for performing the page putback step
> > + *
> > + * address_space_operations utilized methods for ballooned pages:
> > + *   .migratepage    - used to perform balloon's page migration (as is)
> > + *   .invalidatepage - used to isolate a page from balloon's page list
> > + *   .freepage       - used to reinsert an isolated page to balloon's page list
> > + */
> > +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> > +	const struct address_space_operations (label) = {		    \
> > +		.migratepage    = (migratepg),				    \
> > +		.invalidatepage = (isolatepg),				    \
> > +		.freepage       = (putbackpg),				    \
> > +	}
> 
> erp.  Can we avoid doing this?  afaict it would be pretty simple to
> avoid instantiating virtio_balloon_aops at all if
> CONFIG_BALLOON_COMPACTION=n?
>

That was being instantiated at driver's level directly, 
and driver folks have requested this change.

I'll look a way around of it, though. (Just to make sure we're on the same page
here: Are you against the preprocessor macro, or just the way it's being used?)

 
> > +#else
> > +#define assign_balloon_mapping(p, m)	do { } while (0)
> > +#define clear_balloon_mapping(p)	do { } while (0)
> > +#define free_balloon_mapping(m)		do { } while (0)
> > +#define count_balloon_event(e)		do { } while (0)
> 
> Written in C with proper types if possible, please.
> 

Consider it done, sir.


> > +#define DEFINE_BALLOON_MAPPING_AOPS(label, isolatepg, migratepg, putbackpg) \
> > +	const struct {} (label) = {}
> > +
> > +static inline bool movable_balloon_page(struct page *page) { return false; }
> > +static inline bool isolate_balloon_page(struct page *page) { return false; }
> > +static inline void putback_balloon_page(struct page *page) { return; }
> > +
> > +static inline int migrate_balloon_page(struct page *newpage,
> > +				struct page *page, enum migrate_mode mode)
> > +{
> > +	return 0;
> > +}
> > +
> >
> > ...
> >
> > @@ -53,6 +54,23 @@ static inline int mapping_unevictable(struct address_space *mapping)
> >  	return !!mapping;
> >  }
> >  
> > +static inline void mapping_set_balloon(struct address_space *mapping)
> > +{
> > +	set_bit(AS_BALLOON_MAP, &mapping->flags);
> > +}
> > +
> > +static inline void mapping_clear_balloon(struct address_space *mapping)
> > +{
> > +	clear_bit(AS_BALLOON_MAP, &mapping->flags);
> > +}
> > +
> > +static inline int mapping_balloon(struct address_space *mapping)
> > +{
> > +	if (mapping)
> > +		return test_bit(AS_BALLOON_MAP, &mapping->flags);
> > +	return !!mapping;
> 
> Why not "return 0"?
> 
> Or
> 
> 	return mapping && test_bit(AS_BALLOON_MAP, &mapping->flags);

Consider it done, sir.


> 
> > +}
> > +
> >
> > ...
> >
> > +struct address_space *alloc_balloon_mapping(void *balloon_device,
> > +				const struct address_space_operations *a_ops)
> > +{
> > +	struct address_space *mapping;
> > +
> > +	mapping = kmalloc(sizeof(*mapping), GFP_KERNEL);
> > +	if (!mapping)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	/*
> > +	 * Give a clean 'zeroed' status to all elements of this special
> > +	 * balloon page->mapping struct address_space instance.
> > +	 */
> > +	address_space_init_once(mapping);
> > +
> > +	/*
> > +	 * Set mapping->flags appropriately, to allow balloon ->mapping
> > +	 * identification, as well as give a proper hint to the balloon
> > +	 * driver on what GFP allocation mask shall be used.
> > +	 */
> > +	mapping_set_balloon(mapping);
> > +	mapping_set_gfp_mask(mapping, balloon_mapping_gfp_mask());
> > +
> > +	/* balloon's page->mapping->a_ops callback descriptor */
> > +	mapping->a_ops = a_ops;
> > +
> > +	/*
> > +	 * balloon special page->mapping overloads ->assoc_mapping
> > +	 * to held a reference back to the balloon device wich 'owns'
> > +	 * a given page. This is the way we can cope with multiple
> > +	 * balloon devices without losing reference of several
> > +	 * ballooned pagesets.
> 
> I don't really understand the final part of this comment.  Can you
> expand more fully on the problem which this code is solving?
> 

I was trying to state we can have several virtio_balloon devices instantiated
for a single guest (virtio folks have told me that's a quite easy scenario), and
the only way we can safely get the right balloon page lists to isolate pages
from, or put pages back in, is maintaining a pointer back to the device descriptor
at every balloon's page->mapping instance.

I'll try to reprhase the commentary, to make it clearer.


> > +	 */
> > +	mapping->assoc_mapping = balloon_device;
> > +
> > +	return mapping;
> > +}
> > +EXPORT_SYMBOL_GPL(alloc_balloon_mapping);
> 
> balloon_mapping_alloc() :)

Consider it done, sir.


> 
> > +static inline void __isolate_balloon_page(struct page *page)
> > +{
> > +	page->mapping->a_ops->invalidatepage(page, 0);
> > +}
> > +
> >
> > ...
> >
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-18 16:24     ` Rafael Aquini
@ 2012-09-18 22:09       ` Andrew Morton
  2012-09-25  1:05       ` Michael S. Tsirkin
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2012-09-18 22:09 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Tue, 18 Sep 2012 13:24:21 -0300
Rafael Aquini <aquini@redhat.com> wrote:

> On Mon, Sep 17, 2012 at 03:15:43PM -0700, Andrew Morton wrote:
> > > +/* return code to identify when a ballooned page has been migrated */
> > > +#define BALLOON_MIGRATION_RETURN	0xba1100
> > 
> > I didn't really spend enough time to work out why this was done this
> > way, but I know a hack when I see one!
> >
> Yes, I'm afraid it's a hack, but, unfortunately, it's a necessary one (IMHO).
> 
> This 'distinct' return code is used to flag a sucessful balloon page migration
> at the following unmap_and_move() snippet (patch 2).
> If by any reason we fail to identify a sucessfull balloon page migration, we
> will cause a page leak, as the old 'page' won't be properly released.
> .....
>         rc = __unmap_and_move(page, newpage, force, offlining, mode);
> +
> +        if (unlikely(rc == BALLOON_MIGRATION_RETURN)) {
> +                /*
> +                 * A ballooned page has been migrated already.
> +                 * Now, it's the time to remove the old page from the isolated
> +                 * pageset list and handle it back to Buddy, wrap-up counters
> +                 * and return.
> +                 */
> ......
> 
> By reaching that point in code, we cannot rely on testing page->mapping flags
> anymore for both 'page' and 'newpage' because:
> a) migration has already finished and 'page'->mapping is wiped out;
> b) balloon might have started to deflate, and 'newpage' might be released
>    already;
> 
> If the return code approach is unnaceptable, we might defer the 'page'->mapping
> wipe-out step to that point in code for the balloon page case.
> That, however, tends to be a little bit heavier, IMHO, as it will require us to
> acquire the page lock once more to proceed the mapping wipe out, thus
> potentially introducing overhead by lock contention (specially when several
> parallel compaction threads are scanning pages for isolation)

I think the return code approach _is_ acceptable, but the
implementation could be improved.

As it stands, a naive caller could be expecting either 0 (success) or a
negative errno.  A large positive return value could trigger havoc.  We
can defend against such programming mistakes with code commentary, but
a better approach would be to enumerate the return values.  Along the
lines of

/*
 * Return values from addresss_space_operations.migratepage().  Returns a
 * negative errno on failure.
 */
#define MIGRATEPAGE_SUCCESS		0
#define MIGRATEPAGE_BALLOON_THINGY	1	/* nice comment goes here */

and convert all callers to explicitly check for MIGRATEPAGE_SUCCESS,
not literal zero.  We should be particularly careful to look for
codesites which are unprepared for positive return values, such as

	ret = migratepage();
	if (ret < 0)
		return ret;
	...
	return ret;		/* success!! */



If we wanted to be really vigilant about this, we could do

#define MIGRATEPAGE_SUCCESS		1
#define MIGRATEPAGE_BALLOON_THINGY	2

so any naive code which tests for literal zero will nicely explode early
in testing.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
  2012-09-17 22:15   ` Andrew Morton
@ 2012-09-24 12:44   ` Peter Zijlstra
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2012-09-24 12:44 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Michael S. Tsirkin, Rik van Riel, Mel Gorman, Andi Kleen,
	Andrew Morton, Konrad Rzeszutek Wilk, Minchan Kim,
	Paul E. McKenney

On Mon, 2012-09-17 at 13:38 -0300, Rafael Aquini wrote:
> +static inline void assign_balloon_mapping(struct page *page,
> +                                         struct address_space
> *mapping)
> +{
> +       page->mapping = mapping;
> +       smp_wmb();
> +}
> +
> +static inline void clear_balloon_mapping(struct page *page)
> +{
> +       page->mapping = NULL;
> +       smp_wmb();
> +} 

barriers without a comment describing the data race are a mortal sin.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages
  2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
  2012-09-17 22:15   ` Andrew Morton
@ 2012-09-25  0:40   ` Michael S. Tsirkin
  2012-09-25 18:07     ` Rafael Aquini
  1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-09-25  0:40 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, Sep 17, 2012 at 01:38:18PM -0300, Rafael Aquini wrote:
> Memory fragmentation introduced by ballooning might reduce significantly
> the number of 2MB contiguous memory blocks that can be used within a guest,
> thus imposing performance penalties associated with the reduced number of
> transparent huge pages that could be used by the guest workload.
> 
> Besides making balloon pages movable at allocation time and introducing
> the necessary primitives to perform balloon page migration/compaction,
> this patch also introduces the following locking scheme, in order to
> enhance the syncronization methods for accessing elements of struct
> virtio_balloon, thus providing protection against concurrent access
> introduced by parallel memory compaction threads.
> 
>  - balloon_lock (mutex) : synchronizes the access demand to elements of
>                           struct virtio_balloon and its queue operations;
>  - pages_lock (spinlock): special protection to balloon's pages bookmarking
>                           elements (list and atomic counters) against the
>                           potential memory compaction concurrency;
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>
> ---
>  drivers/virtio/virtio_balloon.c | 305 +++++++++++++++++++++++++++++++++++++---
>  1 file changed, 286 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 0908e60..a52c768 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -27,6 +27,7 @@
>  #include <linux/delay.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
> +#include <linux/balloon_compaction.h>
>  
>  /*
>   * Balloon device works in 4K page units.  So each page is pointed to by
> @@ -34,6 +35,7 @@
>   * page units.
>   */
>  #define VIRTIO_BALLOON_PAGES_PER_PAGE (PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
> +#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
>  
>  struct virtio_balloon
>  {
> @@ -46,11 +48,24 @@ struct virtio_balloon
>  	/* The thread servicing the balloon. */
>  	struct task_struct *thread;
>  
> +	/* balloon special page->mapping */
> +	struct address_space *mapping;
> +
> +	/* Synchronize access/update to this struct virtio_balloon elements */
> +	struct mutex balloon_lock;

Please document here nesting rules wrt page lock for this and pages_lock.

> +
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
>  
> +	/* Protect pages list, and pages bookeeping counters */
> +	spinlock_t pages_lock;
> +
> +	/* Number of balloon pages isolated from 'pages' list for compaction */
> +	unsigned int num_isolated_pages;
> +
>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +
>  	/*
>  	 * The pages we've told the Host we're not using.
>  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE

...


> @@ -122,13 +137,17 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  
>  static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> +	/* Get the proper GFP alloc mask from vb->mapping flags */
> +	gfp_t vb_gfp_mask = mapping_gfp_mask(vb->mapping);
> +
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	mutex_lock(&vb->balloon_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
>  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> -					__GFP_NOMEMALLOC | __GFP_NOWARN);
> +		struct page *page = alloc_page(vb_gfp_mask | __GFP_NORETRY |
> +					       __GFP_NOWARN | __GFP_NOMEMALLOC);
>  		if (!page) {
>  			if (printk_ratelimit())
>  				dev_printk(KERN_INFO, &vb->vdev->dev,
> @@ -139,9 +158,15 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  			break;
>  		}
>  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>  		totalram_pages--;
> +
> +		BUG_ON(!trylock_page(page));

So here page lock is nested within balloon_lock.

> +		spin_lock(&vb->pages_lock);
>  		list_add(&page->lru, &vb->pages);
> +		assign_balloon_mapping(page, vb->mapping);
> +		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> +		spin_unlock(&vb->pages_lock);
> +		unlock_page(page);
>  	}
>  
>  	/* Didn't get any?  Oh well. */
> @@ -149,6 +174,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  		return;
>  
>  	tell_host(vb, vb->inflate_vq);
> +	mutex_unlock(&vb->balloon_lock);
>  }

...


> +/*
> + * virtballoon_migratepage - perform the balloon page migration on behalf of
> + *			     a compation thread.     (called under page lock)
> + * @mapping: the page->mapping which will be assigned to the new migrated page.
> + * @newpage: page that will replace the isolated page after migration finishes.
> + * @page   : the isolated (old) page that is about to be migrated to newpage.
> + * @mode   : compaction mode -- not used for balloon page migration.
> + *
> + * After a ballooned page gets isolated by compaction procedures, this is the
> + * function that performs the page migration on behalf of a compaction thread
> + * The page migration for virtio balloon is done in a simple swap fashion which
> + * follows these two macro steps:
> + *  1) insert newpage into vb->pages list and update the host about it;
> + *  2) update the host about the old page removed from vb->pages list;
> + *
> + * This function preforms the balloon page migration task.
> + * Called through balloon_mapping->a_ops.
> + */
> +int virtballoon_migratepage(struct address_space *mapping,
> +		struct page *newpage, struct page *page, enum migrate_mode mode)
> +{
> +	struct virtio_balloon *vb = __page_balloon_device(page);
> +
> +	BUG_ON(!vb);
> +
> +	mutex_lock(&vb->balloon_lock);


While here balloon_lock is taken and according to documentation
this is called under page lock.

> +
> +	/* balloon's page migration 1st step */
> +	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
> +	spin_lock(&vb->pages_lock);
> +	list_add(&newpage->lru, &vb->pages);
> +	assign_balloon_mapping(newpage, mapping);
> +	vb->num_isolated_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
> +	spin_unlock(&vb->pages_lock);
> +	set_page_pfns(vb->pfns, newpage);
> +	tell_host(vb, vb->inflate_vq);
> +
> +	/* balloon's page migration 2nd step */
> +	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
> +	clear_balloon_mapping(page);
> +	set_page_pfns(vb->pfns, page);
> +	tell_host(vb, vb->deflate_vq);
> +
> +	mutex_unlock(&vb->balloon_lock);
> +	wake_up(&vb->config_change);
> +
> +	return BALLOON_MIGRATION_RETURN;
> +}

So nesting is reversed which is normally a problem.
Unfortunately lockep does not seem to work for page lock
otherwise it would detect this.
If this reversed nesting is not a problem, please add
comments in code documenting that this is intentional
and how it works.

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-18 16:24     ` Rafael Aquini
  2012-09-18 22:09       ` Andrew Morton
@ 2012-09-25  1:05       ` Michael S. Tsirkin
  2012-09-25 14:00         ` Rafael Aquini
  1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-09-25  1:05 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: Andrew Morton, linux-mm, linux-kernel, virtualization,
	Rusty Russell, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Tue, Sep 18, 2012 at 01:24:21PM -0300, Rafael Aquini wrote:
> > > +static inline void assign_balloon_mapping(struct page *page,
> > > +					  struct address_space *mapping)
> > > +{
> > > +	page->mapping = mapping;
> > > +	smp_wmb();
> > > +}
> > > +
> > > +static inline void clear_balloon_mapping(struct page *page)
> > > +{
> > > +	page->mapping = NULL;
> > > +	smp_wmb();
> > > +}
> > > +
> > > +static inline gfp_t balloon_mapping_gfp_mask(void)
> > > +{
> > > +	return GFP_HIGHUSER_MOVABLE;
> > > +}
> > > +
> > > +static inline bool __is_movable_balloon_page(struct page *page)
> > > +{
> > > +	struct address_space *mapping = ACCESS_ONCE(page->mapping);
> > > +	smp_read_barrier_depends();
> > > +	return mapping_balloon(mapping);
> > > +}
> > 
> > hm.  Are these barrier tricks copied from somewhere else, or home-made?
> >
> 
> They were introduced by a reviewer request to assure the proper ordering when
> inserting or deleting pages to/from a balloon device, so a given page won't get
> elected as being a balloon page before it gets inserted into the balloon's page
> list, just as it will only be deleted from the balloon's page list after it is
> decomissioned of its balloon page status (page->mapping wipe-out). 
> 
> Despite the mentioned operations only take place under proper locking, I thought
> it wouldn't hurt enforcing such order, thus I kept the barrier stuff. Btw,
> considering the aforementioned usage case, I just realized the
> assign_balloon_mapping() barrier is misplaced. I'll fix that and introduce
> comments on those function's usage.

If these are all under page lock these barriers just confuse things,
because they are almost never enough by themselves.
So in that case it would be better to drop them and document
usage as you are going to.

Even better would be lockdep check but unfortunately it
does not seem to be possible for page lock.

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 0/5] make balloon pages movable by compaction
  2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
  2012-09-17 22:45   ` Rik van Riel
  2012-09-18  0:45   ` Rusty Russell
@ 2012-09-25  1:17   ` Michael S. Tsirkin
  2 siblings, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-09-25  1:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael Aquini, linux-mm, linux-kernel, virtualization,
	Rusty Russell, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Mon, Sep 17, 2012 at 03:15:31PM -0700, Andrew Morton wrote:
> How can a patchset reach v10 and have zero Reviewed-by's?

I think the problem is, this adds an API between mm and balloon
device that is pretty complex: consider that previously we literally
only used alloc_page, __free_page and page->lru field.

So you end up with a problem: mm bits don't do anything
by themselves so mm people aren't very interested and
don't know about virtio anyway, while
we virtio device driver people lack a clue about compaction.

Having said that I think I'm getting my head about some of the issues so
I commented and the patchset is hopefully getting there.

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility
  2012-09-25  1:05       ` Michael S. Tsirkin
@ 2012-09-25 14:00         ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-25 14:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrew Morton, linux-mm, linux-kernel, virtualization,
	Rusty Russell, Rik van Riel, Mel Gorman, Andi Kleen,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Tue, Sep 25, 2012 at 03:05:49AM +0200, Michael S. Tsirkin wrote:
> If these are all under page lock these barriers just confuse things,
> because they are almost never enough by themselves.
> So in that case it would be better to drop them and document
> usage as you are going to.
>

Would the following make more sense (with the proprer comments, as well) ?

---8<---
+static inline void balloon_page_set(struct page *page,
+                                   struct address_space *mapping,
+                                   struct list_head *head)
+{
+       list_add(&page->lru, head);
+       smp_wmb();
+       page->mapping = mapping;
+}
+
+static inline void balloon_page_del(struct page *page)
+{
+       page->mapping = NULL;
+       smp_wmb();
+       list_del(&page->lru);
+}
+
+static inline bool __is_movable_balloon_page(struct page *page)
+{
+       struct address_space *mapping = ACCESS_ONCE(page->mapping);
+       smp_read_barrier_depends();
+       return mapping_balloon(mapping);
+}
+
---8<---

There's still a case where we have to test page->mapping->flags and we cannot
afford to wait for, or grab, the page lock @ isolate_migratepages_range().
The barriers won't avoid leak_ballon() racing against isolate_migratepages_range(),
but they surely will make tests for page->mapping more consistent. And for those
cases where leak_balloon() races against
isolate_migratepages_range->isolate_balloon_page(), we solve the conflict of
interest through page refcounting and page lock. I'm preparing a more extensive
doc to include at Documentation/ to explain the interfaces and how we cope with
these mentioned races, as well.
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages
  2012-09-25  0:40   ` Michael S. Tsirkin
@ 2012-09-25 18:07     ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-09-25 18:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-mm, linux-kernel, virtualization, Rusty Russell,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Peter Zijlstra,
	Paul E. McKenney

On Tue, Sep 25, 2012 at 02:40:24AM +0200, Michael S. Tsirkin wrote:
> > @@ -139,9 +158,15 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  			break;
> >  		}
> >  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> > -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> >  		totalram_pages--;
> > +
> > +		BUG_ON(!trylock_page(page));
> 
> So here page lock is nested within balloon_lock.
> 
This page is coming from Buddy free-lists and as such, fill_balloon does not
race against page migration (which takes pages from an already isolated page list).
We need to grab page lock here, to prevent page isolation from happening while
this page is yet under "balloon insertion" step (mapping assign). Also, failing
to grab the page lock here (remember this page is coming from Buddy) means a
BUG.

I'll make sure to comment that on the code. Thanks for the nit.

> > +int virtballoon_migratepage(struct address_space *mapping,
> > +		struct page *newpage, struct page *page, enum migrate_mode mode)
> > +{
> > +	struct virtio_balloon *vb = __page_balloon_device(page);
> > +
> > +	BUG_ON(!vb);
> > +
> > +	mutex_lock(&vb->balloon_lock);
> 
> 
> While here balloon_lock is taken and according to documentation
> this is called under page lock.
> 
> So nesting is reversed which is normally a problem.
> Unfortunately lockep does not seem to work for page lock
> otherwise it would detect this.
> If this reversed nesting is not a problem, please add
> comments in code documenting that this is intentional
> and how it works.
>

The scheme works because we do not invert the page locking, actually. If we
stumble accross a locked page while holding mutex(balloon_lock) we just give up
that page for that round. The comment @ __leak_balloon explains that already:
----
+               /*
+                * Grab the page lock to avoid racing against threads isolating
+                * pages from, or migrating pages back to vb->pages list.
+                * (both tasks are done under page lock protection)
+                *
+                * Failing to grab the page lock here means this page is being
+                * isolated already, or its migration has not finished yet.
+                *
+                * We simply cannot afford to keep waiting on page lock here,
+                * otherwise we might cause a lock inversion and remain dead-
+                * locked with threads isolating/migrating pages.
+                * So, we give up this round if we fail to grab the page lock.
+                */
+               if (!trylock_page(page))
+                       break;
----

The otherwise is true as well. If the thread trying to isolate pages stumbles
across a balloon page already locked at leak_balloon, it gives up the isolation
step and mutex(&balloon_lock) is never attempted down the code path. Check this
isolate_balloon_page() snippet out:
----
+               /*
+                * As balloon pages are not isolated from LRU lists, concurrent
+                * compaction threads can race against page migration functions
+                * move_to_new_page() & __unmap_and_move().
+                * In order to avoid having an already isolated balloon page
+                * being (wrongly) re-isolated while it is under migration,
+                * lets be sure we have the page lock before proceeding with
+                * the balloon page isolation steps.
+                */
+               if (likely(trylock_page(page))) {
----

I'll rework the comment to indicate the leak_balloon() case as well.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-09-25 18:08 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-17 16:38 [PATCH v10 0/5] make balloon pages movable by compaction Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 1/5] mm: introduce a common interface for balloon pages mobility Rafael Aquini
2012-09-17 22:15   ` Andrew Morton
2012-09-18 16:24     ` Rafael Aquini
2012-09-18 22:09       ` Andrew Morton
2012-09-25  1:05       ` Michael S. Tsirkin
2012-09-25 14:00         ` Rafael Aquini
2012-09-24 12:44   ` Peter Zijlstra
2012-09-17 16:38 ` [PATCH v10 2/5] mm: introduce compaction and migration for ballooned pages Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 3/5] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
2012-09-17 22:15   ` Andrew Morton
2012-09-18 14:07     ` Rafael Aquini
2012-09-25  0:40   ` Michael S. Tsirkin
2012-09-25 18:07     ` Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 4/5] mm: introduce putback_movable_pages() Rafael Aquini
2012-09-17 16:38 ` [PATCH v10 5/5] mm: add vm event counters for balloon pages compaction Rafael Aquini
2012-09-17 22:15 ` [PATCH v10 0/5] make balloon pages movable by compaction Andrew Morton
2012-09-17 22:45   ` Rik van Riel
2012-09-18  0:45   ` Rusty Russell
2012-09-25  1:17   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).