All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] make balloon pages movable by compaction
@ 2012-07-03 23:48 ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patchset follows the main idea discussed at 2012 LSFMMS section:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/

to introduce the required changes to the virtio_balloon driver, as well as
changes to the core compaction & migration bits, in order to make those
subsystems aware of ballooned pages and allow memory balloon pages become
movable within a guest, thus avoiding the aforementioned fragmentation issue.


Rafael Aquini (4):
  mm: introduce compaction and migration for virtio ballooned pages
  virtio_balloon: handle concurrent accesses to virtio_balloon struct
    elements
  virtio_balloon: introduce migration primitives to balloon pages
  mm: add vm event counters for balloon pages compaction

 drivers/virtio/virtio_balloon.c |  142 +++++++++++++++++++++++++++++++++++----
 include/linux/mm.h              |   15 +++++
 include/linux/virtio_balloon.h  |    4 ++
 include/linux/vm_event_item.h   |    2 +
 mm/compaction.c                 |  127 ++++++++++++++++++++++++++++------
 mm/migrate.c                    |   32 ++++++++-
 mm/vmstat.c                     |    4 ++
 7 files changed, 293 insertions(+), 33 deletions(-)


Changelog:
V3: address reviwers nitpick suggestions (Mel, Minchan) 
V2: address Mel Gorman's review comments


Preliminary test results:
(2 VCPU 1024mB RAM KVM guest running 3.5.0_rc5+)

* 64mB balloon:
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 0
compact_balloon_failed 0
compact_balloon_isolated 0
compact_balloon_freed 0
[root@localhost ~]#
[root@localhost ~]# for i in $(seq 1 4); do echo 1 > /proc/sys/vm/compact_memory & done &>/dev/null
[1]   Done                    echo 1 > /proc/sys/vm/compact_memory
[2]   Done                    echo 1 > /proc/sys/vm/compact_memory
[3]-  Done                    echo 1 > /proc/sys/vm/compact_memory
[4]+  Done                    echo 1 > /proc/sys/vm/compact_memory
[root@localhost ~]#
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 2717
compact_pages_moved 46697
compact_pagemigrate_failed 75
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 16384
compact_balloon_failed 0
compact_balloon_isolated 16384
compact_balloon_freed 16384


* 128mB balloon:
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 0
compact_balloon_failed 0
compact_balloon_isolated 0
compact_balloon_freed 0
[root@localhost ~]#
[root@localhost ~]# for i in $(seq 1 4); do echo 1 > /proc/sys/vm/compact_memory & done &>/dev/null
[1]   Done                    echo 1 > /proc/sys/vm/compact_memory
[2]   Done                    echo 1 > /proc/sys/vm/compact_memory
[3]-  Done                    echo 1 > /proc/sys/vm/compact_memory
[4]+  Done                    echo 1 > /proc/sys/vm/compact_memory
[root@localhost ~]#
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 2598
compact_pages_moved 47660
compact_pagemigrate_failed 103
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 26652
compact_balloon_failed 76
compact_balloon_isolated 26728
compact_balloon_freed 26652
-- 
1.7.10.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/4] make balloon pages movable by compaction
@ 2012-07-03 23:48 ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patchset follows the main idea discussed at 2012 LSFMMS section:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/

to introduce the required changes to the virtio_balloon driver, as well as
changes to the core compaction & migration bits, in order to make those
subsystems aware of ballooned pages and allow memory balloon pages become
movable within a guest, thus avoiding the aforementioned fragmentation issue.


Rafael Aquini (4):
  mm: introduce compaction and migration for virtio ballooned pages
  virtio_balloon: handle concurrent accesses to virtio_balloon struct
    elements
  virtio_balloon: introduce migration primitives to balloon pages
  mm: add vm event counters for balloon pages compaction

 drivers/virtio/virtio_balloon.c |  142 +++++++++++++++++++++++++++++++++++----
 include/linux/mm.h              |   15 +++++
 include/linux/virtio_balloon.h  |    4 ++
 include/linux/vm_event_item.h   |    2 +
 mm/compaction.c                 |  127 ++++++++++++++++++++++++++++------
 mm/migrate.c                    |   32 ++++++++-
 mm/vmstat.c                     |    4 ++
 7 files changed, 293 insertions(+), 33 deletions(-)


Changelog:
V3: address reviwers nitpick suggestions (Mel, Minchan) 
V2: address Mel Gorman's review comments


Preliminary test results:
(2 VCPU 1024mB RAM KVM guest running 3.5.0_rc5+)

* 64mB balloon:
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 0
compact_balloon_failed 0
compact_balloon_isolated 0
compact_balloon_freed 0
[root@localhost ~]#
[root@localhost ~]# for i in $(seq 1 4); do echo 1 > /proc/sys/vm/compact_memory & done &>/dev/null
[1]   Done                    echo 1 > /proc/sys/vm/compact_memory
[2]   Done                    echo 1 > /proc/sys/vm/compact_memory
[3]-  Done                    echo 1 > /proc/sys/vm/compact_memory
[4]+  Done                    echo 1 > /proc/sys/vm/compact_memory
[root@localhost ~]#
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 2717
compact_pages_moved 46697
compact_pagemigrate_failed 75
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 16384
compact_balloon_failed 0
compact_balloon_isolated 16384
compact_balloon_freed 16384


* 128mB balloon:
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 0
compact_balloon_failed 0
compact_balloon_isolated 0
compact_balloon_freed 0
[root@localhost ~]#
[root@localhost ~]# for i in $(seq 1 4); do echo 1 > /proc/sys/vm/compact_memory & done &>/dev/null
[1]   Done                    echo 1 > /proc/sys/vm/compact_memory
[2]   Done                    echo 1 > /proc/sys/vm/compact_memory
[3]-  Done                    echo 1 > /proc/sys/vm/compact_memory
[4]+  Done                    echo 1 > /proc/sys/vm/compact_memory
[root@localhost ~]#
[root@localhost ~]# awk '/compact/ {print}' /proc/vmstat
compact_blocks_moved 2598
compact_pages_moved 47660
compact_pagemigrate_failed 103
compact_stall 0
compact_fail 0
compact_success 0
compact_balloon_migrated 26652
compact_balloon_failed 76
compact_balloon_isolated 26728
compact_balloon_freed 26652
-- 
1.7.10.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] mm: introduce compaction and migration for virtio ballooned pages
  2012-07-03 23:48 ` Rafael Aquini
@ 2012-07-03 23:48   ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch introduces the helper functions as well as the necessary changes
to teach compaction and migration bits how to cope with pages which are
part of a guest memory balloon, in order to make them movable by memory
compaction procedures.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/mm.h |   15 +++++++
 mm/compaction.c    |  126 ++++++++++++++++++++++++++++++++++++++++++++--------
 mm/migrate.c       |   30 ++++++++++++-
 3 files changed, 151 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b36d08c..3112198 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1629,5 +1629,20 @@ static inline unsigned int debug_guardpage_minorder(void) { return 0; }
 static inline bool page_is_guard(struct page *page) { return false; }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
+#if (defined(CONFIG_VIRTIO_BALLOON) || \
+	defined(CONFIG_VIRTIO_BALLOON_MODULE)) && defined(CONFIG_COMPACTION)
+extern bool putback_balloon_page(struct page *);
+extern struct address_space *balloon_mapping;
+
+static inline bool is_balloon_page(struct page *page)
+{
+	return (page->mapping == balloon_mapping) ? true : false;
+}
+#else
+static inline bool is_balloon_page(struct page *page)       { return false; }
+static inline bool isolate_balloon_page(struct page *page)  { return false; }
+static inline bool putback_balloon_page(struct page *page)  { return false; }
+#endif /* (VIRTIO_BALLOON || VIRTIO_BALLOON_MODULE) && COMPACTION */
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/mm/compaction.c b/mm/compaction.c
index 7ea259d..887d0fc 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -14,6 +14,7 @@
 #include <linux/backing-dev.h>
 #include <linux/sysctl.h>
 #include <linux/sysfs.h>
+#include <linux/export.h>
 #include "internal.h"
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
@@ -21,6 +22,85 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/compaction.h>
 
+#if defined(CONFIG_VIRTIO_BALLOON) || defined(CONFIG_VIRTIO_BALLOON_MODULE)
+/*
+ * Balloon pages special page->mapping.
+ * Users must properly allocate and initialize an instance of balloon_mapping,
+ * and set it as the page->mapping for balloon enlisted page instances.
+ * There is no need on utilizing struct address_space locking schemes for
+ * balloon_mapping as, once it gets initialized at balloon driver, it will
+ * remain just like a static reference that helps us on identifying a guest
+ * ballooned page by its mapping, as well as it will keep the 'a_ops' callback
+ * pointers to the functions that will execute the balloon page mobility tasks.
+ *
+ * address_space_operations necessary methods for ballooned pages:
+ *   .migratepage    - used to perform balloon's page migration (as is)
+ *   .invalidatepage - used to isolate a page from balloon's page list
+ *   .freepage       - used to reinsert an isolated page to balloon's page list
+ */
+struct address_space *balloon_mapping;
+EXPORT_SYMBOL_GPL(balloon_mapping);
+
+static inline void __isolate_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->invalidatepage(page, 0);
+}
+
+static inline void __putback_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->freepage(page);
+}
+
+/* __isolate_lru_page() counterpart for a ballooned page */
+static bool isolate_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(get_page_unless_zero(page))) {
+		/*
+		 * We can race against move_to_new_page() & __unmap_and_move().
+		 * If we stumble across a locked balloon page and succeed on
+		 * isolating it, the result tends to be disastrous.
+		 */
+		if (likely(trylock_page(page))) {
+			/*
+			 * A ballooned page, by default, has just one refcount.
+			 * Prevent concurrent compaction threads from isolating
+			 * an already isolated balloon page.
+			 */
+			if (is_balloon_page(page) && (page_count(page) == 2)) {
+				__isolate_balloon_page(page);
+				unlock_page(page);
+				return true;
+			}
+			unlock_page(page);
+		}
+		/* Drop refcount taken for this already isolated page */
+		put_page(page);
+	}
+	return false;
+}
+
+/* putback_lru_page() counterpart for a ballooned page */
+bool putback_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(trylock_page(page))) {
+		if (is_balloon_page(page)) {
+			__putback_balloon_page(page);
+			put_page(page);
+			unlock_page(page);
+			return true;
+		}
+		unlock_page(page);
+	}
+	return false;
+}
+#endif /* CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE */
+
 static unsigned long release_freepages(struct list_head *freelist)
 {
 	struct page *page, *next;
@@ -312,32 +392,40 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
 			continue;
 		}
 
-		if (!PageLRU(page))
-			continue;
-
 		/*
-		 * PageLRU is set, and lru_lock excludes isolation,
-		 * splitting and collapsing (collapsing has already
-		 * happened if PageLRU is set).
+		 * It is possible to migrate LRU pages and balloon pages.
+		 * Skip any other type of page.
 		 */
-		if (PageTransHuge(page)) {
-			low_pfn += (1 << compound_order(page)) - 1;
-			continue;
-		}
+		if (PageLRU(page)) {
+			/*
+			 * PageLRU is set, and lru_lock excludes isolation,
+			 * splitting and collapsing (collapsing has already
+			 * happened if PageLRU is set).
+			 */
+			if (PageTransHuge(page)) {
+				low_pfn += (1 << compound_order(page)) - 1;
+				continue;
+			}
 
-		if (!cc->sync)
-			mode |= ISOLATE_ASYNC_MIGRATE;
+			if (!cc->sync)
+				mode |= ISOLATE_ASYNC_MIGRATE;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone);
+			lruvec = mem_cgroup_page_lruvec(page, zone);
 
-		/* Try isolate the page */
-		if (__isolate_lru_page(page, mode) != 0)
-			continue;
+			/* Try isolate the page */
+			if (__isolate_lru_page(page, mode) != 0)
+				continue;
+
+			VM_BUG_ON(PageTransCompound(page));
 
-		VM_BUG_ON(PageTransCompound(page));
+			/* Successfully isolated */
+			del_page_from_lru_list(page, lruvec, page_lru(page));
+		} else if (is_balloon_page(page)) {
+			if (!isolate_balloon_page(page))
+				continue;
+		} else
+			continue;
 
-		/* Successfully isolated */
-		del_page_from_lru_list(page, lruvec, page_lru(page));
 		list_add(&page->lru, migratelist);
 		cc->nr_migratepages++;
 		nr_isolated++;
diff --git a/mm/migrate.c b/mm/migrate.c
index be26d5c..59c7bc5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,7 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		putback_lru_page(page);
+		if (unlikely(is_balloon_page(page)))
+			WARN_ON(!putback_balloon_page(page));
+		else
+			putback_lru_page(page);
 	}
 }
 
@@ -783,6 +786,17 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		}
 	}
 
+	if (is_balloon_page(page)) {
+		/*
+		 * A ballooned page does not need any special attention from
+		 * physical to virtual reverse mapping procedures.
+		 * Skip any attempt to unmap PTEs or to remap swap cache,
+		 * in order to avoid burning cycles at rmap level.
+		 */
+		remap_swapcache = 0;
+		goto skip_unmap;
+	}
+
 	/*
 	 * Corner case handling:
 	 * 1. When a new swap-cache page is read into, it is added to the LRU
@@ -852,6 +866,20 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 			goto out;
 
 	rc = __unmap_and_move(page, newpage, force, offlining, mode);
+
+	if (is_balloon_page(newpage)) {
+		/*
+		 * A ballooned page has been migrated already. Now, it is the
+		 * time to wrap-up counters, handle the old page back to Buddy
+		 * and return.
+		 */
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_cache(page));
+		put_page(page);
+		__free_page(page);
+		return rc;
+	}
 out:
 	if (rc != -EAGAIN) {
 		/*
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] mm: introduce compaction and migration for virtio ballooned pages
@ 2012-07-03 23:48   ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch introduces the helper functions as well as the necessary changes
to teach compaction and migration bits how to cope with pages which are
part of a guest memory balloon, in order to make them movable by memory
compaction procedures.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/mm.h |   15 +++++++
 mm/compaction.c    |  126 ++++++++++++++++++++++++++++++++++++++++++++--------
 mm/migrate.c       |   30 ++++++++++++-
 3 files changed, 151 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b36d08c..3112198 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1629,5 +1629,20 @@ static inline unsigned int debug_guardpage_minorder(void) { return 0; }
 static inline bool page_is_guard(struct page *page) { return false; }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
+#if (defined(CONFIG_VIRTIO_BALLOON) || \
+	defined(CONFIG_VIRTIO_BALLOON_MODULE)) && defined(CONFIG_COMPACTION)
+extern bool putback_balloon_page(struct page *);
+extern struct address_space *balloon_mapping;
+
+static inline bool is_balloon_page(struct page *page)
+{
+	return (page->mapping == balloon_mapping) ? true : false;
+}
+#else
+static inline bool is_balloon_page(struct page *page)       { return false; }
+static inline bool isolate_balloon_page(struct page *page)  { return false; }
+static inline bool putback_balloon_page(struct page *page)  { return false; }
+#endif /* (VIRTIO_BALLOON || VIRTIO_BALLOON_MODULE) && COMPACTION */
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/mm/compaction.c b/mm/compaction.c
index 7ea259d..887d0fc 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -14,6 +14,7 @@
 #include <linux/backing-dev.h>
 #include <linux/sysctl.h>
 #include <linux/sysfs.h>
+#include <linux/export.h>
 #include "internal.h"
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
@@ -21,6 +22,85 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/compaction.h>
 
+#if defined(CONFIG_VIRTIO_BALLOON) || defined(CONFIG_VIRTIO_BALLOON_MODULE)
+/*
+ * Balloon pages special page->mapping.
+ * Users must properly allocate and initialize an instance of balloon_mapping,
+ * and set it as the page->mapping for balloon enlisted page instances.
+ * There is no need on utilizing struct address_space locking schemes for
+ * balloon_mapping as, once it gets initialized at balloon driver, it will
+ * remain just like a static reference that helps us on identifying a guest
+ * ballooned page by its mapping, as well as it will keep the 'a_ops' callback
+ * pointers to the functions that will execute the balloon page mobility tasks.
+ *
+ * address_space_operations necessary methods for ballooned pages:
+ *   .migratepage    - used to perform balloon's page migration (as is)
+ *   .invalidatepage - used to isolate a page from balloon's page list
+ *   .freepage       - used to reinsert an isolated page to balloon's page list
+ */
+struct address_space *balloon_mapping;
+EXPORT_SYMBOL_GPL(balloon_mapping);
+
+static inline void __isolate_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->invalidatepage(page, 0);
+}
+
+static inline void __putback_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->freepage(page);
+}
+
+/* __isolate_lru_page() counterpart for a ballooned page */
+static bool isolate_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(get_page_unless_zero(page))) {
+		/*
+		 * We can race against move_to_new_page() & __unmap_and_move().
+		 * If we stumble across a locked balloon page and succeed on
+		 * isolating it, the result tends to be disastrous.
+		 */
+		if (likely(trylock_page(page))) {
+			/*
+			 * A ballooned page, by default, has just one refcount.
+			 * Prevent concurrent compaction threads from isolating
+			 * an already isolated balloon page.
+			 */
+			if (is_balloon_page(page) && (page_count(page) == 2)) {
+				__isolate_balloon_page(page);
+				unlock_page(page);
+				return true;
+			}
+			unlock_page(page);
+		}
+		/* Drop refcount taken for this already isolated page */
+		put_page(page);
+	}
+	return false;
+}
+
+/* putback_lru_page() counterpart for a ballooned page */
+bool putback_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(trylock_page(page))) {
+		if (is_balloon_page(page)) {
+			__putback_balloon_page(page);
+			put_page(page);
+			unlock_page(page);
+			return true;
+		}
+		unlock_page(page);
+	}
+	return false;
+}
+#endif /* CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE */
+
 static unsigned long release_freepages(struct list_head *freelist)
 {
 	struct page *page, *next;
@@ -312,32 +392,40 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
 			continue;
 		}
 
-		if (!PageLRU(page))
-			continue;
-
 		/*
-		 * PageLRU is set, and lru_lock excludes isolation,
-		 * splitting and collapsing (collapsing has already
-		 * happened if PageLRU is set).
+		 * It is possible to migrate LRU pages and balloon pages.
+		 * Skip any other type of page.
 		 */
-		if (PageTransHuge(page)) {
-			low_pfn += (1 << compound_order(page)) - 1;
-			continue;
-		}
+		if (PageLRU(page)) {
+			/*
+			 * PageLRU is set, and lru_lock excludes isolation,
+			 * splitting and collapsing (collapsing has already
+			 * happened if PageLRU is set).
+			 */
+			if (PageTransHuge(page)) {
+				low_pfn += (1 << compound_order(page)) - 1;
+				continue;
+			}
 
-		if (!cc->sync)
-			mode |= ISOLATE_ASYNC_MIGRATE;
+			if (!cc->sync)
+				mode |= ISOLATE_ASYNC_MIGRATE;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone);
+			lruvec = mem_cgroup_page_lruvec(page, zone);
 
-		/* Try isolate the page */
-		if (__isolate_lru_page(page, mode) != 0)
-			continue;
+			/* Try isolate the page */
+			if (__isolate_lru_page(page, mode) != 0)
+				continue;
+
+			VM_BUG_ON(PageTransCompound(page));
 
-		VM_BUG_ON(PageTransCompound(page));
+			/* Successfully isolated */
+			del_page_from_lru_list(page, lruvec, page_lru(page));
+		} else if (is_balloon_page(page)) {
+			if (!isolate_balloon_page(page))
+				continue;
+		} else
+			continue;
 
-		/* Successfully isolated */
-		del_page_from_lru_list(page, lruvec, page_lru(page));
 		list_add(&page->lru, migratelist);
 		cc->nr_migratepages++;
 		nr_isolated++;
diff --git a/mm/migrate.c b/mm/migrate.c
index be26d5c..59c7bc5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,7 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		putback_lru_page(page);
+		if (unlikely(is_balloon_page(page)))
+			WARN_ON(!putback_balloon_page(page));
+		else
+			putback_lru_page(page);
 	}
 }
 
@@ -783,6 +786,17 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		}
 	}
 
+	if (is_balloon_page(page)) {
+		/*
+		 * A ballooned page does not need any special attention from
+		 * physical to virtual reverse mapping procedures.
+		 * Skip any attempt to unmap PTEs or to remap swap cache,
+		 * in order to avoid burning cycles at rmap level.
+		 */
+		remap_swapcache = 0;
+		goto skip_unmap;
+	}
+
 	/*
 	 * Corner case handling:
 	 * 1. When a new swap-cache page is read into, it is added to the LRU
@@ -852,6 +866,20 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 			goto out;
 
 	rc = __unmap_and_move(page, newpage, force, offlining, mode);
+
+	if (is_balloon_page(newpage)) {
+		/*
+		 * A ballooned page has been migrated already. Now, it is the
+		 * time to wrap-up counters, handle the old page back to Buddy
+		 * and return.
+		 */
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_cache(page));
+		put_page(page);
+		__free_page(page);
+		return rc;
+	}
 out:
 	if (rc != -EAGAIN) {
 		/*
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] mm: introduce compaction and migration for virtio ballooned pages
  2012-07-03 23:48 ` Rafael Aquini
  (?)
@ 2012-07-03 23:48 ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Rik van Riel, Rafael Aquini, Konrad Rzeszutek Wilk,
	Michael S. Tsirkin, linux-kernel, virtualization, Minchan Kim,
	Andi Kleen, Andrew Morton

This patch introduces the helper functions as well as the necessary changes
to teach compaction and migration bits how to cope with pages which are
part of a guest memory balloon, in order to make them movable by memory
compaction procedures.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/mm.h |   15 +++++++
 mm/compaction.c    |  126 ++++++++++++++++++++++++++++++++++++++++++++--------
 mm/migrate.c       |   30 ++++++++++++-
 3 files changed, 151 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b36d08c..3112198 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1629,5 +1629,20 @@ static inline unsigned int debug_guardpage_minorder(void) { return 0; }
 static inline bool page_is_guard(struct page *page) { return false; }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
+#if (defined(CONFIG_VIRTIO_BALLOON) || \
+	defined(CONFIG_VIRTIO_BALLOON_MODULE)) && defined(CONFIG_COMPACTION)
+extern bool putback_balloon_page(struct page *);
+extern struct address_space *balloon_mapping;
+
+static inline bool is_balloon_page(struct page *page)
+{
+	return (page->mapping == balloon_mapping) ? true : false;
+}
+#else
+static inline bool is_balloon_page(struct page *page)       { return false; }
+static inline bool isolate_balloon_page(struct page *page)  { return false; }
+static inline bool putback_balloon_page(struct page *page)  { return false; }
+#endif /* (VIRTIO_BALLOON || VIRTIO_BALLOON_MODULE) && COMPACTION */
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/mm/compaction.c b/mm/compaction.c
index 7ea259d..887d0fc 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -14,6 +14,7 @@
 #include <linux/backing-dev.h>
 #include <linux/sysctl.h>
 #include <linux/sysfs.h>
+#include <linux/export.h>
 #include "internal.h"
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
@@ -21,6 +22,85 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/compaction.h>
 
+#if defined(CONFIG_VIRTIO_BALLOON) || defined(CONFIG_VIRTIO_BALLOON_MODULE)
+/*
+ * Balloon pages special page->mapping.
+ * Users must properly allocate and initialize an instance of balloon_mapping,
+ * and set it as the page->mapping for balloon enlisted page instances.
+ * There is no need on utilizing struct address_space locking schemes for
+ * balloon_mapping as, once it gets initialized at balloon driver, it will
+ * remain just like a static reference that helps us on identifying a guest
+ * ballooned page by its mapping, as well as it will keep the 'a_ops' callback
+ * pointers to the functions that will execute the balloon page mobility tasks.
+ *
+ * address_space_operations necessary methods for ballooned pages:
+ *   .migratepage    - used to perform balloon's page migration (as is)
+ *   .invalidatepage - used to isolate a page from balloon's page list
+ *   .freepage       - used to reinsert an isolated page to balloon's page list
+ */
+struct address_space *balloon_mapping;
+EXPORT_SYMBOL_GPL(balloon_mapping);
+
+static inline void __isolate_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->invalidatepage(page, 0);
+}
+
+static inline void __putback_balloon_page(struct page *page)
+{
+	page->mapping->a_ops->freepage(page);
+}
+
+/* __isolate_lru_page() counterpart for a ballooned page */
+static bool isolate_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(get_page_unless_zero(page))) {
+		/*
+		 * We can race against move_to_new_page() & __unmap_and_move().
+		 * If we stumble across a locked balloon page and succeed on
+		 * isolating it, the result tends to be disastrous.
+		 */
+		if (likely(trylock_page(page))) {
+			/*
+			 * A ballooned page, by default, has just one refcount.
+			 * Prevent concurrent compaction threads from isolating
+			 * an already isolated balloon page.
+			 */
+			if (is_balloon_page(page) && (page_count(page) == 2)) {
+				__isolate_balloon_page(page);
+				unlock_page(page);
+				return true;
+			}
+			unlock_page(page);
+		}
+		/* Drop refcount taken for this already isolated page */
+		put_page(page);
+	}
+	return false;
+}
+
+/* putback_lru_page() counterpart for a ballooned page */
+bool putback_balloon_page(struct page *page)
+{
+	if (WARN_ON(!is_balloon_page(page)))
+		return false;
+
+	if (likely(trylock_page(page))) {
+		if (is_balloon_page(page)) {
+			__putback_balloon_page(page);
+			put_page(page);
+			unlock_page(page);
+			return true;
+		}
+		unlock_page(page);
+	}
+	return false;
+}
+#endif /* CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE */
+
 static unsigned long release_freepages(struct list_head *freelist)
 {
 	struct page *page, *next;
@@ -312,32 +392,40 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
 			continue;
 		}
 
-		if (!PageLRU(page))
-			continue;
-
 		/*
-		 * PageLRU is set, and lru_lock excludes isolation,
-		 * splitting and collapsing (collapsing has already
-		 * happened if PageLRU is set).
+		 * It is possible to migrate LRU pages and balloon pages.
+		 * Skip any other type of page.
 		 */
-		if (PageTransHuge(page)) {
-			low_pfn += (1 << compound_order(page)) - 1;
-			continue;
-		}
+		if (PageLRU(page)) {
+			/*
+			 * PageLRU is set, and lru_lock excludes isolation,
+			 * splitting and collapsing (collapsing has already
+			 * happened if PageLRU is set).
+			 */
+			if (PageTransHuge(page)) {
+				low_pfn += (1 << compound_order(page)) - 1;
+				continue;
+			}
 
-		if (!cc->sync)
-			mode |= ISOLATE_ASYNC_MIGRATE;
+			if (!cc->sync)
+				mode |= ISOLATE_ASYNC_MIGRATE;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone);
+			lruvec = mem_cgroup_page_lruvec(page, zone);
 
-		/* Try isolate the page */
-		if (__isolate_lru_page(page, mode) != 0)
-			continue;
+			/* Try isolate the page */
+			if (__isolate_lru_page(page, mode) != 0)
+				continue;
+
+			VM_BUG_ON(PageTransCompound(page));
 
-		VM_BUG_ON(PageTransCompound(page));
+			/* Successfully isolated */
+			del_page_from_lru_list(page, lruvec, page_lru(page));
+		} else if (is_balloon_page(page)) {
+			if (!isolate_balloon_page(page))
+				continue;
+		} else
+			continue;
 
-		/* Successfully isolated */
-		del_page_from_lru_list(page, lruvec, page_lru(page));
 		list_add(&page->lru, migratelist);
 		cc->nr_migratepages++;
 		nr_isolated++;
diff --git a/mm/migrate.c b/mm/migrate.c
index be26d5c..59c7bc5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,7 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		putback_lru_page(page);
+		if (unlikely(is_balloon_page(page)))
+			WARN_ON(!putback_balloon_page(page));
+		else
+			putback_lru_page(page);
 	}
 }
 
@@ -783,6 +786,17 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		}
 	}
 
+	if (is_balloon_page(page)) {
+		/*
+		 * A ballooned page does not need any special attention from
+		 * physical to virtual reverse mapping procedures.
+		 * Skip any attempt to unmap PTEs or to remap swap cache,
+		 * in order to avoid burning cycles at rmap level.
+		 */
+		remap_swapcache = 0;
+		goto skip_unmap;
+	}
+
 	/*
 	 * Corner case handling:
 	 * 1. When a new swap-cache page is read into, it is added to the LRU
@@ -852,6 +866,20 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 			goto out;
 
 	rc = __unmap_and_move(page, newpage, force, offlining, mode);
+
+	if (is_balloon_page(newpage)) {
+		/*
+		 * A ballooned page has been migrated already. Now, it is the
+		 * time to wrap-up counters, handle the old page back to Buddy
+		 * and return.
+		 */
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_cache(page));
+		put_page(page);
+		__free_page(page);
+		return rc;
+	}
 out:
 	if (rc != -EAGAIN) {
 		/*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
  2012-07-03 23:48 ` Rafael Aquini
@ 2012-07-03 23:48   ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch introduces access sychronization to critical elements of struct
virtio_balloon, in order to allow the thread concurrency compaction/migration
bits might ended up imposing to the balloon driver on several situations.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   45 +++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index bfbc15c..d47c5c2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -51,6 +51,10 @@ struct virtio_balloon
 
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+
+	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
+	spinlock_t pfn_list_lock;
+
 	/*
 	 * The pages we've told the Host we're not using.
 	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
@@ -97,21 +101,23 @@ static void balloon_ack(struct virtqueue *vq)
 		complete(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
-{
-	struct scatterlist sg;
-
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
+DEFINE_MUTEX(vb_queue_completion);
 
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      struct scatterlist *sg)
+{
+	mutex_lock(&vb_queue_completion);
 	init_completion(&vb->acked);
 
 	/* We should always be able to add one buffer to an empty queue. */
-	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
+	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
 		BUG();
 	virtqueue_kick(vq);
 
 	/* When host has read buffer, this completes via balloon_ack */
 	wait_for_completion(&vb->acked);
+	mutex_unlock(&vb_queue_completion);
 }
 
 static void set_page_pfns(u32 pfns[], struct page *page)
@@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 
 static void fill_balloon(struct virtio_balloon *vb, size_t num)
 {
+	struct scatterlist sg;
+	int alloc_failed = 0;
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
@@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 				dev_printk(KERN_INFO, &vb->vdev->dev,
 					   "Out of puff! Can't get %zu pages\n",
 					   num);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
+			alloc_failed = 1;
 			break;
 		}
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
@@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	}
 
 	/* Didn't get any?  Oh well. */
-	if (vb->num_pfns == 0)
+	if (vb->num_pfns == 0) {
+		spin_unlock(&vb->pfn_list_lock);
 		return;
+	}
+
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
-	tell_host(vb, vb->inflate_vq);
+	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
+	if (alloc_failed)
+		msleep(200);
+
+	tell_host(vb, vb->inflate_vq, &sg);
 }
 
 static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
@@ -169,10 +186,12 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
 static void leak_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct page *page;
+	struct scatterlist sg;
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
@@ -180,13 +199,15 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	tell_host(vb, vb->deflate_vq);
+	tell_host(vb, vb->deflate_vq, &sg);
 	release_pages_by_pfn(vb->pfns, vb->num_pfns);
 }
 
@@ -356,6 +377,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	}
 
 	INIT_LIST_HEAD(&vb->pages);
+	spin_lock_init(&vb->pfn_list_lock);
+
 	vb->num_pages = 0;
 	init_waitqueue_head(&vb->config_change);
 	vb->vdev = vdev;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
@ 2012-07-03 23:48   ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch introduces access sychronization to critical elements of struct
virtio_balloon, in order to allow the thread concurrency compaction/migration
bits might ended up imposing to the balloon driver on several situations.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   45 +++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index bfbc15c..d47c5c2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -51,6 +51,10 @@ struct virtio_balloon
 
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+
+	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
+	spinlock_t pfn_list_lock;
+
 	/*
 	 * The pages we've told the Host we're not using.
 	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
@@ -97,21 +101,23 @@ static void balloon_ack(struct virtqueue *vq)
 		complete(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
-{
-	struct scatterlist sg;
-
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
+DEFINE_MUTEX(vb_queue_completion);
 
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      struct scatterlist *sg)
+{
+	mutex_lock(&vb_queue_completion);
 	init_completion(&vb->acked);
 
 	/* We should always be able to add one buffer to an empty queue. */
-	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
+	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
 		BUG();
 	virtqueue_kick(vq);
 
 	/* When host has read buffer, this completes via balloon_ack */
 	wait_for_completion(&vb->acked);
+	mutex_unlock(&vb_queue_completion);
 }
 
 static void set_page_pfns(u32 pfns[], struct page *page)
@@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 
 static void fill_balloon(struct virtio_balloon *vb, size_t num)
 {
+	struct scatterlist sg;
+	int alloc_failed = 0;
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
@@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 				dev_printk(KERN_INFO, &vb->vdev->dev,
 					   "Out of puff! Can't get %zu pages\n",
 					   num);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
+			alloc_failed = 1;
 			break;
 		}
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
@@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	}
 
 	/* Didn't get any?  Oh well. */
-	if (vb->num_pfns == 0)
+	if (vb->num_pfns == 0) {
+		spin_unlock(&vb->pfn_list_lock);
 		return;
+	}
+
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
-	tell_host(vb, vb->inflate_vq);
+	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
+	if (alloc_failed)
+		msleep(200);
+
+	tell_host(vb, vb->inflate_vq, &sg);
 }
 
 static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
@@ -169,10 +186,12 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
 static void leak_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct page *page;
+	struct scatterlist sg;
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
@@ -180,13 +199,15 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	tell_host(vb, vb->deflate_vq);
+	tell_host(vb, vb->deflate_vq, &sg);
 	release_pages_by_pfn(vb->pfns, vb->num_pfns);
 }
 
@@ -356,6 +377,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	}
 
 	INIT_LIST_HEAD(&vb->pages);
+	spin_lock_init(&vb->pfn_list_lock);
+
 	vb->num_pages = 0;
 	init_waitqueue_head(&vb->config_change);
 	vb->vdev = vdev;
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
  2012-07-03 23:48 ` Rafael Aquini
                   ` (3 preceding siblings ...)
  (?)
@ 2012-07-03 23:48 ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Rik van Riel, Rafael Aquini, Konrad Rzeszutek Wilk,
	Michael S. Tsirkin, linux-kernel, virtualization, Minchan Kim,
	Andi Kleen, Andrew Morton

This patch introduces access sychronization to critical elements of struct
virtio_balloon, in order to allow the thread concurrency compaction/migration
bits might ended up imposing to the balloon driver on several situations.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   45 +++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index bfbc15c..d47c5c2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -51,6 +51,10 @@ struct virtio_balloon
 
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+
+	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
+	spinlock_t pfn_list_lock;
+
 	/*
 	 * The pages we've told the Host we're not using.
 	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
@@ -97,21 +101,23 @@ static void balloon_ack(struct virtqueue *vq)
 		complete(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
-{
-	struct scatterlist sg;
-
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
+DEFINE_MUTEX(vb_queue_completion);
 
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      struct scatterlist *sg)
+{
+	mutex_lock(&vb_queue_completion);
 	init_completion(&vb->acked);
 
 	/* We should always be able to add one buffer to an empty queue. */
-	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
+	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
 		BUG();
 	virtqueue_kick(vq);
 
 	/* When host has read buffer, this completes via balloon_ack */
 	wait_for_completion(&vb->acked);
+	mutex_unlock(&vb_queue_completion);
 }
 
 static void set_page_pfns(u32 pfns[], struct page *page)
@@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 
 static void fill_balloon(struct virtio_balloon *vb, size_t num)
 {
+	struct scatterlist sg;
+	int alloc_failed = 0;
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
@@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 				dev_printk(KERN_INFO, &vb->vdev->dev,
 					   "Out of puff! Can't get %zu pages\n",
 					   num);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
+			alloc_failed = 1;
 			break;
 		}
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
@@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	}
 
 	/* Didn't get any?  Oh well. */
-	if (vb->num_pfns == 0)
+	if (vb->num_pfns == 0) {
+		spin_unlock(&vb->pfn_list_lock);
 		return;
+	}
+
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
-	tell_host(vb, vb->inflate_vq);
+	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
+	if (alloc_failed)
+		msleep(200);
+
+	tell_host(vb, vb->inflate_vq, &sg);
 }
 
 static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
@@ -169,10 +186,12 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
 static void leak_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct page *page;
+	struct scatterlist sg;
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
@@ -180,13 +199,15 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
 
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	tell_host(vb, vb->deflate_vq);
+	tell_host(vb, vb->deflate_vq, &sg);
 	release_pages_by_pfn(vb->pfns, vb->num_pfns);
 }
 
@@ -356,6 +377,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	}
 
 	INIT_LIST_HEAD(&vb->pages);
+	spin_lock_init(&vb->pfn_list_lock);
+
 	vb->num_pages = 0;
 	init_waitqueue_head(&vb->config_change);
 	vb->vdev = vdev;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] virtio_balloon: introduce migration primitives to balloon pages
  2012-07-03 23:48 ` Rafael Aquini
@ 2012-07-03 23:48   ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch makes balloon pages movable at allocation time and introduces the
infrastructure needed to perform the balloon page migration operation.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   96 ++++++++++++++++++++++++++++++++++++++-
 include/linux/virtio_balloon.h  |    4 ++
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d47c5c2..53386aa 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,6 +27,8 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/pagemap.h>
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -140,8 +142,9 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
-					__GFP_NOMEMALLOC | __GFP_NOWARN);
+		struct page *page = alloc_page(GFP_HIGHUSER_MOVABLE |
+						__GFP_NORETRY | __GFP_NOWARN |
+						__GFP_NOMEMALLOC);
 		if (!page) {
 			if (printk_ratelimit())
 				dev_printk(KERN_INFO, &vb->vdev->dev,
@@ -154,6 +157,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		totalram_pages--;
 		list_add(&page->lru, &vb->pages);
+		page->mapping = balloon_mapping;
 	}
 
 	/* Didn't get any?  Oh well. */
@@ -195,6 +199,7 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
+		page->mapping = NULL;
 		list_del(&page->lru);
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
@@ -365,6 +370,77 @@ static int init_vqs(struct virtio_balloon *vb)
 	return 0;
 }
 
+/*
+ * Populate balloon_mapping->a_ops->migratepage method to perform the balloon
+ * page migration task.
+ *
+ * After a ballooned page gets isolated by compaction procedures, this is the
+ * function that performs the page migration on behalf of move_to_new_page(),
+ * when the last calls (page)->mapping->a_ops->migratepage.
+ *
+ * Page migration for virtio balloon is done in a simple swap fashion which
+ * follows these two steps:
+ *  1) insert newpage into vb->pages list and update the host about it;
+ *  2) update the host about the removed old page from vb->pages list;
+ */
+int virtballoon_migratepage(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	struct scatterlist sg;
+
+	/* balloon's page migration 1st step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	list_add(&newpage->lru, &vb->pages);
+	set_page_pfns(vb->pfns, newpage);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->inflate_vq, &sg);
+
+	/* balloon's page migration 2nd step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	set_page_pfns(vb->pfns, page);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->deflate_vq, &sg);
+
+	return 0;
+}
+
+/*
+ * Populate balloon_mapping->a_ops->invalidatepage method to help compaction on
+ * isolating a page from the balloon page list.
+ */
+void virtballoon_isolatepage(struct page *page, unsigned long mode)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_del(&page->lru);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+/*
+ * Populate balloon_mapping->a_ops->freepage method to help compaction on
+ * re-inserting an isolated page into the balloon page list.
+ */
+void virtballoon_putbackpage(struct page *page)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_add(&page->lru, &vb->pages);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+static const struct address_space_operations virtio_balloon_aops = {
+	.migratepage = virtballoon_migratepage,
+	.invalidatepage = virtballoon_isolatepage,
+	.freepage = virtballoon_putbackpage,
+};
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
@@ -384,6 +460,19 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
 
+	/* Init the ballooned page->mapping special balloon_mapping */
+	balloon_mapping = kmalloc(sizeof(*balloon_mapping), GFP_KERNEL);
+	if (!balloon_mapping) {
+		err = -ENOMEM;
+		goto out_free_mapping;
+	}
+
+	INIT_RADIX_TREE(&balloon_mapping->page_tree, GFP_ATOMIC | __GFP_NOWARN);
+	INIT_LIST_HEAD(&balloon_mapping->i_mmap_nonlinear);
+	spin_lock_init(&balloon_mapping->tree_lock);
+	balloon_mapping->a_ops = &virtio_balloon_aops;
+	balloon_mapping->backing_dev_info = (void *)vb;
+
 	err = init_vqs(vb);
 	if (err)
 		goto out_free_vb;
@@ -398,6 +487,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 out_del_vqs:
 	vdev->config->del_vqs(vdev);
+out_free_mapping:
+	kfree(balloon_mapping);
 out_free_vb:
 	kfree(vb);
 out:
@@ -424,6 +515,7 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev)
 	kthread_stop(vb->thread);
 	remove_common(vb);
 	kfree(vb);
+	kfree(balloon_mapping);
 }
 
 #ifdef CONFIG_PM
diff --git a/include/linux/virtio_balloon.h b/include/linux/virtio_balloon.h
index 652dc8b..930f1b7 100644
--- a/include/linux/virtio_balloon.h
+++ b/include/linux/virtio_balloon.h
@@ -56,4 +56,8 @@ struct virtio_balloon_stat {
 	u64 val;
 } __attribute__((packed));
 
+#if !defined(CONFIG_COMPACTION)
+struct address_space *balloon_mapping;
+#endif
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] virtio_balloon: introduce migration primitives to balloon pages
@ 2012-07-03 23:48   ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch makes balloon pages movable at allocation time and introduces the
infrastructure needed to perform the balloon page migration operation.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   96 ++++++++++++++++++++++++++++++++++++++-
 include/linux/virtio_balloon.h  |    4 ++
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d47c5c2..53386aa 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,6 +27,8 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/pagemap.h>
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -140,8 +142,9 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
-					__GFP_NOMEMALLOC | __GFP_NOWARN);
+		struct page *page = alloc_page(GFP_HIGHUSER_MOVABLE |
+						__GFP_NORETRY | __GFP_NOWARN |
+						__GFP_NOMEMALLOC);
 		if (!page) {
 			if (printk_ratelimit())
 				dev_printk(KERN_INFO, &vb->vdev->dev,
@@ -154,6 +157,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		totalram_pages--;
 		list_add(&page->lru, &vb->pages);
+		page->mapping = balloon_mapping;
 	}
 
 	/* Didn't get any?  Oh well. */
@@ -195,6 +199,7 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
+		page->mapping = NULL;
 		list_del(&page->lru);
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
@@ -365,6 +370,77 @@ static int init_vqs(struct virtio_balloon *vb)
 	return 0;
 }
 
+/*
+ * Populate balloon_mapping->a_ops->migratepage method to perform the balloon
+ * page migration task.
+ *
+ * After a ballooned page gets isolated by compaction procedures, this is the
+ * function that performs the page migration on behalf of move_to_new_page(),
+ * when the last calls (page)->mapping->a_ops->migratepage.
+ *
+ * Page migration for virtio balloon is done in a simple swap fashion which
+ * follows these two steps:
+ *  1) insert newpage into vb->pages list and update the host about it;
+ *  2) update the host about the removed old page from vb->pages list;
+ */
+int virtballoon_migratepage(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	struct scatterlist sg;
+
+	/* balloon's page migration 1st step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	list_add(&newpage->lru, &vb->pages);
+	set_page_pfns(vb->pfns, newpage);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->inflate_vq, &sg);
+
+	/* balloon's page migration 2nd step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	set_page_pfns(vb->pfns, page);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->deflate_vq, &sg);
+
+	return 0;
+}
+
+/*
+ * Populate balloon_mapping->a_ops->invalidatepage method to help compaction on
+ * isolating a page from the balloon page list.
+ */
+void virtballoon_isolatepage(struct page *page, unsigned long mode)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_del(&page->lru);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+/*
+ * Populate balloon_mapping->a_ops->freepage method to help compaction on
+ * re-inserting an isolated page into the balloon page list.
+ */
+void virtballoon_putbackpage(struct page *page)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_add(&page->lru, &vb->pages);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+static const struct address_space_operations virtio_balloon_aops = {
+	.migratepage = virtballoon_migratepage,
+	.invalidatepage = virtballoon_isolatepage,
+	.freepage = virtballoon_putbackpage,
+};
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
@@ -384,6 +460,19 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
 
+	/* Init the ballooned page->mapping special balloon_mapping */
+	balloon_mapping = kmalloc(sizeof(*balloon_mapping), GFP_KERNEL);
+	if (!balloon_mapping) {
+		err = -ENOMEM;
+		goto out_free_mapping;
+	}
+
+	INIT_RADIX_TREE(&balloon_mapping->page_tree, GFP_ATOMIC | __GFP_NOWARN);
+	INIT_LIST_HEAD(&balloon_mapping->i_mmap_nonlinear);
+	spin_lock_init(&balloon_mapping->tree_lock);
+	balloon_mapping->a_ops = &virtio_balloon_aops;
+	balloon_mapping->backing_dev_info = (void *)vb;
+
 	err = init_vqs(vb);
 	if (err)
 		goto out_free_vb;
@@ -398,6 +487,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 out_del_vqs:
 	vdev->config->del_vqs(vdev);
+out_free_mapping:
+	kfree(balloon_mapping);
 out_free_vb:
 	kfree(vb);
 out:
@@ -424,6 +515,7 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev)
 	kthread_stop(vb->thread);
 	remove_common(vb);
 	kfree(vb);
+	kfree(balloon_mapping);
 }
 
 #ifdef CONFIG_PM
diff --git a/include/linux/virtio_balloon.h b/include/linux/virtio_balloon.h
index 652dc8b..930f1b7 100644
--- a/include/linux/virtio_balloon.h
+++ b/include/linux/virtio_balloon.h
@@ -56,4 +56,8 @@ struct virtio_balloon_stat {
 	u64 val;
 } __attribute__((packed));
 
+#if !defined(CONFIG_COMPACTION)
+struct address_space *balloon_mapping;
+#endif
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] virtio_balloon: introduce migration primitives to balloon pages
  2012-07-03 23:48 ` Rafael Aquini
                   ` (5 preceding siblings ...)
  (?)
@ 2012-07-03 23:48 ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Rik van Riel, Rafael Aquini, Konrad Rzeszutek Wilk,
	Michael S. Tsirkin, linux-kernel, virtualization, Minchan Kim,
	Andi Kleen, Andrew Morton

This patch makes balloon pages movable at allocation time and introduces the
infrastructure needed to perform the balloon page migration operation.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |   96 ++++++++++++++++++++++++++++++++++++++-
 include/linux/virtio_balloon.h  |    4 ++
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d47c5c2..53386aa 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,6 +27,8 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/pagemap.h>
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -140,8 +142,9 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 	spin_lock(&vb->pfn_list_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
-					__GFP_NOMEMALLOC | __GFP_NOWARN);
+		struct page *page = alloc_page(GFP_HIGHUSER_MOVABLE |
+						__GFP_NORETRY | __GFP_NOWARN |
+						__GFP_NOMEMALLOC);
 		if (!page) {
 			if (printk_ratelimit())
 				dev_printk(KERN_INFO, &vb->vdev->dev,
@@ -154,6 +157,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		totalram_pages--;
 		list_add(&page->lru, &vb->pages);
+		page->mapping = balloon_mapping;
 	}
 
 	/* Didn't get any?  Oh well. */
@@ -195,6 +199,7 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
 	for (vb->num_pfns = 0; vb->num_pfns < num;
 	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = list_first_entry(&vb->pages, struct page, lru);
+		page->mapping = NULL;
 		list_del(&page->lru);
 		set_page_pfns(vb->pfns + vb->num_pfns, page);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
@@ -365,6 +370,77 @@ static int init_vqs(struct virtio_balloon *vb)
 	return 0;
 }
 
+/*
+ * Populate balloon_mapping->a_ops->migratepage method to perform the balloon
+ * page migration task.
+ *
+ * After a ballooned page gets isolated by compaction procedures, this is the
+ * function that performs the page migration on behalf of move_to_new_page(),
+ * when the last calls (page)->mapping->a_ops->migratepage.
+ *
+ * Page migration for virtio balloon is done in a simple swap fashion which
+ * follows these two steps:
+ *  1) insert newpage into vb->pages list and update the host about it;
+ *  2) update the host about the removed old page from vb->pages list;
+ */
+int virtballoon_migratepage(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	struct scatterlist sg;
+
+	/* balloon's page migration 1st step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	list_add(&newpage->lru, &vb->pages);
+	set_page_pfns(vb->pfns, newpage);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->inflate_vq, &sg);
+
+	/* balloon's page migration 2nd step */
+	spin_lock(&vb->pfn_list_lock);
+	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+	set_page_pfns(vb->pfns, page);
+	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	spin_unlock(&vb->pfn_list_lock);
+	tell_host(vb, vb->deflate_vq, &sg);
+
+	return 0;
+}
+
+/*
+ * Populate balloon_mapping->a_ops->invalidatepage method to help compaction on
+ * isolating a page from the balloon page list.
+ */
+void virtballoon_isolatepage(struct page *page, unsigned long mode)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_del(&page->lru);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+/*
+ * Populate balloon_mapping->a_ops->freepage method to help compaction on
+ * re-inserting an isolated page into the balloon page list.
+ */
+void virtballoon_putbackpage(struct page *page)
+{
+	struct address_space *mapping = page->mapping;
+	struct virtio_balloon *vb = (void *)mapping->backing_dev_info;
+	spin_lock(&vb->pfn_list_lock);
+	list_add(&page->lru, &vb->pages);
+	spin_unlock(&vb->pfn_list_lock);
+}
+
+static const struct address_space_operations virtio_balloon_aops = {
+	.migratepage = virtballoon_migratepage,
+	.invalidatepage = virtballoon_isolatepage,
+	.freepage = virtballoon_putbackpage,
+};
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
@@ -384,6 +460,19 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
 
+	/* Init the ballooned page->mapping special balloon_mapping */
+	balloon_mapping = kmalloc(sizeof(*balloon_mapping), GFP_KERNEL);
+	if (!balloon_mapping) {
+		err = -ENOMEM;
+		goto out_free_mapping;
+	}
+
+	INIT_RADIX_TREE(&balloon_mapping->page_tree, GFP_ATOMIC | __GFP_NOWARN);
+	INIT_LIST_HEAD(&balloon_mapping->i_mmap_nonlinear);
+	spin_lock_init(&balloon_mapping->tree_lock);
+	balloon_mapping->a_ops = &virtio_balloon_aops;
+	balloon_mapping->backing_dev_info = (void *)vb;
+
 	err = init_vqs(vb);
 	if (err)
 		goto out_free_vb;
@@ -398,6 +487,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 out_del_vqs:
 	vdev->config->del_vqs(vdev);
+out_free_mapping:
+	kfree(balloon_mapping);
 out_free_vb:
 	kfree(vb);
 out:
@@ -424,6 +515,7 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev)
 	kthread_stop(vb->thread);
 	remove_common(vb);
 	kfree(vb);
+	kfree(balloon_mapping);
 }
 
 #ifdef CONFIG_PM
diff --git a/include/linux/virtio_balloon.h b/include/linux/virtio_balloon.h
index 652dc8b..930f1b7 100644
--- a/include/linux/virtio_balloon.h
+++ b/include/linux/virtio_balloon.h
@@ -56,4 +56,8 @@ struct virtio_balloon_stat {
 	u64 val;
 } __attribute__((packed));
 
+#if !defined(CONFIG_COMPACTION)
+struct address_space *balloon_mapping;
+#endif
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] mm: add vm event counters for balloon pages compaction
  2012-07-03 23:48 ` Rafael Aquini
@ 2012-07-03 23:48   ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch is only for testing report purposes and shall be dropped in case of
the rest of this patchset getting accepted for merging.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |    1 +
 include/linux/vm_event_item.h   |    2 ++
 mm/compaction.c                 |    1 +
 mm/migrate.c                    |    6 ++++--
 mm/vmstat.c                     |    4 ++++
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 53386aa..c4a929d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -406,6 +406,7 @@ int virtballoon_migratepage(struct address_space *mapping,
 	spin_unlock(&vb->pfn_list_lock);
 	tell_host(vb, vb->deflate_vq, &sg);
 
+	count_vm_event(COMPACTBALLOONMIGRATED);
 	return 0;
 }
 
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 06f8e38..e330c5a 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -40,6 +40,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 #ifdef CONFIG_COMPACTION
 		COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED,
 		COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
+		COMPACTBALLOONMIGRATED, COMPACTBALLOONFAILED,
+		COMPACTBALLOONISOLATED, COMPACTBALLOONFREED,
 #endif
 #ifdef CONFIG_HUGETLB_PAGE
 		HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
diff --git a/mm/compaction.c b/mm/compaction.c
index 887d0fc..8f7df01 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -72,6 +72,7 @@ static bool isolate_balloon_page(struct page *page)
 			if (is_balloon_page(page) && (page_count(page) == 2)) {
 				__isolate_balloon_page(page);
 				unlock_page(page);
+				count_vm_event(COMPACTBALLOONISOLATED);
 				return true;
 			}
 			unlock_page(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index 59c7bc5..5838719 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,9 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (unlikely(is_balloon_page(page)))
+		if (unlikely(is_balloon_page(page))) {
+			count_vm_event(COMPACTBALLOONFAILED);
 			WARN_ON(!putback_balloon_page(page));
-		else
+		} else
 			putback_lru_page(page);
 	}
 }
@@ -878,6 +879,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 				    page_is_file_cache(page));
 		put_page(page);
 		__free_page(page);
+		count_vm_event(COMPACTBALLOONFREED);
 		return rc;
 	}
 out:
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1bbbbd9..3b7109f 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -767,6 +767,10 @@ const char * const vmstat_text[] = {
 	"compact_stall",
 	"compact_fail",
 	"compact_success",
+	"compact_balloon_migrated",
+	"compact_balloon_failed",
+	"compact_balloon_isolated",
+	"compact_balloon_freed",
 #endif
 
 #ifdef CONFIG_HUGETLB_PAGE
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] mm: add vm event counters for balloon pages compaction
@ 2012-07-03 23:48   ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, virtualization, Rusty Russell, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim, Rafael Aquini

This patch is only for testing report purposes and shall be dropped in case of
the rest of this patchset getting accepted for merging.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |    1 +
 include/linux/vm_event_item.h   |    2 ++
 mm/compaction.c                 |    1 +
 mm/migrate.c                    |    6 ++++--
 mm/vmstat.c                     |    4 ++++
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 53386aa..c4a929d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -406,6 +406,7 @@ int virtballoon_migratepage(struct address_space *mapping,
 	spin_unlock(&vb->pfn_list_lock);
 	tell_host(vb, vb->deflate_vq, &sg);
 
+	count_vm_event(COMPACTBALLOONMIGRATED);
 	return 0;
 }
 
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 06f8e38..e330c5a 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -40,6 +40,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 #ifdef CONFIG_COMPACTION
 		COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED,
 		COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
+		COMPACTBALLOONMIGRATED, COMPACTBALLOONFAILED,
+		COMPACTBALLOONISOLATED, COMPACTBALLOONFREED,
 #endif
 #ifdef CONFIG_HUGETLB_PAGE
 		HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
diff --git a/mm/compaction.c b/mm/compaction.c
index 887d0fc..8f7df01 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -72,6 +72,7 @@ static bool isolate_balloon_page(struct page *page)
 			if (is_balloon_page(page) && (page_count(page) == 2)) {
 				__isolate_balloon_page(page);
 				unlock_page(page);
+				count_vm_event(COMPACTBALLOONISOLATED);
 				return true;
 			}
 			unlock_page(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index 59c7bc5..5838719 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,9 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (unlikely(is_balloon_page(page)))
+		if (unlikely(is_balloon_page(page))) {
+			count_vm_event(COMPACTBALLOONFAILED);
 			WARN_ON(!putback_balloon_page(page));
-		else
+		} else
 			putback_lru_page(page);
 	}
 }
@@ -878,6 +879,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 				    page_is_file_cache(page));
 		put_page(page);
 		__free_page(page);
+		count_vm_event(COMPACTBALLOONFREED);
 		return rc;
 	}
 out:
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1bbbbd9..3b7109f 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -767,6 +767,10 @@ const char * const vmstat_text[] = {
 	"compact_stall",
 	"compact_fail",
 	"compact_success",
+	"compact_balloon_migrated",
+	"compact_balloon_failed",
+	"compact_balloon_isolated",
+	"compact_balloon_freed",
 #endif
 
 #ifdef CONFIG_HUGETLB_PAGE
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] mm: add vm event counters for balloon pages compaction
  2012-07-03 23:48 ` Rafael Aquini
                   ` (6 preceding siblings ...)
  (?)
@ 2012-07-03 23:48 ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-03 23:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Rik van Riel, Rafael Aquini, Konrad Rzeszutek Wilk,
	Michael S. Tsirkin, linux-kernel, virtualization, Minchan Kim,
	Andi Kleen, Andrew Morton

This patch is only for testing report purposes and shall be dropped in case of
the rest of this patchset getting accepted for merging.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 drivers/virtio/virtio_balloon.c |    1 +
 include/linux/vm_event_item.h   |    2 ++
 mm/compaction.c                 |    1 +
 mm/migrate.c                    |    6 ++++--
 mm/vmstat.c                     |    4 ++++
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 53386aa..c4a929d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -406,6 +406,7 @@ int virtballoon_migratepage(struct address_space *mapping,
 	spin_unlock(&vb->pfn_list_lock);
 	tell_host(vb, vb->deflate_vq, &sg);
 
+	count_vm_event(COMPACTBALLOONMIGRATED);
 	return 0;
 }
 
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 06f8e38..e330c5a 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -40,6 +40,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 #ifdef CONFIG_COMPACTION
 		COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED,
 		COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
+		COMPACTBALLOONMIGRATED, COMPACTBALLOONFAILED,
+		COMPACTBALLOONISOLATED, COMPACTBALLOONFREED,
 #endif
 #ifdef CONFIG_HUGETLB_PAGE
 		HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
diff --git a/mm/compaction.c b/mm/compaction.c
index 887d0fc..8f7df01 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -72,6 +72,7 @@ static bool isolate_balloon_page(struct page *page)
 			if (is_balloon_page(page) && (page_count(page) == 2)) {
 				__isolate_balloon_page(page);
 				unlock_page(page);
+				count_vm_event(COMPACTBALLOONISOLATED);
 				return true;
 			}
 			unlock_page(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index 59c7bc5..5838719 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -78,9 +78,10 @@ void putback_lru_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (unlikely(is_balloon_page(page)))
+		if (unlikely(is_balloon_page(page))) {
+			count_vm_event(COMPACTBALLOONFAILED);
 			WARN_ON(!putback_balloon_page(page));
-		else
+		} else
 			putback_lru_page(page);
 	}
 }
@@ -878,6 +879,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
 				    page_is_file_cache(page));
 		put_page(page);
 		__free_page(page);
+		count_vm_event(COMPACTBALLOONFREED);
 		return rc;
 	}
 out:
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1bbbbd9..3b7109f 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -767,6 +767,10 @@ const char * const vmstat_text[] = {
 	"compact_stall",
 	"compact_fail",
 	"compact_success",
+	"compact_balloon_migrated",
+	"compact_balloon_failed",
+	"compact_balloon_isolated",
+	"compact_balloon_freed",
 #endif
 
 #ifdef CONFIG_HUGETLB_PAGE
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
  2012-07-03 23:48   ` Rafael Aquini
  (?)
@ 2012-07-04  6:38     ` Rusty Russell
  -1 siblings, 0 replies; 20+ messages in thread
From: Rusty Russell @ 2012-07-04  6:38 UTC (permalink / raw)
  To: Rafael Aquini, linux-mm
  Cc: linux-kernel, virtualization, Michael S. Tsirkin, Rik van Riel,
	Mel Gorman, Andi Kleen, Andrew Morton, Konrad Rzeszutek Wilk,
	Minchan Kim, Rafael Aquini

On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> This patch introduces access sychronization to critical elements of struct
> virtio_balloon, in order to allow the thread concurrency compaction/migration
> bits might ended up imposing to the balloon driver on several situations.
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>

That's pretty vague, and it's almost impossible to audit this.

It looks very suspicious though:

> -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
> -{
> -	struct scatterlist sg;
> -
> -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> +DEFINE_MUTEX(vb_queue_completion);
>  
> +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> +		      struct scatterlist *sg)
> +{
> +	mutex_lock(&vb_queue_completion);
>  	init_completion(&vb->acked);
>  
>  	/* We should always be able to add one buffer to an empty queue. */
> -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
>  		BUG();
>  	virtqueue_kick(vq);
>  
>  	/* When host has read buffer, this completes via balloon_ack */
>  	wait_for_completion(&vb->acked);
> +	mutex_unlock(&vb_queue_completion);
>  }

OK, this lock is superceded by Michael's patch, and AFAICT is not due to
any requirement introduced by these patches.

>  static void set_page_pfns(u32 pfns[], struct page *page)
> @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  
>  static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> +	struct scatterlist sg;
> +	int alloc_failed = 0;
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	spin_lock(&vb->pfn_list_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
>  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  				dev_printk(KERN_INFO, &vb->vdev->dev,
>  					   "Out of puff! Can't get %zu pages\n",
>  					   num);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
> +			alloc_failed = 1;
>  			break;
>  		}
>  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  	}
>  
>  	/* Didn't get any?  Oh well. */
> -	if (vb->num_pfns == 0)
> +	if (vb->num_pfns == 0) {
> +		spin_unlock(&vb->pfn_list_lock);
>  		return;
> +	}
> +
> +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +	spin_unlock(&vb->pfn_list_lock);
>  
> -	tell_host(vb, vb->inflate_vq);
> +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> +	if (alloc_failed)
> +		msleep(200);
> +
> +	tell_host(vb, vb->inflate_vq, &sg);

So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
use it in tell_host?  Surely it could be corrupted between there.

> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index bfbc15c..d47c5c2 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -51,6 +51,10 @@ struct virtio_balloon
>  
>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +
> +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> +	spinlock_t pfn_list_lock;
> +
>  	/*
>  	 * The pages we've told the Host we're not using.
>  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE

You might be better of taking num_pfns and pfns[] out of struct
virtio_balloon, and putting them on the stack (maybe 64, not 256).

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
@ 2012-07-04  6:38     ` Rusty Russell
  0 siblings, 0 replies; 20+ messages in thread
From: Rusty Russell @ 2012-07-04  6:38 UTC (permalink / raw)
  To: Rafael Aquini, linux-mm
  Cc: linux-kernel, virtualization, Michael S. Tsirkin, Rik van Riel,
	Mel Gorman, Andi Kleen, Andrew Morton, Konrad Rzeszutek Wilk,
	Minchan Kim

On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> This patch introduces access sychronization to critical elements of struct
> virtio_balloon, in order to allow the thread concurrency compaction/migration
> bits might ended up imposing to the balloon driver on several situations.
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>

That's pretty vague, and it's almost impossible to audit this.

It looks very suspicious though:

> -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
> -{
> -	struct scatterlist sg;
> -
> -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> +DEFINE_MUTEX(vb_queue_completion);
>  
> +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> +		      struct scatterlist *sg)
> +{
> +	mutex_lock(&vb_queue_completion);
>  	init_completion(&vb->acked);
>  
>  	/* We should always be able to add one buffer to an empty queue. */
> -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
>  		BUG();
>  	virtqueue_kick(vq);
>  
>  	/* When host has read buffer, this completes via balloon_ack */
>  	wait_for_completion(&vb->acked);
> +	mutex_unlock(&vb_queue_completion);
>  }

OK, this lock is superceded by Michael's patch, and AFAICT is not due to
any requirement introduced by these patches.

>  static void set_page_pfns(u32 pfns[], struct page *page)
> @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  
>  static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> +	struct scatterlist sg;
> +	int alloc_failed = 0;
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	spin_lock(&vb->pfn_list_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
>  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  				dev_printk(KERN_INFO, &vb->vdev->dev,
>  					   "Out of puff! Can't get %zu pages\n",
>  					   num);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
> +			alloc_failed = 1;
>  			break;
>  		}
>  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  	}
>  
>  	/* Didn't get any?  Oh well. */
> -	if (vb->num_pfns == 0)
> +	if (vb->num_pfns == 0) {
> +		spin_unlock(&vb->pfn_list_lock);
>  		return;
> +	}
> +
> +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +	spin_unlock(&vb->pfn_list_lock);
>  
> -	tell_host(vb, vb->inflate_vq);
> +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> +	if (alloc_failed)
> +		msleep(200);
> +
> +	tell_host(vb, vb->inflate_vq, &sg);

So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
use it in tell_host?  Surely it could be corrupted between there.

> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index bfbc15c..d47c5c2 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -51,6 +51,10 @@ struct virtio_balloon
>  
>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +
> +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> +	spinlock_t pfn_list_lock;
> +
>  	/*
>  	 * The pages we've told the Host we're not using.
>  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE

You might be better of taking num_pfns and pfns[] out of struct
virtio_balloon, and putting them on the stack (maybe 64, not 256).

Cheers,
Rusty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
@ 2012-07-04  6:38     ` Rusty Russell
  0 siblings, 0 replies; 20+ messages in thread
From: Rusty Russell @ 2012-07-04  6:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Rik van Riel, Rafael Aquini, Michael S. Tsirkin,
	Konrad Rzeszutek Wilk, linux-kernel, virtualization, Minchan Kim,
	Andi Kleen, Andrew Morton

On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> This patch introduces access sychronization to critical elements of struct
> virtio_balloon, in order to allow the thread concurrency compaction/migration
> bits might ended up imposing to the balloon driver on several situations.
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>

That's pretty vague, and it's almost impossible to audit this.

It looks very suspicious though:

> -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
> -{
> -	struct scatterlist sg;
> -
> -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> +DEFINE_MUTEX(vb_queue_completion);
>  
> +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> +		      struct scatterlist *sg)
> +{
> +	mutex_lock(&vb_queue_completion);
>  	init_completion(&vb->acked);
>  
>  	/* We should always be able to add one buffer to an empty queue. */
> -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
>  		BUG();
>  	virtqueue_kick(vq);
>  
>  	/* When host has read buffer, this completes via balloon_ack */
>  	wait_for_completion(&vb->acked);
> +	mutex_unlock(&vb_queue_completion);
>  }

OK, this lock is superceded by Michael's patch, and AFAICT is not due to
any requirement introduced by these patches.

>  static void set_page_pfns(u32 pfns[], struct page *page)
> @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  
>  static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
> +	struct scatterlist sg;
> +	int alloc_failed = 0;
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	spin_lock(&vb->pfn_list_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
>  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  				dev_printk(KERN_INFO, &vb->vdev->dev,
>  					   "Out of puff! Can't get %zu pages\n",
>  					   num);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
> +			alloc_failed = 1;
>  			break;
>  		}
>  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  	}
>  
>  	/* Didn't get any?  Oh well. */
> -	if (vb->num_pfns == 0)
> +	if (vb->num_pfns == 0) {
> +		spin_unlock(&vb->pfn_list_lock);
>  		return;
> +	}
> +
> +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +	spin_unlock(&vb->pfn_list_lock);
>  
> -	tell_host(vb, vb->inflate_vq);
> +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> +	if (alloc_failed)
> +		msleep(200);
> +
> +	tell_host(vb, vb->inflate_vq, &sg);

So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
use it in tell_host?  Surely it could be corrupted between there.

> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index bfbc15c..d47c5c2 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -51,6 +51,10 @@ struct virtio_balloon
>  
>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +
> +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> +	spinlock_t pfn_list_lock;
> +
>  	/*
>  	 * The pages we've told the Host we're not using.
>  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE

You might be better of taking num_pfns and pfns[] out of struct
virtio_balloon, and putting them on the stack (maybe 64, not 256).

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
  2012-07-04  6:38     ` Rusty Russell
@ 2012-07-04 19:51       ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-04 19:51 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-mm, linux-kernel, virtualization, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim

Howdy Rusty,

First and foremost, thank you very much for taking the time to go through this
proposal and provide me with such valuable feedback. I really appreciate that.

On Wed, Jul 04, 2012 at 04:08:23PM +0930, Rusty Russell wrote:
> On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> > This patch introduces access sychronization to critical elements of struct
> > virtio_balloon, in order to allow the thread concurrency compaction/migration
> > bits might ended up imposing to the balloon driver on several situations.
> > 
> > Signed-off-by: Rafael Aquini <aquini@redhat.com>
> 
> That's pretty vague, and it's almost impossible to audit this.
> 

I'll definetely attempt to improve this one.
Despite it looks concise to me as it states the "whats" and the "whys", any clue
on how to improve it and turn it into something that would make a lot more sense
is very welcome and appreciated. I'm probably failing miserably to express the
whole idea because I'm a terrible writer, no doubts about it. :)


> > +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> > +DEFINE_MUTEX(vb_queue_completion);
> >  
> > +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> > +		      struct scatterlist *sg)
> > +{
> > +	mutex_lock(&vb_queue_completion);
> >  	init_completion(&vb->acked);
> >  
> >  	/* We should always be able to add one buffer to an empty queue. */
> > -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> > +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
> >  		BUG();
> >  	virtqueue_kick(vq);
> >  
> >  	/* When host has read buffer, this completes via balloon_ack */
> >  	wait_for_completion(&vb->acked);
> > +	mutex_unlock(&vb_queue_completion);
> >  }
> 
> OK, this lock is superceded by Michael's patch, and AFAICT is not due to
> any requirement introduced by these patches.
> 

Unfortunately, I'm compelled to disagree with you on this one.

Because tell_host() can be called concurrently, this lock is placed to avoid two
issues, basically:
 a) completion var vb->acked corruption (overriden upon several
    initializations);
 b) virtqueue operations (inflate/deflate) being called simultaneously;

Even though Michael's patch addresses the case (a), as far as this patch series
is concerned, it shows no way to prevent case (b), if two or more threads are
calling tell_host() simultaneously.


> >  static void set_page_pfns(u32 pfns[], struct page *page)
> > @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
> >  
> >  static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  {
> > +	struct scatterlist sg;
> > +	int alloc_failed = 0;
> >  	/* We can only do one array worth at a time. */
> >  	num = min(num, ARRAY_SIZE(vb->pfns));
> >  
> > +	spin_lock(&vb->pfn_list_lock);
> >  	for (vb->num_pfns = 0; vb->num_pfns < num;
> >  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> > @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  				dev_printk(KERN_INFO, &vb->vdev->dev,
> >  					   "Out of puff! Can't get %zu pages\n",
> >  					   num);
> > -			/* Sleep for at least 1/5 of a second before retry. */
> > -			msleep(200);
> > +			alloc_failed = 1;
> >  			break;
> >  		}
> >  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> > @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  	}
> >  
> >  	/* Didn't get any?  Oh well. */
> > -	if (vb->num_pfns == 0)
> > +	if (vb->num_pfns == 0) {
> > +		spin_unlock(&vb->pfn_list_lock);
> >  		return;
> > +	}
> > +
> > +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> > +	spin_unlock(&vb->pfn_list_lock);
> >  
> > -	tell_host(vb, vb->inflate_vq);
> > +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> > +	if (alloc_failed)
> > +		msleep(200);
> > +
> > +	tell_host(vb, vb->inflate_vq, &sg);
> 
> So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
> use it in tell_host?  Surely it could be corrupted between there.
> 

The lock is dropped following these conditions:
 a) virtqueue_add_buf() works based on a scatterlist array (buf);
 b) vp->pfns[] and vb->num_pfns are not anymore being directly accessed/updated
    at tell_host() level;
 c) *vb ptr address is only used as a token to identify the buffer at
    virtqueue_add_buf() and no particular struct's element is updated/accessed;
 d) we are not supposed to block/sleep while holding the spinlock;
 
Changes made to vp->pfns[] and vb->num_pfns after the spinlock is released
doesn't matter for a particular thread anymore since the scatterlist setup is
now moved outside tell_host() and it's being made within the locked session, and
no one else down that codepath directly uses vp->pfns[] or vb->num_pfns to make
its decisions.
Unfortunately, I'm not able to see the same corruption window you've spotted. Am
I missing something here?


Cheers!
Rafael

> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index bfbc15c..d47c5c2 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -51,6 +51,10 @@ struct virtio_balloon
> >  
> >  	/* Number of balloon pages we've told the Host we're not using. */
> >  	unsigned int num_pages;
> > +
> > +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> > +	spinlock_t pfn_list_lock;
> > +
> >  	/*
> >  	 * The pages we've told the Host we're not using.
> >  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
> 
> You might be better of taking num_pfns and pfns[] out of struct
> virtio_balloon, and putting them on the stack (maybe 64, not 256).
> 
> Cheers,
> Rusty.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
@ 2012-07-04 19:51       ` Rafael Aquini
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-04 19:51 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-mm, linux-kernel, virtualization, Michael S. Tsirkin,
	Rik van Riel, Mel Gorman, Andi Kleen, Andrew Morton,
	Konrad Rzeszutek Wilk, Minchan Kim

Howdy Rusty,

First and foremost, thank you very much for taking the time to go through this
proposal and provide me with such valuable feedback. I really appreciate that.

On Wed, Jul 04, 2012 at 04:08:23PM +0930, Rusty Russell wrote:
> On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> > This patch introduces access sychronization to critical elements of struct
> > virtio_balloon, in order to allow the thread concurrency compaction/migration
> > bits might ended up imposing to the balloon driver on several situations.
> > 
> > Signed-off-by: Rafael Aquini <aquini@redhat.com>
> 
> That's pretty vague, and it's almost impossible to audit this.
> 

I'll definetely attempt to improve this one.
Despite it looks concise to me as it states the "whats" and the "whys", any clue
on how to improve it and turn it into something that would make a lot more sense
is very welcome and appreciated. I'm probably failing miserably to express the
whole idea because I'm a terrible writer, no doubts about it. :)


> > +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> > +DEFINE_MUTEX(vb_queue_completion);
> >  
> > +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> > +		      struct scatterlist *sg)
> > +{
> > +	mutex_lock(&vb_queue_completion);
> >  	init_completion(&vb->acked);
> >  
> >  	/* We should always be able to add one buffer to an empty queue. */
> > -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> > +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
> >  		BUG();
> >  	virtqueue_kick(vq);
> >  
> >  	/* When host has read buffer, this completes via balloon_ack */
> >  	wait_for_completion(&vb->acked);
> > +	mutex_unlock(&vb_queue_completion);
> >  }
> 
> OK, this lock is superceded by Michael's patch, and AFAICT is not due to
> any requirement introduced by these patches.
> 

Unfortunately, I'm compelled to disagree with you on this one.

Because tell_host() can be called concurrently, this lock is placed to avoid two
issues, basically:
 a) completion var vb->acked corruption (overriden upon several
    initializations);
 b) virtqueue operations (inflate/deflate) being called simultaneously;

Even though Michael's patch addresses the case (a), as far as this patch series
is concerned, it shows no way to prevent case (b), if two or more threads are
calling tell_host() simultaneously.


> >  static void set_page_pfns(u32 pfns[], struct page *page)
> > @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
> >  
> >  static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  {
> > +	struct scatterlist sg;
> > +	int alloc_failed = 0;
> >  	/* We can only do one array worth at a time. */
> >  	num = min(num, ARRAY_SIZE(vb->pfns));
> >  
> > +	spin_lock(&vb->pfn_list_lock);
> >  	for (vb->num_pfns = 0; vb->num_pfns < num;
> >  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> > @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  				dev_printk(KERN_INFO, &vb->vdev->dev,
> >  					   "Out of puff! Can't get %zu pages\n",
> >  					   num);
> > -			/* Sleep for at least 1/5 of a second before retry. */
> > -			msleep(200);
> > +			alloc_failed = 1;
> >  			break;
> >  		}
> >  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> > @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  	}
> >  
> >  	/* Didn't get any?  Oh well. */
> > -	if (vb->num_pfns == 0)
> > +	if (vb->num_pfns == 0) {
> > +		spin_unlock(&vb->pfn_list_lock);
> >  		return;
> > +	}
> > +
> > +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> > +	spin_unlock(&vb->pfn_list_lock);
> >  
> > -	tell_host(vb, vb->inflate_vq);
> > +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> > +	if (alloc_failed)
> > +		msleep(200);
> > +
> > +	tell_host(vb, vb->inflate_vq, &sg);
> 
> So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
> use it in tell_host?  Surely it could be corrupted between there.
> 

The lock is dropped following these conditions:
 a) virtqueue_add_buf() works based on a scatterlist array (buf);
 b) vp->pfns[] and vb->num_pfns are not anymore being directly accessed/updated
    at tell_host() level;
 c) *vb ptr address is only used as a token to identify the buffer at
    virtqueue_add_buf() and no particular struct's element is updated/accessed;
 d) we are not supposed to block/sleep while holding the spinlock;
 
Changes made to vp->pfns[] and vb->num_pfns after the spinlock is released
doesn't matter for a particular thread anymore since the scatterlist setup is
now moved outside tell_host() and it's being made within the locked session, and
no one else down that codepath directly uses vp->pfns[] or vb->num_pfns to make
its decisions.
Unfortunately, I'm not able to see the same corruption window you've spotted. Am
I missing something here?


Cheers!
Rafael

> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index bfbc15c..d47c5c2 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -51,6 +51,10 @@ struct virtio_balloon
> >  
> >  	/* Number of balloon pages we've told the Host we're not using. */
> >  	unsigned int num_pages;
> > +
> > +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> > +	spinlock_t pfn_list_lock;
> > +
> >  	/*
> >  	 * The pages we've told the Host we're not using.
> >  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
> 
> You might be better of taking num_pfns and pfns[] out of struct
> virtio_balloon, and putting them on the stack (maybe 64, not 256).
> 
> Cheers,
> Rusty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements
  2012-07-04  6:38     ` Rusty Russell
                       ` (2 preceding siblings ...)
  (?)
@ 2012-07-04 19:51     ` Rafael Aquini
  -1 siblings, 0 replies; 20+ messages in thread
From: Rafael Aquini @ 2012-07-04 19:51 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Rik van Riel, Michael S. Tsirkin, Konrad Rzeszutek Wilk,
	linux-kernel, virtualization, linux-mm, Andi Kleen, Minchan Kim,
	Andrew Morton

Howdy Rusty,

First and foremost, thank you very much for taking the time to go through this
proposal and provide me with such valuable feedback. I really appreciate that.

On Wed, Jul 04, 2012 at 04:08:23PM +0930, Rusty Russell wrote:
> On Tue,  3 Jul 2012 20:48:50 -0300, Rafael Aquini <aquini@redhat.com> wrote:
> > This patch introduces access sychronization to critical elements of struct
> > virtio_balloon, in order to allow the thread concurrency compaction/migration
> > bits might ended up imposing to the balloon driver on several situations.
> > 
> > Signed-off-by: Rafael Aquini <aquini@redhat.com>
> 
> That's pretty vague, and it's almost impossible to audit this.
> 

I'll definetely attempt to improve this one.
Despite it looks concise to me as it states the "whats" and the "whys", any clue
on how to improve it and turn it into something that would make a lot more sense
is very welcome and appreciated. I'm probably failing miserably to express the
whole idea because I'm a terrible writer, no doubts about it. :)


> > +/* Protection for concurrent accesses to balloon virtqueues and vb->acked */
> > +DEFINE_MUTEX(vb_queue_completion);
> >  
> > +static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
> > +		      struct scatterlist *sg)
> > +{
> > +	mutex_lock(&vb_queue_completion);
> >  	init_completion(&vb->acked);
> >  
> >  	/* We should always be able to add one buffer to an empty queue. */
> > -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> > +	if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) < 0)
> >  		BUG();
> >  	virtqueue_kick(vq);
> >  
> >  	/* When host has read buffer, this completes via balloon_ack */
> >  	wait_for_completion(&vb->acked);
> > +	mutex_unlock(&vb_queue_completion);
> >  }
> 
> OK, this lock is superceded by Michael's patch, and AFAICT is not due to
> any requirement introduced by these patches.
> 

Unfortunately, I'm compelled to disagree with you on this one.

Because tell_host() can be called concurrently, this lock is placed to avoid two
issues, basically:
 a) completion var vb->acked corruption (overriden upon several
    initializations);
 b) virtqueue operations (inflate/deflate) being called simultaneously;

Even though Michael's patch addresses the case (a), as far as this patch series
is concerned, it shows no way to prevent case (b), if two or more threads are
calling tell_host() simultaneously.


> >  static void set_page_pfns(u32 pfns[], struct page *page)
> > @@ -126,9 +132,12 @@ static void set_page_pfns(u32 pfns[], struct page *page)
> >  
> >  static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  {
> > +	struct scatterlist sg;
> > +	int alloc_failed = 0;
> >  	/* We can only do one array worth at a time. */
> >  	num = min(num, ARRAY_SIZE(vb->pfns));
> >  
> > +	spin_lock(&vb->pfn_list_lock);
> >  	for (vb->num_pfns = 0; vb->num_pfns < num;
> >  	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >  		struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
> > @@ -138,8 +147,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  				dev_printk(KERN_INFO, &vb->vdev->dev,
> >  					   "Out of puff! Can't get %zu pages\n",
> >  					   num);
> > -			/* Sleep for at least 1/5 of a second before retry. */
> > -			msleep(200);
> > +			alloc_failed = 1;
> >  			break;
> >  		}
> >  		set_page_pfns(vb->pfns + vb->num_pfns, page);
> > @@ -149,10 +157,19 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  	}
> >  
> >  	/* Didn't get any?  Oh well. */
> > -	if (vb->num_pfns == 0)
> > +	if (vb->num_pfns == 0) {
> > +		spin_unlock(&vb->pfn_list_lock);
> >  		return;
> > +	}
> > +
> > +	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> > +	spin_unlock(&vb->pfn_list_lock);
> >  
> > -	tell_host(vb, vb->inflate_vq);
> > +	/* alloc_page failed, sleep for at least 1/5 of a sec before retry. */
> > +	if (alloc_failed)
> > +		msleep(200);
> > +
> > +	tell_host(vb, vb->inflate_vq, &sg);
> 
> So, we drop the lock which procects vp->pfns[] and vb->num_pfns, then
> use it in tell_host?  Surely it could be corrupted between there.
> 

The lock is dropped following these conditions:
 a) virtqueue_add_buf() works based on a scatterlist array (buf);
 b) vp->pfns[] and vb->num_pfns are not anymore being directly accessed/updated
    at tell_host() level;
 c) *vb ptr address is only used as a token to identify the buffer at
    virtqueue_add_buf() and no particular struct's element is updated/accessed;
 d) we are not supposed to block/sleep while holding the spinlock;
 
Changes made to vp->pfns[] and vb->num_pfns after the spinlock is released
doesn't matter for a particular thread anymore since the scatterlist setup is
now moved outside tell_host() and it's being made within the locked session, and
no one else down that codepath directly uses vp->pfns[] or vb->num_pfns to make
its decisions.
Unfortunately, I'm not able to see the same corruption window you've spotted. Am
I missing something here?


Cheers!
Rafael

> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index bfbc15c..d47c5c2 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -51,6 +51,10 @@ struct virtio_balloon
> >  
> >  	/* Number of balloon pages we've told the Host we're not using. */
> >  	unsigned int num_pages;
> > +
> > +	/* Protect 'pages', 'pfns' & 'num_pnfs' against concurrent updates */
> > +	spinlock_t pfn_list_lock;
> > +
> >  	/*
> >  	 * The pages we've told the Host we're not using.
> >  	 * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
> 
> You might be better of taking num_pfns and pfns[] out of struct
> virtio_balloon, and putting them on the stack (maybe 64, not 256).
> 
> Cheers,
> Rusty.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-07-04 19:52 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-03 23:48 [PATCH v3 0/4] make balloon pages movable by compaction Rafael Aquini
2012-07-03 23:48 ` Rafael Aquini
2012-07-03 23:48 ` [PATCH v3 1/4] mm: introduce compaction and migration for virtio ballooned pages Rafael Aquini
2012-07-03 23:48 ` Rafael Aquini
2012-07-03 23:48   ` Rafael Aquini
2012-07-03 23:48 ` [PATCH v3 2/4] virtio_balloon: handle concurrent accesses to virtio_balloon struct elements Rafael Aquini
2012-07-03 23:48   ` Rafael Aquini
2012-07-04  6:38   ` Rusty Russell
2012-07-04  6:38     ` Rusty Russell
2012-07-04  6:38     ` Rusty Russell
2012-07-04 19:51     ` Rafael Aquini
2012-07-04 19:51       ` Rafael Aquini
2012-07-04 19:51     ` Rafael Aquini
2012-07-03 23:48 ` Rafael Aquini
2012-07-03 23:48 ` [PATCH v3 3/4] virtio_balloon: introduce migration primitives to balloon pages Rafael Aquini
2012-07-03 23:48   ` Rafael Aquini
2012-07-03 23:48 ` Rafael Aquini
2012-07-03 23:48 ` [PATCH v3 4/4] mm: add vm event counters for balloon pages compaction Rafael Aquini
2012-07-03 23:48 ` Rafael Aquini
2012-07-03 23:48   ` Rafael Aquini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.