* [PATCH RFC 01/10] balloon_compaction: don't BUG() when it is not necessary
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 02/10] xen/balloon: fix code comment for free_xenballooned_pages Wei Liu
` (10 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Xen balloon driver has several sources for ballooned pages. One of the
sources will be generic balloon driver, in the sense that Xen balloon
driver gets pages from that.
When Xen balloon driver asks for a page from these sources, generic
balloon driver might not have enough pages to return to caller. In that
case we don't want to call BUG(). Simply return NULL will be sufficient.
So this patch introduces a flag indicating if caller want to be very
strict about empty list in generic balloon driver. If the flag is set to
true, generic balloon driver calls BUG() when a request cannot be
satisfied, which is the same behavior as before; otherwise it will just
return NULL and let caller handle that situation.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/virtio/virtio_balloon.c | 2 +-
include/linux/balloon_compaction.h | 3 ++-
mm/balloon_compaction.c | 9 ++++++---
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 25ebe8e..9733731 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -179,7 +179,7 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
mutex_lock(&vb->balloon_lock);
for (vb->num_pfns = 0; vb->num_pfns < num;
vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
- page = balloon_page_dequeue(vb_dev_info);
+ page = balloon_page_dequeue(vb_dev_info, true);
if (!page)
break;
set_page_pfns(vb->pfns + vb->num_pfns, page);
diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
index 089743a..1e1f888 100644
--- a/include/linux/balloon_compaction.h
+++ b/include/linux/balloon_compaction.h
@@ -62,7 +62,8 @@ struct balloon_dev_info {
};
extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info);
-extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info);
+extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info,
+ bool strict);
extern struct balloon_dev_info *balloon_devinfo_alloc(
void *balloon_dev_descriptor);
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 6e45a50..d604e83 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -72,14 +72,17 @@ EXPORT_SYMBOL_GPL(balloon_page_enqueue);
* balloon_page_dequeue - removes a page from balloon's page list and returns
* the its address to allow the driver release the page.
* @b_dev_info: balloon device decriptor where we will grab a page from.
+ * @strict: strictly bookkeep number of pages.
*
* Driver must call it to properly de-allocate a previous enlisted balloon page
* before definetively releasing it back to the guest system.
* This function returns the page address for the recently dequeued page or
* NULL in the case we find balloon's page list temporarily empty due to
- * compaction isolated pages.
+ * compaction isolated pages. If @strict is set to true and caller asks for
+ * more pages than we have, BUG().
*/
-struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info)
+struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info,
+ bool strict)
{
struct page *page, *tmp;
unsigned long flags;
@@ -122,7 +125,7 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info)
*/
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
if (unlikely(list_empty(&b_dev_info->pages) &&
- !b_dev_info->isolated_pages))
+ !b_dev_info->isolated_pages) && strict)
BUG();
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
page = NULL;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 02/10] xen/balloon: fix code comment for free_xenballooned_pages
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
2014-10-15 15:54 ` [PATCH RFC 01/10] balloon_compaction: don't BUG() when it is not necessary Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 03/10] xen/balloon: consolidate data structures Wei Liu
` (9 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 1e0a317..30a0baf 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -557,7 +557,8 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
EXPORT_SYMBOL(alloc_xenballooned_pages);
/**
- * free_xenballooned_pages - return pages retrieved with get_ballooned_pages
+ * free_xenballooned_pages - return pages retrieved with
+ * alloc_xenballooned_pages
* @nr_pages: Number of pages
* @pages: pages to return
*/
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 03/10] xen/balloon: consolidate data structures
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
2014-10-15 15:54 ` [PATCH RFC 01/10] balloon_compaction: don't BUG() when it is not necessary Wei Liu
2014-10-15 15:54 ` [PATCH RFC 02/10] xen/balloon: fix code comment for free_xenballooned_pages Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 04/10] xen/balloon: factor out function to update balloon stats Wei Liu
` (8 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Put Xen balloon mutex, page list and stats in to struct xen_balloon, so
that we can easily back-reference those structures. Page migration
callback will need to get hold of those structures in later patch(es).
No functional change is introduced.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 168 ++++++++++++++++++++++++---------------------
drivers/xen/xen-balloon.c | 24 ++++---
include/xen/balloon.h | 15 +++-
3 files changed, 119 insertions(+), 88 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 30a0baf..d8055f0 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -84,20 +84,13 @@ enum bp_state {
BP_ECANCELED
};
-
-static DEFINE_MUTEX(balloon_mutex);
-
-struct balloon_stats balloon_stats;
-EXPORT_SYMBOL_GPL(balloon_stats);
+struct xen_balloon xen_balloon;
+EXPORT_SYMBOL_GPL(xen_balloon);
/* We increase/decrease in batches which fit in a page */
static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
static DEFINE_PER_CPU(struct page *, balloon_scratch_page);
-
-/* List of ballooned pages, threaded through the mem_map array. */
-static LIST_HEAD(ballooned_pages);
-
/* Main work function, always executed in process context. */
static void balloon_process(struct work_struct *work);
static DECLARE_DELAYED_WORK(balloon_worker, balloon_process);
@@ -119,11 +112,11 @@ static void __balloon_append(struct page *page)
{
/* Lowmem is re-populated first, so highmem pages go at list tail. */
if (PageHighMem(page)) {
- list_add_tail(&page->lru, &ballooned_pages);
- balloon_stats.balloon_high++;
+ list_add_tail(&page->lru, &xen_balloon.ballooned_pages);
+ xen_balloon.balloon_stats.balloon_high++;
} else {
- list_add(&page->lru, &ballooned_pages);
- balloon_stats.balloon_low++;
+ list_add(&page->lru, &xen_balloon.ballooned_pages);
+ xen_balloon.balloon_stats.balloon_low++;
}
}
@@ -138,19 +131,21 @@ static struct page *balloon_retrieve(bool prefer_highmem)
{
struct page *page;
- if (list_empty(&ballooned_pages))
+ if (list_empty(&xen_balloon.ballooned_pages))
return NULL;
if (prefer_highmem)
- page = list_entry(ballooned_pages.prev, struct page, lru);
+ page = list_entry(xen_balloon.ballooned_pages.prev,
+ struct page, lru);
else
- page = list_entry(ballooned_pages.next, struct page, lru);
+ page = list_entry(xen_balloon.ballooned_pages.next,
+ struct page, lru);
list_del(&page->lru);
if (PageHighMem(page))
- balloon_stats.balloon_high--;
+ xen_balloon.balloon_stats.balloon_high--;
else
- balloon_stats.balloon_low--;
+ xen_balloon.balloon_stats.balloon_low--;
adjust_managed_page_count(page, 1);
@@ -160,7 +155,7 @@ static struct page *balloon_retrieve(bool prefer_highmem)
static struct page *balloon_next_page(struct page *page)
{
struct list_head *next = page->lru.next;
- if (next == &ballooned_pages)
+ if (next == &xen_balloon.ballooned_pages)
return NULL;
return list_entry(next, struct page, lru);
}
@@ -168,24 +163,27 @@ static struct page *balloon_next_page(struct page *page)
static enum bp_state update_schedule(enum bp_state state)
{
if (state == BP_DONE) {
- balloon_stats.schedule_delay = 1;
- balloon_stats.retry_count = 1;
+ xen_balloon.balloon_stats.schedule_delay = 1;
+ xen_balloon.balloon_stats.retry_count = 1;
return BP_DONE;
}
- ++balloon_stats.retry_count;
+ ++xen_balloon.balloon_stats.retry_count;
- if (balloon_stats.max_retry_count != RETRY_UNLIMITED &&
- balloon_stats.retry_count > balloon_stats.max_retry_count) {
- balloon_stats.schedule_delay = 1;
- balloon_stats.retry_count = 1;
+ if (xen_balloon.balloon_stats.max_retry_count != RETRY_UNLIMITED &&
+ xen_balloon.balloon_stats.retry_count >
+ xen_balloon.balloon_stats.max_retry_count) {
+ xen_balloon.balloon_stats.schedule_delay = 1;
+ xen_balloon.balloon_stats.retry_count = 1;
return BP_ECANCELED;
}
- balloon_stats.schedule_delay <<= 1;
+ xen_balloon.balloon_stats.schedule_delay <<= 1;
- if (balloon_stats.schedule_delay > balloon_stats.max_schedule_delay)
- balloon_stats.schedule_delay = balloon_stats.max_schedule_delay;
+ if (xen_balloon.balloon_stats.schedule_delay >
+ xen_balloon.balloon_stats.max_schedule_delay)
+ xen_balloon.balloon_stats.schedule_delay =
+ xen_balloon.balloon_stats.max_schedule_delay;
return BP_EAGAIN;
}
@@ -193,14 +191,16 @@ static enum bp_state update_schedule(enum bp_state state)
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
static long current_credit(void)
{
- return balloon_stats.target_pages - balloon_stats.current_pages -
- balloon_stats.hotplug_pages;
+ return xen_balloon.balloon_stats.target_pages -
+ xen_balloon.balloon_stats.current_pages -
+ xen_balloon.balloon_stats.hotplug_pages;
}
static bool balloon_is_inflated(void)
{
- if (balloon_stats.balloon_low || balloon_stats.balloon_high ||
- balloon_stats.balloon_hotplug)
+ if (xen_balloon.balloon_stats.balloon_low ||
+ xen_balloon.balloon_stats.balloon_high ||
+ xen_balloon.balloon_stats.balloon_hotplug)
return true;
else
return false;
@@ -236,8 +236,8 @@ static enum bp_state reserve_additional_memory(long credit)
balloon_hotplug -= credit;
- balloon_stats.hotplug_pages += credit;
- balloon_stats.balloon_hotplug = balloon_hotplug;
+ xen_balloon.balloon_stats.hotplug_pages += credit;
+ xen_balloon.balloon_stats.balloon_hotplug = balloon_hotplug;
return BP_DONE;
}
@@ -246,16 +246,16 @@ static void xen_online_page(struct page *page)
{
__online_page_set_limits(page);
- mutex_lock(&balloon_mutex);
+ mutex_lock(&xen_balloon.balloon_mutex);
__balloon_append(page);
- if (balloon_stats.hotplug_pages)
- --balloon_stats.hotplug_pages;
+ if (xen_balloon.balloon_stats.hotplug_pages)
+ --xen_balloon.balloon_stats.hotplug_pages;
else
- --balloon_stats.balloon_hotplug;
+ --xen_balloon.balloon_stats.balloon_hotplug;
- mutex_unlock(&balloon_mutex);
+ mutex_unlock(&xen_balloon.balloon_mutex);
}
static int xen_memory_notifier(struct notifier_block *nb, unsigned long val, void *v)
@@ -273,19 +273,20 @@ static struct notifier_block xen_memory_nb = {
#else
static long current_credit(void)
{
- unsigned long target = balloon_stats.target_pages;
+ unsigned long target = xen_balloon.balloon_stats.target_pages;
target = min(target,
- balloon_stats.current_pages +
- balloon_stats.balloon_low +
- balloon_stats.balloon_high);
+ xen_balloon.balloon_stats.current_pages +
+ xen_balloon.balloon_stats.balloon_low +
+ xen_balloon.balloon_stats.balloon_high);
- return target - balloon_stats.current_pages;
+ return target - xen_balloon.balloon_stats.current_pages;
}
static bool balloon_is_inflated(void)
{
- if (balloon_stats.balloon_low || balloon_stats.balloon_high)
+ if (xen_balloon.balloon_stats.balloon_low ||
+ xen_balloon.balloon_stats.balloon_high)
return true;
else
return false;
@@ -293,7 +294,8 @@ static bool balloon_is_inflated(void)
static enum bp_state reserve_additional_memory(long credit)
{
- balloon_stats.target_pages = balloon_stats.current_pages;
+ xen_balloon.balloon_stats.target_pages =
+ xen_balloon.balloon_stats.current_pages;
return BP_DONE;
}
#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
@@ -310,10 +312,12 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
};
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
- if (!balloon_stats.balloon_low && !balloon_stats.balloon_high) {
- nr_pages = min(nr_pages, balloon_stats.balloon_hotplug);
- balloon_stats.hotplug_pages += nr_pages;
- balloon_stats.balloon_hotplug -= nr_pages;
+ if (!xen_balloon.balloon_stats.balloon_low &&
+ !xen_balloon.balloon_stats.balloon_high) {
+ nr_pages = min(nr_pages,
+ xen_balloon.balloon_stats.balloon_hotplug);
+ xen_balloon.balloon_stats.hotplug_pages += nr_pages;
+ xen_balloon.balloon_stats.balloon_hotplug -= nr_pages;
return BP_DONE;
}
#endif
@@ -321,7 +325,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
if (nr_pages > ARRAY_SIZE(frame_list))
nr_pages = ARRAY_SIZE(frame_list);
- page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
+ page = list_first_entry_or_null(&xen_balloon.ballooned_pages,
+ struct page, lru);
for (i = 0; i < nr_pages; i++) {
if (!page) {
nr_pages = i;
@@ -363,7 +368,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
__free_reserved_page(page);
}
- balloon_stats.current_pages += rc;
+ xen_balloon.balloon_stats.current_pages += rc;
return BP_DONE;
}
@@ -381,10 +386,11 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
};
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
- if (balloon_stats.hotplug_pages) {
- nr_pages = min(nr_pages, balloon_stats.hotplug_pages);
- balloon_stats.hotplug_pages -= nr_pages;
- balloon_stats.balloon_hotplug += nr_pages;
+ if (xen_balloon.balloon_stats.hotplug_pages) {
+ nr_pages = min(nr_pages,
+ xen_balloon.balloon_stats.hotplug_pages);
+ xen_balloon.balloon_stats.hotplug_pages -= nr_pages;
+ xen_balloon.balloon_stats.balloon_hotplug += nr_pages;
return BP_DONE;
}
#endif
@@ -451,7 +457,7 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
BUG_ON(ret != nr_pages);
- balloon_stats.current_pages -= nr_pages;
+ xen_balloon.balloon_stats.current_pages -= nr_pages;
return state;
}
@@ -467,7 +473,7 @@ static void balloon_process(struct work_struct *work)
enum bp_state state = BP_DONE;
long credit;
- mutex_lock(&balloon_mutex);
+ mutex_lock(&xen_balloon.balloon_mutex);
do {
credit = current_credit();
@@ -492,9 +498,10 @@ static void balloon_process(struct work_struct *work)
/* Schedule more work if there is some still to be done. */
if (state == BP_EAGAIN)
- schedule_delayed_work(&balloon_worker, balloon_stats.schedule_delay * HZ);
+ schedule_delayed_work(&balloon_worker,
+ xen_balloon.balloon_stats.schedule_delay * HZ);
- mutex_unlock(&balloon_mutex);
+ mutex_unlock(&xen_balloon.balloon_mutex);
}
struct page *get_balloon_scratch_page(void)
@@ -513,7 +520,7 @@ void put_balloon_scratch_page(void)
void balloon_set_new_target(unsigned long target)
{
/* No need for lock. Not read-modify-write updates. */
- balloon_stats.target_pages = target;
+ xen_balloon.balloon_stats.target_pages = target;
schedule_delayed_work(&balloon_worker, 0);
}
EXPORT_SYMBOL_GPL(balloon_set_new_target);
@@ -529,7 +536,7 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
{
int pgno = 0;
struct page *page;
- mutex_lock(&balloon_mutex);
+ mutex_lock(&xen_balloon.balloon_mutex);
while (pgno < nr_pages) {
page = balloon_retrieve(highmem);
if (page && (highmem || !PageHighMem(page))) {
@@ -544,14 +551,14 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
goto out_undo;
}
}
- mutex_unlock(&balloon_mutex);
+ mutex_unlock(&xen_balloon.balloon_mutex);
return 0;
out_undo:
while (pgno)
balloon_append(pages[--pgno]);
/* Free the memory back to the kernel soon */
schedule_delayed_work(&balloon_worker, 0);
- mutex_unlock(&balloon_mutex);
+ mutex_unlock(&xen_balloon.balloon_mutex);
return -ENOMEM;
}
EXPORT_SYMBOL(alloc_xenballooned_pages);
@@ -566,7 +573,7 @@ void free_xenballooned_pages(int nr_pages, struct page **pages)
{
int i;
- mutex_lock(&balloon_mutex);
+ mutex_lock(&xen_balloon.balloon_mutex);
for (i = 0; i < nr_pages; i++) {
if (pages[i])
@@ -577,7 +584,7 @@ void free_xenballooned_pages(int nr_pages, struct page **pages)
if (current_credit())
schedule_delayed_work(&balloon_worker, 0);
- mutex_unlock(&balloon_mutex);
+ mutex_unlock(&xen_balloon.balloon_mutex);
}
EXPORT_SYMBOL(free_xenballooned_pages);
@@ -660,21 +667,28 @@ static int __init balloon_init(void)
pr_info("Initialising balloon driver\n");
- balloon_stats.current_pages = xen_pv_domain()
+ memset(&xen_balloon, 0, sizeof(xen_balloon));
+
+ mutex_init(&xen_balloon.balloon_mutex);
+
+ INIT_LIST_HEAD(&xen_balloon.ballooned_pages);
+
+ xen_balloon.balloon_stats.current_pages = xen_pv_domain()
? min(xen_start_info->nr_pages - xen_released_pages, max_pfn)
: get_num_physpages();
- balloon_stats.target_pages = balloon_stats.current_pages;
- balloon_stats.balloon_low = 0;
- balloon_stats.balloon_high = 0;
+ xen_balloon.balloon_stats.target_pages =
+ xen_balloon.balloon_stats.current_pages;
+ xen_balloon.balloon_stats.balloon_low = 0;
+ xen_balloon.balloon_stats.balloon_high = 0;
- balloon_stats.schedule_delay = 1;
- balloon_stats.max_schedule_delay = 32;
- balloon_stats.retry_count = 1;
- balloon_stats.max_retry_count = RETRY_UNLIMITED;
+ xen_balloon.balloon_stats.schedule_delay = 1;
+ xen_balloon.balloon_stats.max_schedule_delay = 32;
+ xen_balloon.balloon_stats.retry_count = 1;
+ xen_balloon.balloon_stats.max_retry_count = RETRY_UNLIMITED;
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
- balloon_stats.hotplug_pages = 0;
- balloon_stats.balloon_hotplug = 0;
+ xen_balloon.balloon_stats.hotplug_pages = 0;
+ xen_balloon.balloon_stats.balloon_hotplug = 0;
set_online_page_callback(&xen_online_page);
register_memory_notifier(&xen_memory_nb);
diff --git a/drivers/xen/xen-balloon.c b/drivers/xen/xen-balloon.c
index e555845..ef04236 100644
--- a/drivers/xen/xen-balloon.c
+++ b/drivers/xen/xen-balloon.c
@@ -126,19 +126,23 @@ module_exit(balloon_exit);
} \
static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
-BALLOON_SHOW(current_kb, "%lu\n", PAGES2KB(balloon_stats.current_pages));
-BALLOON_SHOW(low_kb, "%lu\n", PAGES2KB(balloon_stats.balloon_low));
-BALLOON_SHOW(high_kb, "%lu\n", PAGES2KB(balloon_stats.balloon_high));
-
-static DEVICE_ULONG_ATTR(schedule_delay, 0444, balloon_stats.schedule_delay);
-static DEVICE_ULONG_ATTR(max_schedule_delay, 0644, balloon_stats.max_schedule_delay);
-static DEVICE_ULONG_ATTR(retry_count, 0444, balloon_stats.retry_count);
-static DEVICE_ULONG_ATTR(max_retry_count, 0644, balloon_stats.max_retry_count);
+BALLOON_SHOW(current_kb, "%lu\n", PAGES2KB(xen_balloon.balloon_stats.current_pages));
+BALLOON_SHOW(low_kb, "%lu\n", PAGES2KB(xen_balloon.balloon_stats.balloon_low));
+BALLOON_SHOW(high_kb, "%lu\n", PAGES2KB(xen_balloon.balloon_stats.balloon_high));
+
+static DEVICE_ULONG_ATTR(schedule_delay, 0444,
+ xen_balloon.balloon_stats.schedule_delay);
+static DEVICE_ULONG_ATTR(max_schedule_delay, 0644,
+ xen_balloon.balloon_stats.max_schedule_delay);
+static DEVICE_ULONG_ATTR(retry_count, 0444,
+ xen_balloon.balloon_stats.retry_count);
+static DEVICE_ULONG_ATTR(max_retry_count, 0644,
+ xen_balloon.balloon_stats.max_retry_count);
static ssize_t show_target_kb(struct device *dev, struct device_attribute *attr,
char *buf)
{
- return sprintf(buf, "%lu\n", PAGES2KB(balloon_stats.target_pages));
+ return sprintf(buf, "%lu\n", PAGES2KB(xen_balloon.balloon_stats.target_pages));
}
static ssize_t store_target_kb(struct device *dev,
@@ -167,7 +171,7 @@ static ssize_t show_target(struct device *dev, struct device_attribute *attr,
char *buf)
{
return sprintf(buf, "%llu\n",
- (unsigned long long)balloon_stats.target_pages
+ (unsigned long long)xen_balloon.balloon_stats.target_pages
<< PAGE_SHIFT);
}
diff --git a/include/xen/balloon.h b/include/xen/balloon.h
index a4c1c6a..1d7efae 100644
--- a/include/xen/balloon.h
+++ b/include/xen/balloon.h
@@ -21,7 +21,20 @@ struct balloon_stats {
#endif
};
-extern struct balloon_stats balloon_stats;
+struct xen_balloon {
+ /* Mutex to protect xen_balloon across inflation / deflation /
+ * page migration.
+ */
+ struct mutex balloon_mutex;
+
+ /* List of ballooned pages managed by Xen balloon driver */
+ struct list_head ballooned_pages;
+
+ /* Memory statistic */
+ struct balloon_stats balloon_stats;
+};
+
+extern struct xen_balloon xen_balloon;
void balloon_set_new_target(unsigned long target);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 04/10] xen/balloon: factor out function to update balloon stats
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (2 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 03/10] xen/balloon: consolidate data structures Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 05/10] xen/balloon: rework increase_reservation Wei Liu
` (7 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index d8055f0..04e12b4 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -107,17 +107,24 @@ static void scrub_page(struct page *page)
#endif
}
+static inline void update_balloon_stats(struct page *page, int count)
+{
+ if (PageHighMem(page))
+ xen_balloon.balloon_stats.balloon_high += count;
+ else
+ xen_balloon.balloon_stats.balloon_low += count;
+}
+
/* balloon_append: add the given page to the balloon. */
static void __balloon_append(struct page *page)
{
/* Lowmem is re-populated first, so highmem pages go at list tail. */
- if (PageHighMem(page)) {
+ if (PageHighMem(page))
list_add_tail(&page->lru, &xen_balloon.ballooned_pages);
- xen_balloon.balloon_stats.balloon_high++;
- } else {
+ else
list_add(&page->lru, &xen_balloon.ballooned_pages);
- xen_balloon.balloon_stats.balloon_low++;
- }
+
+ update_balloon_stats(page, 1);
}
static void balloon_append(struct page *page)
@@ -142,10 +149,7 @@ static struct page *balloon_retrieve(bool prefer_highmem)
struct page, lru);
list_del(&page->lru);
- if (PageHighMem(page))
- xen_balloon.balloon_stats.balloon_high--;
- else
- xen_balloon.balloon_stats.balloon_low--;
+ update_balloon_stats(page, -1);
adjust_managed_page_count(page, 1);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 05/10] xen/balloon: rework increase_reservation
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (3 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 04/10] xen/balloon: factor out function to update balloon stats Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 06/10] xen/balloon: make use of generic balloon driver Wei Liu
` (6 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
The original behavior of increase_reservation dequeues populated balloon
pages after hypercall, which doesn't quite comply with the logic of core
balloon compaction code, which expects driver to dequeue the pages
before doing any work.
In short, original logic:
1. prepare hypercall structure without grabbing the pages
2. issue hypercall
3. release populated pages to kernel
Change logic of increase_reservation to:
1. grab all pages from balloon driver and prepare hypercall structure
2. issue hypercall
3. release populated pages to kernel
4. add back any pages that are not populated to balloon driver
Note that balloon_retrieve's logic is changed as well -- the
accounting code is moved outside of that function and added to places
where appropriate.
This is in preparation for later patches that make use of the core
balloon compaction driver.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 50 +++++++++++++++++++++++++++++--------------------
1 file changed, 30 insertions(+), 20 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 04e12b4..4f46545 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -133,7 +133,11 @@ static void balloon_append(struct page *page)
adjust_managed_page_count(page, -1);
}
-/* balloon_retrieve: rescue a page from the balloon, if it is not empty. */
+/* balloon_retrieve: rescue a page from Xen balloon driver, if it is
+ * not empty. This function doesn't look at pages in generic balloon
+ * driver. Also the accouting is not updated as the page might be put
+ * back to the list.
+ */
static struct page *balloon_retrieve(bool prefer_highmem)
{
struct page *page;
@@ -149,21 +153,9 @@ static struct page *balloon_retrieve(bool prefer_highmem)
struct page, lru);
list_del(&page->lru);
- update_balloon_stats(page, -1);
-
- adjust_managed_page_count(page, 1);
-
return page;
}
-static struct page *balloon_next_page(struct page *page)
-{
- struct list_head *next = page->lru.next;
- if (next == &xen_balloon.ballooned_pages)
- return NULL;
- return list_entry(next, struct page, lru);
-}
-
static enum bp_state update_schedule(enum bp_state state)
{
if (state == BP_DONE) {
@@ -309,6 +301,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
int rc;
unsigned long pfn, i;
struct page *page;
+ LIST_HEAD(queue);
struct xen_memory_reservation reservation = {
.address_bits = 0,
.extent_order = 0,
@@ -329,27 +322,34 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
if (nr_pages > ARRAY_SIZE(frame_list))
nr_pages = ARRAY_SIZE(frame_list);
- page = list_first_entry_or_null(&xen_balloon.ballooned_pages,
- struct page, lru);
+ /* First step: grab all pages we need to balloon in */
for (i = 0; i < nr_pages; i++) {
+ page = balloon_retrieve(false);
if (!page) {
nr_pages = i;
break;
}
frame_list[i] = page_to_pfn(page);
- page = balloon_next_page(page);
+ /* The order in queue must match the order in frame_list */
+ list_add_tail(&page->lru, &queue);
}
+ /* Second step: issue hypercall */
set_xen_guest_handle(reservation.extent_start, frame_list);
reservation.nr_extents = nr_pages;
rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
- if (rc <= 0)
- return BP_EAGAIN;
+ if (rc <= 0) {
+ rc = BP_EAGAIN;
+ goto move_pages_back;
+ }
+ /* Third step: free populated pages back to kernel allocator */
for (i = 0; i < rc; i++) {
- page = balloon_retrieve(false);
+ page = list_first_entry_or_null(&queue, struct page, lru);
+
BUG_ON(page == NULL);
+ list_del(&page->lru);
pfn = page_to_pfn(page);
#ifdef CONFIG_XEN_HAVE_PVMMU
@@ -370,11 +370,19 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
/* Relinquish the page back to the allocator. */
__free_reserved_page(page);
+
+ /* We only account for those pages that have been populated. */
+ update_balloon_stats(page, -1);
+ adjust_managed_page_count(page, 1);
}
xen_balloon.balloon_stats.current_pages += rc;
+ rc = BP_DONE;
- return BP_DONE;
+move_pages_back:
+ /* Final step: move back any unpopulated pages to balloon driver */
+ list_splice_init(&queue, &xen_balloon.ballooned_pages);
+ return rc;
}
static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
@@ -545,6 +553,8 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
page = balloon_retrieve(highmem);
if (page && (highmem || !PageHighMem(page))) {
pages[pgno++] = page;
+ update_balloon_stats(page, -1);
+ adjust_managed_page_count(page, 1);
} else {
enum bp_state st;
if (page)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 06/10] xen/balloon: make use of generic balloon driver
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (4 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 05/10] xen/balloon: rework increase_reservation Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 07/10] xen/balloon: factor out some helper functions Wei Liu
` (5 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Let generic balloon driver handle high memory and movable memory
allocation. Those pages are subject to page migration, so that we can
have balloon page compaction thread de-fragment all those pages for us.
The page migration function is not yet implemented. A stub is created to
always return -EAGAIN, indicating failure in page migration, which
prevents balloon compaction thread from doing any work. This patch is
arranged like this to ease review.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 150 ++++++++++++++++++++++++++++++++++++++++---------
include/xen/balloon.h | 4 ++
2 files changed, 126 insertions(+), 28 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 4f46545..24efdf6 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -95,11 +95,6 @@ static DEFINE_PER_CPU(struct page *, balloon_scratch_page);
static void balloon_process(struct work_struct *work);
static DECLARE_DELAYED_WORK(balloon_worker, balloon_process);
-/* When ballooning out (allocating memory to return to Xen) we don't really
- want the kernel to try too hard since that can trigger the oom killer. */
-#define GFP_BALLOON \
- (GFP_HIGHUSER | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC)
-
static void scrub_page(struct page *page)
{
#ifdef CONFIG_XEN_SCRUB_PAGES
@@ -115,21 +110,27 @@ static inline void update_balloon_stats(struct page *page, int count)
xen_balloon.balloon_stats.balloon_low += count;
}
-/* balloon_append: add the given page to the balloon. */
-static void __balloon_append(struct page *page)
+/* balloon_append: add the given page to the balloon. "managed"
+ * indicates whether we should add the page to our own list.
+ */
+static void __balloon_append(struct page *page, bool managed)
{
- /* Lowmem is re-populated first, so highmem pages go at list tail. */
- if (PageHighMem(page))
- list_add_tail(&page->lru, &xen_balloon.ballooned_pages);
- else
- list_add(&page->lru, &xen_balloon.ballooned_pages);
+ if (managed) {
+ /* Lowmem is re-populated first, so highmem pages go
+ * at list tail. */
+ if (PageHighMem(page))
+ list_add_tail(&page->lru,
+ &xen_balloon.ballooned_pages);
+ else
+ list_add(&page->lru, &xen_balloon.ballooned_pages);
+ }
update_balloon_stats(page, 1);
}
-static void balloon_append(struct page *page)
+static void balloon_append(struct page *page, bool managed)
{
- __balloon_append(page);
+ __balloon_append(page, managed);
adjust_managed_page_count(page, -1);
}
@@ -244,7 +245,7 @@ static void xen_online_page(struct page *page)
mutex_lock(&xen_balloon.balloon_mutex);
- __balloon_append(page);
+ __balloon_append(page, true);
if (xen_balloon.balloon_stats.hotplug_pages)
--xen_balloon.balloon_stats.hotplug_pages;
@@ -296,12 +297,31 @@ static enum bp_state reserve_additional_memory(long credit)
}
#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
+/* This is used to reinsert pages back to generic balloon driver. We
+ * need to use balloon_page_insert. list_splice is not sufficient.
+ */
+static inline void __reinsert_balloon_pages(struct list_head *head)
+{
+ unsigned long flags;
+ struct page *page, *n;
+
+ spin_lock_irqsave(&xen_balloon.xb_dev_info->pages_lock, flags);
+ list_for_each_entry_safe_reverse(page, n, head, lru)
+ balloon_page_insert(page, xen_balloon.xb_dev_info->mapping,
+ &xen_balloon.xb_dev_info->pages);
+ spin_unlock_irqrestore(&xen_balloon.xb_dev_info->pages_lock, flags);
+}
+
+/* This function will always try to fill in pages managed by Xen
+ * balloon driver, then pages managed by generic balloon driver.
+ */
static enum bp_state increase_reservation(unsigned long nr_pages)
{
int rc;
unsigned long pfn, i;
struct page *page;
LIST_HEAD(queue);
+ bool xen_pages;
struct xen_memory_reservation reservation = {
.address_bits = 0,
.extent_order = 0,
@@ -322,9 +342,19 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
if (nr_pages > ARRAY_SIZE(frame_list))
nr_pages = ARRAY_SIZE(frame_list);
+ /* We always try to fill in pages managed by Xen balloon
+ * driver first. And we separate the process to fill in Xen
+ * balloon driver pages v.s. generic balloon driver in
+ * different rounds for the sake of simplicity.
+ */
+ xen_pages = !list_empty(&xen_balloon.ballooned_pages);
+
/* First step: grab all pages we need to balloon in */
for (i = 0; i < nr_pages; i++) {
- page = balloon_retrieve(false);
+ if (xen_pages)
+ page = balloon_retrieve(false);
+ else
+ page = balloon_page_dequeue(xen_balloon.xb_dev_info, false);
if (!page) {
nr_pages = i;
break;
@@ -369,7 +399,10 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
#endif
/* Relinquish the page back to the allocator. */
- __free_reserved_page(page);
+ if (xen_pages)
+ __free_reserved_page(page);
+ else
+ balloon_page_free(page);
/* We only account for those pages that have been populated. */
update_balloon_stats(page, -1);
@@ -381,11 +414,26 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
move_pages_back:
/* Final step: move back any unpopulated pages to balloon driver */
- list_splice_init(&queue, &xen_balloon.ballooned_pages);
+ if (!list_empty(&queue)) {
+ if (xen_pages)
+ list_splice(&queue, &xen_balloon.ballooned_pages);
+ else
+ __reinsert_balloon_pages(&queue);
+ }
return rc;
}
-static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
+/* decrease_reservation:
+ * core_driver == true:
+ * Page is not movable and managed by Xen balloon driver. Honor gfp
+ * when allocating pages.
+ * core_driver == false:
+ * Page is movable and managed by generic balloon driver, subject to
+ * migration. Gfp is ignored as page is allocated by generic balloon
+ * driver.
+ */
+static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp,
+ bool core_driver)
{
enum bp_state state = BP_DONE;
unsigned long pfn, i;
@@ -411,7 +459,11 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
nr_pages = ARRAY_SIZE(frame_list);
for (i = 0; i < nr_pages; i++) {
- page = alloc_page(gfp);
+ if (core_driver)
+ page = alloc_page(gfp);
+ else
+ page = balloon_page_enqueue(xen_balloon.xb_dev_info);
+
if (page == NULL) {
nr_pages = i;
state = BP_EAGAIN;
@@ -459,7 +511,7 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
}
#endif
- balloon_append(page);
+ balloon_append(page, core_driver);
}
flush_tlb_all();
@@ -498,7 +550,7 @@ static void balloon_process(struct work_struct *work)
}
if (credit < 0)
- state = decrease_reservation(-credit, GFP_BALLOON);
+ state = decrease_reservation(-credit, 0, false);
state = update_schedule(state);
@@ -557,10 +609,10 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
adjust_managed_page_count(page, 1);
} else {
enum bp_state st;
+ gfp_t gfp = highmem ? GFP_HIGHUSER : GFP_USER;
if (page)
- balloon_append(page);
- st = decrease_reservation(nr_pages - pgno,
- highmem ? GFP_HIGHUSER : GFP_USER);
+ balloon_append(page, true);
+ st = decrease_reservation(nr_pages - pgno, gfp, true);
if (st != BP_DONE)
goto out_undo;
}
@@ -569,7 +621,7 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
return 0;
out_undo:
while (pgno)
- balloon_append(pages[--pgno]);
+ balloon_append(pages[--pgno], true);
/* Free the memory back to the kernel soon */
schedule_delayed_work(&balloon_worker, 0);
mutex_unlock(&xen_balloon.balloon_mutex);
@@ -591,7 +643,7 @@ void free_xenballooned_pages(int nr_pages, struct page **pages)
for (i = 0; i < nr_pages; i++) {
if (pages[i])
- balloon_append(pages[i]);
+ balloon_append(pages[i], true);
}
/* The balloon may be too large now. Shrink it if needed. */
@@ -620,7 +672,7 @@ static void __init balloon_add_region(unsigned long start_pfn,
/* totalram_pages and totalhigh_pages do not
include the boot-time balloon extension, so
don't subtract from it. */
- __balloon_append(page);
+ __balloon_append(page, true);
}
}
@@ -658,9 +710,25 @@ static struct notifier_block balloon_cpu_notifier = {
.notifier_call = balloon_cpu_notify,
};
+static const struct address_space_operations xen_balloon_aops;
+#ifdef CONFIG_BALLOON_COMPACTION
+static int xen_balloon_migratepage(struct address_space *mapping,
+ struct page *newpage, struct page *page,
+ enum migrate_mode mode)
+{
+ return -EAGAIN;
+}
+
+static const struct address_space_operations xen_balloon_aops = {
+ .migratepage = xen_balloon_migratepage,
+};
+#endif /* CONFIG_BALLOON_COMPACTION */
+
static int __init balloon_init(void)
{
int i, cpu;
+ struct address_space *mapping;
+ int err = 0;
if (!xen_domain())
return -ENODEV;
@@ -683,6 +751,27 @@ static int __init balloon_init(void)
memset(&xen_balloon, 0, sizeof(xen_balloon));
+ /* Allocate core balloon driver which is in charge of movable
+ * pages.
+ */
+ xen_balloon.xb_dev_info = balloon_devinfo_alloc(&xen_balloon);
+
+ if (IS_ERR(xen_balloon.xb_dev_info)) {
+ err = PTR_ERR(xen_balloon.xb_dev_info);
+ goto out_err;
+ }
+
+ /* Check CONFIG_BALLOON_COMPACTION */
+ if (balloon_compaction_check()) {
+ mapping = balloon_mapping_alloc(xen_balloon.xb_dev_info,
+ &xen_balloon_aops);
+ if (IS_ERR(mapping)) {
+ err = PTR_ERR(mapping);
+ goto out_free_xb_dev_info;
+ }
+ pr_info("balloon page compaction enabled\n");
+ }
+
mutex_init(&xen_balloon.balloon_mutex);
INIT_LIST_HEAD(&xen_balloon.ballooned_pages);
@@ -718,6 +807,11 @@ static int __init balloon_init(void)
PFN_DOWN(xen_extra_mem[i].size));
return 0;
+
+out_free_xb_dev_info:
+ balloon_devinfo_free(xen_balloon.xb_dev_info);
+out_err:
+ return err;
}
subsys_initcall(balloon_init);
diff --git a/include/xen/balloon.h b/include/xen/balloon.h
index 1d7efae..f35b712 100644
--- a/include/xen/balloon.h
+++ b/include/xen/balloon.h
@@ -1,6 +1,7 @@
/******************************************************************************
* Xen balloon functionality
*/
+#include <linux/balloon_compaction.h>
#define RETRY_UNLIMITED 0
@@ -32,6 +33,9 @@ struct xen_balloon {
/* Memory statistic */
struct balloon_stats balloon_stats;
+
+ /* Core balloon driver - in charge of movable pages */
+ struct balloon_dev_info *xb_dev_info;
};
extern struct xen_balloon xen_balloon;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 07/10] xen/balloon: factor out some helper functions
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (5 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 06/10] xen/balloon: make use of generic balloon driver Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 08/10] xen/balloon: implement migratepage Wei Liu
` (4 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
They will be used in the page migration routine.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 121 +++++++++++++++++++++++++++----------------------
1 file changed, 68 insertions(+), 53 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 24efdf6..815e1d5 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -312,6 +312,67 @@ static inline void __reinsert_balloon_pages(struct list_head *head)
spin_unlock_irqrestore(&xen_balloon.xb_dev_info->pages_lock, flags);
}
+static int __memory_op_hypercall(int cmd, xen_pfn_t *list, xen_ulong_t nr)
+{
+ int rc;
+ struct xen_memory_reservation reservation = {
+ .address_bits = 0,
+ .extent_order = 0,
+ .domid = DOMID_SELF
+ };
+
+ set_xen_guest_handle(reservation.extent_start, list);
+ reservation.nr_extents = nr;
+ rc = HYPERVISOR_memory_op(cmd, &reservation);
+
+ return rc;
+}
+
+static void __link_back_to_pagetable(struct page *page, xen_ulong_t mfn,
+ pte_t pte)
+{
+#ifdef CONFIG_XEN_HAVE_PVMMU
+ unsigned long pfn = page_to_pfn(page);
+
+ if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+ set_phys_to_machine(pfn, mfn);
+
+ /* Link back into the page tables if not highmem. */
+ if (!PageHighMem(page)) {
+ int ret;
+ ret = HYPERVISOR_update_va_mapping(
+ (unsigned long)__va(pfn << PAGE_SHIFT),
+ pte, 0);
+ BUG_ON(ret);
+ }
+ }
+#endif
+}
+
+static void __replace_mapping_with_scratch_page(struct page *page)
+{
+#ifdef CONFIG_XEN_HAVE_PVMMU
+ /*
+ * Ballooned out frames are effectively replaced with
+ * a scratch frame. Ensure direct mappings and the
+ * p2m are consistent.
+ */
+ if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+ unsigned long p, smfn;
+ struct page *scratch_page = get_balloon_scratch_page();
+
+ p = page_to_pfn(scratch_page);
+ smfn = pfn_to_mfn(p);
+
+ __link_back_to_pagetable(page, smfn,
+ mfn_pte(smfn, PAGE_KERNEL_RO));
+
+ put_balloon_scratch_page();
+ }
+#endif
+}
+
+
/* This function will always try to fill in pages managed by Xen
* balloon driver, then pages managed by generic balloon driver.
*/
@@ -322,11 +383,6 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
struct page *page;
LIST_HEAD(queue);
bool xen_pages;
- struct xen_memory_reservation reservation = {
- .address_bits = 0,
- .extent_order = 0,
- .domid = DOMID_SELF
- };
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
if (!xen_balloon.balloon_stats.balloon_low &&
@@ -365,9 +421,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
}
/* Second step: issue hypercall */
- set_xen_guest_handle(reservation.extent_start, frame_list);
- reservation.nr_extents = nr_pages;
- rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
+ rc = __memory_op_hypercall(XENMEM_populate_physmap, frame_list,
+ nr_pages);
if (rc <= 0) {
rc = BP_EAGAIN;
goto move_pages_back;
@@ -382,21 +437,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
list_del(&page->lru);
pfn = page_to_pfn(page);
-#ifdef CONFIG_XEN_HAVE_PVMMU
- if (!xen_feature(XENFEAT_auto_translated_physmap)) {
- set_phys_to_machine(pfn, frame_list[i]);
-
- /* Link back into the page tables if not highmem. */
- if (!PageHighMem(page)) {
- int ret;
- ret = HYPERVISOR_update_va_mapping(
- (unsigned long)__va(pfn << PAGE_SHIFT),
- mfn_pte(frame_list[i], PAGE_KERNEL),
- 0);
- BUG_ON(ret);
- }
- }
-#endif
+ __link_back_to_pagetable(page, frame_list[i],
+ mfn_pte(frame_list[i], PAGE_KERNEL));
/* Relinquish the page back to the allocator. */
if (xen_pages)
@@ -439,11 +481,6 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp,
unsigned long pfn, i;
struct page *page;
int ret;
- struct xen_memory_reservation reservation = {
- .address_bits = 0,
- .extent_order = 0,
- .domid = DOMID_SELF
- };
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
if (xen_balloon.balloon_stats.hotplug_pages) {
@@ -489,36 +526,14 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp,
frame_list[i] = pfn_to_mfn(pfn);
page = pfn_to_page(pfn);
-#ifdef CONFIG_XEN_HAVE_PVMMU
- /*
- * Ballooned out frames are effectively replaced with
- * a scratch frame. Ensure direct mappings and the
- * p2m are consistent.
- */
- if (!xen_feature(XENFEAT_auto_translated_physmap)) {
- if (!PageHighMem(page)) {
- struct page *scratch_page = get_balloon_scratch_page();
-
- ret = HYPERVISOR_update_va_mapping(
- (unsigned long)__va(pfn << PAGE_SHIFT),
- pfn_pte(page_to_pfn(scratch_page),
- PAGE_KERNEL_RO), 0);
- BUG_ON(ret);
-
- put_balloon_scratch_page();
- }
- __set_phys_to_machine(pfn, INVALID_P2M_ENTRY);
- }
-#endif
-
+ __replace_mapping_with_scratch_page(page);
balloon_append(page, core_driver);
}
flush_tlb_all();
- set_xen_guest_handle(reservation.extent_start, frame_list);
- reservation.nr_extents = nr_pages;
- ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
+ ret = __memory_op_hypercall(XENMEM_decrease_reservation, frame_list,
+ nr_pages);
BUG_ON(ret != nr_pages);
xen_balloon.balloon_stats.current_pages -= nr_pages;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 08/10] xen/balloon: implement migratepage
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (6 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 07/10] xen/balloon: factor out some helper functions Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 16:16 ` David Vrabel
2014-10-15 15:54 ` [PATCH RFC 09/10] balloon: BALLOON_COMPACTION now depends on XEN_BALLOON Wei Liu
` (3 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
This patch replaces the xen_balloon_migratepage stub with actual
implementation.
It's implemented in two macro steps:
1. populate old page with machine page and remove it from balloon driver
2. give up machine page that backs new page and queue it to balloon
driver
Old page is the page owned by balloon driver, new page is a page
isolated by core mm compaction thread.
The aforementioned steps swaps a ballooned out page with a usable page.
>From guest's point of view, it de-fragments physical address space.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 92 insertions(+), 2 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 815e1d5..112190e 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -727,13 +727,103 @@ static struct notifier_block balloon_cpu_notifier = {
static const struct address_space_operations xen_balloon_aops;
#ifdef CONFIG_BALLOON_COMPACTION
+/*
+ * xen_balloon_migratepage - perform the balloon page migration on behalf of
+ * a compation thread. (called under page lock)
+ * @mapping: the page->mapping which will be assigned to the new migrated page
+ * @newpage: page that will replace the isolated page after migration finishes
+ * @page : the isolated (old) page that is about to be migrated to newpage.
+ * @mode : compaction mode -- not used for balloon page migration.
+ *
+ * After a ballooned page gets isolated by compaction procedures, this
+ * is the function that performs the page migration on behalf of a
+ * compaction thread. The page migration for Xen balloon is done in
+ * these two macro steps:
+ *
+ * A. back @page with machine page
+ * B. release machine that backs @newpage
+ *
+ * Logically the above steps should work in reversed order (first B
+ * then A). But if we fail in A (in BA order) due to memory pressure
+ * in Xen we might not get back @newpage easily. With current order,
+ * we can safely return -EAGAIN if step A fails (either due to memory
+ * cap for guest or out of memory in hypervisor).
+ *
+ * This function preforms the balloon page migration task.
+ * Called through balloon_mapping->a_ops->migratepage
+ */
static int xen_balloon_migratepage(struct address_space *mapping,
struct page *newpage, struct page *page,
enum migrate_mode mode)
{
- return -EAGAIN;
-}
+ struct xen_balloon *xb;
+ struct balloon_dev_info *info = balloon_page_device(page);
+ unsigned long flags;
+ unsigned long pfn;
+ xen_pfn_t frame;
+ int rc;
+
+ BUG_ON(!info);
+ BUG_ON(info->balloon_device != &xen_balloon);
+
+ xb = info->balloon_device;
+
+ /* Avoid contention if we're increasing / decreasing
+ * reservation.
+ */
+ if (!mutex_trylock(&xb->balloon_mutex))
+ return -EAGAIN;
+
+ kmap_flush_unused();
+
+ /*
+ * Step A:
+ * Back page with machine page
+ */
+ frame = page_to_pfn(page);
+
+ rc = __memory_op_hypercall(XENMEM_populate_physmap, &frame, 1);
+ if (rc != 1) {
+ rc = -EAGAIN;
+ goto out;
+ }
+ /*
+ * It's safe to delete page->lru here because this page is at
+ * an isolated migration list, and this step is expected to happen here
+ */
+ balloon_page_delete(page);
+
+ if (!xen_feature(XENFEAT_auto_translated_physmap))
+ __link_back_to_pagetable(page, frame,
+ mfn_pte(frame, PAGE_KERNEL));
+
+ /*
+ * Step B:
+ * Give up newpage's backing machine page and add it to list
+ */
+ pfn = page_to_pfn(newpage);
+ frame = pfn_to_mfn(pfn);
+ scrub_page(newpage);
+ if (!xen_feature(XENFEAT_auto_translated_physmap))
+ __replace_mapping_with_scratch_page(newpage);
+
+ rc = __memory_op_hypercall(XENMEM_decrease_reservation, &frame, 1);
+ BUG_ON(rc != 1);
+
+ spin_lock_irqsave(&info->pages_lock, flags);
+ balloon_page_insert(newpage, mapping, &info->pages);
+ info->isolated_pages--;
+ spin_unlock_irqrestore(&info->pages_lock, flags);
+
+ rc = MIGRATEPAGE_BALLOON_SUCCESS;
+
+ flush_tlb_all();
+out:
+ mutex_unlock(&xb->balloon_mutex);
+
+ return rc;
+}
static const struct address_space_operations xen_balloon_aops = {
.migratepage = xen_balloon_migratepage,
};
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 08/10] xen/balloon: implement migratepage
2014-10-15 15:54 ` [PATCH RFC 08/10] xen/balloon: implement migratepage Wei Liu
@ 2014-10-15 16:16 ` David Vrabel
2014-10-16 9:28 ` Ian Campbell
0 siblings, 1 reply; 22+ messages in thread
From: David Vrabel @ 2014-10-15 16:16 UTC (permalink / raw)
To: Wei Liu, xen-devel; +Cc: boris.ostrovsky, stefano.stabellini, ian.campbell
On 15/10/14 16:54, Wei Liu wrote:
> This patch replaces the xen_balloon_migratepage stub with actual
> implementation.
>
> It's implemented in two macro steps:
> 1. populate old page with machine page and remove it from balloon driver
> 2. give up machine page that backs new page and queue it to balloon
> driver
>
> Old page is the page owned by balloon driver, new page is a page
> isolated by core mm compaction thread.
>
> The aforementioned steps swaps a ballooned out page with a usable page.
> From guest's point of view, it de-fragments physical address space.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
> drivers/xen/balloon.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 92 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 815e1d5..112190e 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -727,13 +727,103 @@ static struct notifier_block balloon_cpu_notifier = {
>
> static const struct address_space_operations xen_balloon_aops;
> #ifdef CONFIG_BALLOON_COMPACTION
> +/*
> + * xen_balloon_migratepage - perform the balloon page migration on behalf of
> + * a compation thread. (called under page lock)
> + * @mapping: the page->mapping which will be assigned to the new migrated page
> + * @newpage: page that will replace the isolated page after migration finishes
> + * @page : the isolated (old) page that is about to be migrated to newpage.
> + * @mode : compaction mode -- not used for balloon page migration.
> + *
> + * After a ballooned page gets isolated by compaction procedures, this
> + * is the function that performs the page migration on behalf of a
> + * compaction thread. The page migration for Xen balloon is done in
> + * these two macro steps:
> + *
> + * A. back @page with machine page
> + * B. release machine that backs @newpage
If the target if enforced and the guest is above the target. This
prevent any compaction. You need an atomic swap operation.
Also, "frame" is the term for "machine page".
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 08/10] xen/balloon: implement migratepage
2014-10-15 16:16 ` David Vrabel
@ 2014-10-16 9:28 ` Ian Campbell
0 siblings, 0 replies; 22+ messages in thread
From: Ian Campbell @ 2014-10-16 9:28 UTC (permalink / raw)
To: David Vrabel; +Cc: boris.ostrovsky, Wei Liu, stefano.stabellini, xen-devel
On Wed, 2014-10-15 at 17:16 +0100, David Vrabel wrote:
> On 15/10/14 16:54, Wei Liu wrote:
> > This patch replaces the xen_balloon_migratepage stub with actual
> > implementation.
> >
> > It's implemented in two macro steps:
> > 1. populate old page with machine page and remove it from balloon driver
> > 2. give up machine page that backs new page and queue it to balloon
> > driver
> >
> > Old page is the page owned by balloon driver, new page is a page
> > isolated by core mm compaction thread.
> >
> > The aforementioned steps swaps a ballooned out page with a usable page.
> > From guest's point of view, it de-fragments physical address space.
> >
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > drivers/xen/balloon.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 92 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> > index 815e1d5..112190e 100644
> > --- a/drivers/xen/balloon.c
> > +++ b/drivers/xen/balloon.c
> > @@ -727,13 +727,103 @@ static struct notifier_block balloon_cpu_notifier = {
> >
> > static const struct address_space_operations xen_balloon_aops;
> > #ifdef CONFIG_BALLOON_COMPACTION
> > +/*
> > + * xen_balloon_migratepage - perform the balloon page migration on behalf of
> > + * a compation thread. (called under page lock)
> > + * @mapping: the page->mapping which will be assigned to the new migrated page
> > + * @newpage: page that will replace the isolated page after migration finishes
> > + * @page : the isolated (old) page that is about to be migrated to newpage.
> > + * @mode : compaction mode -- not used for balloon page migration.
> > + *
> > + * After a ballooned page gets isolated by compaction procedures, this
> > + * is the function that performs the page migration on behalf of a
> > + * compaction thread. The page migration for Xen balloon is done in
> > + * these two macro steps:
> > + *
> > + * A. back @page with machine page
> > + * B. release machine that backs @newpage
>
> If the target if enforced and the guest is above the target. This
> prevent any compaction. You need an atomic swap operation.
XENMEM_exchange ought to be usable here I think.
Ian.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC 09/10] balloon: BALLOON_COMPACTION now depends on XEN_BALLOON
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (7 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 08/10] xen/balloon: implement migratepage Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 15:54 ` [PATCH RFC 10/10] XXX: balloon bitmap and sysrq key to dump bitmap Wei Liu
` (2 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
mm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index 886db21..4d30f0b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -232,7 +232,7 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK
config BALLOON_COMPACTION
bool "Allow for balloon memory compaction/migration"
def_bool y
- depends on COMPACTION && VIRTIO_BALLOON
+ depends on COMPACTION && (VIRTIO_BALLOON || XEN_BALLOON)
help
Memory fragmentation introduced by ballooning might reduce
significantly the number of 2MB contiguous memory blocks that can be
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH RFC 10/10] XXX: balloon bitmap and sysrq key to dump bitmap
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (8 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 09/10] balloon: BALLOON_COMPACTION now depends on XEN_BALLOON Wei Liu
@ 2014-10-15 15:54 ` Wei Liu
2014-10-15 16:25 ` [PATCH RFC 00/10] Xen balloon page compaction support David Vrabel
2014-10-15 16:54 ` Andrew Cooper
11 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-15 15:54 UTC (permalink / raw)
To: xen-devel
Cc: Wei Liu, ian.campbell, stefano.stabellini, david.vrabel, boris.ostrovsky
Maintain a bitmap of balloon pages to give intuitive overview of their
locations.
Write 'x' to /proc/sysrq-trigger to print out this bitmap.
This patch is only for debugging purpose.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
drivers/xen/balloon.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 112190e..9af6206 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -54,6 +54,10 @@
#include <linux/memory.h>
#include <linux/memory_hotplug.h>
#include <linux/percpu-defs.h>
+#include <linux/bitmap.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/sysrq.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
@@ -102,6 +106,53 @@ static void scrub_page(struct page *page)
#endif
}
+static unsigned long *balloon_bitmap;
+static unsigned long balloon_bitmap_len;
+
+static inline void xen_balloon_bitmap_init(void)
+{
+ balloon_bitmap_len = BITS_TO_LONGS(max_pfn);
+ balloon_bitmap = kzalloc(balloon_bitmap_len * sizeof(unsigned long),
+ GFP_KERNEL);
+ /* XXX: this bitmap is only for debugging anyway... */
+ BUG_ON(!balloon_bitmap);
+}
+
+static inline void xen_balloon_bitmap_set(struct page *page)
+{
+ unsigned long pfn = page_to_pfn(page);
+ set_bit(pfn, balloon_bitmap);
+}
+
+static inline void xen_balloon_bitmap_clear(struct page *page)
+{
+ unsigned long pfn = page_to_pfn(page);
+ clear_bit(pfn, balloon_bitmap);
+}
+
+void xen_balloon_bitmap_dump(void)
+{
+ unsigned long i;
+
+ for (i = 0; i < balloon_bitmap_len; i++) {
+ if ((i % 8) == 0)
+ printk("%8lu: ", i);
+ printk("%016lx ", balloon_bitmap[i]);
+ if (((i + 1) % 8) == 0)
+ printk("\n");
+ }
+}
+
+static void sysrq_handle_dump_xen_balloon_bitmap(int key)
+{
+ xen_balloon_bitmap_dump();
+}
+static struct sysrq_key_op sysrq_xen_op = {
+ .handler = sysrq_handle_dump_xen_balloon_bitmap,
+ .help_msg = "dump-balloon-bitmap(x)",
+ .action_msg = "Dump balloon bitmap",
+};
+
static inline void update_balloon_stats(struct page *page, int count)
{
if (PageHighMem(page))
@@ -132,6 +183,7 @@ static void balloon_append(struct page *page, bool managed)
{
__balloon_append(page, managed);
adjust_managed_page_count(page, -1);
+ xen_balloon_bitmap_set(page);
}
/* balloon_retrieve: rescue a page from Xen balloon driver, if it is
@@ -449,6 +501,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
/* We only account for those pages that have been populated. */
update_balloon_stats(page, -1);
adjust_managed_page_count(page, 1);
+ xen_balloon_bitmap_clear(page);
}
xen_balloon.balloon_stats.current_pages += rc;
@@ -622,6 +675,7 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem)
pages[pgno++] = page;
update_balloon_stats(page, -1);
adjust_managed_page_count(page, 1);
+ xen_balloon_bitmap_clear(page);
} else {
enum bp_state st;
gfp_t gfp = highmem ? GFP_HIGHUSER : GFP_USER;
@@ -818,6 +872,9 @@ static int xen_balloon_migratepage(struct address_space *mapping,
rc = MIGRATEPAGE_BALLOON_SUCCESS;
+ xen_balloon_bitmap_clear(page);
+ xen_balloon_bitmap_set(newpage);
+
flush_tlb_all();
out:
mutex_unlock(&xb->balloon_mutex);
@@ -911,6 +968,9 @@ static int __init balloon_init(void)
balloon_add_region(PFN_UP(xen_extra_mem[i].start),
PFN_DOWN(xen_extra_mem[i].size));
+ xen_balloon_bitmap_init();
+ register_sysrq_key('x', &sysrq_xen_op);
+
return 0;
out_free_xb_dev_info:
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (9 preceding siblings ...)
2014-10-15 15:54 ` [PATCH RFC 10/10] XXX: balloon bitmap and sysrq key to dump bitmap Wei Liu
@ 2014-10-15 16:25 ` David Vrabel
2014-10-15 16:30 ` Wei Liu
2014-10-15 16:54 ` Andrew Cooper
11 siblings, 1 reply; 22+ messages in thread
From: David Vrabel @ 2014-10-15 16:25 UTC (permalink / raw)
To: Wei Liu, xen-devel
Cc: boris.ostrovsky, david.vrabel, ian.campbell, stefano.stabellini
On 15/10/14 16:54, Wei Liu wrote:
> Hi all
>
> This is a prototype to make Xen balloon driver work with balloon page
> compaction. The goal is to reduce guest physical address space fragmentation
> introduced by balloon pages. Having guest physical address space as contiguous
> as possible is generally useful (because guest can have higher order pages), and
> it should also help improve HVM performance (provided that guest kernel knows
> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
>
> The approach is simple. Core balloon driver is made one of the page sources of
> Xen balloon driver. Those pages allocated from core balloon driver are subject
> to balloon page compaction.
Your page migrate function is broken. I'm not going to review anything
else in this series until you propose a solution for this.
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 16:25 ` [PATCH RFC 00/10] Xen balloon page compaction support David Vrabel
@ 2014-10-15 16:30 ` Wei Liu
2014-10-16 9:31 ` David Vrabel
0 siblings, 1 reply; 22+ messages in thread
From: Wei Liu @ 2014-10-15 16:30 UTC (permalink / raw)
To: David Vrabel
Cc: boris.ostrovsky, stefano.stabellini, Wei Liu, ian.campbell, xen-devel
On Wed, Oct 15, 2014 at 05:25:41PM +0100, David Vrabel wrote:
> On 15/10/14 16:54, Wei Liu wrote:
> > Hi all
> >
> > This is a prototype to make Xen balloon driver work with balloon page
> > compaction. The goal is to reduce guest physical address space fragmentation
> > introduced by balloon pages. Having guest physical address space as contiguous
> > as possible is generally useful (because guest can have higher order pages), and
> > it should also help improve HVM performance (provided that guest kernel knows
> > how to use huge pages -- Linux has hugetlbfs and transparent huge page).
> >
> > The approach is simple. Core balloon driver is made one of the page sources of
> > Xen balloon driver. Those pages allocated from core balloon driver are subject
> > to balloon page compaction.
>
> Your page migrate function is broken. I'm not going to review anything
> else in this series until you propose a solution for this.
>
What do you think about this idea in general? That's what I care at this
point. You can skip reviewing at the moment.
Wei.
> David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 16:30 ` Wei Liu
@ 2014-10-16 9:31 ` David Vrabel
0 siblings, 0 replies; 22+ messages in thread
From: David Vrabel @ 2014-10-16 9:31 UTC (permalink / raw)
To: Wei Liu; +Cc: boris.ostrovsky, stefano.stabellini, ian.campbell, xen-devel
On 15/10/14 17:30, Wei Liu wrote:
> On Wed, Oct 15, 2014 at 05:25:41PM +0100, David Vrabel wrote:
>> On 15/10/14 16:54, Wei Liu wrote:
>>> Hi all
>>>
>>> This is a prototype to make Xen balloon driver work with balloon page
>>> compaction. The goal is to reduce guest physical address space fragmentation
>>> introduced by balloon pages. Having guest physical address space as contiguous
>>> as possible is generally useful (because guest can have higher order pages), and
>>> it should also help improve HVM performance (provided that guest kernel knows
>>> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
>>>
>>> The approach is simple. Core balloon driver is made one of the page sources of
>>> Xen balloon driver. Those pages allocated from core balloon driver are subject
>>> to balloon page compaction.
>>
>> Your page migrate function is broken. I'm not going to review anything
>> else in this series until you propose a solution for this.
>>
>
> What do you think about this idea in general? That's what I care at this
> point. You can skip reviewing at the moment.
In light of Andrew's comments I would like to see a design for the
complete solution, including the parts necessary to coalesce or preserve
superpages in the stage 2 translation tables.
In particular, it looks like page migration will just fragment the stage
2 tables.
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 15:54 [PATCH RFC 00/10] Xen balloon page compaction support Wei Liu
` (10 preceding siblings ...)
2014-10-15 16:25 ` [PATCH RFC 00/10] Xen balloon page compaction support David Vrabel
@ 2014-10-15 16:54 ` Andrew Cooper
2014-10-15 17:00 ` Wei Liu
11 siblings, 1 reply; 22+ messages in thread
From: Andrew Cooper @ 2014-10-15 16:54 UTC (permalink / raw)
To: Wei Liu, xen-devel
Cc: boris.ostrovsky, david.vrabel, ian.campbell, stefano.stabellini
On 15/10/14 16:54, Wei Liu wrote:
> Hi all
>
> This is a prototype to make Xen balloon driver work with balloon page
> compaction. The goal is to reduce guest physical address space fragmentation
> introduced by balloon pages. Having guest physical address space as contiguous
> as possible is generally useful (because guest can have higher order pages), and
> it should also help improve HVM performance (provided that guest kernel knows
> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
After you have defragmented guest physical memory, does Linux use
2MB/1GB superpages more readily?
On what basis do you think this will improve HVM performance? The HAP
tables still remain fragmented after ballooning.
~Andrew
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 16:54 ` Andrew Cooper
@ 2014-10-15 17:00 ` Wei Liu
2014-10-15 17:14 ` Andrew Cooper
0 siblings, 1 reply; 22+ messages in thread
From: Wei Liu @ 2014-10-15 17:00 UTC (permalink / raw)
To: Andrew Cooper
Cc: Wei Liu, ian.campbell, stefano.stabellini, xen-devel,
david.vrabel, boris.ostrovsky
On Wed, Oct 15, 2014 at 05:54:24PM +0100, Andrew Cooper wrote:
> On 15/10/14 16:54, Wei Liu wrote:
> > Hi all
> >
> > This is a prototype to make Xen balloon driver work with balloon page
> > compaction. The goal is to reduce guest physical address space fragmentation
> > introduced by balloon pages. Having guest physical address space as contiguous
> > as possible is generally useful (because guest can have higher order pages), and
> > it should also help improve HVM performance (provided that guest kernel knows
> > how to use huge pages -- Linux has hugetlbfs and transparent huge page).
>
> After you have defragmented guest physical memory, does Linux use
> 2MB/1GB superpages more readily?
>
That's completely up to the guest. Having contiguous guest physical
address space is prerequisite.
> On what basis do you think this will improve HVM performance? The HAP
> tables still remain fragmented after ballooning.
>
That's the other side of this problem and orthogonal to this patch
series. It should be fixed separately on the hypervisor side, presumably
with similar mechanism to coalesce HAP table in hypervisor.
Wei.
> ~Andrew
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 17:00 ` Wei Liu
@ 2014-10-15 17:14 ` Andrew Cooper
2014-10-16 9:12 ` Wei Liu
2014-10-16 9:26 ` Ian Campbell
0 siblings, 2 replies; 22+ messages in thread
From: Andrew Cooper @ 2014-10-15 17:14 UTC (permalink / raw)
To: Wei Liu
Cc: david.vrabel, boris.ostrovsky, stefano.stabellini, ian.campbell,
xen-devel
On 15/10/14 18:00, Wei Liu wrote:
> On Wed, Oct 15, 2014 at 05:54:24PM +0100, Andrew Cooper wrote:
>> On 15/10/14 16:54, Wei Liu wrote:
>>> Hi all
>>>
>>> This is a prototype to make Xen balloon driver work with balloon page
>>> compaction. The goal is to reduce guest physical address space fragmentation
>>> introduced by balloon pages. Having guest physical address space as contiguous
>>> as possible is generally useful (because guest can have higher order pages), and
>>> it should also help improve HVM performance (provided that guest kernel knows
>>> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
>> After you have defragmented guest physical memory, does Linux use
>> 2MB/1GB superpages more readily?
>>
> That's completely up to the guest. Having contiguous guest physical
> address space is prerequisite.
>
>> On what basis do you think this will improve HVM performance? The HAP
>> tables still remain fragmented after ballooning.
>>
> That's the other side of this problem and orthogonal to this patch
> series. It should be fixed separately on the hypervisor side, presumably
> with similar mechanism to coalesce HAP table in hypervisor.
You can't rearrange the memory of any domain with passthrough, or any
subregion which is mapped by another domain. Even if the underlying
pages are in order, you can't coalesce 4K pages to 2M pages for any
region containing a grant.
By far the best solution for both guest and hypervisor is for Xen to do
its utmost to allocate superpages to start with, and then for the guest
not to shoot holes in its physical address space in the first place.
This would involve intelligently choosing which pages to balloon/grant
first, rather than blindly choosing the first free page.
~Andrew
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 17:14 ` Andrew Cooper
@ 2014-10-16 9:12 ` Wei Liu
2014-10-16 9:26 ` Ian Campbell
1 sibling, 0 replies; 22+ messages in thread
From: Wei Liu @ 2014-10-16 9:12 UTC (permalink / raw)
To: Andrew Cooper
Cc: Wei Liu, ian.campbell, stefano.stabellini, xen-devel,
david.vrabel, boris.ostrovsky
On Wed, Oct 15, 2014 at 06:14:56PM +0100, Andrew Cooper wrote:
> On 15/10/14 18:00, Wei Liu wrote:
> > On Wed, Oct 15, 2014 at 05:54:24PM +0100, Andrew Cooper wrote:
> >> On 15/10/14 16:54, Wei Liu wrote:
> >>> Hi all
> >>>
> >>> This is a prototype to make Xen balloon driver work with balloon page
> >>> compaction. The goal is to reduce guest physical address space fragmentation
> >>> introduced by balloon pages. Having guest physical address space as contiguous
> >>> as possible is generally useful (because guest can have higher order pages), and
> >>> it should also help improve HVM performance (provided that guest kernel knows
> >>> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
> >> After you have defragmented guest physical memory, does Linux use
> >> 2MB/1GB superpages more readily?
> >>
> > That's completely up to the guest. Having contiguous guest physical
> > address space is prerequisite.
> >
> >> On what basis do you think this will improve HVM performance? The HAP
> >> tables still remain fragmented after ballooning.
> >>
> > That's the other side of this problem and orthogonal to this patch
> > series. It should be fixed separately on the hypervisor side, presumably
> > with similar mechanism to coalesce HAP table in hypervisor.
>
> You can't rearrange the memory of any domain with passthrough, or any
> subregion which is mapped by another domain. Even if the underlying
> pages are in order, you can't coalesce 4K pages to 2M pages for any
> region containing a grant.
>
But these cases don't invalidate my point, do they? We can still
coalesce regions that can be coalesced.
> By far the best solution for both guest and hypervisor is for Xen to do
> its utmost to allocate superpages to start with, and then for the guest
This is what it is now, as I understand it.
> not to shoot holes in its physical address space in the first place.
>
> This would involve intelligently choosing which pages to balloon/grant
> first, rather than blindly choosing the first free page.
>
That would require modification to Linux memory allocator which is very
unlikely to get upstreamed.
Wei.
> ~Andrew
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-15 17:14 ` Andrew Cooper
2014-10-16 9:12 ` Wei Liu
@ 2014-10-16 9:26 ` Ian Campbell
2014-10-17 12:35 ` Andrew Cooper
1 sibling, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2014-10-16 9:26 UTC (permalink / raw)
To: Andrew Cooper
Cc: boris.ostrovsky, stefano.stabellini, Wei Liu, david.vrabel, xen-devel
On Wed, 2014-10-15 at 18:14 +0100, Andrew Cooper wrote:
> On 15/10/14 18:00, Wei Liu wrote:
> > On Wed, Oct 15, 2014 at 05:54:24PM +0100, Andrew Cooper wrote:
> >> On 15/10/14 16:54, Wei Liu wrote:
> >>> Hi all
> >>>
> >>> This is a prototype to make Xen balloon driver work with balloon page
> >>> compaction. The goal is to reduce guest physical address space fragmentation
> >>> introduced by balloon pages. Having guest physical address space as contiguous
> >>> as possible is generally useful (because guest can have higher order pages), and
> >>> it should also help improve HVM performance (provided that guest kernel knows
> >>> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
> >> After you have defragmented guest physical memory, does Linux use
> >> 2MB/1GB superpages more readily?
> >>
> > That's completely up to the guest. Having contiguous guest physical
> > address space is prerequisite.
> >
> >> On what basis do you think this will improve HVM performance? The HAP
> >> tables still remain fragmented after ballooning.
> >>
> > That's the other side of this problem and orthogonal to this patch
> > series. It should be fixed separately on the hypervisor side, presumably
> > with similar mechanism to coalesce HAP table in hypervisor.
>
> You can't rearrange the memory of any domain with passthrough, or any
> subregion which is mapped by another domain. Even if the underlying
> pages are in order,
That's fine. This work still improves things for plenty of other
domains.
I suspect with a suitable amount of cunning it might be possible to do
this for domains with passthrough using smmu. Whether it is worth the
time and effort (since I certain concede it won't be easy) I don't know.
> you can't coalesce 4K pages to 2M pages for any
> region containing a grant.
An *active* grant (i.e. something which is granted and actually used by
another domain. At least on ARM it is perfectly fine to grant a page in
the midst of a 2M mapping and no need to shatter on the granting domain.
> By far the best solution for both guest and hypervisor is for Xen to do
> its utmost to allocate superpages to start with, and then for the guest
> not to shoot holes in its physical address space in the first place.
Which having the guest try and ballooning in 2M increments help to
achieve, especially since e.g. grant mappings are placed into ballooned
out holes, meaning they aren't poking holes in other bits of the address
space.
> This would involve intelligently choosing which pages to balloon/grant
> first, rather than blindly choosing the first free page.
By trying to do 2M balloons first and also trying to coalesce any 4k
ballooned pages into 2M using THP we start to achieve this.
I don't think what you are suggesting is going to be practical within
the constraints of the kernel infrastructure.
Ian.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC 00/10] Xen balloon page compaction support
2014-10-16 9:26 ` Ian Campbell
@ 2014-10-17 12:35 ` Andrew Cooper
0 siblings, 0 replies; 22+ messages in thread
From: Andrew Cooper @ 2014-10-17 12:35 UTC (permalink / raw)
To: Ian Campbell
Cc: boris.ostrovsky, stefano.stabellini, Wei Liu, david.vrabel, xen-devel
On 16/10/14 10:26, Ian Campbell wrote:
> On Wed, 2014-10-15 at 18:14 +0100, Andrew Cooper wrote:
>> On 15/10/14 18:00, Wei Liu wrote:
>>> On Wed, Oct 15, 2014 at 05:54:24PM +0100, Andrew Cooper wrote:
>>>> On 15/10/14 16:54, Wei Liu wrote:
>>>>> Hi all
>>>>>
>>>>> This is a prototype to make Xen balloon driver work with balloon page
>>>>> compaction. The goal is to reduce guest physical address space fragmentation
>>>>> introduced by balloon pages. Having guest physical address space as contiguous
>>>>> as possible is generally useful (because guest can have higher order pages), and
>>>>> it should also help improve HVM performance (provided that guest kernel knows
>>>>> how to use huge pages -- Linux has hugetlbfs and transparent huge page).
>>>> After you have defragmented guest physical memory, does Linux use
>>>> 2MB/1GB superpages more readily?
>>>>
>>> That's completely up to the guest. Having contiguous guest physical
>>> address space is prerequisite.
>>>
>>>> On what basis do you think this will improve HVM performance? The HAP
>>>> tables still remain fragmented after ballooning.
>>>>
>>> That's the other side of this problem and orthogonal to this patch
>>> series. It should be fixed separately on the hypervisor side, presumably
>>> with similar mechanism to coalesce HAP table in hypervisor.
>> You can't rearrange the memory of any domain with passthrough, or any
>> subregion which is mapped by another domain. Even if the underlying
>> pages are in order,
> That's fine. This work still improves things for plenty of other
> domains.
>
> I suspect with a suitable amount of cunning it might be possible to do
> this for domains with passthrough using smmu. Whether it is worth the
> time and effort (since I certain concede it won't be easy) I don't know.
x86 IOMMUs do not support restarable faults, and given the timing
constraints in the PCI spec, there is no obvious way to gain support
like this.
As a result, it is impossible for Xen to move a page, while maintaining
DMA read and write coherency for devices.
~Andrew
^ permalink raw reply [flat|nested] 22+ messages in thread