All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support
@ 2015-04-14 17:29 Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
                   ` (9 more replies)
  0 siblings, 10 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers

The VMware balloon driver optimizes the memory reclamation when Linux
runs in a VM of a VMware hypervisor (such as ESXi, VMware Workstation,
or VMware Fusion). The hypervisor signals the balloon driver how much
memory to reclaim. The balloon driver allocates this amount of memory
via regular Linux allocation mechanisms. As the memory appears to be
used by the driver linux does not touch it and the hypervisor can unmap
it and give the memory to a different VM. This is not different from
any other balloon driver.

This series of patches improves the balloon drivers performance,
latency, and reduces the overhead to reclaim memory. All of this is done
while staying backwards compatible with all ever shipping versions of
VMware's hypervisors.

There are three main improvements:
- The balloon driver and hypervisor communicate about the capabilities
  of each other during balloon load.
- Instead of inflating the memory a single 4 KB page at a time, the
  balloon allocates a whole list of pages and hands it off to the
  hypervisor at once.
- We can now allocate 2 MB and 4 KB pages which further reduces the
  overhead and increases the performance
- If there is an demand to reclaim memory the hypervisor signals the
  balloon driver via VMCI that there is something to do. Before the
  balloon driver polled ever second.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/9] VMware balloon: partially inline vmballoon_reserve_page.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

This split the function in two: the allocation part is inlined into the
inflate function and the lock part is kept into his own function.

This change is needed in order to be able to allocate more than one page
before doing the hypervisor call.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 98 ++++++++++++++++++++--------------------------
 1 file changed, 42 insertions(+), 56 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 1916174..2799c46 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.1.3-k");
+MODULE_VERSION("1.2.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -402,55 +402,37 @@ static void vmballoon_reset(struct vmballoon *b)
 }
 
 /*
- * Allocate (or reserve) a page for the balloon and notify the host.  If host
- * refuses the page put it on "refuse" list and allocate another one until host
- * is satisfied. "Refused" pages are released at the end of inflation cycle
- * (when we allocate b->rate_alloc pages).
+ * Notify the host of a ballooned page. If host rejects the page put it on the
+ * refuse list, those refused page are then released at the end of the
+ * inflation cycle.
  */
-static int vmballoon_reserve_page(struct vmballoon *b, bool can_sleep)
+static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 {
-	struct page *page;
-	gfp_t flags;
-	unsigned int hv_status;
-	int locked;
-	flags = can_sleep ? VMW_PAGE_ALLOC_CANSLEEP : VMW_PAGE_ALLOC_NOSLEEP;
-
-	do {
-		if (!can_sleep)
-			STATS_INC(b->stats.alloc);
-		else
-			STATS_INC(b->stats.sleep_alloc);
-
-		page = alloc_page(flags);
-		if (!page) {
-			if (!can_sleep)
-				STATS_INC(b->stats.alloc_fail);
-			else
-				STATS_INC(b->stats.sleep_alloc_fail);
-			return -ENOMEM;
-		}
+	int locked, hv_status;
 
-		/* inform monitor */
-		locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
-		if (locked > 0) {
-			STATS_INC(b->stats.refused_alloc);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	if (locked > 0) {
+		STATS_INC(b->stats.refused_alloc);
 
-			if (hv_status == VMW_BALLOON_ERROR_RESET ||
-			    hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-				__free_page(page);
-				return -EIO;
-			}
+		if (hv_status == VMW_BALLOON_ERROR_RESET ||
+				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
+			__free_page(page);
+			return -EIO;
+		}
 
-			/*
-			 * Place page on the list of non-balloonable pages
-			 * and retry allocation, unless we already accumulated
-			 * too many of them, in which case take a breather.
-			 */
+		/*
+		 * Place page on the list of non-balloonable pages
+		 * and retry allocation, unless we already accumulated
+		 * too many of them, in which case take a breather.
+		 */
+		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			b->n_refused_pages++;
 			list_add(&page->lru, &b->refused_pages);
-			if (++b->n_refused_pages >= VMW_BALLOON_MAX_REFUSED)
-				return -EIO;
+		} else {
+			__free_page(page);
 		}
-	} while (locked != 0);
+		return -EIO;
+	}
 
 	/* track allocated page */
 	list_add(&page->lru, &b->pages);
@@ -512,7 +494,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int i;
 	unsigned int allocations = 0;
 	int error = 0;
-	bool alloc_can_sleep = false;
+	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -543,19 +525,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
+		struct page *page;
 
-		error = vmballoon_reserve_page(b, alloc_can_sleep);
-		if (error) {
-			if (error != -ENOMEM) {
-				/*
-				 * Not a page allocation failure, stop this
-				 * cycle. Maybe we'll get new target from
-				 * the host soon.
-				 */
-				break;
-			}
+		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
+			STATS_INC(b->stats.alloc);
+		else
+			STATS_INC(b->stats.sleep_alloc);
 
-			if (alloc_can_sleep) {
+		page = alloc_page(flags);
+		if (!page) {
+			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
 				 * is under severe memory pressure. Quickly
@@ -563,8 +542,10 @@ static void vmballoon_inflate(struct vmballoon *b)
 				 */
 				b->rate_alloc = max(b->rate_alloc / 2,
 						    VMW_BALLOON_RATE_ALLOC_MIN);
+				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
+			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -579,11 +560,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 			if (i >= b->rate_alloc)
 				break;
 
-			alloc_can_sleep = true;
+			flags = VMW_PAGE_ALLOC_CANSLEEP;
 			/* Lower rate for sleeping allocations. */
 			rate = b->rate_alloc;
+			continue;
 		}
 
+		error = vmballoon_lock_page(b, page);
+		if (error)
+			break;
+
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
 			allocations = 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/9] VMware balloon: Add support for balloon capabilities.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

In order to extend the balloon protocol, the hypervisor and the guest
driver need to agree on a set of supported functionality to use.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 74 +++++++++++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 2799c46..ffb5634 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.2.0-k");
+MODULE_VERSION("1.3.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -110,9 +110,18 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_BALLOON_HV_PORT		0x5670
 #define VMW_BALLOON_HV_MAGIC		0x456c6d6f
-#define VMW_BALLOON_PROTOCOL_VERSION	2
 #define VMW_BALLOON_GUEST_ID		1	/* Linux */
 
+enum vmwballoon_capabilities {
+	/*
+	 * Bit 0 is reserved and not associated to any capability.
+	 */
+	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+};
+
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
@@ -120,32 +129,36 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_CMD_GUEST_ID	4
 
 /* error codes */
-#define VMW_BALLOON_SUCCESS		0
-#define VMW_BALLOON_FAILURE		-1
-#define VMW_BALLOON_ERROR_CMD_INVALID	1
-#define VMW_BALLOON_ERROR_PPN_INVALID	2
-#define VMW_BALLOON_ERROR_PPN_LOCKED	3
-#define VMW_BALLOON_ERROR_PPN_UNLOCKED	4
-#define VMW_BALLOON_ERROR_PPN_PINNED	5
-#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	6
-#define VMW_BALLOON_ERROR_RESET		7
-#define VMW_BALLOON_ERROR_BUSY		8
-
-#define VMWARE_BALLOON_CMD(cmd, data, result)		\
-({							\
-	unsigned long __stat, __dummy1, __dummy2;	\
-	__asm__ __volatile__ ("inl %%dx" :		\
-		"=a"(__stat),				\
-		"=c"(__dummy1),				\
-		"=d"(__dummy2),				\
-		"=b"(result) :				\
-		"0"(VMW_BALLOON_HV_MAGIC),		\
-		"1"(VMW_BALLOON_CMD_##cmd),		\
-		"2"(VMW_BALLOON_HV_PORT),		\
-		"3"(data) :				\
-		"memory");				\
-	result &= -1UL;					\
-	__stat & -1UL;					\
+#define VMW_BALLOON_SUCCESS		        0
+#define VMW_BALLOON_FAILURE		        -1
+#define VMW_BALLOON_ERROR_CMD_INVALID	        1
+#define VMW_BALLOON_ERROR_PPN_INVALID	        2
+#define VMW_BALLOON_ERROR_PPN_LOCKED	        3
+#define VMW_BALLOON_ERROR_PPN_UNLOCKED	        4
+#define VMW_BALLOON_ERROR_PPN_PINNED	        5
+#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	        6
+#define VMW_BALLOON_ERROR_RESET		        7
+#define VMW_BALLOON_ERROR_BUSY		        8
+
+#define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
+
+#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+({								\
+	unsigned long __status, __dummy1, __dummy2;		\
+	__asm__ __volatile__ ("inl %%dx" :			\
+		"=a"(__status),					\
+		"=c"(__dummy1),					\
+		"=d"(__dummy2),					\
+		"=b"(result) :					\
+		"0"(VMW_BALLOON_HV_MAGIC),			\
+		"1"(VMW_BALLOON_CMD_##cmd),			\
+		"2"(VMW_BALLOON_HV_PORT),			\
+		"3"(data) :					\
+		"memory");					\
+	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
+		result = __dummy1;				\
+	result &= -1UL;						\
+	__status & -1UL;					\
 })
 
 #ifdef CONFIG_DEBUG_FS
@@ -223,11 +236,12 @@ static struct vmballoon balloon;
  */
 static bool vmballoon_send_start(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, capabilities;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_PROTOCOL_VERSION, dummy);
+	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
+				capabilities);
 	if (status == VMW_BALLOON_SUCCESS)
 		return true;
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Introduce a new capability to the driver that allow sending 512 pages in
one hypervisor call. This reduce the cost of the driver when reclaiming
memory.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
---
 drivers/misc/vmw_balloon.c | 405 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 352 insertions(+), 53 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index ffb5634..f65c676 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2010, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.0.0-k");
+MODULE_VERSION("1.3.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -120,13 +120,26 @@ enum vmwballoon_capabilities {
 	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
 };
 
-#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
+					| VMW_BALLOON_BATCHED_CMDS)
 
+/*
+ * Backdoor commands availability:
+ *
+ * START, GET_TARGET and GUEST_ID are always available,
+ *
+ * VMW_BALLOON_BASIC_CMDS:
+ *	LOCK and UNLOCK commands,
+ * VMW_BALLOON_BATCHED_CMDS:
+ *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ */
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
 #define VMW_BALLOON_CMD_UNLOCK		3
 #define VMW_BALLOON_CMD_GUEST_ID	4
+#define VMW_BALLOON_CMD_BATCHED_LOCK	6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -142,18 +155,63 @@ enum vmwballoon_capabilities {
 
 #define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
 
-#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+/* Batch page description */
+
+/*
+ * Layout of a page in the batch page:
+ *
+ * +-------------+----------+--------+
+ * |             |          |        |
+ * | Page number | Reserved | Status |
+ * |             |          |        |
+ * +-------------+----------+--------+
+ * 64  PAGE_SHIFT          6         0
+ *
+ * For now only 4K pages are supported, but we can easily support large pages
+ * by using bits in the reserved field.
+ *
+ * The reserved field should be set to 0.
+ */
+#define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
+#define VMW_BALLOON_BATCH_STATUS_MASK	((1UL << 5) - 1)
+#define VMW_BALLOON_BATCH_PAGE_MASK	(~((1UL << PAGE_SHIFT) - 1))
+
+struct vmballoon_batch_page {
+	u64 pages[VMW_BALLOON_BATCH_MAX_PAGES];
+};
+
+static u64 vmballoon_batch_get_pa(struct vmballoon_batch_page *batch, int idx)
+{
+	return batch->pages[idx] & VMW_BALLOON_BATCH_PAGE_MASK;
+}
+
+static int vmballoon_batch_get_status(struct vmballoon_batch_page *batch,
+				int idx)
+{
+	return (int)(batch->pages[idx] & VMW_BALLOON_BATCH_STATUS_MASK);
+}
+
+static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
+				u64 pa)
+{
+	batch->pages[idx] = pa;
+}
+
+
+#define VMWARE_BALLOON_CMD(cmd, arg1, arg2, result)		\
 ({								\
-	unsigned long __status, __dummy1, __dummy2;		\
+	unsigned long __status, __dummy1, __dummy2, __dummy3;	\
 	__asm__ __volatile__ ("inl %%dx" :			\
 		"=a"(__status),					\
 		"=c"(__dummy1),					\
 		"=d"(__dummy2),					\
-		"=b"(result) :					\
+		"=b"(result),					\
+		"=S" (__dummy3) :				\
 		"0"(VMW_BALLOON_HV_MAGIC),			\
 		"1"(VMW_BALLOON_CMD_##cmd),			\
 		"2"(VMW_BALLOON_HV_PORT),			\
-		"3"(data) :					\
+		"3"(arg1),					\
+		"4" (arg2) :					\
 		"memory");					\
 	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
 		result = __dummy1;				\
@@ -192,6 +250,14 @@ struct vmballoon_stats {
 #define STATS_INC(stat)
 #endif
 
+struct vmballoon;
+
+struct vmballoon_ops {
+	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+};
+
 struct vmballoon {
 
 	/* list of reserved physical pages */
@@ -215,6 +281,14 @@ struct vmballoon {
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
 
+	unsigned long capabilities;
+
+	struct vmballoon_batch_page *batch_page;
+	unsigned int batch_max_pages;
+	struct page *page;
+
+	const struct vmballoon_ops *ops;
+
 #ifdef CONFIG_DEBUG_FS
 	/* statistics */
 	struct vmballoon_stats stats;
@@ -234,16 +308,22 @@ static struct vmballoon balloon;
  * Send "start" command to the host, communicating supported version
  * of the protocol.
  */
-static bool vmballoon_send_start(struct vmballoon *b)
+static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
-	unsigned long status, capabilities;
+	unsigned long status, capabilities, dummy = 0;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
-				capabilities);
-	if (status == VMW_BALLOON_SUCCESS)
+	status = VMWARE_BALLOON_CMD(START, req_caps, dummy, capabilities);
+
+	switch (status) {
+	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
+		b->capabilities = capabilities;
+		return true;
+	case VMW_BALLOON_SUCCESS:
+		b->capabilities = VMW_BALLOON_BASIC_CMDS;
 		return true;
+	}
 
 	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
 	STATS_INC(b->stats.start_fail);
@@ -273,9 +353,10 @@ static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
  */
 static bool vmballoon_send_guest_id(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 
-	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy);
+	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy,
+				dummy);
 
 	STATS_INC(b->stats.guest_type);
 
@@ -295,6 +376,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	unsigned long status;
 	unsigned long target;
 	unsigned long limit;
+	unsigned long dummy = 0;
 	u32 limit32;
 
 	/*
@@ -313,7 +395,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	/* update stats */
 	STATS_INC(b->stats.target);
 
-	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, target);
+	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, dummy, target);
 	if (vmballoon_check_status(b, status)) {
 		*new_target = target;
 		return true;
@@ -332,7 +414,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 				     unsigned int *hv_status)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -341,7 +423,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -350,13 +432,30 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	return 1;
 }
 
+static int vmballoon_send_batched_lock(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.lock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return 0;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.lock_fail);
+	return 1;
+}
+
 /*
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
 static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -365,7 +464,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -374,6 +473,23 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 	return false;
 }
 
+static bool vmballoon_send_batched_unlock(struct vmballoon *b,
+						unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.unlock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return true;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.unlock_fail);
+	return false;
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -396,22 +512,13 @@ static void vmballoon_pop(struct vmballoon *b)
 			cond_resched();
 		}
 	}
-}
 
-/*
- * Perform standard reset sequence by popping the balloon (in case it
- * is not  empty) and then restarting protocol. This operation normally
- * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
- */
-static void vmballoon_reset(struct vmballoon *b)
-{
-	/* free all pages, skipping monitor unlock */
-	vmballoon_pop(b);
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		if (b->batch_page)
+			vunmap(b->batch_page);
 
-	if (vmballoon_send_start(b)) {
-		b->reset_required = false;
-		if (!vmballoon_send_guest_id(b))
-			pr_err("failed to send guest ID to the host\n");
+		if (b->page)
+			__free_page(b->page);
 	}
 }
 
@@ -420,9 +527,10 @@ static void vmballoon_reset(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 {
 	int locked, hv_status;
+	struct page *page = b->page;
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
 	if (locked > 0) {
@@ -457,17 +565,68 @@ static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_lock_batched_page(struct vmballoon *b,
+				unsigned int num_pages)
+{
+	int locked, i;
+
+	locked = vmballoon_send_batched_lock(b, num_pages);
+	if (locked > 0) {
+		for (i = 0; i < num_pages; i++) {
+			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+			__free_page(p);
+		}
+
+		return -EIO;
+	}
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+
+		switch (locked) {
+		case VMW_BALLOON_SUCCESS:
+			list_add(&p->lru, &b->pages);
+			b->size++;
+			break;
+		case VMW_BALLOON_ERROR_PPN_PINNED:
+		case VMW_BALLOON_ERROR_PPN_INVALID:
+			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &b->refused_pages);
+				b->n_refused_pages++;
+				break;
+			}
+			/* Fallthrough */
+		case VMW_BALLOON_ERROR_RESET:
+		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
+			__free_page(p);
+			break;
+		default:
+			/* This should never happen */
+			WARN_ON_ONCE(true);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Release the page allocated for the balloon. Note that we first notify
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_release_page(struct vmballoon *b, struct page *page)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 {
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page)))
-		return -EIO;
+	struct page *page = b->page;
 
-	list_del(&page->lru);
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+		list_add(&page->lru, &b->pages);
+		return -EIO;
+	}
 
 	/* deallocate page */
 	__free_page(page);
@@ -479,6 +638,41 @@ static int vmballoon_release_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_unlock_batched_page(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	int locked, i, ret = 0;
+	bool hv_success;
+
+	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	if (!hv_success)
+		ret = -EIO;
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
+			/*
+			 * That page wasn't successfully unlocked by the
+			 * hypervisor, re-add it to the list of pages owned by
+			 * the balloon driver.
+			 */
+			list_add(&p->lru, &b->pages);
+		} else {
+			/* deallocate page */
+			__free_page(p);
+			STATS_INC(b->stats.free);
+
+			/* update balloon size */
+			b->size--;
+		}
+	}
+
+	return ret;
+}
+
 /*
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
@@ -496,6 +690,18 @@ static void vmballoon_release_refused_pages(struct vmballoon *b)
 	b->n_refused_pages = 0;
 }
 
+static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
+{
+	b->page = p;
+}
+
+static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
+				struct page *p)
+{
+	vmballoon_batch_set_pa(b->batch_page, idx,
+			(u64)page_to_pfn(p) << PAGE_SHIFT);
+}
+
 /*
  * Inflate the balloon towards its target size. Note that we try to limit
  * the rate of allocation to make sure we are not choking the rest of the
@@ -507,6 +713,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int rate;
 	unsigned int i;
 	unsigned int allocations = 0;
+	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
@@ -539,14 +746,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
-		struct page *page;
+		struct page *page = alloc_page(flags);
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -580,9 +786,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 			continue;
 		}
 
-		error = vmballoon_lock_page(b, page);
-		if (error)
-			break;
+		b->ops->add_page(b, num_pages++, page);
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->lock(b, num_pages);
+			num_pages = 0;
+			if (error)
+				break;
+		}
 
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
@@ -595,6 +805,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 		}
 	}
 
+	if (num_pages > 0)
+		b->ops->lock(b, num_pages);
+
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
@@ -618,6 +831,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 	struct page *page, *next;
 	unsigned int i = 0;
 	unsigned int goal;
+	unsigned int num_pages = 0;
 	int error;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
@@ -629,21 +843,94 @@ static void vmballoon_deflate(struct vmballoon *b)
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		error = vmballoon_release_page(b, page);
-		if (error) {
-			/* quickly decrease rate in case of error */
-			b->rate_free = max(b->rate_free / 2,
-					   VMW_BALLOON_RATE_FREE_MIN);
-			return;
+		list_del(&page->lru);
+		b->ops->add_page(b, num_pages++, page);
+
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->unlock(b, num_pages);
+			num_pages = 0;
+			if (error) {
+				/* quickly decrease rate in case of error */
+				b->rate_free = max(b->rate_free / 2,
+						VMW_BALLOON_RATE_FREE_MIN);
+				return;
+			}
 		}
 
 		if (++i >= goal)
 			break;
 	}
 
+	if (num_pages > 0)
+		b->ops->unlock(b, num_pages);
+
 	/* slowly increase rate if there were no errors */
-	b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-			   VMW_BALLOON_RATE_FREE_MAX);
+	if (error == 0)
+		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
+				   VMW_BALLOON_RATE_FREE_MAX);
+}
+
+static const struct vmballoon_ops vmballoon_basic_ops = {
+	.add_page = vmballoon_add_page,
+	.lock = vmballoon_lock_page,
+	.unlock = vmballoon_unlock_page
+};
+
+static const struct vmballoon_ops vmballoon_batched_ops = {
+	.add_page = vmballoon_add_batched_page,
+	.lock = vmballoon_lock_batched_page,
+	.unlock = vmballoon_unlock_batched_page
+};
+
+static bool vmballoon_init_batching(struct vmballoon *b)
+{
+	b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP);
+	if (!b->page)
+		return false;
+
+	b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL);
+	if (!b->batch_page) {
+		__free_page(b->page);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Perform standard reset sequence by popping the balloon (in case it
+ * is not  empty) and then restarting protocol. This operation normally
+ * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
+ */
+static void vmballoon_reset(struct vmballoon *b)
+{
+	/* free all pages, skipping monitor unlock */
+	vmballoon_pop(b);
+
+	if (!vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES))
+		return;
+
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		b->ops = &vmballoon_batched_ops;
+		b->batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(b)) {
+			/*
+			 * We failed to initialize batching, inform the monitor
+			 * about it by sending a null capability.
+			 *
+			 * The guest will retry in one second.
+			 */
+			vmballoon_send_start(b, 0);
+			return;
+		}
+	} else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		b->ops = &vmballoon_basic_ops;
+		b->batch_max_pages = 1;
+	}
+
+	b->reset_required = false;
+	if (!vmballoon_send_guest_id(b))
+		pr_err("failed to send guest ID to the host\n");
 }
 
 /*
@@ -802,11 +1089,23 @@ static int __init vmballoon_init(void)
 	/*
 	 * Start balloon.
 	 */
-	if (!vmballoon_send_start(&balloon)) {
+	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
 		pr_err("failed to send start command to the host\n");
 		return -EIO;
 	}
 
+	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		balloon.ops = &vmballoon_batched_ops;
+		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(&balloon)) {
+			pr_err("failed to init batching\n");
+			return -EIO;
+		}
+	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		balloon.ops = &vmballoon_basic_ops;
+		balloon.batch_max_pages = 1;
+	}
+
 	if (!vmballoon_send_guest_id(&balloon)) {
 		pr_err("failed to send guest ID to the host\n");
 		return -EIO;
@@ -833,7 +1132,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon);
+	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 4/9] VMware balloon: Update balloon target on each lock/unlock.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (2 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Instead of waiting for the next GET_TARGET command, we can react faster
by exploiting the fact that each hypervisor call also returns the
balloon target.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 85 +++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f65c676..72247d9 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.1.0-k");
+MODULE_VERSION("1.3.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -254,8 +254,10 @@ struct vmballoon;
 
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
-	int (*lock)(struct vmballoon *b, unsigned int num_pages);
-	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
 };
 
 struct vmballoon {
@@ -412,7 +414,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
  * check the return value and maybe submit a different page.
  */
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
-				     unsigned int *hv_status)
+				unsigned int *hv_status, unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -423,7 +425,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -433,14 +435,14 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.lock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -453,7 +455,8 @@ static int vmballoon_send_batched_lock(struct vmballoon *b,
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
-static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
+static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
+							unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -464,7 +467,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -474,14 +477,14 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-						unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -527,12 +530,14 @@ static void vmballoon_pop(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
 
-	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
+								target);
 	if (locked > 0) {
 		STATS_INC(b->stats.refused_alloc);
 
@@ -566,11 +571,11 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i;
 
-	locked = vmballoon_send_batched_lock(b, num_pages);
+	locked = vmballoon_send_batched_lock(b, num_pages, target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
@@ -619,11 +624,12 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	struct page *page = b->page;
 
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
 		list_add(&page->lru, &b->pages);
 		return -EIO;
 	}
@@ -639,12 +645,12 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
 	if (!hv_success)
 		ret = -EIO;
 
@@ -709,9 +715,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int goal;
 	unsigned int rate;
-	unsigned int i;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -734,7 +738,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * slowdown page allocations considerably.
 	 */
 
-	goal = b->target - b->size;
 	/*
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
@@ -743,16 +746,17 @@ static void vmballoon_inflate(struct vmballoon *b)
 			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
-		 __func__, goal, rate, b->rate_alloc);
+		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	for (i = 0; i < goal; i++) {
-		struct page *page = alloc_page(flags);
+	while (b->size < b->target && num_pages < b->target - b->size) {
+		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
+		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -777,7 +781,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 			 */
 			b->slow_allocation_cycles = VMW_BALLOON_SLOW_CYCLES;
 
-			if (i >= b->rate_alloc)
+			if (allocations >= b->rate_alloc)
 				break;
 
 			flags = VMW_PAGE_ALLOC_CANSLEEP;
@@ -788,7 +792,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages);
+			error = b->ops->lock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -799,21 +803,21 @@ static void vmballoon_inflate(struct vmballoon *b)
 			allocations = 0;
 		}
 
-		if (i >= rate) {
+		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
 			break;
 		}
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages);
+		b->ops->lock(b, num_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
 	 */
-	if (error == 0 && i >= b->rate_alloc) {
-		unsigned int mult = i / b->rate_alloc;
+	if (error == 0 && allocations >= b->rate_alloc) {
+		unsigned int mult = allocations / b->rate_alloc;
 
 		b->rate_alloc =
 			min(b->rate_alloc + mult * VMW_BALLOON_RATE_ALLOC_INC,
@@ -830,16 +834,11 @@ static void vmballoon_deflate(struct vmballoon *b)
 {
 	struct page *page, *next;
 	unsigned int i = 0;
-	unsigned int goal;
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
-
-	/* limit deallocation rate */
-	goal = min(b->size - b->target, b->rate_free);
-
-	pr_debug("%s - goal: %d, rate: %d\n", __func__, goal, b->rate_free);
+	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
+						b->target, b->rate_free);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
@@ -847,7 +846,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 		b->ops->add_page(b, num_pages++, page);
 
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages);
+			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error) {
 				/* quickly decrease rate in case of error */
@@ -857,12 +856,12 @@ static void vmballoon_deflate(struct vmballoon *b)
 			}
 		}
 
-		if (++i >= goal)
+		if (++i >= b->size - b->target)
 			break;
 	}
 
 	if (num_pages > 0)
-		b->ops->unlock(b, num_pages);
+		b->ops->unlock(b, num_pages, &b->target);
 
 	/* slowly increase rate if there were no errors */
 	if (error == 0)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (3 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities or " Philip P. Moltmann
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

This helps with debugging vmw_balloon behavior, as it is clear what
functionality is enabled.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 72247d9..6eaf7f7 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.2.0-k");
+MODULE_VERSION("1.3.3.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -978,6 +978,12 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	struct vmballoon *b = f->private;
 	struct vmballoon_stats *stats = &b->stats;
 
+	/* format capabilities info */
+	seq_printf(f,
+		   "balloon capabilities:   %#4x\n"
+		   "used capabilities:      %#4lx\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+
 	/* format size info */
 	seq_printf(f,
 		   "target:             %8d pages\n"
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 5/9] VMware balloon: Show capabilities or balloon and resulting capabilities in the debug-fs node.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (4 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

This helps with debugging vmw_balloon behavior, as it is clear what functionality is enabled.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 9eaafa6..b12f4dc 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.2.0-k");
+MODULE_VERSION("1.3.3.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -978,6 +978,12 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	struct vmballoon *b = f->private;
 	struct vmballoon_stats *stats = &b->stats;
 
+	/* format capabilities info */
+	seq_printf(f,
+		   "balloon capabilities:   %#4x\n"
+		   "used capabilities:      %#4lx\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+
 	/* format size info */
 	seq_printf(f,
 		   "target:             %8d pages\n"
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (5 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities or " Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-16 20:55   ` Dmitry Torokhov
  2015-04-14 17:29 ` [PATCH 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

Before this patch the slow memory transfer would cause the destination VM to
have internal swapping until all memory is transferred. Now the memory is
transferred fast enough so that the destination VM does not swap. The balloon
loop already yields to the rest of the system, hence the balloon thread
should not monopolize a cpu.

Testing Done: quickly ballooned a lot of pages while wathing if there are any
perceived hickups (periods of non-responsiveness) in the execution of the
linux VM. No such hickups were seen.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 66 +++++++++++-----------------------------------
 1 file changed, 15 insertions(+), 51 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 6eaf7f7..a5e1980 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.3.0-k");
+MODULE_VERSION("1.3.4.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -57,12 +57,6 @@ MODULE_LICENSE("GPL");
  */
 
 /*
- * Rate of allocating memory when there is no memory pressure
- * (driver performs non-sleeping allocations).
- */
-#define VMW_BALLOON_NOSLEEP_ALLOC_MAX	16384U
-
-/*
  * Rates of memory allocaton when guest experiences memory pressure
  * (driver performs sleeping allocations).
  */
@@ -71,13 +65,6 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_RATE_ALLOC_INC	16U
 
 /*
- * Rates for releasing pages while deflating balloon.
- */
-#define VMW_BALLOON_RATE_FREE_MIN	512U
-#define VMW_BALLOON_RATE_FREE_MAX	16384U
-#define VMW_BALLOON_RATE_FREE_INC	16U
-
-/*
  * When guest is under memory pressure, use a reduced page allocation
  * rate for next several cycles.
  */
@@ -99,9 +86,6 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_PAGE_ALLOC_CANSLEEP		(GFP_HIGHUSER)
 
-/* Maximum number of page allocations without yielding processor */
-#define VMW_BALLOON_YIELD_THRESHOLD	1024
-
 /* Maximum number of refused pages we accumulate during inflation cycle */
 #define VMW_BALLOON_MAX_REFUSED		16
 
@@ -278,7 +262,6 @@ struct vmballoon {
 
 	/* adjustment rates (pages per second) */
 	unsigned int rate_alloc;
-	unsigned int rate_free;
 
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
@@ -502,18 +485,13 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-	unsigned int count = 0;
 
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		__free_page(page);
 		STATS_INC(b->stats.free);
 		b->size--;
-
-		if (++count >= b->rate_free) {
-			count = 0;
-			cond_resched();
-		}
+		cond_resched();
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -742,13 +720,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ?
-			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
+	rate = b->slow_allocation_cycles ? b->rate_alloc : -1;
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	while (b->size < b->target && num_pages < b->target - b->size) {
+	while (!b->reset_required &&
+		b->size < b->target && num_pages < b->target - b->size) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
@@ -798,12 +776,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 				break;
 		}
 
-		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
-			cond_resched();
-			allocations = 0;
-		}
+		cond_resched();
 
-		if (allocations >= rate) {
+		if (rate != -1 && allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
 			break;
 		}
@@ -837,36 +812,29 @@ static void vmballoon_deflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
-						b->target, b->rate_free);
+	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		b->ops->add_page(b, num_pages++, page);
 
+
 		if (num_pages == b->batch_max_pages) {
 			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
-			if (error) {
-				/* quickly decrease rate in case of error */
-				b->rate_free = max(b->rate_free / 2,
-						VMW_BALLOON_RATE_FREE_MIN);
+			if (error)
 				return;
-			}
 		}
 
-		if (++i >= b->size - b->target)
+		if (b->reset_required || ++i >= b->size - b->target)
 			break;
+
+		cond_resched();
 	}
 
 	if (num_pages > 0)
 		b->ops->unlock(b, num_pages, &b->target);
-
-	/* slowly increase rate if there were no errors */
-	if (error == 0)
-		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-				   VMW_BALLOON_RATE_FREE_MAX);
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -992,11 +960,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 
 	/* format rate info */
 	seq_printf(f,
-		   "rateNoSleepAlloc:   %8d pages/sec\n"
-		   "rateSleepAlloc:     %8d pages/sec\n"
-		   "rateFree:           %8d pages/sec\n",
-		   VMW_BALLOON_NOSLEEP_ALLOC_MAX,
-		   b->rate_alloc, b->rate_free);
+		   "rateSleepAlloc:     %8d pages/sec\n",
+		   b->rate_alloc);
 
 	seq_printf(f,
 		   "\n"
@@ -1087,7 +1052,6 @@ static int __init vmballoon_init(void)
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-	balloon.rate_free = VMW_BALLOON_RATE_FREE_MAX;
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 7/9] VMware balloon: Support 2m page ballooning.
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (6 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

2m ballooning significantly reduces the hypervisor side (and guest side)
overhead of ballooning and unballooning.

hypervisor only:
      balloon  unballoon
4 KB  2 GB/s   2.6 GB/s
2 MB  54 GB/s  767 GB/s

Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
supported super-page size.

The code has to run on older versions of ESX and old balloon drivers run on
newer version of ESX. Hence match the capabilities with the host before 2m
page ballooning could be enabled.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 376 +++++++++++++++++++++++++++++++--------------
 1 file changed, 258 insertions(+), 118 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index a5e1980..a45eea6 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.4.0-k");
+MODULE_VERSION("1.4.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -101,11 +101,16 @@ enum vmwballoon_capabilities {
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
 	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
-					| VMW_BALLOON_BATCHED_CMDS)
+					| VMW_BALLOON_BATCHED_CMDS \
+					| VMW_BALLOON_BATCHED_2M_CMDS)
+
+#define VMW_BALLOON_2M_SHIFT		(9)
+#define VMW_BALLOON_NUM_PAGE_SIZES	(2)
 
 /*
  * Backdoor commands availability:
@@ -116,14 +121,19 @@ enum vmwballoon_capabilities {
  *	LOCK and UNLOCK commands,
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ * VMW BALLOON_BATCHED_2M_CMDS:
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
  */
-#define VMW_BALLOON_CMD_START		0
-#define VMW_BALLOON_CMD_GET_TARGET	1
-#define VMW_BALLOON_CMD_LOCK		2
-#define VMW_BALLOON_CMD_UNLOCK		3
-#define VMW_BALLOON_CMD_GUEST_ID	4
-#define VMW_BALLOON_CMD_BATCHED_LOCK	6
-#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
+#define VMW_BALLOON_CMD_START			0
+#define VMW_BALLOON_CMD_GET_TARGET		1
+#define VMW_BALLOON_CMD_LOCK			2
+#define VMW_BALLOON_CMD_UNLOCK			3
+#define VMW_BALLOON_CMD_GUEST_ID		4
+#define VMW_BALLOON_CMD_BATCHED_LOCK		6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
+#define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
+#define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -151,9 +161,6 @@ enum vmwballoon_capabilities {
  * +-------------+----------+--------+
  * 64  PAGE_SHIFT          6         0
  *
- * For now only 4K pages are supported, but we can easily support large pages
- * by using bits in the reserved field.
- *
  * The reserved field should be set to 0.
  */
 #define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
@@ -208,19 +215,19 @@ struct vmballoon_stats {
 	unsigned int timer;
 
 	/* allocation statistics */
-	unsigned int alloc;
-	unsigned int alloc_fail;
+	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int alloc_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int sleep_alloc;
 	unsigned int sleep_alloc_fail;
-	unsigned int refused_alloc;
-	unsigned int refused_free;
-	unsigned int free;
+	unsigned int refused_alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int refused_free[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int free[VMW_BALLOON_NUM_PAGE_SIZES];
 
 	/* monitor operations */
-	unsigned int lock;
-	unsigned int lock_fail;
-	unsigned int unlock;
-	unsigned int unlock_fail;
+	unsigned int lock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int lock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int target;
 	unsigned int target_fail;
 	unsigned int start;
@@ -239,19 +246,25 @@ struct vmballoon;
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
 	int (*lock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 };
 
-struct vmballoon {
-
+struct vmballoon_page_size {
 	/* list of reserved physical pages */
 	struct list_head pages;
 
 	/* transient list of non-balloonable pages */
 	struct list_head refused_pages;
 	unsigned int n_refused_pages;
+};
+
+struct vmballoon {
+	struct vmballoon_page_size page_sizes[VMW_BALLOON_NUM_PAGE_SIZES];
+
+	/* supported page sizes. 1 == 4k pages only, 2 == 4k and 2m pages */
+	unsigned supported_page_sizes;
 
 	/* balloon size in pages */
 	unsigned int size;
@@ -296,6 +309,7 @@ static struct vmballoon balloon;
 static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
 	unsigned long status, capabilities, dummy = 0;
+	bool success;
 
 	STATS_INC(b->stats.start);
 
@@ -304,15 +318,26 @@ static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 	switch (status) {
 	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
 		b->capabilities = capabilities;
-		return true;
+		success = true;
+		break;
 	case VMW_BALLOON_SUCCESS:
 		b->capabilities = VMW_BALLOON_BASIC_CMDS;
-		return true;
+		success = true;
+		break;
+	default:
+		success = false;
 	}
 
-	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
-	STATS_INC(b->stats.start_fail);
-	return false;
+	if (b->capabilities & VMW_BALLOON_BATCHED_2M_CMDS)
+		b->supported_page_sizes = 2;
+	else
+		b->supported_page_sizes = 1;
+
+	if (!success) {
+		pr_debug("%s - failed, hv returns %ld\n", __func__, status);
+		STATS_INC(b->stats.start_fail);
+	}
+	return success;
 }
 
 static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
@@ -353,6 +378,14 @@ static bool vmballoon_send_guest_id(struct vmballoon *b)
 	return false;
 }
 
+static u16 vmballoon_page_size(bool is_2m_page)
+{
+	if (is_2m_page)
+		return 1 << VMW_BALLOON_2M_SHIFT;
+
+	return 1;
+}
+
 /*
  * Retrieve desired balloon size from the host.
  */
@@ -406,31 +439,37 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return -1;
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[false]);
 
 	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[false]);
 	return 1;
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_LOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[is_2m_pages]);
 	return 1;
 }
 
@@ -448,34 +487,56 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return false;
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[false]);
 
 	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[false]);
 	return false;
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_UNLOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[is_2m_pages]);
 	return false;
 }
 
+static struct page *vmballoon_alloc_page(gfp_t flags, bool is_2m_page)
+{
+	if (is_2m_page)
+		return alloc_pages(flags, VMW_BALLOON_2M_SHIFT);
+
+	return alloc_page(flags);
+}
+
+static void vmballoon_free_page(struct page *page, bool is_2m_page)
+{
+	if (is_2m_page)
+		__free_pages(page, VMW_BALLOON_2M_SHIFT);
+	else
+		__free_page(page);
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -485,13 +546,21 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.free);
-		b->size--;
-		cond_resched();
+	unsigned is_2m_pages;
+
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+		u16 size_per_page = vmballoon_page_size(is_2m_pages);
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			list_del(&page->lru);
+			vmballoon_free_page(page, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
+			b->size -= size_per_page;
+			cond_resched();
+		}
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -509,19 +578,22 @@ static void vmballoon_pop(struct vmballoon *b)
  * inflation cycle.
  */
 static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+				bool is_2m_pages, unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
 								target);
 	if (locked > 0) {
-		STATS_INC(b->stats.refused_alloc);
+		STATS_INC(b->stats.refused_alloc[false]);
 
 		if (hv_status == VMW_BALLOON_ERROR_RESET ||
 				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 			return -EIO;
 		}
 
@@ -530,17 +602,17 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 		 * and retry allocation, unless we already accumulated
 		 * too many of them, in which case take a breather.
 		 */
-		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-			b->n_refused_pages++;
-			list_add(&page->lru, &b->refused_pages);
+		if (page_size->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			page_size->n_refused_pages++;
+			list_add(&page->lru, &page_size->refused_pages);
 		} else {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 		}
 		return -EIO;
 	}
 
 	/* track allocated page */
-	list_add(&page->lru, &b->pages);
+	list_add(&page->lru, &page_size->pages);
 
 	/* update balloon size */
 	b->size++;
@@ -549,17 +621,19 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	int locked, i;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	locked = vmballoon_send_batched_lock(b, num_pages, target);
+	locked = vmballoon_send_batched_lock(b, num_pages, is_2m_pages,
+			target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
 
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 		}
 
 		return -EIO;
@@ -568,25 +642,28 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 
 		switch (locked) {
 		case VMW_BALLOON_SUCCESS:
-			list_add(&p->lru, &b->pages);
-			b->size++;
+			list_add(&p->lru, &page_size->pages);
+			b->size += size_per_page;
 			break;
 		case VMW_BALLOON_ERROR_PPN_PINNED:
 		case VMW_BALLOON_ERROR_PPN_INVALID:
-			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-				list_add(&p->lru, &b->refused_pages);
-				b->n_refused_pages++;
+			if (page_size->n_refused_pages
+					< VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &page_size->refused_pages);
+				page_size->n_refused_pages++;
 				break;
 			}
 			/* Fallthrough */
 		case VMW_BALLOON_ERROR_RESET:
 		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 			break;
 		default:
 			/* This should never happen */
@@ -603,18 +680,21 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * to use, if needed.
  */
 static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+		bool is_2m_pages, unsigned int *target)
 {
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
-		list_add(&page->lru, &b->pages);
+		list_add(&page->lru, &page_size->pages);
 		return -EIO;
 	}
 
 	/* deallocate page */
-	__free_page(page);
-	STATS_INC(b->stats.free);
+	vmballoon_free_page(page, false);
+	STATS_INC(b->stats.free[false]);
 
 	/* update balloon size */
 	b->size--;
@@ -623,18 +703,23 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+				unsigned int num_pages, bool is_2m_pages,
+				unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, is_2m_pages,
+			target);
 	if (!hv_success)
 		ret = -EIO;
 
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
@@ -643,14 +728,14 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
 			 * hypervisor, re-add it to the list of pages owned by
 			 * the balloon driver.
 			 */
-			list_add(&p->lru, &b->pages);
+			list_add(&p->lru, &page_size->pages);
 		} else {
 			/* deallocate page */
-			__free_page(p);
-			STATS_INC(b->stats.free);
+			vmballoon_free_page(p, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
 
 			/* update balloon size */
-			b->size--;
+			b->size -= size_per_page;
 		}
 	}
 
@@ -661,17 +746,20 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
  */
-static void vmballoon_release_refused_pages(struct vmballoon *b)
+static void vmballoon_release_refused_pages(struct vmballoon *b,
+		bool is_2m_pages)
 {
 	struct page *page, *next;
+	struct vmballoon_page_size *page_size =
+			&b->page_sizes[is_2m_pages];
 
-	list_for_each_entry_safe(page, next, &b->refused_pages, lru) {
+	list_for_each_entry_safe(page, next, &page_size->refused_pages, lru) {
 		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.refused_free);
+		vmballoon_free_page(page, is_2m_pages);
+		STATS_INC(b->stats.refused_free[is_2m_pages]);
 	}
 
-	b->n_refused_pages = 0;
+	page_size->n_refused_pages = 0;
 }
 
 static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
@@ -698,6 +786,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
+	bool is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -720,22 +809,46 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ? b->rate_alloc : -1;
+	if (b->slow_allocation_cycles) {
+		rate = b->rate_alloc;
+		is_2m_pages = false;
+	} else {
+		rate = -1;
+		is_2m_pages =
+			b->supported_page_sizes == VMW_BALLOON_NUM_PAGE_SIZES;
+	}
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
 	while (!b->reset_required &&
-		b->size < b->target && num_pages < b->target - b->size) {
+		b->size + num_pages * vmballoon_page_size(is_2m_pages)
+		< b->target) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
-			STATS_INC(b->stats.alloc);
+			STATS_INC(b->stats.alloc[is_2m_pages]);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
+		page = vmballoon_alloc_page(flags, is_2m_pages);
 		if (!page) {
+			STATS_INC(b->stats.alloc_fail[is_2m_pages]);
+
+			if (is_2m_pages) {
+				b->ops->lock(b, num_pages, true, &b->target);
+
+				/*
+				 * ignore errors from locking as we now switch
+				 * to 4k pages and we might get different
+				 * errors.
+				 */
+
+				num_pages = 0;
+				is_2m_pages = false;
+				continue;
+			}
+
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
@@ -747,7 +860,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
-			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -770,7 +882,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages, &b->target);
+			error = b->ops->lock(b, num_pages, is_2m_pages,
+					&b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -785,7 +898,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages, &b->target);
+		b->ops->lock(b, num_pages, is_2m_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
@@ -799,7 +912,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 			    VMW_BALLOON_RATE_ALLOC_MAX);
 	}
 
-	vmballoon_release_refused_pages(b);
+	vmballoon_release_refused_pages(b, true);
+	vmballoon_release_refused_pages(b, false);
 }
 
 /*
@@ -807,34 +921,45 @@ static void vmballoon_inflate(struct vmballoon *b)
  */
 static void vmballoon_deflate(struct vmballoon *b)
 {
-	struct page *page, *next;
-	unsigned int i = 0;
-	unsigned int num_pages = 0;
-	int error;
+	unsigned is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		b->ops->add_page(b, num_pages++, page);
+	for (is_2m_pages = 0; is_2m_pages < b->supported_page_sizes;
+			is_2m_pages++) {
+		struct page *page, *next;
+		unsigned int num_pages = 0;
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			if (b->reset_required ||
+				(b->target > 0 &&
+					b->size - num_pages
+					* vmballoon_page_size(is_2m_pages)
+				< b->target + vmballoon_page_size(true)))
+				break;
 
+			list_del(&page->lru);
+			b->ops->add_page(b, num_pages++, page);
 
-		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages, &b->target);
-			num_pages = 0;
-			if (error)
-				return;
-		}
+			if (num_pages == b->batch_max_pages) {
+				int error;
 
-		if (b->reset_required || ++i >= b->size - b->target)
-			break;
+				error = b->ops->unlock(b, num_pages,
+						is_2m_pages, &b->target);
+				num_pages = 0;
+				if (error)
+					return;
+			}
 
-		cond_resched();
-	}
+			cond_resched();
+		}
 
-	if (num_pages > 0)
-		b->ops->unlock(b, num_pages, &b->target);
+		if (num_pages > 0)
+			b->ops->unlock(b, num_pages, is_2m_pages, &b->target);
+	}
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -924,7 +1049,8 @@ static void vmballoon_work(struct work_struct *work)
 
 		if (b->size < target)
 			vmballoon_inflate(b);
-		else if (b->size > target)
+		else if (target == 0 ||
+				b->size > target + vmballoon_page_size(true))
 			vmballoon_deflate(b);
 	}
 
@@ -968,24 +1094,35 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "timer:              %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
+		   "2m-lock:            %8u (%4u failed)\n"
 		   "lock:               %8u (%4u failed)\n"
+		   "2m-unlock:          %8u (%4u failed)\n"
 		   "unlock:             %8u (%4u failed)\n"
 		   "target:             %8u (%4u failed)\n"
+		   "prim2mAlloc:        %8u (%4u failed)\n"
 		   "primNoSleepAlloc:   %8u (%4u failed)\n"
 		   "primCanSleepAlloc:  %8u (%4u failed)\n"
+		   "prim2mFree:         %8u\n"
 		   "primFree:           %8u\n"
+		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
+		   "err2mFree:          %8u\n"
 		   "errFree:            %8u\n",
 		   stats->timer,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
-		   stats->lock,  stats->lock_fail,
-		   stats->unlock, stats->unlock_fail,
+		   stats->lock[true],  stats->lock_fail[true],
+		   stats->lock[false],  stats->lock_fail[false],
+		   stats->unlock[true], stats->unlock_fail[true],
+		   stats->unlock[false], stats->unlock_fail[false],
 		   stats->target, stats->target_fail,
-		   stats->alloc, stats->alloc_fail,
+		   stats->alloc[true], stats->alloc_fail[true],
+		   stats->alloc[false], stats->alloc_fail[false],
 		   stats->sleep_alloc, stats->sleep_alloc_fail,
-		   stats->free,
-		   stats->refused_alloc, stats->refused_free);
+		   stats->free[true],
+		   stats->free[false],
+		   stats->refused_alloc[true], stats->refused_alloc[false],
+		   stats->refused_free[true], stats->refused_free[false]);
 
 	return 0;
 }
@@ -1039,7 +1176,7 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b)
 static int __init vmballoon_init(void)
 {
 	int error;
-
+	unsigned is_2m_pages;
 	/*
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
@@ -1047,8 +1184,11 @@ static int __init vmballoon_init(void)
 	if (x86_hyper != &x86_hyper_vmware)
 		return -ENODEV;
 
-	INIT_LIST_HEAD(&balloon.pages);
-	INIT_LIST_HEAD(&balloon.refused_pages);
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].pages);
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].refused_pages);
+	}
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 8/9] VMware balloon: Treat init like reset
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (7 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  2015-04-14 17:29 ` [PATCH 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

Unify the behavior of the first start of the balloon and a reset. Also on
unload, declare that the balloon driver does not have any capabilities
anymore.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 53 ++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index a45eea6..861d9f3 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.0.0-k");
+MODULE_VERSION("1.4.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -563,12 +563,14 @@ static void vmballoon_pop(struct vmballoon *b)
 		}
 	}
 
-	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		if (b->batch_page)
-			vunmap(b->batch_page);
+	if (b->batch_page) {
+		vunmap(b->batch_page);
+		b->batch_page = NULL;
+	}
 
-		if (b->page)
-			__free_page(b->page);
+	if (b->page) {
+		__free_page(b->page);
+		b->page = NULL;
 	}
 }
 
@@ -1043,7 +1045,7 @@ static void vmballoon_work(struct work_struct *work)
 	if (b->slow_allocation_cycles > 0)
 		b->slow_allocation_cycles--;
 
-	if (vmballoon_send_get_target(b, &target)) {
+	if (!b->reset_required && vmballoon_send_get_target(b, &target)) {
 		/* update target, adjust size */
 		b->target = target;
 
@@ -1075,8 +1077,10 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	/* format capabilities info */
 	seq_printf(f,
 		   "balloon capabilities:   %#4x\n"
-		   "used capabilities:      %#4lx\n",
-		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+		   "used capabilities:      %#4lx\n"
+		   "is resetting:           %c\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities,
+		   b->reset_required ? 'y' : 'n');
 
 	/* format size info */
 	seq_printf(f,
@@ -1195,35 +1199,14 @@ static int __init vmballoon_init(void)
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-	/*
-	 * Start balloon.
-	 */
-	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
-		pr_err("failed to send start command to the host\n");
-		return -EIO;
-	}
-
-	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		balloon.ops = &vmballoon_batched_ops;
-		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
-		if (!vmballoon_init_batching(&balloon)) {
-			pr_err("failed to init batching\n");
-			return -EIO;
-		}
-	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
-		balloon.ops = &vmballoon_basic_ops;
-		balloon.batch_max_pages = 1;
-	}
-
-	if (!vmballoon_send_guest_id(&balloon)) {
-		pr_err("failed to send guest ID to the host\n");
-		return -EIO;
-	}
-
 	error = vmballoon_debugfs_init(&balloon);
 	if (error)
 		return error;
 
+	balloon.batch_page = NULL;
+	balloon.page = NULL;
+	balloon.reset_required = true;
+
 	queue_delayed_work(system_freezable_wq, &balloon.dwork, 0);
 
 	return 0;
@@ -1241,7 +1224,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
+	vmballoon_send_start(&balloon, 0);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 9/9] VMware balloon: Enable notification via VMCI
  2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
                   ` (8 preceding siblings ...)
  2015-04-14 17:29 ` [PATCH 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
@ 2015-04-14 17:29 ` Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-04-14 17:29 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, xdeguillard, akpm, pv-drivers, Philip P. Moltmann

Get notified immediately when a balloon target is set, instead of waiting for
up to one second.

The up-to 1 second gap could be long enough to cause swapping inside of the
VM that receives the VM.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Tested-by: Siva Sankar Reddy B <sankars@vmware.com>
---
 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 105 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 006242c..1c075b7 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -404,7 +404,7 @@ config TI_DAC7512
 
 config VMWARE_BALLOON
 	tristate "VMware Balloon Driver"
-	depends on X86 && HYPERVISOR_GUEST
+	depends on VMWARE_VMCI && X86 && HYPERVISOR_GUEST
 	help
 	  This is VMware physical memory management driver which acts
 	  like a "balloon" that can be inflated to reclaim physical pages
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 861d9f3..53d951d 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2014, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -42,11 +42,13 @@
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
+#include <linux/vmw_vmci_defs.h>
+#include <linux/vmw_vmci_api.h>
 #include <asm/hypervisor.h>
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.1.0-k");
+MODULE_VERSION("1.5.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -100,14 +102,16 @@ enum vmwballoon_capabilities {
 	/*
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
-	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
-	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
+	VMW_BALLOON_BASIC_CMDS			= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS		= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS		= (1 << 3),
+	VMW_BALLOON_SIGNALLED_WAKEUP_CMD	= (1 << 4),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
 					| VMW_BALLOON_BATCHED_CMDS \
-					| VMW_BALLOON_BATCHED_2M_CMDS)
+					| VMW_BALLOON_BATCHED_2M_CMDS \
+					| VMW_BALLOON_SIGNALLED_WAKEUP_CMD)
 
 #define VMW_BALLOON_2M_SHIFT		(9)
 #define VMW_BALLOON_NUM_PAGE_SIZES	(2)
@@ -122,7 +126,9 @@ enum vmwballoon_capabilities {
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
  * VMW BALLOON_BATCHED_2M_CMDS:
- *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands,
+ * VMW VMW_BALLOON_SIGNALLED_WAKEUP_CMD:
+ *	VMW_BALLOON_CMD_VMCI_DOORBELL_SET command.
  */
 #define VMW_BALLOON_CMD_START			0
 #define VMW_BALLOON_CMD_GET_TARGET		1
@@ -133,6 +139,7 @@ enum vmwballoon_capabilities {
 #define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
 #define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
 #define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+#define VMW_BALLOON_CMD_VMCI_DOORBELL_SET	10
 
 
 /* error codes */
@@ -213,6 +220,7 @@ static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
 #ifdef CONFIG_DEBUG_FS
 struct vmballoon_stats {
 	unsigned int timer;
+	unsigned int doorbell;
 
 	/* allocation statistics */
 	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
@@ -234,6 +242,8 @@ struct vmballoon_stats {
 	unsigned int start_fail;
 	unsigned int guest_type;
 	unsigned int guest_type_fail;
+	unsigned int doorbell_set;
+	unsigned int doorbell_unset;
 };
 
 #define STATS_INC(stat) (stat)++
@@ -298,6 +308,8 @@ struct vmballoon {
 	struct sysinfo sysinfo;
 
 	struct delayed_work dwork;
+
+	struct vmci_handle vmci_doorbell;
 };
 
 static struct vmballoon balloon;
@@ -992,12 +1004,75 @@ static bool vmballoon_init_batching(struct vmballoon *b)
 }
 
 /*
+ * Receive notification and resize balloon
+ */
+static void vmballoon_doorbell(void *client_data)
+{
+	struct vmballoon *b = client_data;
+
+	STATS_INC(b->stats.doorbell);
+
+	mod_delayed_work(system_freezable_wq, &b->dwork, 0);
+}
+
+/*
+ * Clean up vmci doorbell
+ */
+static void vmballoon_vmci_cleanup(struct vmballoon *b)
+{
+	int error;
+
+	VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET, VMCI_INVALID_ID,
+			VMCI_INVALID_ID, error);
+	STATS_INC(b->stats.doorbell_unset);
+
+	if (!vmci_handle_is_invalid(b->vmci_doorbell)) {
+		vmci_doorbell_destroy(b->vmci_doorbell);
+		b->vmci_doorbell = VMCI_INVALID_HANDLE;
+	}
+}
+
+/*
+ * Initialize vmci doorbell, to get notified as soon as balloon changes
+ */
+static int vmballoon_vmci_init(struct vmballoon *b)
+{
+	int error = 0;
+
+	if ((b->capabilities & VMW_BALLOON_SIGNALLED_WAKEUP_CMD) != 0) {
+		error = vmci_doorbell_create(&b->vmci_doorbell,
+				VMCI_FLAG_DELAYED_CB,
+				VMCI_PRIVILEGE_FLAG_RESTRICTED,
+				vmballoon_doorbell, b);
+
+		if (error == VMCI_SUCCESS) {
+			VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET,
+					b->vmci_doorbell.context,
+					b->vmci_doorbell.resource, error);
+			STATS_INC(b->stats.doorbell_set);
+		}
+	}
+
+	if (error != 0) {
+		vmballoon_vmci_cleanup(b);
+
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/*
  * Perform standard reset sequence by popping the balloon (in case it
  * is not  empty) and then restarting protocol. This operation normally
  * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
  */
 static void vmballoon_reset(struct vmballoon *b)
 {
+	int error;
+
+	vmballoon_vmci_cleanup(b);
+
 	/* free all pages, skipping monitor unlock */
 	vmballoon_pop(b);
 
@@ -1023,6 +1098,11 @@ static void vmballoon_reset(struct vmballoon *b)
 	}
 
 	b->reset_required = false;
+
+	error = vmballoon_vmci_init(b);
+	if (error)
+		pr_err("failed to initialize vmci doorbell\n");
+
 	if (!vmballoon_send_guest_id(b))
 		pr_err("failed to send guest ID to the host\n");
 }
@@ -1096,6 +1176,7 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f,
 		   "\n"
 		   "timer:              %8u\n"
+		   "doorbell:           %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
 		   "2m-lock:            %8u (%4u failed)\n"
@@ -1111,8 +1192,11 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
 		   "err2mFree:          %8u\n"
-		   "errFree:            %8u\n",
+		   "errFree:            %8u\n"
+		   "doorbellSet:        %8u\n"
+		   "doorbellUnset:      %8u\n",
 		   stats->timer,
+		   stats->doorbell,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
 		   stats->lock[true],  stats->lock_fail[true],
@@ -1126,7 +1210,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   stats->free[true],
 		   stats->free[false],
 		   stats->refused_alloc[true], stats->refused_alloc[false],
-		   stats->refused_free[true], stats->refused_free[false]);
+		   stats->refused_free[true], stats->refused_free[false],
+		   stats->doorbell_set, stats->doorbell_unset);
 
 	return 0;
 }
@@ -1203,6 +1288,7 @@ static int __init vmballoon_init(void)
 	if (error)
 		return error;
 
+	balloon.vmci_doorbell = VMCI_INVALID_HANDLE;
 	balloon.batch_page = NULL;
 	balloon.page = NULL;
 	balloon.reset_required = true;
@@ -1215,6 +1301,7 @@ module_init(vmballoon_init);
 
 static void __exit vmballoon_exit(void)
 {
+	vmballoon_vmci_cleanup(&balloon);
 	cancel_delayed_work_sync(&balloon.dwork);
 
 	vmballoon_debugfs_exit(&balloon);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-04-14 17:29 ` [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
@ 2015-04-16 20:55   ` Dmitry Torokhov
  2015-06-11 20:10     ` Philip Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: Dmitry Torokhov @ 2015-04-16 20:55 UTC (permalink / raw)
  To: Philip P. Moltmann; +Cc: gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

Hi Philip,

On Tue, Apr 14, 2015 at 10:29:33AM -0700, Philip P. Moltmann wrote:
> Before this patch the slow memory transfer would cause the destination VM to
> have internal swapping until all memory is transferred. Now the memory is
> transferred fast enough so that the destination VM does not swap. The balloon
> loop already yields to the rest of the system, hence the balloon thread
> should not monopolize a cpu.
> 
> Testing Done: quickly ballooned a lot of pages while wathing if there are any
> perceived hickups (periods of non-responsiveness) in the execution of the
> linux VM. No such hickups were seen.

What happens if you run this new driver on an older hypervisor that does
not support batched operations?

> 
> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
> ---
>  drivers/misc/vmw_balloon.c | 66 +++++++++++-----------------------------------
>  1 file changed, 15 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> index 6eaf7f7..a5e1980 100644
> --- a/drivers/misc/vmw_balloon.c
> +++ b/drivers/misc/vmw_balloon.c
> @@ -46,7 +46,7 @@
>  
>  MODULE_AUTHOR("VMware, Inc.");
>  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
> -MODULE_VERSION("1.3.3.0-k");
> +MODULE_VERSION("1.3.4.0-k");
>  MODULE_ALIAS("dmi:*:svnVMware*:*");
>  MODULE_ALIAS("vmware_vmmemctl");
>  MODULE_LICENSE("GPL");
> @@ -57,12 +57,6 @@ MODULE_LICENSE("GPL");
>   */
>  
>  /*
> - * Rate of allocating memory when there is no memory pressure
> - * (driver performs non-sleeping allocations).
> - */
> -#define VMW_BALLOON_NOSLEEP_ALLOC_MAX	16384U
> -
> -/*
>   * Rates of memory allocaton when guest experiences memory pressure
>   * (driver performs sleeping allocations).
>   */
> @@ -71,13 +65,6 @@ MODULE_LICENSE("GPL");
>  #define VMW_BALLOON_RATE_ALLOC_INC	16U
>  
>  /*
> - * Rates for releasing pages while deflating balloon.
> - */
> -#define VMW_BALLOON_RATE_FREE_MIN	512U
> -#define VMW_BALLOON_RATE_FREE_MAX	16384U
> -#define VMW_BALLOON_RATE_FREE_INC	16U
> -
> -/*
>   * When guest is under memory pressure, use a reduced page allocation
>   * rate for next several cycles.
>   */
> @@ -99,9 +86,6 @@ MODULE_LICENSE("GPL");
>   */
>  #define VMW_PAGE_ALLOC_CANSLEEP		(GFP_HIGHUSER)
>  
> -/* Maximum number of page allocations without yielding processor */
> -#define VMW_BALLOON_YIELD_THRESHOLD	1024
> -
>  /* Maximum number of refused pages we accumulate during inflation cycle */
>  #define VMW_BALLOON_MAX_REFUSED		16
>  
> @@ -278,7 +262,6 @@ struct vmballoon {
>  
>  	/* adjustment rates (pages per second) */
>  	unsigned int rate_alloc;
> -	unsigned int rate_free;
>  
>  	/* slowdown page allocations for next few cycles */
>  	unsigned int slow_allocation_cycles;
> @@ -502,18 +485,13 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
>  static void vmballoon_pop(struct vmballoon *b)
>  {
>  	struct page *page, *next;
> -	unsigned int count = 0;
>  
>  	list_for_each_entry_safe(page, next, &b->pages, lru) {
>  		list_del(&page->lru);
>  		__free_page(page);
>  		STATS_INC(b->stats.free);
>  		b->size--;
> -
> -		if (++count >= b->rate_free) {
> -			count = 0;
> -			cond_resched();
> -		}
> +		cond_resched();
>  	}
>  
>  	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
> @@ -742,13 +720,13 @@ static void vmballoon_inflate(struct vmballoon *b)
>  	 * Start with no sleep allocation rate which may be higher
>  	 * than sleeping allocation rate.
>  	 */
> -	rate = b->slow_allocation_cycles ?
> -			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
> +	rate = b->slow_allocation_cycles ? b->rate_alloc : -1;
>  
>  	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
>  		 __func__, b->target - b->size, rate, b->rate_alloc);
>  
> -	while (b->size < b->target && num_pages < b->target - b->size) {
> +	while (!b->reset_required &&
> +		b->size < b->target && num_pages < b->target - b->size) {
>  		struct page *page;
>  
>  		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
> @@ -798,12 +776,9 @@ static void vmballoon_inflate(struct vmballoon *b)
>  				break;
>  		}
>  
> -		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
> -			cond_resched();
> -			allocations = 0;
> -		}
> +		cond_resched();
>  
> -		if (allocations >= rate) {
> +		if (rate != -1 && allocations >= rate) {
>  			/* We allocated enough pages, let's take a break. */

Why don't you make the rate UINT_MAX when doing fast allocations, then
you would not need to treat -1 as a special case.

>  			break;
>  		}
> @@ -837,36 +812,29 @@ static void vmballoon_deflate(struct vmballoon *b)
>  	unsigned int num_pages = 0;
>  	int error;
>  
> -	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
> -						b->target, b->rate_free);
> +	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
>  
>  	/* free pages to reach target */
>  	list_for_each_entry_safe(page, next, &b->pages, lru) {
>  		list_del(&page->lru);
>  		b->ops->add_page(b, num_pages++, page);
>  
> +

Seems line unintended whitespace addition.

>  		if (num_pages == b->batch_max_pages) {
>  			error = b->ops->unlock(b, num_pages, &b->target);
>  			num_pages = 0;
> -			if (error) {
> -				/* quickly decrease rate in case of error */
> -				b->rate_free = max(b->rate_free / 2,
> -						VMW_BALLOON_RATE_FREE_MIN);
> +			if (error)
>  				return;
> -			}
>  		}
>  
> -		if (++i >= b->size - b->target)
> +		if (b->reset_required || ++i >= b->size - b->target)
>  			break;
> +
> +		cond_resched();
>  	}
>  
>  	if (num_pages > 0)
>  		b->ops->unlock(b, num_pages, &b->target);
> -
> -	/* slowly increase rate if there were no errors */
> -	if (error == 0)
> -		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
> -				   VMW_BALLOON_RATE_FREE_MAX);
>  }
>  
>  static const struct vmballoon_ops vmballoon_basic_ops = {
> @@ -992,11 +960,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
>  
>  	/* format rate info */
>  	seq_printf(f,
> -		   "rateNoSleepAlloc:   %8d pages/sec\n"
> -		   "rateSleepAlloc:     %8d pages/sec\n"
> -		   "rateFree:           %8d pages/sec\n",
> -		   VMW_BALLOON_NOSLEEP_ALLOC_MAX,
> -		   b->rate_alloc, b->rate_free);
> +		   "rateSleepAlloc:     %8d pages/sec\n",
> +		   b->rate_alloc);
>  
>  	seq_printf(f,
>  		   "\n"
> @@ -1087,7 +1052,6 @@ static int __init vmballoon_init(void)
>  
>  	/* initialize rates */
>  	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
> -	balloon.rate_free = VMW_BALLOON_RATE_FREE_MAX;
>  
>  	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-04-16 20:55   ` Dmitry Torokhov
@ 2015-06-11 20:10     ` Philip Moltmann
  2015-06-12 11:20       ` dmitry.torokhov
  0 siblings, 1 reply; 61+ messages in thread
From: Philip Moltmann @ 2015-06-11 20:10 UTC (permalink / raw)
  To: dmitry.torokhov; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1400 bytes --]

Hi,

sorry for taking so long to address your concerns.

> What happens if you run this new driver on an older hypervisor that 
> does not support batched operations?

When the driver starts or when it gets reset the driver checks for the
capabilities of the hypervisor in vmballoon_send_start. Then it resets
it state and only uses the available functionality.

A reset happens any time the VM get hot migrated, snapshotted, resumed,
etc.

I tested this driver on various versions of ESXi to have a full set of
possible capabilities.


> > -           if (allocations >= rate) {
> > +           if (rate != -1 && allocations >= rate) {
> >                     /* We allocated enough pages, let's take a 
> > break. */
> 
> Why don't you make the rate UINT_MAX when doing fast allocations, 
> then
> you would not need to treat -1 as a special case.

Good suggestion.


> >     /* free pages to reach target */
> >     list_for_each_entry_safe(page, next, &b->pages, lru) {
> >             list_del(&page->lru);
> >             b->ops->add_page(b, num_pages++, page);
> >  
> > +
> 
> Seems line unintended whitespace addition.

I could not find this in the code.

I will send out a new patch series addressing your concerns.

Thanks
Philipÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-11 20:10     ` Philip Moltmann
@ 2015-06-12 11:20       ` dmitry.torokhov
  2015-06-12 15:06         ` Philip Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: dmitry.torokhov @ 2015-06-12 11:20 UTC (permalink / raw)
  To: Philip Moltmann; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

Hi Philip,

On Thu, Jun 11, 2015 at 08:10:07PM +0000, Philip Moltmann wrote:
> Hi,
> 
> sorry for taking so long to address your concerns.
> 
> > What happens if you run this new driver on an older hypervisor that 
> > does not support batched operations?
> 
> When the driver starts or when it gets reset the driver checks for the
> capabilities of the hypervisor in vmballoon_send_start. Then it resets
> it state and only uses the available functionality.
> 
> A reset happens any time the VM get hot migrated, snapshotted, resumed,
> etc.
> 
> I tested this driver on various versions of ESXi to have a full set of
> possible capabilities.

I understand that you negotiate the capabilities between hypervisor and
the balloon driver, however that was not my concern (and I am sorry that
I did not express it properly).

The patch description stated:

"Before this patch the slow memory transfer would cause the destination
VM to have internal swapping until all memory is transferred. Now the
memory is transferred fast enough so that the destination VM does not
swap."

As far as I understand the improvements in memory transfer speed hinge
on the availability of batched operations, you however remove the limits
on non-sleep allocations unconditionally. Thus my question: on older
ESXi's that do not support batcher operations won't this cause VM to
start swapping?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-12 11:20       ` dmitry.torokhov
@ 2015-06-12 15:06         ` Philip Moltmann
  2015-06-12 15:31           ` dmitry.torokhov
  0 siblings, 1 reply; 61+ messages in thread
From: Philip Moltmann @ 2015-06-12 15:06 UTC (permalink / raw)
  To: dmitry.torokhov; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1845 bytes --]

Hi,

thanks for taking so much interest in this driver. It is quite good
that our design choices get scrutinized by non-current VMware
employees.


> I understand that you negotiate the capabilities between hypervisor 
> and
> the balloon driver, however that was not my concern (and I am sorry 
> that
> I did not express it properly).
> 
> The patch description stated:
> 
> "Before this patch the slow memory transfer would cause the 
> destination
> VM to have internal swapping until all memory is transferred. Now the
> memory is transferred fast enough so that the destination VM does not
> swap."
> 
> As far as I understand the improvements in memory transfer speed 
> hinge
> on the availability of batched operations, you however remove the 
> limits
> on non-sleep allocations unconditionally. Thus my question: on older
> ESXi's that do not support batcher operations won't this cause VM to
> start swapping?

Three improvements contribute to the overall faster speed:
- batched operations reduce the hypervisor overhead per page
- 2m instead of 4k buffer reduce the hypervisor overhead per page
- removing the rate-limiting for non-sleep allocations allows the guest
operating system to reclaim memory as fast as it can instead of
artificially limiting it.

Any of these improvements is great by itself and helps a lot. The
combination of all three makes a rather dramatic difference.

We cause hypervisor-level swapping if the balloon driver does not
reclaim fast enough. As any of these improvements increases reclamation
speed, we reduce swapping risk in any case.

Unfortunately the first two improvements rely on hypervisor support,
the last does not.

Thanks
Philipÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-12 15:06         ` Philip Moltmann
@ 2015-06-12 15:31           ` dmitry.torokhov
  2015-06-12 15:40             ` Philip Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: dmitry.torokhov @ 2015-06-12 15:31 UTC (permalink / raw)
  To: Philip Moltmann; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

On Fri, Jun 12, 2015 at 03:06:56PM +0000, Philip Moltmann wrote:
> Hi,
> 
> thanks for taking so much interest in this driver. It is quite good
> that our design choices get scrutinized by non-current VMware
> employees.
> 
> 
> > I understand that you negotiate the capabilities between hypervisor 
> > and
> > the balloon driver, however that was not my concern (and I am sorry 
> > that
> > I did not express it properly).
> > 
> > The patch description stated:
> > 
> > "Before this patch the slow memory transfer would cause the 
> > destination
> > VM to have internal swapping until all memory is transferred. Now the
> > memory is transferred fast enough so that the destination VM does not
> > swap."
> > 
> > As far as I understand the improvements in memory transfer speed 
> > hinge
> > on the availability of batched operations, you however remove the 
> > limits
> > on non-sleep allocations unconditionally. Thus my question: on older
> > ESXi's that do not support batcher operations won't this cause VM to
> > start swapping?
> 
> Three improvements contribute to the overall faster speed:
> - batched operations reduce the hypervisor overhead per page
> - 2m instead of 4k buffer reduce the hypervisor overhead per page
> - removing the rate-limiting for non-sleep allocations allows the guest
> operating system to reclaim memory as fast as it can instead of
> artificially limiting it.
> 
> Any of these improvements is great by itself and helps a lot. The
> combination of all three makes a rather dramatic difference.
> 
> We cause hypervisor-level swapping if the balloon driver does not
> reclaim fast enough. As any of these improvements increases reclamation
> speed, we reduce swapping risk in any case.
> 
> Unfortunately the first two improvements rely on hypervisor support,
> the last does not.

As far as I can understand the justification for removing the limit
(improvement #3) is that we have #1 and #2, at least that's how I read
the patch description. I am saying: what if you running on a hypervisor
that does not support neither #1 nor #2? What was the first release that
of ESXi supports batching and 2M pages? What about workstation (I don't
recall if it started using ballooning at some point)?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-12 15:31           ` dmitry.torokhov
@ 2015-06-12 15:40             ` Philip Moltmann
  2015-06-12 16:15               ` dmitry.torokhov
  0 siblings, 1 reply; 61+ messages in thread
From: Philip Moltmann @ 2015-06-12 15:40 UTC (permalink / raw)
  To: dmitry.torokhov; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1917 bytes --]

Hi,
> > 
> > Three improvements contribute to the overall faster speed:
> > - batched operations reduce the hypervisor overhead per page
> > - 2m instead of 4k buffer reduce the hypervisor overhead per page
> > - removing the rate-limiting for non-sleep allocations allows the 
> > guest
> > operating system to reclaim memory as fast as it can instead of
> > artificially limiting it.
> > 
> > Any of these improvements is great by itself and helps a lot. The
> > combination of all three makes a rather dramatic difference.
> > 
> > We cause hypervisor-level swapping if the balloon driver does not
> > reclaim fast enough. As any of these improvements increases 
> > reclamation
> > speed, we reduce swapping risk in any case.
> > 
> > Unfortunately the first two improvements rely on hypervisor 
> > support,
> > the last does not.
> 
> As far as I can understand the justification for removing the limit
> (improvement #3) is that we have #1 and #2, at least that's how I 
> read
> the patch description. I am saying: what if you running on a 
> hypervisor
> that does not support neither #1 nor #2? What was the first release 
> that
> of ESXi supports batching and 2M pages? What about workstation (I 
> don't
> recall if it started using ballooning at some point)?

I see how caused this confusion. The rate limiting was there to not
cause the guest OS to stall while doing nothing else than ballooning.
With the batching the time spend ballooning is smaller, hence this is
less of a problem when these features are available.

Independent of that the yielding in the ballooning loop should help to
reduce stalling. Also hypervisor swapping - because we rate limited
ballooning - causes much worse stalling than in the balloon driver.

Thanks
Philipÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-12 15:40             ` Philip Moltmann
@ 2015-06-12 16:15               ` dmitry.torokhov
  2015-06-12 18:43                 ` [PATCH v3 0/9] Third revision of the performance improvment patch to the VMWare balloon driver Philip P. Moltmann
                                   ` (9 more replies)
  0 siblings, 10 replies; 61+ messages in thread
From: dmitry.torokhov @ 2015-06-12 16:15 UTC (permalink / raw)
  To: Philip Moltmann; +Cc: linux-kernel, pv-drivers, Xavier Deguillard, gregkh, akpm

On Fri, Jun 12, 2015 at 03:40:42PM +0000, Philip Moltmann wrote:
> Hi,
> > > 
> > > Three improvements contribute to the overall faster speed:
> > > - batched operations reduce the hypervisor overhead per page
> > > - 2m instead of 4k buffer reduce the hypervisor overhead per page
> > > - removing the rate-limiting for non-sleep allocations allows the 
> > > guest
> > > operating system to reclaim memory as fast as it can instead of
> > > artificially limiting it.
> > > 
> > > Any of these improvements is great by itself and helps a lot. The
> > > combination of all three makes a rather dramatic difference.
> > > 
> > > We cause hypervisor-level swapping if the balloon driver does not
> > > reclaim fast enough. As any of these improvements increases 
> > > reclamation
> > > speed, we reduce swapping risk in any case.
> > > 
> > > Unfortunately the first two improvements rely on hypervisor 
> > > support,
> > > the last does not.
> > 
> > As far as I can understand the justification for removing the limit
> > (improvement #3) is that we have #1 and #2, at least that's how I 
> > read
> > the patch description. I am saying: what if you running on a 
> > hypervisor
> > that does not support neither #1 nor #2? What was the first release 
> > that
> > of ESXi supports batching and 2M pages? What about workstation (I 
> > don't
> > recall if it started using ballooning at some point)?
> 
> I see how caused this confusion. The rate limiting was there to not
> cause the guest OS to stall while doing nothing else than ballooning.
> With the batching the time spend ballooning is smaller, hence this is
> less of a problem when these features are available.
> 
> Independent of that the yielding in the ballooning loop should help to
> reduce stalling. Also hypervisor swapping - because we rate limited
> ballooning - causes much worse stalling than in the balloon driver.

OK, fair enough. Please update the patch description to reflect that the
rate limiting is useful on its own and does not require additional
hypervisor changes (although when they present they improve the behavior
even further).

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 0/9] Third revision of the performance improvment patch to the VMWare balloon driver
  2015-06-12 16:15               ` dmitry.torokhov
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
                                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

This is the third revision fo the path to the VMWare balloon driver. The original
was send to linux-kernel@vger.kernel.org on 4/14/15 10.29 am PST. Please refer to
the original change for an overview.

v1:
- Initial implementation
v2
- Address suggestions by Dmitry Totoknov
  - Use UINT_MAX as "infitite" rate instead of special casing -1
v3:
- Change commit comment for step 6 to better explain what impact ballooning has
  on the VM performance.

Thanks
Philip 

Philip P. Moltmann (5):
  VMware balloon: Show capabilities of balloon and resulting
    capabilities in the debug-fs node.
  VMware balloon: Do not limit the amount of frees and allocations in
    non-sleep mode.
  VMware balloon: Support 2m page ballooning.
  VMware balloon: Treat init like reset
  VMware balloon: Enable notification via VMCI

Xavier Deguillard (4):
  VMware balloon: partially inline vmballoon_reserve_page.
  VMware balloon: Add support for balloon capabilities.
  VMware balloon: add batching to the vmw_balloon.
  VMware balloon: Update balloon target on each lock/unlock.

 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 954 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 717 insertions(+), 239 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 1/9] VMware balloon: partially inline vmballoon_reserve_page.
  2015-06-12 16:15               ` dmitry.torokhov
  2015-06-12 18:43                 ` [PATCH v3 0/9] Third revision of the performance improvment patch to the VMWare balloon driver Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
                                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Xavier Deguillard, gregkh, linux-kernel, akpm, pv-drivers,
	Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

This split the function in two: the allocation part is inlined into the
inflate function and the lock part is kept into his own function.

This change is needed in order to be able to allocate more than one page
before doing the hypervisor call.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 98 ++++++++++++++++++++--------------------------
 1 file changed, 42 insertions(+), 56 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 1916174..2799c46 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.1.3-k");
+MODULE_VERSION("1.2.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -402,55 +402,37 @@ static void vmballoon_reset(struct vmballoon *b)
 }
 
 /*
- * Allocate (or reserve) a page for the balloon and notify the host.  If host
- * refuses the page put it on "refuse" list and allocate another one until host
- * is satisfied. "Refused" pages are released at the end of inflation cycle
- * (when we allocate b->rate_alloc pages).
+ * Notify the host of a ballooned page. If host rejects the page put it on the
+ * refuse list, those refused page are then released at the end of the
+ * inflation cycle.
  */
-static int vmballoon_reserve_page(struct vmballoon *b, bool can_sleep)
+static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 {
-	struct page *page;
-	gfp_t flags;
-	unsigned int hv_status;
-	int locked;
-	flags = can_sleep ? VMW_PAGE_ALLOC_CANSLEEP : VMW_PAGE_ALLOC_NOSLEEP;
-
-	do {
-		if (!can_sleep)
-			STATS_INC(b->stats.alloc);
-		else
-			STATS_INC(b->stats.sleep_alloc);
-
-		page = alloc_page(flags);
-		if (!page) {
-			if (!can_sleep)
-				STATS_INC(b->stats.alloc_fail);
-			else
-				STATS_INC(b->stats.sleep_alloc_fail);
-			return -ENOMEM;
-		}
+	int locked, hv_status;
 
-		/* inform monitor */
-		locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
-		if (locked > 0) {
-			STATS_INC(b->stats.refused_alloc);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	if (locked > 0) {
+		STATS_INC(b->stats.refused_alloc);
 
-			if (hv_status == VMW_BALLOON_ERROR_RESET ||
-			    hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-				__free_page(page);
-				return -EIO;
-			}
+		if (hv_status == VMW_BALLOON_ERROR_RESET ||
+				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
+			__free_page(page);
+			return -EIO;
+		}
 
-			/*
-			 * Place page on the list of non-balloonable pages
-			 * and retry allocation, unless we already accumulated
-			 * too many of them, in which case take a breather.
-			 */
+		/*
+		 * Place page on the list of non-balloonable pages
+		 * and retry allocation, unless we already accumulated
+		 * too many of them, in which case take a breather.
+		 */
+		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			b->n_refused_pages++;
 			list_add(&page->lru, &b->refused_pages);
-			if (++b->n_refused_pages >= VMW_BALLOON_MAX_REFUSED)
-				return -EIO;
+		} else {
+			__free_page(page);
 		}
-	} while (locked != 0);
+		return -EIO;
+	}
 
 	/* track allocated page */
 	list_add(&page->lru, &b->pages);
@@ -512,7 +494,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int i;
 	unsigned int allocations = 0;
 	int error = 0;
-	bool alloc_can_sleep = false;
+	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -543,19 +525,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
+		struct page *page;
 
-		error = vmballoon_reserve_page(b, alloc_can_sleep);
-		if (error) {
-			if (error != -ENOMEM) {
-				/*
-				 * Not a page allocation failure, stop this
-				 * cycle. Maybe we'll get new target from
-				 * the host soon.
-				 */
-				break;
-			}
+		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
+			STATS_INC(b->stats.alloc);
+		else
+			STATS_INC(b->stats.sleep_alloc);
 
-			if (alloc_can_sleep) {
+		page = alloc_page(flags);
+		if (!page) {
+			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
 				 * is under severe memory pressure. Quickly
@@ -563,8 +542,10 @@ static void vmballoon_inflate(struct vmballoon *b)
 				 */
 				b->rate_alloc = max(b->rate_alloc / 2,
 						    VMW_BALLOON_RATE_ALLOC_MIN);
+				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
+			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -579,11 +560,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 			if (i >= b->rate_alloc)
 				break;
 
-			alloc_can_sleep = true;
+			flags = VMW_PAGE_ALLOC_CANSLEEP;
 			/* Lower rate for sleeping allocations. */
 			rate = b->rate_alloc;
+			continue;
 		}
 
+		error = vmballoon_lock_page(b, page);
+		if (error)
+			break;
+
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
 			allocations = 0;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 2/9] VMware balloon: Add support for balloon capabilities.
  2015-06-12 16:15               ` dmitry.torokhov
  2015-06-12 18:43                 ` [PATCH v3 0/9] Third revision of the performance improvment patch to the VMWare balloon driver Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
                                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Xavier Deguillard, gregkh, linux-kernel, akpm, pv-drivers,
	Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

In order to extend the balloon protocol, the hypervisor and the guest
driver need to agree on a set of supported functionality to use.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 74 +++++++++++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 2799c46..ffb5634 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.2.0-k");
+MODULE_VERSION("1.3.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -110,9 +110,18 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_BALLOON_HV_PORT		0x5670
 #define VMW_BALLOON_HV_MAGIC		0x456c6d6f
-#define VMW_BALLOON_PROTOCOL_VERSION	2
 #define VMW_BALLOON_GUEST_ID		1	/* Linux */
 
+enum vmwballoon_capabilities {
+	/*
+	 * Bit 0 is reserved and not associated to any capability.
+	 */
+	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+};
+
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
@@ -120,32 +129,36 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_CMD_GUEST_ID	4
 
 /* error codes */
-#define VMW_BALLOON_SUCCESS		0
-#define VMW_BALLOON_FAILURE		-1
-#define VMW_BALLOON_ERROR_CMD_INVALID	1
-#define VMW_BALLOON_ERROR_PPN_INVALID	2
-#define VMW_BALLOON_ERROR_PPN_LOCKED	3
-#define VMW_BALLOON_ERROR_PPN_UNLOCKED	4
-#define VMW_BALLOON_ERROR_PPN_PINNED	5
-#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	6
-#define VMW_BALLOON_ERROR_RESET		7
-#define VMW_BALLOON_ERROR_BUSY		8
-
-#define VMWARE_BALLOON_CMD(cmd, data, result)		\
-({							\
-	unsigned long __stat, __dummy1, __dummy2;	\
-	__asm__ __volatile__ ("inl %%dx" :		\
-		"=a"(__stat),				\
-		"=c"(__dummy1),				\
-		"=d"(__dummy2),				\
-		"=b"(result) :				\
-		"0"(VMW_BALLOON_HV_MAGIC),		\
-		"1"(VMW_BALLOON_CMD_##cmd),		\
-		"2"(VMW_BALLOON_HV_PORT),		\
-		"3"(data) :				\
-		"memory");				\
-	result &= -1UL;					\
-	__stat & -1UL;					\
+#define VMW_BALLOON_SUCCESS		        0
+#define VMW_BALLOON_FAILURE		        -1
+#define VMW_BALLOON_ERROR_CMD_INVALID	        1
+#define VMW_BALLOON_ERROR_PPN_INVALID	        2
+#define VMW_BALLOON_ERROR_PPN_LOCKED	        3
+#define VMW_BALLOON_ERROR_PPN_UNLOCKED	        4
+#define VMW_BALLOON_ERROR_PPN_PINNED	        5
+#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	        6
+#define VMW_BALLOON_ERROR_RESET		        7
+#define VMW_BALLOON_ERROR_BUSY		        8
+
+#define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
+
+#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+({								\
+	unsigned long __status, __dummy1, __dummy2;		\
+	__asm__ __volatile__ ("inl %%dx" :			\
+		"=a"(__status),					\
+		"=c"(__dummy1),					\
+		"=d"(__dummy2),					\
+		"=b"(result) :					\
+		"0"(VMW_BALLOON_HV_MAGIC),			\
+		"1"(VMW_BALLOON_CMD_##cmd),			\
+		"2"(VMW_BALLOON_HV_PORT),			\
+		"3"(data) :					\
+		"memory");					\
+	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
+		result = __dummy1;				\
+	result &= -1UL;						\
+	__status & -1UL;					\
 })
 
 #ifdef CONFIG_DEBUG_FS
@@ -223,11 +236,12 @@ static struct vmballoon balloon;
  */
 static bool vmballoon_send_start(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, capabilities;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_PROTOCOL_VERSION, dummy);
+	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
+				capabilities);
 	if (status == VMW_BALLOON_SUCCESS)
 		return true;
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (2 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-08-05 20:19                   ` Greg KH
  2015-06-12 18:43                 ` [PATCH v3 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
                                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Xavier Deguillard, gregkh, linux-kernel, akpm, pv-drivers,
	Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Introduce a new capability to the driver that allow sending 512 pages in
one hypervisor call. This reduce the cost of the driver when reclaiming
memory.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
---
 drivers/misc/vmw_balloon.c | 405 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 352 insertions(+), 53 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index ffb5634..f65c676 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2010, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.0.0-k");
+MODULE_VERSION("1.3.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -120,13 +120,26 @@ enum vmwballoon_capabilities {
 	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
 };
 
-#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
+					| VMW_BALLOON_BATCHED_CMDS)
 
+/*
+ * Backdoor commands availability:
+ *
+ * START, GET_TARGET and GUEST_ID are always available,
+ *
+ * VMW_BALLOON_BASIC_CMDS:
+ *	LOCK and UNLOCK commands,
+ * VMW_BALLOON_BATCHED_CMDS:
+ *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ */
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
 #define VMW_BALLOON_CMD_UNLOCK		3
 #define VMW_BALLOON_CMD_GUEST_ID	4
+#define VMW_BALLOON_CMD_BATCHED_LOCK	6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -142,18 +155,63 @@ enum vmwballoon_capabilities {
 
 #define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
 
-#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+/* Batch page description */
+
+/*
+ * Layout of a page in the batch page:
+ *
+ * +-------------+----------+--------+
+ * |             |          |        |
+ * | Page number | Reserved | Status |
+ * |             |          |        |
+ * +-------------+----------+--------+
+ * 64  PAGE_SHIFT          6         0
+ *
+ * For now only 4K pages are supported, but we can easily support large pages
+ * by using bits in the reserved field.
+ *
+ * The reserved field should be set to 0.
+ */
+#define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
+#define VMW_BALLOON_BATCH_STATUS_MASK	((1UL << 5) - 1)
+#define VMW_BALLOON_BATCH_PAGE_MASK	(~((1UL << PAGE_SHIFT) - 1))
+
+struct vmballoon_batch_page {
+	u64 pages[VMW_BALLOON_BATCH_MAX_PAGES];
+};
+
+static u64 vmballoon_batch_get_pa(struct vmballoon_batch_page *batch, int idx)
+{
+	return batch->pages[idx] & VMW_BALLOON_BATCH_PAGE_MASK;
+}
+
+static int vmballoon_batch_get_status(struct vmballoon_batch_page *batch,
+				int idx)
+{
+	return (int)(batch->pages[idx] & VMW_BALLOON_BATCH_STATUS_MASK);
+}
+
+static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
+				u64 pa)
+{
+	batch->pages[idx] = pa;
+}
+
+
+#define VMWARE_BALLOON_CMD(cmd, arg1, arg2, result)		\
 ({								\
-	unsigned long __status, __dummy1, __dummy2;		\
+	unsigned long __status, __dummy1, __dummy2, __dummy3;	\
 	__asm__ __volatile__ ("inl %%dx" :			\
 		"=a"(__status),					\
 		"=c"(__dummy1),					\
 		"=d"(__dummy2),					\
-		"=b"(result) :					\
+		"=b"(result),					\
+		"=S" (__dummy3) :				\
 		"0"(VMW_BALLOON_HV_MAGIC),			\
 		"1"(VMW_BALLOON_CMD_##cmd),			\
 		"2"(VMW_BALLOON_HV_PORT),			\
-		"3"(data) :					\
+		"3"(arg1),					\
+		"4" (arg2) :					\
 		"memory");					\
 	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
 		result = __dummy1;				\
@@ -192,6 +250,14 @@ struct vmballoon_stats {
 #define STATS_INC(stat)
 #endif
 
+struct vmballoon;
+
+struct vmballoon_ops {
+	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+};
+
 struct vmballoon {
 
 	/* list of reserved physical pages */
@@ -215,6 +281,14 @@ struct vmballoon {
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
 
+	unsigned long capabilities;
+
+	struct vmballoon_batch_page *batch_page;
+	unsigned int batch_max_pages;
+	struct page *page;
+
+	const struct vmballoon_ops *ops;
+
 #ifdef CONFIG_DEBUG_FS
 	/* statistics */
 	struct vmballoon_stats stats;
@@ -234,16 +308,22 @@ static struct vmballoon balloon;
  * Send "start" command to the host, communicating supported version
  * of the protocol.
  */
-static bool vmballoon_send_start(struct vmballoon *b)
+static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
-	unsigned long status, capabilities;
+	unsigned long status, capabilities, dummy = 0;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
-				capabilities);
-	if (status == VMW_BALLOON_SUCCESS)
+	status = VMWARE_BALLOON_CMD(START, req_caps, dummy, capabilities);
+
+	switch (status) {
+	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
+		b->capabilities = capabilities;
+		return true;
+	case VMW_BALLOON_SUCCESS:
+		b->capabilities = VMW_BALLOON_BASIC_CMDS;
 		return true;
+	}
 
 	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
 	STATS_INC(b->stats.start_fail);
@@ -273,9 +353,10 @@ static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
  */
 static bool vmballoon_send_guest_id(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 
-	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy);
+	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy,
+				dummy);
 
 	STATS_INC(b->stats.guest_type);
 
@@ -295,6 +376,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	unsigned long status;
 	unsigned long target;
 	unsigned long limit;
+	unsigned long dummy = 0;
 	u32 limit32;
 
 	/*
@@ -313,7 +395,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	/* update stats */
 	STATS_INC(b->stats.target);
 
-	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, target);
+	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, dummy, target);
 	if (vmballoon_check_status(b, status)) {
 		*new_target = target;
 		return true;
@@ -332,7 +414,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 				     unsigned int *hv_status)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -341,7 +423,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -350,13 +432,30 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	return 1;
 }
 
+static int vmballoon_send_batched_lock(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.lock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return 0;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.lock_fail);
+	return 1;
+}
+
 /*
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
 static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -365,7 +464,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -374,6 +473,23 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 	return false;
 }
 
+static bool vmballoon_send_batched_unlock(struct vmballoon *b,
+						unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.unlock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return true;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.unlock_fail);
+	return false;
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -396,22 +512,13 @@ static void vmballoon_pop(struct vmballoon *b)
 			cond_resched();
 		}
 	}
-}
 
-/*
- * Perform standard reset sequence by popping the balloon (in case it
- * is not  empty) and then restarting protocol. This operation normally
- * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
- */
-static void vmballoon_reset(struct vmballoon *b)
-{
-	/* free all pages, skipping monitor unlock */
-	vmballoon_pop(b);
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		if (b->batch_page)
+			vunmap(b->batch_page);
 
-	if (vmballoon_send_start(b)) {
-		b->reset_required = false;
-		if (!vmballoon_send_guest_id(b))
-			pr_err("failed to send guest ID to the host\n");
+		if (b->page)
+			__free_page(b->page);
 	}
 }
 
@@ -420,9 +527,10 @@ static void vmballoon_reset(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 {
 	int locked, hv_status;
+	struct page *page = b->page;
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
 	if (locked > 0) {
@@ -457,17 +565,68 @@ static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_lock_batched_page(struct vmballoon *b,
+				unsigned int num_pages)
+{
+	int locked, i;
+
+	locked = vmballoon_send_batched_lock(b, num_pages);
+	if (locked > 0) {
+		for (i = 0; i < num_pages; i++) {
+			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+			__free_page(p);
+		}
+
+		return -EIO;
+	}
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+
+		switch (locked) {
+		case VMW_BALLOON_SUCCESS:
+			list_add(&p->lru, &b->pages);
+			b->size++;
+			break;
+		case VMW_BALLOON_ERROR_PPN_PINNED:
+		case VMW_BALLOON_ERROR_PPN_INVALID:
+			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &b->refused_pages);
+				b->n_refused_pages++;
+				break;
+			}
+			/* Fallthrough */
+		case VMW_BALLOON_ERROR_RESET:
+		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
+			__free_page(p);
+			break;
+		default:
+			/* This should never happen */
+			WARN_ON_ONCE(true);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Release the page allocated for the balloon. Note that we first notify
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_release_page(struct vmballoon *b, struct page *page)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 {
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page)))
-		return -EIO;
+	struct page *page = b->page;
 
-	list_del(&page->lru);
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+		list_add(&page->lru, &b->pages);
+		return -EIO;
+	}
 
 	/* deallocate page */
 	__free_page(page);
@@ -479,6 +638,41 @@ static int vmballoon_release_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_unlock_batched_page(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	int locked, i, ret = 0;
+	bool hv_success;
+
+	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	if (!hv_success)
+		ret = -EIO;
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
+			/*
+			 * That page wasn't successfully unlocked by the
+			 * hypervisor, re-add it to the list of pages owned by
+			 * the balloon driver.
+			 */
+			list_add(&p->lru, &b->pages);
+		} else {
+			/* deallocate page */
+			__free_page(p);
+			STATS_INC(b->stats.free);
+
+			/* update balloon size */
+			b->size--;
+		}
+	}
+
+	return ret;
+}
+
 /*
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
@@ -496,6 +690,18 @@ static void vmballoon_release_refused_pages(struct vmballoon *b)
 	b->n_refused_pages = 0;
 }
 
+static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
+{
+	b->page = p;
+}
+
+static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
+				struct page *p)
+{
+	vmballoon_batch_set_pa(b->batch_page, idx,
+			(u64)page_to_pfn(p) << PAGE_SHIFT);
+}
+
 /*
  * Inflate the balloon towards its target size. Note that we try to limit
  * the rate of allocation to make sure we are not choking the rest of the
@@ -507,6 +713,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int rate;
 	unsigned int i;
 	unsigned int allocations = 0;
+	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
@@ -539,14 +746,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
-		struct page *page;
+		struct page *page = alloc_page(flags);
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -580,9 +786,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 			continue;
 		}
 
-		error = vmballoon_lock_page(b, page);
-		if (error)
-			break;
+		b->ops->add_page(b, num_pages++, page);
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->lock(b, num_pages);
+			num_pages = 0;
+			if (error)
+				break;
+		}
 
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
@@ -595,6 +805,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 		}
 	}
 
+	if (num_pages > 0)
+		b->ops->lock(b, num_pages);
+
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
@@ -618,6 +831,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 	struct page *page, *next;
 	unsigned int i = 0;
 	unsigned int goal;
+	unsigned int num_pages = 0;
 	int error;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
@@ -629,21 +843,94 @@ static void vmballoon_deflate(struct vmballoon *b)
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		error = vmballoon_release_page(b, page);
-		if (error) {
-			/* quickly decrease rate in case of error */
-			b->rate_free = max(b->rate_free / 2,
-					   VMW_BALLOON_RATE_FREE_MIN);
-			return;
+		list_del(&page->lru);
+		b->ops->add_page(b, num_pages++, page);
+
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->unlock(b, num_pages);
+			num_pages = 0;
+			if (error) {
+				/* quickly decrease rate in case of error */
+				b->rate_free = max(b->rate_free / 2,
+						VMW_BALLOON_RATE_FREE_MIN);
+				return;
+			}
 		}
 
 		if (++i >= goal)
 			break;
 	}
 
+	if (num_pages > 0)
+		b->ops->unlock(b, num_pages);
+
 	/* slowly increase rate if there were no errors */
-	b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-			   VMW_BALLOON_RATE_FREE_MAX);
+	if (error == 0)
+		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
+				   VMW_BALLOON_RATE_FREE_MAX);
+}
+
+static const struct vmballoon_ops vmballoon_basic_ops = {
+	.add_page = vmballoon_add_page,
+	.lock = vmballoon_lock_page,
+	.unlock = vmballoon_unlock_page
+};
+
+static const struct vmballoon_ops vmballoon_batched_ops = {
+	.add_page = vmballoon_add_batched_page,
+	.lock = vmballoon_lock_batched_page,
+	.unlock = vmballoon_unlock_batched_page
+};
+
+static bool vmballoon_init_batching(struct vmballoon *b)
+{
+	b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP);
+	if (!b->page)
+		return false;
+
+	b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL);
+	if (!b->batch_page) {
+		__free_page(b->page);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Perform standard reset sequence by popping the balloon (in case it
+ * is not  empty) and then restarting protocol. This operation normally
+ * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
+ */
+static void vmballoon_reset(struct vmballoon *b)
+{
+	/* free all pages, skipping monitor unlock */
+	vmballoon_pop(b);
+
+	if (!vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES))
+		return;
+
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		b->ops = &vmballoon_batched_ops;
+		b->batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(b)) {
+			/*
+			 * We failed to initialize batching, inform the monitor
+			 * about it by sending a null capability.
+			 *
+			 * The guest will retry in one second.
+			 */
+			vmballoon_send_start(b, 0);
+			return;
+		}
+	} else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		b->ops = &vmballoon_basic_ops;
+		b->batch_max_pages = 1;
+	}
+
+	b->reset_required = false;
+	if (!vmballoon_send_guest_id(b))
+		pr_err("failed to send guest ID to the host\n");
 }
 
 /*
@@ -802,11 +1089,23 @@ static int __init vmballoon_init(void)
 	/*
 	 * Start balloon.
 	 */
-	if (!vmballoon_send_start(&balloon)) {
+	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
 		pr_err("failed to send start command to the host\n");
 		return -EIO;
 	}
 
+	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		balloon.ops = &vmballoon_batched_ops;
+		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(&balloon)) {
+			pr_err("failed to init batching\n");
+			return -EIO;
+		}
+	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		balloon.ops = &vmballoon_basic_ops;
+		balloon.batch_max_pages = 1;
+	}
+
 	if (!vmballoon_send_guest_id(&balloon)) {
 		pr_err("failed to send guest ID to the host\n");
 		return -EIO;
@@ -833,7 +1132,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon);
+	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 4/9] VMware balloon: Update balloon target on each lock/unlock.
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (3 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
                                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Xavier Deguillard, gregkh, linux-kernel, akpm, pv-drivers,
	Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Instead of waiting for the next GET_TARGET command, we can react faster
by exploiting the fact that each hypervisor call also returns the
balloon target.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 85 +++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f65c676..72247d9 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.1.0-k");
+MODULE_VERSION("1.3.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -254,8 +254,10 @@ struct vmballoon;
 
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
-	int (*lock)(struct vmballoon *b, unsigned int num_pages);
-	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
 };
 
 struct vmballoon {
@@ -412,7 +414,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
  * check the return value and maybe submit a different page.
  */
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
-				     unsigned int *hv_status)
+				unsigned int *hv_status, unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -423,7 +425,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -433,14 +435,14 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.lock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -453,7 +455,8 @@ static int vmballoon_send_batched_lock(struct vmballoon *b,
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
-static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
+static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
+							unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -464,7 +467,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -474,14 +477,14 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-						unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -527,12 +530,14 @@ static void vmballoon_pop(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
 
-	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
+								target);
 	if (locked > 0) {
 		STATS_INC(b->stats.refused_alloc);
 
@@ -566,11 +571,11 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i;
 
-	locked = vmballoon_send_batched_lock(b, num_pages);
+	locked = vmballoon_send_batched_lock(b, num_pages, target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
@@ -619,11 +624,12 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	struct page *page = b->page;
 
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
 		list_add(&page->lru, &b->pages);
 		return -EIO;
 	}
@@ -639,12 +645,12 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
 	if (!hv_success)
 		ret = -EIO;
 
@@ -709,9 +715,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int goal;
 	unsigned int rate;
-	unsigned int i;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -734,7 +738,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * slowdown page allocations considerably.
 	 */
 
-	goal = b->target - b->size;
 	/*
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
@@ -743,16 +746,17 @@ static void vmballoon_inflate(struct vmballoon *b)
 			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
-		 __func__, goal, rate, b->rate_alloc);
+		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	for (i = 0; i < goal; i++) {
-		struct page *page = alloc_page(flags);
+	while (b->size < b->target && num_pages < b->target - b->size) {
+		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
+		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -777,7 +781,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 			 */
 			b->slow_allocation_cycles = VMW_BALLOON_SLOW_CYCLES;
 
-			if (i >= b->rate_alloc)
+			if (allocations >= b->rate_alloc)
 				break;
 
 			flags = VMW_PAGE_ALLOC_CANSLEEP;
@@ -788,7 +792,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages);
+			error = b->ops->lock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -799,21 +803,21 @@ static void vmballoon_inflate(struct vmballoon *b)
 			allocations = 0;
 		}
 
-		if (i >= rate) {
+		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
 			break;
 		}
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages);
+		b->ops->lock(b, num_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
 	 */
-	if (error == 0 && i >= b->rate_alloc) {
-		unsigned int mult = i / b->rate_alloc;
+	if (error == 0 && allocations >= b->rate_alloc) {
+		unsigned int mult = allocations / b->rate_alloc;
 
 		b->rate_alloc =
 			min(b->rate_alloc + mult * VMW_BALLOON_RATE_ALLOC_INC,
@@ -830,16 +834,11 @@ static void vmballoon_deflate(struct vmballoon *b)
 {
 	struct page *page, *next;
 	unsigned int i = 0;
-	unsigned int goal;
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
-
-	/* limit deallocation rate */
-	goal = min(b->size - b->target, b->rate_free);
-
-	pr_debug("%s - goal: %d, rate: %d\n", __func__, goal, b->rate_free);
+	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
+						b->target, b->rate_free);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
@@ -847,7 +846,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 		b->ops->add_page(b, num_pages++, page);
 
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages);
+			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error) {
 				/* quickly decrease rate in case of error */
@@ -857,12 +856,12 @@ static void vmballoon_deflate(struct vmballoon *b)
 			}
 		}
 
-		if (++i >= goal)
+		if (++i >= b->size - b->target)
 			break;
 	}
 
 	if (num_pages > 0)
-		b->ops->unlock(b, num_pages);
+		b->ops->unlock(b, num_pages, &b->target);
 
 	/* slowly increase rate if there were no errors */
 	if (error == 0)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (4 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-08-05 20:14                   ` Greg KH
  2015-06-12 18:43                 ` [PATCH v3 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
                                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

This helps with debugging vmw_balloon behavior, as it is clear what
functionality is enabled.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 72247d9..6eaf7f7 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.2.0-k");
+MODULE_VERSION("1.3.3.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -978,6 +978,12 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	struct vmballoon *b = f->private;
 	struct vmballoon_stats *stats = &b->stats;
 
+	/* format capabilities info */
+	seq_printf(f,
+		   "balloon capabilities:   %#4x\n"
+		   "used capabilities:      %#4lx\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+
 	/* format size info */
 	seq_printf(f,
 		   "target:             %8d pages\n"
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (5 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
                                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

When VMware's hypervisor requests a VM to reclaim memory this is preferrably done
via ballooning. If the balloon driver does not return memory fast enough, more
drastic methods, such as hypervisor-level swapping are needed. These other methods
cause performance issues, e.g. hypervisor-level swapping requires the hypervisor to
swap in a page syncronously while the virtual CPU is blocked.

Hence it is in the interest of the VM to balloon memory as fast as possible. The
problem with doing this is that the VM might end up doing nothing else than
ballooning and the user might notice that the VM is stalled, esp. when the VM has
only a single virtual CPU.

This is less of a problem if the VM and the hypervisor perform balloon operations
faster. Also the balloon driver yields regularly, hence on a single virtual CPU
the Linux scheduler should be able to properly time-slice between ballooning and
other tasks.

Testing Done: quickly ballooned a lot of pages while wathing if there are any
perceived hickups (periods of non-responsiveness) in the execution of the
linux VM. No such hickups were seen.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 68 +++++++++++-----------------------------------
 1 file changed, 16 insertions(+), 52 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 6eaf7f7..904b6a6 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.3.0-k");
+MODULE_VERSION("1.3.4.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -57,12 +57,6 @@ MODULE_LICENSE("GPL");
  */
 
 /*
- * Rate of allocating memory when there is no memory pressure
- * (driver performs non-sleeping allocations).
- */
-#define VMW_BALLOON_NOSLEEP_ALLOC_MAX	16384U
-
-/*
  * Rates of memory allocaton when guest experiences memory pressure
  * (driver performs sleeping allocations).
  */
@@ -71,13 +65,6 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_RATE_ALLOC_INC	16U
 
 /*
- * Rates for releasing pages while deflating balloon.
- */
-#define VMW_BALLOON_RATE_FREE_MIN	512U
-#define VMW_BALLOON_RATE_FREE_MAX	16384U
-#define VMW_BALLOON_RATE_FREE_INC	16U
-
-/*
  * When guest is under memory pressure, use a reduced page allocation
  * rate for next several cycles.
  */
@@ -99,9 +86,6 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_PAGE_ALLOC_CANSLEEP		(GFP_HIGHUSER)
 
-/* Maximum number of page allocations without yielding processor */
-#define VMW_BALLOON_YIELD_THRESHOLD	1024
-
 /* Maximum number of refused pages we accumulate during inflation cycle */
 #define VMW_BALLOON_MAX_REFUSED		16
 
@@ -278,7 +262,6 @@ struct vmballoon {
 
 	/* adjustment rates (pages per second) */
 	unsigned int rate_alloc;
-	unsigned int rate_free;
 
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
@@ -502,18 +485,13 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-	unsigned int count = 0;
 
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		__free_page(page);
 		STATS_INC(b->stats.free);
 		b->size--;
-
-		if (++count >= b->rate_free) {
-			count = 0;
-			cond_resched();
-		}
+		cond_resched();
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -715,7 +693,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int rate;
+	unsigned rate;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -742,13 +720,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ?
-			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
+	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
 
-	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
+	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	while (b->size < b->target && num_pages < b->target - b->size) {
+	while (!b->reset_required &&
+		b->size < b->target && num_pages < b->target - b->size) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
@@ -798,10 +776,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 				break;
 		}
 
-		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
-			cond_resched();
-			allocations = 0;
-		}
+		cond_resched();
 
 		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
@@ -837,36 +812,29 @@ static void vmballoon_deflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
-						b->target, b->rate_free);
+	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		b->ops->add_page(b, num_pages++, page);
 
+
 		if (num_pages == b->batch_max_pages) {
 			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
-			if (error) {
-				/* quickly decrease rate in case of error */
-				b->rate_free = max(b->rate_free / 2,
-						VMW_BALLOON_RATE_FREE_MIN);
+			if (error)
 				return;
-			}
 		}
 
-		if (++i >= b->size - b->target)
+		if (b->reset_required || ++i >= b->size - b->target)
 			break;
+
+		cond_resched();
 	}
 
 	if (num_pages > 0)
 		b->ops->unlock(b, num_pages, &b->target);
-
-	/* slowly increase rate if there were no errors */
-	if (error == 0)
-		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-				   VMW_BALLOON_RATE_FREE_MAX);
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -992,11 +960,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 
 	/* format rate info */
 	seq_printf(f,
-		   "rateNoSleepAlloc:   %8d pages/sec\n"
-		   "rateSleepAlloc:     %8d pages/sec\n"
-		   "rateFree:           %8d pages/sec\n",
-		   VMW_BALLOON_NOSLEEP_ALLOC_MAX,
-		   b->rate_alloc, b->rate_free);
+		   "rateSleepAlloc:     %8d pages/sec\n",
+		   b->rate_alloc);
 
 	seq_printf(f,
 		   "\n"
@@ -1087,7 +1052,6 @@ static int __init vmballoon_init(void)
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-	balloon.rate_free = VMW_BALLOON_RATE_FREE_MAX;
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 7/9] VMware balloon: Support 2m page ballooning.
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (6 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

2m ballooning significantly reduces the hypervisor side (and guest side)
overhead of ballooning and unballooning.

hypervisor only:
      balloon  unballoon
4 KB  2 GB/s   2.6 GB/s
2 MB  54 GB/s  767 GB/s

Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
supported super-page size.

The code has to run on older versions of ESX and old balloon drivers run on
newer version of ESX. Hence match the capabilities with the host before 2m
page ballooning could be enabled.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 376 +++++++++++++++++++++++++++++++--------------
 1 file changed, 258 insertions(+), 118 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 904b6a6..cbaf329 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.4.0-k");
+MODULE_VERSION("1.4.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -101,11 +101,16 @@ enum vmwballoon_capabilities {
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
 	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
-					| VMW_BALLOON_BATCHED_CMDS)
+					| VMW_BALLOON_BATCHED_CMDS \
+					| VMW_BALLOON_BATCHED_2M_CMDS)
+
+#define VMW_BALLOON_2M_SHIFT		(9)
+#define VMW_BALLOON_NUM_PAGE_SIZES	(2)
 
 /*
  * Backdoor commands availability:
@@ -116,14 +121,19 @@ enum vmwballoon_capabilities {
  *	LOCK and UNLOCK commands,
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ * VMW BALLOON_BATCHED_2M_CMDS:
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
  */
-#define VMW_BALLOON_CMD_START		0
-#define VMW_BALLOON_CMD_GET_TARGET	1
-#define VMW_BALLOON_CMD_LOCK		2
-#define VMW_BALLOON_CMD_UNLOCK		3
-#define VMW_BALLOON_CMD_GUEST_ID	4
-#define VMW_BALLOON_CMD_BATCHED_LOCK	6
-#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
+#define VMW_BALLOON_CMD_START			0
+#define VMW_BALLOON_CMD_GET_TARGET		1
+#define VMW_BALLOON_CMD_LOCK			2
+#define VMW_BALLOON_CMD_UNLOCK			3
+#define VMW_BALLOON_CMD_GUEST_ID		4
+#define VMW_BALLOON_CMD_BATCHED_LOCK		6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
+#define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
+#define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -151,9 +161,6 @@ enum vmwballoon_capabilities {
  * +-------------+----------+--------+
  * 64  PAGE_SHIFT          6         0
  *
- * For now only 4K pages are supported, but we can easily support large pages
- * by using bits in the reserved field.
- *
  * The reserved field should be set to 0.
  */
 #define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
@@ -208,19 +215,19 @@ struct vmballoon_stats {
 	unsigned int timer;
 
 	/* allocation statistics */
-	unsigned int alloc;
-	unsigned int alloc_fail;
+	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int alloc_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int sleep_alloc;
 	unsigned int sleep_alloc_fail;
-	unsigned int refused_alloc;
-	unsigned int refused_free;
-	unsigned int free;
+	unsigned int refused_alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int refused_free[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int free[VMW_BALLOON_NUM_PAGE_SIZES];
 
 	/* monitor operations */
-	unsigned int lock;
-	unsigned int lock_fail;
-	unsigned int unlock;
-	unsigned int unlock_fail;
+	unsigned int lock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int lock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int target;
 	unsigned int target_fail;
 	unsigned int start;
@@ -239,19 +246,25 @@ struct vmballoon;
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
 	int (*lock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 };
 
-struct vmballoon {
-
+struct vmballoon_page_size {
 	/* list of reserved physical pages */
 	struct list_head pages;
 
 	/* transient list of non-balloonable pages */
 	struct list_head refused_pages;
 	unsigned int n_refused_pages;
+};
+
+struct vmballoon {
+	struct vmballoon_page_size page_sizes[VMW_BALLOON_NUM_PAGE_SIZES];
+
+	/* supported page sizes. 1 == 4k pages only, 2 == 4k and 2m pages */
+	unsigned supported_page_sizes;
 
 	/* balloon size in pages */
 	unsigned int size;
@@ -296,6 +309,7 @@ static struct vmballoon balloon;
 static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
 	unsigned long status, capabilities, dummy = 0;
+	bool success;
 
 	STATS_INC(b->stats.start);
 
@@ -304,15 +318,26 @@ static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 	switch (status) {
 	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
 		b->capabilities = capabilities;
-		return true;
+		success = true;
+		break;
 	case VMW_BALLOON_SUCCESS:
 		b->capabilities = VMW_BALLOON_BASIC_CMDS;
-		return true;
+		success = true;
+		break;
+	default:
+		success = false;
 	}
 
-	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
-	STATS_INC(b->stats.start_fail);
-	return false;
+	if (b->capabilities & VMW_BALLOON_BATCHED_2M_CMDS)
+		b->supported_page_sizes = 2;
+	else
+		b->supported_page_sizes = 1;
+
+	if (!success) {
+		pr_debug("%s - failed, hv returns %ld\n", __func__, status);
+		STATS_INC(b->stats.start_fail);
+	}
+	return success;
 }
 
 static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
@@ -353,6 +378,14 @@ static bool vmballoon_send_guest_id(struct vmballoon *b)
 	return false;
 }
 
+static u16 vmballoon_page_size(bool is_2m_page)
+{
+	if (is_2m_page)
+		return 1 << VMW_BALLOON_2M_SHIFT;
+
+	return 1;
+}
+
 /*
  * Retrieve desired balloon size from the host.
  */
@@ -406,31 +439,37 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return -1;
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[false]);
 
 	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[false]);
 	return 1;
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_LOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[is_2m_pages]);
 	return 1;
 }
 
@@ -448,34 +487,56 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return false;
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[false]);
 
 	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[false]);
 	return false;
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_UNLOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[is_2m_pages]);
 	return false;
 }
 
+static struct page *vmballoon_alloc_page(gfp_t flags, bool is_2m_page)
+{
+	if (is_2m_page)
+		return alloc_pages(flags, VMW_BALLOON_2M_SHIFT);
+
+	return alloc_page(flags);
+}
+
+static void vmballoon_free_page(struct page *page, bool is_2m_page)
+{
+	if (is_2m_page)
+		__free_pages(page, VMW_BALLOON_2M_SHIFT);
+	else
+		__free_page(page);
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -485,13 +546,21 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.free);
-		b->size--;
-		cond_resched();
+	unsigned is_2m_pages;
+
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+		u16 size_per_page = vmballoon_page_size(is_2m_pages);
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			list_del(&page->lru);
+			vmballoon_free_page(page, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
+			b->size -= size_per_page;
+			cond_resched();
+		}
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -509,19 +578,22 @@ static void vmballoon_pop(struct vmballoon *b)
  * inflation cycle.
  */
 static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+				bool is_2m_pages, unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
 								target);
 	if (locked > 0) {
-		STATS_INC(b->stats.refused_alloc);
+		STATS_INC(b->stats.refused_alloc[false]);
 
 		if (hv_status == VMW_BALLOON_ERROR_RESET ||
 				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 			return -EIO;
 		}
 
@@ -530,17 +602,17 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 		 * and retry allocation, unless we already accumulated
 		 * too many of them, in which case take a breather.
 		 */
-		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-			b->n_refused_pages++;
-			list_add(&page->lru, &b->refused_pages);
+		if (page_size->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			page_size->n_refused_pages++;
+			list_add(&page->lru, &page_size->refused_pages);
 		} else {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 		}
 		return -EIO;
 	}
 
 	/* track allocated page */
-	list_add(&page->lru, &b->pages);
+	list_add(&page->lru, &page_size->pages);
 
 	/* update balloon size */
 	b->size++;
@@ -549,17 +621,19 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	int locked, i;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	locked = vmballoon_send_batched_lock(b, num_pages, target);
+	locked = vmballoon_send_batched_lock(b, num_pages, is_2m_pages,
+			target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
 
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 		}
 
 		return -EIO;
@@ -568,25 +642,28 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 
 		switch (locked) {
 		case VMW_BALLOON_SUCCESS:
-			list_add(&p->lru, &b->pages);
-			b->size++;
+			list_add(&p->lru, &page_size->pages);
+			b->size += size_per_page;
 			break;
 		case VMW_BALLOON_ERROR_PPN_PINNED:
 		case VMW_BALLOON_ERROR_PPN_INVALID:
-			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-				list_add(&p->lru, &b->refused_pages);
-				b->n_refused_pages++;
+			if (page_size->n_refused_pages
+					< VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &page_size->refused_pages);
+				page_size->n_refused_pages++;
 				break;
 			}
 			/* Fallthrough */
 		case VMW_BALLOON_ERROR_RESET:
 		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 			break;
 		default:
 			/* This should never happen */
@@ -603,18 +680,21 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * to use, if needed.
  */
 static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+		bool is_2m_pages, unsigned int *target)
 {
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
-		list_add(&page->lru, &b->pages);
+		list_add(&page->lru, &page_size->pages);
 		return -EIO;
 	}
 
 	/* deallocate page */
-	__free_page(page);
-	STATS_INC(b->stats.free);
+	vmballoon_free_page(page, false);
+	STATS_INC(b->stats.free[false]);
 
 	/* update balloon size */
 	b->size--;
@@ -623,18 +703,23 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+				unsigned int num_pages, bool is_2m_pages,
+				unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, is_2m_pages,
+			target);
 	if (!hv_success)
 		ret = -EIO;
 
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
@@ -643,14 +728,14 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
 			 * hypervisor, re-add it to the list of pages owned by
 			 * the balloon driver.
 			 */
-			list_add(&p->lru, &b->pages);
+			list_add(&p->lru, &page_size->pages);
 		} else {
 			/* deallocate page */
-			__free_page(p);
-			STATS_INC(b->stats.free);
+			vmballoon_free_page(p, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
 
 			/* update balloon size */
-			b->size--;
+			b->size -= size_per_page;
 		}
 	}
 
@@ -661,17 +746,20 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
  */
-static void vmballoon_release_refused_pages(struct vmballoon *b)
+static void vmballoon_release_refused_pages(struct vmballoon *b,
+		bool is_2m_pages)
 {
 	struct page *page, *next;
+	struct vmballoon_page_size *page_size =
+			&b->page_sizes[is_2m_pages];
 
-	list_for_each_entry_safe(page, next, &b->refused_pages, lru) {
+	list_for_each_entry_safe(page, next, &page_size->refused_pages, lru) {
 		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.refused_free);
+		vmballoon_free_page(page, is_2m_pages);
+		STATS_INC(b->stats.refused_free[is_2m_pages]);
 	}
 
-	b->n_refused_pages = 0;
+	page_size->n_refused_pages = 0;
 }
 
 static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
@@ -698,6 +786,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
+	bool is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -720,22 +809,46 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
+	if (b->slow_allocation_cycles) {
+		rate = b->rate_alloc;
+		is_2m_pages = false;
+	} else {
+		rate = UINT_MAX;
+		is_2m_pages =
+			b->supported_page_sizes == VMW_BALLOON_NUM_PAGE_SIZES;
+	}
 
 	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
 	while (!b->reset_required &&
-		b->size < b->target && num_pages < b->target - b->size) {
+		b->size + num_pages * vmballoon_page_size(is_2m_pages)
+		< b->target) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
-			STATS_INC(b->stats.alloc);
+			STATS_INC(b->stats.alloc[is_2m_pages]);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
+		page = vmballoon_alloc_page(flags, is_2m_pages);
 		if (!page) {
+			STATS_INC(b->stats.alloc_fail[is_2m_pages]);
+
+			if (is_2m_pages) {
+				b->ops->lock(b, num_pages, true, &b->target);
+
+				/*
+				 * ignore errors from locking as we now switch
+				 * to 4k pages and we might get different
+				 * errors.
+				 */
+
+				num_pages = 0;
+				is_2m_pages = false;
+				continue;
+			}
+
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
@@ -747,7 +860,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
-			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -770,7 +882,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages, &b->target);
+			error = b->ops->lock(b, num_pages, is_2m_pages,
+					&b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -785,7 +898,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages, &b->target);
+		b->ops->lock(b, num_pages, is_2m_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
@@ -799,7 +912,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 			    VMW_BALLOON_RATE_ALLOC_MAX);
 	}
 
-	vmballoon_release_refused_pages(b);
+	vmballoon_release_refused_pages(b, true);
+	vmballoon_release_refused_pages(b, false);
 }
 
 /*
@@ -807,34 +921,45 @@ static void vmballoon_inflate(struct vmballoon *b)
  */
 static void vmballoon_deflate(struct vmballoon *b)
 {
-	struct page *page, *next;
-	unsigned int i = 0;
-	unsigned int num_pages = 0;
-	int error;
+	unsigned is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		b->ops->add_page(b, num_pages++, page);
+	for (is_2m_pages = 0; is_2m_pages < b->supported_page_sizes;
+			is_2m_pages++) {
+		struct page *page, *next;
+		unsigned int num_pages = 0;
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			if (b->reset_required ||
+				(b->target > 0 &&
+					b->size - num_pages
+					* vmballoon_page_size(is_2m_pages)
+				< b->target + vmballoon_page_size(true)))
+				break;
 
+			list_del(&page->lru);
+			b->ops->add_page(b, num_pages++, page);
 
-		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages, &b->target);
-			num_pages = 0;
-			if (error)
-				return;
-		}
+			if (num_pages == b->batch_max_pages) {
+				int error;
 
-		if (b->reset_required || ++i >= b->size - b->target)
-			break;
+				error = b->ops->unlock(b, num_pages,
+						is_2m_pages, &b->target);
+				num_pages = 0;
+				if (error)
+					return;
+			}
 
-		cond_resched();
-	}
+			cond_resched();
+		}
 
-	if (num_pages > 0)
-		b->ops->unlock(b, num_pages, &b->target);
+		if (num_pages > 0)
+			b->ops->unlock(b, num_pages, is_2m_pages, &b->target);
+	}
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -924,7 +1049,8 @@ static void vmballoon_work(struct work_struct *work)
 
 		if (b->size < target)
 			vmballoon_inflate(b);
-		else if (b->size > target)
+		else if (target == 0 ||
+				b->size > target + vmballoon_page_size(true))
 			vmballoon_deflate(b);
 	}
 
@@ -968,24 +1094,35 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "timer:              %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
+		   "2m-lock:            %8u (%4u failed)\n"
 		   "lock:               %8u (%4u failed)\n"
+		   "2m-unlock:          %8u (%4u failed)\n"
 		   "unlock:             %8u (%4u failed)\n"
 		   "target:             %8u (%4u failed)\n"
+		   "prim2mAlloc:        %8u (%4u failed)\n"
 		   "primNoSleepAlloc:   %8u (%4u failed)\n"
 		   "primCanSleepAlloc:  %8u (%4u failed)\n"
+		   "prim2mFree:         %8u\n"
 		   "primFree:           %8u\n"
+		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
+		   "err2mFree:          %8u\n"
 		   "errFree:            %8u\n",
 		   stats->timer,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
-		   stats->lock,  stats->lock_fail,
-		   stats->unlock, stats->unlock_fail,
+		   stats->lock[true],  stats->lock_fail[true],
+		   stats->lock[false],  stats->lock_fail[false],
+		   stats->unlock[true], stats->unlock_fail[true],
+		   stats->unlock[false], stats->unlock_fail[false],
 		   stats->target, stats->target_fail,
-		   stats->alloc, stats->alloc_fail,
+		   stats->alloc[true], stats->alloc_fail[true],
+		   stats->alloc[false], stats->alloc_fail[false],
 		   stats->sleep_alloc, stats->sleep_alloc_fail,
-		   stats->free,
-		   stats->refused_alloc, stats->refused_free);
+		   stats->free[true],
+		   stats->free[false],
+		   stats->refused_alloc[true], stats->refused_alloc[false],
+		   stats->refused_free[true], stats->refused_free[false]);
 
 	return 0;
 }
@@ -1039,7 +1176,7 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b)
 static int __init vmballoon_init(void)
 {
 	int error;
-
+	unsigned is_2m_pages;
 	/*
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
@@ -1047,8 +1184,11 @@ static int __init vmballoon_init(void)
 	if (x86_hyper != &x86_hyper_vmware)
 		return -ENODEV;
 
-	INIT_LIST_HEAD(&balloon.pages);
-	INIT_LIST_HEAD(&balloon.refused_pages);
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].pages);
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].refused_pages);
+	}
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 8/9] VMware balloon: Treat init like reset
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (7 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  2015-06-12 18:43                 ` [PATCH v3 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

Unify the behavior of the first start of the balloon and a reset. Also on
unload, declare that the balloon driver does not have any capabilities
anymore.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 53 ++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index cbaf329..cc4953d 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.0.0-k");
+MODULE_VERSION("1.4.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -563,12 +563,14 @@ static void vmballoon_pop(struct vmballoon *b)
 		}
 	}
 
-	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		if (b->batch_page)
-			vunmap(b->batch_page);
+	if (b->batch_page) {
+		vunmap(b->batch_page);
+		b->batch_page = NULL;
+	}
 
-		if (b->page)
-			__free_page(b->page);
+	if (b->page) {
+		__free_page(b->page);
+		b->page = NULL;
 	}
 }
 
@@ -1043,7 +1045,7 @@ static void vmballoon_work(struct work_struct *work)
 	if (b->slow_allocation_cycles > 0)
 		b->slow_allocation_cycles--;
 
-	if (vmballoon_send_get_target(b, &target)) {
+	if (!b->reset_required && vmballoon_send_get_target(b, &target)) {
 		/* update target, adjust size */
 		b->target = target;
 
@@ -1075,8 +1077,10 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	/* format capabilities info */
 	seq_printf(f,
 		   "balloon capabilities:   %#4x\n"
-		   "used capabilities:      %#4lx\n",
-		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+		   "used capabilities:      %#4lx\n"
+		   "is resetting:           %c\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities,
+		   b->reset_required ? 'y' : 'n');
 
 	/* format size info */
 	seq_printf(f,
@@ -1195,35 +1199,14 @@ static int __init vmballoon_init(void)
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-	/*
-	 * Start balloon.
-	 */
-	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
-		pr_err("failed to send start command to the host\n");
-		return -EIO;
-	}
-
-	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		balloon.ops = &vmballoon_batched_ops;
-		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
-		if (!vmballoon_init_batching(&balloon)) {
-			pr_err("failed to init batching\n");
-			return -EIO;
-		}
-	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
-		balloon.ops = &vmballoon_basic_ops;
-		balloon.batch_max_pages = 1;
-	}
-
-	if (!vmballoon_send_guest_id(&balloon)) {
-		pr_err("failed to send guest ID to the host\n");
-		return -EIO;
-	}
-
 	error = vmballoon_debugfs_init(&balloon);
 	if (error)
 		return error;
 
+	balloon.batch_page = NULL;
+	balloon.page = NULL;
+	balloon.reset_required = true;
+
 	queue_delayed_work(system_freezable_wq, &balloon.dwork, 0);
 
 	return 0;
@@ -1241,7 +1224,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
+	vmballoon_send_start(&balloon, 0);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v3 9/9] VMware balloon: Enable notification via VMCI
  2015-06-12 16:15               ` dmitry.torokhov
                                   ` (8 preceding siblings ...)
  2015-06-12 18:43                 ` [PATCH v3 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
@ 2015-06-12 18:43                 ` Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-06-12 18:43 UTC (permalink / raw)
  To: dmitry.torokhov
  Cc: Philip P. Moltmann, gregkh, linux-kernel, xdeguillard, akpm, pv-drivers

Get notified immediately when a balloon target is set, instead of waiting for
up to one second.

The up-to 1 second gap could be long enough to cause swapping inside of the
VM that receives the VM.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Tested-by: Siva Sankar Reddy B <sankars@vmware.com>
---
 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 105 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 006242c..1c075b7 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -404,7 +404,7 @@ config TI_DAC7512
 
 config VMWARE_BALLOON
 	tristate "VMware Balloon Driver"
-	depends on X86 && HYPERVISOR_GUEST
+	depends on VMWARE_VMCI && X86 && HYPERVISOR_GUEST
 	help
 	  This is VMware physical memory management driver which acts
 	  like a "balloon" that can be inflated to reclaim physical pages
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index cc4953d..cfe7655 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2014, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -42,11 +42,13 @@
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
+#include <linux/vmw_vmci_defs.h>
+#include <linux/vmw_vmci_api.h>
 #include <asm/hypervisor.h>
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.1.0-k");
+MODULE_VERSION("1.5.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -100,14 +102,16 @@ enum vmwballoon_capabilities {
 	/*
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
-	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
-	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
+	VMW_BALLOON_BASIC_CMDS			= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS		= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS		= (1 << 3),
+	VMW_BALLOON_SIGNALLED_WAKEUP_CMD	= (1 << 4),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
 					| VMW_BALLOON_BATCHED_CMDS \
-					| VMW_BALLOON_BATCHED_2M_CMDS)
+					| VMW_BALLOON_BATCHED_2M_CMDS \
+					| VMW_BALLOON_SIGNALLED_WAKEUP_CMD)
 
 #define VMW_BALLOON_2M_SHIFT		(9)
 #define VMW_BALLOON_NUM_PAGE_SIZES	(2)
@@ -122,7 +126,9 @@ enum vmwballoon_capabilities {
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
  * VMW BALLOON_BATCHED_2M_CMDS:
- *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands,
+ * VMW VMW_BALLOON_SIGNALLED_WAKEUP_CMD:
+ *	VMW_BALLOON_CMD_VMCI_DOORBELL_SET command.
  */
 #define VMW_BALLOON_CMD_START			0
 #define VMW_BALLOON_CMD_GET_TARGET		1
@@ -133,6 +139,7 @@ enum vmwballoon_capabilities {
 #define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
 #define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
 #define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+#define VMW_BALLOON_CMD_VMCI_DOORBELL_SET	10
 
 
 /* error codes */
@@ -213,6 +220,7 @@ static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
 #ifdef CONFIG_DEBUG_FS
 struct vmballoon_stats {
 	unsigned int timer;
+	unsigned int doorbell;
 
 	/* allocation statistics */
 	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
@@ -234,6 +242,8 @@ struct vmballoon_stats {
 	unsigned int start_fail;
 	unsigned int guest_type;
 	unsigned int guest_type_fail;
+	unsigned int doorbell_set;
+	unsigned int doorbell_unset;
 };
 
 #define STATS_INC(stat) (stat)++
@@ -298,6 +308,8 @@ struct vmballoon {
 	struct sysinfo sysinfo;
 
 	struct delayed_work dwork;
+
+	struct vmci_handle vmci_doorbell;
 };
 
 static struct vmballoon balloon;
@@ -992,12 +1004,75 @@ static bool vmballoon_init_batching(struct vmballoon *b)
 }
 
 /*
+ * Receive notification and resize balloon
+ */
+static void vmballoon_doorbell(void *client_data)
+{
+	struct vmballoon *b = client_data;
+
+	STATS_INC(b->stats.doorbell);
+
+	mod_delayed_work(system_freezable_wq, &b->dwork, 0);
+}
+
+/*
+ * Clean up vmci doorbell
+ */
+static void vmballoon_vmci_cleanup(struct vmballoon *b)
+{
+	int error;
+
+	VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET, VMCI_INVALID_ID,
+			VMCI_INVALID_ID, error);
+	STATS_INC(b->stats.doorbell_unset);
+
+	if (!vmci_handle_is_invalid(b->vmci_doorbell)) {
+		vmci_doorbell_destroy(b->vmci_doorbell);
+		b->vmci_doorbell = VMCI_INVALID_HANDLE;
+	}
+}
+
+/*
+ * Initialize vmci doorbell, to get notified as soon as balloon changes
+ */
+static int vmballoon_vmci_init(struct vmballoon *b)
+{
+	int error = 0;
+
+	if ((b->capabilities & VMW_BALLOON_SIGNALLED_WAKEUP_CMD) != 0) {
+		error = vmci_doorbell_create(&b->vmci_doorbell,
+				VMCI_FLAG_DELAYED_CB,
+				VMCI_PRIVILEGE_FLAG_RESTRICTED,
+				vmballoon_doorbell, b);
+
+		if (error == VMCI_SUCCESS) {
+			VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET,
+					b->vmci_doorbell.context,
+					b->vmci_doorbell.resource, error);
+			STATS_INC(b->stats.doorbell_set);
+		}
+	}
+
+	if (error != 0) {
+		vmballoon_vmci_cleanup(b);
+
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/*
  * Perform standard reset sequence by popping the balloon (in case it
  * is not  empty) and then restarting protocol. This operation normally
  * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
  */
 static void vmballoon_reset(struct vmballoon *b)
 {
+	int error;
+
+	vmballoon_vmci_cleanup(b);
+
 	/* free all pages, skipping monitor unlock */
 	vmballoon_pop(b);
 
@@ -1023,6 +1098,11 @@ static void vmballoon_reset(struct vmballoon *b)
 	}
 
 	b->reset_required = false;
+
+	error = vmballoon_vmci_init(b);
+	if (error)
+		pr_err("failed to initialize vmci doorbell\n");
+
 	if (!vmballoon_send_guest_id(b))
 		pr_err("failed to send guest ID to the host\n");
 }
@@ -1096,6 +1176,7 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f,
 		   "\n"
 		   "timer:              %8u\n"
+		   "doorbell:           %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
 		   "2m-lock:            %8u (%4u failed)\n"
@@ -1111,8 +1192,11 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
 		   "err2mFree:          %8u\n"
-		   "errFree:            %8u\n",
+		   "errFree:            %8u\n"
+		   "doorbellSet:        %8u\n"
+		   "doorbellUnset:      %8u\n",
 		   stats->timer,
+		   stats->doorbell,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
 		   stats->lock[true],  stats->lock_fail[true],
@@ -1126,7 +1210,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   stats->free[true],
 		   stats->free[false],
 		   stats->refused_alloc[true], stats->refused_alloc[false],
-		   stats->refused_free[true], stats->refused_free[false]);
+		   stats->refused_free[true], stats->refused_free[false],
+		   stats->doorbell_set, stats->doorbell_unset);
 
 	return 0;
 }
@@ -1203,6 +1288,7 @@ static int __init vmballoon_init(void)
 	if (error)
 		return error;
 
+	balloon.vmci_doorbell = VMCI_INVALID_HANDLE;
 	balloon.batch_page = NULL;
 	balloon.page = NULL;
 	balloon.reset_required = true;
@@ -1215,6 +1301,7 @@ module_init(vmballoon_init);
 
 static void __exit vmballoon_exit(void)
 {
+	vmballoon_vmci_cleanup(&balloon);
 	cancel_delayed_work_sync(&balloon.dwork);
 
 	vmballoon_debugfs_exit(&balloon);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-06-12 18:43                 ` [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
@ 2015-08-05 20:14                   ` Greg KH
  2015-08-05 20:22                     ` Philip Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: Greg KH @ 2015-08-05 20:14 UTC (permalink / raw)
  To: Philip P. Moltmann
  Cc: dmitry.torokhov, linux-kernel, xdeguillard, akpm, pv-drivers

On Fri, Jun 12, 2015 at 11:43:26AM -0700, Philip P. Moltmann wrote:
> This helps with debugging vmw_balloon behavior, as it is clear what
> functionality is enabled.
> 
> Acked-by: Andy King <acking@vmware.com>
> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
> ---
>  drivers/misc/vmw_balloon.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> index 72247d9..6eaf7f7 100644
> --- a/drivers/misc/vmw_balloon.c
> +++ b/drivers/misc/vmw_balloon.c
> @@ -46,7 +46,7 @@
>  
>  MODULE_AUTHOR("VMware, Inc.");
>  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
> -MODULE_VERSION("1.3.2.0-k");
> +MODULE_VERSION("1.3.3.0-k");

This constant change of module version is annoying, is it really even
needed?

I'll take this, but seriously consider just dropping it entirely as it
doesn't mean anything now that the driver is in the kernel tree.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-06-12 18:43                 ` [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
@ 2015-08-05 20:19                   ` Greg KH
  2015-08-05 22:36                     ` [PATCH v4 " Philip P. Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: Greg KH @ 2015-08-05 20:19 UTC (permalink / raw)
  To: Philip P. Moltmann
  Cc: dmitry.torokhov, Xavier Deguillard, linux-kernel, akpm, pv-drivers

On Fri, Jun 12, 2015 at 11:43:24AM -0700, Philip P. Moltmann wrote:
> From: Xavier Deguillard <xdeguillard@vmware.com>
> 
> Introduce a new capability to the driver that allow sending 512 pages in
> one hypervisor call. This reduce the cost of the driver when reclaiming
> memory.
> 
> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
> Acked-by: Dmitry Torokhov <dtor@vmware.com>
> Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
> ---
>  drivers/misc/vmw_balloon.c | 405 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 352 insertions(+), 53 deletions(-)

This patch causes build errors on my system about vmap and vunmap not
being defined :(

I can't take it, so I've stopped here, sorry.  Please always test your
patches properly.

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:14                   ` Greg KH
@ 2015-08-05 20:22                     ` Philip Moltmann
  2015-08-05 20:33                       ` dmitry.torokhov
  2015-08-05 20:40                       ` gregkh
  0 siblings, 2 replies; 61+ messages in thread
From: Philip Moltmann @ 2015-08-05 20:22 UTC (permalink / raw)
  To: gregkh
  Cc: dmitry.torokhov, linux-kernel, pv-drivers, Xavier Deguillard,
	John Savanyo, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 710 bytes --]

Hi,

> >  MODULE_AUTHOR("VMware, Inc.");
> >  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
> > -MODULE_VERSION("1.3.2.0-k");
> > +MODULE_VERSION("1.3.3.0-k");
> 
> This constant change of module version is annoying, is it really even
> needed?
> 
> I'll take this, but seriously consider just dropping it entirely as 
> it
> doesn't mean anything now that the driver is in the kernel tree.

I think this is meant so that we can track which patches got backported
into RHEL and SLES.

CC-ing John as the policy comes from him.

Philipÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:22                     ` Philip Moltmann
@ 2015-08-05 20:33                       ` dmitry.torokhov
  2015-08-05 20:42                         ` John Savanyo
  2015-08-05 20:40                       ` gregkh
  1 sibling, 1 reply; 61+ messages in thread
From: dmitry.torokhov @ 2015-08-05 20:33 UTC (permalink / raw)
  To: Philip Moltmann
  Cc: gregkh, linux-kernel, pv-drivers, Xavier Deguillard, John Savanyo, akpm

On Wed, Aug 05, 2015 at 08:22:35PM +0000, Philip Moltmann wrote:
> Hi,
> 
> > >  MODULE_AUTHOR("VMware, Inc.");
> > >  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
> > > -MODULE_VERSION("1.3.2.0-k");
> > > +MODULE_VERSION("1.3.3.0-k");
> > 
> > This constant change of module version is annoying, is it really even
> > needed?
> > 
> > I'll take this, but seriously consider just dropping it entirely as 
> > it
> > doesn't mean anything now that the driver is in the kernel tree.
> 
> I think this is meant so that we can track which patches got backported
> into RHEL and SLES.

That assumes that RHEL and SLES always take everything that is in
mainline, which I would not count. I.e if you have a security fix and
also change version to 1.3.4.0-k and RedHat picks it up is the driver
that they have really 1.3.4.0-k? If not then what?

You really need to keep track of the substance of the changes needing to
go into each distribution.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:22                     ` Philip Moltmann
  2015-08-05 20:33                       ` dmitry.torokhov
@ 2015-08-05 20:40                       ` gregkh
  1 sibling, 0 replies; 61+ messages in thread
From: gregkh @ 2015-08-05 20:40 UTC (permalink / raw)
  To: Philip Moltmann
  Cc: dmitry.torokhov, linux-kernel, pv-drivers, Xavier Deguillard,
	John Savanyo, akpm

On Wed, Aug 05, 2015 at 08:22:35PM +0000, Philip Moltmann wrote:
> Hi,
> 
> > >  MODULE_AUTHOR("VMware, Inc.");
> > >  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
> > > -MODULE_VERSION("1.3.2.0-k");
> > > +MODULE_VERSION("1.3.3.0-k");
> > 
> > This constant change of module version is annoying, is it really even
> > needed?
> > 
> > I'll take this, but seriously consider just dropping it entirely as 
> > it
> > doesn't mean anything now that the driver is in the kernel tree.
> 
> I think this is meant so that we can track which patches got backported
> into RHEL and SLES.

That guarantees all of those patches will conflict and the engineers
will curse your name.  Don't do that, it's horrid.  What happens if
someone picks one patch, and then skips one, and uses the third?  What
would be the "version" then?

Just drop it entirely, it's useless.  You have the source, so you know
what changes are done, no need to try to match it up with a random
number.

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:33                       ` dmitry.torokhov
@ 2015-08-05 20:42                         ` John Savanyo
  2015-08-05 20:50                           ` gregkh
  0 siblings, 1 reply; 61+ messages in thread
From: John Savanyo @ 2015-08-05 20:42 UTC (permalink / raw)
  To: dmitry.torokhov, Philip Moltmann
  Cc: gregkh, linux-kernel, pv-drivers, Xavier Deguillard, akpm

I agree that version number tracking is not perfect. But it is valuable
for us to use as a rough indication that we have attempted to back port
some appropriate subset of patches from mainline to a sustaining distro
release without having to diff the code.

-John

On 8/5/15, 1:33 PM, "dmitry.torokhov@gmail.com"
<dmitry.torokhov@gmail.com> wrote:

>On Wed, Aug 05, 2015 at 08:22:35PM +0000, Philip Moltmann wrote:
>> Hi,
>> 
>> > >  MODULE_AUTHOR("VMware, Inc.");
>> > >  MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
>> > > -MODULE_VERSION("1.3.2.0-k");
>> > > +MODULE_VERSION("1.3.3.0-k");
>> > 
>> > This constant change of module version is annoying, is it really even
>> > needed?
>> > 
>> > I'll take this, but seriously consider just dropping it entirely as
>> > it
>> > doesn't mean anything now that the driver is in the kernel tree.
>> 
>> I think this is meant so that we can track which patches got backported
>> into RHEL and SLES.
>
>That assumes that RHEL and SLES always take everything that is in
>mainline, which I would not count. I.e if you have a security fix and
>also change version to 1.3.4.0-k and RedHat picks it up is the driver
>that they have really 1.3.4.0-k? If not then what?
>
>You really need to keep track of the substance of the changes needing to
>go into each distribution.
>
>Thanks.
>
>-- 
>Dmitry


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:42                         ` John Savanyo
@ 2015-08-05 20:50                           ` gregkh
  2015-08-05 21:11                             ` John Savanyo
  0 siblings, 1 reply; 61+ messages in thread
From: gregkh @ 2015-08-05 20:50 UTC (permalink / raw)
  To: John Savanyo
  Cc: dmitry.torokhov, Philip Moltmann, linux-kernel, pv-drivers,
	Xavier Deguillard, akpm


A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, Aug 05, 2015 at 08:42:25PM +0000, John Savanyo wrote:
> I agree that version number tracking is not perfect. But it is valuable
> for us to use as a rough indication that we have attempted to back port
> some appropriate subset of patches from mainline to a sustaining distro
> release without having to diff the code.

You have to always diff the code anyway, you can't trust that number,
see my other email as to why.

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 20:50                           ` gregkh
@ 2015-08-05 21:11                             ` John Savanyo
  0 siblings, 0 replies; 61+ messages in thread
From: John Savanyo @ 2015-08-05 21:11 UTC (permalink / raw)
  To: gregkh
  Cc: dmitry.torokhov, Philip Moltmann, linux-kernel, pv-drivers,
	Xavier Deguillard, akpm


On 8/5/15, 1:50 PM, "gregkh@linuxfoundation.org"
<gregkh@linuxfoundation.org> wrote:

>You have to always diff the code anyway, you can't trust that number,
>see my other email as to why.
>
>greg k-h

I agree that we need to look at the source code to 100% understand that
status of a driver. However, if VMware has a practice of bumping the
version number for our contributions to help with some of our internal
processes, then I don¹t see any harm in allowing this practice to
continue. We are not asking you personally to interpret these version
numbers in any way. So it should be of little consequence to you to just
accept them as part of our contributions in the future.

Thanks,
John


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-08-05 20:19                   ` Greg KH
@ 2015-08-05 22:36                     ` Philip P. Moltmann
  2015-08-05 22:44                       ` Greg KH
  0 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-05 22:36 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Introduce a new capability to the driver that allow sending 512 pages in
one hypervisor call. This reduce the cost of the driver when reclaiming
memory.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
---
 drivers/misc/vmw_balloon.c | 406 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 353 insertions(+), 53 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index ffb5634..64f275e 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2010, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -37,6 +37,7 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/vmalloc.h>
 #include <linux/sched.h>
 #include <linux/module.h>
 #include <linux/workqueue.h>
@@ -46,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.0.0-k");
+MODULE_VERSION("1.3.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -120,13 +121,26 @@ enum vmwballoon_capabilities {
 	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
 };
 
-#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
+					| VMW_BALLOON_BATCHED_CMDS)
 
+/*
+ * Backdoor commands availability:
+ *
+ * START, GET_TARGET and GUEST_ID are always available,
+ *
+ * VMW_BALLOON_BASIC_CMDS:
+ *	LOCK and UNLOCK commands,
+ * VMW_BALLOON_BATCHED_CMDS:
+ *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ */
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
 #define VMW_BALLOON_CMD_UNLOCK		3
 #define VMW_BALLOON_CMD_GUEST_ID	4
+#define VMW_BALLOON_CMD_BATCHED_LOCK	6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -142,18 +156,63 @@ enum vmwballoon_capabilities {
 
 #define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
 
-#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+/* Batch page description */
+
+/*
+ * Layout of a page in the batch page:
+ *
+ * +-------------+----------+--------+
+ * |             |          |        |
+ * | Page number | Reserved | Status |
+ * |             |          |        |
+ * +-------------+----------+--------+
+ * 64  PAGE_SHIFT          6         0
+ *
+ * For now only 4K pages are supported, but we can easily support large pages
+ * by using bits in the reserved field.
+ *
+ * The reserved field should be set to 0.
+ */
+#define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
+#define VMW_BALLOON_BATCH_STATUS_MASK	((1UL << 5) - 1)
+#define VMW_BALLOON_BATCH_PAGE_MASK	(~((1UL << PAGE_SHIFT) - 1))
+
+struct vmballoon_batch_page {
+	u64 pages[VMW_BALLOON_BATCH_MAX_PAGES];
+};
+
+static u64 vmballoon_batch_get_pa(struct vmballoon_batch_page *batch, int idx)
+{
+	return batch->pages[idx] & VMW_BALLOON_BATCH_PAGE_MASK;
+}
+
+static int vmballoon_batch_get_status(struct vmballoon_batch_page *batch,
+				int idx)
+{
+	return (int)(batch->pages[idx] & VMW_BALLOON_BATCH_STATUS_MASK);
+}
+
+static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
+				u64 pa)
+{
+	batch->pages[idx] = pa;
+}
+
+
+#define VMWARE_BALLOON_CMD(cmd, arg1, arg2, result)		\
 ({								\
-	unsigned long __status, __dummy1, __dummy2;		\
+	unsigned long __status, __dummy1, __dummy2, __dummy3;	\
 	__asm__ __volatile__ ("inl %%dx" :			\
 		"=a"(__status),					\
 		"=c"(__dummy1),					\
 		"=d"(__dummy2),					\
-		"=b"(result) :					\
+		"=b"(result),					\
+		"=S" (__dummy3) :				\
 		"0"(VMW_BALLOON_HV_MAGIC),			\
 		"1"(VMW_BALLOON_CMD_##cmd),			\
 		"2"(VMW_BALLOON_HV_PORT),			\
-		"3"(data) :					\
+		"3"(arg1),					\
+		"4" (arg2) :					\
 		"memory");					\
 	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
 		result = __dummy1;				\
@@ -192,6 +251,14 @@ struct vmballoon_stats {
 #define STATS_INC(stat)
 #endif
 
+struct vmballoon;
+
+struct vmballoon_ops {
+	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+};
+
 struct vmballoon {
 
 	/* list of reserved physical pages */
@@ -215,6 +282,14 @@ struct vmballoon {
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
 
+	unsigned long capabilities;
+
+	struct vmballoon_batch_page *batch_page;
+	unsigned int batch_max_pages;
+	struct page *page;
+
+	const struct vmballoon_ops *ops;
+
 #ifdef CONFIG_DEBUG_FS
 	/* statistics */
 	struct vmballoon_stats stats;
@@ -234,16 +309,22 @@ static struct vmballoon balloon;
  * Send "start" command to the host, communicating supported version
  * of the protocol.
  */
-static bool vmballoon_send_start(struct vmballoon *b)
+static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
-	unsigned long status, capabilities;
+	unsigned long status, capabilities, dummy = 0;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
-				capabilities);
-	if (status == VMW_BALLOON_SUCCESS)
+	status = VMWARE_BALLOON_CMD(START, req_caps, dummy, capabilities);
+
+	switch (status) {
+	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
+		b->capabilities = capabilities;
+		return true;
+	case VMW_BALLOON_SUCCESS:
+		b->capabilities = VMW_BALLOON_BASIC_CMDS;
 		return true;
+	}
 
 	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
 	STATS_INC(b->stats.start_fail);
@@ -273,9 +354,10 @@ static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
  */
 static bool vmballoon_send_guest_id(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 
-	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy);
+	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy,
+				dummy);
 
 	STATS_INC(b->stats.guest_type);
 
@@ -295,6 +377,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	unsigned long status;
 	unsigned long target;
 	unsigned long limit;
+	unsigned long dummy = 0;
 	u32 limit32;
 
 	/*
@@ -313,7 +396,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	/* update stats */
 	STATS_INC(b->stats.target);
 
-	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, target);
+	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, dummy, target);
 	if (vmballoon_check_status(b, status)) {
 		*new_target = target;
 		return true;
@@ -332,7 +415,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 				     unsigned int *hv_status)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -341,7 +424,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -350,13 +433,30 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	return 1;
 }
 
+static int vmballoon_send_batched_lock(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.lock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return 0;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.lock_fail);
+	return 1;
+}
+
 /*
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
 static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -365,7 +465,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -374,6 +474,23 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 	return false;
 }
 
+static bool vmballoon_send_batched_unlock(struct vmballoon *b,
+						unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.unlock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return true;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.unlock_fail);
+	return false;
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -396,22 +513,13 @@ static void vmballoon_pop(struct vmballoon *b)
 			cond_resched();
 		}
 	}
-}
 
-/*
- * Perform standard reset sequence by popping the balloon (in case it
- * is not  empty) and then restarting protocol. This operation normally
- * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
- */
-static void vmballoon_reset(struct vmballoon *b)
-{
-	/* free all pages, skipping monitor unlock */
-	vmballoon_pop(b);
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		if (b->batch_page)
+			vunmap(b->batch_page);
 
-	if (vmballoon_send_start(b)) {
-		b->reset_required = false;
-		if (!vmballoon_send_guest_id(b))
-			pr_err("failed to send guest ID to the host\n");
+		if (b->page)
+			__free_page(b->page);
 	}
 }
 
@@ -420,9 +528,10 @@ static void vmballoon_reset(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 {
 	int locked, hv_status;
+	struct page *page = b->page;
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
 	if (locked > 0) {
@@ -457,17 +566,68 @@ static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_lock_batched_page(struct vmballoon *b,
+				unsigned int num_pages)
+{
+	int locked, i;
+
+	locked = vmballoon_send_batched_lock(b, num_pages);
+	if (locked > 0) {
+		for (i = 0; i < num_pages; i++) {
+			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+			__free_page(p);
+		}
+
+		return -EIO;
+	}
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+
+		switch (locked) {
+		case VMW_BALLOON_SUCCESS:
+			list_add(&p->lru, &b->pages);
+			b->size++;
+			break;
+		case VMW_BALLOON_ERROR_PPN_PINNED:
+		case VMW_BALLOON_ERROR_PPN_INVALID:
+			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &b->refused_pages);
+				b->n_refused_pages++;
+				break;
+			}
+			/* Fallthrough */
+		case VMW_BALLOON_ERROR_RESET:
+		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
+			__free_page(p);
+			break;
+		default:
+			/* This should never happen */
+			WARN_ON_ONCE(true);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Release the page allocated for the balloon. Note that we first notify
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_release_page(struct vmballoon *b, struct page *page)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 {
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page)))
-		return -EIO;
+	struct page *page = b->page;
 
-	list_del(&page->lru);
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+		list_add(&page->lru, &b->pages);
+		return -EIO;
+	}
 
 	/* deallocate page */
 	__free_page(page);
@@ -479,6 +639,41 @@ static int vmballoon_release_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_unlock_batched_page(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	int locked, i, ret = 0;
+	bool hv_success;
+
+	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	if (!hv_success)
+		ret = -EIO;
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
+			/*
+			 * That page wasn't successfully unlocked by the
+			 * hypervisor, re-add it to the list of pages owned by
+			 * the balloon driver.
+			 */
+			list_add(&p->lru, &b->pages);
+		} else {
+			/* deallocate page */
+			__free_page(p);
+			STATS_INC(b->stats.free);
+
+			/* update balloon size */
+			b->size--;
+		}
+	}
+
+	return ret;
+}
+
 /*
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
@@ -496,6 +691,18 @@ static void vmballoon_release_refused_pages(struct vmballoon *b)
 	b->n_refused_pages = 0;
 }
 
+static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
+{
+	b->page = p;
+}
+
+static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
+				struct page *p)
+{
+	vmballoon_batch_set_pa(b->batch_page, idx,
+			(u64)page_to_pfn(p) << PAGE_SHIFT);
+}
+
 /*
  * Inflate the balloon towards its target size. Note that we try to limit
  * the rate of allocation to make sure we are not choking the rest of the
@@ -507,6 +714,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int rate;
 	unsigned int i;
 	unsigned int allocations = 0;
+	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
@@ -539,14 +747,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
-		struct page *page;
+		struct page *page = alloc_page(flags);
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -580,9 +787,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 			continue;
 		}
 
-		error = vmballoon_lock_page(b, page);
-		if (error)
-			break;
+		b->ops->add_page(b, num_pages++, page);
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->lock(b, num_pages);
+			num_pages = 0;
+			if (error)
+				break;
+		}
 
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
@@ -595,6 +806,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 		}
 	}
 
+	if (num_pages > 0)
+		b->ops->lock(b, num_pages);
+
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
@@ -618,6 +832,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 	struct page *page, *next;
 	unsigned int i = 0;
 	unsigned int goal;
+	unsigned int num_pages = 0;
 	int error;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
@@ -629,21 +844,94 @@ static void vmballoon_deflate(struct vmballoon *b)
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		error = vmballoon_release_page(b, page);
-		if (error) {
-			/* quickly decrease rate in case of error */
-			b->rate_free = max(b->rate_free / 2,
-					   VMW_BALLOON_RATE_FREE_MIN);
-			return;
+		list_del(&page->lru);
+		b->ops->add_page(b, num_pages++, page);
+
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->unlock(b, num_pages);
+			num_pages = 0;
+			if (error) {
+				/* quickly decrease rate in case of error */
+				b->rate_free = max(b->rate_free / 2,
+						VMW_BALLOON_RATE_FREE_MIN);
+				return;
+			}
 		}
 
 		if (++i >= goal)
 			break;
 	}
 
+	if (num_pages > 0)
+		b->ops->unlock(b, num_pages);
+
 	/* slowly increase rate if there were no errors */
-	b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-			   VMW_BALLOON_RATE_FREE_MAX);
+	if (error == 0)
+		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
+				   VMW_BALLOON_RATE_FREE_MAX);
+}
+
+static const struct vmballoon_ops vmballoon_basic_ops = {
+	.add_page = vmballoon_add_page,
+	.lock = vmballoon_lock_page,
+	.unlock = vmballoon_unlock_page
+};
+
+static const struct vmballoon_ops vmballoon_batched_ops = {
+	.add_page = vmballoon_add_batched_page,
+	.lock = vmballoon_lock_batched_page,
+	.unlock = vmballoon_unlock_batched_page
+};
+
+static bool vmballoon_init_batching(struct vmballoon *b)
+{
+	b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP);
+	if (!b->page)
+		return false;
+
+	b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL);
+	if (!b->batch_page) {
+		__free_page(b->page);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Perform standard reset sequence by popping the balloon (in case it
+ * is not  empty) and then restarting protocol. This operation normally
+ * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
+ */
+static void vmballoon_reset(struct vmballoon *b)
+{
+	/* free all pages, skipping monitor unlock */
+	vmballoon_pop(b);
+
+	if (!vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES))
+		return;
+
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		b->ops = &vmballoon_batched_ops;
+		b->batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(b)) {
+			/*
+			 * We failed to initialize batching, inform the monitor
+			 * about it by sending a null capability.
+			 *
+			 * The guest will retry in one second.
+			 */
+			vmballoon_send_start(b, 0);
+			return;
+		}
+	} else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		b->ops = &vmballoon_basic_ops;
+		b->batch_max_pages = 1;
+	}
+
+	b->reset_required = false;
+	if (!vmballoon_send_guest_id(b))
+		pr_err("failed to send guest ID to the host\n");
 }
 
 /*
@@ -802,11 +1090,23 @@ static int __init vmballoon_init(void)
 	/*
 	 * Start balloon.
 	 */
-	if (!vmballoon_send_start(&balloon)) {
+	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
 		pr_err("failed to send start command to the host\n");
 		return -EIO;
 	}
 
+	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		balloon.ops = &vmballoon_batched_ops;
+		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(&balloon)) {
+			pr_err("failed to init batching\n");
+			return -EIO;
+		}
+	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		balloon.ops = &vmballoon_basic_ops;
+		balloon.batch_max_pages = 1;
+	}
+
 	if (!vmballoon_send_guest_id(&balloon)) {
 		pr_err("failed to send guest ID to the host\n");
 		return -EIO;
@@ -833,7 +1133,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon);
+	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-08-05 22:36                     ` [PATCH v4 " Philip P. Moltmann
@ 2015-08-05 22:44                       ` Greg KH
  2015-08-05 22:47                         ` Philip Moltmann
  0 siblings, 1 reply; 61+ messages in thread
From: Greg KH @ 2015-08-05 22:44 UTC (permalink / raw)
  To: Philip P. Moltmann
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm, pv-drivers

On Wed, Aug 05, 2015 at 03:36:31PM -0700, Philip P. Moltmann wrote:
> From: Xavier Deguillard <xdeguillard@vmware.com>
> 
> Introduce a new capability to the driver that allow sending 512 pages in
> one hypervisor call. This reduce the cost of the driver when reclaiming
> memory.
> 
> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
> Acked-by: Dmitry Torokhov <dtor@vmware.com>
> Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
> ---
>  drivers/misc/vmw_balloon.c | 406 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 353 insertions(+), 53 deletions(-)

what changed?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-08-05 22:44                       ` Greg KH
@ 2015-08-05 22:47                         ` Philip Moltmann
  2015-08-05 23:26                           ` gregkh
  0 siblings, 1 reply; 61+ messages in thread
From: Philip Moltmann @ 2015-08-05 22:47 UTC (permalink / raw)
  To: gregkh; +Cc: dmitry.torokhov, linux-kernel, pv-drivers, Xavier Deguillard, akpm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 338 bytes --]

Hi,

> what changed?

I added the include:

#include <linux/vmalloc.h>

Nothing else changes. None of v3 step 4-9 changed and they should still
cleanly apply (and build).

Philipÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-08-05 22:47                         ` Philip Moltmann
@ 2015-08-05 23:26                           ` gregkh
  2015-08-06 20:33                             ` [PATCH v4 0/9] Fourth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
                                               ` (9 more replies)
  0 siblings, 10 replies; 61+ messages in thread
From: gregkh @ 2015-08-05 23:26 UTC (permalink / raw)
  To: Philip Moltmann
  Cc: dmitry.torokhov, linux-kernel, pv-drivers, Xavier Deguillard, akpm

On Wed, Aug 05, 2015 at 10:47:37PM +0000, Philip Moltmann wrote:
> Hi,
> 
> > what changed?
> 
> I added the include:
> 
> #include <linux/vmalloc.h>
> 
> Nothing else changes.

How was I supposed to know this?

Please add this type of thing to the patch, in the proper place, as is
required.

> None of v3 step 4-9 changed and they should still
> cleanly apply (and build).

They are long gone from my queue, please resend everything that I
haven't already applied, including this one, with the proper
information.

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 0/9] Fourth revision of the performance improvement patch to the VMware balloon driver
  2015-08-05 23:26                           ` gregkh
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
                                               ` (8 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

This is the fourth revision fo the path to the VMWare balloon driver. The original
was send to linux-kernel@vger.kernel.org on 4/14/15 10.29 am PST. Please refer to
the original change for an overview.

v1:
- Initial implementation
v2
- Address suggestions by Dmitry Totoknov
  - Use UINT_MAX as "infitite" rate instead of special casing -1
v3:
- Change commit comment for step 6 to better explain what impact ballooning has
  on the VM performance.
v4:
- Add missing include header <linux/vmalloc.h> in step 3

Thanks
Philip

Philip P. Moltmann (5):
  VMware balloon: Show capabilities of balloon and resulting
    capabilities in the debug-fs node.
  VMware balloon: Do not limit the amount of frees and allocations in
    non-sleep mode.
  VMware balloon: Support 2m page ballooning.
  VMware balloon: Treat init like reset
  VMware balloon: Enable notification via VMCI

Xavier Deguillard (4):
  VMware balloon: partially inline vmballoon_reserve_page.
  VMware balloon: Add support for balloon capabilities.
  VMware balloon: add batching to the vmw_balloon.
  VMware balloon: Update balloon target on each lock/unlock.

 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 955 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 718 insertions(+), 239 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page.
  2015-08-05 23:26                           ` gregkh
  2015-08-06 20:33                             ` [PATCH v4 0/9] Fourth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 21:07                               ` Greg KH
  2015-08-06 20:33                             ` [PATCH v4 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
                                               ` (7 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

This split the function in two: the allocation part is inlined into the
inflate function and the lock part is kept into his own function.

This change is needed in order to be able to allocate more than one page
before doing the hypervisor call.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 98 ++++++++++++++++++++--------------------------
 1 file changed, 42 insertions(+), 56 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 1916174..2799c46 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.1.3-k");
+MODULE_VERSION("1.2.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -402,55 +402,37 @@ static void vmballoon_reset(struct vmballoon *b)
 }
 
 /*
- * Allocate (or reserve) a page for the balloon and notify the host.  If host
- * refuses the page put it on "refuse" list and allocate another one until host
- * is satisfied. "Refused" pages are released at the end of inflation cycle
- * (when we allocate b->rate_alloc pages).
+ * Notify the host of a ballooned page. If host rejects the page put it on the
+ * refuse list, those refused page are then released at the end of the
+ * inflation cycle.
  */
-static int vmballoon_reserve_page(struct vmballoon *b, bool can_sleep)
+static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 {
-	struct page *page;
-	gfp_t flags;
-	unsigned int hv_status;
-	int locked;
-	flags = can_sleep ? VMW_PAGE_ALLOC_CANSLEEP : VMW_PAGE_ALLOC_NOSLEEP;
-
-	do {
-		if (!can_sleep)
-			STATS_INC(b->stats.alloc);
-		else
-			STATS_INC(b->stats.sleep_alloc);
-
-		page = alloc_page(flags);
-		if (!page) {
-			if (!can_sleep)
-				STATS_INC(b->stats.alloc_fail);
-			else
-				STATS_INC(b->stats.sleep_alloc_fail);
-			return -ENOMEM;
-		}
+	int locked, hv_status;
 
-		/* inform monitor */
-		locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
-		if (locked > 0) {
-			STATS_INC(b->stats.refused_alloc);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	if (locked > 0) {
+		STATS_INC(b->stats.refused_alloc);
 
-			if (hv_status == VMW_BALLOON_ERROR_RESET ||
-			    hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-				__free_page(page);
-				return -EIO;
-			}
+		if (hv_status == VMW_BALLOON_ERROR_RESET ||
+				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
+			__free_page(page);
+			return -EIO;
+		}
 
-			/*
-			 * Place page on the list of non-balloonable pages
-			 * and retry allocation, unless we already accumulated
-			 * too many of them, in which case take a breather.
-			 */
+		/*
+		 * Place page on the list of non-balloonable pages
+		 * and retry allocation, unless we already accumulated
+		 * too many of them, in which case take a breather.
+		 */
+		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			b->n_refused_pages++;
 			list_add(&page->lru, &b->refused_pages);
-			if (++b->n_refused_pages >= VMW_BALLOON_MAX_REFUSED)
-				return -EIO;
+		} else {
+			__free_page(page);
 		}
-	} while (locked != 0);
+		return -EIO;
+	}
 
 	/* track allocated page */
 	list_add(&page->lru, &b->pages);
@@ -512,7 +494,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int i;
 	unsigned int allocations = 0;
 	int error = 0;
-	bool alloc_can_sleep = false;
+	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -543,19 +525,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
+		struct page *page;
 
-		error = vmballoon_reserve_page(b, alloc_can_sleep);
-		if (error) {
-			if (error != -ENOMEM) {
-				/*
-				 * Not a page allocation failure, stop this
-				 * cycle. Maybe we'll get new target from
-				 * the host soon.
-				 */
-				break;
-			}
+		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
+			STATS_INC(b->stats.alloc);
+		else
+			STATS_INC(b->stats.sleep_alloc);
 
-			if (alloc_can_sleep) {
+		page = alloc_page(flags);
+		if (!page) {
+			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
 				 * is under severe memory pressure. Quickly
@@ -563,8 +542,10 @@ static void vmballoon_inflate(struct vmballoon *b)
 				 */
 				b->rate_alloc = max(b->rate_alloc / 2,
 						    VMW_BALLOON_RATE_ALLOC_MIN);
+				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
+			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -579,11 +560,16 @@ static void vmballoon_inflate(struct vmballoon *b)
 			if (i >= b->rate_alloc)
 				break;
 
-			alloc_can_sleep = true;
+			flags = VMW_PAGE_ALLOC_CANSLEEP;
 			/* Lower rate for sleeping allocations. */
 			rate = b->rate_alloc;
+			continue;
 		}
 
+		error = vmballoon_lock_page(b, page);
+		if (error)
+			break;
+
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
 			allocations = 0;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 2/9] VMware balloon: Add support for balloon capabilities.
  2015-08-05 23:26                           ` gregkh
  2015-08-06 20:33                             ` [PATCH v4 0/9] Fourth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
                                               ` (6 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

In order to extend the balloon protocol, the hypervisor and the guest
driver need to agree on a set of supported functionality to use.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 74 +++++++++++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 2799c46..ffb5634 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -46,7 +46,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.2.2.0-k");
+MODULE_VERSION("1.3.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -110,9 +110,18 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_BALLOON_HV_PORT		0x5670
 #define VMW_BALLOON_HV_MAGIC		0x456c6d6f
-#define VMW_BALLOON_PROTOCOL_VERSION	2
 #define VMW_BALLOON_GUEST_ID		1	/* Linux */
 
+enum vmwballoon_capabilities {
+	/*
+	 * Bit 0 is reserved and not associated to any capability.
+	 */
+	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+};
+
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
@@ -120,32 +129,36 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_CMD_GUEST_ID	4
 
 /* error codes */
-#define VMW_BALLOON_SUCCESS		0
-#define VMW_BALLOON_FAILURE		-1
-#define VMW_BALLOON_ERROR_CMD_INVALID	1
-#define VMW_BALLOON_ERROR_PPN_INVALID	2
-#define VMW_BALLOON_ERROR_PPN_LOCKED	3
-#define VMW_BALLOON_ERROR_PPN_UNLOCKED	4
-#define VMW_BALLOON_ERROR_PPN_PINNED	5
-#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	6
-#define VMW_BALLOON_ERROR_RESET		7
-#define VMW_BALLOON_ERROR_BUSY		8
-
-#define VMWARE_BALLOON_CMD(cmd, data, result)		\
-({							\
-	unsigned long __stat, __dummy1, __dummy2;	\
-	__asm__ __volatile__ ("inl %%dx" :		\
-		"=a"(__stat),				\
-		"=c"(__dummy1),				\
-		"=d"(__dummy2),				\
-		"=b"(result) :				\
-		"0"(VMW_BALLOON_HV_MAGIC),		\
-		"1"(VMW_BALLOON_CMD_##cmd),		\
-		"2"(VMW_BALLOON_HV_PORT),		\
-		"3"(data) :				\
-		"memory");				\
-	result &= -1UL;					\
-	__stat & -1UL;					\
+#define VMW_BALLOON_SUCCESS		        0
+#define VMW_BALLOON_FAILURE		        -1
+#define VMW_BALLOON_ERROR_CMD_INVALID	        1
+#define VMW_BALLOON_ERROR_PPN_INVALID	        2
+#define VMW_BALLOON_ERROR_PPN_LOCKED	        3
+#define VMW_BALLOON_ERROR_PPN_UNLOCKED	        4
+#define VMW_BALLOON_ERROR_PPN_PINNED	        5
+#define VMW_BALLOON_ERROR_PPN_NOTNEEDED	        6
+#define VMW_BALLOON_ERROR_RESET		        7
+#define VMW_BALLOON_ERROR_BUSY		        8
+
+#define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
+
+#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+({								\
+	unsigned long __status, __dummy1, __dummy2;		\
+	__asm__ __volatile__ ("inl %%dx" :			\
+		"=a"(__status),					\
+		"=c"(__dummy1),					\
+		"=d"(__dummy2),					\
+		"=b"(result) :					\
+		"0"(VMW_BALLOON_HV_MAGIC),			\
+		"1"(VMW_BALLOON_CMD_##cmd),			\
+		"2"(VMW_BALLOON_HV_PORT),			\
+		"3"(data) :					\
+		"memory");					\
+	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
+		result = __dummy1;				\
+	result &= -1UL;						\
+	__status & -1UL;					\
 })
 
 #ifdef CONFIG_DEBUG_FS
@@ -223,11 +236,12 @@ static struct vmballoon balloon;
  */
 static bool vmballoon_send_start(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, capabilities;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_PROTOCOL_VERSION, dummy);
+	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
+				capabilities);
 	if (status == VMW_BALLOON_SUCCESS)
 		return true;
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon.
  2015-08-05 23:26                           ` gregkh
                                               ` (2 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
                                               ` (5 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Introduce a new capability to the driver that allow sending 512 pages in
one hypervisor call. This reduce the cost of the driver when reclaiming
memory.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
---
 drivers/misc/vmw_balloon.c | 406 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 353 insertions(+), 53 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index ffb5634..64f275e 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2010, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -37,6 +37,7 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/vmalloc.h>
 #include <linux/sched.h>
 #include <linux/module.h>
 #include <linux/workqueue.h>
@@ -46,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.0.0-k");
+MODULE_VERSION("1.3.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -120,13 +121,26 @@ enum vmwballoon_capabilities {
 	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
 };
 
-#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
+					| VMW_BALLOON_BATCHED_CMDS)
 
+/*
+ * Backdoor commands availability:
+ *
+ * START, GET_TARGET and GUEST_ID are always available,
+ *
+ * VMW_BALLOON_BASIC_CMDS:
+ *	LOCK and UNLOCK commands,
+ * VMW_BALLOON_BATCHED_CMDS:
+ *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ */
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
 #define VMW_BALLOON_CMD_UNLOCK		3
 #define VMW_BALLOON_CMD_GUEST_ID	4
+#define VMW_BALLOON_CMD_BATCHED_LOCK	6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -142,18 +156,63 @@ enum vmwballoon_capabilities {
 
 #define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
 
-#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+/* Batch page description */
+
+/*
+ * Layout of a page in the batch page:
+ *
+ * +-------------+----------+--------+
+ * |             |          |        |
+ * | Page number | Reserved | Status |
+ * |             |          |        |
+ * +-------------+----------+--------+
+ * 64  PAGE_SHIFT          6         0
+ *
+ * For now only 4K pages are supported, but we can easily support large pages
+ * by using bits in the reserved field.
+ *
+ * The reserved field should be set to 0.
+ */
+#define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
+#define VMW_BALLOON_BATCH_STATUS_MASK	((1UL << 5) - 1)
+#define VMW_BALLOON_BATCH_PAGE_MASK	(~((1UL << PAGE_SHIFT) - 1))
+
+struct vmballoon_batch_page {
+	u64 pages[VMW_BALLOON_BATCH_MAX_PAGES];
+};
+
+static u64 vmballoon_batch_get_pa(struct vmballoon_batch_page *batch, int idx)
+{
+	return batch->pages[idx] & VMW_BALLOON_BATCH_PAGE_MASK;
+}
+
+static int vmballoon_batch_get_status(struct vmballoon_batch_page *batch,
+				int idx)
+{
+	return (int)(batch->pages[idx] & VMW_BALLOON_BATCH_STATUS_MASK);
+}
+
+static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
+				u64 pa)
+{
+	batch->pages[idx] = pa;
+}
+
+
+#define VMWARE_BALLOON_CMD(cmd, arg1, arg2, result)		\
 ({								\
-	unsigned long __status, __dummy1, __dummy2;		\
+	unsigned long __status, __dummy1, __dummy2, __dummy3;	\
 	__asm__ __volatile__ ("inl %%dx" :			\
 		"=a"(__status),					\
 		"=c"(__dummy1),					\
 		"=d"(__dummy2),					\
-		"=b"(result) :					\
+		"=b"(result),					\
+		"=S" (__dummy3) :				\
 		"0"(VMW_BALLOON_HV_MAGIC),			\
 		"1"(VMW_BALLOON_CMD_##cmd),			\
 		"2"(VMW_BALLOON_HV_PORT),			\
-		"3"(data) :					\
+		"3"(arg1),					\
+		"4" (arg2) :					\
 		"memory");					\
 	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
 		result = __dummy1;				\
@@ -192,6 +251,14 @@ struct vmballoon_stats {
 #define STATS_INC(stat)
 #endif
 
+struct vmballoon;
+
+struct vmballoon_ops {
+	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+};
+
 struct vmballoon {
 
 	/* list of reserved physical pages */
@@ -215,6 +282,14 @@ struct vmballoon {
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
 
+	unsigned long capabilities;
+
+	struct vmballoon_batch_page *batch_page;
+	unsigned int batch_max_pages;
+	struct page *page;
+
+	const struct vmballoon_ops *ops;
+
 #ifdef CONFIG_DEBUG_FS
 	/* statistics */
 	struct vmballoon_stats stats;
@@ -234,16 +309,22 @@ static struct vmballoon balloon;
  * Send "start" command to the host, communicating supported version
  * of the protocol.
  */
-static bool vmballoon_send_start(struct vmballoon *b)
+static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
-	unsigned long status, capabilities;
+	unsigned long status, capabilities, dummy = 0;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
-				capabilities);
-	if (status == VMW_BALLOON_SUCCESS)
+	status = VMWARE_BALLOON_CMD(START, req_caps, dummy, capabilities);
+
+	switch (status) {
+	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
+		b->capabilities = capabilities;
+		return true;
+	case VMW_BALLOON_SUCCESS:
+		b->capabilities = VMW_BALLOON_BASIC_CMDS;
 		return true;
+	}
 
 	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
 	STATS_INC(b->stats.start_fail);
@@ -273,9 +354,10 @@ static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
  */
 static bool vmballoon_send_guest_id(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 
-	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy);
+	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy,
+				dummy);
 
 	STATS_INC(b->stats.guest_type);
 
@@ -295,6 +377,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	unsigned long status;
 	unsigned long target;
 	unsigned long limit;
+	unsigned long dummy = 0;
 	u32 limit32;
 
 	/*
@@ -313,7 +396,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	/* update stats */
 	STATS_INC(b->stats.target);
 
-	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, target);
+	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, dummy, target);
 	if (vmballoon_check_status(b, status)) {
 		*new_target = target;
 		return true;
@@ -332,7 +415,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 				     unsigned int *hv_status)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -341,7 +424,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -350,13 +433,30 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	return 1;
 }
 
+static int vmballoon_send_batched_lock(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.lock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return 0;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.lock_fail);
+	return 1;
+}
+
 /*
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
 static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -365,7 +465,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -374,6 +474,23 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 	return false;
 }
 
+static bool vmballoon_send_batched_unlock(struct vmballoon *b,
+						unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.unlock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return true;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.unlock_fail);
+	return false;
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -396,22 +513,13 @@ static void vmballoon_pop(struct vmballoon *b)
 			cond_resched();
 		}
 	}
-}
 
-/*
- * Perform standard reset sequence by popping the balloon (in case it
- * is not  empty) and then restarting protocol. This operation normally
- * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
- */
-static void vmballoon_reset(struct vmballoon *b)
-{
-	/* free all pages, skipping monitor unlock */
-	vmballoon_pop(b);
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		if (b->batch_page)
+			vunmap(b->batch_page);
 
-	if (vmballoon_send_start(b)) {
-		b->reset_required = false;
-		if (!vmballoon_send_guest_id(b))
-			pr_err("failed to send guest ID to the host\n");
+		if (b->page)
+			__free_page(b->page);
 	}
 }
 
@@ -420,9 +528,10 @@ static void vmballoon_reset(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 {
 	int locked, hv_status;
+	struct page *page = b->page;
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
 	if (locked > 0) {
@@ -457,17 +566,68 @@ static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_lock_batched_page(struct vmballoon *b,
+				unsigned int num_pages)
+{
+	int locked, i;
+
+	locked = vmballoon_send_batched_lock(b, num_pages);
+	if (locked > 0) {
+		for (i = 0; i < num_pages; i++) {
+			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+			__free_page(p);
+		}
+
+		return -EIO;
+	}
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+
+		switch (locked) {
+		case VMW_BALLOON_SUCCESS:
+			list_add(&p->lru, &b->pages);
+			b->size++;
+			break;
+		case VMW_BALLOON_ERROR_PPN_PINNED:
+		case VMW_BALLOON_ERROR_PPN_INVALID:
+			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &b->refused_pages);
+				b->n_refused_pages++;
+				break;
+			}
+			/* Fallthrough */
+		case VMW_BALLOON_ERROR_RESET:
+		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
+			__free_page(p);
+			break;
+		default:
+			/* This should never happen */
+			WARN_ON_ONCE(true);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Release the page allocated for the balloon. Note that we first notify
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_release_page(struct vmballoon *b, struct page *page)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 {
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page)))
-		return -EIO;
+	struct page *page = b->page;
 
-	list_del(&page->lru);
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+		list_add(&page->lru, &b->pages);
+		return -EIO;
+	}
 
 	/* deallocate page */
 	__free_page(page);
@@ -479,6 +639,41 @@ static int vmballoon_release_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_unlock_batched_page(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	int locked, i, ret = 0;
+	bool hv_success;
+
+	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	if (!hv_success)
+		ret = -EIO;
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
+			/*
+			 * That page wasn't successfully unlocked by the
+			 * hypervisor, re-add it to the list of pages owned by
+			 * the balloon driver.
+			 */
+			list_add(&p->lru, &b->pages);
+		} else {
+			/* deallocate page */
+			__free_page(p);
+			STATS_INC(b->stats.free);
+
+			/* update balloon size */
+			b->size--;
+		}
+	}
+
+	return ret;
+}
+
 /*
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
@@ -496,6 +691,18 @@ static void vmballoon_release_refused_pages(struct vmballoon *b)
 	b->n_refused_pages = 0;
 }
 
+static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
+{
+	b->page = p;
+}
+
+static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
+				struct page *p)
+{
+	vmballoon_batch_set_pa(b->batch_page, idx,
+			(u64)page_to_pfn(p) << PAGE_SHIFT);
+}
+
 /*
  * Inflate the balloon towards its target size. Note that we try to limit
  * the rate of allocation to make sure we are not choking the rest of the
@@ -507,6 +714,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int rate;
 	unsigned int i;
 	unsigned int allocations = 0;
+	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
@@ -539,14 +747,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
-		struct page *page;
+		struct page *page = alloc_page(flags);
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -580,9 +787,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 			continue;
 		}
 
-		error = vmballoon_lock_page(b, page);
-		if (error)
-			break;
+		b->ops->add_page(b, num_pages++, page);
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->lock(b, num_pages);
+			num_pages = 0;
+			if (error)
+				break;
+		}
 
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
@@ -595,6 +806,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 		}
 	}
 
+	if (num_pages > 0)
+		b->ops->lock(b, num_pages);
+
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
@@ -618,6 +832,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 	struct page *page, *next;
 	unsigned int i = 0;
 	unsigned int goal;
+	unsigned int num_pages = 0;
 	int error;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
@@ -629,21 +844,94 @@ static void vmballoon_deflate(struct vmballoon *b)
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		error = vmballoon_release_page(b, page);
-		if (error) {
-			/* quickly decrease rate in case of error */
-			b->rate_free = max(b->rate_free / 2,
-					   VMW_BALLOON_RATE_FREE_MIN);
-			return;
+		list_del(&page->lru);
+		b->ops->add_page(b, num_pages++, page);
+
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->unlock(b, num_pages);
+			num_pages = 0;
+			if (error) {
+				/* quickly decrease rate in case of error */
+				b->rate_free = max(b->rate_free / 2,
+						VMW_BALLOON_RATE_FREE_MIN);
+				return;
+			}
 		}
 
 		if (++i >= goal)
 			break;
 	}
 
+	if (num_pages > 0)
+		b->ops->unlock(b, num_pages);
+
 	/* slowly increase rate if there were no errors */
-	b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-			   VMW_BALLOON_RATE_FREE_MAX);
+	if (error == 0)
+		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
+				   VMW_BALLOON_RATE_FREE_MAX);
+}
+
+static const struct vmballoon_ops vmballoon_basic_ops = {
+	.add_page = vmballoon_add_page,
+	.lock = vmballoon_lock_page,
+	.unlock = vmballoon_unlock_page
+};
+
+static const struct vmballoon_ops vmballoon_batched_ops = {
+	.add_page = vmballoon_add_batched_page,
+	.lock = vmballoon_lock_batched_page,
+	.unlock = vmballoon_unlock_batched_page
+};
+
+static bool vmballoon_init_batching(struct vmballoon *b)
+{
+	b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP);
+	if (!b->page)
+		return false;
+
+	b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL);
+	if (!b->batch_page) {
+		__free_page(b->page);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Perform standard reset sequence by popping the balloon (in case it
+ * is not  empty) and then restarting protocol. This operation normally
+ * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
+ */
+static void vmballoon_reset(struct vmballoon *b)
+{
+	/* free all pages, skipping monitor unlock */
+	vmballoon_pop(b);
+
+	if (!vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES))
+		return;
+
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		b->ops = &vmballoon_batched_ops;
+		b->batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(b)) {
+			/*
+			 * We failed to initialize batching, inform the monitor
+			 * about it by sending a null capability.
+			 *
+			 * The guest will retry in one second.
+			 */
+			vmballoon_send_start(b, 0);
+			return;
+		}
+	} else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		b->ops = &vmballoon_basic_ops;
+		b->batch_max_pages = 1;
+	}
+
+	b->reset_required = false;
+	if (!vmballoon_send_guest_id(b))
+		pr_err("failed to send guest ID to the host\n");
 }
 
 /*
@@ -802,11 +1090,23 @@ static int __init vmballoon_init(void)
 	/*
 	 * Start balloon.
 	 */
-	if (!vmballoon_send_start(&balloon)) {
+	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
 		pr_err("failed to send start command to the host\n");
 		return -EIO;
 	}
 
+	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		balloon.ops = &vmballoon_batched_ops;
+		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(&balloon)) {
+			pr_err("failed to init batching\n");
+			return -EIO;
+		}
+	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		balloon.ops = &vmballoon_basic_ops;
+		balloon.batch_max_pages = 1;
+	}
+
 	if (!vmballoon_send_guest_id(&balloon)) {
 		pr_err("failed to send guest ID to the host\n");
 		return -EIO;
@@ -833,7 +1133,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon);
+	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 4/9] VMware balloon: Update balloon target on each lock/unlock.
  2015-08-05 23:26                           ` gregkh
                                               ` (3 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
                                               ` (4 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Instead of waiting for the next GET_TARGET command, we can react faster
by exploiting the fact that each hypervisor call also returns the
balloon target.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 85 +++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 64f275e..0b5aa93 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.1.0-k");
+MODULE_VERSION("1.3.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -255,8 +255,10 @@ struct vmballoon;
 
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
-	int (*lock)(struct vmballoon *b, unsigned int num_pages);
-	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
 };
 
 struct vmballoon {
@@ -413,7 +415,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
  * check the return value and maybe submit a different page.
  */
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
-				     unsigned int *hv_status)
+				unsigned int *hv_status, unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -424,7 +426,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -434,14 +436,14 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.lock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -454,7 +456,8 @@ static int vmballoon_send_batched_lock(struct vmballoon *b,
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
-static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
+static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
+							unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -465,7 +468,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -475,14 +478,14 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-						unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -528,12 +531,14 @@ static void vmballoon_pop(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
 
-	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
+								target);
 	if (locked > 0) {
 		STATS_INC(b->stats.refused_alloc);
 
@@ -567,11 +572,11 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i;
 
-	locked = vmballoon_send_batched_lock(b, num_pages);
+	locked = vmballoon_send_batched_lock(b, num_pages, target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
@@ -620,11 +625,12 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	struct page *page = b->page;
 
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
 		list_add(&page->lru, &b->pages);
 		return -EIO;
 	}
@@ -640,12 +646,12 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
 	if (!hv_success)
 		ret = -EIO;
 
@@ -710,9 +716,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int goal;
 	unsigned int rate;
-	unsigned int i;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -735,7 +739,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * slowdown page allocations considerably.
 	 */
 
-	goal = b->target - b->size;
 	/*
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
@@ -744,16 +747,17 @@ static void vmballoon_inflate(struct vmballoon *b)
 			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
-		 __func__, goal, rate, b->rate_alloc);
+		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	for (i = 0; i < goal; i++) {
-		struct page *page = alloc_page(flags);
+	while (b->size < b->target && num_pages < b->target - b->size) {
+		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
+		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -778,7 +782,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 			 */
 			b->slow_allocation_cycles = VMW_BALLOON_SLOW_CYCLES;
 
-			if (i >= b->rate_alloc)
+			if (allocations >= b->rate_alloc)
 				break;
 
 			flags = VMW_PAGE_ALLOC_CANSLEEP;
@@ -789,7 +793,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages);
+			error = b->ops->lock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -800,21 +804,21 @@ static void vmballoon_inflate(struct vmballoon *b)
 			allocations = 0;
 		}
 
-		if (i >= rate) {
+		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
 			break;
 		}
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages);
+		b->ops->lock(b, num_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
 	 */
-	if (error == 0 && i >= b->rate_alloc) {
-		unsigned int mult = i / b->rate_alloc;
+	if (error == 0 && allocations >= b->rate_alloc) {
+		unsigned int mult = allocations / b->rate_alloc;
 
 		b->rate_alloc =
 			min(b->rate_alloc + mult * VMW_BALLOON_RATE_ALLOC_INC,
@@ -831,16 +835,11 @@ static void vmballoon_deflate(struct vmballoon *b)
 {
 	struct page *page, *next;
 	unsigned int i = 0;
-	unsigned int goal;
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
-
-	/* limit deallocation rate */
-	goal = min(b->size - b->target, b->rate_free);
-
-	pr_debug("%s - goal: %d, rate: %d\n", __func__, goal, b->rate_free);
+	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
+						b->target, b->rate_free);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
@@ -848,7 +847,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 		b->ops->add_page(b, num_pages++, page);
 
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages);
+			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error) {
 				/* quickly decrease rate in case of error */
@@ -858,12 +857,12 @@ static void vmballoon_deflate(struct vmballoon *b)
 			}
 		}
 
-		if (++i >= goal)
+		if (++i >= b->size - b->target)
 			break;
 	}
 
 	if (num_pages > 0)
-		b->ops->unlock(b, num_pages);
+		b->ops->unlock(b, num_pages, &b->target);
 
 	/* slowly increase rate if there were no errors */
 	if (error == 0)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-05 23:26                           ` gregkh
                                               ` (4 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
                                               ` (3 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

This helps with debugging vmw_balloon behavior, as it is clear what
functionality is enabled.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 0b5aa93..f0beb65 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.2.0-k");
+MODULE_VERSION("1.3.3.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -979,6 +979,12 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	struct vmballoon *b = f->private;
 	struct vmballoon_stats *stats = &b->stats;
 
+	/* format capabilities info */
+	seq_printf(f,
+		   "balloon capabilities:   %#4x\n"
+		   "used capabilities:      %#4lx\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+
 	/* format size info */
 	seq_printf(f,
 		   "target:             %8d pages\n"
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-08-05 23:26                           ` gregkh
                                               ` (5 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
                                               ` (2 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

When VMware's hypervisor requests a VM to reclaim memory this is preferrably done
via ballooning. If the balloon driver does not return memory fast enough, more
drastic methods, such as hypervisor-level swapping are needed. These other methods
cause performance issues, e.g. hypervisor-level swapping requires the hypervisor to
swap in a page syncronously while the virtual CPU is blocked.

Hence it is in the interest of the VM to balloon memory as fast as possible. The
problem with doing this is that the VM might end up doing nothing else than
ballooning and the user might notice that the VM is stalled, esp. when the VM has
only a single virtual CPU.

This is less of a problem if the VM and the hypervisor perform balloon operations
faster. Also the balloon driver yields regularly, hence on a single virtual CPU
the Linux scheduler should be able to properly time-slice between ballooning and
other tasks.

Testing Done: quickly ballooned a lot of pages while wathing if there are any
perceived hickups (periods of non-responsiveness) in the execution of the
linux VM. No such hickups were seen.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 68 +++++++++++-----------------------------------
 1 file changed, 16 insertions(+), 52 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f0beb65..aed9525 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.3.0-k");
+MODULE_VERSION("1.3.4.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -58,12 +58,6 @@ MODULE_LICENSE("GPL");
  */
 
 /*
- * Rate of allocating memory when there is no memory pressure
- * (driver performs non-sleeping allocations).
- */
-#define VMW_BALLOON_NOSLEEP_ALLOC_MAX	16384U
-
-/*
  * Rates of memory allocaton when guest experiences memory pressure
  * (driver performs sleeping allocations).
  */
@@ -72,13 +66,6 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_RATE_ALLOC_INC	16U
 
 /*
- * Rates for releasing pages while deflating balloon.
- */
-#define VMW_BALLOON_RATE_FREE_MIN	512U
-#define VMW_BALLOON_RATE_FREE_MAX	16384U
-#define VMW_BALLOON_RATE_FREE_INC	16U
-
-/*
  * When guest is under memory pressure, use a reduced page allocation
  * rate for next several cycles.
  */
@@ -100,9 +87,6 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_PAGE_ALLOC_CANSLEEP		(GFP_HIGHUSER)
 
-/* Maximum number of page allocations without yielding processor */
-#define VMW_BALLOON_YIELD_THRESHOLD	1024
-
 /* Maximum number of refused pages we accumulate during inflation cycle */
 #define VMW_BALLOON_MAX_REFUSED		16
 
@@ -279,7 +263,6 @@ struct vmballoon {
 
 	/* adjustment rates (pages per second) */
 	unsigned int rate_alloc;
-	unsigned int rate_free;
 
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
@@ -503,18 +486,13 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-	unsigned int count = 0;
 
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		__free_page(page);
 		STATS_INC(b->stats.free);
 		b->size--;
-
-		if (++count >= b->rate_free) {
-			count = 0;
-			cond_resched();
-		}
+		cond_resched();
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -716,7 +694,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int rate;
+	unsigned rate;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -743,13 +721,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ?
-			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
+	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
 
-	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
+	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	while (b->size < b->target && num_pages < b->target - b->size) {
+	while (!b->reset_required &&
+		b->size < b->target && num_pages < b->target - b->size) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
@@ -799,10 +777,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 				break;
 		}
 
-		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
-			cond_resched();
-			allocations = 0;
-		}
+		cond_resched();
 
 		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
@@ -838,36 +813,29 @@ static void vmballoon_deflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
-						b->target, b->rate_free);
+	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		b->ops->add_page(b, num_pages++, page);
 
+
 		if (num_pages == b->batch_max_pages) {
 			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
-			if (error) {
-				/* quickly decrease rate in case of error */
-				b->rate_free = max(b->rate_free / 2,
-						VMW_BALLOON_RATE_FREE_MIN);
+			if (error)
 				return;
-			}
 		}
 
-		if (++i >= b->size - b->target)
+		if (b->reset_required || ++i >= b->size - b->target)
 			break;
+
+		cond_resched();
 	}
 
 	if (num_pages > 0)
 		b->ops->unlock(b, num_pages, &b->target);
-
-	/* slowly increase rate if there were no errors */
-	if (error == 0)
-		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-				   VMW_BALLOON_RATE_FREE_MAX);
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -993,11 +961,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 
 	/* format rate info */
 	seq_printf(f,
-		   "rateNoSleepAlloc:   %8d pages/sec\n"
-		   "rateSleepAlloc:     %8d pages/sec\n"
-		   "rateFree:           %8d pages/sec\n",
-		   VMW_BALLOON_NOSLEEP_ALLOC_MAX,
-		   b->rate_alloc, b->rate_free);
+		   "rateSleepAlloc:     %8d pages/sec\n",
+		   b->rate_alloc);
 
 	seq_printf(f,
 		   "\n"
@@ -1088,7 +1053,6 @@ static int __init vmballoon_init(void)
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-	balloon.rate_free = VMW_BALLOON_RATE_FREE_MAX;
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 7/9] VMware balloon: Support 2m page ballooning.
  2015-08-05 23:26                           ` gregkh
                                               ` (6 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

2m ballooning significantly reduces the hypervisor side (and guest side)
overhead of ballooning and unballooning.

hypervisor only:
      balloon  unballoon
4 KB  2 GB/s   2.6 GB/s
2 MB  54 GB/s  767 GB/s

Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
supported super-page size.

The code has to run on older versions of ESX and old balloon drivers run on
newer version of ESX. Hence match the capabilities with the host before 2m
page ballooning could be enabled.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 376 +++++++++++++++++++++++++++++++--------------
 1 file changed, 258 insertions(+), 118 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index aed9525..01519ff 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.4.0-k");
+MODULE_VERSION("1.4.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -102,11 +102,16 @@ enum vmwballoon_capabilities {
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
 	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
-					| VMW_BALLOON_BATCHED_CMDS)
+					| VMW_BALLOON_BATCHED_CMDS \
+					| VMW_BALLOON_BATCHED_2M_CMDS)
+
+#define VMW_BALLOON_2M_SHIFT		(9)
+#define VMW_BALLOON_NUM_PAGE_SIZES	(2)
 
 /*
  * Backdoor commands availability:
@@ -117,14 +122,19 @@ enum vmwballoon_capabilities {
  *	LOCK and UNLOCK commands,
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ * VMW BALLOON_BATCHED_2M_CMDS:
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
  */
-#define VMW_BALLOON_CMD_START		0
-#define VMW_BALLOON_CMD_GET_TARGET	1
-#define VMW_BALLOON_CMD_LOCK		2
-#define VMW_BALLOON_CMD_UNLOCK		3
-#define VMW_BALLOON_CMD_GUEST_ID	4
-#define VMW_BALLOON_CMD_BATCHED_LOCK	6
-#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
+#define VMW_BALLOON_CMD_START			0
+#define VMW_BALLOON_CMD_GET_TARGET		1
+#define VMW_BALLOON_CMD_LOCK			2
+#define VMW_BALLOON_CMD_UNLOCK			3
+#define VMW_BALLOON_CMD_GUEST_ID		4
+#define VMW_BALLOON_CMD_BATCHED_LOCK		6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
+#define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
+#define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -152,9 +162,6 @@ enum vmwballoon_capabilities {
  * +-------------+----------+--------+
  * 64  PAGE_SHIFT          6         0
  *
- * For now only 4K pages are supported, but we can easily support large pages
- * by using bits in the reserved field.
- *
  * The reserved field should be set to 0.
  */
 #define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
@@ -209,19 +216,19 @@ struct vmballoon_stats {
 	unsigned int timer;
 
 	/* allocation statistics */
-	unsigned int alloc;
-	unsigned int alloc_fail;
+	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int alloc_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int sleep_alloc;
 	unsigned int sleep_alloc_fail;
-	unsigned int refused_alloc;
-	unsigned int refused_free;
-	unsigned int free;
+	unsigned int refused_alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int refused_free[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int free[VMW_BALLOON_NUM_PAGE_SIZES];
 
 	/* monitor operations */
-	unsigned int lock;
-	unsigned int lock_fail;
-	unsigned int unlock;
-	unsigned int unlock_fail;
+	unsigned int lock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int lock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int target;
 	unsigned int target_fail;
 	unsigned int start;
@@ -240,19 +247,25 @@ struct vmballoon;
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
 	int (*lock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 };
 
-struct vmballoon {
-
+struct vmballoon_page_size {
 	/* list of reserved physical pages */
 	struct list_head pages;
 
 	/* transient list of non-balloonable pages */
 	struct list_head refused_pages;
 	unsigned int n_refused_pages;
+};
+
+struct vmballoon {
+	struct vmballoon_page_size page_sizes[VMW_BALLOON_NUM_PAGE_SIZES];
+
+	/* supported page sizes. 1 == 4k pages only, 2 == 4k and 2m pages */
+	unsigned supported_page_sizes;
 
 	/* balloon size in pages */
 	unsigned int size;
@@ -297,6 +310,7 @@ static struct vmballoon balloon;
 static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
 	unsigned long status, capabilities, dummy = 0;
+	bool success;
 
 	STATS_INC(b->stats.start);
 
@@ -305,15 +319,26 @@ static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 	switch (status) {
 	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
 		b->capabilities = capabilities;
-		return true;
+		success = true;
+		break;
 	case VMW_BALLOON_SUCCESS:
 		b->capabilities = VMW_BALLOON_BASIC_CMDS;
-		return true;
+		success = true;
+		break;
+	default:
+		success = false;
 	}
 
-	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
-	STATS_INC(b->stats.start_fail);
-	return false;
+	if (b->capabilities & VMW_BALLOON_BATCHED_2M_CMDS)
+		b->supported_page_sizes = 2;
+	else
+		b->supported_page_sizes = 1;
+
+	if (!success) {
+		pr_debug("%s - failed, hv returns %ld\n", __func__, status);
+		STATS_INC(b->stats.start_fail);
+	}
+	return success;
 }
 
 static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
@@ -354,6 +379,14 @@ static bool vmballoon_send_guest_id(struct vmballoon *b)
 	return false;
 }
 
+static u16 vmballoon_page_size(bool is_2m_page)
+{
+	if (is_2m_page)
+		return 1 << VMW_BALLOON_2M_SHIFT;
+
+	return 1;
+}
+
 /*
  * Retrieve desired balloon size from the host.
  */
@@ -407,31 +440,37 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return -1;
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[false]);
 
 	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[false]);
 	return 1;
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_LOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[is_2m_pages]);
 	return 1;
 }
 
@@ -449,34 +488,56 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return false;
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[false]);
 
 	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[false]);
 	return false;
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_UNLOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[is_2m_pages]);
 	return false;
 }
 
+static struct page *vmballoon_alloc_page(gfp_t flags, bool is_2m_page)
+{
+	if (is_2m_page)
+		return alloc_pages(flags, VMW_BALLOON_2M_SHIFT);
+
+	return alloc_page(flags);
+}
+
+static void vmballoon_free_page(struct page *page, bool is_2m_page)
+{
+	if (is_2m_page)
+		__free_pages(page, VMW_BALLOON_2M_SHIFT);
+	else
+		__free_page(page);
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -486,13 +547,21 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.free);
-		b->size--;
-		cond_resched();
+	unsigned is_2m_pages;
+
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+		u16 size_per_page = vmballoon_page_size(is_2m_pages);
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			list_del(&page->lru);
+			vmballoon_free_page(page, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
+			b->size -= size_per_page;
+			cond_resched();
+		}
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -510,19 +579,22 @@ static void vmballoon_pop(struct vmballoon *b)
  * inflation cycle.
  */
 static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+				bool is_2m_pages, unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
 								target);
 	if (locked > 0) {
-		STATS_INC(b->stats.refused_alloc);
+		STATS_INC(b->stats.refused_alloc[false]);
 
 		if (hv_status == VMW_BALLOON_ERROR_RESET ||
 				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 			return -EIO;
 		}
 
@@ -531,17 +603,17 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 		 * and retry allocation, unless we already accumulated
 		 * too many of them, in which case take a breather.
 		 */
-		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-			b->n_refused_pages++;
-			list_add(&page->lru, &b->refused_pages);
+		if (page_size->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			page_size->n_refused_pages++;
+			list_add(&page->lru, &page_size->refused_pages);
 		} else {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 		}
 		return -EIO;
 	}
 
 	/* track allocated page */
-	list_add(&page->lru, &b->pages);
+	list_add(&page->lru, &page_size->pages);
 
 	/* update balloon size */
 	b->size++;
@@ -550,17 +622,19 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	int locked, i;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	locked = vmballoon_send_batched_lock(b, num_pages, target);
+	locked = vmballoon_send_batched_lock(b, num_pages, is_2m_pages,
+			target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
 
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 		}
 
 		return -EIO;
@@ -569,25 +643,28 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 
 		switch (locked) {
 		case VMW_BALLOON_SUCCESS:
-			list_add(&p->lru, &b->pages);
-			b->size++;
+			list_add(&p->lru, &page_size->pages);
+			b->size += size_per_page;
 			break;
 		case VMW_BALLOON_ERROR_PPN_PINNED:
 		case VMW_BALLOON_ERROR_PPN_INVALID:
-			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-				list_add(&p->lru, &b->refused_pages);
-				b->n_refused_pages++;
+			if (page_size->n_refused_pages
+					< VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &page_size->refused_pages);
+				page_size->n_refused_pages++;
 				break;
 			}
 			/* Fallthrough */
 		case VMW_BALLOON_ERROR_RESET:
 		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 			break;
 		default:
 			/* This should never happen */
@@ -604,18 +681,21 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * to use, if needed.
  */
 static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+		bool is_2m_pages, unsigned int *target)
 {
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
-		list_add(&page->lru, &b->pages);
+		list_add(&page->lru, &page_size->pages);
 		return -EIO;
 	}
 
 	/* deallocate page */
-	__free_page(page);
-	STATS_INC(b->stats.free);
+	vmballoon_free_page(page, false);
+	STATS_INC(b->stats.free[false]);
 
 	/* update balloon size */
 	b->size--;
@@ -624,18 +704,23 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+				unsigned int num_pages, bool is_2m_pages,
+				unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, is_2m_pages,
+			target);
 	if (!hv_success)
 		ret = -EIO;
 
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
@@ -644,14 +729,14 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
 			 * hypervisor, re-add it to the list of pages owned by
 			 * the balloon driver.
 			 */
-			list_add(&p->lru, &b->pages);
+			list_add(&p->lru, &page_size->pages);
 		} else {
 			/* deallocate page */
-			__free_page(p);
-			STATS_INC(b->stats.free);
+			vmballoon_free_page(p, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
 
 			/* update balloon size */
-			b->size--;
+			b->size -= size_per_page;
 		}
 	}
 
@@ -662,17 +747,20 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
  */
-static void vmballoon_release_refused_pages(struct vmballoon *b)
+static void vmballoon_release_refused_pages(struct vmballoon *b,
+		bool is_2m_pages)
 {
 	struct page *page, *next;
+	struct vmballoon_page_size *page_size =
+			&b->page_sizes[is_2m_pages];
 
-	list_for_each_entry_safe(page, next, &b->refused_pages, lru) {
+	list_for_each_entry_safe(page, next, &page_size->refused_pages, lru) {
 		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.refused_free);
+		vmballoon_free_page(page, is_2m_pages);
+		STATS_INC(b->stats.refused_free[is_2m_pages]);
 	}
 
-	b->n_refused_pages = 0;
+	page_size->n_refused_pages = 0;
 }
 
 static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
@@ -699,6 +787,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
+	bool is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -721,22 +810,46 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
+	if (b->slow_allocation_cycles) {
+		rate = b->rate_alloc;
+		is_2m_pages = false;
+	} else {
+		rate = UINT_MAX;
+		is_2m_pages =
+			b->supported_page_sizes == VMW_BALLOON_NUM_PAGE_SIZES;
+	}
 
 	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
 	while (!b->reset_required &&
-		b->size < b->target && num_pages < b->target - b->size) {
+		b->size + num_pages * vmballoon_page_size(is_2m_pages)
+		< b->target) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
-			STATS_INC(b->stats.alloc);
+			STATS_INC(b->stats.alloc[is_2m_pages]);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
+		page = vmballoon_alloc_page(flags, is_2m_pages);
 		if (!page) {
+			STATS_INC(b->stats.alloc_fail[is_2m_pages]);
+
+			if (is_2m_pages) {
+				b->ops->lock(b, num_pages, true, &b->target);
+
+				/*
+				 * ignore errors from locking as we now switch
+				 * to 4k pages and we might get different
+				 * errors.
+				 */
+
+				num_pages = 0;
+				is_2m_pages = false;
+				continue;
+			}
+
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
@@ -748,7 +861,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
-			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -771,7 +883,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages, &b->target);
+			error = b->ops->lock(b, num_pages, is_2m_pages,
+					&b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -786,7 +899,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages, &b->target);
+		b->ops->lock(b, num_pages, is_2m_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
@@ -800,7 +913,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 			    VMW_BALLOON_RATE_ALLOC_MAX);
 	}
 
-	vmballoon_release_refused_pages(b);
+	vmballoon_release_refused_pages(b, true);
+	vmballoon_release_refused_pages(b, false);
 }
 
 /*
@@ -808,34 +922,45 @@ static void vmballoon_inflate(struct vmballoon *b)
  */
 static void vmballoon_deflate(struct vmballoon *b)
 {
-	struct page *page, *next;
-	unsigned int i = 0;
-	unsigned int num_pages = 0;
-	int error;
+	unsigned is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		b->ops->add_page(b, num_pages++, page);
+	for (is_2m_pages = 0; is_2m_pages < b->supported_page_sizes;
+			is_2m_pages++) {
+		struct page *page, *next;
+		unsigned int num_pages = 0;
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			if (b->reset_required ||
+				(b->target > 0 &&
+					b->size - num_pages
+					* vmballoon_page_size(is_2m_pages)
+				< b->target + vmballoon_page_size(true)))
+				break;
 
+			list_del(&page->lru);
+			b->ops->add_page(b, num_pages++, page);
 
-		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages, &b->target);
-			num_pages = 0;
-			if (error)
-				return;
-		}
+			if (num_pages == b->batch_max_pages) {
+				int error;
 
-		if (b->reset_required || ++i >= b->size - b->target)
-			break;
+				error = b->ops->unlock(b, num_pages,
+						is_2m_pages, &b->target);
+				num_pages = 0;
+				if (error)
+					return;
+			}
 
-		cond_resched();
-	}
+			cond_resched();
+		}
 
-	if (num_pages > 0)
-		b->ops->unlock(b, num_pages, &b->target);
+		if (num_pages > 0)
+			b->ops->unlock(b, num_pages, is_2m_pages, &b->target);
+	}
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -925,7 +1050,8 @@ static void vmballoon_work(struct work_struct *work)
 
 		if (b->size < target)
 			vmballoon_inflate(b);
-		else if (b->size > target)
+		else if (target == 0 ||
+				b->size > target + vmballoon_page_size(true))
 			vmballoon_deflate(b);
 	}
 
@@ -969,24 +1095,35 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "timer:              %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
+		   "2m-lock:            %8u (%4u failed)\n"
 		   "lock:               %8u (%4u failed)\n"
+		   "2m-unlock:          %8u (%4u failed)\n"
 		   "unlock:             %8u (%4u failed)\n"
 		   "target:             %8u (%4u failed)\n"
+		   "prim2mAlloc:        %8u (%4u failed)\n"
 		   "primNoSleepAlloc:   %8u (%4u failed)\n"
 		   "primCanSleepAlloc:  %8u (%4u failed)\n"
+		   "prim2mFree:         %8u\n"
 		   "primFree:           %8u\n"
+		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
+		   "err2mFree:          %8u\n"
 		   "errFree:            %8u\n",
 		   stats->timer,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
-		   stats->lock,  stats->lock_fail,
-		   stats->unlock, stats->unlock_fail,
+		   stats->lock[true],  stats->lock_fail[true],
+		   stats->lock[false],  stats->lock_fail[false],
+		   stats->unlock[true], stats->unlock_fail[true],
+		   stats->unlock[false], stats->unlock_fail[false],
 		   stats->target, stats->target_fail,
-		   stats->alloc, stats->alloc_fail,
+		   stats->alloc[true], stats->alloc_fail[true],
+		   stats->alloc[false], stats->alloc_fail[false],
 		   stats->sleep_alloc, stats->sleep_alloc_fail,
-		   stats->free,
-		   stats->refused_alloc, stats->refused_free);
+		   stats->free[true],
+		   stats->free[false],
+		   stats->refused_alloc[true], stats->refused_alloc[false],
+		   stats->refused_free[true], stats->refused_free[false]);
 
 	return 0;
 }
@@ -1040,7 +1177,7 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b)
 static int __init vmballoon_init(void)
 {
 	int error;
-
+	unsigned is_2m_pages;
 	/*
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
@@ -1048,8 +1185,11 @@ static int __init vmballoon_init(void)
 	if (x86_hyper != &x86_hyper_vmware)
 		return -ENODEV;
 
-	INIT_LIST_HEAD(&balloon.pages);
-	INIT_LIST_HEAD(&balloon.refused_pages);
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].pages);
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].refused_pages);
+	}
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 8/9] VMware balloon: Treat init like reset
  2015-08-05 23:26                           ` gregkh
                                               ` (7 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  2015-08-06 20:33                             ` [PATCH v4 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

Unify the behavior of the first start of the balloon and a reset. Also on
unload, declare that the balloon driver does not have any capabilities
anymore.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 53 ++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 01519ff..28fe9e5 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.0.0-k");
+MODULE_VERSION("1.4.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -564,12 +564,14 @@ static void vmballoon_pop(struct vmballoon *b)
 		}
 	}
 
-	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		if (b->batch_page)
-			vunmap(b->batch_page);
+	if (b->batch_page) {
+		vunmap(b->batch_page);
+		b->batch_page = NULL;
+	}
 
-		if (b->page)
-			__free_page(b->page);
+	if (b->page) {
+		__free_page(b->page);
+		b->page = NULL;
 	}
 }
 
@@ -1044,7 +1046,7 @@ static void vmballoon_work(struct work_struct *work)
 	if (b->slow_allocation_cycles > 0)
 		b->slow_allocation_cycles--;
 
-	if (vmballoon_send_get_target(b, &target)) {
+	if (!b->reset_required && vmballoon_send_get_target(b, &target)) {
 		/* update target, adjust size */
 		b->target = target;
 
@@ -1076,8 +1078,10 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	/* format capabilities info */
 	seq_printf(f,
 		   "balloon capabilities:   %#4x\n"
-		   "used capabilities:      %#4lx\n",
-		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+		   "used capabilities:      %#4lx\n"
+		   "is resetting:           %c\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities,
+		   b->reset_required ? 'y' : 'n');
 
 	/* format size info */
 	seq_printf(f,
@@ -1196,35 +1200,14 @@ static int __init vmballoon_init(void)
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-	/*
-	 * Start balloon.
-	 */
-	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
-		pr_err("failed to send start command to the host\n");
-		return -EIO;
-	}
-
-	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		balloon.ops = &vmballoon_batched_ops;
-		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
-		if (!vmballoon_init_batching(&balloon)) {
-			pr_err("failed to init batching\n");
-			return -EIO;
-		}
-	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
-		balloon.ops = &vmballoon_basic_ops;
-		balloon.batch_max_pages = 1;
-	}
-
-	if (!vmballoon_send_guest_id(&balloon)) {
-		pr_err("failed to send guest ID to the host\n");
-		return -EIO;
-	}
-
 	error = vmballoon_debugfs_init(&balloon);
 	if (error)
 		return error;
 
+	balloon.batch_page = NULL;
+	balloon.page = NULL;
+	balloon.reset_required = true;
+
 	queue_delayed_work(system_freezable_wq, &balloon.dwork, 0);
 
 	return 0;
@@ -1242,7 +1225,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
+	vmballoon_send_start(&balloon, 0);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 9/9] VMware balloon: Enable notification via VMCI
  2015-08-05 23:26                           ` gregkh
                                               ` (8 preceding siblings ...)
  2015-08-06 20:33                             ` [PATCH v4 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
@ 2015-08-06 20:33                             ` Philip P. Moltmann
  9 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 20:33 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

Get notified immediately when a balloon target is set, instead of waiting for
up to one second.

The up-to 1 second gap could be long enough to cause swapping inside of the
VM that receives the VM.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Tested-by: Siva Sankar Reddy B <sankars@vmware.com>
---
 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 105 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 42c3852..76a3d42 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -404,7 +404,7 @@ config TI_DAC7512
 
 config VMWARE_BALLOON
 	tristate "VMware Balloon Driver"
-	depends on X86 && HYPERVISOR_GUEST
+	depends on VMWARE_VMCI && X86 && HYPERVISOR_GUEST
 	help
 	  This is VMware physical memory management driver which acts
 	  like a "balloon" that can be inflated to reclaim physical pages
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 28fe9e5..8930087 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2014, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -43,11 +43,13 @@
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
+#include <linux/vmw_vmci_defs.h>
+#include <linux/vmw_vmci_api.h>
 #include <asm/hypervisor.h>
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.1.0-k");
+MODULE_VERSION("1.5.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -101,14 +103,16 @@ enum vmwballoon_capabilities {
 	/*
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
-	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
-	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
+	VMW_BALLOON_BASIC_CMDS			= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS		= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS		= (1 << 3),
+	VMW_BALLOON_SIGNALLED_WAKEUP_CMD	= (1 << 4),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
 					| VMW_BALLOON_BATCHED_CMDS \
-					| VMW_BALLOON_BATCHED_2M_CMDS)
+					| VMW_BALLOON_BATCHED_2M_CMDS \
+					| VMW_BALLOON_SIGNALLED_WAKEUP_CMD)
 
 #define VMW_BALLOON_2M_SHIFT		(9)
 #define VMW_BALLOON_NUM_PAGE_SIZES	(2)
@@ -123,7 +127,9 @@ enum vmwballoon_capabilities {
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
  * VMW BALLOON_BATCHED_2M_CMDS:
- *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands,
+ * VMW VMW_BALLOON_SIGNALLED_WAKEUP_CMD:
+ *	VMW_BALLOON_CMD_VMCI_DOORBELL_SET command.
  */
 #define VMW_BALLOON_CMD_START			0
 #define VMW_BALLOON_CMD_GET_TARGET		1
@@ -134,6 +140,7 @@ enum vmwballoon_capabilities {
 #define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
 #define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
 #define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+#define VMW_BALLOON_CMD_VMCI_DOORBELL_SET	10
 
 
 /* error codes */
@@ -214,6 +221,7 @@ static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
 #ifdef CONFIG_DEBUG_FS
 struct vmballoon_stats {
 	unsigned int timer;
+	unsigned int doorbell;
 
 	/* allocation statistics */
 	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
@@ -235,6 +243,8 @@ struct vmballoon_stats {
 	unsigned int start_fail;
 	unsigned int guest_type;
 	unsigned int guest_type_fail;
+	unsigned int doorbell_set;
+	unsigned int doorbell_unset;
 };
 
 #define STATS_INC(stat) (stat)++
@@ -299,6 +309,8 @@ struct vmballoon {
 	struct sysinfo sysinfo;
 
 	struct delayed_work dwork;
+
+	struct vmci_handle vmci_doorbell;
 };
 
 static struct vmballoon balloon;
@@ -993,12 +1005,75 @@ static bool vmballoon_init_batching(struct vmballoon *b)
 }
 
 /*
+ * Receive notification and resize balloon
+ */
+static void vmballoon_doorbell(void *client_data)
+{
+	struct vmballoon *b = client_data;
+
+	STATS_INC(b->stats.doorbell);
+
+	mod_delayed_work(system_freezable_wq, &b->dwork, 0);
+}
+
+/*
+ * Clean up vmci doorbell
+ */
+static void vmballoon_vmci_cleanup(struct vmballoon *b)
+{
+	int error;
+
+	VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET, VMCI_INVALID_ID,
+			VMCI_INVALID_ID, error);
+	STATS_INC(b->stats.doorbell_unset);
+
+	if (!vmci_handle_is_invalid(b->vmci_doorbell)) {
+		vmci_doorbell_destroy(b->vmci_doorbell);
+		b->vmci_doorbell = VMCI_INVALID_HANDLE;
+	}
+}
+
+/*
+ * Initialize vmci doorbell, to get notified as soon as balloon changes
+ */
+static int vmballoon_vmci_init(struct vmballoon *b)
+{
+	int error = 0;
+
+	if ((b->capabilities & VMW_BALLOON_SIGNALLED_WAKEUP_CMD) != 0) {
+		error = vmci_doorbell_create(&b->vmci_doorbell,
+				VMCI_FLAG_DELAYED_CB,
+				VMCI_PRIVILEGE_FLAG_RESTRICTED,
+				vmballoon_doorbell, b);
+
+		if (error == VMCI_SUCCESS) {
+			VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET,
+					b->vmci_doorbell.context,
+					b->vmci_doorbell.resource, error);
+			STATS_INC(b->stats.doorbell_set);
+		}
+	}
+
+	if (error != 0) {
+		vmballoon_vmci_cleanup(b);
+
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/*
  * Perform standard reset sequence by popping the balloon (in case it
  * is not  empty) and then restarting protocol. This operation normally
  * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
  */
 static void vmballoon_reset(struct vmballoon *b)
 {
+	int error;
+
+	vmballoon_vmci_cleanup(b);
+
 	/* free all pages, skipping monitor unlock */
 	vmballoon_pop(b);
 
@@ -1024,6 +1099,11 @@ static void vmballoon_reset(struct vmballoon *b)
 	}
 
 	b->reset_required = false;
+
+	error = vmballoon_vmci_init(b);
+	if (error)
+		pr_err("failed to initialize vmci doorbell\n");
+
 	if (!vmballoon_send_guest_id(b))
 		pr_err("failed to send guest ID to the host\n");
 }
@@ -1097,6 +1177,7 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f,
 		   "\n"
 		   "timer:              %8u\n"
+		   "doorbell:           %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
 		   "2m-lock:            %8u (%4u failed)\n"
@@ -1112,8 +1193,11 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
 		   "err2mFree:          %8u\n"
-		   "errFree:            %8u\n",
+		   "errFree:            %8u\n"
+		   "doorbellSet:        %8u\n"
+		   "doorbellUnset:      %8u\n",
 		   stats->timer,
+		   stats->doorbell,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
 		   stats->lock[true],  stats->lock_fail[true],
@@ -1127,7 +1211,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   stats->free[true],
 		   stats->free[false],
 		   stats->refused_alloc[true], stats->refused_alloc[false],
-		   stats->refused_free[true], stats->refused_free[false]);
+		   stats->refused_free[true], stats->refused_free[false],
+		   stats->doorbell_set, stats->doorbell_unset);
 
 	return 0;
 }
@@ -1204,6 +1289,7 @@ static int __init vmballoon_init(void)
 	if (error)
 		return error;
 
+	balloon.vmci_doorbell = VMCI_INVALID_HANDLE;
 	balloon.batch_page = NULL;
 	balloon.page = NULL;
 	balloon.reset_required = true;
@@ -1216,6 +1302,7 @@ module_init(vmballoon_init);
 
 static void __exit vmballoon_exit(void)
 {
+	vmballoon_vmci_cleanup(&balloon);
 	cancel_delayed_work_sync(&balloon.dwork);
 
 	vmballoon_debugfs_exit(&balloon);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page.
  2015-08-06 20:33                             ` [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
@ 2015-08-06 21:07                               ` Greg KH
  2015-08-06 22:17                                 ` [PATCH v5 0/7] Fifth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
                                                   ` (7 more replies)
  0 siblings, 8 replies; 61+ messages in thread
From: Greg KH @ 2015-08-06 21:07 UTC (permalink / raw)
  To: Philip P. Moltmann
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm, pv-drivers

On Thu, Aug 06, 2015 at 01:33:39PM -0700, Philip P. Moltmann wrote:
> From: Xavier Deguillard <xdeguillard@vmware.com>
> 
> This split the function in two: the allocation part is inlined into the
> inflate function and the lock part is kept into his own function.
> 
> This change is needed in order to be able to allocate more than one page
> before doing the hypervisor call.
> 
> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
> Acked-by: Dmitry Torokhov <dtor@vmware.com>
> Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
> Acked-by: Andy King <acking@vmware.com>
> ---
>  drivers/misc/vmw_balloon.c | 98 ++++++++++++++++++++--------------------------
>  1 file changed, 42 insertions(+), 56 deletions(-)

This fails to apply on my tree, so I can't apply this series at all :(

Please rebase on char-misc-next and resend them so that I can apply
them.  This shows you didn't even test this series in any form, which
does not make me feel like it's worth accepting, don't you agree?

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 0/7]  Fifth revision of the performance improvement patch to the VMware balloon driver
  2015-08-06 21:07                               ` Greg KH
@ 2015-08-06 22:17                                 ` Philip P. Moltmann
  2015-08-14 23:27                                   ` Philip Moltmann
  2015-08-06 22:17                                 ` [PATCH v5 1/7] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
                                                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:17 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

This is the fifth revision fo the path to the VMWare balloon driver. The original
was send to linux-kernel@vger.kernel.org on 4/14/15 10.29 am PST. Please refer to
the original change for an overview.

v1:
- Initial implementation
v2
- Address suggestions by Dmitry Totoknov
  - Use UINT_MAX as "infitite" rate instead of special casing -1
v3:
- Change commit comment for step 6 to better explain what impact ballooning has
  on the VM performance.
v4:
- Add missing include header <linux/vmalloc.h> in step 3
v5:
- Moved to from git/torvalds/linux master to git/gregkh/char-misc char-misc-next
  (4816693286d4ff9219b1cc72c2ab9c589448ebcb) and remove already applied steps
  1/9 and 2/9. That changes the numbering of the patches. 3/9 is now 1/7, 4/9 is
  not 2/7, etc. There were no merge issues during this transition.

  Testing done: recompiled, installed, rebooted and make sure ballooning still
                works

Thanks
Philip

Philip P. Moltmann (5):
  VMware balloon: Show capabilities of balloon and resulting
    capabilities in the debug-fs node.
  VMware balloon: Do not limit the amount of frees and allocations in
    non-sleep mode.
  VMware balloon: Support 2m page ballooning.
  VMware balloon: Treat init like reset
  VMware balloon: Enable notification via VMCI

Xavier Deguillard (2):
  VMware balloon: add batching to the vmw_balloon.
  VMware balloon: Update balloon target on each lock/unlock.

 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 843 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 662 insertions(+), 183 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 1/7] VMware balloon: add batching to the vmw_balloon.
  2015-08-06 21:07                               ` Greg KH
  2015-08-06 22:17                                 ` [PATCH v5 0/7] Fifth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
@ 2015-08-06 22:17                                 ` Philip P. Moltmann
  2015-08-06 22:17                                 ` [PATCH v5 2/7] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
                                                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:17 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Introduce a new capability to the driver that allow sending 512 pages in
one hypervisor call. This reduce the cost of the driver when reclaiming
memory.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
---
 drivers/misc/vmw_balloon.c | 406 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 353 insertions(+), 53 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index ffb5634..64f275e 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2010, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -37,6 +37,7 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/vmalloc.h>
 #include <linux/sched.h>
 #include <linux/module.h>
 #include <linux/workqueue.h>
@@ -46,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.0.0-k");
+MODULE_VERSION("1.3.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -120,13 +121,26 @@ enum vmwballoon_capabilities {
 	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
 };
 
-#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS)
+#define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
+					| VMW_BALLOON_BATCHED_CMDS)
 
+/*
+ * Backdoor commands availability:
+ *
+ * START, GET_TARGET and GUEST_ID are always available,
+ *
+ * VMW_BALLOON_BASIC_CMDS:
+ *	LOCK and UNLOCK commands,
+ * VMW_BALLOON_BATCHED_CMDS:
+ *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ */
 #define VMW_BALLOON_CMD_START		0
 #define VMW_BALLOON_CMD_GET_TARGET	1
 #define VMW_BALLOON_CMD_LOCK		2
 #define VMW_BALLOON_CMD_UNLOCK		3
 #define VMW_BALLOON_CMD_GUEST_ID	4
+#define VMW_BALLOON_CMD_BATCHED_LOCK	6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -142,18 +156,63 @@ enum vmwballoon_capabilities {
 
 #define VMW_BALLOON_SUCCESS_WITH_CAPABILITIES	(0x03000000)
 
-#define VMWARE_BALLOON_CMD(cmd, data, result)			\
+/* Batch page description */
+
+/*
+ * Layout of a page in the batch page:
+ *
+ * +-------------+----------+--------+
+ * |             |          |        |
+ * | Page number | Reserved | Status |
+ * |             |          |        |
+ * +-------------+----------+--------+
+ * 64  PAGE_SHIFT          6         0
+ *
+ * For now only 4K pages are supported, but we can easily support large pages
+ * by using bits in the reserved field.
+ *
+ * The reserved field should be set to 0.
+ */
+#define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
+#define VMW_BALLOON_BATCH_STATUS_MASK	((1UL << 5) - 1)
+#define VMW_BALLOON_BATCH_PAGE_MASK	(~((1UL << PAGE_SHIFT) - 1))
+
+struct vmballoon_batch_page {
+	u64 pages[VMW_BALLOON_BATCH_MAX_PAGES];
+};
+
+static u64 vmballoon_batch_get_pa(struct vmballoon_batch_page *batch, int idx)
+{
+	return batch->pages[idx] & VMW_BALLOON_BATCH_PAGE_MASK;
+}
+
+static int vmballoon_batch_get_status(struct vmballoon_batch_page *batch,
+				int idx)
+{
+	return (int)(batch->pages[idx] & VMW_BALLOON_BATCH_STATUS_MASK);
+}
+
+static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
+				u64 pa)
+{
+	batch->pages[idx] = pa;
+}
+
+
+#define VMWARE_BALLOON_CMD(cmd, arg1, arg2, result)		\
 ({								\
-	unsigned long __status, __dummy1, __dummy2;		\
+	unsigned long __status, __dummy1, __dummy2, __dummy3;	\
 	__asm__ __volatile__ ("inl %%dx" :			\
 		"=a"(__status),					\
 		"=c"(__dummy1),					\
 		"=d"(__dummy2),					\
-		"=b"(result) :					\
+		"=b"(result),					\
+		"=S" (__dummy3) :				\
 		"0"(VMW_BALLOON_HV_MAGIC),			\
 		"1"(VMW_BALLOON_CMD_##cmd),			\
 		"2"(VMW_BALLOON_HV_PORT),			\
-		"3"(data) :					\
+		"3"(arg1),					\
+		"4" (arg2) :					\
 		"memory");					\
 	if (VMW_BALLOON_CMD_##cmd == VMW_BALLOON_CMD_START)	\
 		result = __dummy1;				\
@@ -192,6 +251,14 @@ struct vmballoon_stats {
 #define STATS_INC(stat)
 #endif
 
+struct vmballoon;
+
+struct vmballoon_ops {
+	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+};
+
 struct vmballoon {
 
 	/* list of reserved physical pages */
@@ -215,6 +282,14 @@ struct vmballoon {
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
 
+	unsigned long capabilities;
+
+	struct vmballoon_batch_page *batch_page;
+	unsigned int batch_max_pages;
+	struct page *page;
+
+	const struct vmballoon_ops *ops;
+
 #ifdef CONFIG_DEBUG_FS
 	/* statistics */
 	struct vmballoon_stats stats;
@@ -234,16 +309,22 @@ static struct vmballoon balloon;
  * Send "start" command to the host, communicating supported version
  * of the protocol.
  */
-static bool vmballoon_send_start(struct vmballoon *b)
+static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
-	unsigned long status, capabilities;
+	unsigned long status, capabilities, dummy = 0;
 
 	STATS_INC(b->stats.start);
 
-	status = VMWARE_BALLOON_CMD(START, VMW_BALLOON_CAPABILITIES,
-				capabilities);
-	if (status == VMW_BALLOON_SUCCESS)
+	status = VMWARE_BALLOON_CMD(START, req_caps, dummy, capabilities);
+
+	switch (status) {
+	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
+		b->capabilities = capabilities;
+		return true;
+	case VMW_BALLOON_SUCCESS:
+		b->capabilities = VMW_BALLOON_BASIC_CMDS;
 		return true;
+	}
 
 	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
 	STATS_INC(b->stats.start_fail);
@@ -273,9 +354,10 @@ static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
  */
 static bool vmballoon_send_guest_id(struct vmballoon *b)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 
-	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy);
+	status = VMWARE_BALLOON_CMD(GUEST_ID, VMW_BALLOON_GUEST_ID, dummy,
+				dummy);
 
 	STATS_INC(b->stats.guest_type);
 
@@ -295,6 +377,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	unsigned long status;
 	unsigned long target;
 	unsigned long limit;
+	unsigned long dummy = 0;
 	u32 limit32;
 
 	/*
@@ -313,7 +396,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 	/* update stats */
 	STATS_INC(b->stats.target);
 
-	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, target);
+	status = VMWARE_BALLOON_CMD(GET_TARGET, limit, dummy, target);
 	if (vmballoon_check_status(b, status)) {
 		*new_target = target;
 		return true;
@@ -332,7 +415,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 				     unsigned int *hv_status)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -341,7 +424,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -350,13 +433,30 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	return 1;
 }
 
+static int vmballoon_send_batched_lock(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.lock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return 0;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.lock_fail);
+	return 1;
+}
+
 /*
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
 static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 {
-	unsigned long status, dummy;
+	unsigned long status, dummy = 0;
 	u32 pfn32;
 
 	pfn32 = (u32)pfn;
@@ -365,7 +465,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -374,6 +474,23 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 	return false;
 }
 
+static bool vmballoon_send_batched_unlock(struct vmballoon *b,
+						unsigned int num_pages)
+{
+	unsigned long status, dummy;
+	unsigned long pfn = page_to_pfn(b->page);
+
+	STATS_INC(b->stats.unlock);
+
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	if (vmballoon_check_status(b, status))
+		return true;
+
+	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
+	STATS_INC(b->stats.unlock_fail);
+	return false;
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -396,22 +513,13 @@ static void vmballoon_pop(struct vmballoon *b)
 			cond_resched();
 		}
 	}
-}
 
-/*
- * Perform standard reset sequence by popping the balloon (in case it
- * is not  empty) and then restarting protocol. This operation normally
- * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
- */
-static void vmballoon_reset(struct vmballoon *b)
-{
-	/* free all pages, skipping monitor unlock */
-	vmballoon_pop(b);
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		if (b->batch_page)
+			vunmap(b->batch_page);
 
-	if (vmballoon_send_start(b)) {
-		b->reset_required = false;
-		if (!vmballoon_send_guest_id(b))
-			pr_err("failed to send guest ID to the host\n");
+		if (b->page)
+			__free_page(b->page);
 	}
 }
 
@@ -420,9 +528,10 @@ static void vmballoon_reset(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 {
 	int locked, hv_status;
+	struct page *page = b->page;
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
 	if (locked > 0) {
@@ -457,17 +566,68 @@ static int vmballoon_lock_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_lock_batched_page(struct vmballoon *b,
+				unsigned int num_pages)
+{
+	int locked, i;
+
+	locked = vmballoon_send_batched_lock(b, num_pages);
+	if (locked > 0) {
+		for (i = 0; i < num_pages; i++) {
+			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+			__free_page(p);
+		}
+
+		return -EIO;
+	}
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+
+		switch (locked) {
+		case VMW_BALLOON_SUCCESS:
+			list_add(&p->lru, &b->pages);
+			b->size++;
+			break;
+		case VMW_BALLOON_ERROR_PPN_PINNED:
+		case VMW_BALLOON_ERROR_PPN_INVALID:
+			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &b->refused_pages);
+				b->n_refused_pages++;
+				break;
+			}
+			/* Fallthrough */
+		case VMW_BALLOON_ERROR_RESET:
+		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
+			__free_page(p);
+			break;
+		default:
+			/* This should never happen */
+			WARN_ON_ONCE(true);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Release the page allocated for the balloon. Note that we first notify
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_release_page(struct vmballoon *b, struct page *page)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 {
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page)))
-		return -EIO;
+	struct page *page = b->page;
 
-	list_del(&page->lru);
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+		list_add(&page->lru, &b->pages);
+		return -EIO;
+	}
 
 	/* deallocate page */
 	__free_page(page);
@@ -479,6 +639,41 @@ static int vmballoon_release_page(struct vmballoon *b, struct page *page)
 	return 0;
 }
 
+static int vmballoon_unlock_batched_page(struct vmballoon *b,
+					unsigned int num_pages)
+{
+	int locked, i, ret = 0;
+	bool hv_success;
+
+	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	if (!hv_success)
+		ret = -EIO;
+
+	for (i = 0; i < num_pages; i++) {
+		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
+		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+
+		locked = vmballoon_batch_get_status(b->batch_page, i);
+		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
+			/*
+			 * That page wasn't successfully unlocked by the
+			 * hypervisor, re-add it to the list of pages owned by
+			 * the balloon driver.
+			 */
+			list_add(&p->lru, &b->pages);
+		} else {
+			/* deallocate page */
+			__free_page(p);
+			STATS_INC(b->stats.free);
+
+			/* update balloon size */
+			b->size--;
+		}
+	}
+
+	return ret;
+}
+
 /*
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
@@ -496,6 +691,18 @@ static void vmballoon_release_refused_pages(struct vmballoon *b)
 	b->n_refused_pages = 0;
 }
 
+static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
+{
+	b->page = p;
+}
+
+static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
+				struct page *p)
+{
+	vmballoon_batch_set_pa(b->batch_page, idx,
+			(u64)page_to_pfn(p) << PAGE_SHIFT);
+}
+
 /*
  * Inflate the balloon towards its target size. Note that we try to limit
  * the rate of allocation to make sure we are not choking the rest of the
@@ -507,6 +714,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int rate;
 	unsigned int i;
 	unsigned int allocations = 0;
+	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
 
@@ -539,14 +747,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 		 __func__, goal, rate, b->rate_alloc);
 
 	for (i = 0; i < goal; i++) {
-		struct page *page;
+		struct page *page = alloc_page(flags);
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -580,9 +787,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 			continue;
 		}
 
-		error = vmballoon_lock_page(b, page);
-		if (error)
-			break;
+		b->ops->add_page(b, num_pages++, page);
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->lock(b, num_pages);
+			num_pages = 0;
+			if (error)
+				break;
+		}
 
 		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
 			cond_resched();
@@ -595,6 +806,9 @@ static void vmballoon_inflate(struct vmballoon *b)
 		}
 	}
 
+	if (num_pages > 0)
+		b->ops->lock(b, num_pages);
+
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
@@ -618,6 +832,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 	struct page *page, *next;
 	unsigned int i = 0;
 	unsigned int goal;
+	unsigned int num_pages = 0;
 	int error;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
@@ -629,21 +844,94 @@ static void vmballoon_deflate(struct vmballoon *b)
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		error = vmballoon_release_page(b, page);
-		if (error) {
-			/* quickly decrease rate in case of error */
-			b->rate_free = max(b->rate_free / 2,
-					   VMW_BALLOON_RATE_FREE_MIN);
-			return;
+		list_del(&page->lru);
+		b->ops->add_page(b, num_pages++, page);
+
+		if (num_pages == b->batch_max_pages) {
+			error = b->ops->unlock(b, num_pages);
+			num_pages = 0;
+			if (error) {
+				/* quickly decrease rate in case of error */
+				b->rate_free = max(b->rate_free / 2,
+						VMW_BALLOON_RATE_FREE_MIN);
+				return;
+			}
 		}
 
 		if (++i >= goal)
 			break;
 	}
 
+	if (num_pages > 0)
+		b->ops->unlock(b, num_pages);
+
 	/* slowly increase rate if there were no errors */
-	b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-			   VMW_BALLOON_RATE_FREE_MAX);
+	if (error == 0)
+		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
+				   VMW_BALLOON_RATE_FREE_MAX);
+}
+
+static const struct vmballoon_ops vmballoon_basic_ops = {
+	.add_page = vmballoon_add_page,
+	.lock = vmballoon_lock_page,
+	.unlock = vmballoon_unlock_page
+};
+
+static const struct vmballoon_ops vmballoon_batched_ops = {
+	.add_page = vmballoon_add_batched_page,
+	.lock = vmballoon_lock_batched_page,
+	.unlock = vmballoon_unlock_batched_page
+};
+
+static bool vmballoon_init_batching(struct vmballoon *b)
+{
+	b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP);
+	if (!b->page)
+		return false;
+
+	b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL);
+	if (!b->batch_page) {
+		__free_page(b->page);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Perform standard reset sequence by popping the balloon (in case it
+ * is not  empty) and then restarting protocol. This operation normally
+ * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
+ */
+static void vmballoon_reset(struct vmballoon *b)
+{
+	/* free all pages, skipping monitor unlock */
+	vmballoon_pop(b);
+
+	if (!vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES))
+		return;
+
+	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		b->ops = &vmballoon_batched_ops;
+		b->batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(b)) {
+			/*
+			 * We failed to initialize batching, inform the monitor
+			 * about it by sending a null capability.
+			 *
+			 * The guest will retry in one second.
+			 */
+			vmballoon_send_start(b, 0);
+			return;
+		}
+	} else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		b->ops = &vmballoon_basic_ops;
+		b->batch_max_pages = 1;
+	}
+
+	b->reset_required = false;
+	if (!vmballoon_send_guest_id(b))
+		pr_err("failed to send guest ID to the host\n");
 }
 
 /*
@@ -802,11 +1090,23 @@ static int __init vmballoon_init(void)
 	/*
 	 * Start balloon.
 	 */
-	if (!vmballoon_send_start(&balloon)) {
+	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
 		pr_err("failed to send start command to the host\n");
 		return -EIO;
 	}
 
+	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
+		balloon.ops = &vmballoon_batched_ops;
+		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
+		if (!vmballoon_init_batching(&balloon)) {
+			pr_err("failed to init batching\n");
+			return -EIO;
+		}
+	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
+		balloon.ops = &vmballoon_basic_ops;
+		balloon.batch_max_pages = 1;
+	}
+
 	if (!vmballoon_send_guest_id(&balloon)) {
 		pr_err("failed to send guest ID to the host\n");
 		return -EIO;
@@ -833,7 +1133,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon);
+	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 2/7] VMware balloon: Update balloon target on each lock/unlock.
  2015-08-06 21:07                               ` Greg KH
  2015-08-06 22:17                                 ` [PATCH v5 0/7] Fifth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
  2015-08-06 22:17                                 ` [PATCH v5 1/7] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
@ 2015-08-06 22:17                                 ` Philip P. Moltmann
  2015-08-06 22:18                                 ` [PATCH v5 3/7] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
                                                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:17 UTC (permalink / raw)
  To: gregkh
  Cc: Xavier Deguillard, dmitry.torokhov, linux-kernel, akpm,
	pv-drivers, Philip P. Moltmann

From: Xavier Deguillard <xdeguillard@vmware.com>

Instead of waiting for the next GET_TARGET command, we can react faster
by exploiting the fact that each hypervisor call also returns the
balloon target.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Philip P. Moltmann <moltmann@vmware.com>
Acked-by: Andy King <acking@vmware.com>
---
 drivers/misc/vmw_balloon.c | 85 +++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 64f275e..0b5aa93 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.1.0-k");
+MODULE_VERSION("1.3.2.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -255,8 +255,10 @@ struct vmballoon;
 
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
-	int (*lock)(struct vmballoon *b, unsigned int num_pages);
-	int (*unlock)(struct vmballoon *b, unsigned int num_pages);
+	int (*lock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
+	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
+						unsigned int *target);
 };
 
 struct vmballoon {
@@ -413,7 +415,7 @@ static bool vmballoon_send_get_target(struct vmballoon *b, u32 *new_target)
  * check the return value and maybe submit a different page.
  */
 static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
-				     unsigned int *hv_status)
+				unsigned int *hv_status, unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -424,7 +426,7 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 
 	STATS_INC(b->stats.lock);
 
-	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, dummy);
+	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -434,14 +436,14 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.lock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
@@ -454,7 +456,8 @@ static int vmballoon_send_batched_lock(struct vmballoon *b,
  * Notify the host that guest intends to release given page back into
  * the pool of available (to the guest) pages.
  */
-static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
+static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
+							unsigned int *target)
 {
 	unsigned long status, dummy = 0;
 	u32 pfn32;
@@ -465,7 +468,7 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, dummy);
+	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -475,14 +478,14 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn)
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-						unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
-	unsigned long status, dummy;
+	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
 	STATS_INC(b->stats.unlock);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, dummy);
+	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
@@ -528,12 +531,14 @@ static void vmballoon_pop(struct vmballoon *b)
  * refuse list, those refused page are then released at the end of the
  * inflation cycle.
  */
-static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
 
-	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status);
+	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
+								target);
 	if (locked > 0) {
 		STATS_INC(b->stats.refused_alloc);
 
@@ -567,11 +572,11 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i;
 
-	locked = vmballoon_send_batched_lock(b, num_pages);
+	locked = vmballoon_send_batched_lock(b, num_pages, target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
@@ -620,11 +625,12 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * the host so it can make sure the page will be available for the guest
  * to use, if needed.
  */
-static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
+static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
+							unsigned int *target)
 {
 	struct page *page = b->page;
 
-	if (!vmballoon_send_unlock_page(b, page_to_pfn(page))) {
+	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
 		list_add(&page->lru, &b->pages);
 		return -EIO;
 	}
@@ -640,12 +646,12 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages)
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-					unsigned int num_pages)
+				unsigned int num_pages, unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
 	if (!hv_success)
 		ret = -EIO;
 
@@ -710,9 +716,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int goal;
 	unsigned int rate;
-	unsigned int i;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -735,7 +739,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * slowdown page allocations considerably.
 	 */
 
-	goal = b->target - b->size;
 	/*
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
@@ -744,16 +747,17 @@ static void vmballoon_inflate(struct vmballoon *b)
 			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
 
 	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
-		 __func__, goal, rate, b->rate_alloc);
+		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	for (i = 0; i < goal; i++) {
-		struct page *page = alloc_page(flags);
+	while (b->size < b->target && num_pages < b->target - b->size) {
+		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
 			STATS_INC(b->stats.alloc);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
+		page = alloc_page(flags);
 		if (!page) {
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
@@ -778,7 +782,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 			 */
 			b->slow_allocation_cycles = VMW_BALLOON_SLOW_CYCLES;
 
-			if (i >= b->rate_alloc)
+			if (allocations >= b->rate_alloc)
 				break;
 
 			flags = VMW_PAGE_ALLOC_CANSLEEP;
@@ -789,7 +793,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages);
+			error = b->ops->lock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -800,21 +804,21 @@ static void vmballoon_inflate(struct vmballoon *b)
 			allocations = 0;
 		}
 
-		if (i >= rate) {
+		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
 			break;
 		}
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages);
+		b->ops->lock(b, num_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
 	 * allocation rate.
 	 */
-	if (error == 0 && i >= b->rate_alloc) {
-		unsigned int mult = i / b->rate_alloc;
+	if (error == 0 && allocations >= b->rate_alloc) {
+		unsigned int mult = allocations / b->rate_alloc;
 
 		b->rate_alloc =
 			min(b->rate_alloc + mult * VMW_BALLOON_RATE_ALLOC_INC,
@@ -831,16 +835,11 @@ static void vmballoon_deflate(struct vmballoon *b)
 {
 	struct page *page, *next;
 	unsigned int i = 0;
-	unsigned int goal;
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
-
-	/* limit deallocation rate */
-	goal = min(b->size - b->target, b->rate_free);
-
-	pr_debug("%s - goal: %d, rate: %d\n", __func__, goal, b->rate_free);
+	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
+						b->target, b->rate_free);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
@@ -848,7 +847,7 @@ static void vmballoon_deflate(struct vmballoon *b)
 		b->ops->add_page(b, num_pages++, page);
 
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages);
+			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
 			if (error) {
 				/* quickly decrease rate in case of error */
@@ -858,12 +857,12 @@ static void vmballoon_deflate(struct vmballoon *b)
 			}
 		}
 
-		if (++i >= goal)
+		if (++i >= b->size - b->target)
 			break;
 	}
 
 	if (num_pages > 0)
-		b->ops->unlock(b, num_pages);
+		b->ops->unlock(b, num_pages, &b->target);
 
 	/* slowly increase rate if there were no errors */
 	if (error == 0)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 3/7] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node.
  2015-08-06 21:07                               ` Greg KH
                                                   ` (2 preceding siblings ...)
  2015-08-06 22:17                                 ` [PATCH v5 2/7] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
@ 2015-08-06 22:18                                 ` Philip P. Moltmann
  2015-08-06 22:18                                 ` [PATCH v5 4/7] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
                                                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:18 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

This helps with debugging vmw_balloon behavior, as it is clear what
functionality is enabled.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 0b5aa93..f0beb65 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.2.0-k");
+MODULE_VERSION("1.3.3.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -979,6 +979,12 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	struct vmballoon *b = f->private;
 	struct vmballoon_stats *stats = &b->stats;
 
+	/* format capabilities info */
+	seq_printf(f,
+		   "balloon capabilities:   %#4x\n"
+		   "used capabilities:      %#4lx\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+
 	/* format size info */
 	seq_printf(f,
 		   "target:             %8d pages\n"
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 4/7] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.
  2015-08-06 21:07                               ` Greg KH
                                                   ` (3 preceding siblings ...)
  2015-08-06 22:18                                 ` [PATCH v5 3/7] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
@ 2015-08-06 22:18                                 ` Philip P. Moltmann
  2015-08-06 22:18                                 ` [PATCH v5 5/7] VMware balloon: Support 2m page ballooning Philip P. Moltmann
                                                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:18 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

When VMware's hypervisor requests a VM to reclaim memory this is preferrably done
via ballooning. If the balloon driver does not return memory fast enough, more
drastic methods, such as hypervisor-level swapping are needed. These other methods
cause performance issues, e.g. hypervisor-level swapping requires the hypervisor to
swap in a page syncronously while the virtual CPU is blocked.

Hence it is in the interest of the VM to balloon memory as fast as possible. The
problem with doing this is that the VM might end up doing nothing else than
ballooning and the user might notice that the VM is stalled, esp. when the VM has
only a single virtual CPU.

This is less of a problem if the VM and the hypervisor perform balloon operations
faster. Also the balloon driver yields regularly, hence on a single virtual CPU
the Linux scheduler should be able to properly time-slice between ballooning and
other tasks.

Testing Done: quickly ballooned a lot of pages while wathing if there are any
perceived hickups (periods of non-responsiveness) in the execution of the
linux VM. No such hickups were seen.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 68 +++++++++++-----------------------------------
 1 file changed, 16 insertions(+), 52 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index f0beb65..aed9525 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.3.0-k");
+MODULE_VERSION("1.3.4.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -58,12 +58,6 @@ MODULE_LICENSE("GPL");
  */
 
 /*
- * Rate of allocating memory when there is no memory pressure
- * (driver performs non-sleeping allocations).
- */
-#define VMW_BALLOON_NOSLEEP_ALLOC_MAX	16384U
-
-/*
  * Rates of memory allocaton when guest experiences memory pressure
  * (driver performs sleeping allocations).
  */
@@ -72,13 +66,6 @@ MODULE_LICENSE("GPL");
 #define VMW_BALLOON_RATE_ALLOC_INC	16U
 
 /*
- * Rates for releasing pages while deflating balloon.
- */
-#define VMW_BALLOON_RATE_FREE_MIN	512U
-#define VMW_BALLOON_RATE_FREE_MAX	16384U
-#define VMW_BALLOON_RATE_FREE_INC	16U
-
-/*
  * When guest is under memory pressure, use a reduced page allocation
  * rate for next several cycles.
  */
@@ -100,9 +87,6 @@ MODULE_LICENSE("GPL");
  */
 #define VMW_PAGE_ALLOC_CANSLEEP		(GFP_HIGHUSER)
 
-/* Maximum number of page allocations without yielding processor */
-#define VMW_BALLOON_YIELD_THRESHOLD	1024
-
 /* Maximum number of refused pages we accumulate during inflation cycle */
 #define VMW_BALLOON_MAX_REFUSED		16
 
@@ -279,7 +263,6 @@ struct vmballoon {
 
 	/* adjustment rates (pages per second) */
 	unsigned int rate_alloc;
-	unsigned int rate_free;
 
 	/* slowdown page allocations for next few cycles */
 	unsigned int slow_allocation_cycles;
@@ -503,18 +486,13 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-	unsigned int count = 0;
 
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		__free_page(page);
 		STATS_INC(b->stats.free);
 		b->size--;
-
-		if (++count >= b->rate_free) {
-			count = 0;
-			cond_resched();
-		}
+		cond_resched();
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -716,7 +694,7 @@ static void vmballoon_add_batched_page(struct vmballoon *b, int idx,
  */
 static void vmballoon_inflate(struct vmballoon *b)
 {
-	unsigned int rate;
+	unsigned rate;
 	unsigned int allocations = 0;
 	unsigned int num_pages = 0;
 	int error = 0;
@@ -743,13 +721,13 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ?
-			b->rate_alloc : VMW_BALLOON_NOSLEEP_ALLOC_MAX;
+	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
 
-	pr_debug("%s - goal: %d, no-sleep rate: %d, sleep rate: %d\n",
+	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
-	while (b->size < b->target && num_pages < b->target - b->size) {
+	while (!b->reset_required &&
+		b->size < b->target && num_pages < b->target - b->size) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
@@ -799,10 +777,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 				break;
 		}
 
-		if (++allocations > VMW_BALLOON_YIELD_THRESHOLD) {
-			cond_resched();
-			allocations = 0;
-		}
+		cond_resched();
 
 		if (allocations >= rate) {
 			/* We allocated enough pages, let's take a break. */
@@ -838,36 +813,29 @@ static void vmballoon_deflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error;
 
-	pr_debug("%s - size: %d, target %d, rate: %d\n", __func__, b->size,
-						b->target, b->rate_free);
+	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
 	list_for_each_entry_safe(page, next, &b->pages, lru) {
 		list_del(&page->lru);
 		b->ops->add_page(b, num_pages++, page);
 
+
 		if (num_pages == b->batch_max_pages) {
 			error = b->ops->unlock(b, num_pages, &b->target);
 			num_pages = 0;
-			if (error) {
-				/* quickly decrease rate in case of error */
-				b->rate_free = max(b->rate_free / 2,
-						VMW_BALLOON_RATE_FREE_MIN);
+			if (error)
 				return;
-			}
 		}
 
-		if (++i >= b->size - b->target)
+		if (b->reset_required || ++i >= b->size - b->target)
 			break;
+
+		cond_resched();
 	}
 
 	if (num_pages > 0)
 		b->ops->unlock(b, num_pages, &b->target);
-
-	/* slowly increase rate if there were no errors */
-	if (error == 0)
-		b->rate_free = min(b->rate_free + VMW_BALLOON_RATE_FREE_INC,
-				   VMW_BALLOON_RATE_FREE_MAX);
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -993,11 +961,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 
 	/* format rate info */
 	seq_printf(f,
-		   "rateNoSleepAlloc:   %8d pages/sec\n"
-		   "rateSleepAlloc:     %8d pages/sec\n"
-		   "rateFree:           %8d pages/sec\n",
-		   VMW_BALLOON_NOSLEEP_ALLOC_MAX,
-		   b->rate_alloc, b->rate_free);
+		   "rateSleepAlloc:     %8d pages/sec\n",
+		   b->rate_alloc);
 
 	seq_printf(f,
 		   "\n"
@@ -1088,7 +1053,6 @@ static int __init vmballoon_init(void)
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-	balloon.rate_free = VMW_BALLOON_RATE_FREE_MAX;
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 5/7] VMware balloon: Support 2m page ballooning.
  2015-08-06 21:07                               ` Greg KH
                                                   ` (4 preceding siblings ...)
  2015-08-06 22:18                                 ` [PATCH v5 4/7] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
@ 2015-08-06 22:18                                 ` Philip P. Moltmann
  2015-08-21 14:01                                   ` Kamalneet Singh
  2015-08-06 22:18                                 ` [PATCH v5 6/7] VMware balloon: Treat init like reset Philip P. Moltmann
  2015-08-06 22:18                                 ` [PATCH v5 7/7] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  7 siblings, 1 reply; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:18 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

2m ballooning significantly reduces the hypervisor side (and guest side)
overhead of ballooning and unballooning.

hypervisor only:
      balloon  unballoon
4 KB  2 GB/s   2.6 GB/s
2 MB  54 GB/s  767 GB/s

Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
supported super-page size.

The code has to run on older versions of ESX and old balloon drivers run on
newer version of ESX. Hence match the capabilities with the host before 2m
page ballooning could be enabled.

Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 376 +++++++++++++++++++++++++++++++--------------
 1 file changed, 258 insertions(+), 118 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index aed9525..01519ff 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.3.4.0-k");
+MODULE_VERSION("1.4.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -102,11 +102,16 @@ enum vmwballoon_capabilities {
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
 	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2)
+	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
-					| VMW_BALLOON_BATCHED_CMDS)
+					| VMW_BALLOON_BATCHED_CMDS \
+					| VMW_BALLOON_BATCHED_2M_CMDS)
+
+#define VMW_BALLOON_2M_SHIFT		(9)
+#define VMW_BALLOON_NUM_PAGE_SIZES	(2)
 
 /*
  * Backdoor commands availability:
@@ -117,14 +122,19 @@ enum vmwballoon_capabilities {
  *	LOCK and UNLOCK commands,
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
+ * VMW BALLOON_BATCHED_2M_CMDS:
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
  */
-#define VMW_BALLOON_CMD_START		0
-#define VMW_BALLOON_CMD_GET_TARGET	1
-#define VMW_BALLOON_CMD_LOCK		2
-#define VMW_BALLOON_CMD_UNLOCK		3
-#define VMW_BALLOON_CMD_GUEST_ID	4
-#define VMW_BALLOON_CMD_BATCHED_LOCK	6
-#define VMW_BALLOON_CMD_BATCHED_UNLOCK	7
+#define VMW_BALLOON_CMD_START			0
+#define VMW_BALLOON_CMD_GET_TARGET		1
+#define VMW_BALLOON_CMD_LOCK			2
+#define VMW_BALLOON_CMD_UNLOCK			3
+#define VMW_BALLOON_CMD_GUEST_ID		4
+#define VMW_BALLOON_CMD_BATCHED_LOCK		6
+#define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
+#define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
+#define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+
 
 /* error codes */
 #define VMW_BALLOON_SUCCESS		        0
@@ -152,9 +162,6 @@ enum vmwballoon_capabilities {
  * +-------------+----------+--------+
  * 64  PAGE_SHIFT          6         0
  *
- * For now only 4K pages are supported, but we can easily support large pages
- * by using bits in the reserved field.
- *
  * The reserved field should be set to 0.
  */
 #define VMW_BALLOON_BATCH_MAX_PAGES	(PAGE_SIZE / sizeof(u64))
@@ -209,19 +216,19 @@ struct vmballoon_stats {
 	unsigned int timer;
 
 	/* allocation statistics */
-	unsigned int alloc;
-	unsigned int alloc_fail;
+	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int alloc_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int sleep_alloc;
 	unsigned int sleep_alloc_fail;
-	unsigned int refused_alloc;
-	unsigned int refused_free;
-	unsigned int free;
+	unsigned int refused_alloc[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int refused_free[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int free[VMW_BALLOON_NUM_PAGE_SIZES];
 
 	/* monitor operations */
-	unsigned int lock;
-	unsigned int lock_fail;
-	unsigned int unlock;
-	unsigned int unlock_fail;
+	unsigned int lock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int lock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock[VMW_BALLOON_NUM_PAGE_SIZES];
+	unsigned int unlock_fail[VMW_BALLOON_NUM_PAGE_SIZES];
 	unsigned int target;
 	unsigned int target_fail;
 	unsigned int start;
@@ -240,19 +247,25 @@ struct vmballoon;
 struct vmballoon_ops {
 	void (*add_page)(struct vmballoon *b, int idx, struct page *p);
 	int (*lock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 	int (*unlock)(struct vmballoon *b, unsigned int num_pages,
-						unsigned int *target);
+			bool is_2m_pages, unsigned int *target);
 };
 
-struct vmballoon {
-
+struct vmballoon_page_size {
 	/* list of reserved physical pages */
 	struct list_head pages;
 
 	/* transient list of non-balloonable pages */
 	struct list_head refused_pages;
 	unsigned int n_refused_pages;
+};
+
+struct vmballoon {
+	struct vmballoon_page_size page_sizes[VMW_BALLOON_NUM_PAGE_SIZES];
+
+	/* supported page sizes. 1 == 4k pages only, 2 == 4k and 2m pages */
+	unsigned supported_page_sizes;
 
 	/* balloon size in pages */
 	unsigned int size;
@@ -297,6 +310,7 @@ static struct vmballoon balloon;
 static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 {
 	unsigned long status, capabilities, dummy = 0;
+	bool success;
 
 	STATS_INC(b->stats.start);
 
@@ -305,15 +319,26 @@ static bool vmballoon_send_start(struct vmballoon *b, unsigned long req_caps)
 	switch (status) {
 	case VMW_BALLOON_SUCCESS_WITH_CAPABILITIES:
 		b->capabilities = capabilities;
-		return true;
+		success = true;
+		break;
 	case VMW_BALLOON_SUCCESS:
 		b->capabilities = VMW_BALLOON_BASIC_CMDS;
-		return true;
+		success = true;
+		break;
+	default:
+		success = false;
 	}
 
-	pr_debug("%s - failed, hv returns %ld\n", __func__, status);
-	STATS_INC(b->stats.start_fail);
-	return false;
+	if (b->capabilities & VMW_BALLOON_BATCHED_2M_CMDS)
+		b->supported_page_sizes = 2;
+	else
+		b->supported_page_sizes = 1;
+
+	if (!success) {
+		pr_debug("%s - failed, hv returns %ld\n", __func__, status);
+		STATS_INC(b->stats.start_fail);
+	}
+	return success;
 }
 
 static bool vmballoon_check_status(struct vmballoon *b, unsigned long status)
@@ -354,6 +379,14 @@ static bool vmballoon_send_guest_id(struct vmballoon *b)
 	return false;
 }
 
+static u16 vmballoon_page_size(bool is_2m_page)
+{
+	if (is_2m_page)
+		return 1 << VMW_BALLOON_2M_SHIFT;
+
+	return 1;
+}
+
 /*
  * Retrieve desired balloon size from the host.
  */
@@ -407,31 +440,37 @@ static int vmballoon_send_lock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return -1;
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[false]);
 
 	*hv_status = status = VMWARE_BALLOON_CMD(LOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[false]);
 	return 1;
 }
 
 static int vmballoon_send_batched_lock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.lock);
+	STATS_INC(b->stats.lock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_LOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_LOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return 0;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.lock_fail);
+	STATS_INC(b->stats.lock_fail[is_2m_pages]);
 	return 1;
 }
 
@@ -449,34 +488,56 @@ static bool vmballoon_send_unlock_page(struct vmballoon *b, unsigned long pfn,
 	if (pfn32 != pfn)
 		return false;
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[false]);
 
 	status = VMWARE_BALLOON_CMD(UNLOCK, pfn, dummy, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[false]);
 	return false;
 }
 
 static bool vmballoon_send_batched_unlock(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	unsigned long status;
 	unsigned long pfn = page_to_pfn(b->page);
 
-	STATS_INC(b->stats.unlock);
+	STATS_INC(b->stats.unlock[is_2m_pages]);
+
+	if (is_2m_pages)
+		status = VMWARE_BALLOON_CMD(BATCHED_2M_UNLOCK, pfn, num_pages,
+				*target);
+	else
+		status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages,
+				*target);
 
-	status = VMWARE_BALLOON_CMD(BATCHED_UNLOCK, pfn, num_pages, *target);
 	if (vmballoon_check_status(b, status))
 		return true;
 
 	pr_debug("%s - batch ppn %lx, hv returns %ld\n", __func__, pfn, status);
-	STATS_INC(b->stats.unlock_fail);
+	STATS_INC(b->stats.unlock_fail[is_2m_pages]);
 	return false;
 }
 
+static struct page *vmballoon_alloc_page(gfp_t flags, bool is_2m_page)
+{
+	if (is_2m_page)
+		return alloc_pages(flags, VMW_BALLOON_2M_SHIFT);
+
+	return alloc_page(flags);
+}
+
+static void vmballoon_free_page(struct page *page, bool is_2m_page)
+{
+	if (is_2m_page)
+		__free_pages(page, VMW_BALLOON_2M_SHIFT);
+	else
+		__free_page(page);
+}
+
 /*
  * Quickly release all pages allocated for the balloon. This function is
  * called when host decides to "reset" balloon for one reason or another.
@@ -486,13 +547,21 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
 static void vmballoon_pop(struct vmballoon *b)
 {
 	struct page *page, *next;
-
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.free);
-		b->size--;
-		cond_resched();
+	unsigned is_2m_pages;
+
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+		u16 size_per_page = vmballoon_page_size(is_2m_pages);
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			list_del(&page->lru);
+			vmballoon_free_page(page, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
+			b->size -= size_per_page;
+			cond_resched();
+		}
 	}
 
 	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
@@ -510,19 +579,22 @@ static void vmballoon_pop(struct vmballoon *b)
  * inflation cycle.
  */
 static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+				bool is_2m_pages, unsigned int *target)
 {
 	int locked, hv_status;
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	locked = vmballoon_send_lock_page(b, page_to_pfn(page), &hv_status,
 								target);
 	if (locked > 0) {
-		STATS_INC(b->stats.refused_alloc);
+		STATS_INC(b->stats.refused_alloc[false]);
 
 		if (hv_status == VMW_BALLOON_ERROR_RESET ||
 				hv_status == VMW_BALLOON_ERROR_PPN_NOTNEEDED) {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 			return -EIO;
 		}
 
@@ -531,17 +603,17 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 		 * and retry allocation, unless we already accumulated
 		 * too many of them, in which case take a breather.
 		 */
-		if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-			b->n_refused_pages++;
-			list_add(&page->lru, &b->refused_pages);
+		if (page_size->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
+			page_size->n_refused_pages++;
+			list_add(&page->lru, &page_size->refused_pages);
 		} else {
-			__free_page(page);
+			vmballoon_free_page(page, false);
 		}
 		return -EIO;
 	}
 
 	/* track allocated page */
-	list_add(&page->lru, &b->pages);
+	list_add(&page->lru, &page_size->pages);
 
 	/* update balloon size */
 	b->size++;
@@ -550,17 +622,19 @@ static int vmballoon_lock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_lock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+		unsigned int num_pages, bool is_2m_pages, unsigned int *target)
 {
 	int locked, i;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	locked = vmballoon_send_batched_lock(b, num_pages, target);
+	locked = vmballoon_send_batched_lock(b, num_pages, is_2m_pages,
+			target);
 	if (locked > 0) {
 		for (i = 0; i < num_pages; i++) {
 			u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 			struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
 
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 		}
 
 		return -EIO;
@@ -569,25 +643,28 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 
 		switch (locked) {
 		case VMW_BALLOON_SUCCESS:
-			list_add(&p->lru, &b->pages);
-			b->size++;
+			list_add(&p->lru, &page_size->pages);
+			b->size += size_per_page;
 			break;
 		case VMW_BALLOON_ERROR_PPN_PINNED:
 		case VMW_BALLOON_ERROR_PPN_INVALID:
-			if (b->n_refused_pages < VMW_BALLOON_MAX_REFUSED) {
-				list_add(&p->lru, &b->refused_pages);
-				b->n_refused_pages++;
+			if (page_size->n_refused_pages
+					< VMW_BALLOON_MAX_REFUSED) {
+				list_add(&p->lru, &page_size->refused_pages);
+				page_size->n_refused_pages++;
 				break;
 			}
 			/* Fallthrough */
 		case VMW_BALLOON_ERROR_RESET:
 		case VMW_BALLOON_ERROR_PPN_NOTNEEDED:
-			__free_page(p);
+			vmballoon_free_page(p, is_2m_pages);
 			break;
 		default:
 			/* This should never happen */
@@ -604,18 +681,21 @@ static int vmballoon_lock_batched_page(struct vmballoon *b,
  * to use, if needed.
  */
 static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
-							unsigned int *target)
+		bool is_2m_pages, unsigned int *target)
 {
 	struct page *page = b->page;
+	struct vmballoon_page_size *page_size = &b->page_sizes[false];
+
+	/* is_2m_pages can never happen as 2m pages support implies batching */
 
 	if (!vmballoon_send_unlock_page(b, page_to_pfn(page), target)) {
-		list_add(&page->lru, &b->pages);
+		list_add(&page->lru, &page_size->pages);
 		return -EIO;
 	}
 
 	/* deallocate page */
-	__free_page(page);
-	STATS_INC(b->stats.free);
+	vmballoon_free_page(page, false);
+	STATS_INC(b->stats.free[false]);
 
 	/* update balloon size */
 	b->size--;
@@ -624,18 +704,23 @@ static int vmballoon_unlock_page(struct vmballoon *b, unsigned int num_pages,
 }
 
 static int vmballoon_unlock_batched_page(struct vmballoon *b,
-				unsigned int num_pages, unsigned int *target)
+				unsigned int num_pages, bool is_2m_pages,
+				unsigned int *target)
 {
 	int locked, i, ret = 0;
 	bool hv_success;
+	u16 size_per_page = vmballoon_page_size(is_2m_pages);
 
-	hv_success = vmballoon_send_batched_unlock(b, num_pages, target);
+	hv_success = vmballoon_send_batched_unlock(b, num_pages, is_2m_pages,
+			target);
 	if (!hv_success)
 		ret = -EIO;
 
 	for (i = 0; i < num_pages; i++) {
 		u64 pa = vmballoon_batch_get_pa(b->batch_page, i);
 		struct page *p = pfn_to_page(pa >> PAGE_SHIFT);
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
 
 		locked = vmballoon_batch_get_status(b->batch_page, i);
 		if (!hv_success || locked != VMW_BALLOON_SUCCESS) {
@@ -644,14 +729,14 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
 			 * hypervisor, re-add it to the list of pages owned by
 			 * the balloon driver.
 			 */
-			list_add(&p->lru, &b->pages);
+			list_add(&p->lru, &page_size->pages);
 		} else {
 			/* deallocate page */
-			__free_page(p);
-			STATS_INC(b->stats.free);
+			vmballoon_free_page(p, is_2m_pages);
+			STATS_INC(b->stats.free[is_2m_pages]);
 
 			/* update balloon size */
-			b->size--;
+			b->size -= size_per_page;
 		}
 	}
 
@@ -662,17 +747,20 @@ static int vmballoon_unlock_batched_page(struct vmballoon *b,
  * Release pages that were allocated while attempting to inflate the
  * balloon but were refused by the host for one reason or another.
  */
-static void vmballoon_release_refused_pages(struct vmballoon *b)
+static void vmballoon_release_refused_pages(struct vmballoon *b,
+		bool is_2m_pages)
 {
 	struct page *page, *next;
+	struct vmballoon_page_size *page_size =
+			&b->page_sizes[is_2m_pages];
 
-	list_for_each_entry_safe(page, next, &b->refused_pages, lru) {
+	list_for_each_entry_safe(page, next, &page_size->refused_pages, lru) {
 		list_del(&page->lru);
-		__free_page(page);
-		STATS_INC(b->stats.refused_free);
+		vmballoon_free_page(page, is_2m_pages);
+		STATS_INC(b->stats.refused_free[is_2m_pages]);
 	}
 
-	b->n_refused_pages = 0;
+	page_size->n_refused_pages = 0;
 }
 
 static void vmballoon_add_page(struct vmballoon *b, int idx, struct page *p)
@@ -699,6 +787,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	unsigned int num_pages = 0;
 	int error = 0;
 	gfp_t flags = VMW_PAGE_ALLOC_NOSLEEP;
+	bool is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
@@ -721,22 +810,46 @@ static void vmballoon_inflate(struct vmballoon *b)
 	 * Start with no sleep allocation rate which may be higher
 	 * than sleeping allocation rate.
 	 */
-	rate = b->slow_allocation_cycles ? b->rate_alloc : UINT_MAX;
+	if (b->slow_allocation_cycles) {
+		rate = b->rate_alloc;
+		is_2m_pages = false;
+	} else {
+		rate = UINT_MAX;
+		is_2m_pages =
+			b->supported_page_sizes == VMW_BALLOON_NUM_PAGE_SIZES;
+	}
 
 	pr_debug("%s - goal: %d, no-sleep rate: %u, sleep rate: %d\n",
 		 __func__, b->target - b->size, rate, b->rate_alloc);
 
 	while (!b->reset_required &&
-		b->size < b->target && num_pages < b->target - b->size) {
+		b->size + num_pages * vmballoon_page_size(is_2m_pages)
+		< b->target) {
 		struct page *page;
 
 		if (flags == VMW_PAGE_ALLOC_NOSLEEP)
-			STATS_INC(b->stats.alloc);
+			STATS_INC(b->stats.alloc[is_2m_pages]);
 		else
 			STATS_INC(b->stats.sleep_alloc);
 
-		page = alloc_page(flags);
+		page = vmballoon_alloc_page(flags, is_2m_pages);
 		if (!page) {
+			STATS_INC(b->stats.alloc_fail[is_2m_pages]);
+
+			if (is_2m_pages) {
+				b->ops->lock(b, num_pages, true, &b->target);
+
+				/*
+				 * ignore errors from locking as we now switch
+				 * to 4k pages and we might get different
+				 * errors.
+				 */
+
+				num_pages = 0;
+				is_2m_pages = false;
+				continue;
+			}
+
 			if (flags == VMW_PAGE_ALLOC_CANSLEEP) {
 				/*
 				 * CANSLEEP page allocation failed, so guest
@@ -748,7 +861,6 @@ static void vmballoon_inflate(struct vmballoon *b)
 				STATS_INC(b->stats.sleep_alloc_fail);
 				break;
 			}
-			STATS_INC(b->stats.alloc_fail);
 
 			/*
 			 * NOSLEEP page allocation failed, so the guest is
@@ -771,7 +883,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 
 		b->ops->add_page(b, num_pages++, page);
 		if (num_pages == b->batch_max_pages) {
-			error = b->ops->lock(b, num_pages, &b->target);
+			error = b->ops->lock(b, num_pages, is_2m_pages,
+					&b->target);
 			num_pages = 0;
 			if (error)
 				break;
@@ -786,7 +899,7 @@ static void vmballoon_inflate(struct vmballoon *b)
 	}
 
 	if (num_pages > 0)
-		b->ops->lock(b, num_pages, &b->target);
+		b->ops->lock(b, num_pages, is_2m_pages, &b->target);
 
 	/*
 	 * We reached our goal without failures so try increasing
@@ -800,7 +913,8 @@ static void vmballoon_inflate(struct vmballoon *b)
 			    VMW_BALLOON_RATE_ALLOC_MAX);
 	}
 
-	vmballoon_release_refused_pages(b);
+	vmballoon_release_refused_pages(b, true);
+	vmballoon_release_refused_pages(b, false);
 }
 
 /*
@@ -808,34 +922,45 @@ static void vmballoon_inflate(struct vmballoon *b)
  */
 static void vmballoon_deflate(struct vmballoon *b)
 {
-	struct page *page, *next;
-	unsigned int i = 0;
-	unsigned int num_pages = 0;
-	int error;
+	unsigned is_2m_pages;
 
 	pr_debug("%s - size: %d, target %d\n", __func__, b->size, b->target);
 
 	/* free pages to reach target */
-	list_for_each_entry_safe(page, next, &b->pages, lru) {
-		list_del(&page->lru);
-		b->ops->add_page(b, num_pages++, page);
+	for (is_2m_pages = 0; is_2m_pages < b->supported_page_sizes;
+			is_2m_pages++) {
+		struct page *page, *next;
+		unsigned int num_pages = 0;
+		struct vmballoon_page_size *page_size =
+				&b->page_sizes[is_2m_pages];
+
+		list_for_each_entry_safe(page, next, &page_size->pages, lru) {
+			if (b->reset_required ||
+				(b->target > 0 &&
+					b->size - num_pages
+					* vmballoon_page_size(is_2m_pages)
+				< b->target + vmballoon_page_size(true)))
+				break;
 
+			list_del(&page->lru);
+			b->ops->add_page(b, num_pages++, page);
 
-		if (num_pages == b->batch_max_pages) {
-			error = b->ops->unlock(b, num_pages, &b->target);
-			num_pages = 0;
-			if (error)
-				return;
-		}
+			if (num_pages == b->batch_max_pages) {
+				int error;
 
-		if (b->reset_required || ++i >= b->size - b->target)
-			break;
+				error = b->ops->unlock(b, num_pages,
+						is_2m_pages, &b->target);
+				num_pages = 0;
+				if (error)
+					return;
+			}
 
-		cond_resched();
-	}
+			cond_resched();
+		}
 
-	if (num_pages > 0)
-		b->ops->unlock(b, num_pages, &b->target);
+		if (num_pages > 0)
+			b->ops->unlock(b, num_pages, is_2m_pages, &b->target);
+	}
 }
 
 static const struct vmballoon_ops vmballoon_basic_ops = {
@@ -925,7 +1050,8 @@ static void vmballoon_work(struct work_struct *work)
 
 		if (b->size < target)
 			vmballoon_inflate(b);
-		else if (b->size > target)
+		else if (target == 0 ||
+				b->size > target + vmballoon_page_size(true))
 			vmballoon_deflate(b);
 	}
 
@@ -969,24 +1095,35 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "timer:              %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
+		   "2m-lock:            %8u (%4u failed)\n"
 		   "lock:               %8u (%4u failed)\n"
+		   "2m-unlock:          %8u (%4u failed)\n"
 		   "unlock:             %8u (%4u failed)\n"
 		   "target:             %8u (%4u failed)\n"
+		   "prim2mAlloc:        %8u (%4u failed)\n"
 		   "primNoSleepAlloc:   %8u (%4u failed)\n"
 		   "primCanSleepAlloc:  %8u (%4u failed)\n"
+		   "prim2mFree:         %8u\n"
 		   "primFree:           %8u\n"
+		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
+		   "err2mFree:          %8u\n"
 		   "errFree:            %8u\n",
 		   stats->timer,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
-		   stats->lock,  stats->lock_fail,
-		   stats->unlock, stats->unlock_fail,
+		   stats->lock[true],  stats->lock_fail[true],
+		   stats->lock[false],  stats->lock_fail[false],
+		   stats->unlock[true], stats->unlock_fail[true],
+		   stats->unlock[false], stats->unlock_fail[false],
 		   stats->target, stats->target_fail,
-		   stats->alloc, stats->alloc_fail,
+		   stats->alloc[true], stats->alloc_fail[true],
+		   stats->alloc[false], stats->alloc_fail[false],
 		   stats->sleep_alloc, stats->sleep_alloc_fail,
-		   stats->free,
-		   stats->refused_alloc, stats->refused_free);
+		   stats->free[true],
+		   stats->free[false],
+		   stats->refused_alloc[true], stats->refused_alloc[false],
+		   stats->refused_free[true], stats->refused_free[false]);
 
 	return 0;
 }
@@ -1040,7 +1177,7 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b)
 static int __init vmballoon_init(void)
 {
 	int error;
-
+	unsigned is_2m_pages;
 	/*
 	 * Check if we are running on VMware's hypervisor and bail out
 	 * if we are not.
@@ -1048,8 +1185,11 @@ static int __init vmballoon_init(void)
 	if (x86_hyper != &x86_hyper_vmware)
 		return -ENODEV;
 
-	INIT_LIST_HEAD(&balloon.pages);
-	INIT_LIST_HEAD(&balloon.refused_pages);
+	for (is_2m_pages = 0; is_2m_pages < VMW_BALLOON_NUM_PAGE_SIZES;
+			is_2m_pages++) {
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].pages);
+		INIT_LIST_HEAD(&balloon.page_sizes[is_2m_pages].refused_pages);
+	}
 
 	/* initialize rates */
 	balloon.rate_alloc = VMW_BALLOON_RATE_ALLOC_MAX;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 6/7] VMware balloon: Treat init like reset
  2015-08-06 21:07                               ` Greg KH
                                                   ` (5 preceding siblings ...)
  2015-08-06 22:18                                 ` [PATCH v5 5/7] VMware balloon: Support 2m page ballooning Philip P. Moltmann
@ 2015-08-06 22:18                                 ` Philip P. Moltmann
  2015-08-06 22:18                                 ` [PATCH v5 7/7] VMware balloon: Enable notification via VMCI Philip P. Moltmann
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:18 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

Unify the behavior of the first start of the balloon and a reset. Also on
unload, declare that the balloon driver does not have any capabilities
anymore.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
---
 drivers/misc/vmw_balloon.c | 53 ++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 01519ff..28fe9e5 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -47,7 +47,7 @@
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.0.0-k");
+MODULE_VERSION("1.4.1.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -564,12 +564,14 @@ static void vmballoon_pop(struct vmballoon *b)
 		}
 	}
 
-	if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		if (b->batch_page)
-			vunmap(b->batch_page);
+	if (b->batch_page) {
+		vunmap(b->batch_page);
+		b->batch_page = NULL;
+	}
 
-		if (b->page)
-			__free_page(b->page);
+	if (b->page) {
+		__free_page(b->page);
+		b->page = NULL;
 	}
 }
 
@@ -1044,7 +1046,7 @@ static void vmballoon_work(struct work_struct *work)
 	if (b->slow_allocation_cycles > 0)
 		b->slow_allocation_cycles--;
 
-	if (vmballoon_send_get_target(b, &target)) {
+	if (!b->reset_required && vmballoon_send_get_target(b, &target)) {
 		/* update target, adjust size */
 		b->target = target;
 
@@ -1076,8 +1078,10 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	/* format capabilities info */
 	seq_printf(f,
 		   "balloon capabilities:   %#4x\n"
-		   "used capabilities:      %#4lx\n",
-		   VMW_BALLOON_CAPABILITIES, b->capabilities);
+		   "used capabilities:      %#4lx\n"
+		   "is resetting:           %c\n",
+		   VMW_BALLOON_CAPABILITIES, b->capabilities,
+		   b->reset_required ? 'y' : 'n');
 
 	/* format size info */
 	seq_printf(f,
@@ -1196,35 +1200,14 @@ static int __init vmballoon_init(void)
 
 	INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work);
 
-	/*
-	 * Start balloon.
-	 */
-	if (!vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES)) {
-		pr_err("failed to send start command to the host\n");
-		return -EIO;
-	}
-
-	if ((balloon.capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) {
-		balloon.ops = &vmballoon_batched_ops;
-		balloon.batch_max_pages = VMW_BALLOON_BATCH_MAX_PAGES;
-		if (!vmballoon_init_batching(&balloon)) {
-			pr_err("failed to init batching\n");
-			return -EIO;
-		}
-	} else if ((balloon.capabilities & VMW_BALLOON_BASIC_CMDS) != 0) {
-		balloon.ops = &vmballoon_basic_ops;
-		balloon.batch_max_pages = 1;
-	}
-
-	if (!vmballoon_send_guest_id(&balloon)) {
-		pr_err("failed to send guest ID to the host\n");
-		return -EIO;
-	}
-
 	error = vmballoon_debugfs_init(&balloon);
 	if (error)
 		return error;
 
+	balloon.batch_page = NULL;
+	balloon.page = NULL;
+	balloon.reset_required = true;
+
 	queue_delayed_work(system_freezable_wq, &balloon.dwork, 0);
 
 	return 0;
@@ -1242,7 +1225,7 @@ static void __exit vmballoon_exit(void)
 	 * Reset connection before deallocating memory to avoid potential for
 	 * additional spurious resets from guest touching deallocated pages.
 	 */
-	vmballoon_send_start(&balloon, VMW_BALLOON_CAPABILITIES);
+	vmballoon_send_start(&balloon, 0);
 	vmballoon_pop(&balloon);
 }
 module_exit(vmballoon_exit);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 7/7] VMware balloon: Enable notification via VMCI
  2015-08-06 21:07                               ` Greg KH
                                                   ` (6 preceding siblings ...)
  2015-08-06 22:18                                 ` [PATCH v5 6/7] VMware balloon: Treat init like reset Philip P. Moltmann
@ 2015-08-06 22:18                                 ` Philip P. Moltmann
  7 siblings, 0 replies; 61+ messages in thread
From: Philip P. Moltmann @ 2015-08-06 22:18 UTC (permalink / raw)
  To: gregkh
  Cc: Philip P. Moltmann, dmitry.torokhov, linux-kernel, xdeguillard,
	akpm, pv-drivers

Get notified immediately when a balloon target is set, instead of waiting for
up to one second.

The up-to 1 second gap could be long enough to cause swapping inside of the
VM that receives the VM.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com>
Tested-by: Siva Sankar Reddy B <sankars@vmware.com>
---
 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 105 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index ccccc29..22892c7 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -414,7 +414,7 @@ config TI_DAC7512
 
 config VMWARE_BALLOON
 	tristate "VMware Balloon Driver"
-	depends on X86 && HYPERVISOR_GUEST
+	depends on VMWARE_VMCI && X86 && HYPERVISOR_GUEST
 	help
 	  This is VMware physical memory management driver which acts
 	  like a "balloon" that can be inflated to reclaim physical pages
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 28fe9e5..8930087 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1,7 +1,7 @@
 /*
  * VMware Balloon driver.
  *
- * Copyright (C) 2000-2013, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2000-2014, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -43,11 +43,13 @@
 #include <linux/workqueue.h>
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
+#include <linux/vmw_vmci_defs.h>
+#include <linux/vmw_vmci_api.h>
 #include <asm/hypervisor.h>
 
 MODULE_AUTHOR("VMware, Inc.");
 MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver");
-MODULE_VERSION("1.4.1.0-k");
+MODULE_VERSION("1.5.0.0-k");
 MODULE_ALIAS("dmi:*:svnVMware*:*");
 MODULE_ALIAS("vmware_vmmemctl");
 MODULE_LICENSE("GPL");
@@ -101,14 +103,16 @@ enum vmwballoon_capabilities {
 	/*
 	 * Bit 0 is reserved and not associated to any capability.
 	 */
-	VMW_BALLOON_BASIC_CMDS		= (1 << 1),
-	VMW_BALLOON_BATCHED_CMDS	= (1 << 2),
-	VMW_BALLOON_BATCHED_2M_CMDS     = (1 << 3),
+	VMW_BALLOON_BASIC_CMDS			= (1 << 1),
+	VMW_BALLOON_BATCHED_CMDS		= (1 << 2),
+	VMW_BALLOON_BATCHED_2M_CMDS		= (1 << 3),
+	VMW_BALLOON_SIGNALLED_WAKEUP_CMD	= (1 << 4),
 };
 
 #define VMW_BALLOON_CAPABILITIES	(VMW_BALLOON_BASIC_CMDS \
 					| VMW_BALLOON_BATCHED_CMDS \
-					| VMW_BALLOON_BATCHED_2M_CMDS)
+					| VMW_BALLOON_BATCHED_2M_CMDS \
+					| VMW_BALLOON_SIGNALLED_WAKEUP_CMD)
 
 #define VMW_BALLOON_2M_SHIFT		(9)
 #define VMW_BALLOON_NUM_PAGE_SIZES	(2)
@@ -123,7 +127,9 @@ enum vmwballoon_capabilities {
  * VMW_BALLOON_BATCHED_CMDS:
  *	BATCHED_LOCK and BATCHED_UNLOCK commands.
  * VMW BALLOON_BATCHED_2M_CMDS:
- *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands.
+ *	BATCHED_2M_LOCK and BATCHED_2M_UNLOCK commands,
+ * VMW VMW_BALLOON_SIGNALLED_WAKEUP_CMD:
+ *	VMW_BALLOON_CMD_VMCI_DOORBELL_SET command.
  */
 #define VMW_BALLOON_CMD_START			0
 #define VMW_BALLOON_CMD_GET_TARGET		1
@@ -134,6 +140,7 @@ enum vmwballoon_capabilities {
 #define VMW_BALLOON_CMD_BATCHED_UNLOCK		7
 #define VMW_BALLOON_CMD_BATCHED_2M_LOCK		8
 #define VMW_BALLOON_CMD_BATCHED_2M_UNLOCK	9
+#define VMW_BALLOON_CMD_VMCI_DOORBELL_SET	10
 
 
 /* error codes */
@@ -214,6 +221,7 @@ static void vmballoon_batch_set_pa(struct vmballoon_batch_page *batch, int idx,
 #ifdef CONFIG_DEBUG_FS
 struct vmballoon_stats {
 	unsigned int timer;
+	unsigned int doorbell;
 
 	/* allocation statistics */
 	unsigned int alloc[VMW_BALLOON_NUM_PAGE_SIZES];
@@ -235,6 +243,8 @@ struct vmballoon_stats {
 	unsigned int start_fail;
 	unsigned int guest_type;
 	unsigned int guest_type_fail;
+	unsigned int doorbell_set;
+	unsigned int doorbell_unset;
 };
 
 #define STATS_INC(stat) (stat)++
@@ -299,6 +309,8 @@ struct vmballoon {
 	struct sysinfo sysinfo;
 
 	struct delayed_work dwork;
+
+	struct vmci_handle vmci_doorbell;
 };
 
 static struct vmballoon balloon;
@@ -993,12 +1005,75 @@ static bool vmballoon_init_batching(struct vmballoon *b)
 }
 
 /*
+ * Receive notification and resize balloon
+ */
+static void vmballoon_doorbell(void *client_data)
+{
+	struct vmballoon *b = client_data;
+
+	STATS_INC(b->stats.doorbell);
+
+	mod_delayed_work(system_freezable_wq, &b->dwork, 0);
+}
+
+/*
+ * Clean up vmci doorbell
+ */
+static void vmballoon_vmci_cleanup(struct vmballoon *b)
+{
+	int error;
+
+	VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET, VMCI_INVALID_ID,
+			VMCI_INVALID_ID, error);
+	STATS_INC(b->stats.doorbell_unset);
+
+	if (!vmci_handle_is_invalid(b->vmci_doorbell)) {
+		vmci_doorbell_destroy(b->vmci_doorbell);
+		b->vmci_doorbell = VMCI_INVALID_HANDLE;
+	}
+}
+
+/*
+ * Initialize vmci doorbell, to get notified as soon as balloon changes
+ */
+static int vmballoon_vmci_init(struct vmballoon *b)
+{
+	int error = 0;
+
+	if ((b->capabilities & VMW_BALLOON_SIGNALLED_WAKEUP_CMD) != 0) {
+		error = vmci_doorbell_create(&b->vmci_doorbell,
+				VMCI_FLAG_DELAYED_CB,
+				VMCI_PRIVILEGE_FLAG_RESTRICTED,
+				vmballoon_doorbell, b);
+
+		if (error == VMCI_SUCCESS) {
+			VMWARE_BALLOON_CMD(VMCI_DOORBELL_SET,
+					b->vmci_doorbell.context,
+					b->vmci_doorbell.resource, error);
+			STATS_INC(b->stats.doorbell_set);
+		}
+	}
+
+	if (error != 0) {
+		vmballoon_vmci_cleanup(b);
+
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/*
  * Perform standard reset sequence by popping the balloon (in case it
  * is not  empty) and then restarting protocol. This operation normally
  * happens when host responds with VMW_BALLOON_ERROR_RESET to a command.
  */
 static void vmballoon_reset(struct vmballoon *b)
 {
+	int error;
+
+	vmballoon_vmci_cleanup(b);
+
 	/* free all pages, skipping monitor unlock */
 	vmballoon_pop(b);
 
@@ -1024,6 +1099,11 @@ static void vmballoon_reset(struct vmballoon *b)
 	}
 
 	b->reset_required = false;
+
+	error = vmballoon_vmci_init(b);
+	if (error)
+		pr_err("failed to initialize vmci doorbell\n");
+
 	if (!vmballoon_send_guest_id(b))
 		pr_err("failed to send guest ID to the host\n");
 }
@@ -1097,6 +1177,7 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f,
 		   "\n"
 		   "timer:              %8u\n"
+		   "doorbell:           %8u\n"
 		   "start:              %8u (%4u failed)\n"
 		   "guestType:          %8u (%4u failed)\n"
 		   "2m-lock:            %8u (%4u failed)\n"
@@ -1112,8 +1193,11 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   "err2mAlloc:         %8u\n"
 		   "errAlloc:           %8u\n"
 		   "err2mFree:          %8u\n"
-		   "errFree:            %8u\n",
+		   "errFree:            %8u\n"
+		   "doorbellSet:        %8u\n"
+		   "doorbellUnset:      %8u\n",
 		   stats->timer,
+		   stats->doorbell,
 		   stats->start, stats->start_fail,
 		   stats->guest_type, stats->guest_type_fail,
 		   stats->lock[true],  stats->lock_fail[true],
@@ -1127,7 +1211,8 @@ static int vmballoon_debug_show(struct seq_file *f, void *offset)
 		   stats->free[true],
 		   stats->free[false],
 		   stats->refused_alloc[true], stats->refused_alloc[false],
-		   stats->refused_free[true], stats->refused_free[false]);
+		   stats->refused_free[true], stats->refused_free[false],
+		   stats->doorbell_set, stats->doorbell_unset);
 
 	return 0;
 }
@@ -1204,6 +1289,7 @@ static int __init vmballoon_init(void)
 	if (error)
 		return error;
 
+	balloon.vmci_doorbell = VMCI_INVALID_HANDLE;
 	balloon.batch_page = NULL;
 	balloon.page = NULL;
 	balloon.reset_required = true;
@@ -1216,6 +1302,7 @@ module_init(vmballoon_init);
 
 static void __exit vmballoon_exit(void)
 {
+	vmballoon_vmci_cleanup(&balloon);
 	cancel_delayed_work_sync(&balloon.dwork);
 
 	vmballoon_debugfs_exit(&balloon);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 0/7]  Fifth revision of the performance improvement patch to the VMware balloon driver
  2015-08-06 22:17                                 ` [PATCH v5 0/7] Fifth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
@ 2015-08-14 23:27                                   ` Philip Moltmann
  0 siblings, 0 replies; 61+ messages in thread
From: Philip Moltmann @ 2015-08-14 23:27 UTC (permalink / raw)
  To: gregkh; +Cc: dmitry.torokhov, linux-kernel, Xavier Deguillard, akpm, pv-drivers

Hi,

will v5 1/7 - 7/7 be considered for Linux 4.2 ?

Philip
________________________________________
From: Philip P. Moltmann <moltmann@vmware.com>
Sent: Thursday, August 6, 2015 3:17 PM
To: gregkh@linuxfoundation.org
Cc: Philip Moltmann; dmitry.torokhov@gmail.com; linux-kernel@vger.kernel.org; Xavier Deguillard; akpm@linux-foundation.org; pv-drivers@vmware.com
Subject: [PATCH v5 0/7]  Fifth revision of the performance improvement patch to the VMware balloon driver

This is the fifth revision fo the path to the VMWare balloon driver. The original
was send to linux-kernel@vger.kernel.org on 4/14/15 10.29 am PST. Please refer to
the original change for an overview.

v1:
- Initial implementation
v2
- Address suggestions by Dmitry Totoknov
  - Use UINT_MAX as "infitite" rate instead of special casing -1
v3:
- Change commit comment for step 6 to better explain what impact ballooning has
  on the VM performance.
v4:
- Add missing include header <linux/vmalloc.h> in step 3
v5:
- Moved to from git/torvalds/linux master to git/gregkh/char-misc char-misc-next
  (4816693286d4ff9219b1cc72c2ab9c589448ebcb) and remove already applied steps
  1/9 and 2/9. That changes the numbering of the patches. 3/9 is now 1/7, 4/9 is
  not 2/7, etc. There were no merge issues during this transition.

  Testing done: recompiled, installed, rebooted and make sure ballooning still
                works

Thanks
Philip

Philip P. Moltmann (5):
  VMware balloon: Show capabilities of balloon and resulting
    capabilities in the debug-fs node.
  VMware balloon: Do not limit the amount of frees and allocations in
    non-sleep mode.
  VMware balloon: Support 2m page ballooning.
  VMware balloon: Treat init like reset
  VMware balloon: Enable notification via VMCI

Xavier Deguillard (2):
  VMware balloon: add batching to the vmw_balloon.
  VMware balloon: Update balloon target on each lock/unlock.

 drivers/misc/Kconfig       |   2 +-
 drivers/misc/vmw_balloon.c | 843 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 662 insertions(+), 183 deletions(-)

--
2.4.3


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 5/7] VMware balloon: Support 2m page ballooning.
  2015-08-06 22:18                                 ` [PATCH v5 5/7] VMware balloon: Support 2m page ballooning Philip P. Moltmann
@ 2015-08-21 14:01                                   ` Kamalneet Singh
  0 siblings, 0 replies; 61+ messages in thread
From: Kamalneet Singh @ 2015-08-21 14:01 UTC (permalink / raw)
  To: linux-kernel

Philip P. Moltmann <moltmann <at> vmware.com> writes:

...
> Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
> supported super-page size.
> 
> The code has to run on older versions of ESX and old balloon drivers run on
> newer version of ESX. Hence match the capabilities with the host before 2m
> page ballooning could be enabled.

I applied your patchset. Is there any hardware requirement for 2M ballooning?
Or any ESXi  host config? My ESXi 6 installation reports the following
capabilities, which doesn't include 2M balloon size.

balloon capabilities:   0x1e
used capabilities:       0x6

Thanks
Kamal


^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2015-08-21 14:05 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-14 17:29 [PATCH 0/9] VMware balloon: Large page ballooning and VMCI support Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 5/9] VMware balloon: Show capabilities or " Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
2015-04-16 20:55   ` Dmitry Torokhov
2015-06-11 20:10     ` Philip Moltmann
2015-06-12 11:20       ` dmitry.torokhov
2015-06-12 15:06         ` Philip Moltmann
2015-06-12 15:31           ` dmitry.torokhov
2015-06-12 15:40             ` Philip Moltmann
2015-06-12 16:15               ` dmitry.torokhov
2015-06-12 18:43                 ` [PATCH v3 0/9] Third revision of the performance improvment patch to the VMWare balloon driver Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
2015-08-05 20:19                   ` Greg KH
2015-08-05 22:36                     ` [PATCH v4 " Philip P. Moltmann
2015-08-05 22:44                       ` Greg KH
2015-08-05 22:47                         ` Philip Moltmann
2015-08-05 23:26                           ` gregkh
2015-08-06 20:33                             ` [PATCH v4 0/9] Fourth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 1/9] VMware balloon: partially inline vmballoon_reserve_page Philip P. Moltmann
2015-08-06 21:07                               ` Greg KH
2015-08-06 22:17                                 ` [PATCH v5 0/7] Fifth revision of the performance improvement patch to the VMware balloon driver Philip P. Moltmann
2015-08-14 23:27                                   ` Philip Moltmann
2015-08-06 22:17                                 ` [PATCH v5 1/7] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
2015-08-06 22:17                                 ` [PATCH v5 2/7] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
2015-08-06 22:18                                 ` [PATCH v5 3/7] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
2015-08-06 22:18                                 ` [PATCH v5 4/7] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
2015-08-06 22:18                                 ` [PATCH v5 5/7] VMware balloon: Support 2m page ballooning Philip P. Moltmann
2015-08-21 14:01                                   ` Kamalneet Singh
2015-08-06 22:18                                 ` [PATCH v5 6/7] VMware balloon: Treat init like reset Philip P. Moltmann
2015-08-06 22:18                                 ` [PATCH v5 7/7] VMware balloon: Enable notification via VMCI Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 2/9] VMware balloon: Add support for balloon capabilities Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 3/9] VMware balloon: add batching to the vmw_balloon Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
2015-08-06 20:33                             ` [PATCH v4 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 4/9] VMware balloon: Update balloon target on each lock/unlock Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 5/9] VMware balloon: Show capabilities of balloon and resulting capabilities in the debug-fs node Philip P. Moltmann
2015-08-05 20:14                   ` Greg KH
2015-08-05 20:22                     ` Philip Moltmann
2015-08-05 20:33                       ` dmitry.torokhov
2015-08-05 20:42                         ` John Savanyo
2015-08-05 20:50                           ` gregkh
2015-08-05 21:11                             ` John Savanyo
2015-08-05 20:40                       ` gregkh
2015-06-12 18:43                 ` [PATCH v3 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
2015-06-12 18:43                 ` [PATCH v3 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 7/9] VMware balloon: Support 2m page ballooning Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 8/9] VMware balloon: Treat init like reset Philip P. Moltmann
2015-04-14 17:29 ` [PATCH 9/9] VMware balloon: Enable notification via VMCI Philip P. Moltmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.