linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
@ 2016-12-21  6:52 Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 1/5] virtio-balloon: rework deflate to add page to a list Liang Li
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li

This patch set contains two parts of changes to the virtio-balloon.
 
One is the change for speeding up the inflating & deflating process,
the main idea of this optimization is to use {pfn|length} to present
the page information instead of the PFNs, to reduce the overhead of
virtio data transmission, address translation and madvise(). This can
help to improve the performance by about 85%.
 
Another change is for speeding up live migration. By skipping process
guest's unused pages in the first round of data copy, to reduce needless
data processing, this can help to save quite a lot of CPU cycles and
network bandwidth. We put guest's unused page information in a
{pfn|length} array and send it to host with the virt queue of
virtio-balloon. For an idle guest with 8GB RAM, this can help to shorten
the total live migration time from 2Sec to about 500ms in 10Gbps network
environment. For an guest with quite a lot of page cache and with little
unused pages, it's possible to let the guest drop it's page cache before
live migration, this case can benefit from this new feature too.
 
Changes from v5 to v6:
    * Drop the bitmap from the virtio ABI, use {pfn|length} only.
    * Enhance the API to get the unused page information from mm. 

Changes from v4 to v5:
    * Drop the code to get the max_pfn, use another way instead.
    * Simplify the API to get the unused page information from mm. 

Changes from v3 to v4:
    * Use the new scheme suggested by Dave Hansen to encode the bitmap.
    * Add code which is missed in v3 to handle migrate page. 
    * Free the memory for bitmap intime once the operation is done.
    * Address some of the comments in v3.

Changes from v2 to v3:
    * Change the name of 'free page' to 'unused page'.
    * Use the scatter & gather bitmap instead of a 1MB page bitmap.
    * Fix overwriting the page bitmap after kicking.
    * Some of MST's comments for v2.
 
Changes from v1 to v2:
    * Abandon the patch for dropping page cache.
    * Put some structures to uapi head file.
    * Use a new way to determine the page bitmap size.
    * Use a unified way to send the free page information with the bitmap
    * Address the issues referred in MST's comments

Liang Li (5):
  virtio-balloon: rework deflate to add page to a list
  virtio-balloon: define new feature bit and head struct
  virtio-balloon: speed up inflate/deflate process
  virtio-balloon: define flags and head for host request vq
  virtio-balloon: tell host vm's unused page info

 drivers/virtio/virtio_balloon.c     | 510 ++++++++++++++++++++++++++++++++----
 include/linux/mm.h                  |   3 +
 include/uapi/linux/virtio_balloon.h |  34 +++
 mm/page_alloc.c                     | 120 +++++++++
 4 files changed, 621 insertions(+), 46 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v6 kernel 1/5] virtio-balloon: rework deflate to add page to a list
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
@ 2016-12-21  6:52 ` Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct Liang Li
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li

When doing the inflating/deflating operation, the current virtio-balloon
implementation uses an array to save 256 PFNS, then send these PFNS to
host through virtio and process each PFN one by one. This way is not
efficient when inflating/deflating a large mount of memory because too
many times of the following operations:

    1. Virtio data transmission
    2. Page allocate/free
    3. Address translation(GPA->HVA)
    4. madvise

The over head of these operations will consume a lot of CPU cycles and
will take a long time to complete, it may impact the QoS of the guest as
well as the host. The overhead will be reduced a lot if batch processing
is used. E.g. If there are several pages whose address are physical
contiguous in the guest, these bulk pages can be processed in one
operation.

The main idea for the optimization is to reduce the above operations as
much as possible. And it can be achieved by using a {pfn|length} array
instead of a PFN array. Comparing with PFN array, {pfn|length} array can
present more pages and is fit for batch processing.

This patch saves the deflated pages to a list instead of the PFN array,
which will allow faster notifications using the {pfn|length} down the
road. balloon_pfn_to_page() can be removed because it's useless.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 181793f..f59cb4f 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -103,12 +103,6 @@ static u32 page_to_balloon_pfn(struct page *page)
 	return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;
 }
 
-static struct page *balloon_pfn_to_page(u32 pfn)
-{
-	BUG_ON(pfn % VIRTIO_BALLOON_PAGES_PER_PAGE);
-	return pfn_to_page(pfn / VIRTIO_BALLOON_PAGES_PER_PAGE);
-}
-
 static void balloon_ack(struct virtqueue *vq)
 {
 	struct virtio_balloon *vb = vq->vdev->priv;
@@ -181,18 +175,16 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	return num_allocated_pages;
 }
 
-static void release_pages_balloon(struct virtio_balloon *vb)
+static void release_pages_balloon(struct virtio_balloon *vb,
+				 struct list_head *pages)
 {
-	unsigned int i;
-	struct page *page;
+	struct page *page, *next;
 
-	/* Find pfns pointing at start of each page, get pages and free them. */
-	for (i = 0; i < vb->num_pfns; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		page = balloon_pfn_to_page(virtio32_to_cpu(vb->vdev,
-							   vb->pfns[i]));
+	list_for_each_entry_safe(page, next, pages, lru) {
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 			adjust_managed_page_count(page, 1);
+		list_del(&page->lru);
 		put_page(page); /* balloon reference */
 	}
 }
@@ -202,6 +194,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	LIST_HEAD(pages);
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
@@ -215,6 +208,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 		if (!page)
 			break;
 		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		list_add(&page->lru, &pages);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
 
@@ -226,7 +220,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	 */
 	if (vb->num_pfns != 0)
 		tell_host(vb, vb->deflate_vq);
-	release_pages_balloon(vb);
+	release_pages_balloon(vb, &pages);
 	mutex_unlock(&vb->balloon_lock);
 	return num_freed_pages;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 1/5] virtio-balloon: rework deflate to add page to a list Liang Li
@ 2016-12-21  6:52 ` Liang Li
  2017-01-12 19:43   ` Michael S. Tsirkin
  2016-12-21  6:52 ` [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li

Add a new feature which supports sending the page information
with range array. The current implementation uses PFNs array,
which is not very efficient. Using ranges can improve the
performance of inflating/deflating significantly.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
---
 include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index 343d7dd..2f850bf 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -34,10 +34,14 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info with ranges */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+/* Bits width for the length of the pfn range */
+#define VIRTIO_BALLOON_NR_PFN_BITS 12
+
 struct virtio_balloon_config {
 	/* Number of pages host wants Guest to give up. */
 	__u32 num_pages;
@@ -82,4 +86,12 @@ struct virtio_balloon_stat {
 	__virtio64 val;
 } __attribute__((packed));
 
+/* Response header structure */
+struct virtio_balloon_resp_hdr {
+	__le64 cmd : 8; /* Distinguish different requests type */
+	__le64 flag: 8; /* Mark status for a specific request type */
+	__le64 id : 16; /* Distinguish requests of a specific type */
+	__le64 data_len: 32; /* Length of the following data, in bytes */
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 1/5] virtio-balloon: rework deflate to add page to a list Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct Liang Li
@ 2016-12-21  6:52 ` Liang Li
  2017-01-17 19:15   ` Michael S. Tsirkin
  2017-01-20 11:48   ` Dr. David Alan Gilbert
  2016-12-21  6:52 ` [PATCH v6 kernel 4/5] virtio-balloon: define flags and head for host request vq Liang Li
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li

The implementation of the current virtio-balloon is not very
efficient, the time spends on different stages of inflating
the balloon to 7GB of a 8GB idle guest:

a. allocating pages (6.5%)
b. sending PFNs to host (68.3%)
c. address translation (6.1%)
d. madvise (19%)

It takes about 4126ms for the inflating process to complete.
Debugging shows that the bottle neck are the stage b and stage d.

If using {pfn|length} array to send the page info instead of the
PFNs, we can reduce the overhead in stage b quite a lot.
Furthermore, we can do the address translation and call madvise()
with a range of memory, instead of the current page per page way,
the overhead of stage c and stage d can also be reduced a lot.

This patch is the kernel side implementation which is intended to
speed up the inflating & deflating process by adding a new feature
to the virtio-balloon device. With this new feature, inflating the
balloon to 7GB of a 8GB idle guest only takes 590ms, the
performance improvement is about 85%.

TODO: optimize stage a by allocating/freeing a chunk of pages
instead of a single page at a time.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 348 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 320 insertions(+), 28 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f59cb4f..03383b3 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+#define BALLOON_BMAP_SIZE	(8 * PAGE_SIZE)
+#define PFNS_PER_BMAP		(BALLOON_BMAP_SIZE * BITS_PER_BYTE)
+#define BALLOON_BMAP_COUNT	32
+
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
@@ -67,6 +71,20 @@ struct virtio_balloon {
 
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+	/* Pointer to the response header. */
+	void *resp_hdr;
+	/* Pointer to the start address of response data. */
+	__le64 *resp_data;
+	/* Size of response data buffer. */
+	unsigned int resp_buf_size;
+	/* Pointer offset of the response data. */
+	unsigned int resp_pos;
+	/* Bitmap used to save the pfns info */
+	unsigned long *page_bitmap[BALLOON_BMAP_COUNT];
+	/* Number of split page bitmaps */
+	unsigned int nr_page_bmap;
+	/* Used to record the processed pfn range */
+	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -110,20 +128,180 @@ static void balloon_ack(struct virtqueue *vq)
 	wake_up(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
 {
-	struct scatterlist sg;
+	vb->min_pfn = ULONG_MAX;
+	vb->max_pfn = 0;
+}
+
+static inline void update_bmap_pfn_range(struct virtio_balloon *vb,
+				 struct page *page)
+{
+	unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+	vb->min_pfn = min(balloon_pfn, vb->min_pfn);
+	vb->max_pfn = max(balloon_pfn, vb->max_pfn);
+}
+
+static void extend_page_bitmap(struct virtio_balloon *vb,
+				unsigned long nr_pfn)
+{
+	int i, bmap_count;
+	unsigned long bmap_len;
+
+	bmap_len = ALIGN(nr_pfn, BITS_PER_LONG) / BITS_PER_BYTE;
+	bmap_len = ALIGN(bmap_len, BALLOON_BMAP_SIZE);
+	bmap_count = min((int)(bmap_len / BALLOON_BMAP_SIZE),
+				 BALLOON_BMAP_COUNT);
+
+	for (i = 1; i < bmap_count; i++) {
+		vb->page_bitmap[i] = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL);
+		if (vb->page_bitmap[i])
+			vb->nr_page_bmap++;
+		else
+			break;
+	}
+}
+
+static void free_extended_page_bitmap(struct virtio_balloon *vb)
+{
+	int i, bmap_count = vb->nr_page_bmap;
+
+	for (i = 1; i < bmap_count; i++) {
+		kfree(vb->page_bitmap[i]);
+		vb->page_bitmap[i] = NULL;
+		vb->nr_page_bmap--;
+	}
+}
+
+static void kfree_page_bitmap(struct virtio_balloon *vb)
+{
+	int i;
+
+	for (i = 0; i < vb->nr_page_bmap; i++)
+		kfree(vb->page_bitmap[i]);
+}
+
+static void clear_page_bitmap(struct virtio_balloon *vb)
+{
+	int i;
+
+	for (i = 0; i < vb->nr_page_bmap; i++)
+		memset(vb->page_bitmap[i], 0, BALLOON_BMAP_SIZE);
+}
+
+static void send_resp_data(struct virtio_balloon *vb, struct virtqueue *vq,
+			bool busy_wait)
+{
+	struct scatterlist sg[2];
+	struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr;
 	unsigned int len;
 
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	len = hdr->data_len = vb->resp_pos * sizeof(__le64);
+	sg_init_table(sg, 2);
+	sg_set_buf(&sg[0], hdr, sizeof(struct virtio_balloon_resp_hdr));
+	sg_set_buf(&sg[1], vb->resp_data, len);
+
+	if (virtqueue_add_outbuf(vq, sg, 2, vb, GFP_KERNEL) == 0) {
+		virtqueue_kick(vq);
+		if (busy_wait)
+			while (!virtqueue_get_buf(vq, &len)
+				&& !virtqueue_is_broken(vq))
+				cpu_relax();
+		else
+			wait_event(vb->acked, virtqueue_get_buf(vq, &len));
+		vb->resp_pos = 0;
+		free_extended_page_bitmap(vb);
+	}
+}
 
-	/* We should always be able to add one buffer to an empty queue. */
-	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
-	virtqueue_kick(vq);
+static void do_set_resp_bitmap(struct virtio_balloon *vb,
+		unsigned long base_pfn, int pages)
 
-	/* When host has read buffer, this completes via balloon_ack */
-	wait_event(vb->acked, virtqueue_get_buf(vq, &len));
+{
+	__le64 *range = vb->resp_data + vb->resp_pos;
 
+	if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
+		/* when the length field can't contain pages, set it to 0 to
+		 * indicate the actual length is in the next __le64;
+		 */
+		*range = cpu_to_le64((base_pfn <<
+				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
+		*(range + 1) = cpu_to_le64(pages);
+		vb->resp_pos += 2;
+	} else {
+		*range = (base_pfn << VIRTIO_BALLOON_NR_PFN_BITS) | pages;
+		vb->resp_pos++;
+	}
+}
+
+static void set_bulk_pages(struct virtio_balloon *vb, struct virtqueue *vq,
+		unsigned long start_pfn, unsigned long *bitmap,
+		unsigned long len, bool busy_wait)
+{
+	unsigned long pos = 0, end = len * BITS_PER_BYTE;
+
+	while (pos < end) {
+		unsigned long one = find_next_bit(bitmap, end, pos);
+
+		if (one < end) {
+			unsigned long pages, zero;
+
+			zero = find_next_zero_bit(bitmap, end, one + 1);
+			if (zero >= end)
+				pages = end - one;
+			else
+				pages = zero - one;
+			if (pages) {
+				if ((vb->resp_pos + 2) * sizeof(__le64) >
+						vb->resp_buf_size)
+					send_resp_data(vb, vq, busy_wait);
+				do_set_resp_bitmap(vb, start_pfn + one,	pages);
+			}
+			pos = one + pages;
+		} else
+			pos = one;
+	}
+}
+
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+{
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_RANGE)) {
+		int nr_pfn, nr_used_bmap, i;
+		unsigned long start_pfn, bmap_len;
+
+		start_pfn = vb->start_pfn;
+		nr_pfn = vb->end_pfn - start_pfn + 1;
+		nr_pfn = roundup(nr_pfn, BITS_PER_LONG);
+		nr_used_bmap = nr_pfn / PFNS_PER_BMAP;
+		if (nr_pfn % PFNS_PER_BMAP)
+			nr_used_bmap++;
+		bmap_len = nr_pfn / BITS_PER_BYTE;
+
+		for (i = 0; i < nr_used_bmap; i++) {
+			unsigned int bmap_size = BALLOON_BMAP_SIZE;
+
+			if (i + 1 == nr_used_bmap)
+				bmap_size = bmap_len - BALLOON_BMAP_SIZE * i;
+			set_bulk_pages(vb, vq, start_pfn + i * PFNS_PER_BMAP,
+				 vb->page_bitmap[i], bmap_size, false);
+		}
+		if (vb->resp_pos > 0)
+			send_resp_data(vb, vq, false);
+	} else {
+		struct scatterlist sg;
+		unsigned int len;
+
+		sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+
+		/* We should always be able to add one buffer to an
+		 * empty queue
+		 */
+		virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
+		virtqueue_kick(vq);
+		/* When host has read buffer, this completes via balloon_ack */
+		wait_event(vb->acked, virtqueue_get_buf(vq, &len));
+	}
 }
 
 static void set_page_pfns(struct virtio_balloon *vb,
@@ -138,13 +316,59 @@ static void set_page_pfns(struct virtio_balloon *vb,
 					  page_to_balloon_pfn(page) + i);
 }
 
-static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
+static void set_page_bitmap(struct virtio_balloon *vb,
+			 struct list_head *pages, struct virtqueue *vq)
+{
+	unsigned long pfn, pfn_limit;
+	struct page *page;
+	bool found;
+	int bmap_idx;
+
+	vb->min_pfn = rounddown(vb->min_pfn, BITS_PER_LONG);
+	vb->max_pfn = roundup(vb->max_pfn, BITS_PER_LONG);
+	pfn_limit = PFNS_PER_BMAP * vb->nr_page_bmap;
+
+	if (vb->nr_page_bmap == 1)
+		extend_page_bitmap(vb, vb->max_pfn - vb->min_pfn + 1);
+	for (pfn = vb->min_pfn; pfn < vb->max_pfn; pfn += pfn_limit) {
+		unsigned long end_pfn;
+
+		clear_page_bitmap(vb);
+		vb->start_pfn = pfn;
+		end_pfn = pfn;
+		found = false;
+		list_for_each_entry(page, pages, lru) {
+			unsigned long pos, balloon_pfn;
+
+			balloon_pfn = page_to_balloon_pfn(page);
+			if (balloon_pfn < pfn || balloon_pfn >= pfn + pfn_limit)
+				continue;
+			bmap_idx = (balloon_pfn - pfn) / PFNS_PER_BMAP;
+			pos = (balloon_pfn - pfn) % PFNS_PER_BMAP;
+			set_bit(pos, vb->page_bitmap[bmap_idx]);
+			if (balloon_pfn > end_pfn)
+				end_pfn = balloon_pfn;
+			found = true;
+		}
+		if (found) {
+			vb->end_pfn = end_pfn;
+			tell_host(vb, vq);
+		}
+	}
+}
+
+static unsigned int fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
-	unsigned num_allocated_pages;
+	unsigned int num_allocated_pages;
+	bool use_bmap = virtio_has_feature(vb->vdev,
+				 VIRTIO_BALLOON_F_PAGE_RANGE);
 
-	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	if (use_bmap)
+		init_bmap_pfn_range(vb);
+	else
+		/* We can only do one array worth at a time. */
+		num = min(num, ARRAY_SIZE(vb->pfns));
 
 	mutex_lock(&vb->balloon_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
@@ -159,7 +383,10 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			msleep(200);
 			break;
 		}
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		if (use_bmap)
+			update_bmap_pfn_range(vb, page);
+		else
+			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
 		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
@@ -168,8 +395,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 
 	num_allocated_pages = vb->num_pfns;
 	/* Did we get any? */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->inflate_vq);
+	if (vb->num_pfns != 0) {
+		if (use_bmap)
+			set_page_bitmap(vb, &vb_dev_info->pages,
+					vb->inflate_vq);
+		else
+			tell_host(vb, vb->inflate_vq);
+	}
 	mutex_unlock(&vb->balloon_lock);
 
 	return num_allocated_pages;
@@ -189,15 +421,20 @@ static void release_pages_balloon(struct virtio_balloon *vb,
 	}
 }
 
-static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
+static unsigned int leak_balloon(struct virtio_balloon *vb, size_t num)
 {
-	unsigned num_freed_pages;
+	unsigned int num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	LIST_HEAD(pages);
+	bool use_bmap = virtio_has_feature(vb->vdev,
+			 VIRTIO_BALLOON_F_PAGE_RANGE);
 
-	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	if (use_bmap)
+		init_bmap_pfn_range(vb);
+	else
+		/* We can only do one array worth at a time. */
+		num = min(num, ARRAY_SIZE(vb->pfns));
 
 	mutex_lock(&vb->balloon_lock);
 	/* We can't release more pages than taken */
@@ -207,7 +444,10 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 		page = balloon_page_dequeue(vb_dev_info);
 		if (!page)
 			break;
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		if (use_bmap)
+			update_bmap_pfn_range(vb, page);
+		else
+			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
 		list_add(&page->lru, &pages);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
@@ -218,8 +458,12 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->deflate_vq);
+	if (vb->num_pfns != 0) {
+		if (use_bmap)
+			set_page_bitmap(vb, &pages, vb->deflate_vq);
+		else
+			tell_host(vb, vb->deflate_vq);
+	}
 	release_pages_balloon(vb, &pages);
 	mutex_unlock(&vb->balloon_lock);
 	return num_freed_pages;
@@ -431,6 +675,18 @@ static int init_vqs(struct virtio_balloon *vb)
 }
 
 #ifdef CONFIG_BALLOON_COMPACTION
+static void tell_host_one_page(struct virtio_balloon *vb,
+	struct virtqueue *vq, struct page *page)
+{
+	__le64 *range;
+
+	range = vb->resp_data + vb->resp_pos;
+	*range = cpu_to_le64((page_to_pfn(page) <<
+				VIRTIO_BALLOON_NR_PFN_BITS) | 1);
+	vb->resp_pos++;
+	send_resp_data(vb, vq, false);
+}
+
 /*
  * virtballoon_migratepage - perform the balloon page migration on behalf of
  *			     a compation thread.     (called under page lock)
@@ -455,6 +711,8 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	struct virtio_balloon *vb = container_of(vb_dev_info,
 			struct virtio_balloon, vb_dev_info);
 	unsigned long flags;
+	bool use_bmap = virtio_has_feature(vb->vdev,
+				 VIRTIO_BALLOON_F_PAGE_RANGE);
 
 	/*
 	 * In order to avoid lock contention while migrating pages concurrently
@@ -475,15 +733,23 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	vb_dev_info->isolated_pages--;
 	__count_vm_event(BALLOON_MIGRATE);
 	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, newpage);
-	tell_host(vb, vb->inflate_vq);
+	if (use_bmap)
+		tell_host_one_page(vb, vb->inflate_vq, newpage);
+	else {
+		vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb, vb->pfns, newpage);
+		tell_host(vb, vb->inflate_vq);
+	}
 
 	/* balloon's page migration 2nd step -- deflate "page" */
 	balloon_page_delete(page);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, page);
-	tell_host(vb, vb->deflate_vq);
+	if (use_bmap)
+		tell_host_one_page(vb, vb->deflate_vq, page);
+	else {
+		vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb, vb->pfns, page);
+		tell_host(vb, vb->deflate_vq);
+	}
 
 	mutex_unlock(&vb->balloon_lock);
 
@@ -533,6 +799,29 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
 	vb->num_pages = 0;
+	vb->resp_hdr = kzalloc(sizeof(struct virtio_balloon_resp_hdr),
+				 GFP_KERNEL);
+	/* Clear the feature bit if memory allocation fails */
+	if (!vb->resp_hdr)
+		__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_PAGE_RANGE);
+	else {
+		vb->page_bitmap[0] = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL);
+		if (!vb->page_bitmap[0]) {
+			__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_PAGE_RANGE);
+			kfree(vb->resp_hdr);
+		} else {
+			vb->nr_page_bmap = 1;
+			vb->resp_data = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL);
+			if (!vb->resp_data) {
+				__virtio_clear_bit(vdev,
+						VIRTIO_BALLOON_F_PAGE_RANGE);
+				kfree(vb->page_bitmap[0]);
+				kfree(vb->resp_hdr);
+			}
+		}
+	}
+	vb->resp_pos = 0;
+	vb->resp_buf_size = BALLOON_BMAP_SIZE;
 	mutex_init(&vb->balloon_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
@@ -611,6 +900,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
 	remove_common(vb);
 	if (vb->vb_dev_info.inode)
 		iput(vb->vb_dev_info.inode);
+	kfree_page_bitmap(vb);
+	kfree(vb->resp_hdr);
 	kfree(vb);
 }
 
@@ -649,6 +940,7 @@ static int virtballoon_restore(struct virtio_device *vdev)
 	VIRTIO_BALLOON_F_MUST_TELL_HOST,
 	VIRTIO_BALLOON_F_STATS_VQ,
 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
+	VIRTIO_BALLOON_F_PAGE_RANGE,
 };
 
 static struct virtio_driver virtio_balloon_driver = {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v6 kernel 4/5] virtio-balloon: define flags and head for host request vq
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
                   ` (2 preceding siblings ...)
  2016-12-21  6:52 ` [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
@ 2016-12-21  6:52 ` Liang Li
  2016-12-21  6:52 ` [PATCH v6 kernel 5/5] virtio-balloon: tell host vm's unused page info Liang Li
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li, Andrew Morton,
	Mel Gorman

Define the flags and head struct for a new host request virtual
queue. Guest can get requests from host and then responds to them on
this new virtual queue.
Host can make use of this virtual queue to request the guest do some
operations, e.g. drop page cache, synchronize file system, etc.
And the hypervisor can get some of guest's runtime information
through this virtual queue too, e.g. the guest's unused page
information, which can be used for live migration optimization.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
---
 include/uapi/linux/virtio_balloon.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index 2f850bf..b367020 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info with ranges */
+#define VIRTIO_BALLOON_F_HOST_REQ_VQ	4 /* Host request virtqueue */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -94,4 +95,25 @@ struct virtio_balloon_resp_hdr {
 	__le64 data_len: 32; /* Length of the following data, in bytes */
 };
 
+enum virtio_balloon_req_id {
+	/* Get unused page information */
+	BALLOON_GET_UNUSED_PAGES,
+};
+
+enum virtio_balloon_flag {
+	/* Have more data for a request */
+	BALLOON_FLAG_CONT,
+	/* No more data for a request */
+	BALLOON_FLAG_DONE,
+};
+
+struct virtio_balloon_req_hdr {
+	/* Used to distinguish different requests */
+	__le16 cmd;
+	/* Reserved */
+	__le16 reserved[3];
+	/* Request parameter */
+	__le64 param;
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v6 kernel 5/5] virtio-balloon: tell host vm's unused page info
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
                   ` (3 preceding siblings ...)
  2016-12-21  6:52 ` [PATCH v6 kernel 4/5] virtio-balloon: define flags and head for host request vq Liang Li
@ 2016-12-21  6:52 ` Liang Li
  2017-01-10  6:43 ` [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Li, Liang Z
  2017-01-18 10:09 ` David Hildenbrand
  6 siblings, 0 replies; 24+ messages in thread
From: Liang Li @ 2016-12-21  6:52 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela, Liang Li, Andrew Morton,
	Mel Gorman

This patch contains two parts:

One is to add a new API to mm go get the unused page information.
The virtio balloon driver will use this new API added to get the
unused page info and send it to hypervisor(QEMU) to speed up live
migration. During sending the bitmap, some the pages may be modified
and are used by the guest, this inaccuracy can be corrected by the
dirty page logging mechanism.

One is to add support the request for vm's unused page information,
QEMU can make use of unused page information and the dirty page
logging mechanism to skip the transportation of some of these unused
pages, this is very helpful to reduce the network traffic and speed
up the live migration process.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 144 ++++++++++++++++++++++++++++++++++++++--
 include/linux/mm.h              |   3 +
 mm/page_alloc.c                 | 120 +++++++++++++++++++++++++++++++++
 3 files changed, 261 insertions(+), 6 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 03383b3..b67f865 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -56,7 +56,7 @@
 
 struct virtio_balloon {
 	struct virtio_device *vdev;
-	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *req_vq;
 
 	/* The balloon servicing is delegated to a freezable workqueue. */
 	struct work_struct update_balloon_stats_work;
@@ -85,6 +85,8 @@ struct virtio_balloon {
 	unsigned int nr_page_bmap;
 	/* Used to record the processed pfn range */
 	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
+	/* Request header */
+	struct virtio_balloon_req_hdr req_hdr;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -505,6 +507,80 @@ static void update_balloon_stats(struct virtio_balloon *vb)
 				pages_to_bytes(available));
 }
 
+static void __send_unused_pages(struct virtio_balloon *vb,
+	unsigned long req_id, unsigned int pos, bool done)
+{
+	struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr;
+	struct virtqueue *vq = vb->req_vq;
+
+	vb->resp_pos = pos;
+	hdr->cmd = BALLOON_GET_UNUSED_PAGES;
+	hdr->id = req_id;
+	if (!done)
+		hdr->flag = BALLOON_FLAG_CONT;
+	else
+		hdr->flag = BALLOON_FLAG_DONE;
+
+	if (pos > 0 || done)
+		send_resp_data(vb, vq, true);
+
+}
+
+static void send_unused_pages(struct virtio_balloon *vb,
+				unsigned long req_id)
+{
+	struct scatterlist sg_in;
+	unsigned int pos = 0;
+	struct virtqueue *vq = vb->req_vq;
+	int ret, order;
+	struct zone *zone = NULL;
+	bool part_fill = false;
+
+	mutex_lock(&vb->balloon_lock);
+
+	for (order = MAX_ORDER - 1; order >= 0; order--) {
+		ret = mark_unused_pages(&zone, order, vb->resp_data,
+			 vb->resp_buf_size / sizeof(__le64),
+			 &pos, VIRTIO_BALLOON_NR_PFN_BITS, part_fill);
+		if (ret == -ENOSPC) {
+			if (pos == 0) {
+				void *new_resp_data;
+
+				new_resp_data = kmalloc(2 * vb->resp_buf_size,
+							GFP_KERNEL);
+				if (new_resp_data) {
+					kfree(vb->resp_data);
+					vb->resp_data = new_resp_data;
+					vb->resp_buf_size *= 2;
+				} else {
+					part_fill = true;
+					dev_warn(&vb->vdev->dev,
+						 "%s: part fill order: %d\n",
+						 __func__, order);
+				}
+			} else {
+				__send_unused_pages(vb, req_id, pos, false);
+				pos = 0;
+			}
+
+			if (!part_fill) {
+				order++;
+				continue;
+			}
+		} else
+			zone = NULL;
+
+		if (order == 0)
+			__send_unused_pages(vb, req_id, pos, true);
+
+	}
+
+	mutex_unlock(&vb->balloon_lock);
+	sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
+	virtqueue_add_inbuf(vq, &sg_in, 1, &vb->req_hdr, GFP_KERNEL);
+	virtqueue_kick(vq);
+}
+
 /*
  * While most virtqueues communicate guest-initiated requests to the hypervisor,
  * the stats queue operates in reverse.  The driver initializes the virtqueue
@@ -639,11 +715,38 @@ static void update_balloon_size_func(struct work_struct *work)
 		queue_work(system_freezable_wq, work);
 }
 
+static void misc_handle_rq(struct virtio_balloon *vb)
+{
+	struct virtio_balloon_req_hdr *ptr_hdr;
+	unsigned int len;
+
+	ptr_hdr = virtqueue_get_buf(vb->req_vq, &len);
+	if (!ptr_hdr || len != sizeof(vb->req_hdr))
+		return;
+
+	switch (ptr_hdr->cmd) {
+	case BALLOON_GET_UNUSED_PAGES:
+		send_unused_pages(vb, ptr_hdr->param);
+		break;
+	default:
+		break;
+	}
+}
+
+static void misc_request(struct virtqueue *vq)
+{
+	struct virtio_balloon *vb = vq->vdev->priv;
+
+	misc_handle_rq(vb);
+}
+
 static int init_vqs(struct virtio_balloon *vb)
 {
-	struct virtqueue *vqs[3];
-	vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request };
-	static const char * const names[] = { "inflate", "deflate", "stats" };
+	struct virtqueue *vqs[4];
+	vq_callback_t *callbacks[] = { balloon_ack, balloon_ack,
+					 stats_request, misc_request };
+	static const char * const names[] = { "inflate", "deflate", "stats",
+						 "misc" };
 	int err, nvqs;
 
 	/*
@@ -651,6 +754,18 @@ static int init_vqs(struct virtio_balloon *vb)
 	 * optionally stat.
 	 */
 	nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HOST_REQ_VQ))
+		nvqs = 4;
+	else if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ))
+		nvqs = 3;
+	else
+		nvqs = 2;
+
+	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) {
+		__virtio_clear_bit(vb->vdev, VIRTIO_BALLOON_F_PAGE_RANGE);
+		__virtio_clear_bit(vb->vdev, VIRTIO_BALLOON_F_HOST_REQ_VQ);
+	}
+
 	err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names);
 	if (err)
 		return err;
@@ -671,6 +786,18 @@ static int init_vqs(struct virtio_balloon *vb)
 			BUG();
 		virtqueue_kick(vb->stats_vq);
 	}
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HOST_REQ_VQ)) {
+		struct scatterlist sg_in;
+
+		vb->req_vq = vqs[3];
+		sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
+		if (virtqueue_add_inbuf(vb->req_vq, &sg_in, 1,
+		    &vb->req_hdr, GFP_KERNEL) < 0)
+			__virtio_clear_bit(vb->vdev,
+					VIRTIO_BALLOON_F_HOST_REQ_VQ);
+		else
+			virtqueue_kick(vb->req_vq);
+	}
 	return 0;
 }
 
@@ -802,12 +929,14 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	vb->resp_hdr = kzalloc(sizeof(struct virtio_balloon_resp_hdr),
 				 GFP_KERNEL);
 	/* Clear the feature bit if memory allocation fails */
-	if (!vb->resp_hdr)
+	if (!vb->resp_hdr) {
 		__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_PAGE_RANGE);
-	else {
+		__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_HOST_REQ_VQ);
+	} else {
 		vb->page_bitmap[0] = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL);
 		if (!vb->page_bitmap[0]) {
 			__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_PAGE_RANGE);
+			__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_HOST_REQ_VQ);
 			kfree(vb->resp_hdr);
 		} else {
 			vb->nr_page_bmap = 1;
@@ -815,6 +944,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 			if (!vb->resp_data) {
 				__virtio_clear_bit(vdev,
 						VIRTIO_BALLOON_F_PAGE_RANGE);
+				__virtio_clear_bit(vdev,
+						VIRTIO_BALLOON_F_HOST_REQ_VQ);
 				kfree(vb->page_bitmap[0]);
 				kfree(vb->resp_hdr);
 			}
@@ -941,6 +1072,7 @@ static int virtballoon_restore(struct virtio_device *vdev)
 	VIRTIO_BALLOON_F_STATS_VQ,
 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
 	VIRTIO_BALLOON_F_PAGE_RANGE,
+	VIRTIO_BALLOON_F_HOST_REQ_VQ,
 };
 
 static struct virtio_driver virtio_balloon_driver = {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4424784..a80b8f3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1762,6 +1762,9 @@ static inline spinlock_t *pmd_lock(struct mm_struct *mm, pmd_t *pmd)
 extern void free_area_init_node(int nid, unsigned long * zones_size,
 		unsigned long zone_start_pfn, unsigned long *zholes_size);
 extern void free_initmem(void);
+extern int mark_unused_pages(struct zone **start_zone, int order,
+		__le64 *pages, unsigned int size, unsigned int *pos,
+		u8 len_bits, bool part_fill);
 
 /*
  * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2c6d5f6..de0e7a4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4479,6 +4479,126 @@ void show_free_areas(unsigned int filter)
 	show_swap_cache_info();
 }
 
+static int  __mark_unused_pages(struct zone *zone, int order,
+		__le64 *pages, unsigned int size, unsigned int *pos,
+		u8 len_bits, bool part_fill)
+{
+	unsigned long pfn, flags;
+	int t, ret = 0;
+	struct list_head *curr;
+	__le64 *range;
+
+	if (zone_is_empty(zone))
+		return 0;
+
+	spin_lock_irqsave(&zone->lock, flags);
+
+	if (*pos + zone->free_area[order].nr_free > size && !part_fill) {
+		ret = -ENOSPC;
+		goto out;
+	}
+	for (t = 0; t < MIGRATE_TYPES; t++) {
+		list_for_each(curr, &zone->free_area[order].free_list[t]) {
+			pfn = page_to_pfn(list_entry(curr, struct page, lru));
+			range = pages + *pos;
+			if (order < len_bits) {
+				if (*pos + 1 > size) {
+					ret = -ENOSPC;
+					goto out;
+				}
+				*range = cpu_to_le64((pfn << len_bits)
+							| 1 << order);
+				*pos += 1;
+			} else {
+				if (*pos + 2 > size) {
+					ret = -ENOSPC;
+					goto out;
+				}
+				*range = cpu_to_le64((pfn << len_bits) | 0);
+				*(range + 1) = cpu_to_le64(1 << order);
+				*pos += 2;
+			}
+		}
+	}
+
+out:
+	spin_unlock_irqrestore(&zone->lock, flags);
+
+	return ret;
+}
+
+/*
+ * During live migration, page is discardable unless it's content
+ * is needed by the system.
+ * mark_unused_pages provides an API to mark the unused pages, these
+ * unused pages can be discarded if there is no modification since
+ * the request. Some other mechanism, like the dirty page logging
+ * can be used to track the modification.
+ *
+ * This function scans the free page list to mark the unused pages
+ * with the specified order, and set the corresponding range element
+ * in the array 'pages' if unused pages are found for the specified
+ * order.
+ *
+ * @start_zone: zone to start the mark operation.
+ * @order: page order to mark.
+ * @pages: array to save the unused page info.
+ * @size: size of array pages.
+ * @pos: offset in the array to save the page info.
+ * @len_bits: bits for the length field of the range.
+ * @part_fill: indicate if partial fill is used.
+ *
+ * return -EINVAL if parameter is invalid
+ * return -ENOSPC when bitmap can't contain the pages
+ * return 0 when sccess
+ */
+int mark_unused_pages(struct zone **start_zone, int order,
+	__le64 *pages, unsigned int size, unsigned int *pos,
+	u8 len_bits, bool part_fill)
+{
+	struct zone *zone;
+	int ret = 0;
+	bool skip_check = false;
+
+	/* make sure all the parameters are valid */
+	if (pages == NULL || pos == NULL || *pos < 0
+		|| order >= MAX_ORDER || len_bits > 64)
+		return -EINVAL;
+	if (*start_zone != NULL) {
+		bool found = false;
+
+		for_each_populated_zone(zone) {
+			if (zone != *start_zone)
+				continue;
+			found = true;
+			break;
+		}
+		if (!found)
+			return -EINVAL;
+	} else
+		skip_check = true;
+
+	for_each_populated_zone(zone) {
+		/* Start from *start_zone if it's not NULL */
+		if (!skip_check) {
+			if (*start_zone != zone)
+				continue;
+			else
+				skip_check = true;
+		}
+		ret = __mark_unused_pages(zone, order, pages, size,
+					pos, len_bits, part_fill);
+		if (ret < 0) {
+			/* record the failed zone */
+			*start_zone = zone;
+			break;
+		}
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(mark_unused_pages);
+
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
 {
 	zoneref->zone = zone;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* RE: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
                   ` (4 preceding siblings ...)
  2016-12-21  6:52 ` [PATCH v6 kernel 5/5] virtio-balloon: tell host vm's unused page info Liang Li
@ 2017-01-10  6:43 ` Li, Liang Z
  2017-01-18 10:09 ` David Hildenbrand
  6 siblings, 0 replies; 24+ messages in thread
From: Li, Liang Z @ 2017-01-10  6:43 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, Hansen, Dave, cornelia.huck, pbonzini, mst, david,
	aarcange, dgilbert, quintela

Hi guys,

Could you help to review this patch set?

Thanks!
Liang

> -----Original Message-----
> From: Li, Liang Z
> Sent: Wednesday, December 21, 2016 2:52 PM
> To: kvm@vger.kernel.org
> Cc: virtio-dev@lists.oasis-open.org; qemu-devel@nongnu.org; linux-
> mm@kvack.org; linux-kernel@vger.kernel.org; virtualization@lists.linux-
> foundation.org; amit.shah@redhat.com; Hansen, Dave;
> cornelia.huck@de.ibm.com; pbonzini@redhat.com; mst@redhat.com;
> david@redhat.com; aarcange@redhat.com; dgilbert@redhat.com;
> quintela@redhat.com; Li, Liang Z
> Subject: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating &
> fast live migration
> 
> This patch set contains two parts of changes to the virtio-balloon.
> 
> One is the change for speeding up the inflating & deflating process, the main
> idea of this optimization is to use {pfn|length} to present the page
> information instead of the PFNs, to reduce the overhead of virtio data
> transmission, address translation and madvise(). This can help to improve the
> performance by about 85%.
> 
> Another change is for speeding up live migration. By skipping process guest's
> unused pages in the first round of data copy, to reduce needless data
> processing, this can help to save quite a lot of CPU cycles and network
> bandwidth. We put guest's unused page information in a {pfn|length} array
> and send it to host with the virt queue of virtio-balloon. For an idle guest with
> 8GB RAM, this can help to shorten the total live migration time from 2Sec to
> about 500ms in 10Gbps network environment. For an guest with quite a lot
> of page cache and with little unused pages, it's possible to let the guest drop
> it's page cache before live migration, this case can benefit from this new
> feature too.
> 
> Changes from v5 to v6:
>     * Drop the bitmap from the virtio ABI, use {pfn|length} only.
>     * Enhance the API to get the unused page information from mm.
> 
> Changes from v4 to v5:
>     * Drop the code to get the max_pfn, use another way instead.
>     * Simplify the API to get the unused page information from mm.
> 
> Changes from v3 to v4:
>     * Use the new scheme suggested by Dave Hansen to encode the bitmap.
>     * Add code which is missed in v3 to handle migrate page.
>     * Free the memory for bitmap intime once the operation is done.
>     * Address some of the comments in v3.
> 
> Changes from v2 to v3:
>     * Change the name of 'free page' to 'unused page'.
>     * Use the scatter & gather bitmap instead of a 1MB page bitmap.
>     * Fix overwriting the page bitmap after kicking.
>     * Some of MST's comments for v2.
> 
> Changes from v1 to v2:
>     * Abandon the patch for dropping page cache.
>     * Put some structures to uapi head file.
>     * Use a new way to determine the page bitmap size.
>     * Use a unified way to send the free page information with the bitmap
>     * Address the issues referred in MST's comments
> 
> Liang Li (5):
>   virtio-balloon: rework deflate to add page to a list
>   virtio-balloon: define new feature bit and head struct
>   virtio-balloon: speed up inflate/deflate process
>   virtio-balloon: define flags and head for host request vq
>   virtio-balloon: tell host vm's unused page info
> 
>  drivers/virtio/virtio_balloon.c     | 510
> ++++++++++++++++++++++++++++++++----
>  include/linux/mm.h                  |   3 +
>  include/uapi/linux/virtio_balloon.h |  34 +++
>  mm/page_alloc.c                     | 120 +++++++++
>  4 files changed, 621 insertions(+), 46 deletions(-)
> 
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2016-12-21  6:52 ` [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct Liang Li
@ 2017-01-12 19:43   ` Michael S. Tsirkin
  2017-01-13  9:24     ` [virtio-dev] " Li, Liang Z
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-12 19:43 UTC (permalink / raw)
  To: Liang Li
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, dave.hansen, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Wed, Dec 21, 2016 at 02:52:25PM +0800, Liang Li wrote:
> Add a new feature which supports sending the page information
> with range array. The current implementation uses PFNs array,
> which is not very efficient. Using ranges can improve the
> performance of inflating/deflating significantly.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Amit Shah <amit.shah@redhat.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> ---
>  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index 343d7dd..2f850bf 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -34,10 +34,14 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info with ranges */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
>  
> +/* Bits width for the length of the pfn range */

What does this mean? Couldn't figure it out.

> +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> +
>  struct virtio_balloon_config {
>  	/* Number of pages host wants Guest to give up. */
>  	__u32 num_pages;
> @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
>  	__virtio64 val;
>  } __attribute__((packed));
>  
> +/* Response header structure */
> +struct virtio_balloon_resp_hdr {
> +	__le64 cmd : 8; /* Distinguish different requests type */
> +	__le64 flag: 8; /* Mark status for a specific request type */
> +	__le64 id : 16; /* Distinguish requests of a specific type */
> +	__le64 data_len: 32; /* Length of the following data, in bytes */

This use of __le64 makes no sense.  Just use u8/le16/le32 pls.

> +};
> +
>  #endif /* _LINUX_VIRTIO_BALLOON_H */
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2017-01-12 19:43   ` Michael S. Tsirkin
@ 2017-01-13  9:24     ` Li, Liang Z
  2017-01-17 19:11       ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Liang Z @ 2017-01-13  9:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

> On Wed, Dec 21, 2016 at 02:52:25PM +0800, Liang Li wrote:
> > Add a new feature which supports sending the page information with
> > range array. The current implementation uses PFNs array, which is not
> > very efficient. Using ranges can improve the performance of
> > inflating/deflating significantly.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > Cc: Amit Shah <amit.shah@redhat.com>
> > Cc: Dave Hansen <dave.hansen@intel.com>
> > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > ---
> >  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/include/uapi/linux/virtio_balloon.h
> > b/include/uapi/linux/virtio_balloon.h
> > index 343d7dd..2f850bf 100644
> > --- a/include/uapi/linux/virtio_balloon.h
> > +++ b/include/uapi/linux/virtio_balloon.h
> > @@ -34,10 +34,14 @@
> >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> reclaiming pages */
> >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> */
> >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon
> on OOM */
> > +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info
> with ranges */
> >
> >  /* Size of a PFN in the balloon interface. */  #define
> > VIRTIO_BALLOON_PFN_SHIFT 12
> >
> > +/* Bits width for the length of the pfn range */
> 
> What does this mean? Couldn't figure it out.
> 
> > +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> > +
> >  struct virtio_balloon_config {
> >  	/* Number of pages host wants Guest to give up. */
> >  	__u32 num_pages;
> > @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
> >  	__virtio64 val;
> >  } __attribute__((packed));
> >
> > +/* Response header structure */
> > +struct virtio_balloon_resp_hdr {
> > +	__le64 cmd : 8; /* Distinguish different requests type */
> > +	__le64 flag: 8; /* Mark status for a specific request type */
> > +	__le64 id : 16; /* Distinguish requests of a specific type */
> > +	__le64 data_len: 32; /* Length of the following data, in bytes */
> 
> This use of __le64 makes no sense.  Just use u8/le16/le32 pls.
> 

Got it, will change in the next version. 

And could help take a look at other parts? as well as the QEMU part.

Thanks!
Liang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2017-01-13  9:24     ` [virtio-dev] " Li, Liang Z
@ 2017-01-17 19:11       ` Michael S. Tsirkin
  2017-01-18  1:55         ` Li, Liang Z
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-17 19:11 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Fri, Jan 13, 2017 at 09:24:22AM +0000, Li, Liang Z wrote:
> > On Wed, Dec 21, 2016 at 02:52:25PM +0800, Liang Li wrote:
> > > Add a new feature which supports sending the page information with
> > > range array. The current implementation uses PFNs array, which is not
> > > very efficient. Using ranges can improve the performance of
> > > inflating/deflating significantly.
> > >
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > > Cc: Amit Shah <amit.shah@redhat.com>
> > > Cc: Dave Hansen <dave.hansen@intel.com>
> > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > Cc: David Hildenbrand <david@redhat.com>
> > > ---
> > >  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/virtio_balloon.h
> > > b/include/uapi/linux/virtio_balloon.h
> > > index 343d7dd..2f850bf 100644
> > > --- a/include/uapi/linux/virtio_balloon.h
> > > +++ b/include/uapi/linux/virtio_balloon.h
> > > @@ -34,10 +34,14 @@
> > >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> > reclaiming pages */
> > >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> > */
> > >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon
> > on OOM */
> > > +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info
> > with ranges */
> > >
> > >  /* Size of a PFN in the balloon interface. */  #define
> > > VIRTIO_BALLOON_PFN_SHIFT 12
> > >
> > > +/* Bits width for the length of the pfn range */
> > 
> > What does this mean? Couldn't figure it out.
> > 
> > > +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> > > +
> > >  struct virtio_balloon_config {
> > >  	/* Number of pages host wants Guest to give up. */
> > >  	__u32 num_pages;
> > > @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
> > >  	__virtio64 val;
> > >  } __attribute__((packed));
> > >
> > > +/* Response header structure */
> > > +struct virtio_balloon_resp_hdr {
> > > +	__le64 cmd : 8; /* Distinguish different requests type */
> > > +	__le64 flag: 8; /* Mark status for a specific request type */
> > > +	__le64 id : 16; /* Distinguish requests of a specific type */
> > > +	__le64 data_len: 32; /* Length of the following data, in bytes */
> > 
> > This use of __le64 makes no sense.  Just use u8/le16/le32 pls.
> > 
> 
> Got it, will change in the next version. 
> 
> And could help take a look at other parts? as well as the QEMU part.
> 
> Thanks!
> Liang

Yes but first I would like to understand how come no fields
in this new structure come up if I search for them in the
following patch. I don't see why should I waste time on
reviewing the implementation if the interface isn't
reasonable. You don't have to waste it too - just send RFC
patches with the header until we can agree on it.

-- 
MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2016-12-21  6:52 ` [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
@ 2017-01-17 19:15   ` Michael S. Tsirkin
  2017-01-18  4:56     ` Li, Liang Z
  2017-01-20 11:48   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-17 19:15 UTC (permalink / raw)
  To: Liang Li
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, dave.hansen, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Wed, Dec 21, 2016 at 02:52:26PM +0800, Liang Li wrote:
>  
> -	/* We should always be able to add one buffer to an empty queue. */
> -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> -	virtqueue_kick(vq);
> +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> +		unsigned long base_pfn, int pages)
>  
> -	/* When host has read buffer, this completes via balloon_ack */
> -	wait_event(vb->acked, virtqueue_get_buf(vq, &len));
> +{
> +	__le64 *range = vb->resp_data + vb->resp_pos;
>  
> +	if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> +		/* when the length field can't contain pages, set it to 0 to

/*
 * Multi-line
 * comments
 * should look like this.
 */

Also, pls start sentences with an upper-case letter.

> +		 * indicate the actual length is in the next __le64;
> +		 */

This is part of the interface so should be documented as such.

> +		*range = cpu_to_le64((base_pfn <<
> +				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> +		*(range + 1) = cpu_to_le64(pages);
> +		vb->resp_pos += 2;

Pls use structs for this kind of stuff.

> +	} else {
> +		*range = (base_pfn << VIRTIO_BALLOON_NR_PFN_BITS) | pages;
> +		vb->resp_pos++;
> +	}
> +}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2017-01-17 19:11       ` Michael S. Tsirkin
@ 2017-01-18  1:55         ` Li, Liang Z
  2017-01-18 15:30           ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Liang Z @ 2017-01-18  1:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

> Sent: Wednesday, January 18, 2017 3:11 AM
> To: Li, Liang Z
> Cc: kvm@vger.kernel.org; virtio-dev@lists.oasis-open.org; qemu-
> devel@nongnu.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
> virtualization@lists.linux-foundation.org; amit.shah@redhat.com; Hansen,
> Dave; cornelia.huck@de.ibm.com; pbonzini@redhat.com;
> david@redhat.com; aarcange@redhat.com; dgilbert@redhat.com;
> quintela@redhat.com
> Subject: Re: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new
> feature bit and head struct
> 
> On Fri, Jan 13, 2017 at 09:24:22AM +0000, Li, Liang Z wrote:
> > > On Wed, Dec 21, 2016 at 02:52:25PM +0800, Liang Li wrote:
> > > > Add a new feature which supports sending the page information with
> > > > range array. The current implementation uses PFNs array, which is
> > > > not very efficient. Using ranges can improve the performance of
> > > > inflating/deflating significantly.
> > > >
> > > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > > > Cc: Amit Shah <amit.shah@redhat.com>
> > > > Cc: Dave Hansen <dave.hansen@intel.com>
> > > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > Cc: David Hildenbrand <david@redhat.com>
> > > > ---
> > > >  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
> > > >  1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/virtio_balloon.h
> > > > b/include/uapi/linux/virtio_balloon.h
> > > > index 343d7dd..2f850bf 100644
> > > > --- a/include/uapi/linux/virtio_balloon.h
> > > > +++ b/include/uapi/linux/virtio_balloon.h
> > > > @@ -34,10 +34,14 @@
> > > >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> > > reclaiming pages */
> > > >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> > > */
> > > >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate
> balloon
> > > on OOM */
> > > > +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info
> > > with ranges */
> > > >
> > > >  /* Size of a PFN in the balloon interface. */  #define
> > > > VIRTIO_BALLOON_PFN_SHIFT 12
> > > >
> > > > +/* Bits width for the length of the pfn range */
> > >
> > > What does this mean? Couldn't figure it out.
> > >
> > > > +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> > > > +
> > > >  struct virtio_balloon_config {
> > > >  	/* Number of pages host wants Guest to give up. */
> > > >  	__u32 num_pages;
> > > > @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
> > > >  	__virtio64 val;
> > > >  } __attribute__((packed));
> > > >
> > > > +/* Response header structure */
> > > > +struct virtio_balloon_resp_hdr {
> > > > +	__le64 cmd : 8; /* Distinguish different requests type */
> > > > +	__le64 flag: 8; /* Mark status for a specific request type */
> > > > +	__le64 id : 16; /* Distinguish requests of a specific type */
> > > > +	__le64 data_len: 32; /* Length of the following data, in bytes
> > > > +*/
> > >
> > > This use of __le64 makes no sense.  Just use u8/le16/le32 pls.
> > >
> >
> > Got it, will change in the next version.
> >
> > And could help take a look at other parts? as well as the QEMU part.
> >
> > Thanks!
> > Liang
> 
> Yes but first I would like to understand how come no fields in this new
> structure come up if I search for them in the following patch. I don't see why

It's not true, all of the field will be referenced in the following patches except 
the 'reserved' filed.

> should I waste time on reviewing the implementation if the interface isn't
> reasonable. You don't have to waste it too - just send RFC patches with the
> header until we can agree on it.

OK. I will post the header part separately.

Thanks!
Liang
> 
> --
> MST
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2017-01-17 19:15   ` Michael S. Tsirkin
@ 2017-01-18  4:56     ` Li, Liang Z
  2017-01-18 15:30       ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Liang Z @ 2017-01-18  4:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

> > -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> > -	virtqueue_kick(vq);
> > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > +		unsigned long base_pfn, int pages)
> >
> > -	/* When host has read buffer, this completes via balloon_ack */
> > -	wait_event(vb->acked, virtqueue_get_buf(vq, &len));
> > +{
> > +	__le64 *range = vb->resp_data + vb->resp_pos;
> >
> > +	if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > +		/* when the length field can't contain pages, set it to 0 to
> 
> /*
>  * Multi-line
>  * comments
>  * should look like this.
>  */
> 
> Also, pls start sentences with an upper-case letter.
> 

Sorry for that.

> > +		 * indicate the actual length is in the next __le64;
> > +		 */
> 
> This is part of the interface so should be documented as such.
> 
> > +		*range = cpu_to_le64((base_pfn <<
> > +				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > +		*(range + 1) = cpu_to_le64(pages);
> > +		vb->resp_pos += 2;
> 
> Pls use structs for this kind of stuff.

I am not sure if you mean to use 

struct  range {
 	__le64 pfn: 52;
	__le64 nr_page: 12
}
Instead of the shift operation?

I didn't use this way because I don't want to include 'virtio-balloon.h' in page_alloc.c,
or copy the define of this struct in page_alloc.c

Thanks!
Liang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
  2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
                   ` (5 preceding siblings ...)
  2017-01-10  6:43 ` [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Li, Liang Z
@ 2017-01-18 10:09 ` David Hildenbrand
  2017-01-18 13:29   ` Li, Liang Z
  2017-01-18 15:38   ` Michael S. Tsirkin
  6 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2017-01-18 10:09 UTC (permalink / raw)
  To: Liang Li, kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, dave.hansen, cornelia.huck, pbonzini, mst, aarcange,
	dgilbert, quintela

Am 21.12.2016 um 07:52 schrieb Liang Li:
> This patch set contains two parts of changes to the virtio-balloon.
>
> One is the change for speeding up the inflating & deflating process,
> the main idea of this optimization is to use {pfn|length} to present
> the page information instead of the PFNs, to reduce the overhead of
> virtio data transmission, address translation and madvise(). This can
> help to improve the performance by about 85%.
>
> Another change is for speeding up live migration. By skipping process
> guest's unused pages in the first round of data copy, to reduce needless
> data processing, this can help to save quite a lot of CPU cycles and
> network bandwidth. We put guest's unused page information in a
> {pfn|length} array and send it to host with the virt queue of
> virtio-balloon. For an idle guest with 8GB RAM, this can help to shorten
> the total live migration time from 2Sec to about 500ms in 10Gbps network
> environment. For an guest with quite a lot of page cache and with little
> unused pages, it's possible to let the guest drop it's page cache before
> live migration, this case can benefit from this new feature too.

I agree that both changes make sense (although the second change just 
smells very racy, as you also pointed out in the patch description),
however I am not sure if virtio-balloon is really the right place for
the latter change.

virtio-balloon is all about ballooning, nothing else. What you're doing
is using it as a way to communicate balloon-unrelated data from/to the
hypervisor. Yes, it is also about guest memory, but completely unrelated
to the purpose of the balloon device.

Maybe using virtio-balloon for this purpose is okay - I have mixed
feelings (especially as I can't tell where else this could go). I would
like to get a second opinion on this.

-- 

David

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
  2017-01-18 10:09 ` David Hildenbrand
@ 2017-01-18 13:29   ` Li, Liang Z
  2017-01-18 15:38   ` Michael S. Tsirkin
  1 sibling, 0 replies; 24+ messages in thread
From: Li, Liang Z @ 2017-01-18 13:29 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: virtio-dev, qemu-devel, linux-mm, linux-kernel, virtualization,
	amit.shah, Hansen, Dave, cornelia.huck, pbonzini, mst, aarcange,
	dgilbert, quintela

> Am 21.12.2016 um 07:52 schrieb Liang Li:
> > This patch set contains two parts of changes to the virtio-balloon.
> >
> > One is the change for speeding up the inflating & deflating process,
> > the main idea of this optimization is to use {pfn|length} to present
> > the page information instead of the PFNs, to reduce the overhead of
> > virtio data transmission, address translation and madvise(). This can
> > help to improve the performance by about 85%.
> >
> > Another change is for speeding up live migration. By skipping process
> > guest's unused pages in the first round of data copy, to reduce
> > needless data processing, this can help to save quite a lot of CPU
> > cycles and network bandwidth. We put guest's unused page information
> > in a {pfn|length} array and send it to host with the virt queue of
> > virtio-balloon. For an idle guest with 8GB RAM, this can help to
> > shorten the total live migration time from 2Sec to about 500ms in
> > 10Gbps network environment. For an guest with quite a lot of page
> > cache and with little unused pages, it's possible to let the guest
> > drop it's page cache before live migration, this case can benefit from this
> new feature too.
> 
> I agree that both changes make sense (although the second change just
> smells very racy, as you also pointed out in the patch description), however I
> am not sure if virtio-balloon is really the right place for the latter change.
> 
> virtio-balloon is all about ballooning, nothing else. What you're doing is using
> it as a way to communicate balloon-unrelated data from/to the hypervisor.
> Yes, it is also about guest memory, but completely unrelated to the purpose
> of the balloon device.
> 
> Maybe using virtio-balloon for this purpose is okay - I have mixed feelings
> (especially as I can't tell where else this could go). I would like to get a second
> opinion on this.
> 

We have ever discussed the implementation for a long time, making use the current
virtio balloon seems better than the other solutions and is recommended by Michael.

Thanks!
Liang
> --
> 
> David

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2017-01-18  4:56     ` Li, Liang Z
@ 2017-01-18 15:30       ` Michael S. Tsirkin
  2017-01-19  1:44         ` Li, Liang Z
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-18 15:30 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Wed, Jan 18, 2017 at 04:56:58AM +0000, Li, Liang Z wrote:
> > > -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> > > -	virtqueue_kick(vq);
> > > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > > +		unsigned long base_pfn, int pages)
> > >
> > > -	/* When host has read buffer, this completes via balloon_ack */
> > > -	wait_event(vb->acked, virtqueue_get_buf(vq, &len));
> > > +{
> > > +	__le64 *range = vb->resp_data + vb->resp_pos;
> > >
> > > +	if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > > +		/* when the length field can't contain pages, set it to 0 to
> > 
> > /*
> >  * Multi-line
> >  * comments
> >  * should look like this.
> >  */
> > 
> > Also, pls start sentences with an upper-case letter.
> > 
> 
> Sorry for that.
> 
> > > +		 * indicate the actual length is in the next __le64;
> > > +		 */
> > 
> > This is part of the interface so should be documented as such.
> > 
> > > +		*range = cpu_to_le64((base_pfn <<
> > > +				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > +		*(range + 1) = cpu_to_le64(pages);
> > > +		vb->resp_pos += 2;
> > 
> > Pls use structs for this kind of stuff.
> 
> I am not sure if you mean to use 
> 
> struct  range {
>  	__le64 pfn: 52;
> 	__le64 nr_page: 12
> }
> Instead of the shift operation?

Not just that. You want to add a pages field as well.

Generally describe the format in the header in some way
so host and guest can easily stay in sync.

All the pointer math and void * means we get zero type
safety and I'm not happy about it.


> I didn't use this way because I don't want to include 'virtio-balloon.h' in page_alloc.c,
> or copy the define of this struct in page_alloc.c
> 
> Thanks!
> Liang


It's not good that virtio format seeps out to page_alloc anyway.
If unavoidable it is not a good idea to try to hide this fact,
people will assume they can change the format at will.

-- 
MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2017-01-18  1:55         ` Li, Liang Z
@ 2017-01-18 15:30           ` Michael S. Tsirkin
  2017-01-19  1:30             ` Li, Liang Z
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-18 15:30 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Wed, Jan 18, 2017 at 01:55:12AM +0000, Li, Liang Z wrote:
> > Sent: Wednesday, January 18, 2017 3:11 AM
> > To: Li, Liang Z
> > Cc: kvm@vger.kernel.org; virtio-dev@lists.oasis-open.org; qemu-
> > devel@nongnu.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
> > virtualization@lists.linux-foundation.org; amit.shah@redhat.com; Hansen,
> > Dave; cornelia.huck@de.ibm.com; pbonzini@redhat.com;
> > david@redhat.com; aarcange@redhat.com; dgilbert@redhat.com;
> > quintela@redhat.com
> > Subject: Re: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new
> > feature bit and head struct
> > 
> > On Fri, Jan 13, 2017 at 09:24:22AM +0000, Li, Liang Z wrote:
> > > > On Wed, Dec 21, 2016 at 02:52:25PM +0800, Liang Li wrote:
> > > > > Add a new feature which supports sending the page information with
> > > > > range array. The current implementation uses PFNs array, which is
> > > > > not very efficient. Using ranges can improve the performance of
> > > > > inflating/deflating significantly.
> > > > >
> > > > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > > > > Cc: Amit Shah <amit.shah@redhat.com>
> > > > > Cc: Dave Hansen <dave.hansen@intel.com>
> > > > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > > Cc: David Hildenbrand <david@redhat.com>
> > > > > ---
> > > > >  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
> > > > >  1 file changed, 12 insertions(+)
> > > > >
> > > > > diff --git a/include/uapi/linux/virtio_balloon.h
> > > > > b/include/uapi/linux/virtio_balloon.h
> > > > > index 343d7dd..2f850bf 100644
> > > > > --- a/include/uapi/linux/virtio_balloon.h
> > > > > +++ b/include/uapi/linux/virtio_balloon.h
> > > > > @@ -34,10 +34,14 @@
> > > > >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before
> > > > reclaiming pages */
> > > > >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue
> > > > */
> > > > >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate
> > balloon
> > > > on OOM */
> > > > > +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info
> > > > with ranges */
> > > > >
> > > > >  /* Size of a PFN in the balloon interface. */  #define
> > > > > VIRTIO_BALLOON_PFN_SHIFT 12
> > > > >
> > > > > +/* Bits width for the length of the pfn range */
> > > >
> > > > What does this mean? Couldn't figure it out.
> > > >
> > > > > +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> > > > > +
> > > > >  struct virtio_balloon_config {
> > > > >  	/* Number of pages host wants Guest to give up. */
> > > > >  	__u32 num_pages;
> > > > > @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
> > > > >  	__virtio64 val;
> > > > >  } __attribute__((packed));
> > > > >
> > > > > +/* Response header structure */
> > > > > +struct virtio_balloon_resp_hdr {
> > > > > +	__le64 cmd : 8; /* Distinguish different requests type */
> > > > > +	__le64 flag: 8; /* Mark status for a specific request type */
> > > > > +	__le64 id : 16; /* Distinguish requests of a specific type */
> > > > > +	__le64 data_len: 32; /* Length of the following data, in bytes
> > > > > +*/
> > > >
> > > > This use of __le64 makes no sense.  Just use u8/le16/le32 pls.
> > > >
> > >
> > > Got it, will change in the next version.
> > >
> > > And could help take a look at other parts? as well as the QEMU part.
> > >
> > > Thanks!
> > > Liang
> > 
> > Yes but first I would like to understand how come no fields in this new
> > structure come up if I search for them in the following patch. I don't see why
> 
> It's not true, all of the field will be referenced in the following patches except 
> the 'reserved' filed.

But none of these are used in the following patch 3.

> > should I waste time on reviewing the implementation if the interface isn't
> > reasonable. You don't have to waste it too - just send RFC patches with the
> > header until we can agree on it.
> 
> OK. I will post the header part separately.
> 
> Thanks!
> Liang
> > 
> > --
> > MST
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
  2017-01-18 10:09 ` David Hildenbrand
  2017-01-18 13:29   ` Li, Liang Z
@ 2017-01-18 15:38   ` Michael S. Tsirkin
  2017-01-19 17:24     ` David Hildenbrand
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-18 15:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Liang Li, kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, dave.hansen, cornelia.huck, pbonzini,
	aarcange, dgilbert, quintela

On Wed, Jan 18, 2017 at 11:09:30AM +0100, David Hildenbrand wrote:
> Am 21.12.2016 um 07:52 schrieb Liang Li:
> > This patch set contains two parts of changes to the virtio-balloon.
> > 
> > One is the change for speeding up the inflating & deflating process,
> > the main idea of this optimization is to use {pfn|length} to present
> > the page information instead of the PFNs, to reduce the overhead of
> > virtio data transmission, address translation and madvise(). This can
> > help to improve the performance by about 85%.
> > 
> > Another change is for speeding up live migration. By skipping process
> > guest's unused pages in the first round of data copy, to reduce needless
> > data processing, this can help to save quite a lot of CPU cycles and
> > network bandwidth. We put guest's unused page information in a
> > {pfn|length} array and send it to host with the virt queue of
> > virtio-balloon. For an idle guest with 8GB RAM, this can help to shorten
> > the total live migration time from 2Sec to about 500ms in 10Gbps network
> > environment. For an guest with quite a lot of page cache and with little
> > unused pages, it's possible to let the guest drop it's page cache before
> > live migration, this case can benefit from this new feature too.
> 
> I agree that both changes make sense (although the second change just smells
> very racy, as you also pointed out in the patch description),
> however I am not sure if virtio-balloon is really the right place for
> the latter change.
> 
> virtio-balloon is all about ballooning, nothing else. What you're doing
> is using it as a way to communicate balloon-unrelated data from/to the
> hypervisor. Yes, it is also about guest memory, but completely unrelated
> to the purpose of the balloon device.
> 
> Maybe using virtio-balloon for this purpose is okay - I have mixed
> feelings (especially as I can't tell where else this could go). I would
> like to get a second opinion on this.

As long as the interface is similar, it seems to make
sense for me - why invent a completely new device that
looks very much like the old one?

So this boils down to whether the speedup patches are merged.


> -- 
> 
> David

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [virtio-dev] Re: [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct
  2017-01-18 15:30           ` Michael S. Tsirkin
@ 2017-01-19  1:30             ` Li, Liang Z
  0 siblings, 0 replies; 24+ messages in thread
From: Li, Liang Z @ 2017-01-19  1:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

> > > > > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > > > > > Cc: Amit Shah <amit.shah@redhat.com>
> > > > > > Cc: Dave Hansen <dave.hansen@intel.com>
> > > > > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > > > Cc: David Hildenbrand <david@redhat.com>
> > > > > > ---
> > > > > >  include/uapi/linux/virtio_balloon.h | 12 ++++++++++++
> > > > > >  1 file changed, 12 insertions(+)
> > > > > >
> > > > > > diff --git a/include/uapi/linux/virtio_balloon.h
> > > > > > b/include/uapi/linux/virtio_balloon.h
> > > > > > index 343d7dd..2f850bf 100644
> > > > > > --- a/include/uapi/linux/virtio_balloon.h
> > > > > > +++ b/include/uapi/linux/virtio_balloon.h
> > > > > > @@ -34,10 +34,14 @@
> > > > > >  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell
> before
> > > > > reclaiming pages */
> > > > > >  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats
> virtqueue
> > > > > */
> > > > > >  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate
> > > balloon
> > > > > on OOM */
> > > > > > +#define VIRTIO_BALLOON_F_PAGE_RANGE	3 /* Send page info
> > > > > with ranges */
> > > > > >
> > > > > >  /* Size of a PFN in the balloon interface. */  #define
> > > > > > VIRTIO_BALLOON_PFN_SHIFT 12
> > > > > >
> > > > > > +/* Bits width for the length of the pfn range */
> > > > >
> > > > > What does this mean? Couldn't figure it out.
> > > > >
> > > > > > +#define VIRTIO_BALLOON_NR_PFN_BITS 12
> > > > > > +
> > > > > >  struct virtio_balloon_config {
> > > > > >  	/* Number of pages host wants Guest to give up. */
> > > > > >  	__u32 num_pages;
> > > > > > @@ -82,4 +86,12 @@ struct virtio_balloon_stat {
> > > > > >  	__virtio64 val;
> > > > > >  } __attribute__((packed));
> > > > > >
> > > > > > +/* Response header structure */ struct
> > > > > > +virtio_balloon_resp_hdr {
> > > > > > +	__le64 cmd : 8; /* Distinguish different requests type */
> > > > > > +	__le64 flag: 8; /* Mark status for a specific request type */
> > > > > > +	__le64 id : 16; /* Distinguish requests of a specific type */
> > > > > > +	__le64 data_len: 32; /* Length of the following data, in
> > > > > > +bytes */
> > > > >
> > > > > This use of __le64 makes no sense.  Just use u8/le16/le32 pls.
> > > > >
> > > >
> > > > Got it, will change in the next version.
> > > >
> > > > And could help take a look at other parts? as well as the QEMU part.
> > > >
> > > > Thanks!
> > > > Liang
> > >
> > > Yes but first I would like to understand how come no fields in this
> > > new structure come up if I search for them in the following patch. I
> > > don't see why
> >
> > It's not true, all of the field will be referenced in the following
> > patches except the 'reserved' filed.
> 
> But none of these are used in the following patch 3.

Yes. Only 'data_len' is used in patch 3, and for expansibility maybe at least 'cmd' is needed to. I should set it in patch 3 to some default value even
it's not currently useful. 'flag' and 'id' are for patch 4. I just want to reuse the 'struct virtio_balloon_resp_hdr' and make the code simpler.

Thanks!
Liang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2017-01-18 15:30       ` Michael S. Tsirkin
@ 2017-01-19  1:44         ` Li, Liang Z
  2017-01-20 16:34           ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Liang Z @ 2017-01-19  1:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

> On Wed, Jan 18, 2017 at 04:56:58AM +0000, Li, Liang Z wrote:
> > > > -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> > > > -	virtqueue_kick(vq);
> > > > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > > > +		unsigned long base_pfn, int pages)
> > > >
> > > > -	/* When host has read buffer, this completes via balloon_ack */
> > > > -	wait_event(vb->acked, virtqueue_get_buf(vq, &len));
> > > > +{
> > > > +	__le64 *range = vb->resp_data + vb->resp_pos;
> > > >
> > > > +	if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > > > +		/* when the length field can't contain pages, set it to 0 to
> > >
> > > /*
> > >  * Multi-line
> > >  * comments
> > >  * should look like this.
> > >  */
> > >
> > > Also, pls start sentences with an upper-case letter.
> > >
> >
> > Sorry for that.
> >
> > > > +		 * indicate the actual length is in the next __le64;
> > > > +		 */
> > >
> > > This is part of the interface so should be documented as such.
> > >
> > > > +		*range = cpu_to_le64((base_pfn <<
> > > > +				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > > +		*(range + 1) = cpu_to_le64(pages);
> > > > +		vb->resp_pos += 2;
> > >
> > > Pls use structs for this kind of stuff.
> >
> > I am not sure if you mean to use
> >
> > struct  range {
> >  	__le64 pfn: 52;
> > 	__le64 nr_page: 12
> > }
> > Instead of the shift operation?
> 
> Not just that. You want to add a pages field as well.
> 

pages field? Could you give more hints?

> Generally describe the format in the header in some way so host and guest
> can easily stay in sync.

'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the
related function in page_alloc.c as a parameter.

Thanks!
Liang
> All the pointer math and void * means we get zero type safety and I'm not
> happy about it.
> 
> It's not good that virtio format seeps out to page_alloc anyway.
> If unavoidable it is not a good idea to try to hide this fact, people will assume
> they can change the format at will.
> 
> --
> MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
  2017-01-18 15:38   ` Michael S. Tsirkin
@ 2017-01-19 17:24     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2017-01-19 17:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Liang Li, kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, dave.hansen, cornelia.huck, pbonzini,
	aarcange, dgilbert, quintela


> As long as the interface is similar, it seems to make
> sense for me - why invent a completely new device that
> looks very much like the old one?

The only reason would be that this feature could be used independently
of virtio-balloon. But this would of course only be the case, if
ballooning is strictly not wanted in a configuration, or the current
balloon driver gets replaced by an alternative solution.

I don't have any strong feelings about this, just wanted to double check.

Thanks,

David

> 
> So this boils down to whether the speedup patches are merged.
> 
> 
>> -- 
>>
>> David

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2016-12-21  6:52 ` [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
  2017-01-17 19:15   ` Michael S. Tsirkin
@ 2017-01-20 11:48   ` Dr. David Alan Gilbert
  2017-02-04  4:35     ` Li, Liang Z
  1 sibling, 1 reply; 24+ messages in thread
From: Dr. David Alan Gilbert @ 2017-01-20 11:48 UTC (permalink / raw)
  To: Liang Li
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, dave.hansen, cornelia.huck, pbonzini,
	mst, david, aarcange, quintela

* Liang Li (liang.z.li@intel.com) wrote:

<snip>

> +static void free_extended_page_bitmap(struct virtio_balloon *vb)
> +{
> +	int i, bmap_count = vb->nr_page_bmap;
> +
> +	for (i = 1; i < bmap_count; i++) {
> +		kfree(vb->page_bitmap[i]);
> +		vb->page_bitmap[i] = NULL;
> +		vb->nr_page_bmap--;
> +	}
> +}
> +
> +static void kfree_page_bitmap(struct virtio_balloon *vb)
> +{
> +	int i;
> +
> +	for (i = 0; i < vb->nr_page_bmap; i++)
> +		kfree(vb->page_bitmap[i]);
> +}

It might be worth commenting that pair of functions to make it clear
why they are so different; I guess the kfree_page_bitmap
is used just before you free the structure above it so you
don't need to keep the count/pointers updated?

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2017-01-19  1:44         ` Li, Liang Z
@ 2017-01-20 16:34           ` Michael S. Tsirkin
  0 siblings, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2017-01-20 16:34 UTC (permalink / raw)
  To: Li, Liang Z
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	david, aarcange, dgilbert, quintela

On Thu, Jan 19, 2017 at 01:44:36AM +0000, Li, Liang Z wrote:
> > > > > +		*range = cpu_to_le64((base_pfn <<
> > > > > +				VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > > > +		*(range + 1) = cpu_to_le64(pages);
> > > > > +		vb->resp_pos += 2;
> > > >
> > > > Pls use structs for this kind of stuff.
> > >
> > > I am not sure if you mean to use
> > >
> > > struct  range {
> > >  	__le64 pfn: 52;
> > > 	__le64 nr_page: 12
> > > }
> > > Instead of the shift operation?
> > 
> > Not just that. You want to add a pages field as well.
> > 
> 
> pages field? Could you give more hints?

Well look how you are formatting it manually above.
There is clearly a structure with two 64 bit fields.
First one includes pfn and 0 (no idea why does | 0 make
sense but that's a separate issue).
Second one includes the pages value.


> > Generally describe the format in the header in some way so host and guest
> > can easily stay in sync.
> 
> 'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the
> related function in page_alloc.c as a parameter.
> 
> Thanks!
> Liang
> > All the pointer math and void * means we get zero type safety and I'm not
> > happy about it.
> > 
> > It's not good that virtio format seeps out to page_alloc anyway.
> > If unavoidable it is not a good idea to try to hide this fact, people will assume
> > they can change the format at will.
> > 
> > --
> > MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
  2017-01-20 11:48   ` Dr. David Alan Gilbert
@ 2017-02-04  4:35     ` Li, Liang Z
  0 siblings, 0 replies; 24+ messages in thread
From: Li, Liang Z @ 2017-02-04  4:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: kvm, virtio-dev, qemu-devel, linux-mm, linux-kernel,
	virtualization, amit.shah, Hansen, Dave, cornelia.huck, pbonzini,
	mst, david, aarcange, quintela

> <snip>
> 
> > +static void free_extended_page_bitmap(struct virtio_balloon *vb) {
> > +	int i, bmap_count = vb->nr_page_bmap;
> > +
> > +	for (i = 1; i < bmap_count; i++) {
> > +		kfree(vb->page_bitmap[i]);
> > +		vb->page_bitmap[i] = NULL;
> > +		vb->nr_page_bmap--;
> > +	}
> > +}
> > +
> > +static void kfree_page_bitmap(struct virtio_balloon *vb) {
> > +	int i;
> > +
> > +	for (i = 0; i < vb->nr_page_bmap; i++)
> > +		kfree(vb->page_bitmap[i]);
> > +}
> 
> It might be worth commenting that pair of functions to make it clear why
> they are so different; I guess the kfree_page_bitmap is used just before you
> free the structure above it so you don't need to keep the count/pointers
> updated?
> 

Yes. I will add some comments for that. Thanks!

Liang
 
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-02-04  4:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-21  6:52 [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
2016-12-21  6:52 ` [PATCH v6 kernel 1/5] virtio-balloon: rework deflate to add page to a list Liang Li
2016-12-21  6:52 ` [PATCH v6 kernel 2/5] virtio-balloon: define new feature bit and head struct Liang Li
2017-01-12 19:43   ` Michael S. Tsirkin
2017-01-13  9:24     ` [virtio-dev] " Li, Liang Z
2017-01-17 19:11       ` Michael S. Tsirkin
2017-01-18  1:55         ` Li, Liang Z
2017-01-18 15:30           ` Michael S. Tsirkin
2017-01-19  1:30             ` Li, Liang Z
2016-12-21  6:52 ` [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
2017-01-17 19:15   ` Michael S. Tsirkin
2017-01-18  4:56     ` Li, Liang Z
2017-01-18 15:30       ` Michael S. Tsirkin
2017-01-19  1:44         ` Li, Liang Z
2017-01-20 16:34           ` Michael S. Tsirkin
2017-01-20 11:48   ` Dr. David Alan Gilbert
2017-02-04  4:35     ` Li, Liang Z
2016-12-21  6:52 ` [PATCH v6 kernel 4/5] virtio-balloon: define flags and head for host request vq Liang Li
2016-12-21  6:52 ` [PATCH v6 kernel 5/5] virtio-balloon: tell host vm's unused page info Liang Li
2017-01-10  6:43 ` [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Li, Liang Z
2017-01-18 10:09 ` David Hildenbrand
2017-01-18 13:29   ` Li, Liang Z
2017-01-18 15:38   ` Michael S. Tsirkin
2017-01-19 17:24     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).