linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Fast balloon & fast live migration
@ 2016-06-13  9:47 Liang Li
  2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm; +Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li

The implementation of the current virtio-balloon is not very
efficient, bellow is test result of time spends on inflating
the balloon to 3GB of a 4GB idle guest:

a. allocating pages (6.5%, 103ms)
b. sending PFNs to host (68.3%, 787ms)
c. address translation (6.1%, 96ms)
d. madvise (19%, 300ms)

It takes about 1577ms for the whole inflating process to complete.
The test shows that the bottle neck is the stage b and stage d.

If using a bitmap to send the page info instead of the PFNs, we
can reduce the overhead in stage b quite a lot. Furthermore, it's
possible to do the address translation and the madvise with a bulk
of pages, instead of the current page per page way, so the overhead
of stage c and stage d can also be reduced a lot.

In addition, we can speed up live migration by skipping process
guest's free pages.

Patch 1 and patch 2 are the kernel side implementation which are
intended to speed up the inflating & deflating process by adding a
new feature to the virtio-balloon device. And now, inflating the
balloon to 3GB of a 4GB idle guest only takes 200ms, it's about 8
times as fast as before.


Patch 3 and patch 4 add the cache drop support, now hypervisor can
request the guest to drop it's cache. It's useful before inflating
the virtio-balloon and before starting live migration.

Patch 5 and patch 6 save guest's free page information into a page
bitmap and send the bitmap to host through balloon's virt queue. 

Liang Li (6):
  virtio-balloon: rework deflate to add page to a list
  virtio-balloon: speed up inflate/deflate process
  mm:split the drop cache operation into a function
  virtio-balloon: add drop cache support
  mm: add the related functions to get free page info
  virtio-balloon: tell host vm's free page info

 drivers/virtio/virtio_balloon.c     | 321 +++++++++++++++++++++++++++++++-----
 fs/drop_caches.c                    |  22 ++-
 include/linux/mm.h                  |   1 +
 include/uapi/linux/virtio_balloon.h |   2 +
 mm/page_alloc.c                     |  40 +++++
 5 files changed, 339 insertions(+), 47 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] virtio-balloon: rework deflate to add page to a list
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-23  8:25   ` Li, Liang Z
  2016-06-23  8:30   ` Li, Liang Z
  2016-06-13  9:47 ` [PATCH 2/6] virtio-balloon: speed up inflate/deflate process Liang Li
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah

will allow faster notifications using a bitmap down the road.
Now balloon_pfn_to_page() can be removed because it is not used.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 476c0e3..8d649a2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -98,12 +98,6 @@ static u32 page_to_balloon_pfn(struct page *page)
 	return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;
 }
 
-static struct page *balloon_pfn_to_page(u32 pfn)
-{
-	BUG_ON(pfn % VIRTIO_BALLOON_PAGES_PER_PAGE);
-	return pfn_to_page(pfn / VIRTIO_BALLOON_PAGES_PER_PAGE);
-}
-
 static void balloon_ack(struct virtqueue *vq)
 {
 	struct virtio_balloon *vb = vq->vdev->priv;
@@ -176,18 +170,16 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	return num_allocated_pages;
 }
 
-static void release_pages_balloon(struct virtio_balloon *vb)
+static void release_pages_balloon(struct virtio_balloon *vb,
+				 struct list_head *pages)
 {
-	unsigned int i;
-	struct page *page;
+	struct page *page, *next;
 
-	/* Find pfns pointing at start of each page, get pages and free them. */
-	for (i = 0; i < vb->num_pfns; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		page = balloon_pfn_to_page(virtio32_to_cpu(vb->vdev,
-							   vb->pfns[i]));
+	list_for_each_entry_safe(page, next, pages, lru) {
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 			adjust_managed_page_count(page, 1);
+		list_del(&page->lru);
 		put_page(page); /* balloon reference */
 	}
 }
@@ -197,6 +189,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	LIST_HEAD(pages);
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
@@ -208,6 +201,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 		if (!page)
 			break;
 		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		list_add(&page->lru, &pages);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
 
@@ -219,7 +213,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	 */
 	if (vb->num_pfns != 0)
 		tell_host(vb, vb->deflate_vq);
-	release_pages_balloon(vb);
+	release_pages_balloon(vb, &pages);
 	mutex_unlock(&vb->balloon_lock);
 	return num_freed_pages;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] virtio-balloon: speed up inflate/deflate process
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
  2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-13 10:17   ` kbuild test robot
  2016-06-24  5:39   ` Michael S. Tsirkin
  2016-06-13  9:47 ` [PATCH 3/6] mm:split the drop cache operation into a function Liang Li
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah

The implementation of the current virtio-balloon is not very efficient,
Bellow is test result of time spends on inflating the balloon to 3GB of
a 4GB idle guest:

a. allocating pages (6.5%, 103ms)
b. sending PFNs to host (68.3%, 787ms)
c. address translation (6.1%, 96ms)
d. madvise (19%, 300ms)

It takes about 1577ms for the whole inflating process to complete. The
test shows that the bottle neck is the stage b and stage d.

If using a bitmap to send the page info instead of the PFNs, we can
reduce the overhead in stage b quite a lot. Furthermore, it's possible
to do the address translation and the madvise with a bulk of pages,
instead of the current page per page way, so the overhead of stage c
and stage d can also be reduced a lot.

This patch is the kernel side implementation which is intended to speed
up the inflating & deflating process by adding a new feature to the
virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
idle guest only takes 200ms, it's about 8 times as fast as before.

TODO: optimize stage a by allocating/freeing a chunk of pages instead
of a single page at a time.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
---
 drivers/virtio/virtio_balloon.c     | 164 +++++++++++++++++++++++++++++++-----
 include/uapi/linux/virtio_balloon.h |   1 +
 2 files changed, 144 insertions(+), 21 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8d649a2..1fa601b 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -40,11 +40,19 @@
 #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
+#define VIRTIO_BALLOON_PFNS_LIMIT ((2 * (1ULL << 30)) >> PAGE_SHIFT) /* 2GB */
 
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
+struct balloon_bmap_hdr {
+	__virtio32 id;
+	__virtio32 page_shift;
+	__virtio64 start_pfn;
+	__virtio64 bmap_len;
+};
+
 struct virtio_balloon {
 	struct virtio_device *vdev;
 	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
@@ -62,6 +70,11 @@ struct virtio_balloon {
 
 	/* Number of balloon pages we've told the Host we're not using. */
 	unsigned int num_pages;
+	/* Bitmap and length used to tell the host the pages */
+	unsigned long *page_bitmap;
+	unsigned long bmap_len;
+	/* Used to record the processed pfn range */
+	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -105,15 +118,51 @@ static void balloon_ack(struct virtqueue *vq)
 	wake_up(&vb->acked);
 }
 
+static inline void init_pfn_range(struct virtio_balloon *vb)
+{
+	vb->min_pfn = (1UL << 48);
+	vb->max_pfn = 0;
+}
+
+static inline void update_pfn_range(struct virtio_balloon *vb,
+				 struct page *page)
+{
+	unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+	if (balloon_pfn < vb->min_pfn)
+		vb->min_pfn = balloon_pfn;
+	if (balloon_pfn > vb->max_pfn)
+		vb->max_pfn = balloon_pfn;
+}
+
 static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
 {
-	struct scatterlist sg;
 	unsigned int len;
 
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP)) {
+		struct balloon_bmap_hdr hdr;
+		unsigned long bmap_len;
+		struct scatterlist sg[2];
+
+		hdr.id = cpu_to_virtio32(vb->vdev, 0);
+		hdr.page_shift = cpu_to_virtio32(vb->vdev, PAGE_SHIFT);
+		hdr.start_pfn = cpu_to_virtio64(vb->vdev, vb->start_pfn);
+		bmap_len = min(vb->bmap_len,
+				(vb->end_pfn - vb->start_pfn) / BITS_PER_BYTE);
+		hdr.bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
+		sg_init_table(sg, 2);
+		sg_set_buf(&sg[0], &hdr, sizeof(hdr));
+		sg_set_buf(&sg[1], vb->page_bitmap, bmap_len);
+		virtqueue_add_outbuf(vq, sg, 2, vb, GFP_KERNEL);
+	} else {
+		struct scatterlist sg;
 
-	/* We should always be able to add one buffer to an empty queue. */
-	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
+		sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+		/* We should always be able to add one buffer to an
+		* empty queue.
+		*/
+		virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
+	}
 	virtqueue_kick(vq);
 
 	/* When host has read buffer, this completes via balloon_ack */
@@ -133,13 +182,50 @@ static void set_page_pfns(struct virtio_balloon *vb,
 					  page_to_balloon_pfn(page) + i);
 }
 
-static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
+static void set_page_bitmap(struct virtio_balloon *vb,
+			 struct list_head *pages, struct virtqueue *vq)
+{
+	unsigned long pfn;
+	struct page *page, *next;
+	bool find;
+
+	vb->min_pfn = rounddown(vb->min_pfn, BITS_PER_LONG);
+	vb->max_pfn = roundup(vb->max_pfn, BITS_PER_LONG);
+	for (pfn = vb->min_pfn; pfn < vb->max_pfn;
+			pfn += VIRTIO_BALLOON_PFNS_LIMIT) {
+		vb->start_pfn = pfn;
+		vb->end_pfn = pfn;
+		memset(vb->page_bitmap, 0, vb->bmap_len);
+		find = false;
+		list_for_each_entry_safe(page, next, pages, lru) {
+			unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+			if (balloon_pfn < pfn ||
+				 balloon_pfn >= pfn + VIRTIO_BALLOON_PFNS_LIMIT)
+				continue;
+			set_bit(balloon_pfn - pfn, vb->page_bitmap);
+			if (balloon_pfn > vb->end_pfn)
+				vb->end_pfn = balloon_pfn;
+			find = true;
+		}
+		if (find) {
+			vb->end_pfn = roundup(vb->end_pfn, BITS_PER_LONG);
+			tell_host(vb, vq);
+		}
+	}
+}
+
+static unsigned int fill_balloon(struct virtio_balloon *vb, size_t num,
+				 bool use_bmap)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	unsigned num_allocated_pages;
 
-	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	if (use_bmap)
+		init_pfn_range(vb);
+	else
+		/* We can only do one array worth at a time. */
+		num = min(num, ARRAY_SIZE(vb->pfns));
 
 	mutex_lock(&vb->balloon_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
@@ -154,7 +240,10 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			msleep(200);
 			break;
 		}
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		if (use_bmap)
+			update_pfn_range(vb, page);
+		else
+			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
 		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
@@ -163,8 +252,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 
 	num_allocated_pages = vb->num_pfns;
 	/* Did we get any? */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->inflate_vq);
+	if (vb->num_pfns != 0) {
+		if (use_bmap)
+			set_page_bitmap(vb, &vb_dev_info->pages,
+					 vb->inflate_vq);
+		else
+			tell_host(vb, vb->inflate_vq);
+	}
 	mutex_unlock(&vb->balloon_lock);
 
 	return num_allocated_pages;
@@ -184,15 +278,19 @@ static void release_pages_balloon(struct virtio_balloon *vb,
 	}
 }
 
-static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
+static unsigned int leak_balloon(struct virtio_balloon *vb, size_t num,
+				bool use_bmap)
 {
 	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	LIST_HEAD(pages);
 
-	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	if (use_bmap)
+		init_pfn_range(vb);
+	else
+		/* We can only do one array worth at a time. */
+		num = min(num, ARRAY_SIZE(vb->pfns));
 
 	mutex_lock(&vb->balloon_lock);
 	for (vb->num_pfns = 0; vb->num_pfns < num;
@@ -200,7 +298,10 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 		page = balloon_page_dequeue(vb_dev_info);
 		if (!page)
 			break;
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		if (use_bmap)
+			update_pfn_range(vb, page);
+		else
+			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
 		list_add(&page->lru, &pages);
 		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
@@ -211,9 +312,14 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->deflate_vq);
-	release_pages_balloon(vb, &pages);
+	if (vb->num_pfns != 0) {
+		if (use_bmap)
+			set_page_bitmap(vb, &pages, vb->deflate_vq);
+		else
+			tell_host(vb, vb->deflate_vq);
+
+		release_pages_balloon(vb, &pages);
+	}
 	mutex_unlock(&vb->balloon_lock);
 	return num_freed_pages;
 }
@@ -347,13 +453,15 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	struct virtio_balloon *vb;
 	unsigned long *freed;
 	unsigned num_freed_pages;
+	bool use_bmap;
 
 	vb = container_of(self, struct virtio_balloon, nb);
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
 	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
+	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
+	num_freed_pages = leak_balloon(vb, oom_pages, use_bmap);
 	update_balloon_size(vb);
 	*freed += num_freed_pages;
 
@@ -373,15 +481,17 @@ static void update_balloon_size_func(struct work_struct *work)
 {
 	struct virtio_balloon *vb;
 	s64 diff;
+	bool use_bmap;
 
 	vb = container_of(work, struct virtio_balloon,
 			  update_balloon_size_work);
+	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
 	diff = towards_target(vb);
 
 	if (diff > 0)
-		diff -= fill_balloon(vb, diff);
+		diff -= fill_balloon(vb, diff, use_bmap);
 	else if (diff < 0)
-		diff += leak_balloon(vb, -diff);
+		diff += leak_balloon(vb, -diff, use_bmap);
 	update_balloon_size(vb);
 
 	if (diff)
@@ -508,6 +618,13 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
 	vb->num_pages = 0;
+	vb->bmap_len = ALIGN(VIRTIO_BALLOON_PFNS_LIMIT, BITS_PER_LONG) /
+		 BITS_PER_BYTE + 2 * sizeof(unsigned long);
+	vb->page_bitmap = kzalloc(vb->bmap_len, GFP_KERNEL);
+	if (!vb->page_bitmap) {
+		err = -ENOMEM;
+		goto out;
+	}
 	mutex_init(&vb->balloon_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
@@ -541,9 +658,12 @@ out:
 
 static void remove_common(struct virtio_balloon *vb)
 {
+	bool use_bmap;
+
+	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
 	/* There might be pages left in the balloon: free them. */
 	while (vb->num_pages)
-		leak_balloon(vb, vb->num_pages);
+		leak_balloon(vb, vb->num_pages, use_bmap);
 	update_balloon_size(vb);
 
 	/* Now we reset the device so we can clean up the queues. */
@@ -565,6 +685,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 	cancel_work_sync(&vb->update_balloon_stats_work);
 
 	remove_common(vb);
+	kfree(vb->page_bitmap);
 	kfree(vb);
 }
 
@@ -603,6 +724,7 @@ static unsigned int features[] = {
 	VIRTIO_BALLOON_F_MUST_TELL_HOST,
 	VIRTIO_BALLOON_F_STATS_VQ,
 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
+	VIRTIO_BALLOON_F_PAGE_BITMAP,
 };
 
 static struct virtio_driver virtio_balloon_driver = {
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index 343d7dd..f78fa47 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -34,6 +34,7 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_PAGE_BITMAP	3 /* Send page info with bitmap */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] mm:split the drop cache operation into a function
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
  2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
  2016-06-13  9:47 ` [PATCH 2/6] virtio-balloon: speed up inflate/deflate process Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-13  9:47 ` [PATCH 4/6] virtio-balloon: add drop cache support Liang Li
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah, Alexander Viro

Put the drop caches operation in a new function and export it, then
wen can reuse it later.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk> 
---
 fs/drop_caches.c   | 22 ++++++++++++++--------
 include/linux/mm.h |  1 +
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index d72d52b..977dc71 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -50,14 +50,7 @@ int drop_caches_sysctl_handler(struct ctl_table *table, int write,
 	if (write) {
 		static int stfu;
 
-		if (sysctl_drop_caches & 1) {
-			iterate_supers(drop_pagecache_sb, NULL);
-			count_vm_event(DROP_PAGECACHE);
-		}
-		if (sysctl_drop_caches & 2) {
-			drop_slab();
-			count_vm_event(DROP_SLAB);
-		}
+		drop_caches(sysctl_drop_caches);
 		if (!stfu) {
 			pr_info("%s (%d): drop_caches: %d\n",
 				current->comm, task_pid_nr(current),
@@ -67,3 +60,16 @@ int drop_caches_sysctl_handler(struct ctl_table *table, int write,
 	}
 	return 0;
 }
+
+void drop_caches(int drop_ctl)
+{
+	if (drop_ctl & 1) {
+		iterate_supers(drop_pagecache_sb, NULL);
+		count_vm_event(DROP_PAGECACHE);
+	}
+	if (drop_ctl & 2) {
+		drop_slab();
+		count_vm_event(DROP_SLAB);
+	}
+}
+EXPORT_SYMBOL_GPL(drop_caches);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5df5feb..e22e315 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2263,6 +2263,7 @@ static inline int in_gate_area(struct mm_struct *mm, unsigned long addr)
 extern int sysctl_drop_caches;
 int drop_caches_sysctl_handler(struct ctl_table *, int,
 					void __user *, size_t *, loff_t *);
+void drop_caches(int drop_ctl);
 #endif
 
 void drop_slab(void);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] virtio-balloon: add drop cache support
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
                   ` (2 preceding siblings ...)
  2016-06-13  9:47 ` [PATCH 3/6] mm:split the drop cache operation into a function Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-13  9:47 ` [PATCH 5/6] mm: add the related functions to get free page info Liang Li
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah, Alexander Viro

virtio-balloon can make use of the amount of free memory to determine
the amount of memory to be filled in the balloon, but the amount of
free memory will be effected by the page cache, which can be reclaimed.
Drop the cache before getting the amount of free memory will be very
helpful to relect the exact amount of memroy that can be reclaimed.

This patch add a new feature to balloon driver to support this
operation, hypervisor can request the VM to drop it's cache, so as to
reclaim more memory.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
---
 drivers/virtio/virtio_balloon.c     | 75 ++++++++++++++++++++++++++++++++++---
 include/uapi/linux/virtio_balloon.h |  1 +
 2 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 1fa601b..5a30ca0 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -46,6 +46,15 @@ static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
+enum balloon_req_id {
+	BALLOON_DROP_CACHE,
+};
+
+struct balloon_req_hdr {
+	__virtio32 id;
+	__virtio32 param;
+};
+
 struct balloon_bmap_hdr {
 	__virtio32 id;
 	__virtio32 page_shift;
@@ -55,7 +64,7 @@ struct balloon_bmap_hdr {
 
 struct virtio_balloon {
 	struct virtio_device *vdev;
-	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *misc_vq;
 
 	/* The balloon servicing is delegated to a freezable workqueue. */
 	struct work_struct update_balloon_stats_work;
@@ -75,6 +84,7 @@ struct virtio_balloon {
 	unsigned long bmap_len;
 	/* Used to record the processed pfn range */
 	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
+	struct balloon_req_hdr req_hdr;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -498,18 +508,62 @@ static void update_balloon_size_func(struct work_struct *work)
 		queue_work(system_freezable_wq, work);
 }
 
+static void misc_handle_rq(struct virtio_balloon *vb)
+{
+	struct virtqueue *vq;
+	struct scatterlist sg_out;
+	unsigned int len;
+	struct balloon_req_hdr *ptr_hdr;
+	struct scatterlist sg_in;
+
+	vq = vb->misc_vq;
+	ptr_hdr = virtqueue_get_buf(vq, &len);
+
+	if (!ptr_hdr || len != sizeof(vb->req_hdr))
+		return;
+
+	switch (ptr_hdr->id) {
+	case BALLOON_DROP_CACHE:
+#ifdef CONFIG_SYSCTL
+		drop_caches(ptr_hdr->param);
+#endif
+		sg_init_one(&sg_out, &ptr_hdr->id, sizeof(ptr_hdr->id));
+		virtqueue_add_outbuf(vq, &sg_out, 1, vb, GFP_KERNEL);
+		sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
+		virtqueue_add_inbuf(vq, &sg_in, 1, &vb->req_hdr, GFP_KERNEL);
+		break;
+	default:
+		break;
+	}
+
+	virtqueue_kick(vq);
+}
+
+static void misc_request(struct virtqueue *vq)
+{
+	struct virtio_balloon *vb = vq->vdev->priv;
+
+	misc_handle_rq(vb);
+}
+
 static int init_vqs(struct virtio_balloon *vb)
 {
-	struct virtqueue *vqs[3];
-	vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request };
-	static const char * const names[] = { "inflate", "deflate", "stats" };
+	struct virtqueue *vqs[4];
+	vq_callback_t *callbacks[] = { balloon_ack, balloon_ack,
+					 stats_request, misc_request };
+	const char *names[] = { "inflate", "deflate", "stats", "misc" };
 	int err, nvqs;
 
 	/*
 	 * We expect two virtqueues: inflate and deflate, and
 	 * optionally stat.
 	 */
-	nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_MISC))
+		nvqs = 4;
+	else
+		nvqs = virtio_has_feature(vb->vdev,
+					  VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
+
 	err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names);
 	if (err)
 		return err;
@@ -530,6 +584,16 @@ static int init_vqs(struct virtio_balloon *vb)
 			BUG();
 		virtqueue_kick(vb->stats_vq);
 	}
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_MISC)) {
+		struct scatterlist sg_in;
+
+		vb->misc_vq = vqs[3];
+		sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
+		if (virtqueue_add_inbuf(vb->misc_vq, &sg_in, 1,
+		    &vb->req_hdr, GFP_KERNEL) < 0)
+			BUG();
+		virtqueue_kick(vb->misc_vq);
+	}
 	return 0;
 }
 
@@ -725,6 +789,7 @@ static unsigned int features[] = {
 	VIRTIO_BALLOON_F_STATS_VQ,
 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
 	VIRTIO_BALLOON_F_PAGE_BITMAP,
+	VIRTIO_BALLOON_F_MISC,
 };
 
 static struct virtio_driver virtio_balloon_driver = {
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index f78fa47..5a7309d 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_PAGE_BITMAP	3 /* Send page info with bitmap */
+#define VIRTIO_BALLOON_F_MISC		4 /* Send request and get misc info */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] mm: add the related functions to get free page info
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
                   ` (3 preceding siblings ...)
  2016-06-13  9:47 ` [PATCH 4/6] virtio-balloon: add drop cache support Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-13  9:47 ` [PATCH 6/6] virtio-balloon: tell host vm's " Liang Li
  2016-06-23  8:27 ` [PATCH 0/6] Fast balloon & fast live migration Li, Liang Z
  6 siblings, 0 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah

Save the free page info into a page bitmap, will be used in virtio
balloon device driver.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
---
 mm/page_alloc.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6903b69..96b408f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4515,6 +4515,46 @@ void show_free_areas(unsigned int filter)
 	show_swap_cache_info();
 }
 
+unsigned long get_max_pfn(void)
+{
+	return max_pfn;
+}
+EXPORT_SYMBOL(get_max_pfn);
+
+static void mark_free_pages_bitmap(struct zone *zone,
+		unsigned long *bitmap, unsigned long len)
+{
+	unsigned long pfn, flags, limit, page_num;
+	unsigned int order, t;
+	struct list_head *curr;
+
+	if (zone_is_empty(zone))
+		return;
+
+	spin_lock_irqsave(&zone->lock, flags);
+
+	limit = min(len, max_pfn);
+	for_each_migratetype_order(order, t) {
+		list_for_each(curr, &zone->free_area[order].free_list[t]) {
+			pfn = page_to_pfn(list_entry(curr, struct page, lru));
+			page_num = 1UL << order;
+			if (pfn + page_num < limit)
+				bitmap_set(bitmap, pfn, page_num);
+		}
+	}
+
+	spin_unlock_irqrestore(&zone->lock, flags);
+}
+
+void get_free_pages(unsigned long *bitmap, unsigned long len)
+{
+	struct zone *zone;
+
+	for_each_populated_zone(zone)
+		mark_free_pages_bitmap(zone, bitmap, len);
+}
+EXPORT_SYMBOL(get_free_pages);
+
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
 {
 	zoneref->zone = zone;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] virtio-balloon: tell host vm's free page info
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
                   ` (4 preceding siblings ...)
  2016-06-13  9:47 ` [PATCH 5/6] mm: add the related functions to get free page info Liang Li
@ 2016-06-13  9:47 ` Liang Li
  2016-06-23  8:27 ` [PATCH 0/6] Fast balloon & fast live migration Li, Liang Z
  6 siblings, 0 replies; 13+ messages in thread
From: Liang Li @ 2016-06-13  9:47 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Liang Li,
	Paolo Bonzini, Cornelia Huck, Amit Shah

Support the new request for vm's free page information, response with
a page bitmap. QEMU can make use of this free page bitmap to speed up
the live migration process by skipping process the free pages.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Amit Shah <amit.shah@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 64 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 5a30ca0..5237d50 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -46,8 +46,12 @@ static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
+extern void get_free_pages(unsigned long *free_page_bitmap, unsigned long len);
+extern unsigned long get_max_pfn(void);
+
 enum balloon_req_id {
 	BALLOON_DROP_CACHE,
+	BALLOON_GET_FREE_PAGES,
 };
 
 struct balloon_req_hdr {
@@ -85,6 +89,9 @@ struct virtio_balloon {
 	/* Used to record the processed pfn range */
 	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
 	struct balloon_req_hdr req_hdr;
+	/* Free page bitmap and length to tell the host */
+	unsigned long *free_pages;
+	unsigned long free_bmap_len;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -370,6 +377,41 @@ static void update_balloon_stats(struct virtio_balloon *vb)
 				pages_to_bytes(available));
 }
 
+static int reset_free_page_bmap(struct virtio_balloon *vb,
+				 unsigned long *max_pfn)
+{
+	int err = 0;
+	unsigned long bitmap_bytes;
+
+	*max_pfn = get_max_pfn();
+	bitmap_bytes = ALIGN(*max_pfn, BITS_PER_LONG) / BITS_PER_BYTE;
+
+	if (bitmap_bytes < vb->free_bmap_len)
+		memset(vb->free_pages, 0, bitmap_bytes);
+	else {
+		kfree(vb->free_pages);
+		vb->free_bmap_len = bitmap_bytes;
+		vb->free_pages = kzalloc(bitmap_bytes, GFP_KERNEL);
+	}
+
+	if (!vb->free_pages) {
+		err = -ENOMEM;
+		vb->free_bmap_len = 0;
+	}
+
+	return err;
+}
+
+static void update_free_pages_stats(struct virtio_balloon *vb)
+{
+	unsigned long max_pfn;
+
+	if (!reset_free_page_bmap(vb, &max_pfn))
+		get_free_pages(vb->free_pages, max_pfn);
+	else
+		dev_err(&vb->vdev->dev, "%s failure: No memory!\n", __func__);
+}
+
 /*
  * While most virtqueues communicate guest-initiated requests to the hypervisor,
  * the stats queue operates in reverse.  The driver initializes the virtqueue
@@ -511,10 +553,11 @@ static void update_balloon_size_func(struct work_struct *work)
 static void misc_handle_rq(struct virtio_balloon *vb)
 {
 	struct virtqueue *vq;
-	struct scatterlist sg_out;
+	struct scatterlist sg_out, sg[2];
 	unsigned int len;
 	struct balloon_req_hdr *ptr_hdr;
 	struct scatterlist sg_in;
+	struct balloon_bmap_hdr hdr;
 
 	vq = vb->misc_vq;
 	ptr_hdr = virtqueue_get_buf(vq, &len);
@@ -532,6 +575,18 @@ static void misc_handle_rq(struct virtio_balloon *vb)
 		sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
 		virtqueue_add_inbuf(vq, &sg_in, 1, &vb->req_hdr, GFP_KERNEL);
 		break;
+	case BALLOON_GET_FREE_PAGES:
+		update_free_pages_stats(vb);
+		sg_init_table(sg, 2);
+
+		hdr.id = cpu_to_virtio32(vb->vdev, BALLOON_GET_FREE_PAGES);
+		hdr.page_shift = cpu_to_virtio32(vb->vdev, PAGE_SHIFT);
+		hdr.start_pfn = cpu_to_virtio64(vb->vdev, 0);
+		hdr.bmap_len = cpu_to_virtio64(vb->vdev, vb->free_bmap_len);
+		sg_set_buf(&sg[0], &hdr, sizeof(hdr));
+		sg_set_buf(&sg[1], vb->free_pages, vb->free_bmap_len);
+		virtqueue_add_outbuf(vq, &sg[0], 2, vb, GFP_KERNEL);
+		break;
 	default:
 		break;
 	}
@@ -689,6 +744,12 @@ static int virtballoon_probe(struct virtio_device *vdev)
 		err = -ENOMEM;
 		goto out;
 	}
+	vb->free_bmap_len = ALIGN(get_max_pfn(), BITS_PER_LONG) / BITS_PER_BYTE;
+	vb->free_pages = kzalloc(vb->free_bmap_len, GFP_KERNEL);
+	if (!vb->free_pages) {
+		err = -ENOMEM;
+		goto out;
+	}
 	mutex_init(&vb->balloon_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
@@ -750,6 +811,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 
 	remove_common(vb);
 	kfree(vb->page_bitmap);
+	kfree(vb->free_pages);
 	kfree(vb);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] virtio-balloon: speed up inflate/deflate process
  2016-06-13  9:47 ` [PATCH 2/6] virtio-balloon: speed up inflate/deflate process Liang Li
@ 2016-06-13 10:17   ` kbuild test robot
  2016-06-24  5:39   ` Michael S. Tsirkin
  1 sibling, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2016-06-13 10:17 UTC (permalink / raw)
  To: Liang Li
  Cc: kbuild-all, kvm, virtio-dev, qemu-devel, linux-kernel, mst,
	Liang Li, Paolo Bonzini, Cornelia Huck, Amit Shah

[-- Attachment #1: Type: text/plain, Size: 1809 bytes --]

Hi,

[auto build test WARNING on v4.7-rc3]
[cannot apply to next-20160609]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Liang-Li/Fast-balloon-fast-live-migration/20160613-175812
config: m68k-allyesconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m68k 

All warnings (new ones prefixed by >>):

   drivers/virtio/virtio_balloon.c: In function 'init_pfn_range':
>> drivers/virtio/virtio_balloon.c:123:2: warning: left shift count >= width of type
     vb->min_pfn = (1UL << 48);
     ^

vim +123 drivers/virtio/virtio_balloon.c

   107		unsigned long pfn = page_to_pfn(page);
   108	
   109		BUILD_BUG_ON(PAGE_SHIFT < VIRTIO_BALLOON_PFN_SHIFT);
   110		/* Convert pfn from Linux page size to balloon page size. */
   111		return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;
   112	}
   113	
   114	static void balloon_ack(struct virtqueue *vq)
   115	{
   116		struct virtio_balloon *vb = vq->vdev->priv;
   117	
   118		wake_up(&vb->acked);
   119	}
   120	
   121	static inline void init_pfn_range(struct virtio_balloon *vb)
   122	{
 > 123		vb->min_pfn = (1UL << 48);
   124		vb->max_pfn = 0;
   125	}
   126	
   127	static inline void update_pfn_range(struct virtio_balloon *vb,
   128					 struct page *page)
   129	{
   130		unsigned long balloon_pfn = page_to_balloon_pfn(page);
   131	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 37193 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/6] virtio-balloon: rework deflate to add page to a list
  2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
@ 2016-06-23  8:25   ` Li, Liang Z
  2016-06-23  8:30   ` Li, Liang Z
  1 sibling, 0 replies; 13+ messages in thread
From: Li, Liang Z @ 2016-06-23  8:25 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Paolo Bonzini,
	Cornelia Huck, Amit Shah

Ping ... 

Liang


> -----Original Message-----
> From: Li, Liang Z
> Sent: Monday, June 13, 2016 5:47 PM
> To: kvm@vger.kernel.org
> Cc: virtio-dev@lists.oasis-open.org; qemu-devel@nongun.org; linux-
> kernel@vger.kernel.org; mst@redhat.com; Li, Liang Z; Paolo Bonzini; Cornelia
> Huck; Amit Shah
> Subject: [PATCH 1/6] virtio-balloon: rework deflate to add page to a list
> 
> will allow faster notifications using a bitmap down the road.
> Now balloon_pfn_to_page() can be removed because it is not used.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Amit Shah <amit.shah@redhat.com>
> ---
>  drivers/virtio/virtio_balloon.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 476c0e3..8d649a2 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -98,12 +98,6 @@ static u32 page_to_balloon_pfn(struct page *page)
>  	return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;  }
> 
> -static struct page *balloon_pfn_to_page(u32 pfn) -{
> -	BUG_ON(pfn % VIRTIO_BALLOON_PAGES_PER_PAGE);
> -	return pfn_to_page(pfn / VIRTIO_BALLOON_PAGES_PER_PAGE);
> -}
> -
>  static void balloon_ack(struct virtqueue *vq)  {
>  	struct virtio_balloon *vb = vq->vdev->priv; @@ -176,18 +170,16 @@
> static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	return num_allocated_pages;
>  }
> 
> -static void release_pages_balloon(struct virtio_balloon *vb)
> +static void release_pages_balloon(struct virtio_balloon *vb,
> +				 struct list_head *pages)
>  {
> -	unsigned int i;
> -	struct page *page;
> +	struct page *page, *next;
> 
> -	/* Find pfns pointing at start of each page, get pages and free them.
> */
> -	for (i = 0; i < vb->num_pfns; i +=
> VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		page = balloon_pfn_to_page(virtio32_to_cpu(vb->vdev,
> -							   vb->pfns[i]));
> +	list_for_each_entry_safe(page, next, pages, lru) {
>  		if (!virtio_has_feature(vb->vdev,
> 
> 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  			adjust_managed_page_count(page, 1);
> +		list_del(&page->lru);
>  		put_page(page); /* balloon reference */
>  	}
>  }
> @@ -197,6 +189,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	LIST_HEAD(pages);
> 
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
> @@ -208,6 +201,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  		if (!page)
>  			break;
>  		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		list_add(&page->lru, &pages);
>  		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
> 
> @@ -219,7 +213,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  	 */
>  	if (vb->num_pfns != 0)
>  		tell_host(vb, vb->deflate_vq);
> -	release_pages_balloon(vb);
> +	release_pages_balloon(vb, &pages);
>  	mutex_unlock(&vb->balloon_lock);
>  	return num_freed_pages;
>  }
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 0/6] Fast balloon & fast live migration
  2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
                   ` (5 preceding siblings ...)
  2016-06-13  9:47 ` [PATCH 6/6] virtio-balloon: tell host vm's " Liang Li
@ 2016-06-23  8:27 ` Li, Liang Z
  6 siblings, 0 replies; 13+ messages in thread
From: Li, Liang Z @ 2016-06-23  8:27 UTC (permalink / raw)
  To: kvm; +Cc: virtio-dev, qemu-devel, linux-kernel, mst

Any comments?

Liang


> -----Original Message-----
> From: Li, Liang Z
> Sent: Monday, June 13, 2016 5:47 PM
> To: kvm@vger.kernel.org
> Cc: virtio-dev@lists.oasis-open.org; qemu-devel@nongun.org; linux-
> kernel@vger.kernel.org; mst@redhat.com; Li, Liang Z
> Subject: [PATCH 0/6] Fast balloon & fast live migration
> 
> The implementation of the current virtio-balloon is not very efficient, bellow
> is test result of time spends on inflating the balloon to 3GB of a 4GB idle
> guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete.
> The test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can reduce
> the overhead in stage b quite a lot. Furthermore, it's possible to do the
> address translation and the madvise with a bulk of pages, instead of the
> current page per page way, so the overhead of stage c and stage d can also
> be reduced a lot.
> 
> In addition, we can speed up live migration by skipping process guest's free
> pages.
> 
> Patch 1 and patch 2 are the kernel side implementation which are intended
> to speed up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB idle
> guest only takes 200ms, it's about 8 times as fast as before.
> 
> 
> Patch 3 and patch 4 add the cache drop support, now hypervisor can request
> the guest to drop it's cache. It's useful before inflating the virtio-balloon and
> before starting live migration.
> 
> Patch 5 and patch 6 save guest's free page information into a page bitmap
> and send the bitmap to host through balloon's virt queue.
> 
> Liang Li (6):
>   virtio-balloon: rework deflate to add page to a list
>   virtio-balloon: speed up inflate/deflate process
>   mm:split the drop cache operation into a function
>   virtio-balloon: add drop cache support
>   mm: add the related functions to get free page info
>   virtio-balloon: tell host vm's free page info
> 
>  drivers/virtio/virtio_balloon.c     | 321
> +++++++++++++++++++++++++++++++-----
>  fs/drop_caches.c                    |  22 ++-
>  include/linux/mm.h                  |   1 +
>  include/uapi/linux/virtio_balloon.h |   2 +
>  mm/page_alloc.c                     |  40 +++++
>  5 files changed, 339 insertions(+), 47 deletions(-)
> 
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/6] virtio-balloon: rework deflate to add page to a list
  2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
  2016-06-23  8:25   ` Li, Liang Z
@ 2016-06-23  8:30   ` Li, Liang Z
  1 sibling, 0 replies; 13+ messages in thread
From: Li, Liang Z @ 2016-06-23  8:30 UTC (permalink / raw)
  To: kvm
  Cc: virtio-dev, qemu-devel, linux-kernel, mst, Paolo Bonzini,
	Cornelia Huck, Amit Shah

Hi Michael,

Could you help to review this patch set and give some comments when you have time? 
My work is blocked here.

Thanks !
Liang


> -----Original Message-----
> From: Li, Liang Z
> Sent: Monday, June 13, 2016 5:47 PM
> To: kvm@vger.kernel.org
> Cc: virtio-dev@lists.oasis-open.org; qemu-devel@nongun.org; linux-
> kernel@vger.kernel.org; mst@redhat.com; Li, Liang Z; Paolo Bonzini; Cornelia
> Huck; Amit Shah
> Subject: [PATCH 1/6] virtio-balloon: rework deflate to add page to a list
> 
> will allow faster notifications using a bitmap down the road.
> Now balloon_pfn_to_page() can be removed because it is not used.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Amit Shah <amit.shah@redhat.com>
> ---
>  drivers/virtio/virtio_balloon.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 476c0e3..8d649a2 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -98,12 +98,6 @@ static u32 page_to_balloon_pfn(struct page *page)
>  	return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;  }
> 
> -static struct page *balloon_pfn_to_page(u32 pfn) -{
> -	BUG_ON(pfn % VIRTIO_BALLOON_PAGES_PER_PAGE);
> -	return pfn_to_page(pfn / VIRTIO_BALLOON_PAGES_PER_PAGE);
> -}
> -
>  static void balloon_ack(struct virtqueue *vq)  {
>  	struct virtio_balloon *vb = vq->vdev->priv; @@ -176,18 +170,16 @@
> static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	return num_allocated_pages;
>  }
> 
> -static void release_pages_balloon(struct virtio_balloon *vb)
> +static void release_pages_balloon(struct virtio_balloon *vb,
> +				 struct list_head *pages)
>  {
> -	unsigned int i;
> -	struct page *page;
> +	struct page *page, *next;
> 
> -	/* Find pfns pointing at start of each page, get pages and free them.
> */
> -	for (i = 0; i < vb->num_pfns; i +=
> VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		page = balloon_pfn_to_page(virtio32_to_cpu(vb->vdev,
> -							   vb->pfns[i]));
> +	list_for_each_entry_safe(page, next, pages, lru) {
>  		if (!virtio_has_feature(vb->vdev,
> 
> 	VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  			adjust_managed_page_count(page, 1);
> +		list_del(&page->lru);
>  		put_page(page); /* balloon reference */
>  	}
>  }
> @@ -197,6 +189,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	LIST_HEAD(pages);
> 
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
> @@ -208,6 +201,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  		if (!page)
>  			break;
>  		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		list_add(&page->lru, &pages);
>  		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
> 
> @@ -219,7 +213,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb,
> size_t num)
>  	 */
>  	if (vb->num_pfns != 0)
>  		tell_host(vb, vb->deflate_vq);
> -	release_pages_balloon(vb);
> +	release_pages_balloon(vb, &pages);
>  	mutex_unlock(&vb->balloon_lock);
>  	return num_freed_pages;
>  }
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] virtio-balloon: speed up inflate/deflate process
  2016-06-13  9:47 ` [PATCH 2/6] virtio-balloon: speed up inflate/deflate process Liang Li
  2016-06-13 10:17   ` kbuild test robot
@ 2016-06-24  5:39   ` Michael S. Tsirkin
  2016-06-24  6:28     ` Li, Liang Z
  1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2016-06-24  5:39 UTC (permalink / raw)
  To: Liang Li
  Cc: kvm, virtio-dev, qemu-devel, linux-kernel, Paolo Bonzini,
	Cornelia Huck, Amit Shah

On Mon, Jun 13, 2016 at 05:47:09PM +0800, Liang Li wrote:
> The implementation of the current virtio-balloon is not very efficient,
> Bellow is test result of time spends on inflating the balloon to 3GB of
> a 4GB idle guest:
> 
> a. allocating pages (6.5%, 103ms)
> b. sending PFNs to host (68.3%, 787ms)
> c. address translation (6.1%, 96ms)
> d. madvise (19%, 300ms)
> 
> It takes about 1577ms for the whole inflating process to complete. The
> test shows that the bottle neck is the stage b and stage d.
> 
> If using a bitmap to send the page info instead of the PFNs, we can
> reduce the overhead in stage b quite a lot. Furthermore, it's possible
> to do the address translation and the madvise with a bulk of pages,
> instead of the current page per page way, so the overhead of stage c
> and stage d can also be reduced a lot.
> 
> This patch is the kernel side implementation which is intended to speed
> up the inflating & deflating process by adding a new feature to the
> virtio-balloon device. And now, inflating the balloon to 3GB of a 4GB
> idle guest only takes 200ms, it's about 8 times as fast as before.
> 
> TODO: optimize stage a by allocating/freeing a chunk of pages instead
> of a single page at a time.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Amit Shah <amit.shah@redhat.com>

Causes kbuild warnings

> ---
>  drivers/virtio/virtio_balloon.c     | 164 +++++++++++++++++++++++++++++++-----
>  include/uapi/linux/virtio_balloon.h |   1 +
>  2 files changed, 144 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8d649a2..1fa601b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -40,11 +40,19 @@
>  #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> +#define VIRTIO_BALLOON_PFNS_LIMIT ((2 * (1ULL << 30)) >> PAGE_SHIFT) /* 2GB */

2<< 30  is 2G but that is not a useful comment.
pls explain what is the reason for this selection.

>  
>  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
>  module_param(oom_pages, int, S_IRUSR | S_IWUSR);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  
> +struct balloon_bmap_hdr {
> +	__virtio32 id;
> +	__virtio32 page_shift;
> +	__virtio64 start_pfn;
> +	__virtio64 bmap_len;
> +};
> +

Put this in an uapi header please.

>  struct virtio_balloon {
>  	struct virtio_device *vdev;
>  	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
> @@ -62,6 +70,11 @@ struct virtio_balloon {
>  
>  	/* Number of balloon pages we've told the Host we're not using. */
>  	unsigned int num_pages;
> +	/* Bitmap and length used to tell the host the pages */
> +	unsigned long *page_bitmap;
> +	unsigned long bmap_len;
> +	/* Used to record the processed pfn range */
> +	unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
>  	/*
>  	 * The pages we've told the Host we're not using are enqueued
>  	 * at vb_dev_info->pages list.
> @@ -105,15 +118,51 @@ static void balloon_ack(struct virtqueue *vq)
>  	wake_up(&vb->acked);
>  }
>  
> +static inline void init_pfn_range(struct virtio_balloon *vb)
> +{
> +	vb->min_pfn = (1UL << 48);

Where does this value come from? Do you want ULONG_MAX?
This does not fit in long on 32 bit systems.


> +	vb->max_pfn = 0;
> +}
> +
> +static inline void update_pfn_range(struct virtio_balloon *vb,
> +				 struct page *page)
> +{
> +	unsigned long balloon_pfn = page_to_balloon_pfn(page);
> +
> +	if (balloon_pfn < vb->min_pfn)
> +		vb->min_pfn = balloon_pfn;
> +	if (balloon_pfn > vb->max_pfn)
> +		vb->max_pfn = balloon_pfn;
> +}
> +
>  static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
>  {
> -	struct scatterlist sg;
>  	unsigned int len;
>  
> -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP)) {
> +		struct balloon_bmap_hdr hdr;

why not init fields here?

> +		unsigned long bmap_len;

and here

> +		struct scatterlist sg[2];
> +
> +		hdr.id = cpu_to_virtio32(vb->vdev, 0);
> +		hdr.page_shift = cpu_to_virtio32(vb->vdev, PAGE_SHIFT);
> +		hdr.start_pfn = cpu_to_virtio64(vb->vdev, vb->start_pfn);
> +		bmap_len = min(vb->bmap_len,
> +				(vb->end_pfn - vb->start_pfn) / BITS_PER_BYTE);
> +		hdr.bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
> +		sg_init_table(sg, 2);
> +		sg_set_buf(&sg[0], &hdr, sizeof(hdr));
> +		sg_set_buf(&sg[1], vb->page_bitmap, bmap_len);
> +		virtqueue_add_outbuf(vq, sg, 2, vb, GFP_KERNEL);

might fail if queue size < 2. validate queue size and clear
VIRTIO_BALLOON_F_PAGE_BITMAP?

Alternatively, and I think preferably,
use first struct balloon_bmap_hdr bytes in the buffer
to pass the header to host.


> +	} else {
> +		struct scatterlist sg;
>  
> -	/* We should always be able to add one buffer to an empty queue. */
> -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> +		sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> +		/* We should always be able to add one buffer to an
> +		* empty queue.
> +		*/
> +		virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> +	}
>  	virtqueue_kick(vq);
>  
>  	/* When host has read buffer, this completes via balloon_ack */
> @@ -133,13 +182,50 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  					  page_to_balloon_pfn(page) + i);
>  }
>  
> -static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> +static void set_page_bitmap(struct virtio_balloon *vb,
> +			 struct list_head *pages, struct virtqueue *vq)
> +{
> +	unsigned long pfn;
> +	struct page *page, *next;
> +	bool find;

find -> found

> +
> +	vb->min_pfn = rounddown(vb->min_pfn, BITS_PER_LONG);
> +	vb->max_pfn = roundup(vb->max_pfn, BITS_PER_LONG);
> +	for (pfn = vb->min_pfn; pfn < vb->max_pfn;
> +			pfn += VIRTIO_BALLOON_PFNS_LIMIT) {
> +		vb->start_pfn = pfn;
> +		vb->end_pfn = pfn;
> +		memset(vb->page_bitmap, 0, vb->bmap_len);
> +		find = false;
> +		list_for_each_entry_safe(page, next, pages, lru) {

Why _safe?

> +			unsigned long balloon_pfn = page_to_balloon_pfn(page);
> +
> +			if (balloon_pfn < pfn ||
> +				 balloon_pfn >= pfn + VIRTIO_BALLOON_PFNS_LIMIT)
> +				continue;
> +			set_bit(balloon_pfn - pfn, vb->page_bitmap);
> +			if (balloon_pfn > vb->end_pfn)
> +				vb->end_pfn = balloon_pfn;
> +			find = true;

maybe remove page from list? this way we won't go over same entry
multiple times ...

> +		}
> +		if (find) {
> +			vb->end_pfn = roundup(vb->end_pfn, BITS_PER_LONG);
> +			tell_host(vb, vq);
> +		}
> +	}
> +}
> +
> +static unsigned int fill_balloon(struct virtio_balloon *vb, size_t num,
> +				 bool use_bmap)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>  	unsigned num_allocated_pages;
>  
> -	/* We can only do one array worth at a time. */
> -	num = min(num, ARRAY_SIZE(vb->pfns));
> +	if (use_bmap)
> +		init_pfn_range(vb);
> +	else
> +		/* We can only do one array worth at a time. */
> +		num = min(num, ARRAY_SIZE(vb->pfns));
>  
>  	mutex_lock(&vb->balloon_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
> @@ -154,7 +240,10 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			msleep(200);
>  			break;
>  		}
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		if (use_bmap)
> +			update_pfn_range(vb, page);
> +		else
> +			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>  		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>  		if (!virtio_has_feature(vb->vdev,
>  					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> @@ -163,8 +252,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  
>  	num_allocated_pages = vb->num_pfns;
>  	/* Did we get any? */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->inflate_vq);
> +	if (vb->num_pfns != 0) {
> +		if (use_bmap)
> +			set_page_bitmap(vb, &vb_dev_info->pages,
> +					 vb->inflate_vq);

don't we need pages_lock if we access vb_dev_info->pages?

> +		else
> +			tell_host(vb, vb->inflate_vq);
> +	}
>  	mutex_unlock(&vb->balloon_lock);
>  
>  	return num_allocated_pages;
> @@ -184,15 +278,19 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>  	}
>  }
>  
> -static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
> +static unsigned int leak_balloon(struct virtio_balloon *vb, size_t num,
> +				bool use_bmap)
>  {
>  	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>  	LIST_HEAD(pages);
>  
> -	/* We can only do one array worth at a time. */
> -	num = min(num, ARRAY_SIZE(vb->pfns));
> +	if (use_bmap)
> +		init_pfn_range(vb);
> +	else
> +		/* We can only do one array worth at a time. */
> +		num = min(num, ARRAY_SIZE(vb->pfns));
>  
>  	mutex_lock(&vb->balloon_lock);
>  	for (vb->num_pfns = 0; vb->num_pfns < num;
> @@ -200,7 +298,10 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>  		page = balloon_page_dequeue(vb_dev_info);
>  		if (!page)
>  			break;
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		if (use_bmap)
> +			update_pfn_range(vb, page);
> +		else
> +			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>  		list_add(&page->lru, &pages);
>  		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
> @@ -211,9 +312,14 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>  	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
>  	 * is true, we *have* to do it in this order
>  	 */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->deflate_vq);
> -	release_pages_balloon(vb, &pages);
> +	if (vb->num_pfns != 0) {
> +		if (use_bmap)
> +			set_page_bitmap(vb, &pages, vb->deflate_vq);
> +		else
> +			tell_host(vb, vb->deflate_vq);
> +
> +		release_pages_balloon(vb, &pages);
> +	}
>  	mutex_unlock(&vb->balloon_lock);
>  	return num_freed_pages;
>  }
> @@ -347,13 +453,15 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	struct virtio_balloon *vb;
>  	unsigned long *freed;
>  	unsigned num_freed_pages;
> +	bool use_bmap;
>  
>  	vb = container_of(self, struct virtio_balloon, nb);
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
>  	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> +	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
> +	num_freed_pages = leak_balloon(vb, oom_pages, use_bmap);
>  	update_balloon_size(vb);
>  	*freed += num_freed_pages;
>  
> @@ -373,15 +481,17 @@ static void update_balloon_size_func(struct work_struct *work)
>  {
>  	struct virtio_balloon *vb;
>  	s64 diff;
> +	bool use_bmap;
>  
>  	vb = container_of(work, struct virtio_balloon,
>  			  update_balloon_size_work);
> +	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
>  	diff = towards_target(vb);
>  
>  	if (diff > 0)
> -		diff -= fill_balloon(vb, diff);
> +		diff -= fill_balloon(vb, diff, use_bmap);
>  	else if (diff < 0)
> -		diff += leak_balloon(vb, -diff);
> +		diff += leak_balloon(vb, -diff, use_bmap);
>  	update_balloon_size(vb);
>  
>  	if (diff)
> @@ -508,6 +618,13 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	spin_lock_init(&vb->stop_update_lock);
>  	vb->stop_update = false;
>  	vb->num_pages = 0;
> +	vb->bmap_len = ALIGN(VIRTIO_BALLOON_PFNS_LIMIT, BITS_PER_LONG) /
> +		 BITS_PER_BYTE + 2 * sizeof(unsigned long);
> +	vb->page_bitmap = kzalloc(vb->bmap_len, GFP_KERNEL);
> +	if (!vb->page_bitmap) {
> +		err = -ENOMEM;
> +		goto out;
> +	}

How about we clear the bitmap feature on this failure?

>  	mutex_init(&vb->balloon_lock);
>  	init_waitqueue_head(&vb->acked);
>  	vb->vdev = vdev;
> @@ -541,9 +658,12 @@ out:
>  
>  static void remove_common(struct virtio_balloon *vb)
>  {
> +	bool use_bmap;
> +
> +	use_bmap = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP);
>  	/* There might be pages left in the balloon: free them. */
>  	while (vb->num_pages)
> -		leak_balloon(vb, vb->num_pages);
> +		leak_balloon(vb, vb->num_pages, use_bmap);
>  	update_balloon_size(vb);
>  
>  	/* Now we reset the device so we can clean up the queues. */
> @@ -565,6 +685,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
>  	cancel_work_sync(&vb->update_balloon_stats_work);
>  
>  	remove_common(vb);
> +	kfree(vb->page_bitmap);
>  	kfree(vb);
>  }
>  
> @@ -603,6 +724,7 @@ static unsigned int features[] = {
>  	VIRTIO_BALLOON_F_MUST_TELL_HOST,
>  	VIRTIO_BALLOON_F_STATS_VQ,
>  	VIRTIO_BALLOON_F_DEFLATE_ON_OOM,
> +	VIRTIO_BALLOON_F_PAGE_BITMAP,
>  };
>  
>  static struct virtio_driver virtio_balloon_driver = {
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index 343d7dd..f78fa47 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -34,6 +34,7 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_PAGE_BITMAP	3 /* Send page info with bitmap */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/6] virtio-balloon: speed up inflate/deflate process
  2016-06-24  5:39   ` Michael S. Tsirkin
@ 2016-06-24  6:28     ` Li, Liang Z
  0 siblings, 0 replies; 13+ messages in thread
From: Li, Liang Z @ 2016-06-24  6:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, virtio-dev, qemu-devel, linux-kernel, Paolo Bonzini,
	Cornelia Huck, Amit Shah

Hi Michael,

Thanks for your comments!

> 
> 2<< 30  is 2G but that is not a useful comment.
> pls explain what is the reason for this selection.
> 

Will change in the next version.

> > +struct balloon_bmap_hdr {
> > +	__virtio32 id;
> > +	__virtio32 page_shift;
> > +	__virtio64 start_pfn;
> > +	__virtio64 bmap_len;
> > +};
> > +
> 
> Put this in an uapi header please.

Will change in the next version.

> > +static inline void init_pfn_range(struct virtio_balloon *vb) {
> > +	vb->min_pfn = (1UL << 48);
> 
> Where does this value come from? Do you want ULONG_MAX?
> This does not fit in long on 32 bit systems.

I just want to make it big enough, ULONG_MAX is better. Will change it.

> >  static void tell_host(struct virtio_balloon *vb, struct virtqueue
> > *vq)  {
> > -	struct scatterlist sg;
> >  	unsigned int len;
> >
> > -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> > +	if (virtio_has_feature(vb->vdev,
> VIRTIO_BALLOON_F_PAGE_BITMAP)) {
> > +		struct balloon_bmap_hdr hdr;
> 
> why not init fields here?
> 
> > +		unsigned long bmap_len;
> 
> and here

All the fields and the bmap_len will be updated later, so init is unnecessary?
 
> > +		struct scatterlist sg[2];
> > +
> > +		hdr.id = cpu_to_virtio32(vb->vdev, 0);
> > +		hdr.page_shift = cpu_to_virtio32(vb->vdev, PAGE_SHIFT);
> > +		hdr.start_pfn = cpu_to_virtio64(vb->vdev, vb->start_pfn);
> > +		bmap_len = min(vb->bmap_len,
> > +				(vb->end_pfn - vb->start_pfn) /
> BITS_PER_BYTE);
> > +		hdr.bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
> > +		sg_init_table(sg, 2);
> > +		sg_set_buf(&sg[0], &hdr, sizeof(hdr));
> > +		sg_set_buf(&sg[1], vb->page_bitmap, bmap_len);
> > +		virtqueue_add_outbuf(vq, sg, 2, vb, GFP_KERNEL);
> 
> might fail if queue size < 2. validate queue size and clear
> VIRTIO_BALLOON_F_PAGE_BITMAP?
> 
not considered yet.

> Alternatively, and I think preferably,
> use first struct balloon_bmap_hdr bytes in the buffer to pass the header to
> host.

How about the bitmap, in another sending?

> > +	struct page *page, *next;
> > +	bool find;
> 
> find -> found

Will change.

> > +	vb->max_pfn = roundup(vb->max_pfn, BITS_PER_LONG);
> > +	for (pfn = vb->min_pfn; pfn < vb->max_pfn;
> > +			pfn += VIRTIO_BALLOON_PFNS_LIMIT) {
> > +		vb->start_pfn = pfn;
> > +		vb->end_pfn = pfn;
> > +		memset(vb->page_bitmap, 0, vb->bmap_len);
> > +		find = false;
> > +		list_for_each_entry_safe(page, next, pages, lru) {
> 
> Why _safe?

No safe is OK. Will change.

> > +			unsigned long balloon_pfn =
> page_to_balloon_pfn(page);
> > +
> > +			if (balloon_pfn < pfn ||
> > +				 balloon_pfn >= pfn +
> VIRTIO_BALLOON_PFNS_LIMIT)
> > +				continue;
> > +			set_bit(balloon_pfn - pfn, vb->page_bitmap);
> > +			if (balloon_pfn > vb->end_pfn)
> > +				vb->end_pfn = balloon_pfn;
> > +			find = true;
> 
> maybe remove page from list? this way we won't go over same entry
> multiple times ...

No, we can't remove the page from list. The list saves all the pages filled in the balloon,
When delating, we fetch the pages from the list to return them to guest.  
If removed, we can't find them.

> > unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >
> >  	num_allocated_pages = vb->num_pfns;
> >  	/* Did we get any? */
> > -	if (vb->num_pfns != 0)
> > -		tell_host(vb, vb->inflate_vq);
> > +	if (vb->num_pfns != 0) {
> > +		if (use_bmap)
> > +			set_page_bitmap(vb, &vb_dev_info->pages,
> > +					 vb->inflate_vq);
> 
> don't we need pages_lock if we access vb_dev_info->pages?

It is protected by the vb->balloon_lock. not enough?

> > +	vb->page_bitmap = kzalloc(vb->bmap_len, GFP_KERNEL);
> > +	if (!vb->page_bitmap) {
> > +		err = -ENOMEM;
> > +		goto out;
> > +	}
> 
> How about we clear the bitmap feature on this failure?

That' better.  Will change.

Thanks again!
Liang

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-06-24  6:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-13  9:47 [PATCH 0/6] Fast balloon & fast live migration Liang Li
2016-06-13  9:47 ` [PATCH 1/6] virtio-balloon: rework deflate to add page to a list Liang Li
2016-06-23  8:25   ` Li, Liang Z
2016-06-23  8:30   ` Li, Liang Z
2016-06-13  9:47 ` [PATCH 2/6] virtio-balloon: speed up inflate/deflate process Liang Li
2016-06-13 10:17   ` kbuild test robot
2016-06-24  5:39   ` Michael S. Tsirkin
2016-06-24  6:28     ` Li, Liang Z
2016-06-13  9:47 ` [PATCH 3/6] mm:split the drop cache operation into a function Liang Li
2016-06-13  9:47 ` [PATCH 4/6] virtio-balloon: add drop cache support Liang Li
2016-06-13  9:47 ` [PATCH 5/6] mm: add the related functions to get free page info Liang Li
2016-06-13  9:47 ` [PATCH 6/6] virtio-balloon: tell host vm's " Liang Li
2016-06-23  8:27 ` [PATCH 0/6] Fast balloon & fast live migration Li, Liang Z

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).