All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] Virtio-balloon Improvement
@ 2017-10-20 11:54 ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

This patch series intends to summarize the recent contributions made by
Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
discussing the related deadlock issues on the mailinglist. Please check
each patch for details.

>From a high-level point of view, this patch series achieves:
1) eliminate the deadlock issue fundamentally caused by the inability
to run leak_balloon and fill_balloon concurrently;
2) enable OOM to release more than 256 inflated pages; and
3) stop inflating when the guest is under severe memory pressure
(i.e. OOM).

Here is an example of the benefit brought by this patch series:
The guest sets virtio_balloon.oom_pages=100000. When the host requests
to inflate 7.9G of an 8G idle guest, the guest can still run normally
since OOM can guarantee at least 100000 pages (400MB) for the guest.
Without the above patches, the guest will kill all the killable
processes and fall into kernel panic finally.

Wei Wang (3):
  virtio-balloon: replace the coarse-grained balloon_lock
  virtio-balloon: deflate up to oom_pages on OOM
  virtio-balloon: stop inflating when OOM occurs

 drivers/virtio/virtio_balloon.c | 149 ++++++++++++++++++++++++----------------
 1 file changed, 91 insertions(+), 58 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 0/3] Virtio-balloon Improvement
@ 2017-10-20 11:54 ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

This patch series intends to summarize the recent contributions made by
Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
discussing the related deadlock issues on the mailinglist. Please check
each patch for details.

>From a high-level point of view, this patch series achieves:
1) eliminate the deadlock issue fundamentally caused by the inability
to run leak_balloon and fill_balloon concurrently;
2) enable OOM to release more than 256 inflated pages; and
3) stop inflating when the guest is under severe memory pressure
(i.e. OOM).

Here is an example of the benefit brought by this patch series:
The guest sets virtio_balloon.oom_pages=100000. When the host requests
to inflate 7.9G of an 8G idle guest, the guest can still run normally
since OOM can guarantee at least 100000 pages (400MB) for the guest.
Without the above patches, the guest will kill all the killable
processes and fall into kernel panic finally.

Wei Wang (3):
  virtio-balloon: replace the coarse-grained balloon_lock
  virtio-balloon: deflate up to oom_pages on OOM
  virtio-balloon: stop inflating when OOM occurs

 drivers/virtio/virtio_balloon.c | 149 ++++++++++++++++++++++++----------------
 1 file changed, 91 insertions(+), 58 deletions(-)

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-20 11:54 ` Wei Wang
@ 2017-10-20 11:54   ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

The balloon_lock was used to synchronize the access demand to elements
of struct virtio_balloon and its queue operations (please see commit
e22504296d). This prevents the concurrent run of the leak_balloon and
fill_balloon functions, thereby resulting in a deadlock issue on OOM:

fill_balloon: take balloon_lock and wait for OOM to get some memory;
oom_notify: release some inflated memory via leak_balloon();
leak_balloon: wait for balloon_lock to be released by fill_balloon.

This patch breaks the lock into two fine-grained inflate_lock and
deflate_lock, and eliminates the unnecessary use of the shared data
(i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
fill_balloon to run concurrently and solves the deadlock issue.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 102 +++++++++++++++++++++-------------------
 1 file changed, 53 insertions(+), 49 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..1ecd15a 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -67,7 +67,7 @@ struct virtio_balloon {
 	wait_queue_head_t acked;
 
 	/* Number of balloon pages we've told the Host we're not using. */
-	unsigned int num_pages;
+	atomic64_t num_pages;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -76,12 +76,9 @@ struct virtio_balloon {
 	 */
 	struct balloon_dev_info vb_dev_info;
 
-	/* Synchronize access/update to this struct virtio_balloon elements */
-	struct mutex balloon_lock;
-
-	/* The array of pfns we tell the Host about. */
-	unsigned int num_pfns;
-	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
+	/* Synchronize access to inflate_vq and deflate_vq respectively */
+	struct mutex inflate_lock;
+	struct mutex deflate_lock;
 
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
@@ -111,12 +108,13 @@ static void balloon_ack(struct virtqueue *vq)
 	wake_up(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      __virtio32 pfns[], unsigned int num_pfns)
 {
 	struct scatterlist sg;
 	unsigned int len;
 
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	sg_init_one(&sg, pfns, sizeof(pfns[0]) * num_pfns);
 
 	/* We should always be able to add one buffer to an empty queue. */
 	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
@@ -144,14 +142,14 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
-	unsigned num_allocated_pages;
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = balloon_page_enqueue(vb_dev_info);
 
 		if (!page) {
@@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			msleep(200);
 			break;
 		}
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
-		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb, pfns + num_pfns, page);
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 			adjust_managed_page_count(page, -1);
 	}
 
-	num_allocated_pages = vb->num_pfns;
+	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->inflate_vq);
-	mutex_unlock(&vb->balloon_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->inflate_lock);
+	atomic64_add(num_pfns, &vb->num_pages);
 
-	return num_allocated_pages;
+	return num_pfns;
 }
 
 static void release_pages_balloon(struct virtio_balloon *vb,
@@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
 
 static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 {
-	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	LIST_HEAD(pages);
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
 	/* We can't release more pages than taken */
-	num = min(num, (size_t)vb->num_pages);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = balloon_page_dequeue(vb_dev_info);
 		if (!page)
 			break;
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		set_page_pfns(vb, pfns + num_pfns, page);
 		list_add(&page->lru, &pages);
-		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
 
-	num_freed_pages = vb->num_pfns;
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->deflate_vq);
+	mutex_lock(&vb->deflate_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->deflate_lock);
 	release_pages_balloon(vb, &pages);
-	mutex_unlock(&vb->balloon_lock);
-	return num_freed_pages;
+	atomic64_sub(num_pfns, &vb->num_pages);
+
+	return num_pfns;
 }
 
 static inline void update_stat(struct virtio_balloon *vb, int idx,
@@ -327,12 +326,12 @@ static inline s64 towards_target(struct virtio_balloon *vb)
 		num_pages = le32_to_cpu((__force __le32)num_pages);
 
 	target = num_pages;
-	return target - vb->num_pages;
+	return target - atomic64_read(&vb->num_pages);
 }
 
 static void update_balloon_size(struct virtio_balloon *vb)
 {
-	u32 actual = vb->num_pages;
+	u32 actual = atomic64_read(&vb->num_pages);
 
 	/* Legacy balloon config space is LE, unlike all other devices. */
 	if (!virtio_has_feature(vb->vdev, VIRTIO_F_VERSION_1))
@@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	struct virtio_balloon *vb = container_of(vb_dev_info,
 			struct virtio_balloon, vb_dev_info);
 	unsigned long flags;
+	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];
 
 	/*
 	 * In order to avoid lock contention while migrating pages concurrently
@@ -474,8 +474,12 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	 * recursion in the case it ends up triggering memory compaction
 	 * while it is attempting to inflate the ballon.
 	 */
-	if (!mutex_trylock(&vb->balloon_lock))
+	if (!mutex_trylock(&vb->inflate_lock))
+		return -EAGAIN;
+	if (!mutex_trylock(&vb->deflate_lock)) {
+		mutex_unlock(&vb->inflate_lock);
 		return -EAGAIN;
+	}
 
 	get_page(newpage); /* balloon reference */
 
@@ -485,17 +489,16 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	vb_dev_info->isolated_pages--;
 	__count_vm_event(BALLOON_MIGRATE);
 	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, newpage);
-	tell_host(vb, vb->inflate_vq);
+
+	set_page_pfns(vb, pfns, newpage);
+	tell_host(vb, vb->inflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->inflate_lock);
 
 	/* balloon's page migration 2nd step -- deflate "page" */
 	balloon_page_delete(page);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, page);
-	tell_host(vb, vb->deflate_vq);
-
-	mutex_unlock(&vb->balloon_lock);
+	set_page_pfns(vb, pfns, page);
+	tell_host(vb, vb->deflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->deflate_lock);
 
 	put_page(page); /* balloon reference */
 
@@ -542,8 +545,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
-	vb->num_pages = 0;
-	mutex_init(&vb->balloon_lock);
+	atomic64_set(&vb->num_pages, 0);
+	mutex_init(&vb->inflate_lock);
+	mutex_init(&vb->deflate_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 
@@ -596,8 +600,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 static void remove_common(struct virtio_balloon *vb)
 {
 	/* There might be pages left in the balloon: free them. */
-	while (vb->num_pages)
-		leak_balloon(vb, vb->num_pages);
+	while (atomic64_read(&vb->num_pages))
+		leak_balloon(vb, atomic64_read(&vb->num_pages));
 	update_balloon_size(vb);
 
 	/* Now we reset the device so we can clean up the queues. */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
@ 2017-10-20 11:54   ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

The balloon_lock was used to synchronize the access demand to elements
of struct virtio_balloon and its queue operations (please see commit
e22504296d). This prevents the concurrent run of the leak_balloon and
fill_balloon functions, thereby resulting in a deadlock issue on OOM:

fill_balloon: take balloon_lock and wait for OOM to get some memory;
oom_notify: release some inflated memory via leak_balloon();
leak_balloon: wait for balloon_lock to be released by fill_balloon.

This patch breaks the lock into two fine-grained inflate_lock and
deflate_lock, and eliminates the unnecessary use of the shared data
(i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
fill_balloon to run concurrently and solves the deadlock issue.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 102 +++++++++++++++++++++-------------------
 1 file changed, 53 insertions(+), 49 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..1ecd15a 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -67,7 +67,7 @@ struct virtio_balloon {
 	wait_queue_head_t acked;
 
 	/* Number of balloon pages we've told the Host we're not using. */
-	unsigned int num_pages;
+	atomic64_t num_pages;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -76,12 +76,9 @@ struct virtio_balloon {
 	 */
 	struct balloon_dev_info vb_dev_info;
 
-	/* Synchronize access/update to this struct virtio_balloon elements */
-	struct mutex balloon_lock;
-
-	/* The array of pfns we tell the Host about. */
-	unsigned int num_pfns;
-	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
+	/* Synchronize access to inflate_vq and deflate_vq respectively */
+	struct mutex inflate_lock;
+	struct mutex deflate_lock;
 
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
@@ -111,12 +108,13 @@ static void balloon_ack(struct virtqueue *vq)
 	wake_up(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      __virtio32 pfns[], unsigned int num_pfns)
 {
 	struct scatterlist sg;
 	unsigned int len;
 
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	sg_init_one(&sg, pfns, sizeof(pfns[0]) * num_pfns);
 
 	/* We should always be able to add one buffer to an empty queue. */
 	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
@@ -144,14 +142,14 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
-	unsigned num_allocated_pages;
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = balloon_page_enqueue(vb_dev_info);
 
 		if (!page) {
@@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			msleep(200);
 			break;
 		}
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
-		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb, pfns + num_pfns, page);
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 			adjust_managed_page_count(page, -1);
 	}
 
-	num_allocated_pages = vb->num_pfns;
+	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->inflate_vq);
-	mutex_unlock(&vb->balloon_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->inflate_lock);
+	atomic64_add(num_pfns, &vb->num_pages);
 
-	return num_allocated_pages;
+	return num_pfns;
 }
 
 static void release_pages_balloon(struct virtio_balloon *vb,
@@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
 
 static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 {
-	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	LIST_HEAD(pages);
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
 	/* We can't release more pages than taken */
-	num = min(num, (size_t)vb->num_pages);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = balloon_page_dequeue(vb_dev_info);
 		if (!page)
 			break;
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		set_page_pfns(vb, pfns + num_pfns, page);
 		list_add(&page->lru, &pages);
-		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
 
-	num_freed_pages = vb->num_pfns;
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->deflate_vq);
+	mutex_lock(&vb->deflate_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->deflate_lock);
 	release_pages_balloon(vb, &pages);
-	mutex_unlock(&vb->balloon_lock);
-	return num_freed_pages;
+	atomic64_sub(num_pfns, &vb->num_pages);
+
+	return num_pfns;
 }
 
 static inline void update_stat(struct virtio_balloon *vb, int idx,
@@ -327,12 +326,12 @@ static inline s64 towards_target(struct virtio_balloon *vb)
 		num_pages = le32_to_cpu((__force __le32)num_pages);
 
 	target = num_pages;
-	return target - vb->num_pages;
+	return target - atomic64_read(&vb->num_pages);
 }
 
 static void update_balloon_size(struct virtio_balloon *vb)
 {
-	u32 actual = vb->num_pages;
+	u32 actual = atomic64_read(&vb->num_pages);
 
 	/* Legacy balloon config space is LE, unlike all other devices. */
 	if (!virtio_has_feature(vb->vdev, VIRTIO_F_VERSION_1))
@@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	struct virtio_balloon *vb = container_of(vb_dev_info,
 			struct virtio_balloon, vb_dev_info);
 	unsigned long flags;
+	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];
 
 	/*
 	 * In order to avoid lock contention while migrating pages concurrently
@@ -474,8 +474,12 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	 * recursion in the case it ends up triggering memory compaction
 	 * while it is attempting to inflate the ballon.
 	 */
-	if (!mutex_trylock(&vb->balloon_lock))
+	if (!mutex_trylock(&vb->inflate_lock))
+		return -EAGAIN;
+	if (!mutex_trylock(&vb->deflate_lock)) {
+		mutex_unlock(&vb->inflate_lock);
 		return -EAGAIN;
+	}
 
 	get_page(newpage); /* balloon reference */
 
@@ -485,17 +489,16 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	vb_dev_info->isolated_pages--;
 	__count_vm_event(BALLOON_MIGRATE);
 	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, newpage);
-	tell_host(vb, vb->inflate_vq);
+
+	set_page_pfns(vb, pfns, newpage);
+	tell_host(vb, vb->inflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->inflate_lock);
 
 	/* balloon's page migration 2nd step -- deflate "page" */
 	balloon_page_delete(page);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, page);
-	tell_host(vb, vb->deflate_vq);
-
-	mutex_unlock(&vb->balloon_lock);
+	set_page_pfns(vb, pfns, page);
+	tell_host(vb, vb->deflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->deflate_lock);
 
 	put_page(page); /* balloon reference */
 
@@ -542,8 +545,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
-	vb->num_pages = 0;
-	mutex_init(&vb->balloon_lock);
+	atomic64_set(&vb->num_pages, 0);
+	mutex_init(&vb->inflate_lock);
+	mutex_init(&vb->deflate_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 
@@ -596,8 +600,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 static void remove_common(struct virtio_balloon *vb)
 {
 	/* There might be pages left in the balloon: free them. */
-	while (vb->num_pages)
-		leak_balloon(vb, vb->num_pages);
+	while (atomic64_read(&vb->num_pages))
+		leak_balloon(vb, atomic64_read(&vb->num_pages));
 	update_balloon_size(vb);
 
 	/* Now we reset the device so we can clean up the queues. */
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-20 11:54 ` Wei Wang
  (?)
  (?)
@ 2017-10-20 11:54 ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

The balloon_lock was used to synchronize the access demand to elements
of struct virtio_balloon and its queue operations (please see commit
e22504296d). This prevents the concurrent run of the leak_balloon and
fill_balloon functions, thereby resulting in a deadlock issue on OOM:

fill_balloon: take balloon_lock and wait for OOM to get some memory;
oom_notify: release some inflated memory via leak_balloon();
leak_balloon: wait for balloon_lock to be released by fill_balloon.

This patch breaks the lock into two fine-grained inflate_lock and
deflate_lock, and eliminates the unnecessary use of the shared data
(i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
fill_balloon to run concurrently and solves the deadlock issue.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 102 +++++++++++++++++++++-------------------
 1 file changed, 53 insertions(+), 49 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..1ecd15a 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -67,7 +67,7 @@ struct virtio_balloon {
 	wait_queue_head_t acked;
 
 	/* Number of balloon pages we've told the Host we're not using. */
-	unsigned int num_pages;
+	atomic64_t num_pages;
 	/*
 	 * The pages we've told the Host we're not using are enqueued
 	 * at vb_dev_info->pages list.
@@ -76,12 +76,9 @@ struct virtio_balloon {
 	 */
 	struct balloon_dev_info vb_dev_info;
 
-	/* Synchronize access/update to this struct virtio_balloon elements */
-	struct mutex balloon_lock;
-
-	/* The array of pfns we tell the Host about. */
-	unsigned int num_pfns;
-	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
+	/* Synchronize access to inflate_vq and deflate_vq respectively */
+	struct mutex inflate_lock;
+	struct mutex deflate_lock;
 
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
@@ -111,12 +108,13 @@ static void balloon_ack(struct virtqueue *vq)
 	wake_up(&vb->acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq,
+		      __virtio32 pfns[], unsigned int num_pfns)
 {
 	struct scatterlist sg;
 	unsigned int len;
 
-	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+	sg_init_one(&sg, pfns, sizeof(pfns[0]) * num_pfns);
 
 	/* We should always be able to add one buffer to an empty queue. */
 	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
@@ -144,14 +142,14 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
-	unsigned num_allocated_pages;
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = balloon_page_enqueue(vb_dev_info);
 
 		if (!page) {
@@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			msleep(200);
 			break;
 		}
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
-		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
+		set_page_pfns(vb, pfns + num_pfns, page);
 		if (!virtio_has_feature(vb->vdev,
 					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 			adjust_managed_page_count(page, -1);
 	}
 
-	num_allocated_pages = vb->num_pfns;
+	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->inflate_vq);
-	mutex_unlock(&vb->balloon_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->inflate_lock);
+	atomic64_add(num_pfns, &vb->num_pages);
 
-	return num_allocated_pages;
+	return num_pfns;
 }
 
 static void release_pages_balloon(struct virtio_balloon *vb,
@@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
 
 static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
 {
-	unsigned num_freed_pages;
 	struct page *page;
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
 	LIST_HEAD(pages);
+	unsigned int num_pfns;
+	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
 	/* We can only do one array worth at a time. */
-	num = min(num, ARRAY_SIZE(vb->pfns));
+	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
-	mutex_lock(&vb->balloon_lock);
 	/* We can't release more pages than taken */
-	num = min(num, (size_t)vb->num_pages);
-	for (vb->num_pfns = 0; vb->num_pfns < num;
-	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
+	for (num_pfns = 0; num_pfns < num;
+	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		page = balloon_page_dequeue(vb_dev_info);
 		if (!page)
 			break;
-		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+		set_page_pfns(vb, pfns + num_pfns, page);
 		list_add(&page->lru, &pages);
-		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 	}
 
-	num_freed_pages = vb->num_pfns;
 	/*
 	 * Note that if
 	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 	 * is true, we *have* to do it in this order
 	 */
-	if (vb->num_pfns != 0)
-		tell_host(vb, vb->deflate_vq);
+	mutex_lock(&vb->deflate_lock);
+	if (num_pfns != 0)
+		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
+	mutex_unlock(&vb->deflate_lock);
 	release_pages_balloon(vb, &pages);
-	mutex_unlock(&vb->balloon_lock);
-	return num_freed_pages;
+	atomic64_sub(num_pfns, &vb->num_pages);
+
+	return num_pfns;
 }
 
 static inline void update_stat(struct virtio_balloon *vb, int idx,
@@ -327,12 +326,12 @@ static inline s64 towards_target(struct virtio_balloon *vb)
 		num_pages = le32_to_cpu((__force __le32)num_pages);
 
 	target = num_pages;
-	return target - vb->num_pages;
+	return target - atomic64_read(&vb->num_pages);
 }
 
 static void update_balloon_size(struct virtio_balloon *vb)
 {
-	u32 actual = vb->num_pages;
+	u32 actual = atomic64_read(&vb->num_pages);
 
 	/* Legacy balloon config space is LE, unlike all other devices. */
 	if (!virtio_has_feature(vb->vdev, VIRTIO_F_VERSION_1))
@@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	struct virtio_balloon *vb = container_of(vb_dev_info,
 			struct virtio_balloon, vb_dev_info);
 	unsigned long flags;
+	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];
 
 	/*
 	 * In order to avoid lock contention while migrating pages concurrently
@@ -474,8 +474,12 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	 * recursion in the case it ends up triggering memory compaction
 	 * while it is attempting to inflate the ballon.
 	 */
-	if (!mutex_trylock(&vb->balloon_lock))
+	if (!mutex_trylock(&vb->inflate_lock))
+		return -EAGAIN;
+	if (!mutex_trylock(&vb->deflate_lock)) {
+		mutex_unlock(&vb->inflate_lock);
 		return -EAGAIN;
+	}
 
 	get_page(newpage); /* balloon reference */
 
@@ -485,17 +489,16 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 	vb_dev_info->isolated_pages--;
 	__count_vm_event(BALLOON_MIGRATE);
 	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, newpage);
-	tell_host(vb, vb->inflate_vq);
+
+	set_page_pfns(vb, pfns, newpage);
+	tell_host(vb, vb->inflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->inflate_lock);
 
 	/* balloon's page migration 2nd step -- deflate "page" */
 	balloon_page_delete(page);
-	vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
-	set_page_pfns(vb, vb->pfns, page);
-	tell_host(vb, vb->deflate_vq);
-
-	mutex_unlock(&vb->balloon_lock);
+	set_page_pfns(vb, pfns, page);
+	tell_host(vb, vb->deflate_vq, pfns, VIRTIO_BALLOON_PAGES_PER_PAGE);
+	mutex_unlock(&vb->deflate_lock);
 
 	put_page(page); /* balloon reference */
 
@@ -542,8 +545,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
-	vb->num_pages = 0;
-	mutex_init(&vb->balloon_lock);
+	atomic64_set(&vb->num_pages, 0);
+	mutex_init(&vb->inflate_lock);
+	mutex_init(&vb->deflate_lock);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 
@@ -596,8 +600,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 static void remove_common(struct virtio_balloon *vb)
 {
 	/* There might be pages left in the balloon: free them. */
-	while (vb->num_pages)
-		leak_balloon(vb, vb->num_pages);
+	while (atomic64_read(&vb->num_pages))
+		leak_balloon(vb, atomic64_read(&vb->num_pages));
 	update_balloon_size(vb);
 
 	/* Now we reset the device so we can clean up the queues. */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-20 11:54 ` Wei Wang
@ 2017-10-20 11:54   ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

The current implementation only deflates 256 pages even when a user
specifies more than that via the oom_pages module param. This patch
enables the deflating of up to oom_pages pages if there are enough
inflated pages.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/virtio/virtio_balloon.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 1ecd15a..ab55cf8 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -43,8 +43,8 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
-static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
-module_param(oom_pages, int, S_IRUSR | S_IWUSR);
+static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
+module_param(oom_pages, uint, 0600);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
 #ifdef CONFIG_BALLOON_COMPACTION
@@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 {
 	struct virtio_balloon *vb;
 	unsigned long *freed;
-	unsigned num_freed_pages;
+	unsigned int npages = oom_pages;
 
 	vb = container_of(self, struct virtio_balloon, nb);
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
 	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
+
+	/* Don't deflate more than the number of inflated pages */
+	while (npages && atomic64_read(&vb->num_pages))
+		npages -= leak_balloon(vb, npages);
+
 	update_balloon_size(vb);
-	*freed += num_freed_pages;
+	*freed += oom_pages - npages;
 
 	return NOTIFY_OK;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
@ 2017-10-20 11:54   ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

The current implementation only deflates 256 pages even when a user
specifies more than that via the oom_pages module param. This patch
enables the deflating of up to oom_pages pages if there are enough
inflated pages.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/virtio/virtio_balloon.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 1ecd15a..ab55cf8 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -43,8 +43,8 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
-static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
-module_param(oom_pages, int, S_IRUSR | S_IWUSR);
+static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
+module_param(oom_pages, uint, 0600);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
 #ifdef CONFIG_BALLOON_COMPACTION
@@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 {
 	struct virtio_balloon *vb;
 	unsigned long *freed;
-	unsigned num_freed_pages;
+	unsigned int npages = oom_pages;
 
 	vb = container_of(self, struct virtio_balloon, nb);
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
 	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
+
+	/* Don't deflate more than the number of inflated pages */
+	while (npages && atomic64_read(&vb->num_pages))
+		npages -= leak_balloon(vb, npages);
+
 	update_balloon_size(vb);
-	*freed += num_freed_pages;
+	*freed += oom_pages - npages;
 
 	return NOTIFY_OK;
 }
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-20 11:54 ` Wei Wang
                   ` (2 preceding siblings ...)
  (?)
@ 2017-10-20 11:54 ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

The current implementation only deflates 256 pages even when a user
specifies more than that via the oom_pages module param. This patch
enables the deflating of up to oom_pages pages if there are enough
inflated pages.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/virtio/virtio_balloon.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 1ecd15a..ab55cf8 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -43,8 +43,8 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
-static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
-module_param(oom_pages, int, S_IRUSR | S_IWUSR);
+static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
+module_param(oom_pages, uint, 0600);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
 #ifdef CONFIG_BALLOON_COMPACTION
@@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 {
 	struct virtio_balloon *vb;
 	unsigned long *freed;
-	unsigned num_freed_pages;
+	unsigned int npages = oom_pages;
 
 	vb = container_of(self, struct virtio_balloon, nb);
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
 	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
+
+	/* Don't deflate more than the number of inflated pages */
+	while (npages && atomic64_read(&vb->num_pages))
+		npages -= leak_balloon(vb, npages);
+
 	update_balloon_size(vb);
-	*freed += num_freed_pages;
+	*freed += oom_pages - npages;
 
 	return NOTIFY_OK;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-20 11:54 ` Wei Wang
@ 2017-10-20 11:54   ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

This patch forces the cease of the inflating work when OOM occurs.
The fundamental idea of memory ballooning is to take out some guest
pages when the guest has low memory utilization, so it is sensible to
inflate nothing when the guest is already under memory pressure.

On the other hand, the policy is determined by the admin or the
orchestration layer from the host. That is, the host is expected to
re-start the memory inflating request at a proper time later when
the guest has enough memory to inflate, for example, by checking
the memory stats reported by the balloon. If another inflating
requests is sent to guest when the guest is still under memory
pressure, still no pages will be inflated.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index ab55cf8..cf29663 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -63,6 +63,15 @@ struct virtio_balloon {
 	spinlock_t stop_update_lock;
 	bool stop_update;
 
+	/*
+	 * The balloon driver enters the oom mode if the oom notifier is
+	 * invoked. Entering the oom mode will force the exit of current
+	 * inflating work. When a later inflating request is received from
+	 * the host, the success of memory allocation via balloon_page_enqueue
+	 * will turn off the mode.
+	 */
+	bool oom_mode;
+
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
 
@@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	struct page *page;
+	size_t orig_num;
 	unsigned int num_pfns;
 	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
+	orig_num = num;
 	/* We can only do one array worth at a time. */
 	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
 	for (num_pfns = 0; num_pfns < num;
 	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = balloon_page_enqueue(vb_dev_info);
-
+		page = balloon_page_enqueue(vb_dev_info);
 		if (!page) {
 			dev_info_ratelimited(&vb->vdev->dev,
 					     "Out of puff! Can't get %u pages\n",
 					     VIRTIO_BALLOON_PAGES_PER_PAGE);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
 			break;
 		}
 		set_page_pfns(vb, pfns + num_pfns, page);
@@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			adjust_managed_page_count(page, -1);
 	}
 
+	/*
+	 * The oom_mode is set, but we've already been able to get some
+	 * pages, so it is time to turn it off here.
+	 */
+	if (unlikely(READ_ONCE(vb->oom_mode) && page))
+		WRITE_ONCE(vb->oom_mode, false);
+
 	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
 	if (num_pfns != 0)
@@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	mutex_unlock(&vb->inflate_lock);
 	atomic64_add(num_pfns, &vb->num_pages);
 
+	/*
+	 * If oom_mode is on, return the original @num passed by
+	 * update_balloon_size_func to stop the inflating.
+	 */
+	if (READ_ONCE(vb->oom_mode))
+		return orig_num;
+
 	return num_pfns;
 }
 
@@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
+	WRITE_ONCE(vb->oom_mode, true);
 	freed = parm;
 
 	/* Don't deflate more than the number of inflated pages */
@@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
+	vb->oom_mode = false;
 	atomic64_set(&vb->num_pages, 0);
 	mutex_init(&vb->inflate_lock);
 	mutex_init(&vb->deflate_lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
@ 2017-10-20 11:54   ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization
  Cc: Wei Wang

This patch forces the cease of the inflating work when OOM occurs.
The fundamental idea of memory ballooning is to take out some guest
pages when the guest has low memory utilization, so it is sensible to
inflate nothing when the guest is already under memory pressure.

On the other hand, the policy is determined by the admin or the
orchestration layer from the host. That is, the host is expected to
re-start the memory inflating request at a proper time later when
the guest has enough memory to inflate, for example, by checking
the memory stats reported by the balloon. If another inflating
requests is sent to guest when the guest is still under memory
pressure, still no pages will be inflated.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index ab55cf8..cf29663 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -63,6 +63,15 @@ struct virtio_balloon {
 	spinlock_t stop_update_lock;
 	bool stop_update;
 
+	/*
+	 * The balloon driver enters the oom mode if the oom notifier is
+	 * invoked. Entering the oom mode will force the exit of current
+	 * inflating work. When a later inflating request is received from
+	 * the host, the success of memory allocation via balloon_page_enqueue
+	 * will turn off the mode.
+	 */
+	bool oom_mode;
+
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
 
@@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	struct page *page;
+	size_t orig_num;
 	unsigned int num_pfns;
 	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
+	orig_num = num;
 	/* We can only do one array worth at a time. */
 	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
 	for (num_pfns = 0; num_pfns < num;
 	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = balloon_page_enqueue(vb_dev_info);
-
+		page = balloon_page_enqueue(vb_dev_info);
 		if (!page) {
 			dev_info_ratelimited(&vb->vdev->dev,
 					     "Out of puff! Can't get %u pages\n",
 					     VIRTIO_BALLOON_PAGES_PER_PAGE);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
 			break;
 		}
 		set_page_pfns(vb, pfns + num_pfns, page);
@@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			adjust_managed_page_count(page, -1);
 	}
 
+	/*
+	 * The oom_mode is set, but we've already been able to get some
+	 * pages, so it is time to turn it off here.
+	 */
+	if (unlikely(READ_ONCE(vb->oom_mode) && page))
+		WRITE_ONCE(vb->oom_mode, false);
+
 	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
 	if (num_pfns != 0)
@@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	mutex_unlock(&vb->inflate_lock);
 	atomic64_add(num_pfns, &vb->num_pages);
 
+	/*
+	 * If oom_mode is on, return the original @num passed by
+	 * update_balloon_size_func to stop the inflating.
+	 */
+	if (READ_ONCE(vb->oom_mode))
+		return orig_num;
+
 	return num_pfns;
 }
 
@@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
+	WRITE_ONCE(vb->oom_mode, true);
 	freed = parm;
 
 	/* Don't deflate more than the number of inflated pages */
@@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
+	vb->oom_mode = false;
 	atomic64_set(&vb->num_pages, 0);
 	mutex_init(&vb->inflate_lock);
 	mutex_init(&vb->deflate_lock);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-20 11:54 ` Wei Wang
                   ` (5 preceding siblings ...)
  (?)
@ 2017-10-20 11:54 ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-20 11:54 UTC (permalink / raw)
  To: mst, penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

This patch forces the cease of the inflating work when OOM occurs.
The fundamental idea of memory ballooning is to take out some guest
pages when the guest has low memory utilization, so it is sensible to
inflate nothing when the guest is already under memory pressure.

On the other hand, the policy is determined by the admin or the
orchestration layer from the host. That is, the host is expected to
re-start the memory inflating request at a proper time later when
the guest has enough memory to inflate, for example, by checking
the memory stats reported by the balloon. If another inflating
requests is sent to guest when the guest is still under memory
pressure, still no pages will be inflated.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>
---
 drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index ab55cf8..cf29663 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -63,6 +63,15 @@ struct virtio_balloon {
 	spinlock_t stop_update_lock;
 	bool stop_update;
 
+	/*
+	 * The balloon driver enters the oom mode if the oom notifier is
+	 * invoked. Entering the oom mode will force the exit of current
+	 * inflating work. When a later inflating request is received from
+	 * the host, the success of memory allocation via balloon_page_enqueue
+	 * will turn off the mode.
+	 */
+	bool oom_mode;
+
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
 
@@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
 static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	struct page *page;
+	size_t orig_num;
 	unsigned int num_pfns;
 	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
 
+	orig_num = num;
 	/* We can only do one array worth at a time. */
 	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
 
 	for (num_pfns = 0; num_pfns < num;
 	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-		struct page *page = balloon_page_enqueue(vb_dev_info);
-
+		page = balloon_page_enqueue(vb_dev_info);
 		if (!page) {
 			dev_info_ratelimited(&vb->vdev->dev,
 					     "Out of puff! Can't get %u pages\n",
 					     VIRTIO_BALLOON_PAGES_PER_PAGE);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
 			break;
 		}
 		set_page_pfns(vb, pfns + num_pfns, page);
@@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 			adjust_managed_page_count(page, -1);
 	}
 
+	/*
+	 * The oom_mode is set, but we've already been able to get some
+	 * pages, so it is time to turn it off here.
+	 */
+	if (unlikely(READ_ONCE(vb->oom_mode) && page))
+		WRITE_ONCE(vb->oom_mode, false);
+
 	mutex_lock(&vb->inflate_lock);
 	/* Did we get any? */
 	if (num_pfns != 0)
@@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	mutex_unlock(&vb->inflate_lock);
 	atomic64_add(num_pfns, &vb->num_pages);
 
+	/*
+	 * If oom_mode is on, return the original @num passed by
+	 * update_balloon_size_func to stop the inflating.
+	 */
+	if (READ_ONCE(vb->oom_mode))
+		return orig_num;
+
 	return num_pfns;
 }
 
@@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
 		return NOTIFY_OK;
 
+	WRITE_ONCE(vb->oom_mode, true);
 	freed = parm;
 
 	/* Don't deflate more than the number of inflated pages */
@@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	vb->stop_update = false;
+	vb->oom_mode = false;
 	atomic64_set(&vb->num_pages, 0);
 	mutex_init(&vb->inflate_lock);
 	mutex_init(&vb->deflate_lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-20 11:54 ` Wei Wang
@ 2017-10-22  3:19   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:19 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
> This patch series intends to summarize the recent contributions made by
> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
> discussing the related deadlock issues on the mailinglist. Please check
> each patch for details.
> 
> >From a high-level point of view, this patch series achieves:
> 1) eliminate the deadlock issue fundamentally caused by the inability
> to run leak_balloon and fill_balloon concurrently;

We need to think about this carefully. Is it an issue that
leak can now bypass fill? It seems that we can now
try to leak a page before fill was seen by host,
but I did not look into it deeply.

I really like my patch for this better at least for
current kernel. I agree we need to work more on 2+3.

> 2) enable OOM to release more than 256 inflated pages; and

Does just this help enough? How about my patch + 2?
Tetsuo, what do you think?

> 3) stop inflating when the guest is under severe memory pressure
> (i.e. OOM).

But when do we finally inflate?  Question is how does host know it needs
to resend an interrupt, and when should it do it?


> Here is an example of the benefit brought by this patch series:
> The guest sets virtio_balloon.oom_pages=100000. When the host requests
> to inflate 7.9G of an 8G idle guest, the guest can still run normally
> since OOM can guarantee at least 100000 pages (400MB) for the guest.
> Without the above patches, the guest will kill all the killable
> processes and fall into kernel panic finally.
> 
> Wei Wang (3):
>   virtio-balloon: replace the coarse-grained balloon_lock
>   virtio-balloon: deflate up to oom_pages on OOM
>   virtio-balloon: stop inflating when OOM occurs
> 
>  drivers/virtio/virtio_balloon.c | 149 ++++++++++++++++++++++++----------------
>  1 file changed, 91 insertions(+), 58 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
@ 2017-10-22  3:19   ` Michael S. Tsirkin
  0 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:19 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
> This patch series intends to summarize the recent contributions made by
> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
> discussing the related deadlock issues on the mailinglist. Please check
> each patch for details.
> 
> >From a high-level point of view, this patch series achieves:
> 1) eliminate the deadlock issue fundamentally caused by the inability
> to run leak_balloon and fill_balloon concurrently;

We need to think about this carefully. Is it an issue that
leak can now bypass fill? It seems that we can now
try to leak a page before fill was seen by host,
but I did not look into it deeply.

I really like my patch for this better at least for
current kernel. I agree we need to work more on 2+3.

> 2) enable OOM to release more than 256 inflated pages; and

Does just this help enough? How about my patch + 2?
Tetsuo, what do you think?

> 3) stop inflating when the guest is under severe memory pressure
> (i.e. OOM).

But when do we finally inflate?  Question is how does host know it needs
to resend an interrupt, and when should it do it?


> Here is an example of the benefit brought by this patch series:
> The guest sets virtio_balloon.oom_pages=100000. When the host requests
> to inflate 7.9G of an 8G idle guest, the guest can still run normally
> since OOM can guarantee at least 100000 pages (400MB) for the guest.
> Without the above patches, the guest will kill all the killable
> processes and fall into kernel panic finally.
> 
> Wei Wang (3):
>   virtio-balloon: replace the coarse-grained balloon_lock
>   virtio-balloon: deflate up to oom_pages on OOM
>   virtio-balloon: stop inflating when OOM occurs
> 
>  drivers/virtio/virtio_balloon.c | 149 ++++++++++++++++++++++++----------------
>  1 file changed, 91 insertions(+), 58 deletions(-)
> 
> -- 
> 2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-20 11:54 ` Wei Wang
                   ` (6 preceding siblings ...)
  (?)
@ 2017-10-22  3:19 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:19 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
> This patch series intends to summarize the recent contributions made by
> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
> discussing the related deadlock issues on the mailinglist. Please check
> each patch for details.
> 
> >From a high-level point of view, this patch series achieves:
> 1) eliminate the deadlock issue fundamentally caused by the inability
> to run leak_balloon and fill_balloon concurrently;

We need to think about this carefully. Is it an issue that
leak can now bypass fill? It seems that we can now
try to leak a page before fill was seen by host,
but I did not look into it deeply.

I really like my patch for this better at least for
current kernel. I agree we need to work more on 2+3.

> 2) enable OOM to release more than 256 inflated pages; and

Does just this help enough? How about my patch + 2?
Tetsuo, what do you think?

> 3) stop inflating when the guest is under severe memory pressure
> (i.e. OOM).

But when do we finally inflate?  Question is how does host know it needs
to resend an interrupt, and when should it do it?


> Here is an example of the benefit brought by this patch series:
> The guest sets virtio_balloon.oom_pages=100000. When the host requests
> to inflate 7.9G of an 8G idle guest, the guest can still run normally
> since OOM can guarantee at least 100000 pages (400MB) for the guest.
> Without the above patches, the guest will kill all the killable
> processes and fall into kernel panic finally.
> 
> Wei Wang (3):
>   virtio-balloon: replace the coarse-grained balloon_lock
>   virtio-balloon: deflate up to oom_pages on OOM
>   virtio-balloon: stop inflating when OOM occurs
> 
>  drivers/virtio/virtio_balloon.c | 149 ++++++++++++++++++++++++----------------
>  1 file changed, 91 insertions(+), 58 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-20 11:54   ` Wei Wang
@ 2017-10-22  3:21     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:21 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> The current implementation only deflates 256 pages even when a user
> specifies more than that via the oom_pages module param. This patch
> enables the deflating of up to oom_pages pages if there are enough
> inflated pages.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

This seems reasonable. Does this by itself help?


> ---
>  drivers/virtio/virtio_balloon.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 1ecd15a..ab55cf8 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -43,8 +43,8 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> +static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> +module_param(oom_pages, uint, 0600);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  
>  #ifdef CONFIG_BALLOON_COMPACTION
> @@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  {
>  	struct virtio_balloon *vb;
>  	unsigned long *freed;
> -	unsigned num_freed_pages;
> +	unsigned int npages = oom_pages;
>  
>  	vb = container_of(self, struct virtio_balloon, nb);
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
>  	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> +
> +	/* Don't deflate more than the number of inflated pages */
> +	while (npages && atomic64_read(&vb->num_pages))
> +		npages -= leak_balloon(vb, npages);
> +
>  	update_balloon_size(vb);
> -	*freed += num_freed_pages;
> +	*freed += oom_pages - npages;
>  
>  	return NOTIFY_OK;
>  }
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
@ 2017-10-22  3:21     ` Michael S. Tsirkin
  0 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:21 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> The current implementation only deflates 256 pages even when a user
> specifies more than that via the oom_pages module param. This patch
> enables the deflating of up to oom_pages pages if there are enough
> inflated pages.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

This seems reasonable. Does this by itself help?


> ---
>  drivers/virtio/virtio_balloon.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 1ecd15a..ab55cf8 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -43,8 +43,8 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> +static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> +module_param(oom_pages, uint, 0600);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  
>  #ifdef CONFIG_BALLOON_COMPACTION
> @@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  {
>  	struct virtio_balloon *vb;
>  	unsigned long *freed;
> -	unsigned num_freed_pages;
> +	unsigned int npages = oom_pages;
>  
>  	vb = container_of(self, struct virtio_balloon, nb);
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
>  	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> +
> +	/* Don't deflate more than the number of inflated pages */
> +	while (npages && atomic64_read(&vb->num_pages))
> +		npages -= leak_balloon(vb, npages);
> +
>  	update_balloon_size(vb);
> -	*freed += num_freed_pages;
> +	*freed += oom_pages - npages;
>  
>  	return NOTIFY_OK;
>  }
> -- 
> 2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-20 11:54   ` Wei Wang
  (?)
@ 2017-10-22  3:21   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22  3:21 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> The current implementation only deflates 256 pages even when a user
> specifies more than that via the oom_pages module param. This patch
> enables the deflating of up to oom_pages pages if there are enough
> inflated pages.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

This seems reasonable. Does this by itself help?


> ---
>  drivers/virtio/virtio_balloon.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 1ecd15a..ab55cf8 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -43,8 +43,8 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> +static unsigned int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> +module_param(oom_pages, uint, 0600);
>  MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
>  
>  #ifdef CONFIG_BALLOON_COMPACTION
> @@ -359,16 +359,20 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  {
>  	struct virtio_balloon *vb;
>  	unsigned long *freed;
> -	unsigned num_freed_pages;
> +	unsigned int npages = oom_pages;
>  
>  	vb = container_of(self, struct virtio_balloon, nb);
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
>  	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> +
> +	/* Don't deflate more than the number of inflated pages */
> +	while (npages && atomic64_read(&vb->num_pages))
> +		npages -= leak_balloon(vb, npages);
> +
>  	update_balloon_size(vb);
> -	*freed += num_freed_pages;
> +	*freed += oom_pages - npages;
>  
>  	return NOTIFY_OK;
>  }
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-22  3:21     ` Michael S. Tsirkin
@ 2017-10-22  4:11       ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  4:11 UTC (permalink / raw)
  To: mst, wei.w.wang; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> > The current implementation only deflates 256 pages even when a user
> > specifies more than that via the oom_pages module param. This patch
> > enables the deflating of up to oom_pages pages if there are enough
> > inflated pages.
> 
> This seems reasonable. Does this by itself help?

At least

> > -	num_freed_pages = leak_balloon(vb, oom_pages);
> > +
> > +	/* Don't deflate more than the number of inflated pages */
> > +	while (npages && atomic64_read(&vb->num_pages))
> > +		npages -= leak_balloon(vb, npages);

don't we need to abort if leak_balloon() returned 0 for some reason?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
@ 2017-10-22  4:11       ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  4:11 UTC (permalink / raw)
  To: mst, wei.w.wang; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> > The current implementation only deflates 256 pages even when a user
> > specifies more than that via the oom_pages module param. This patch
> > enables the deflating of up to oom_pages pages if there are enough
> > inflated pages.
> 
> This seems reasonable. Does this by itself help?

At least

> > -	num_freed_pages = leak_balloon(vb, oom_pages);
> > +
> > +	/* Don't deflate more than the number of inflated pages */
> > +	while (npages && atomic64_read(&vb->num_pages))
> > +		npages -= leak_balloon(vb, npages);

don't we need to abort if leak_balloon() returned 0 for some reason?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-22  3:21     ` Michael S. Tsirkin
  (?)
@ 2017-10-22  4:11     ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  4:11 UTC (permalink / raw)
  To: mst, wei.w.wang; +Cc: linux-mm, virtualization, linux-kernel, mhocko

Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:25PM +0800, Wei Wang wrote:
> > The current implementation only deflates 256 pages even when a user
> > specifies more than that via the oom_pages module param. This patch
> > enables the deflating of up to oom_pages pages if there are enough
> > inflated pages.
> 
> This seems reasonable. Does this by itself help?

At least

> > -	num_freed_pages = leak_balloon(vb, oom_pages);
> > +
> > +	/* Don't deflate more than the number of inflated pages */
> > +	while (npages && atomic64_read(&vb->num_pages))
> > +		npages -= leak_balloon(vb, npages);

don't we need to abort if leak_balloon() returned 0 for some reason?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-20 11:54   ` Wei Wang
@ 2017-10-22  5:20     ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  5:20 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Wei Wang wrote:
> The balloon_lock was used to synchronize the access demand to elements
> of struct virtio_balloon and its queue operations (please see commit
> e22504296d). This prevents the concurrent run of the leak_balloon and
> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
> 
> fill_balloon: take balloon_lock and wait for OOM to get some memory;
> oom_notify: release some inflated memory via leak_balloon();
> leak_balloon: wait for balloon_lock to be released by fill_balloon.
> 
> This patch breaks the lock into two fine-grained inflate_lock and
> deflate_lock, and eliminates the unnecessary use of the shared data
> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
> fill_balloon to run concurrently and solves the deadlock issue.
> 

> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			msleep(200);
>  			break;
>  		}
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		if (!virtio_has_feature(vb->vdev,
>  					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> -	num_allocated_pages = vb->num_pfns;
> +	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->inflate_vq);
> -	mutex_unlock(&vb->balloon_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->inflate_lock);
> +	atomic64_add(num_pfns, &vb->num_pages);

Isn't this addition too late? If leak_balloon() is called due to
out_of_memory(), it will fail to find up to dated vb->num_pages value.

>  
> -	return num_allocated_pages;
> +	return num_pfns;
>  }
>  
>  static void release_pages_balloon(struct virtio_balloon *vb,
> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>  
>  static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>  {
> -	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>  	LIST_HEAD(pages);
> +	unsigned int num_pfns;
> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];

This array consumes 1024 bytes of kernel stack, doesn't it?
leak_balloon() might be called from out_of_memory() where kernel stack
is already largely consumed before entering __alloc_pages_nodemask().
For reducing possibility of stack overflow, since out_of_memory() is
serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
out_of_memory().

>  
>  	/* We can only do one array worth at a time. */
> -	num = min(num, ARRAY_SIZE(vb->pfns));
> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
> -	mutex_lock(&vb->balloon_lock);
>  	/* We can't release more pages than taken */
> -	num = min(num, (size_t)vb->num_pages);
> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> +	for (num_pfns = 0; num_pfns < num;
> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		page = balloon_page_dequeue(vb_dev_info);

If balloon_page_dequeue() can be concurrently called by both host's request
and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
Is such concurrency needed?

>  		if (!page)
>  			break;
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		list_add(&page->lru, &pages);
> -		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
>  
> -	num_freed_pages = vb->num_pfns;
>  	/*
>  	 * Note that if
>  	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
>  	 * is true, we *have* to do it in this order
>  	 */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->deflate_vq);
> +	mutex_lock(&vb->deflate_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->deflate_lock);
>  	release_pages_balloon(vb, &pages);
> -	mutex_unlock(&vb->balloon_lock);
> -	return num_freed_pages;
> +	atomic64_sub(num_pfns, &vb->num_pages);

Isn't this subtraction too late?

> +
> +	return num_pfns;
>  }
>  
>  static inline void update_stat(struct virtio_balloon *vb, int idx,

> @@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
>  	struct virtio_balloon *vb = container_of(vb_dev_info,
>  			struct virtio_balloon, vb_dev_info);
>  	unsigned long flags;
> +	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];

If this is called from memory allocation path, maybe kmalloc()ed buffer is safer.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
@ 2017-10-22  5:20     ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  5:20 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Wei Wang wrote:
> The balloon_lock was used to synchronize the access demand to elements
> of struct virtio_balloon and its queue operations (please see commit
> e22504296d). This prevents the concurrent run of the leak_balloon and
> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
> 
> fill_balloon: take balloon_lock and wait for OOM to get some memory;
> oom_notify: release some inflated memory via leak_balloon();
> leak_balloon: wait for balloon_lock to be released by fill_balloon.
> 
> This patch breaks the lock into two fine-grained inflate_lock and
> deflate_lock, and eliminates the unnecessary use of the shared data
> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
> fill_balloon to run concurrently and solves the deadlock issue.
> 

> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			msleep(200);
>  			break;
>  		}
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		if (!virtio_has_feature(vb->vdev,
>  					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> -	num_allocated_pages = vb->num_pfns;
> +	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->inflate_vq);
> -	mutex_unlock(&vb->balloon_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->inflate_lock);
> +	atomic64_add(num_pfns, &vb->num_pages);

Isn't this addition too late? If leak_balloon() is called due to
out_of_memory(), it will fail to find up to dated vb->num_pages value.

>  
> -	return num_allocated_pages;
> +	return num_pfns;
>  }
>  
>  static void release_pages_balloon(struct virtio_balloon *vb,
> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>  
>  static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>  {
> -	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>  	LIST_HEAD(pages);
> +	unsigned int num_pfns;
> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];

This array consumes 1024 bytes of kernel stack, doesn't it?
leak_balloon() might be called from out_of_memory() where kernel stack
is already largely consumed before entering __alloc_pages_nodemask().
For reducing possibility of stack overflow, since out_of_memory() is
serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
out_of_memory().

>  
>  	/* We can only do one array worth at a time. */
> -	num = min(num, ARRAY_SIZE(vb->pfns));
> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
> -	mutex_lock(&vb->balloon_lock);
>  	/* We can't release more pages than taken */
> -	num = min(num, (size_t)vb->num_pages);
> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> +	for (num_pfns = 0; num_pfns < num;
> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		page = balloon_page_dequeue(vb_dev_info);

If balloon_page_dequeue() can be concurrently called by both host's request
and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
Is such concurrency needed?

>  		if (!page)
>  			break;
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		list_add(&page->lru, &pages);
> -		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
>  
> -	num_freed_pages = vb->num_pfns;
>  	/*
>  	 * Note that if
>  	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
>  	 * is true, we *have* to do it in this order
>  	 */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->deflate_vq);
> +	mutex_lock(&vb->deflate_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->deflate_lock);
>  	release_pages_balloon(vb, &pages);
> -	mutex_unlock(&vb->balloon_lock);
> -	return num_freed_pages;
> +	atomic64_sub(num_pfns, &vb->num_pages);

Isn't this subtraction too late?

> +
> +	return num_pfns;
>  }
>  
>  static inline void update_stat(struct virtio_balloon *vb, int idx,

> @@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
>  	struct virtio_balloon *vb = container_of(vb_dev_info,
>  			struct virtio_balloon, vb_dev_info);
>  	unsigned long flags;
> +	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];

If this is called from memory allocation path, maybe kmalloc()ed buffer is safer.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-20 11:54   ` Wei Wang
  (?)
@ 2017-10-22  5:20   ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22  5:20 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: linux-mm, virtualization, linux-kernel, mhocko

Wei Wang wrote:
> The balloon_lock was used to synchronize the access demand to elements
> of struct virtio_balloon and its queue operations (please see commit
> e22504296d). This prevents the concurrent run of the leak_balloon and
> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
> 
> fill_balloon: take balloon_lock and wait for OOM to get some memory;
> oom_notify: release some inflated memory via leak_balloon();
> leak_balloon: wait for balloon_lock to be released by fill_balloon.
> 
> This patch breaks the lock into two fine-grained inflate_lock and
> deflate_lock, and eliminates the unnecessary use of the shared data
> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
> fill_balloon to run concurrently and solves the deadlock issue.
> 

> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			msleep(200);
>  			break;
>  		}
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		if (!virtio_has_feature(vb->vdev,
>  					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> -	num_allocated_pages = vb->num_pfns;
> +	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->inflate_vq);
> -	mutex_unlock(&vb->balloon_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->inflate_lock);
> +	atomic64_add(num_pfns, &vb->num_pages);

Isn't this addition too late? If leak_balloon() is called due to
out_of_memory(), it will fail to find up to dated vb->num_pages value.

>  
> -	return num_allocated_pages;
> +	return num_pfns;
>  }
>  
>  static void release_pages_balloon(struct virtio_balloon *vb,
> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>  
>  static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>  {
> -	unsigned num_freed_pages;
>  	struct page *page;
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>  	LIST_HEAD(pages);
> +	unsigned int num_pfns;
> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];

This array consumes 1024 bytes of kernel stack, doesn't it?
leak_balloon() might be called from out_of_memory() where kernel stack
is already largely consumed before entering __alloc_pages_nodemask().
For reducing possibility of stack overflow, since out_of_memory() is
serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
out_of_memory().

>  
>  	/* We can only do one array worth at a time. */
> -	num = min(num, ARRAY_SIZE(vb->pfns));
> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
> -	mutex_lock(&vb->balloon_lock);
>  	/* We can't release more pages than taken */
> -	num = min(num, (size_t)vb->num_pages);
> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> +	for (num_pfns = 0; num_pfns < num;
> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		page = balloon_page_dequeue(vb_dev_info);

If balloon_page_dequeue() can be concurrently called by both host's request
and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
Is such concurrency needed?

>  		if (!page)
>  			break;
> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> +		set_page_pfns(vb, pfns + num_pfns, page);
>  		list_add(&page->lru, &pages);
> -		vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>  	}
>  
> -	num_freed_pages = vb->num_pfns;
>  	/*
>  	 * Note that if
>  	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
>  	 * is true, we *have* to do it in this order
>  	 */
> -	if (vb->num_pfns != 0)
> -		tell_host(vb, vb->deflate_vq);
> +	mutex_lock(&vb->deflate_lock);
> +	if (num_pfns != 0)
> +		tell_host(vb, vb->deflate_vq, pfns, num_pfns);
> +	mutex_unlock(&vb->deflate_lock);
>  	release_pages_balloon(vb, &pages);
> -	mutex_unlock(&vb->balloon_lock);
> -	return num_freed_pages;
> +	atomic64_sub(num_pfns, &vb->num_pages);

Isn't this subtraction too late?

> +
> +	return num_pfns;
>  }
>  
>  static inline void update_stat(struct virtio_balloon *vb, int idx,

> @@ -465,6 +464,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
>  	struct virtio_balloon *vb = container_of(vb_dev_info,
>  			struct virtio_balloon, vb_dev_info);
>  	unsigned long flags;
> +	__virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE];

If this is called from memory allocation path, maybe kmalloc()ed buffer is safer.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-22  3:19   ` Michael S. Tsirkin
@ 2017-10-22 11:19     ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.

Yes, we can check more. But from the original intention:
(copied from the commit e22504296d)
balloon_lock (mutex) : synchronizes the access demand to elements of
                               struct virtio_balloon and its queue 
operations;

This implementation has covered what balloon_lock achieves. We have
inflating and deflating decoupled and use a small lock for each vq 
respectively.

I also tested inflating 20G, and before it's done, requested to 
deflating 20G, all work fine.


>
>> 2) enable OOM to release more than 256 inflated pages; and
> Does just this help enough? How about my patch + 2?
> Tetsuo, what do you think?
>
>> 3) stop inflating when the guest is under severe memory pressure
>> (i.e. OOM).
> But when do we finally inflate?  Question is how does host know it needs
> to resend an interrupt, and when should it do it?

I think "when to inflate again" should be a policy defined by the 
orchestration
layer software on the host. A reasonable inflating request should be 
sent to a
guest on the condition that this guest has enough free memory to inflate
(virtio-balloon memory stats has already supported to report that info).

If the policy defines to inflate guest memory without considering 
whether the guest
is even under memory pressure. The mechanism we provide here is to offer 
no pages
to the host in that case. I think this should be reasonable.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
@ 2017-10-22 11:19     ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.

Yes, we can check more. But from the original intention:
(copied from the commit e22504296d)
balloon_lock (mutex) : synchronizes the access demand to elements of
                               struct virtio_balloon and its queue 
operations;

This implementation has covered what balloon_lock achieves. We have
inflating and deflating decoupled and use a small lock for each vq 
respectively.

I also tested inflating 20G, and before it's done, requested to 
deflating 20G, all work fine.


>
>> 2) enable OOM to release more than 256 inflated pages; and
> Does just this help enough? How about my patch + 2?
> Tetsuo, what do you think?
>
>> 3) stop inflating when the guest is under severe memory pressure
>> (i.e. OOM).
> But when do we finally inflate?  Question is how does host know it needs
> to resend an interrupt, and when should it do it?

I think "when to inflate again" should be a policy defined by the 
orchestration
layer software on the host. A reasonable inflating request should be 
sent to a
guest on the condition that this guest has enough free memory to inflate
(virtio-balloon memory stats has already supported to report that info).

If the policy defines to inflate guest memory without considering 
whether the guest
is even under memory pressure. The mechanism we provide here is to offer 
no pages
to the host in that case. I think this should be reasonable.


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-22  3:19   ` Michael S. Tsirkin
  (?)
  (?)
@ 2017-10-22 11:19   ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.

Yes, we can check more. But from the original intention:
(copied from the commit e22504296d)
balloon_lock (mutex) : synchronizes the access demand to elements of
                               struct virtio_balloon and its queue 
operations;

This implementation has covered what balloon_lock achieves. We have
inflating and deflating decoupled and use a small lock for each vq 
respectively.

I also tested inflating 20G, and before it's done, requested to 
deflating 20G, all work fine.


>
>> 2) enable OOM to release more than 256 inflated pages; and
> Does just this help enough? How about my patch + 2?
> Tetsuo, what do you think?
>
>> 3) stop inflating when the guest is under severe memory pressure
>> (i.e. OOM).
> But when do we finally inflate?  Question is how does host know it needs
> to resend an interrupt, and when should it do it?

I think "when to inflate again" should be a policy defined by the 
orchestration
layer software on the host. A reasonable inflating request should be 
sent to a
guest on the condition that this guest has enough free memory to inflate
(virtio-balloon memory stats has already supported to report that info).

If the policy defines to inflate guest memory without considering 
whether the guest
is even under memory pressure. The mechanism we provide here is to offer 
no pages
to the host in that case. I think this should be reasonable.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22  5:20     ` Tetsuo Handa
@ 2017-10-22 11:24       ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:24 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 01:20 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>> The balloon_lock was used to synchronize the access demand to elements
>> of struct virtio_balloon and its queue operations (please see commit
>> e22504296d). This prevents the concurrent run of the leak_balloon and
>> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
>>
>> fill_balloon: take balloon_lock and wait for OOM to get some memory;
>> oom_notify: release some inflated memory via leak_balloon();
>> leak_balloon: wait for balloon_lock to be released by fill_balloon.
>>
>> This patch breaks the lock into two fine-grained inflate_lock and
>> deflate_lock, and eliminates the unnecessary use of the shared data
>> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
>> fill_balloon to run concurrently and solves the deadlock issue.
>>
>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>   			msleep(200);
>>   			break;
>>   		}
>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>   		if (!virtio_has_feature(vb->vdev,
>>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>   			adjust_managed_page_count(page, -1);
>>   	}
>>   
>> -	num_allocated_pages = vb->num_pfns;
>> +	mutex_lock(&vb->inflate_lock);
>>   	/* Did we get any? */
>> -	if (vb->num_pfns != 0)
>> -		tell_host(vb, vb->inflate_vq);
>> -	mutex_unlock(&vb->balloon_lock);
>> +	if (num_pfns != 0)
>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>> +	mutex_unlock(&vb->inflate_lock);
>> +	atomic64_add(num_pfns, &vb->num_pages);
> Isn't this addition too late? If leak_balloon() is called due to
> out_of_memory(), it will fail to find up to dated vb->num_pages value.

Not really. I think the old way of implementation above:
"vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
isn't quite accurate, because "vb->num_page" should reflect the number of
pages that have already been inflated, which means those pages have
already been given to the host via "tell_host()".

If we update "vb->num_page" earlier before tell_host(), then it will 
include the pages
that haven't been given to the host, which I think shouldn't be counted 
as inflated pages.

On the other hand, OOM will use leak_balloon() to release the pages that 
should
have already been inflated.

In addition, I think we would also need to move balloon_page_insert(), 
which puts the
page onto the inflated page list, after tell_host().



>>   
>> -	return num_allocated_pages;
>> +	return num_pfns;
>>   }
>>   
>>   static void release_pages_balloon(struct virtio_balloon *vb,
>> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>>   
>>   static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>>   {
>> -	unsigned num_freed_pages;
>>   	struct page *page;
>>   	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>>   	LIST_HEAD(pages);
>> +	unsigned int num_pfns;
>> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
> This array consumes 1024 bytes of kernel stack, doesn't it?
> leak_balloon() might be called from out_of_memory() where kernel stack
> is already largely consumed before entering __alloc_pages_nodemask().
> For reducing possibility of stack overflow, since out_of_memory() is
> serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
> vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
> out_of_memory().

In that case, we might as well to use
vb->inflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
vb->deflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
which are allocated in probe().

>>   
>>   	/* We can only do one array worth at a time. */
>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>   
>> -	mutex_lock(&vb->balloon_lock);
>>   	/* We can't release more pages than taken */
>> -	num = min(num, (size_t)vb->num_pages);
>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>> +	for (num_pfns = 0; num_pfns < num;
>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>   		page = balloon_page_dequeue(vb_dev_info);
> If balloon_page_dequeue() can be concurrently called by both host's request
> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?


I'm not sure about the question. The "dequeue_page" is a local variable
in the function, why would it be unsafe for two invocations (the shared
b_dev_info->pages are operated under a lock)?



> Is such concurrency needed?

Thanks for this question, it triggers another optimization, which I want to
introduce if this direction could be accepted:

I think it is not quite necessary to deflate pages in OOM-->leak_balloon()
when the host request leak_ballon() is running. In that case, I think OOM
can just count the pages that are deflated by the host request.

The implementation logic will be simple, here is the major part:

1) Introduce a "vb->deflating" flag, to tell whether deflating is in 
progress

2) At the beginning of leak_balloon():
     if (READ_ONCE(vb->deflating)) {
            npages = atomic64_read(&vb->num_pages);
            /* Wait till the other run of leak_balloon() returns */
            while (READ_ONCE(vb->deflating));
            npages = npages - atomic64_read(&vb->num_pages)
     } else {
         WRITE_ONCE(vb->deflating, true);
     }
     ...

3) At the end of leak_balloon():
     WRITE_ONCE(vb->deflating, false);

(The above vb->deflating doesn't have to be in vb though, it can be a 
static variable inside leak_balloon(). we can
discuss more about the implementation when reaching that step)


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
@ 2017-10-22 11:24       ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:24 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 01:20 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>> The balloon_lock was used to synchronize the access demand to elements
>> of struct virtio_balloon and its queue operations (please see commit
>> e22504296d). This prevents the concurrent run of the leak_balloon and
>> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
>>
>> fill_balloon: take balloon_lock and wait for OOM to get some memory;
>> oom_notify: release some inflated memory via leak_balloon();
>> leak_balloon: wait for balloon_lock to be released by fill_balloon.
>>
>> This patch breaks the lock into two fine-grained inflate_lock and
>> deflate_lock, and eliminates the unnecessary use of the shared data
>> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
>> fill_balloon to run concurrently and solves the deadlock issue.
>>
>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>   			msleep(200);
>>   			break;
>>   		}
>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>   		if (!virtio_has_feature(vb->vdev,
>>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>   			adjust_managed_page_count(page, -1);
>>   	}
>>   
>> -	num_allocated_pages = vb->num_pfns;
>> +	mutex_lock(&vb->inflate_lock);
>>   	/* Did we get any? */
>> -	if (vb->num_pfns != 0)
>> -		tell_host(vb, vb->inflate_vq);
>> -	mutex_unlock(&vb->balloon_lock);
>> +	if (num_pfns != 0)
>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>> +	mutex_unlock(&vb->inflate_lock);
>> +	atomic64_add(num_pfns, &vb->num_pages);
> Isn't this addition too late? If leak_balloon() is called due to
> out_of_memory(), it will fail to find up to dated vb->num_pages value.

Not really. I think the old way of implementation above:
"vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
isn't quite accurate, because "vb->num_page" should reflect the number of
pages that have already been inflated, which means those pages have
already been given to the host via "tell_host()".

If we update "vb->num_page" earlier before tell_host(), then it will 
include the pages
that haven't been given to the host, which I think shouldn't be counted 
as inflated pages.

On the other hand, OOM will use leak_balloon() to release the pages that 
should
have already been inflated.

In addition, I think we would also need to move balloon_page_insert(), 
which puts the
page onto the inflated page list, after tell_host().



>>   
>> -	return num_allocated_pages;
>> +	return num_pfns;
>>   }
>>   
>>   static void release_pages_balloon(struct virtio_balloon *vb,
>> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>>   
>>   static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>>   {
>> -	unsigned num_freed_pages;
>>   	struct page *page;
>>   	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>>   	LIST_HEAD(pages);
>> +	unsigned int num_pfns;
>> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
> This array consumes 1024 bytes of kernel stack, doesn't it?
> leak_balloon() might be called from out_of_memory() where kernel stack
> is already largely consumed before entering __alloc_pages_nodemask().
> For reducing possibility of stack overflow, since out_of_memory() is
> serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
> vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
> out_of_memory().

In that case, we might as well to use
vb->inflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
vb->deflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
which are allocated in probe().

>>   
>>   	/* We can only do one array worth at a time. */
>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>   
>> -	mutex_lock(&vb->balloon_lock);
>>   	/* We can't release more pages than taken */
>> -	num = min(num, (size_t)vb->num_pages);
>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>> +	for (num_pfns = 0; num_pfns < num;
>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>   		page = balloon_page_dequeue(vb_dev_info);
> If balloon_page_dequeue() can be concurrently called by both host's request
> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?


I'm not sure about the question. The "dequeue_page" is a local variable
in the function, why would it be unsafe for two invocations (the shared
b_dev_info->pages are operated under a lock)?



> Is such concurrency needed?

Thanks for this question, it triggers another optimization, which I want to
introduce if this direction could be accepted:

I think it is not quite necessary to deflate pages in OOM-->leak_balloon()
when the host request leak_ballon() is running. In that case, I think OOM
can just count the pages that are deflated by the host request.

The implementation logic will be simple, here is the major part:

1) Introduce a "vb->deflating" flag, to tell whether deflating is in 
progress

2) At the beginning of leak_balloon():
     if (READ_ONCE(vb->deflating)) {
            npages = atomic64_read(&vb->num_pages);
            /* Wait till the other run of leak_balloon() returns */
            while (READ_ONCE(vb->deflating));
            npages = npages - atomic64_read(&vb->num_pages)
     } else {
         WRITE_ONCE(vb->deflating, true);
     }
     ...

3) At the end of leak_balloon():
     WRITE_ONCE(vb->deflating, false);

(The above vb->deflating doesn't have to be in vb though, it can be a 
static variable inside leak_balloon(). we can
discuss more about the implementation when reaching that step)


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22  5:20     ` Tetsuo Handa
  (?)
@ 2017-10-22 11:24     ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:24 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: linux-mm, virtualization, linux-kernel, mhocko

On 10/22/2017 01:20 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>> The balloon_lock was used to synchronize the access demand to elements
>> of struct virtio_balloon and its queue operations (please see commit
>> e22504296d). This prevents the concurrent run of the leak_balloon and
>> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
>>
>> fill_balloon: take balloon_lock and wait for OOM to get some memory;
>> oom_notify: release some inflated memory via leak_balloon();
>> leak_balloon: wait for balloon_lock to be released by fill_balloon.
>>
>> This patch breaks the lock into two fine-grained inflate_lock and
>> deflate_lock, and eliminates the unnecessary use of the shared data
>> (i.e. vb->pnfs, vb->num_pfns). This enables leak_balloon and
>> fill_balloon to run concurrently and solves the deadlock issue.
>>
>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>   			msleep(200);
>>   			break;
>>   		}
>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>   		if (!virtio_has_feature(vb->vdev,
>>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>   			adjust_managed_page_count(page, -1);
>>   	}
>>   
>> -	num_allocated_pages = vb->num_pfns;
>> +	mutex_lock(&vb->inflate_lock);
>>   	/* Did we get any? */
>> -	if (vb->num_pfns != 0)
>> -		tell_host(vb, vb->inflate_vq);
>> -	mutex_unlock(&vb->balloon_lock);
>> +	if (num_pfns != 0)
>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>> +	mutex_unlock(&vb->inflate_lock);
>> +	atomic64_add(num_pfns, &vb->num_pages);
> Isn't this addition too late? If leak_balloon() is called due to
> out_of_memory(), it will fail to find up to dated vb->num_pages value.

Not really. I think the old way of implementation above:
"vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
isn't quite accurate, because "vb->num_page" should reflect the number of
pages that have already been inflated, which means those pages have
already been given to the host via "tell_host()".

If we update "vb->num_page" earlier before tell_host(), then it will 
include the pages
that haven't been given to the host, which I think shouldn't be counted 
as inflated pages.

On the other hand, OOM will use leak_balloon() to release the pages that 
should
have already been inflated.

In addition, I think we would also need to move balloon_page_insert(), 
which puts the
page onto the inflated page list, after tell_host().



>>   
>> -	return num_allocated_pages;
>> +	return num_pfns;
>>   }
>>   
>>   static void release_pages_balloon(struct virtio_balloon *vb,
>> @@ -194,38 +192,39 @@ static void release_pages_balloon(struct virtio_balloon *vb,
>>   
>>   static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
>>   {
>> -	unsigned num_freed_pages;
>>   	struct page *page;
>>   	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
>>   	LIST_HEAD(pages);
>> +	unsigned int num_pfns;
>> +	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
> This array consumes 1024 bytes of kernel stack, doesn't it?
> leak_balloon() might be called from out_of_memory() where kernel stack
> is already largely consumed before entering __alloc_pages_nodemask().
> For reducing possibility of stack overflow, since out_of_memory() is
> serialized by oom_lock, I suggest using static (maybe kmalloc()ed as
> vb->oom_pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]) buffer when called from
> out_of_memory().

In that case, we might as well to use
vb->inflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
vb->deflate_pfns = kmalloc(VIRTIO_BALLOON_ARRAY_PFNS_MAX..);
which are allocated in probe().

>>   
>>   	/* We can only do one array worth at a time. */
>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>   
>> -	mutex_lock(&vb->balloon_lock);
>>   	/* We can't release more pages than taken */
>> -	num = min(num, (size_t)vb->num_pages);
>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>> +	for (num_pfns = 0; num_pfns < num;
>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>   		page = balloon_page_dequeue(vb_dev_info);
> If balloon_page_dequeue() can be concurrently called by both host's request
> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?


I'm not sure about the question. The "dequeue_page" is a local variable
in the function, why would it be unsafe for two invocations (the shared
b_dev_info->pages are operated under a lock)?



> Is such concurrency needed?

Thanks for this question, it triggers another optimization, which I want to
introduce if this direction could be accepted:

I think it is not quite necessary to deflate pages in OOM-->leak_balloon()
when the host request leak_ballon() is running. In that case, I think OOM
can just count the pages that are deflated by the host request.

The implementation logic will be simple, here is the major part:

1) Introduce a "vb->deflating" flag, to tell whether deflating is in 
progress

2) At the beginning of leak_balloon():
     if (READ_ONCE(vb->deflating)) {
            npages = atomic64_read(&vb->num_pages);
            /* Wait till the other run of leak_balloon() returns */
            while (READ_ONCE(vb->deflating));
            npages = npages - atomic64_read(&vb->num_pages)
     } else {
         WRITE_ONCE(vb->deflating, true);
     }
     ...

3) At the end of leak_balloon():
     WRITE_ONCE(vb->deflating, false);

(The above vb->deflating doesn't have to be in vb though, it can be a 
static variable inside leak_balloon(). we can
discuss more about the implementation when reaching that step)


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-22  4:11       ` Tetsuo Handa
@ 2017-10-22 11:31         ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:31 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> Michael S. Tsirkin wrote:
>>> -	num_freed_pages = leak_balloon(vb, oom_pages);
>>> +
>>> +	/* Don't deflate more than the number of inflated pages */
>>> +	while (npages && atomic64_read(&vb->num_pages))
>>> +		npages -= leak_balloon(vb, npages);
> don't we need to abort if leak_balloon() returned 0 for some reason?

I don't think so. Returning 0 should be a normal case when the host tries
to give back some pages to the guest, but there is no pages that have ever
been inflated. For example, right after booting the guest, the host sends a
deflating request to give the guest 1G memory, leak_balloon should return 0,
and guest wouldn't get 1 more G memory.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
@ 2017-10-22 11:31         ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:31 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> Michael S. Tsirkin wrote:
>>> -	num_freed_pages = leak_balloon(vb, oom_pages);
>>> +
>>> +	/* Don't deflate more than the number of inflated pages */
>>> +	while (npages && atomic64_read(&vb->num_pages))
>>> +		npages -= leak_balloon(vb, npages);
> don't we need to abort if leak_balloon() returned 0 for some reason?

I don't think so. Returning 0 should be a normal case when the host tries
to give back some pages to the guest, but there is no pages that have ever
been inflated. For example, right after booting the guest, the host sends a
deflating request to give the guest 1G memory, leak_balloon should return 0,
and guest wouldn't get 1 more G memory.


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM
  2017-10-22  4:11       ` Tetsuo Handa
  (?)
  (?)
@ 2017-10-22 11:31       ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-22 11:31 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: linux-mm, virtualization, linux-kernel, mhocko

On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> Michael S. Tsirkin wrote:
>>> -	num_freed_pages = leak_balloon(vb, oom_pages);
>>> +
>>> +	/* Don't deflate more than the number of inflated pages */
>>> +	while (npages && atomic64_read(&vb->num_pages))
>>> +		npages -= leak_balloon(vb, npages);
> don't we need to abort if leak_balloon() returned 0 for some reason?

I don't think so. Returning 0 should be a normal case when the host tries
to give back some pages to the guest, but there is no pages that have ever
been inflated. For example, right after booting the guest, the host sends a
deflating request to give the guest 1G memory, leak_balloon should return 0,
and guest wouldn't get 1 more G memory.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22 11:24       ` Wei Wang
@ 2017-10-22 11:50         ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22 11:50 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Wei Wang wrote:
> >> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >>   			msleep(200);
> >>   			break;
> >>   		}
> >> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> >> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> >> +		set_page_pfns(vb, pfns + num_pfns, page);
> >>   		if (!virtio_has_feature(vb->vdev,
> >>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> >>   			adjust_managed_page_count(page, -1);
> >>   	}
> >>   
> >> -	num_allocated_pages = vb->num_pfns;
> >> +	mutex_lock(&vb->inflate_lock);
> >>   	/* Did we get any? */
> >> -	if (vb->num_pfns != 0)
> >> -		tell_host(vb, vb->inflate_vq);
> >> -	mutex_unlock(&vb->balloon_lock);
> >> +	if (num_pfns != 0)
> >> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> >> +	mutex_unlock(&vb->inflate_lock);
> >> +	atomic64_add(num_pfns, &vb->num_pages);
> > Isn't this addition too late? If leak_balloon() is called due to
> > out_of_memory(), it will fail to find up to dated vb->num_pages value.
> 
> Not really. I think the old way of implementation above:
> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
> isn't quite accurate, because "vb->num_page" should reflect the number of
> pages that have already been inflated, which means those pages have
> already been given to the host via "tell_host()".
> 
> If we update "vb->num_page" earlier before tell_host(), then it will 
> include the pages
> that haven't been given to the host, which I think shouldn't be counted 
> as inflated pages.
> 
> On the other hand, OOM will use leak_balloon() to release the pages that 
> should
> have already been inflated.

But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?

> 
> >>   
> >>   	/* We can only do one array worth at a time. */
> >> -	num = min(num, ARRAY_SIZE(vb->pfns));
> >> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
> >>   
> >> -	mutex_lock(&vb->balloon_lock);
> >>   	/* We can't release more pages than taken */
> >> -	num = min(num, (size_t)vb->num_pages);
> >> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> >> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> >> +	for (num_pfns = 0; num_pfns < num;
> >> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >>   		page = balloon_page_dequeue(vb_dev_info);
> > If balloon_page_dequeue() can be concurrently called by both host's request
> > and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
> 
> 
> I'm not sure about the question. The "dequeue_page" is a local variable
> in the function, why would it be unsafe for two invocations (the shared
> b_dev_info->pages are operated under a lock)?

I'm not MM person nor virtio person. I'm commenting from point of view of
safe programming. My question is, isn't there possibility of hitting

	if (unlikely(list_empty(&b_dev_info->pages) &&
		     !b_dev_info->isolated_pages))
		BUG();

when things run concurrently.

Wei Wang wrote:
> On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> > Michael S. Tsirkin wrote:
> >>> -	num_freed_pages = leak_balloon(vb, oom_pages);
> >>> +
> >>> +	/* Don't deflate more than the number of inflated pages */
> >>> +	while (npages && atomic64_read(&vb->num_pages))
> >>> +		npages -= leak_balloon(vb, npages);
> > don't we need to abort if leak_balloon() returned 0 for some reason?
> 
> I don't think so. Returning 0 should be a normal case when the host tries
> to give back some pages to the guest, but there is no pages that have ever
> been inflated. For example, right after booting the guest, the host sends a
> deflating request to give the guest 1G memory, leak_balloon should return 0,
> and guest wouldn't get 1 more G memory.
> 
My question is, isn't there possibility of leak_balloon() returning 0 for
reasons other than vb->num_pages == 0 ? If yes, this can cause infinite loop
(i.e. lockups) when things run concurrently.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
@ 2017-10-22 11:50         ` Tetsuo Handa
  0 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22 11:50 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

Wei Wang wrote:
> >> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >>   			msleep(200);
> >>   			break;
> >>   		}
> >> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> >> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> >> +		set_page_pfns(vb, pfns + num_pfns, page);
> >>   		if (!virtio_has_feature(vb->vdev,
> >>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> >>   			adjust_managed_page_count(page, -1);
> >>   	}
> >>   
> >> -	num_allocated_pages = vb->num_pfns;
> >> +	mutex_lock(&vb->inflate_lock);
> >>   	/* Did we get any? */
> >> -	if (vb->num_pfns != 0)
> >> -		tell_host(vb, vb->inflate_vq);
> >> -	mutex_unlock(&vb->balloon_lock);
> >> +	if (num_pfns != 0)
> >> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> >> +	mutex_unlock(&vb->inflate_lock);
> >> +	atomic64_add(num_pfns, &vb->num_pages);
> > Isn't this addition too late? If leak_balloon() is called due to
> > out_of_memory(), it will fail to find up to dated vb->num_pages value.
> 
> Not really. I think the old way of implementation above:
> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
> isn't quite accurate, because "vb->num_page" should reflect the number of
> pages that have already been inflated, which means those pages have
> already been given to the host via "tell_host()".
> 
> If we update "vb->num_page" earlier before tell_host(), then it will 
> include the pages
> that haven't been given to the host, which I think shouldn't be counted 
> as inflated pages.
> 
> On the other hand, OOM will use leak_balloon() to release the pages that 
> should
> have already been inflated.

But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?

> 
> >>   
> >>   	/* We can only do one array worth at a time. */
> >> -	num = min(num, ARRAY_SIZE(vb->pfns));
> >> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
> >>   
> >> -	mutex_lock(&vb->balloon_lock);
> >>   	/* We can't release more pages than taken */
> >> -	num = min(num, (size_t)vb->num_pages);
> >> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> >> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> >> +	for (num_pfns = 0; num_pfns < num;
> >> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >>   		page = balloon_page_dequeue(vb_dev_info);
> > If balloon_page_dequeue() can be concurrently called by both host's request
> > and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
> 
> 
> I'm not sure about the question. The "dequeue_page" is a local variable
> in the function, why would it be unsafe for two invocations (the shared
> b_dev_info->pages are operated under a lock)?

I'm not MM person nor virtio person. I'm commenting from point of view of
safe programming. My question is, isn't there possibility of hitting

	if (unlikely(list_empty(&b_dev_info->pages) &&
		     !b_dev_info->isolated_pages))
		BUG();

when things run concurrently.

Wei Wang wrote:
> On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> > Michael S. Tsirkin wrote:
> >>> -	num_freed_pages = leak_balloon(vb, oom_pages);
> >>> +
> >>> +	/* Don't deflate more than the number of inflated pages */
> >>> +	while (npages && atomic64_read(&vb->num_pages))
> >>> +		npages -= leak_balloon(vb, npages);
> > don't we need to abort if leak_balloon() returned 0 for some reason?
> 
> I don't think so. Returning 0 should be a normal case when the host tries
> to give back some pages to the guest, but there is no pages that have ever
> been inflated. For example, right after booting the guest, the host sends a
> deflating request to give the guest 1G memory, leak_balloon should return 0,
> and guest wouldn't get 1 more G memory.
> 
My question is, isn't there possibility of leak_balloon() returning 0 for
reasons other than vb->num_pages == 0 ? If yes, this can cause infinite loop
(i.e. lockups) when things run concurrently.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22 11:24       ` Wei Wang
  (?)
  (?)
@ 2017-10-22 11:50       ` Tetsuo Handa
  -1 siblings, 0 replies; 47+ messages in thread
From: Tetsuo Handa @ 2017-10-22 11:50 UTC (permalink / raw)
  To: wei.w.wang, mst; +Cc: linux-mm, virtualization, linux-kernel, mhocko

Wei Wang wrote:
> >> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >>   			msleep(200);
> >>   			break;
> >>   		}
> >> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> >> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> >> +		set_page_pfns(vb, pfns + num_pfns, page);
> >>   		if (!virtio_has_feature(vb->vdev,
> >>   					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> >>   			adjust_managed_page_count(page, -1);
> >>   	}
> >>   
> >> -	num_allocated_pages = vb->num_pfns;
> >> +	mutex_lock(&vb->inflate_lock);
> >>   	/* Did we get any? */
> >> -	if (vb->num_pfns != 0)
> >> -		tell_host(vb, vb->inflate_vq);
> >> -	mutex_unlock(&vb->balloon_lock);
> >> +	if (num_pfns != 0)
> >> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
> >> +	mutex_unlock(&vb->inflate_lock);
> >> +	atomic64_add(num_pfns, &vb->num_pages);
> > Isn't this addition too late? If leak_balloon() is called due to
> > out_of_memory(), it will fail to find up to dated vb->num_pages value.
> 
> Not really. I think the old way of implementation above:
> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
> isn't quite accurate, because "vb->num_page" should reflect the number of
> pages that have already been inflated, which means those pages have
> already been given to the host via "tell_host()".
> 
> If we update "vb->num_page" earlier before tell_host(), then it will 
> include the pages
> that haven't been given to the host, which I think shouldn't be counted 
> as inflated pages.
> 
> On the other hand, OOM will use leak_balloon() to release the pages that 
> should
> have already been inflated.

But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?

> 
> >>   
> >>   	/* We can only do one array worth at a time. */
> >> -	num = min(num, ARRAY_SIZE(vb->pfns));
> >> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
> >>   
> >> -	mutex_lock(&vb->balloon_lock);
> >>   	/* We can't release more pages than taken */
> >> -	num = min(num, (size_t)vb->num_pages);
> >> -	for (vb->num_pfns = 0; vb->num_pfns < num;
> >> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
> >> +	for (num_pfns = 0; num_pfns < num;
> >> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> >>   		page = balloon_page_dequeue(vb_dev_info);
> > If balloon_page_dequeue() can be concurrently called by both host's request
> > and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
> 
> 
> I'm not sure about the question. The "dequeue_page" is a local variable
> in the function, why would it be unsafe for two invocations (the shared
> b_dev_info->pages are operated under a lock)?

I'm not MM person nor virtio person. I'm commenting from point of view of
safe programming. My question is, isn't there possibility of hitting

	if (unlikely(list_empty(&b_dev_info->pages) &&
		     !b_dev_info->isolated_pages))
		BUG();

when things run concurrently.

Wei Wang wrote:
> On 10/22/2017 12:11 PM, Tetsuo Handa wrote:
> > Michael S. Tsirkin wrote:
> >>> -	num_freed_pages = leak_balloon(vb, oom_pages);
> >>> +
> >>> +	/* Don't deflate more than the number of inflated pages */
> >>> +	while (npages && atomic64_read(&vb->num_pages))
> >>> +		npages -= leak_balloon(vb, npages);
> > don't we need to abort if leak_balloon() returned 0 for some reason?
> 
> I don't think so. Returning 0 should be a normal case when the host tries
> to give back some pages to the guest, but there is no pages that have ever
> been inflated. For example, right after booting the guest, the host sends a
> deflating request to give the guest 1G memory, leak_balloon should return 0,
> and guest wouldn't get 1 more G memory.
> 
My question is, isn't there possibility of leak_balloon() returning 0 for
reasons other than vb->num_pages == 0 ? If yes, this can cause infinite loop
(i.e. lockups) when things run concurrently.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-20 11:54   ` Wei Wang
@ 2017-10-22 17:13     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22 17:13 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
> This patch forces the cease of the inflating work when OOM occurs.
> The fundamental idea of memory ballooning is to take out some guest
> pages when the guest has low memory utilization, so it is sensible to
> inflate nothing when the guest is already under memory pressure.
> 
> On the other hand, the policy is determined by the admin or the
> orchestration layer from the host. That is, the host is expected to
> re-start the memory inflating request at a proper time later when
> the guest has enough memory to inflate, for example, by checking
> the memory stats reported by the balloon.

Is there any other way to do it? And if so can't we just have guest do
it automatically? Maybe the issue is really that fill attempts to
allocate memory aggressively instead of checking availability.
Maybe with deflate on oom it should check availability?


> If another inflating
> requests is sent to guest when the guest is still under memory
> pressure, still no pages will be inflated.

Any such changes are hypervisor-visible and need a new feature bit.


> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michal Hocko <mhocko@kernel.org>
> ---
>  drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index ab55cf8..cf29663 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -63,6 +63,15 @@ struct virtio_balloon {
>  	spinlock_t stop_update_lock;
>  	bool stop_update;
>  
> +	/*
> +	 * The balloon driver enters the oom mode if the oom notifier is
> +	 * invoked. Entering the oom mode will force the exit of current
> +	 * inflating work. When a later inflating request is received from
> +	 * the host, the success of memory allocation via balloon_page_enqueue
> +	 * will turn off the mode.
> +	 */
> +	bool oom_mode;
> +
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
>  
> @@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	struct page *page;
> +	size_t orig_num;
>  	unsigned int num_pfns;
>  	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
>  
> +	orig_num = num;
>  	/* We can only do one array worth at a time. */
>  	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
>  	for (num_pfns = 0; num_pfns < num;
>  	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		struct page *page = balloon_page_enqueue(vb_dev_info);
> -
> +		page = balloon_page_enqueue(vb_dev_info);
>  		if (!page) {
>  			dev_info_ratelimited(&vb->vdev->dev,
>  					     "Out of puff! Can't get %u pages\n",
>  					     VIRTIO_BALLOON_PAGES_PER_PAGE);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
>  			break;
>  		}
>  		set_page_pfns(vb, pfns + num_pfns, page);
> @@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> +	/*
> +	 * The oom_mode is set, but we've already been able to get some
> +	 * pages, so it is time to turn it off here.
> +	 */
> +	if (unlikely(READ_ONCE(vb->oom_mode) && page))
> +		WRITE_ONCE(vb->oom_mode, false);
> +
>  	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
>  	if (num_pfns != 0)
> @@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	mutex_unlock(&vb->inflate_lock);
>  	atomic64_add(num_pfns, &vb->num_pages);
>  
> +	/*
> +	 * If oom_mode is on, return the original @num passed by
> +	 * update_balloon_size_func to stop the inflating.
> +	 */
> +	if (READ_ONCE(vb->oom_mode))
> +		return orig_num;
> +
>  	return num_pfns;
>  }
>  
> @@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
> +	WRITE_ONCE(vb->oom_mode, true);
>  	freed = parm;
>  
>  	/* Don't deflate more than the number of inflated pages */
> @@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
>  	spin_lock_init(&vb->stop_update_lock);
>  	vb->stop_update = false;
> +	vb->oom_mode = false;
>  	atomic64_set(&vb->num_pages, 0);
>  	mutex_init(&vb->inflate_lock);
>  	mutex_init(&vb->deflate_lock);
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
@ 2017-10-22 17:13     ` Michael S. Tsirkin
  0 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22 17:13 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
> This patch forces the cease of the inflating work when OOM occurs.
> The fundamental idea of memory ballooning is to take out some guest
> pages when the guest has low memory utilization, so it is sensible to
> inflate nothing when the guest is already under memory pressure.
> 
> On the other hand, the policy is determined by the admin or the
> orchestration layer from the host. That is, the host is expected to
> re-start the memory inflating request at a proper time later when
> the guest has enough memory to inflate, for example, by checking
> the memory stats reported by the balloon.

Is there any other way to do it? And if so can't we just have guest do
it automatically? Maybe the issue is really that fill attempts to
allocate memory aggressively instead of checking availability.
Maybe with deflate on oom it should check availability?


> If another inflating
> requests is sent to guest when the guest is still under memory
> pressure, still no pages will be inflated.

Any such changes are hypervisor-visible and need a new feature bit.


> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michal Hocko <mhocko@kernel.org>
> ---
>  drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index ab55cf8..cf29663 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -63,6 +63,15 @@ struct virtio_balloon {
>  	spinlock_t stop_update_lock;
>  	bool stop_update;
>  
> +	/*
> +	 * The balloon driver enters the oom mode if the oom notifier is
> +	 * invoked. Entering the oom mode will force the exit of current
> +	 * inflating work. When a later inflating request is received from
> +	 * the host, the success of memory allocation via balloon_page_enqueue
> +	 * will turn off the mode.
> +	 */
> +	bool oom_mode;
> +
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
>  
> @@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	struct page *page;
> +	size_t orig_num;
>  	unsigned int num_pfns;
>  	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
>  
> +	orig_num = num;
>  	/* We can only do one array worth at a time. */
>  	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
>  	for (num_pfns = 0; num_pfns < num;
>  	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		struct page *page = balloon_page_enqueue(vb_dev_info);
> -
> +		page = balloon_page_enqueue(vb_dev_info);
>  		if (!page) {
>  			dev_info_ratelimited(&vb->vdev->dev,
>  					     "Out of puff! Can't get %u pages\n",
>  					     VIRTIO_BALLOON_PAGES_PER_PAGE);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
>  			break;
>  		}
>  		set_page_pfns(vb, pfns + num_pfns, page);
> @@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> +	/*
> +	 * The oom_mode is set, but we've already been able to get some
> +	 * pages, so it is time to turn it off here.
> +	 */
> +	if (unlikely(READ_ONCE(vb->oom_mode) && page))
> +		WRITE_ONCE(vb->oom_mode, false);
> +
>  	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
>  	if (num_pfns != 0)
> @@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	mutex_unlock(&vb->inflate_lock);
>  	atomic64_add(num_pfns, &vb->num_pages);
>  
> +	/*
> +	 * If oom_mode is on, return the original @num passed by
> +	 * update_balloon_size_func to stop the inflating.
> +	 */
> +	if (READ_ONCE(vb->oom_mode))
> +		return orig_num;
> +
>  	return num_pfns;
>  }
>  
> @@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
> +	WRITE_ONCE(vb->oom_mode, true);
>  	freed = parm;
>  
>  	/* Don't deflate more than the number of inflated pages */
> @@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
>  	spin_lock_init(&vb->stop_update_lock);
>  	vb->stop_update = false;
> +	vb->oom_mode = false;
>  	atomic64_set(&vb->num_pages, 0);
>  	mutex_init(&vb->inflate_lock);
>  	mutex_init(&vb->deflate_lock);
> -- 
> 2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-20 11:54   ` Wei Wang
  (?)
@ 2017-10-22 17:13   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael S. Tsirkin @ 2017-10-22 17:13 UTC (permalink / raw)
  To: Wei Wang; +Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
> This patch forces the cease of the inflating work when OOM occurs.
> The fundamental idea of memory ballooning is to take out some guest
> pages when the guest has low memory utilization, so it is sensible to
> inflate nothing when the guest is already under memory pressure.
> 
> On the other hand, the policy is determined by the admin or the
> orchestration layer from the host. That is, the host is expected to
> re-start the memory inflating request at a proper time later when
> the guest has enough memory to inflate, for example, by checking
> the memory stats reported by the balloon.

Is there any other way to do it? And if so can't we just have guest do
it automatically? Maybe the issue is really that fill attempts to
allocate memory aggressively instead of checking availability.
Maybe with deflate on oom it should check availability?


> If another inflating
> requests is sent to guest when the guest is still under memory
> pressure, still no pages will be inflated.

Any such changes are hypervisor-visible and need a new feature bit.


> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michal Hocko <mhocko@kernel.org>
> ---
>  drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index ab55cf8..cf29663 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -63,6 +63,15 @@ struct virtio_balloon {
>  	spinlock_t stop_update_lock;
>  	bool stop_update;
>  
> +	/*
> +	 * The balloon driver enters the oom mode if the oom notifier is
> +	 * invoked. Entering the oom mode will force the exit of current
> +	 * inflating work. When a later inflating request is received from
> +	 * the host, the success of memory allocation via balloon_page_enqueue
> +	 * will turn off the mode.
> +	 */
> +	bool oom_mode;
> +
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
>  
> @@ -142,22 +151,22 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	struct page *page;
> +	size_t orig_num;
>  	unsigned int num_pfns;
>  	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
>  
> +	orig_num = num;
>  	/* We can only do one array worth at a time. */
>  	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>  
>  	for (num_pfns = 0; num_pfns < num;
>  	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> -		struct page *page = balloon_page_enqueue(vb_dev_info);
> -
> +		page = balloon_page_enqueue(vb_dev_info);
>  		if (!page) {
>  			dev_info_ratelimited(&vb->vdev->dev,
>  					     "Out of puff! Can't get %u pages\n",
>  					     VIRTIO_BALLOON_PAGES_PER_PAGE);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
>  			break;
>  		}
>  		set_page_pfns(vb, pfns + num_pfns, page);
> @@ -166,6 +175,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> +	/*
> +	 * The oom_mode is set, but we've already been able to get some
> +	 * pages, so it is time to turn it off here.
> +	 */
> +	if (unlikely(READ_ONCE(vb->oom_mode) && page))
> +		WRITE_ONCE(vb->oom_mode, false);
> +
>  	mutex_lock(&vb->inflate_lock);
>  	/* Did we get any? */
>  	if (num_pfns != 0)
> @@ -173,6 +189,13 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	mutex_unlock(&vb->inflate_lock);
>  	atomic64_add(num_pfns, &vb->num_pages);
>  
> +	/*
> +	 * If oom_mode is on, return the original @num passed by
> +	 * update_balloon_size_func to stop the inflating.
> +	 */
> +	if (READ_ONCE(vb->oom_mode))
> +		return orig_num;
> +
>  	return num_pfns;
>  }
>  
> @@ -365,6 +388,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>  		return NOTIFY_OK;
>  
> +	WRITE_ONCE(vb->oom_mode, true);
>  	freed = parm;
>  
>  	/* Don't deflate more than the number of inflated pages */
> @@ -549,6 +573,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
>  	spin_lock_init(&vb->stop_update_lock);
>  	vb->stop_update = false;
> +	vb->oom_mode = false;
>  	atomic64_set(&vb->num_pages, 0);
>  	mutex_init(&vb->inflate_lock);
>  	mutex_init(&vb->deflate_lock);
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22 11:50         ` Tetsuo Handa
@ 2017-10-24  1:46           ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:46 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 07:50 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>>>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>>>    			msleep(200);
>>>>    			break;
>>>>    		}
>>>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>>>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>>>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>>>    		if (!virtio_has_feature(vb->vdev,
>>>>    					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>>>    			adjust_managed_page_count(page, -1);
>>>>    	}
>>>>    
>>>> -	num_allocated_pages = vb->num_pfns;
>>>> +	mutex_lock(&vb->inflate_lock);
>>>>    	/* Did we get any? */
>>>> -	if (vb->num_pfns != 0)
>>>> -		tell_host(vb, vb->inflate_vq);
>>>> -	mutex_unlock(&vb->balloon_lock);
>>>> +	if (num_pfns != 0)
>>>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>>>> +	mutex_unlock(&vb->inflate_lock);
>>>> +	atomic64_add(num_pfns, &vb->num_pages);
>>> Isn't this addition too late? If leak_balloon() is called due to
>>> out_of_memory(), it will fail to find up to dated vb->num_pages value.
>> Not really. I think the old way of implementation above:
>> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
>> isn't quite accurate, because "vb->num_page" should reflect the number of
>> pages that have already been inflated, which means those pages have
>> already been given to the host via "tell_host()".
>>
>> If we update "vb->num_page" earlier before tell_host(), then it will
>> include the pages
>> that haven't been given to the host, which I think shouldn't be counted
>> as inflated pages.
>>
>> On the other hand, OOM will use leak_balloon() to release the pages that
>> should
>> have already been inflated.
> But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?
>
>>>>    
>>>>    	/* We can only do one array worth at a time. */
>>>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>>>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>>>    
>>>> -	mutex_lock(&vb->balloon_lock);
>>>>    	/* We can't release more pages than taken */
>>>> -	num = min(num, (size_t)vb->num_pages);
>>>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>>>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>>>> +	for (num_pfns = 0; num_pfns < num;
>>>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>>    		page = balloon_page_dequeue(vb_dev_info);
>>> If balloon_page_dequeue() can be concurrently called by both host's request
>>> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
>>
>> I'm not sure about the question. The "dequeue_page" is a local variable
>> in the function, why would it be unsafe for two invocations (the shared
>> b_dev_info->pages are operated under a lock)?
> I'm not MM person nor virtio person. I'm commenting from point of view of
> safe programming. My question is, isn't there possibility of hitting
>
> 	if (unlikely(list_empty(&b_dev_info->pages) &&
> 		     !b_dev_info->isolated_pages))
> 		BUG();
>
> when things run concurrently.

Thanks for the comments. I'm not 100% confident about all the possible 
corner cases here at present
(e.g. why is the b_dev_info->page_lock released and re-gained in 
balloon_page_dequeue()), and
Michael has given a preference of the solution, so I plan not to stick 
with this one.

Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
@ 2017-10-24  1:46           ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:46 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 07:50 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>>>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>>>    			msleep(200);
>>>>    			break;
>>>>    		}
>>>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>>>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>>>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>>>    		if (!virtio_has_feature(vb->vdev,
>>>>    					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>>>    			adjust_managed_page_count(page, -1);
>>>>    	}
>>>>    
>>>> -	num_allocated_pages = vb->num_pfns;
>>>> +	mutex_lock(&vb->inflate_lock);
>>>>    	/* Did we get any? */
>>>> -	if (vb->num_pfns != 0)
>>>> -		tell_host(vb, vb->inflate_vq);
>>>> -	mutex_unlock(&vb->balloon_lock);
>>>> +	if (num_pfns != 0)
>>>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>>>> +	mutex_unlock(&vb->inflate_lock);
>>>> +	atomic64_add(num_pfns, &vb->num_pages);
>>> Isn't this addition too late? If leak_balloon() is called due to
>>> out_of_memory(), it will fail to find up to dated vb->num_pages value.
>> Not really. I think the old way of implementation above:
>> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
>> isn't quite accurate, because "vb->num_page" should reflect the number of
>> pages that have already been inflated, which means those pages have
>> already been given to the host via "tell_host()".
>>
>> If we update "vb->num_page" earlier before tell_host(), then it will
>> include the pages
>> that haven't been given to the host, which I think shouldn't be counted
>> as inflated pages.
>>
>> On the other hand, OOM will use leak_balloon() to release the pages that
>> should
>> have already been inflated.
> But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?
>
>>>>    
>>>>    	/* We can only do one array worth at a time. */
>>>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>>>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>>>    
>>>> -	mutex_lock(&vb->balloon_lock);
>>>>    	/* We can't release more pages than taken */
>>>> -	num = min(num, (size_t)vb->num_pages);
>>>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>>>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>>>> +	for (num_pfns = 0; num_pfns < num;
>>>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>>    		page = balloon_page_dequeue(vb_dev_info);
>>> If balloon_page_dequeue() can be concurrently called by both host's request
>>> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
>>
>> I'm not sure about the question. The "dequeue_page" is a local variable
>> in the function, why would it be unsafe for two invocations (the shared
>> b_dev_info->pages are operated under a lock)?
> I'm not MM person nor virtio person. I'm commenting from point of view of
> safe programming. My question is, isn't there possibility of hitting
>
> 	if (unlikely(list_empty(&b_dev_info->pages) &&
> 		     !b_dev_info->isolated_pages))
> 		BUG();
>
> when things run concurrently.

Thanks for the comments. I'm not 100% confident about all the possible 
corner cases here at present
(e.g. why is the b_dev_info->page_lock released and re-gained in 
balloon_page_dequeue()), and
Michael has given a preference of the solution, so I plan not to stick 
with this one.

Best,
Wei


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock
  2017-10-22 11:50         ` Tetsuo Handa
  (?)
  (?)
@ 2017-10-24  1:46         ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:46 UTC (permalink / raw)
  To: Tetsuo Handa, mst; +Cc: linux-mm, virtualization, linux-kernel, mhocko

On 10/22/2017 07:50 PM, Tetsuo Handa wrote:
> Wei Wang wrote:
>>>> @@ -162,20 +160,20 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>>>>    			msleep(200);
>>>>    			break;
>>>>    		}
>>>> -		set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>>>> -		vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
>>>> +		set_page_pfns(vb, pfns + num_pfns, page);
>>>>    		if (!virtio_has_feature(vb->vdev,
>>>>    					VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>>>    			adjust_managed_page_count(page, -1);
>>>>    	}
>>>>    
>>>> -	num_allocated_pages = vb->num_pfns;
>>>> +	mutex_lock(&vb->inflate_lock);
>>>>    	/* Did we get any? */
>>>> -	if (vb->num_pfns != 0)
>>>> -		tell_host(vb, vb->inflate_vq);
>>>> -	mutex_unlock(&vb->balloon_lock);
>>>> +	if (num_pfns != 0)
>>>> +		tell_host(vb, vb->inflate_vq, pfns, num_pfns);
>>>> +	mutex_unlock(&vb->inflate_lock);
>>>> +	atomic64_add(num_pfns, &vb->num_pages);
>>> Isn't this addition too late? If leak_balloon() is called due to
>>> out_of_memory(), it will fail to find up to dated vb->num_pages value.
>> Not really. I think the old way of implementation above:
>> "vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE"
>> isn't quite accurate, because "vb->num_page" should reflect the number of
>> pages that have already been inflated, which means those pages have
>> already been given to the host via "tell_host()".
>>
>> If we update "vb->num_page" earlier before tell_host(), then it will
>> include the pages
>> that haven't been given to the host, which I think shouldn't be counted
>> as inflated pages.
>>
>> On the other hand, OOM will use leak_balloon() to release the pages that
>> should
>> have already been inflated.
> But leak_balloon() finds max inflated pages from vb->num_pages, doesn't it?
>
>>>>    
>>>>    	/* We can only do one array worth at a time. */
>>>> -	num = min(num, ARRAY_SIZE(vb->pfns));
>>>> +	num = min_t(size_t, num, VIRTIO_BALLOON_ARRAY_PFNS_MAX);
>>>>    
>>>> -	mutex_lock(&vb->balloon_lock);
>>>>    	/* We can't release more pages than taken */
>>>> -	num = min(num, (size_t)vb->num_pages);
>>>> -	for (vb->num_pfns = 0; vb->num_pfns < num;
>>>> -	     vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>> +	num = min_t(size_t, num, atomic64_read(&vb->num_pages));
>>>> +	for (num_pfns = 0; num_pfns < num;
>>>> +	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>>>>    		page = balloon_page_dequeue(vb_dev_info);
>>> If balloon_page_dequeue() can be concurrently called by both host's request
>>> and guest's OOM event, is (!dequeued_page) test in balloon_page_dequeue() safe?
>>
>> I'm not sure about the question. The "dequeue_page" is a local variable
>> in the function, why would it be unsafe for two invocations (the shared
>> b_dev_info->pages are operated under a lock)?
> I'm not MM person nor virtio person. I'm commenting from point of view of
> safe programming. My question is, isn't there possibility of hitting
>
> 	if (unlikely(list_empty(&b_dev_info->pages) &&
> 		     !b_dev_info->isolated_pages))
> 		BUG();
>
> when things run concurrently.

Thanks for the comments. I'm not 100% confident about all the possible 
corner cases here at present
(e.g. why is the b_dev_info->page_lock released and re-gained in 
balloon_page_dequeue()), and
Michael has given a preference of the solution, so I plan not to stick 
with this one.

Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-22 17:13     ` Michael S. Tsirkin
@ 2017-10-24  1:58       ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/23/2017 01:13 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
>> This patch forces the cease of the inflating work when OOM occurs.
>> The fundamental idea of memory ballooning is to take out some guest
>> pages when the guest has low memory utilization, so it is sensible to
>> inflate nothing when the guest is already under memory pressure.
>>
>> On the other hand, the policy is determined by the admin or the
>> orchestration layer from the host. That is, the host is expected to
>> re-start the memory inflating request at a proper time later when
>> the guest has enough memory to inflate, for example, by checking
>> the memory stats reported by the balloon.
> Is there any other way to do it? And if so can't we just have guest do
> it automatically? Maybe the issue is really that fill attempts to
> allocate memory aggressively instead of checking availability.
> Maybe with deflate on oom it should check availability?
>

I think it might not be easy to do it in the guest in practice.
For example, the host asks for 4G from the guest, and the guest checks
that it has 4G that can be inflated at that point. While it is inflating 
and 2G
is done inflating, another new task on the guest comes out and
takes the remaining 2G to use. Now the guest has nothing to inflate.

This would raise the questions:
1) what is the point of checking the availability?
Maybe we could just let the guest inflate as much as it can, that is, till
balloon_page_enqueue() returns NULL, then stop inflating.

2) How long would the host has to wait for this guest to get the 
remaining 2G?
If I understand "guest do it automatically" correctly: now the guest is 
responsible
for giving another 2G, which he owes to the host in this case - not 
giving up inflating
whenever there is some free memory. Maybe in the next 1 hour it wouldn't 
have any
memory available to give to the host. The time seems non-deterministic.

If we leave it to the host to define the policy, I think it would be easier.
Once the host finds that the guest can only offer 2G, then it can just 
give up asking
for memory from this guest, and continue to check other guests to see if 
it can get
some memory there to satisfy the needs.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
@ 2017-10-24  1:58       ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/23/2017 01:13 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
>> This patch forces the cease of the inflating work when OOM occurs.
>> The fundamental idea of memory ballooning is to take out some guest
>> pages when the guest has low memory utilization, so it is sensible to
>> inflate nothing when the guest is already under memory pressure.
>>
>> On the other hand, the policy is determined by the admin or the
>> orchestration layer from the host. That is, the host is expected to
>> re-start the memory inflating request at a proper time later when
>> the guest has enough memory to inflate, for example, by checking
>> the memory stats reported by the balloon.
> Is there any other way to do it? And if so can't we just have guest do
> it automatically? Maybe the issue is really that fill attempts to
> allocate memory aggressively instead of checking availability.
> Maybe with deflate on oom it should check availability?
>

I think it might not be easy to do it in the guest in practice.
For example, the host asks for 4G from the guest, and the guest checks
that it has 4G that can be inflated at that point. While it is inflating 
and 2G
is done inflating, another new task on the guest comes out and
takes the remaining 2G to use. Now the guest has nothing to inflate.

This would raise the questions:
1) what is the point of checking the availability?
Maybe we could just let the guest inflate as much as it can, that is, till
balloon_page_enqueue() returns NULL, then stop inflating.

2) How long would the host has to wait for this guest to get the 
remaining 2G?
If I understand "guest do it automatically" correctly: now the guest is 
responsible
for giving another 2G, which he owes to the host in this case - not 
giving up inflating
whenever there is some free memory. Maybe in the next 1 hour it wouldn't 
have any
memory available to give to the host. The time seems non-deterministic.

If we leave it to the host to define the policy, I think it would be easier.
Once the host finds that the guest can only offer 2G, then it can just 
give up asking
for memory from this guest, and continue to check other guests to see if 
it can get
some memory there to satisfy the needs.


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs
  2017-10-22 17:13     ` Michael S. Tsirkin
  (?)
  (?)
@ 2017-10-24  1:58     ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-10-24  1:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On 10/23/2017 01:13 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:26PM +0800, Wei Wang wrote:
>> This patch forces the cease of the inflating work when OOM occurs.
>> The fundamental idea of memory ballooning is to take out some guest
>> pages when the guest has low memory utilization, so it is sensible to
>> inflate nothing when the guest is already under memory pressure.
>>
>> On the other hand, the policy is determined by the admin or the
>> orchestration layer from the host. That is, the host is expected to
>> re-start the memory inflating request at a proper time later when
>> the guest has enough memory to inflate, for example, by checking
>> the memory stats reported by the balloon.
> Is there any other way to do it? And if so can't we just have guest do
> it automatically? Maybe the issue is really that fill attempts to
> allocate memory aggressively instead of checking availability.
> Maybe with deflate on oom it should check availability?
>

I think it might not be easy to do it in the guest in practice.
For example, the host asks for 4G from the guest, and the guest checks
that it has 4G that can be inflated at that point. While it is inflating 
and 2G
is done inflating, another new task on the guest comes out and
takes the remaining 2G to use. Now the guest has nothing to inflate.

This would raise the questions:
1) what is the point of checking the availability?
Maybe we could just let the guest inflate as much as it can, that is, till
balloon_page_enqueue() returns NULL, then stop inflating.

2) How long would the host has to wait for this guest to get the 
remaining 2G?
If I understand "guest do it automatically" correctly: now the guest is 
responsible
for giving another 2G, which he owes to the host in this case - not 
giving up inflating
whenever there is some free memory. Maybe in the next 1 hour it wouldn't 
have any
memory available to give to the host. The time seems non-deterministic.

If we leave it to the host to define the policy, I think it would be easier.
Once the host finds that the guest can only offer 2G, then it can just 
give up asking
for memory from this guest, and continue to check other guests to see if 
it can get
some memory there to satisfy the needs.


Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-22  3:19   ` Michael S. Tsirkin
@ 2017-11-03  8:35     ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-11-03  8:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.
>

Since we have many customers interested in the "Virtio-balloon 
Enhancement" series,
please review the v17 patches first (it has a dependency on your patch 
for that deadlock fix,
so I included it there too), and we can get back to 2+3 here after that 
series is done. Thanks.

Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
@ 2017-11-03  8:35     ` Wei Wang
  0 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-11-03  8:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, mhocko, linux-kernel, linux-mm, virtualization

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.
>

Since we have many customers interested in the "Virtio-balloon 
Enhancement" series,
please review the v17 patches first (it has a dependency on your patch 
for that deadlock fix,
so I included it there too), and we can get back to 2+3 here after that 
series is done. Thanks.

Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v1 0/3] Virtio-balloon Improvement
  2017-10-22  3:19   ` Michael S. Tsirkin
                     ` (3 preceding siblings ...)
  (?)
@ 2017-11-03  8:35   ` Wei Wang
  -1 siblings, 0 replies; 47+ messages in thread
From: Wei Wang @ 2017-11-03  8:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: penguin-kernel, linux-mm, virtualization, linux-kernel, mhocko

On 10/22/2017 11:19 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 20, 2017 at 07:54:23PM +0800, Wei Wang wrote:
>> This patch series intends to summarize the recent contributions made by
>> Michael S. Tsirkin, Tetsuo Handa, Michal Hocko etc. via reporting and
>> discussing the related deadlock issues on the mailinglist. Please check
>> each patch for details.
>>
>> >From a high-level point of view, this patch series achieves:
>> 1) eliminate the deadlock issue fundamentally caused by the inability
>> to run leak_balloon and fill_balloon concurrently;
> We need to think about this carefully. Is it an issue that
> leak can now bypass fill? It seems that we can now
> try to leak a page before fill was seen by host,
> but I did not look into it deeply.
>
> I really like my patch for this better at least for
> current kernel. I agree we need to work more on 2+3.
>

Since we have many customers interested in the "Virtio-balloon 
Enhancement" series,
please review the v17 patches first (it has a dependency on your patch 
for that deadlock fix,
so I included it there too), and we can get back to 2+3 here after that 
series is done. Thanks.

Best,
Wei

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2017-11-03  8:35 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-20 11:54 [PATCH v1 0/3] Virtio-balloon Improvement Wei Wang
2017-10-20 11:54 ` Wei Wang
2017-10-20 11:54 ` [PATCH v1 1/3] virtio-balloon: replace the coarse-grained balloon_lock Wei Wang
2017-10-20 11:54   ` Wei Wang
2017-10-22  5:20   ` Tetsuo Handa
2017-10-22  5:20   ` Tetsuo Handa
2017-10-22  5:20     ` Tetsuo Handa
2017-10-22 11:24     ` Wei Wang
2017-10-22 11:24     ` Wei Wang
2017-10-22 11:24       ` Wei Wang
2017-10-22 11:50       ` Tetsuo Handa
2017-10-22 11:50         ` Tetsuo Handa
2017-10-24  1:46         ` Wei Wang
2017-10-24  1:46           ` Wei Wang
2017-10-24  1:46         ` Wei Wang
2017-10-22 11:50       ` Tetsuo Handa
2017-10-20 11:54 ` Wei Wang
2017-10-20 11:54 ` [PATCH v1 2/3] virtio-balloon: deflate up to oom_pages on OOM Wei Wang
2017-10-20 11:54 ` Wei Wang
2017-10-20 11:54   ` Wei Wang
2017-10-22  3:21   ` Michael S. Tsirkin
2017-10-22  3:21   ` Michael S. Tsirkin
2017-10-22  3:21     ` Michael S. Tsirkin
2017-10-22  4:11     ` Tetsuo Handa
2017-10-22  4:11     ` Tetsuo Handa
2017-10-22  4:11       ` Tetsuo Handa
2017-10-22 11:31       ` Wei Wang
2017-10-22 11:31         ` Wei Wang
2017-10-22 11:31       ` Wei Wang
2017-10-20 11:54 ` [PATCH v1 3/3] virtio-balloon: stop inflating when OOM occurs Wei Wang
2017-10-20 11:54   ` Wei Wang
2017-10-22 17:13   ` Michael S. Tsirkin
2017-10-22 17:13   ` Michael S. Tsirkin
2017-10-22 17:13     ` Michael S. Tsirkin
2017-10-24  1:58     ` Wei Wang
2017-10-24  1:58       ` Wei Wang
2017-10-24  1:58     ` Wei Wang
2017-10-20 11:54 ` Wei Wang
2017-10-22  3:19 ` [PATCH v1 0/3] Virtio-balloon Improvement Michael S. Tsirkin
2017-10-22  3:19 ` Michael S. Tsirkin
2017-10-22  3:19   ` Michael S. Tsirkin
2017-10-22 11:19   ` Wei Wang
2017-10-22 11:19     ` Wei Wang
2017-10-22 11:19   ` Wei Wang
2017-11-03  8:35   ` Wei Wang
2017-11-03  8:35     ` Wei Wang
2017-11-03  8:35   ` Wei Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.