All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue
@ 2015-12-04 13:37 Petr Mladek
  2015-12-04 13:37 ` [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze Petr Mladek
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Petr Mladek @ 2015-12-04 13:37 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Jeff Epler, Tejun Heo, Jiri Kosina, virtualization, linux-kernel,
	Petr Mladek

It has been long since I have sent v3 of the balloon conversion from
a kthread to a workqueue. I have gained some more experience with
the APIs in the meantime. I hope that you would like the outcome.

I have added one more patch that fixes a separate problem with
restoring the balloon after the system freeze. I have found this
when testing the conversion.

Changes against v3:

  + rebased on 4.4-rc3

  + call cancel_work_synch() when removing the balloon

  + do not queue the work from fill_balloon() and leak_balloon()
    because they are called also independently from the workqueue,
    e.g. remove_common(), virtballoon_oom_notify(). Re-queue
    the work from the work function when necessary.


Changes against v2:

  + Use system_freezable_wq instead of an allocated one
    and move INIT_WORK() higher in virtballoon_probe().

  + Fix typos in the commit message.


Changes against v1:

  + More elegant detection of the pending work in fill_balloon() and
    leak_balloon(). It still needs to keep the original requested number
    of pages but it does not add any extra boolean variable.

  + Remove WQ_MEM_RECLAIM workqueue parameter. If I get it correctly,
    this is possible because the code manipulates memory but it is not
    used in the memory reclaim path.

  + initialize the work item before allocation the workqueue

JFYI, the discussion about the previous version can be found at
http://thread.gmane.org/gmane.linux.kernel.virtualization/23701

Petr Mladek (2):
  virtio_balloon: Restore the entire balloon after the system freeze
  virtio_balloon: Use a workqueue instead of "vballoon" kthread

 drivers/virtio/virtio_balloon.c | 93 ++++++++++++++++-------------------------
 1 file changed, 35 insertions(+), 58 deletions(-)

-- 
1.8.5.6


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
  2015-12-04 13:37 [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue Petr Mladek
@ 2015-12-04 13:37 ` Petr Mladek
  2016-01-01 10:11     ` Michael S. Tsirkin
  2015-12-04 13:37 ` Petr Mladek
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2015-12-04 13:37 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Jeff Epler, Tejun Heo, Jiri Kosina, virtualization, linux-kernel,
	Petr Mladek

fill_balloon() and leak_balloon() manipulate only a limited number
of pages in one call. This is the reason why remove_common() calls
leak_balloon() in a while cycle.

remove_common() is called also when the system is being frozen.
But fill_balloon() is called only once when the system is being
restored. It means that most of the balloon stays leaked after
the system freeze and restore.

This patch adds the missing while cycle also into virtballoon_restore().
Also it makes fill_balloon() to return the number of really modified
pages. Note that leak_balloon() already did this.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 drivers/virtio/virtio_balloon.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 7efc32945810..d73a86db2490 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 		pfns[i] = page_to_balloon_pfn(page) + i;
 }
 
-static void fill_balloon(struct virtio_balloon *vb, size_t num)
+static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	unsigned num_allocated_pages;
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
@@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 			adjust_managed_page_count(page, -1);
 	}
 
+	num_allocated_pages = vb->num_pfns;
 	/* Did we get any? */
 	if (vb->num_pfns != 0)
 		tell_host(vb, vb->inflate_vq);
 	mutex_unlock(&vb->balloon_lock);
+
+	return num_allocated_pages;
 }
 
 static void release_pages_balloon(struct virtio_balloon *vb)
@@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
 static int virtballoon_restore(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
+	s64 diff;
 	int ret;
 
 	ret = init_vqs(vdev->priv);
@@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
 
 	virtio_device_ready(vdev);
 
-	fill_balloon(vb, towards_target(vb));
+	diff = towards_target(vb);
+	while (diff > 0)
+		diff -= fill_balloon(vb, diff);
 	update_balloon_size(vb);
 	return 0;
 }
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
  2015-12-04 13:37 [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue Petr Mladek
  2015-12-04 13:37 ` [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze Petr Mladek
@ 2015-12-04 13:37 ` Petr Mladek
  2015-12-04 13:37 ` [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread Petr Mladek
  2015-12-04 13:37 ` Petr Mladek
  3 siblings, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2015-12-04 13:37 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Petr Mladek, Jiri Kosina, linux-kernel, virtualization,
	Tejun Heo, Jeff Epler

fill_balloon() and leak_balloon() manipulate only a limited number
of pages in one call. This is the reason why remove_common() calls
leak_balloon() in a while cycle.

remove_common() is called also when the system is being frozen.
But fill_balloon() is called only once when the system is being
restored. It means that most of the balloon stays leaked after
the system freeze and restore.

This patch adds the missing while cycle also into virtballoon_restore().
Also it makes fill_balloon() to return the number of really modified
pages. Note that leak_balloon() already did this.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 drivers/virtio/virtio_balloon.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 7efc32945810..d73a86db2490 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
 		pfns[i] = page_to_balloon_pfn(page) + i;
 }
 
-static void fill_balloon(struct virtio_balloon *vb, size_t num)
+static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 {
 	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+	unsigned num_allocated_pages;
 
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
@@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
 			adjust_managed_page_count(page, -1);
 	}
 
+	num_allocated_pages = vb->num_pfns;
 	/* Did we get any? */
 	if (vb->num_pfns != 0)
 		tell_host(vb, vb->inflate_vq);
 	mutex_unlock(&vb->balloon_lock);
+
+	return num_allocated_pages;
 }
 
 static void release_pages_balloon(struct virtio_balloon *vb)
@@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
 static int virtballoon_restore(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
+	s64 diff;
 	int ret;
 
 	ret = init_vqs(vdev->priv);
@@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
 
 	virtio_device_ready(vdev);
 
-	fill_balloon(vb, towards_target(vb));
+	diff = towards_target(vb);
+	while (diff > 0)
+		diff -= fill_balloon(vb, diff);
 	update_balloon_size(vb);
 	return 0;
 }
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2015-12-04 13:37 [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue Petr Mladek
                   ` (2 preceding siblings ...)
  2015-12-04 13:37 ` [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread Petr Mladek
@ 2015-12-04 13:37 ` Petr Mladek
  2016-01-01 10:18     ` Michael S. Tsirkin
  3 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2015-12-04 13:37 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Jeff Epler, Tejun Heo, Jiri Kosina, virtualization, linux-kernel,
	Petr Mladek

From: Petr Mladek <pmladek@suse.cz>

This patch moves the deferred work from the "vballoon" kthread into a
system freezable workqueue.

We do not need to maintain and run a dedicated kthread. Also the event
driven workqueues API makes the logic much easier. Especially, we do
not longer need an own wait queue, wait function, and freeze point.

The conversion is pretty straightforward. One cycle of the main loop
is put into a work. The work is queued instead of waking the kthread.

fill_balloon() and leak_balloon() have a limit for the amount of modified
pages. The work re-queues itself when necessary.

My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
suggested using a system one. Tejun Heo confirmed that the system
workqueue has a pretty high concurrency level (256) by default.
Therefore we need not be afraid of too long blocking.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
---
 drivers/virtio/virtio_balloon.c | 82 +++++++++++++----------------------------
 1 file changed, 26 insertions(+), 56 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d73a86db2490..960e54b1d0c1 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -22,8 +22,7 @@
 #include <linux/virtio.h>
 #include <linux/virtio_balloon.h>
 #include <linux/swap.h>
-#include <linux/kthread.h>
-#include <linux/freezer.h>
+#include <linux/workqueue.h>
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
@@ -49,11 +48,8 @@ struct virtio_balloon {
 	struct virtio_device *vdev;
 	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
 
-	/* Where the ballooning thread waits for config to change. */
-	wait_queue_head_t config_change;
-
-	/* The thread servicing the balloon. */
-	struct task_struct *thread;
+	/* The balloon servicing is delegated to a freezable workqueue. */
+	struct work_struct wq_work;
 
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
@@ -255,14 +251,15 @@ static void update_balloon_stats(struct virtio_balloon *vb)
  * with a single buffer.  From that point forward, all conversations consist of
  * a hypervisor request (a call to this function) which directs us to refill
  * the virtqueue with a fresh stats buffer.  Since stats collection can sleep,
- * we notify our kthread which does the actual work via stats_handle_request().
+ * we delegate the job to a freezable workqueue that will do the actual work via
+ * stats_handle_request().
  */
 static void stats_request(struct virtqueue *vq)
 {
 	struct virtio_balloon *vb = vq->vdev->priv;
 
 	vb->need_stats_update = 1;
-	wake_up(&vb->config_change);
+	queue_work(system_freezable_wq, &vb->wq_work);
 }
 
 static void stats_handle_request(struct virtio_balloon *vb)
@@ -286,7 +283,7 @@ static void virtballoon_changed(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
 
-	wake_up(&vb->config_change);
+	queue_work(system_freezable_wq, &vb->wq_work);
 }
 
 static inline s64 towards_target(struct virtio_balloon *vb)
@@ -349,43 +346,25 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static int balloon(void *_vballoon)
+static void balloon(struct work_struct *work)
 {
-	struct virtio_balloon *vb = _vballoon;
-	DEFINE_WAIT_FUNC(wait, woken_wake_function);
-
-	set_freezable();
-	while (!kthread_should_stop()) {
-		s64 diff;
-
-		try_to_freeze();
-
-		add_wait_queue(&vb->config_change, &wait);
-		for (;;) {
-			if ((diff = towards_target(vb)) != 0 ||
-			    vb->need_stats_update ||
-			    kthread_should_stop() ||
-			    freezing(current))
-				break;
-			wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
-		}
-		remove_wait_queue(&vb->config_change, &wait);
+	struct virtio_balloon *vb;
+	s64 diff;
 
-		if (vb->need_stats_update)
-			stats_handle_request(vb);
-		if (diff > 0)
-			fill_balloon(vb, diff);
-		else if (diff < 0)
-			leak_balloon(vb, -diff);
-		update_balloon_size(vb);
+	vb = container_of(work, struct virtio_balloon, wq_work);
+	diff = towards_target(vb);
 
-		/*
-		 * For large balloon changes, we could spend a lot of time
-		 * and always have work to do.  Be nice if preempt disabled.
-		 */
-		cond_resched();
-	}
-	return 0;
+	if (vb->need_stats_update)
+		stats_handle_request(vb);
+
+	if (diff > 0)
+		diff -= fill_balloon(vb, diff);
+	else if (diff < 0)
+		diff += leak_balloon(vb, -diff);
+	update_balloon_size(vb);
+
+	if (diff)
+		queue_work(system_freezable_wq, work);
 }
 
 static int init_vqs(struct virtio_balloon *vb)
@@ -503,9 +482,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
 		goto out;
 	}
 
+	INIT_WORK(&vb->wq_work, balloon);
 	vb->num_pages = 0;
 	mutex_init(&vb->balloon_lock);
-	init_waitqueue_head(&vb->config_change);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
@@ -527,16 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 	virtio_device_ready(vdev);
 
-	vb->thread = kthread_run(balloon, vb, "vballoon");
-	if (IS_ERR(vb->thread)) {
-		err = PTR_ERR(vb->thread);
-		goto out_del_vqs;
-	}
-
 	return 0;
 
-out_del_vqs:
-	unregister_oom_notifier(&vb->nb);
 out_oom_notify:
 	vdev->config->del_vqs(vdev);
 out_free_vb:
@@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 	struct virtio_balloon *vb = vdev->priv;
 
 	unregister_oom_notifier(&vb->nb);
-	kthread_stop(vb->thread);
+	cancel_work_sync(&vb->wq_work);
 	remove_common(vb);
 	kfree(vb);
 }
@@ -574,10 +545,9 @@ static int virtballoon_freeze(struct virtio_device *vdev)
 	struct virtio_balloon *vb = vdev->priv;
 
 	/*
-	 * The kthread is already frozen by the PM core before this
+	 * The workqueue is already frozen by the PM core before this
 	 * function is called.
 	 */
-
 	remove_common(vb);
 	return 0;
 }
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2015-12-04 13:37 [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue Petr Mladek
  2015-12-04 13:37 ` [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze Petr Mladek
  2015-12-04 13:37 ` Petr Mladek
@ 2015-12-04 13:37 ` Petr Mladek
  2015-12-04 13:37 ` Petr Mladek
  3 siblings, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2015-12-04 13:37 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Jiri Kosina, linux-kernel, Petr Mladek, virtualization,
	Tejun Heo, Jeff Epler

From: Petr Mladek <pmladek@suse.cz>

This patch moves the deferred work from the "vballoon" kthread into a
system freezable workqueue.

We do not need to maintain and run a dedicated kthread. Also the event
driven workqueues API makes the logic much easier. Especially, we do
not longer need an own wait queue, wait function, and freeze point.

The conversion is pretty straightforward. One cycle of the main loop
is put into a work. The work is queued instead of waking the kthread.

fill_balloon() and leak_balloon() have a limit for the amount of modified
pages. The work re-queues itself when necessary.

My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
suggested using a system one. Tejun Heo confirmed that the system
workqueue has a pretty high concurrency level (256) by default.
Therefore we need not be afraid of too long blocking.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
---
 drivers/virtio/virtio_balloon.c | 82 +++++++++++++----------------------------
 1 file changed, 26 insertions(+), 56 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d73a86db2490..960e54b1d0c1 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -22,8 +22,7 @@
 #include <linux/virtio.h>
 #include <linux/virtio_balloon.h>
 #include <linux/swap.h>
-#include <linux/kthread.h>
-#include <linux/freezer.h>
+#include <linux/workqueue.h>
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/module.h>
@@ -49,11 +48,8 @@ struct virtio_balloon {
 	struct virtio_device *vdev;
 	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
 
-	/* Where the ballooning thread waits for config to change. */
-	wait_queue_head_t config_change;
-
-	/* The thread servicing the balloon. */
-	struct task_struct *thread;
+	/* The balloon servicing is delegated to a freezable workqueue. */
+	struct work_struct wq_work;
 
 	/* Waiting for host to ack the pages we released. */
 	wait_queue_head_t acked;
@@ -255,14 +251,15 @@ static void update_balloon_stats(struct virtio_balloon *vb)
  * with a single buffer.  From that point forward, all conversations consist of
  * a hypervisor request (a call to this function) which directs us to refill
  * the virtqueue with a fresh stats buffer.  Since stats collection can sleep,
- * we notify our kthread which does the actual work via stats_handle_request().
+ * we delegate the job to a freezable workqueue that will do the actual work via
+ * stats_handle_request().
  */
 static void stats_request(struct virtqueue *vq)
 {
 	struct virtio_balloon *vb = vq->vdev->priv;
 
 	vb->need_stats_update = 1;
-	wake_up(&vb->config_change);
+	queue_work(system_freezable_wq, &vb->wq_work);
 }
 
 static void stats_handle_request(struct virtio_balloon *vb)
@@ -286,7 +283,7 @@ static void virtballoon_changed(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
 
-	wake_up(&vb->config_change);
+	queue_work(system_freezable_wq, &vb->wq_work);
 }
 
 static inline s64 towards_target(struct virtio_balloon *vb)
@@ -349,43 +346,25 @@ static int virtballoon_oom_notify(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static int balloon(void *_vballoon)
+static void balloon(struct work_struct *work)
 {
-	struct virtio_balloon *vb = _vballoon;
-	DEFINE_WAIT_FUNC(wait, woken_wake_function);
-
-	set_freezable();
-	while (!kthread_should_stop()) {
-		s64 diff;
-
-		try_to_freeze();
-
-		add_wait_queue(&vb->config_change, &wait);
-		for (;;) {
-			if ((diff = towards_target(vb)) != 0 ||
-			    vb->need_stats_update ||
-			    kthread_should_stop() ||
-			    freezing(current))
-				break;
-			wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
-		}
-		remove_wait_queue(&vb->config_change, &wait);
+	struct virtio_balloon *vb;
+	s64 diff;
 
-		if (vb->need_stats_update)
-			stats_handle_request(vb);
-		if (diff > 0)
-			fill_balloon(vb, diff);
-		else if (diff < 0)
-			leak_balloon(vb, -diff);
-		update_balloon_size(vb);
+	vb = container_of(work, struct virtio_balloon, wq_work);
+	diff = towards_target(vb);
 
-		/*
-		 * For large balloon changes, we could spend a lot of time
-		 * and always have work to do.  Be nice if preempt disabled.
-		 */
-		cond_resched();
-	}
-	return 0;
+	if (vb->need_stats_update)
+		stats_handle_request(vb);
+
+	if (diff > 0)
+		diff -= fill_balloon(vb, diff);
+	else if (diff < 0)
+		diff += leak_balloon(vb, -diff);
+	update_balloon_size(vb);
+
+	if (diff)
+		queue_work(system_freezable_wq, work);
 }
 
 static int init_vqs(struct virtio_balloon *vb)
@@ -503,9 +482,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
 		goto out;
 	}
 
+	INIT_WORK(&vb->wq_work, balloon);
 	vb->num_pages = 0;
 	mutex_init(&vb->balloon_lock);
-	init_waitqueue_head(&vb->config_change);
 	init_waitqueue_head(&vb->acked);
 	vb->vdev = vdev;
 	vb->need_stats_update = 0;
@@ -527,16 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 	virtio_device_ready(vdev);
 
-	vb->thread = kthread_run(balloon, vb, "vballoon");
-	if (IS_ERR(vb->thread)) {
-		err = PTR_ERR(vb->thread);
-		goto out_del_vqs;
-	}
-
 	return 0;
 
-out_del_vqs:
-	unregister_oom_notifier(&vb->nb);
 out_oom_notify:
 	vdev->config->del_vqs(vdev);
 out_free_vb:
@@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 	struct virtio_balloon *vb = vdev->priv;
 
 	unregister_oom_notifier(&vb->nb);
-	kthread_stop(vb->thread);
+	cancel_work_sync(&vb->wq_work);
 	remove_common(vb);
 	kfree(vb);
 }
@@ -574,10 +545,9 @@ static int virtballoon_freeze(struct virtio_device *vdev)
 	struct virtio_balloon *vb = vdev->priv;
 
 	/*
-	 * The kthread is already frozen by the PM core before this
+	 * The workqueue is already frozen by the PM core before this
 	 * function is called.
 	 */
-
 	remove_common(vb);
 	return 0;
 }
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
  2015-12-04 13:37 ` [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze Petr Mladek
@ 2016-01-01 10:11     ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Rusty Russell, Jeff Epler, Tejun Heo, Jiri Kosina,
	virtualization, linux-kernel

On Fri, Dec 04, 2015 at 02:37:50PM +0100, Petr Mladek wrote:
> fill_balloon() and leak_balloon() manipulate only a limited number
> of pages in one call. This is the reason why remove_common() calls
> leak_balloon() in a while cycle.
> 
> remove_common() is called also when the system is being frozen.
> But fill_balloon() is called only once when the system is being
> restored. It means that most of the balloon stays leaked after
> the system freeze and restore.

Right, but refilling might take a long while.
In fact, we sleep for 200msec on refill failure,
stalling system resume - which is already a bug.

> This patch adds the missing while cycle also into virtballoon_restore().
> Also it makes fill_balloon() to return the number of really modified
> pages. Note that leak_balloon() already did this.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

This is a replacement for:
	virtio_balloon: Restore the entire balloon after the system freeze

> ---
>  drivers/virtio/virtio_balloon.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 7efc32945810..d73a86db2490 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  		pfns[i] = page_to_balloon_pfn(page) + i;
>  }
>  
> -static void fill_balloon(struct virtio_balloon *vb, size_t num)
> +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	unsigned num_allocated_pages;
>  
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
> @@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> +	num_allocated_pages = vb->num_pfns;
>  	/* Did we get any? */
>  	if (vb->num_pfns != 0)
>  		tell_host(vb, vb->inflate_vq);
>  	mutex_unlock(&vb->balloon_lock);
> +
> +	return num_allocated_pages;
>  }
>  
>  static void release_pages_balloon(struct virtio_balloon *vb)
> @@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
>  static int virtballoon_restore(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
> +	s64 diff;
>  	int ret;
>  
>  	ret = init_vqs(vdev->priv);
> @@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
>  
>  	virtio_device_ready(vdev);
>  
> -	fill_balloon(vb, towards_target(vb));
> +	diff = towards_target(vb);
> +	while (diff > 0)
> +		diff -= fill_balloon(vb, diff);
>  	update_balloon_size(vb);
>  	return 0;
>  }
> -- 
> 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
@ 2016-01-01 10:11     ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, linux-kernel, virtualization, Tejun Heo, Jeff Epler

On Fri, Dec 04, 2015 at 02:37:50PM +0100, Petr Mladek wrote:
> fill_balloon() and leak_balloon() manipulate only a limited number
> of pages in one call. This is the reason why remove_common() calls
> leak_balloon() in a while cycle.
> 
> remove_common() is called also when the system is being frozen.
> But fill_balloon() is called only once when the system is being
> restored. It means that most of the balloon stays leaked after
> the system freeze and restore.

Right, but refilling might take a long while.
In fact, we sleep for 200msec on refill failure,
stalling system resume - which is already a bug.

> This patch adds the missing while cycle also into virtballoon_restore().
> Also it makes fill_balloon() to return the number of really modified
> pages. Note that leak_balloon() already did this.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

This is a replacement for:
	virtio_balloon: Restore the entire balloon after the system freeze

> ---
>  drivers/virtio/virtio_balloon.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 7efc32945810..d73a86db2490 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
>  		pfns[i] = page_to_balloon_pfn(page) + i;
>  }
>  
> -static void fill_balloon(struct virtio_balloon *vb, size_t num)
> +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  {
>  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> +	unsigned num_allocated_pages;
>  
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
> @@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
>  			adjust_managed_page_count(page, -1);
>  	}
>  
> +	num_allocated_pages = vb->num_pfns;
>  	/* Did we get any? */
>  	if (vb->num_pfns != 0)
>  		tell_host(vb, vb->inflate_vq);
>  	mutex_unlock(&vb->balloon_lock);
> +
> +	return num_allocated_pages;
>  }
>  
>  static void release_pages_balloon(struct virtio_balloon *vb)
> @@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
>  static int virtballoon_restore(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
> +	s64 diff;
>  	int ret;
>  
>  	ret = init_vqs(vdev->priv);
> @@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
>  
>  	virtio_device_ready(vdev);
>  
> -	fill_balloon(vb, towards_target(vb));
> +	diff = towards_target(vb);
> +	while (diff > 0)
> +		diff -= fill_balloon(vb, diff);
>  	update_balloon_size(vb);
>  	return 0;
>  }
> -- 
> 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
  2016-01-01 10:11     ` Michael S. Tsirkin
@ 2016-01-01 10:11       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Rusty Russell, Jeff Epler, Tejun Heo, Jiri Kosina,
	virtualization, linux-kernel

On Fri, Jan 01, 2016 at 12:11:02PM +0200, Michael S. Tsirkin wrote:
> On Fri, Dec 04, 2015 at 02:37:50PM +0100, Petr Mladek wrote:
> > fill_balloon() and leak_balloon() manipulate only a limited number
> > of pages in one call. This is the reason why remove_common() calls
> > leak_balloon() in a while cycle.
> > 
> > remove_common() is called also when the system is being frozen.
> > But fill_balloon() is called only once when the system is being
> > restored. It means that most of the balloon stays leaked after
> > the system freeze and restore.
> 
> Right, but refilling might take a long while.
> In fact, we sleep for 200msec on refill failure,
> stalling system resume - which is already a bug.
> 
> > This patch adds the missing while cycle also into virtballoon_restore().
> > Also it makes fill_balloon() to return the number of really modified
> > pages. Note that leak_balloon() already did this.
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> 
> This is a replacement for:
> 	virtio_balloon: Restore the entire balloon after the system freeze

oops, typo
I meant to write I'll send a replacement patch shortly :)

> > ---
> >  drivers/virtio/virtio_balloon.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index 7efc32945810..d73a86db2490 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
> >  		pfns[i] = page_to_balloon_pfn(page) + i;
> >  }
> >  
> > -static void fill_balloon(struct virtio_balloon *vb, size_t num)
> > +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >  {
> >  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> > +	unsigned num_allocated_pages;
> >  
> >  	/* We can only do one array worth at a time. */
> >  	num = min(num, ARRAY_SIZE(vb->pfns));
> > @@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  			adjust_managed_page_count(page, -1);
> >  	}
> >  
> > +	num_allocated_pages = vb->num_pfns;
> >  	/* Did we get any? */
> >  	if (vb->num_pfns != 0)
> >  		tell_host(vb, vb->inflate_vq);
> >  	mutex_unlock(&vb->balloon_lock);
> > +
> > +	return num_allocated_pages;
> >  }
> >  
> >  static void release_pages_balloon(struct virtio_balloon *vb)
> > @@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
> >  static int virtballoon_restore(struct virtio_device *vdev)
> >  {
> >  	struct virtio_balloon *vb = vdev->priv;
> > +	s64 diff;
> >  	int ret;
> >  
> >  	ret = init_vqs(vdev->priv);
> > @@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
> >  
> >  	virtio_device_ready(vdev);
> >  
> > -	fill_balloon(vb, towards_target(vb));
> > +	diff = towards_target(vb);
> > +	while (diff > 0)
> > +		diff -= fill_balloon(vb, diff);
> >  	update_balloon_size(vb);
> >  	return 0;
> >  }
> > -- 
> > 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze
@ 2016-01-01 10:11       ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, linux-kernel, virtualization, Tejun Heo, Jeff Epler

On Fri, Jan 01, 2016 at 12:11:02PM +0200, Michael S. Tsirkin wrote:
> On Fri, Dec 04, 2015 at 02:37:50PM +0100, Petr Mladek wrote:
> > fill_balloon() and leak_balloon() manipulate only a limited number
> > of pages in one call. This is the reason why remove_common() calls
> > leak_balloon() in a while cycle.
> > 
> > remove_common() is called also when the system is being frozen.
> > But fill_balloon() is called only once when the system is being
> > restored. It means that most of the balloon stays leaked after
> > the system freeze and restore.
> 
> Right, but refilling might take a long while.
> In fact, we sleep for 200msec on refill failure,
> stalling system resume - which is already a bug.
> 
> > This patch adds the missing while cycle also into virtballoon_restore().
> > Also it makes fill_balloon() to return the number of really modified
> > pages. Note that leak_balloon() already did this.
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> 
> This is a replacement for:
> 	virtio_balloon: Restore the entire balloon after the system freeze

oops, typo
I meant to write I'll send a replacement patch shortly :)

> > ---
> >  drivers/virtio/virtio_balloon.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index 7efc32945810..d73a86db2490 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -135,9 +135,10 @@ static void set_page_pfns(u32 pfns[], struct page *page)
> >  		pfns[i] = page_to_balloon_pfn(page) + i;
> >  }
> >  
> > -static void fill_balloon(struct virtio_balloon *vb, size_t num)
> > +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >  {
> >  	struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> > +	unsigned num_allocated_pages;
> >  
> >  	/* We can only do one array worth at a time. */
> >  	num = min(num, ARRAY_SIZE(vb->pfns));
> > @@ -162,10 +163,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num)
> >  			adjust_managed_page_count(page, -1);
> >  	}
> >  
> > +	num_allocated_pages = vb->num_pfns;
> >  	/* Did we get any? */
> >  	if (vb->num_pfns != 0)
> >  		tell_host(vb, vb->inflate_vq);
> >  	mutex_unlock(&vb->balloon_lock);
> > +
> > +	return num_allocated_pages;
> >  }
> >  
> >  static void release_pages_balloon(struct virtio_balloon *vb)
> > @@ -581,6 +585,7 @@ static int virtballoon_freeze(struct virtio_device *vdev)
> >  static int virtballoon_restore(struct virtio_device *vdev)
> >  {
> >  	struct virtio_balloon *vb = vdev->priv;
> > +	s64 diff;
> >  	int ret;
> >  
> >  	ret = init_vqs(vdev->priv);
> > @@ -589,7 +594,9 @@ static int virtballoon_restore(struct virtio_device *vdev)
> >  
> >  	virtio_device_ready(vdev);
> >  
> > -	fill_balloon(vb, towards_target(vb));
> > +	diff = towards_target(vb);
> > +	while (diff > 0)
> > +		diff -= fill_balloon(vb, diff);
> >  	update_balloon_size(vb);
> >  	return 0;
> >  }
> > -- 
> > 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2015-12-04 13:37 ` Petr Mladek
@ 2016-01-01 10:18     ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Rusty Russell, Jeff Epler, Tejun Heo, Jiri Kosina,
	virtualization, linux-kernel, Petr Mladek

On Fri, Dec 04, 2015 at 02:37:51PM +0100, Petr Mladek wrote:
> From: Petr Mladek <pmladek@suse.cz>
> 
> This patch moves the deferred work from the "vballoon" kthread into a
> system freezable workqueue.
> 
> We do not need to maintain and run a dedicated kthread. Also the event
> driven workqueues API makes the logic much easier. Especially, we do
> not longer need an own wait queue, wait function, and freeze point.
> 
> The conversion is pretty straightforward. One cycle of the main loop
> is put into a work. The work is queued instead of waking the kthread.
> 
> fill_balloon() and leak_balloon() have a limit for the amount of modified
> pages. The work re-queues itself when necessary.
> 
> My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> suggested using a system one. Tejun Heo confirmed that the system
> workqueue has a pretty high concurrency level (256) by default.
> Therefore we need not be afraid of too long blocking.

Right but fill has a 1/5 second sleep on failure - *that*
is problematic for a system queue.

There's also a race introduced on remove, see below.

I'm inclined to tread carefully with this conversion.

> 
> Signed-off-by: Petr Mladek <pmladek@suse.cz>
> ---
>  drivers/virtio/virtio_balloon.c | 82 +++++++++++++----------------------------
>  1 file changed, 26 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index d73a86db2490..960e54b1d0c1 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -22,8 +22,7 @@
>  #include <linux/virtio.h>
>  #include <linux/virtio_balloon.h>
>  #include <linux/swap.h>
> -#include <linux/kthread.h>
> -#include <linux/freezer.h>
> +#include <linux/workqueue.h>
>  #include <linux/delay.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
> @@ -49,11 +48,8 @@ struct virtio_balloon {
>  	struct virtio_device *vdev;
>  	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
>  
> -	/* Where the ballooning thread waits for config to change. */
> -	wait_queue_head_t config_change;
> -
> -	/* The thread servicing the balloon. */
> -	struct task_struct *thread;
> +	/* The balloon servicing is delegated to a freezable workqueue. */
> +	struct work_struct wq_work;
>  
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
> @@ -255,14 +251,15 @@ static void update_balloon_stats(struct virtio_balloon *vb)
>   * with a single buffer.  From that point forward, all conversations consist of
>   * a hypervisor request (a call to this function) which directs us to refill
>   * the virtqueue with a fresh stats buffer.  Since stats collection can sleep,
> - * we notify our kthread which does the actual work via stats_handle_request().
> + * we delegate the job to a freezable workqueue that will do the actual work via
> + * stats_handle_request().
>   */
>  static void stats_request(struct virtqueue *vq)
>  {
>  	struct virtio_balloon *vb = vq->vdev->priv;
>  
>  	vb->need_stats_update = 1;
> -	wake_up(&vb->config_change);
> +	queue_work(system_freezable_wq, &vb->wq_work);
>  }
>  
>  static void stats_handle_request(struct virtio_balloon *vb)
> @@ -286,7 +283,7 @@ static void virtballoon_changed(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
>  
> -	wake_up(&vb->config_change);
> +	queue_work(system_freezable_wq, &vb->wq_work);
>  }
>  
>  static inline s64 towards_target(struct virtio_balloon *vb)
> @@ -349,43 +346,25 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	return NOTIFY_OK;
>  }
>  
> -static int balloon(void *_vballoon)
> +static void balloon(struct work_struct *work)
>  {
> -	struct virtio_balloon *vb = _vballoon;
> -	DEFINE_WAIT_FUNC(wait, woken_wake_function);
> -
> -	set_freezable();
> -	while (!kthread_should_stop()) {
> -		s64 diff;
> -
> -		try_to_freeze();
> -
> -		add_wait_queue(&vb->config_change, &wait);
> -		for (;;) {
> -			if ((diff = towards_target(vb)) != 0 ||
> -			    vb->need_stats_update ||
> -			    kthread_should_stop() ||
> -			    freezing(current))
> -				break;
> -			wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
> -		}
> -		remove_wait_queue(&vb->config_change, &wait);
> +	struct virtio_balloon *vb;
> +	s64 diff;
>  
> -		if (vb->need_stats_update)
> -			stats_handle_request(vb);
> -		if (diff > 0)
> -			fill_balloon(vb, diff);
> -		else if (diff < 0)
> -			leak_balloon(vb, -diff);
> -		update_balloon_size(vb);
> +	vb = container_of(work, struct virtio_balloon, wq_work);
> +	diff = towards_target(vb);
>  
> -		/*
> -		 * For large balloon changes, we could spend a lot of time
> -		 * and always have work to do.  Be nice if preempt disabled.
> -		 */
> -		cond_resched();
> -	}
> -	return 0;
> +	if (vb->need_stats_update)
> +		stats_handle_request(vb);
> +
> +	if (diff > 0)
> +		diff -= fill_balloon(vb, diff);
> +	else if (diff < 0)
> +		diff += leak_balloon(vb, -diff);
> +	update_balloon_size(vb);
> +
> +	if (diff)
> +		queue_work(system_freezable_wq, work);
>  }
>  
>  static int init_vqs(struct virtio_balloon *vb)
> @@ -503,9 +482,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  		goto out;
>  	}
>  
> +	INIT_WORK(&vb->wq_work, balloon);
>  	vb->num_pages = 0;
>  	mutex_init(&vb->balloon_lock);
> -	init_waitqueue_head(&vb->config_change);
>  	init_waitqueue_head(&vb->acked);
>  	vb->vdev = vdev;
>  	vb->need_stats_update = 0;
> @@ -527,16 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  
>  	virtio_device_ready(vdev);
>  
> -	vb->thread = kthread_run(balloon, vb, "vballoon");
> -	if (IS_ERR(vb->thread)) {
> -		err = PTR_ERR(vb->thread);
> -		goto out_del_vqs;
> -	}
> -
>  	return 0;
>  
> -out_del_vqs:
> -	unregister_oom_notifier(&vb->nb);
>  out_oom_notify:
>  	vdev->config->del_vqs(vdev);
>  out_free_vb:
> @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
>  	struct virtio_balloon *vb = vdev->priv;
>  
>  	unregister_oom_notifier(&vb->nb);
> -	kthread_stop(vb->thread);
> +	cancel_work_sync(&vb->wq_work);

OK but since job requeues itself, cancelling like this might not be enough.

>  	remove_common(vb);
>  	kfree(vb);
>  }
> @@ -574,10 +545,9 @@ static int virtballoon_freeze(struct virtio_device *vdev)
>  	struct virtio_balloon *vb = vdev->priv;
>  
>  	/*
> -	 * The kthread is already frozen by the PM core before this
> +	 * The workqueue is already frozen by the PM core before this
>  	 * function is called.
>  	 */
> -
>  	remove_common(vb);
>  	return 0;
>  }
> -- 
> 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-01 10:18     ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-01 10:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, linux-kernel, Petr Mladek, virtualization,
	Tejun Heo, Jeff Epler

On Fri, Dec 04, 2015 at 02:37:51PM +0100, Petr Mladek wrote:
> From: Petr Mladek <pmladek@suse.cz>
> 
> This patch moves the deferred work from the "vballoon" kthread into a
> system freezable workqueue.
> 
> We do not need to maintain and run a dedicated kthread. Also the event
> driven workqueues API makes the logic much easier. Especially, we do
> not longer need an own wait queue, wait function, and freeze point.
> 
> The conversion is pretty straightforward. One cycle of the main loop
> is put into a work. The work is queued instead of waking the kthread.
> 
> fill_balloon() and leak_balloon() have a limit for the amount of modified
> pages. The work re-queues itself when necessary.
> 
> My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> suggested using a system one. Tejun Heo confirmed that the system
> workqueue has a pretty high concurrency level (256) by default.
> Therefore we need not be afraid of too long blocking.

Right but fill has a 1/5 second sleep on failure - *that*
is problematic for a system queue.

There's also a race introduced on remove, see below.

I'm inclined to tread carefully with this conversion.

> 
> Signed-off-by: Petr Mladek <pmladek@suse.cz>
> ---
>  drivers/virtio/virtio_balloon.c | 82 +++++++++++++----------------------------
>  1 file changed, 26 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index d73a86db2490..960e54b1d0c1 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -22,8 +22,7 @@
>  #include <linux/virtio.h>
>  #include <linux/virtio_balloon.h>
>  #include <linux/swap.h>
> -#include <linux/kthread.h>
> -#include <linux/freezer.h>
> +#include <linux/workqueue.h>
>  #include <linux/delay.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
> @@ -49,11 +48,8 @@ struct virtio_balloon {
>  	struct virtio_device *vdev;
>  	struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
>  
> -	/* Where the ballooning thread waits for config to change. */
> -	wait_queue_head_t config_change;
> -
> -	/* The thread servicing the balloon. */
> -	struct task_struct *thread;
> +	/* The balloon servicing is delegated to a freezable workqueue. */
> +	struct work_struct wq_work;
>  
>  	/* Waiting for host to ack the pages we released. */
>  	wait_queue_head_t acked;
> @@ -255,14 +251,15 @@ static void update_balloon_stats(struct virtio_balloon *vb)
>   * with a single buffer.  From that point forward, all conversations consist of
>   * a hypervisor request (a call to this function) which directs us to refill
>   * the virtqueue with a fresh stats buffer.  Since stats collection can sleep,
> - * we notify our kthread which does the actual work via stats_handle_request().
> + * we delegate the job to a freezable workqueue that will do the actual work via
> + * stats_handle_request().
>   */
>  static void stats_request(struct virtqueue *vq)
>  {
>  	struct virtio_balloon *vb = vq->vdev->priv;
>  
>  	vb->need_stats_update = 1;
> -	wake_up(&vb->config_change);
> +	queue_work(system_freezable_wq, &vb->wq_work);
>  }
>  
>  static void stats_handle_request(struct virtio_balloon *vb)
> @@ -286,7 +283,7 @@ static void virtballoon_changed(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
>  
> -	wake_up(&vb->config_change);
> +	queue_work(system_freezable_wq, &vb->wq_work);
>  }
>  
>  static inline s64 towards_target(struct virtio_balloon *vb)
> @@ -349,43 +346,25 @@ static int virtballoon_oom_notify(struct notifier_block *self,
>  	return NOTIFY_OK;
>  }
>  
> -static int balloon(void *_vballoon)
> +static void balloon(struct work_struct *work)
>  {
> -	struct virtio_balloon *vb = _vballoon;
> -	DEFINE_WAIT_FUNC(wait, woken_wake_function);
> -
> -	set_freezable();
> -	while (!kthread_should_stop()) {
> -		s64 diff;
> -
> -		try_to_freeze();
> -
> -		add_wait_queue(&vb->config_change, &wait);
> -		for (;;) {
> -			if ((diff = towards_target(vb)) != 0 ||
> -			    vb->need_stats_update ||
> -			    kthread_should_stop() ||
> -			    freezing(current))
> -				break;
> -			wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
> -		}
> -		remove_wait_queue(&vb->config_change, &wait);
> +	struct virtio_balloon *vb;
> +	s64 diff;
>  
> -		if (vb->need_stats_update)
> -			stats_handle_request(vb);
> -		if (diff > 0)
> -			fill_balloon(vb, diff);
> -		else if (diff < 0)
> -			leak_balloon(vb, -diff);
> -		update_balloon_size(vb);
> +	vb = container_of(work, struct virtio_balloon, wq_work);
> +	diff = towards_target(vb);
>  
> -		/*
> -		 * For large balloon changes, we could spend a lot of time
> -		 * and always have work to do.  Be nice if preempt disabled.
> -		 */
> -		cond_resched();
> -	}
> -	return 0;
> +	if (vb->need_stats_update)
> +		stats_handle_request(vb);
> +
> +	if (diff > 0)
> +		diff -= fill_balloon(vb, diff);
> +	else if (diff < 0)
> +		diff += leak_balloon(vb, -diff);
> +	update_balloon_size(vb);
> +
> +	if (diff)
> +		queue_work(system_freezable_wq, work);
>  }
>  
>  static int init_vqs(struct virtio_balloon *vb)
> @@ -503,9 +482,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  		goto out;
>  	}
>  
> +	INIT_WORK(&vb->wq_work, balloon);
>  	vb->num_pages = 0;
>  	mutex_init(&vb->balloon_lock);
> -	init_waitqueue_head(&vb->config_change);
>  	init_waitqueue_head(&vb->acked);
>  	vb->vdev = vdev;
>  	vb->need_stats_update = 0;
> @@ -527,16 +506,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  
>  	virtio_device_ready(vdev);
>  
> -	vb->thread = kthread_run(balloon, vb, "vballoon");
> -	if (IS_ERR(vb->thread)) {
> -		err = PTR_ERR(vb->thread);
> -		goto out_del_vqs;
> -	}
> -
>  	return 0;
>  
> -out_del_vqs:
> -	unregister_oom_notifier(&vb->nb);
>  out_oom_notify:
>  	vdev->config->del_vqs(vdev);
>  out_free_vb:
> @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
>  	struct virtio_balloon *vb = vdev->priv;
>  
>  	unregister_oom_notifier(&vb->nb);
> -	kthread_stop(vb->thread);
> +	cancel_work_sync(&vb->wq_work);

OK but since job requeues itself, cancelling like this might not be enough.

>  	remove_common(vb);
>  	kfree(vb);
>  }
> @@ -574,10 +545,9 @@ static int virtballoon_freeze(struct virtio_device *vdev)
>  	struct virtio_balloon *vb = vdev->priv;
>  
>  	/*
> -	 * The kthread is already frozen by the PM core before this
> +	 * The workqueue is already frozen by the PM core before this
>  	 * function is called.
>  	 */
> -
>  	remove_common(vb);
>  	return 0;
>  }
> -- 
> 1.8.5.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2016-01-01 10:18     ` Michael S. Tsirkin
@ 2016-01-02 11:43       ` Tejun Heo
  -1 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2016-01-02 11:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Petr Mladek, Rusty Russell, Jeff Epler, Jiri Kosina,
	virtualization, linux-kernel, Petr Mladek

Hello,

On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > suggested using a system one. Tejun Heo confirmed that the system
> > workqueue has a pretty high concurrency level (256) by default.
> > Therefore we need not be afraid of too long blocking.
> 
> Right but fill has a 1/5 second sleep on failure - *that*
> is problematic for a system queue.

Why so?  As long as the maximum concurrently used workers are not
high, 1/5 second or even a lot longer sleeps are completely fine.

> > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> >  	struct virtio_balloon *vb = vdev->priv;
> >  
> >  	unregister_oom_notifier(&vb->nb);
> > -	kthread_stop(vb->thread);
> > +	cancel_work_sync(&vb->wq_work);
> 
> OK but since job requeues itself, cancelling like this might not be enough.

As long as there's no further external queueing, cancel_work_sync() is
guaranteed to kill a self-requeueing work item.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-02 11:43       ` Tejun Heo
  0 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2016-01-02 11:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Petr Mladek, Jiri Kosina, linux-kernel, Petr Mladek,
	virtualization, Jeff Epler

Hello,

On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > suggested using a system one. Tejun Heo confirmed that the system
> > workqueue has a pretty high concurrency level (256) by default.
> > Therefore we need not be afraid of too long blocking.
> 
> Right but fill has a 1/5 second sleep on failure - *that*
> is problematic for a system queue.

Why so?  As long as the maximum concurrently used workers are not
high, 1/5 second or even a lot longer sleeps are completely fine.

> > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> >  	struct virtio_balloon *vb = vdev->priv;
> >  
> >  	unregister_oom_notifier(&vb->nb);
> > -	kthread_stop(vb->thread);
> > +	cancel_work_sync(&vb->wq_work);
> 
> OK but since job requeues itself, cancelling like this might not be enough.

As long as there's no further external queueing, cancel_work_sync() is
guaranteed to kill a self-requeueing work item.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2016-01-02 11:43       ` Tejun Heo
@ 2016-01-02 21:36         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-02 21:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Petr Mladek, Rusty Russell, Jeff Epler, Jiri Kosina,
	virtualization, linux-kernel, Petr Mladek

On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> Hello,
> 
> On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > suggested using a system one. Tejun Heo confirmed that the system
> > > workqueue has a pretty high concurrency level (256) by default.
> > > Therefore we need not be afraid of too long blocking.
> > 
> > Right but fill has a 1/5 second sleep on failure - *that*
> > is problematic for a system queue.
> 
> Why so?  As long as the maximum concurrently used workers are not
> high, 1/5 second or even a lot longer sleeps are completely fine.

I always thought the right way to defer executing a work queue item
is to queue delayed work, not sleep + queue work.

Doing a sleep ties up one thread for 1/5 of a second, does it not?
If so, as long as it's the only driver doing this, we'll be fine,
but if many others copy this pattern, things will
start to break, will they not?

> > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > >  	struct virtio_balloon *vb = vdev->priv;
> > >  
> > >  	unregister_oom_notifier(&vb->nb);
> > > -	kthread_stop(vb->thread);
> > > +	cancel_work_sync(&vb->wq_work);
> > 
> > OK but since job requeues itself, cancelling like this might not be enough.
> 
> As long as there's no further external queueing, cancel_work_sync() is
> guaranteed to kill a self-requeueing work item.
> 
> Thanks.

I didn't realise this. Thanks!

Unfortunately in this case, there can be further requeueing
if a stats request arrives.

> -- 
> tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-02 21:36         ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-02 21:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Petr Mladek, Jiri Kosina, linux-kernel, Petr Mladek,
	virtualization, Jeff Epler

On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> Hello,
> 
> On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > suggested using a system one. Tejun Heo confirmed that the system
> > > workqueue has a pretty high concurrency level (256) by default.
> > > Therefore we need not be afraid of too long blocking.
> > 
> > Right but fill has a 1/5 second sleep on failure - *that*
> > is problematic for a system queue.
> 
> Why so?  As long as the maximum concurrently used workers are not
> high, 1/5 second or even a lot longer sleeps are completely fine.

I always thought the right way to defer executing a work queue item
is to queue delayed work, not sleep + queue work.

Doing a sleep ties up one thread for 1/5 of a second, does it not?
If so, as long as it's the only driver doing this, we'll be fine,
but if many others copy this pattern, things will
start to break, will they not?

> > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > >  	struct virtio_balloon *vb = vdev->priv;
> > >  
> > >  	unregister_oom_notifier(&vb->nb);
> > > -	kthread_stop(vb->thread);
> > > +	cancel_work_sync(&vb->wq_work);
> > 
> > OK but since job requeues itself, cancelling like this might not be enough.
> 
> As long as there's no further external queueing, cancel_work_sync() is
> guaranteed to kill a self-requeueing work item.
> 
> Thanks.

I didn't realise this. Thanks!

Unfortunately in this case, there can be further requeueing
if a stats request arrives.

> -- 
> tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2016-01-02 21:36         ` Michael S. Tsirkin
@ 2016-01-03 13:58           ` Tejun Heo
  -1 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2016-01-03 13:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Petr Mladek, Rusty Russell, Jeff Epler, Jiri Kosina,
	virtualization, linux-kernel, Petr Mladek

Hello, Michael.

On Sat, Jan 02, 2016 at 11:36:03PM +0200, Michael S. Tsirkin wrote:
> > Why so?  As long as the maximum concurrently used workers are not
> > high, 1/5 second or even a lot longer sleeps are completely fine.
> 
> I always thought the right way to defer executing a work queue item
> is to queue delayed work, not sleep + queue work.

That works too and is preferable if there are gonna be a lot of work
items sleeping but it isn't different from any other blocking.

> Doing a sleep ties up one thread for 1/5 of a second, does it not?

It does.

> If so, as long as it's the only driver doing this, we'll be fine,
> but if many others copy this pattern, things will
> start to break, will they not?

The maximum concurrency on the system_wq is 256 which is pretty high,
so for most use cases, it's fine.  If high concurrency is expected,
it's better to break it out to a separate workqueue.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-03 13:58           ` Tejun Heo
  0 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2016-01-03 13:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Petr Mladek, Jiri Kosina, linux-kernel, Petr Mladek,
	virtualization, Jeff Epler

Hello, Michael.

On Sat, Jan 02, 2016 at 11:36:03PM +0200, Michael S. Tsirkin wrote:
> > Why so?  As long as the maximum concurrently used workers are not
> > high, 1/5 second or even a lot longer sleeps are completely fine.
> 
> I always thought the right way to defer executing a work queue item
> is to queue delayed work, not sleep + queue work.

That works too and is preferable if there are gonna be a lot of work
items sleeping but it isn't different from any other blocking.

> Doing a sleep ties up one thread for 1/5 of a second, does it not?

It does.

> If so, as long as it's the only driver doing this, we'll be fine,
> but if many others copy this pattern, things will
> start to break, will they not?

The maximum concurrency on the system_wq is 256 which is pretty high,
so for most use cases, it's fine.  If high concurrency is expected,
it's better to break it out to a separate workqueue.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2016-01-02 21:36         ` Michael S. Tsirkin
@ 2016-01-05 14:49           ` Petr Mladek
  -1 siblings, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2016-01-05 14:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tejun Heo, Rusty Russell, Jeff Epler, Jiri Kosina,
	virtualization, linux-kernel

On Sat 2016-01-02 23:36:03, Michael S. Tsirkin wrote:
> On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> > On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > > >  	struct virtio_balloon *vb = vdev->priv;
> > > >  
> > > >  	unregister_oom_notifier(&vb->nb);
> > > > -	kthread_stop(vb->thread);
> > > > +	cancel_work_sync(&vb->wq_work);
> > > 
> > > OK but since job requeues itself, cancelling like this might not be enough.
> > 
> > As long as there's no further external queueing, cancel_work_sync() is
> > guaranteed to kill a self-requeueing work item.
> > 
> > Thanks.
> 
> I didn't realise this. Thanks!
> 
> Unfortunately in this case, there can be further requeueing
> if a stats request arrives.

Please, is there any point where the stat requests are disabled for
sure? I am not 100% sure but it might be after the reset() call:

    vb->vdev->config->reset(vb->vdev);

Then we could split the kthread into two works: resizing and stats.
The resizing work still must be canceled before leaking the balloon.
But the stats work might be canceled after the reset() call.

In fact, the solution with the two works looks even cleaner.


Thanks for feedback,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-05 14:49           ` Petr Mladek
  0 siblings, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2016-01-05 14:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jiri Kosina, linux-kernel, virtualization, Tejun Heo, Jeff Epler

On Sat 2016-01-02 23:36:03, Michael S. Tsirkin wrote:
> On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> > On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > > >  	struct virtio_balloon *vb = vdev->priv;
> > > >  
> > > >  	unregister_oom_notifier(&vb->nb);
> > > > -	kthread_stop(vb->thread);
> > > > +	cancel_work_sync(&vb->wq_work);
> > > 
> > > OK but since job requeues itself, cancelling like this might not be enough.
> > 
> > As long as there's no further external queueing, cancel_work_sync() is
> > guaranteed to kill a self-requeueing work item.
> > 
> > Thanks.
> 
> I didn't realise this. Thanks!
> 
> Unfortunately in this case, there can be further requeueing
> if a stats request arrives.

Please, is there any point where the stat requests are disabled for
sure? I am not 100% sure but it might be after the reset() call:

    vb->vdev->config->reset(vb->vdev);

Then we could split the kthread into two works: resizing and stats.
The resizing work still must be canceled before leaking the balloon.
But the stats work might be canceled after the reset() call.

In fact, the solution with the two works looks even cleaner.


Thanks for feedback,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
  2016-01-05 14:49           ` Petr Mladek
@ 2016-01-05 15:37             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-05 15:37 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tejun Heo, Rusty Russell, Jeff Epler, Jiri Kosina,
	virtualization, linux-kernel

On Tue, Jan 05, 2016 at 03:49:18PM +0100, Petr Mladek wrote:
> On Sat 2016-01-02 23:36:03, Michael S. Tsirkin wrote:
> > On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> > > On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > > > >  	struct virtio_balloon *vb = vdev->priv;
> > > > >  
> > > > >  	unregister_oom_notifier(&vb->nb);
> > > > > -	kthread_stop(vb->thread);
> > > > > +	cancel_work_sync(&vb->wq_work);
> > > > 
> > > > OK but since job requeues itself, cancelling like this might not be enough.
> > > 
> > > As long as there's no further external queueing, cancel_work_sync() is
> > > guaranteed to kill a self-requeueing work item.
> > > 
> > > Thanks.
> > 
> > I didn't realise this. Thanks!
> > 
> > Unfortunately in this case, there can be further requeueing
> > if a stats request arrives.
> 
> Please, is there any point where the stat requests are disabled for
> sure? I am not 100% sure but it might be after the reset() call:
> 
>     vb->vdev->config->reset(vb->vdev);

Yes.

> Then we could split the kthread into two works: resizing and stats.
> The resizing work still must be canceled before leaking the balloon.
> But the stats work might be canceled after the reset() call.
> 
> In fact, the solution with the two works looks even cleaner.
> 
> 
> Thanks for feedback,
> Petr

I agree - in fact, not blocking stats call while inflate is blocked
would be very nice. As things then happen in parallel, we need to be
careful with locking and stuff.

That would be a good reason to switch to wq.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread
@ 2016-01-05 15:37             ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-05 15:37 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jiri Kosina, linux-kernel, virtualization, Tejun Heo, Jeff Epler

On Tue, Jan 05, 2016 at 03:49:18PM +0100, Petr Mladek wrote:
> On Sat 2016-01-02 23:36:03, Michael S. Tsirkin wrote:
> > On Sat, Jan 02, 2016 at 06:43:16AM -0500, Tejun Heo wrote:
> > > On Fri, Jan 01, 2016 at 12:18:17PM +0200, Michael S. Tsirkin wrote:
> > > > > My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
> > > > > @@ -563,7 +534,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
> > > > >  	struct virtio_balloon *vb = vdev->priv;
> > > > >  
> > > > >  	unregister_oom_notifier(&vb->nb);
> > > > > -	kthread_stop(vb->thread);
> > > > > +	cancel_work_sync(&vb->wq_work);
> > > > 
> > > > OK but since job requeues itself, cancelling like this might not be enough.
> > > 
> > > As long as there's no further external queueing, cancel_work_sync() is
> > > guaranteed to kill a self-requeueing work item.
> > > 
> > > Thanks.
> > 
> > I didn't realise this. Thanks!
> > 
> > Unfortunately in this case, there can be further requeueing
> > if a stats request arrives.
> 
> Please, is there any point where the stat requests are disabled for
> sure? I am not 100% sure but it might be after the reset() call:
> 
>     vb->vdev->config->reset(vb->vdev);

Yes.

> Then we could split the kthread into two works: resizing and stats.
> The resizing work still must be canceled before leaking the balloon.
> But the stats work might be canceled after the reset() call.
> 
> In fact, the solution with the two works looks even cleaner.
> 
> 
> Thanks for feedback,
> Petr

I agree - in fact, not blocking stats call while inflate is blocked
would be very nice. As things then happen in parallel, we need to be
careful with locking and stuff.

That would be a good reason to switch to wq.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-01-05 15:37 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-04 13:37 [PATCH v4 0/2] virtio_balloon: Fix restore and convert to workqueue Petr Mladek
2015-12-04 13:37 ` [PATCH v4 1/2] virtio_balloon: Restore the entire balloon after the system freeze Petr Mladek
2016-01-01 10:11   ` Michael S. Tsirkin
2016-01-01 10:11     ` Michael S. Tsirkin
2016-01-01 10:11     ` Michael S. Tsirkin
2016-01-01 10:11       ` Michael S. Tsirkin
2015-12-04 13:37 ` Petr Mladek
2015-12-04 13:37 ` [PATCH v4 2/2] virtio_balloon: Use a workqueue instead of "vballoon" kthread Petr Mladek
2015-12-04 13:37 ` Petr Mladek
2016-01-01 10:18   ` Michael S. Tsirkin
2016-01-01 10:18     ` Michael S. Tsirkin
2016-01-02 11:43     ` Tejun Heo
2016-01-02 11:43       ` Tejun Heo
2016-01-02 21:36       ` Michael S. Tsirkin
2016-01-02 21:36         ` Michael S. Tsirkin
2016-01-03 13:58         ` Tejun Heo
2016-01-03 13:58           ` Tejun Heo
2016-01-05 14:49         ` Petr Mladek
2016-01-05 14:49           ` Petr Mladek
2016-01-05 15:37           ` Michael S. Tsirkin
2016-01-05 15:37             ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.