linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] virtio-balloon: some improvements
@ 2018-08-03  8:32 Wei Wang
  2018-08-03  8:32 ` [PATCH v3 1/2] virtio-balloon: remove BUG() in init_vqs Wei Wang
  2018-08-03  8:32 ` [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Wei Wang
  0 siblings, 2 replies; 11+ messages in thread
From: Wei Wang @ 2018-08-03  8:32 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm, penguin-kernel
  Cc: wei.w.wang

This series is split from the "Virtio-balloon: support free page
reporting" series to make some improvements.

ChangeLog:
v2->v3:
- shrink the balloon pages according to the amount requested by the
  claimer, instead of using a user specified number;
v1->v2:
- register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is
  negotiated.

Wei Wang (2):
  virtio-balloon: remove BUG() in init_vqs
  virtio_balloon: replace oom notifier with shrinker

 drivers/virtio/virtio_balloon.c | 121 ++++++++++++++++++++++------------------
 1 file changed, 67 insertions(+), 54 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] virtio-balloon: remove BUG() in init_vqs
  2018-08-03  8:32 [PATCH v3 0/2] virtio-balloon: some improvements Wei Wang
@ 2018-08-03  8:32 ` Wei Wang
  2018-08-03  8:32 ` [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Wei Wang
  1 sibling, 0 replies; 11+ messages in thread
From: Wei Wang @ 2018-08-03  8:32 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm, penguin-kernel
  Cc: wei.w.wang

It's a bit overkill to use BUG when failing to add an entry to the
stats_vq in init_vqs. So remove it and just return the error to the
caller to bail out nicely.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 3988c09..8100e77 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -455,9 +455,13 @@ static int init_vqs(struct virtio_balloon *vb)
 		num_stats = update_balloon_stats(vb);
 
 		sg_init_one(&sg, vb->stats, sizeof(vb->stats[0]) * num_stats);
-		if (virtqueue_add_outbuf(vb->stats_vq, &sg, 1, vb, GFP_KERNEL)
-		    < 0)
-			BUG();
+		err = virtqueue_add_outbuf(vb->stats_vq, &sg, 1, vb,
+					   GFP_KERNEL);
+		if (err) {
+			dev_warn(&vb->vdev->dev, "%s: add stat_vq failed\n",
+				 __func__);
+			return err;
+		}
 		virtqueue_kick(vb->stats_vq);
 	}
 	return 0;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-03  8:32 [PATCH v3 0/2] virtio-balloon: some improvements Wei Wang
  2018-08-03  8:32 ` [PATCH v3 1/2] virtio-balloon: remove BUG() in init_vqs Wei Wang
@ 2018-08-03  8:32 ` Wei Wang
  2018-08-03 12:11   ` Tetsuo Handa
  2018-08-03 19:15   ` Michael S. Tsirkin
  1 sibling, 2 replies; 11+ messages in thread
From: Wei Wang @ 2018-08-03  8:32 UTC (permalink / raw)
  To: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko,
	akpm, penguin-kernel
  Cc: wei.w.wang

The OOM notifier is getting deprecated to use for the reasons:
- As a callout from the oom context, it is too subtle and easy to
  generate bugs and corner cases which are hard to track;
- It is called too late (after the reclaiming has been performed).
  Drivers with large amuont of reclaimable memory is expected to
  release them at an early stage of memory pressure;
- The notifier callback isn't aware of oom contrains;
Link: https://lkml.org/lkml/2018/7/12/314

This patch replaces the virtio-balloon oom notifier with a shrinker
to release balloon pages on memory pressure. The balloon pages are
given back to mm adaptively by returning the number of pages that the
reclaimer is asking for (i.e. sc->nr_to_scan).

Currently the max possible value of sc->nr_to_scan passed to the balloon
shrinker is SHRINK_BATCH, which is 128. This is smaller than the
limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be
returned via one invocation of leak_balloon. But this patch still
considers the case that SHRINK_BATCH or shrinker->batch could be changed
to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to
do multiple invocations of leak_balloon.

Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used
to release balloon pages on OOM. We continue to use this feature bit for
the shrinker, so the shrinker is only registered when this feature bit
has been negotiated with host.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 drivers/virtio/virtio_balloon.c | 111 ++++++++++++++++++++++------------------
 1 file changed, 60 insertions(+), 51 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8100e77..612a359 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -27,7 +27,6 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/balloon_compaction.h>
-#include <linux/oom.h>
 #include <linux/wait.h>
 #include <linux/mm.h>
 #include <linux/mount.h>
@@ -40,13 +39,8 @@
  */
 #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
 #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
-#define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
-static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
-module_param(oom_pages, int, S_IRUSR | S_IWUSR);
-MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
-
 #ifdef CONFIG_BALLOON_COMPACTION
 static struct vfsmount *balloon_mnt;
 #endif
@@ -86,8 +80,8 @@ struct virtio_balloon {
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
 
-	/* To register callback in oom notifier call chain */
-	struct notifier_block nb;
+	/* To register a shrinker to shrink memory upon memory pressure */
+	struct shrinker shrinker;
 };
 
 static struct virtio_device_id id_table[] = {
@@ -365,38 +359,6 @@ static void update_balloon_size(struct virtio_balloon *vb)
 		      &actual);
 }
 
-/*
- * virtballoon_oom_notify - release pages when system is under severe
- *			    memory pressure (called from out_of_memory())
- * @self : notifier block struct
- * @dummy: not used
- * @parm : returned - number of freed pages
- *
- * The balancing of memory by use of the virtio balloon should not cause
- * the termination of processes while there are pages in the balloon.
- * If virtio balloon manages to release some memory, it will make the
- * system return and retry the allocation that forced the OOM killer
- * to run.
- */
-static int virtballoon_oom_notify(struct notifier_block *self,
-				  unsigned long dummy, void *parm)
-{
-	struct virtio_balloon *vb;
-	unsigned long *freed;
-	unsigned num_freed_pages;
-
-	vb = container_of(self, struct virtio_balloon, nb);
-	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
-		return NOTIFY_OK;
-
-	freed = parm;
-	num_freed_pages = leak_balloon(vb, oom_pages);
-	update_balloon_size(vb);
-	*freed += num_freed_pages;
-
-	return NOTIFY_OK;
-}
-
 static void update_balloon_stats_func(struct work_struct *work)
 {
 	struct virtio_balloon *vb;
@@ -550,6 +512,53 @@ static struct file_system_type balloon_fs = {
 
 #endif /* CONFIG_BALLOON_COMPACTION */
 
+static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
+						  struct shrink_control *sc)
+{
+	unsigned long pages_to_free, pages_freed = 0;
+	struct virtio_balloon *vb = container_of(shrinker,
+					struct virtio_balloon, shrinker);
+
+	pages_to_free = sc->nr_to_scan * VIRTIO_BALLOON_PAGES_PER_PAGE;
+
+	/*
+	 * One invocation of leak_balloon can deflate at most
+	 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
+	 * multiple times to deflate pages till reaching pages_to_free.
+	 */
+	while (vb->num_pages && pages_to_free) {
+		pages_to_free -= pages_freed;
+		pages_freed += leak_balloon(vb, pages_to_free);
+	}
+	update_balloon_size(vb);
+
+	return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
+}
+
+static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
+						   struct shrink_control *sc)
+{
+	struct virtio_balloon *vb = container_of(shrinker,
+					struct virtio_balloon, shrinker);
+
+	return vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+}
+
+static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
+{
+	unregister_shrinker(&vb->shrinker);
+}
+
+static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
+{
+	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
+	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
+	vb->shrinker.batch = 0;
+	vb->shrinker.seeks = DEFAULT_SEEKS;
+
+	return register_shrinker(&vb->shrinker);
+}
+
 static int virtballoon_probe(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb;
@@ -582,17 +591,10 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	if (err)
 		goto out_free_vb;
 
-	vb->nb.notifier_call = virtballoon_oom_notify;
-	vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
-	err = register_oom_notifier(&vb->nb);
-	if (err < 0)
-		goto out_del_vqs;
-
 #ifdef CONFIG_BALLOON_COMPACTION
 	balloon_mnt = kern_mount(&balloon_fs);
 	if (IS_ERR(balloon_mnt)) {
 		err = PTR_ERR(balloon_mnt);
-		unregister_oom_notifier(&vb->nb);
 		goto out_del_vqs;
 	}
 
@@ -601,13 +603,20 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	if (IS_ERR(vb->vb_dev_info.inode)) {
 		err = PTR_ERR(vb->vb_dev_info.inode);
 		kern_unmount(balloon_mnt);
-		unregister_oom_notifier(&vb->nb);
 		vb->vb_dev_info.inode = NULL;
 		goto out_del_vqs;
 	}
 	vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
 #endif
-
+	/*
+	 * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
+	 * shrinker needs to be registered to relieve memory pressure.
+	 */
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
+		err = virtio_balloon_register_shrinker(vb);
+		if (err)
+			goto out_del_vqs;
+	}
 	virtio_device_ready(vdev);
 
 	if (towards_target(vb))
@@ -639,8 +648,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
 {
 	struct virtio_balloon *vb = vdev->priv;
 
-	unregister_oom_notifier(&vb->nb);
-
+	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
+		virtio_balloon_unregister_shrinker(vb);
 	spin_lock_irq(&vb->stop_update_lock);
 	vb->stop_update = true;
 	spin_unlock_irq(&vb->stop_update_lock);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-03  8:32 ` [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Wei Wang
@ 2018-08-03 12:11   ` Tetsuo Handa
  2018-08-06  9:56     ` Wei Wang
  2018-08-03 19:15   ` Michael S. Tsirkin
  1 sibling, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2018-08-03 12:11 UTC (permalink / raw)
  To: Wei Wang, virtio-dev, linux-kernel, virtualization, linux-mm,
	mst, mhocko, akpm

On 2018/08/03 17:32, Wei Wang wrote:
> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> +{
> +	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> +	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> +	vb->shrinker.batch = 0;
> +	vb->shrinker.seeks = DEFAULT_SEEKS;

Why flags field is not set? If vb is allocated by kmalloc(GFP_KERNEL)
and is nowhere zero-cleared, KASAN would complain it.

> +
> +	return register_shrinker(&vb->shrinker);
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-03  8:32 ` [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Wei Wang
  2018-08-03 12:11   ` Tetsuo Handa
@ 2018-08-03 19:15   ` Michael S. Tsirkin
  2018-08-06  3:29     ` Wei Wang
  1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2018-08-03 19:15 UTC (permalink / raw)
  To: Wei Wang
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mhocko, akpm,
	penguin-kernel

On Fri, Aug 03, 2018 at 04:32:26PM +0800, Wei Wang wrote:
> The OOM notifier is getting deprecated to use for the reasons:
> - As a callout from the oom context, it is too subtle and easy to
>   generate bugs and corner cases which are hard to track;
> - It is called too late (after the reclaiming has been performed).
>   Drivers with large amuont of reclaimable memory is expected to
>   release them at an early stage of memory pressure;
> - The notifier callback isn't aware of oom contrains;
> Link: https://lkml.org/lkml/2018/7/12/314
> 
> This patch replaces the virtio-balloon oom notifier with a shrinker
> to release balloon pages on memory pressure. The balloon pages are
> given back to mm adaptively by returning the number of pages that the
> reclaimer is asking for (i.e. sc->nr_to_scan).
> 
> Currently the max possible value of sc->nr_to_scan passed to the balloon
> shrinker is SHRINK_BATCH, which is 128. This is smaller than the
> limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be
> returned via one invocation of leak_balloon. But this patch still
> considers the case that SHRINK_BATCH or shrinker->batch could be changed
> to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to
> do multiple invocations of leak_balloon.
> 
> Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used
> to release balloon pages on OOM. We continue to use this feature bit for
> the shrinker, so the shrinker is only registered when this feature bit
> has been negotiated with host.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>


Could you add data at how was this tested and how did guest
behaviour change. Which configurations see an improvement?

> ---
>  drivers/virtio/virtio_balloon.c | 111 ++++++++++++++++++++++------------------
>  1 file changed, 60 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8100e77..612a359 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -27,7 +27,6 @@
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  #include <linux/balloon_compaction.h>
> -#include <linux/oom.h>
>  #include <linux/wait.h>
>  #include <linux/mm.h>
>  #include <linux/mount.h>
> @@ -40,13 +39,8 @@
>   */
>  #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
>  #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
> -#define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> -MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> -
>  #ifdef CONFIG_BALLOON_COMPACTION
>  static struct vfsmount *balloon_mnt;
>  #endif
> @@ -86,8 +80,8 @@ struct virtio_balloon {
>  	/* Memory statistics */
>  	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
>  
> -	/* To register callback in oom notifier call chain */
> -	struct notifier_block nb;
> +	/* To register a shrinker to shrink memory upon memory pressure */
> +	struct shrinker shrinker;
>  };
>  
>  static struct virtio_device_id id_table[] = {
> @@ -365,38 +359,6 @@ static void update_balloon_size(struct virtio_balloon *vb)
>  		      &actual);
>  }
>  
> -/*
> - * virtballoon_oom_notify - release pages when system is under severe
> - *			    memory pressure (called from out_of_memory())
> - * @self : notifier block struct
> - * @dummy: not used
> - * @parm : returned - number of freed pages
> - *
> - * The balancing of memory by use of the virtio balloon should not cause
> - * the termination of processes while there are pages in the balloon.
> - * If virtio balloon manages to release some memory, it will make the
> - * system return and retry the allocation that forced the OOM killer
> - * to run.
> - */
> -static int virtballoon_oom_notify(struct notifier_block *self,
> -				  unsigned long dummy, void *parm)
> -{
> -	struct virtio_balloon *vb;
> -	unsigned long *freed;
> -	unsigned num_freed_pages;
> -
> -	vb = container_of(self, struct virtio_balloon, nb);
> -	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> -		return NOTIFY_OK;
> -
> -	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> -	update_balloon_size(vb);
> -	*freed += num_freed_pages;
> -
> -	return NOTIFY_OK;
> -}
> -
>  static void update_balloon_stats_func(struct work_struct *work)
>  {
>  	struct virtio_balloon *vb;
> @@ -550,6 +512,53 @@ static struct file_system_type balloon_fs = {
>  
>  #endif /* CONFIG_BALLOON_COMPACTION */
>  
> +static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
> +						  struct shrink_control *sc)
> +{
> +	unsigned long pages_to_free, pages_freed = 0;
> +	struct virtio_balloon *vb = container_of(shrinker,
> +					struct virtio_balloon, shrinker);
> +
> +	pages_to_free = sc->nr_to_scan * VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
> +	/*
> +	 * One invocation of leak_balloon can deflate at most
> +	 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> +	 * multiple times to deflate pages till reaching pages_to_free.
> +	 */
> +	while (vb->num_pages && pages_to_free) {
> +		pages_to_free -= pages_freed;
> +		pages_freed += leak_balloon(vb, pages_to_free);
> +	}
> +	update_balloon_size(vb);
> +
> +	return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> +						   struct shrink_control *sc)
> +{
> +	struct virtio_balloon *vb = container_of(shrinker,
> +					struct virtio_balloon, shrinker);
> +
> +	return vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
> +{
> +	unregister_shrinker(&vb->shrinker);
> +}
> +
> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> +{
> +	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> +	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> +	vb->shrinker.batch = 0;
> +	vb->shrinker.seeks = DEFAULT_SEEKS;
> +
> +	return register_shrinker(&vb->shrinker);
> +}
> +
>  static int virtballoon_probe(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb;
> @@ -582,17 +591,10 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	if (err)
>  		goto out_free_vb;
>  
> -	vb->nb.notifier_call = virtballoon_oom_notify;
> -	vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
> -	err = register_oom_notifier(&vb->nb);
> -	if (err < 0)
> -		goto out_del_vqs;
> -
>  #ifdef CONFIG_BALLOON_COMPACTION
>  	balloon_mnt = kern_mount(&balloon_fs);
>  	if (IS_ERR(balloon_mnt)) {
>  		err = PTR_ERR(balloon_mnt);
> -		unregister_oom_notifier(&vb->nb);
>  		goto out_del_vqs;
>  	}
>  
> @@ -601,13 +603,20 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	if (IS_ERR(vb->vb_dev_info.inode)) {
>  		err = PTR_ERR(vb->vb_dev_info.inode);
>  		kern_unmount(balloon_mnt);
> -		unregister_oom_notifier(&vb->nb);
>  		vb->vb_dev_info.inode = NULL;
>  		goto out_del_vqs;
>  	}
>  	vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
>  #endif
> -
> +	/*
> +	 * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
> +	 * shrinker needs to be registered to relieve memory pressure.
> +	 */
> +	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
> +		err = virtio_balloon_register_shrinker(vb);
> +		if (err)
> +			goto out_del_vqs;
> +	}
>  	virtio_device_ready(vdev);
>  
>  	if (towards_target(vb))
> @@ -639,8 +648,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
>  
> -	unregister_oom_notifier(&vb->nb);
> -
> +	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> +		virtio_balloon_unregister_shrinker(vb);
>  	spin_lock_irq(&vb->stop_update_lock);
>  	vb->stop_update = true;
>  	spin_unlock_irq(&vb->stop_update_lock);
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-03 19:15   ` Michael S. Tsirkin
@ 2018-08-06  3:29     ` Wei Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Wei Wang @ 2018-08-06  3:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mhocko, akpm,
	penguin-kernel

On 08/04/2018 03:15 AM, Michael S. Tsirkin wrote:
> On Fri, Aug 03, 2018 at 04:32:26PM +0800, Wei Wang wrote:
>> The OOM notifier is getting deprecated to use for the reasons:
>> - As a callout from the oom context, it is too subtle and easy to
>>    generate bugs and corner cases which are hard to track;
>> - It is called too late (after the reclaiming has been performed).
>>    Drivers with large amuont of reclaimable memory is expected to
>>    release them at an early stage of memory pressure;
>> - The notifier callback isn't aware of oom contrains;
>> Link: https://lkml.org/lkml/2018/7/12/314
>>
>> This patch replaces the virtio-balloon oom notifier with a shrinker
>> to release balloon pages on memory pressure. The balloon pages are
>> given back to mm adaptively by returning the number of pages that the
>> reclaimer is asking for (i.e. sc->nr_to_scan).
>>
>> Currently the max possible value of sc->nr_to_scan passed to the balloon
>> shrinker is SHRINK_BATCH, which is 128. This is smaller than the
>> limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be
>> returned via one invocation of leak_balloon. But this patch still
>> considers the case that SHRINK_BATCH or shrinker->batch could be changed
>> to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to
>> do multiple invocations of leak_balloon.
>>
>> Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used
>> to release balloon pages on OOM. We continue to use this feature bit for
>> the shrinker, so the shrinker is only registered when this feature bit
>> has been negotiated with host.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>
> Could you add data at how was this tested and how did guest
> behaviour change. Which configurations see an improvement?
>

Yes. Please see the differences from the "*1" and "*2" cases below.

Taking this chance, I use "*2" and "*3" to show Michal etc the 
differences of applying and not applying the shrinker fix patch here: 
https://lkml.org/lkml/2018/8/3/384


*1. V3 patches
1)After inflating some amount of memory, actual=1000001536 Bytes
free -m
               total        used        free      shared buff/cache   
available
Mem:           7975        7289         514          10 171         447
Swap:         10236           0       10236

2) dd if=478MB_file of=/dev/null, actual=1058721792 Bytes
free -m
               total        used        free      shared buff/cache   
available
Mem:           7975        7233         102          10 639         475
Swap:         10236           0       10236

The advantage is that the inflated pages are given back to mm based on 
the number, i.e. ~56MB(diff "actual" above) of the reclaimer is asking 
for. This is more adaptive.



*2. V2 paches, balloon_pages_to_shrink=1000000 pages (around 4GB), with 
the shrinker fix patches applied.
1)After inflating some amount of memory, actual=1000001536 Bytes
free -m
               total        used        free      shared buff/cache   
available
Mem:           7975        7288         530          10 157         455
Swap:         10236           0       10236

2)dd if=478MB_file of=/dev/null, actual=5096001536 Bytes
free -m
               total        used        free      shared buff/cache   
available
Mem:           7975        3381        3953          10 640        4327
Swap:         10236           0       10236

In the above example, we set 4GB to shrink to make the difference 
obvious. Though the claimer only needs to reclaim ~56MB memory, 4GB 
inflated pages are given back to mm, which is unnecessary. From the 
user's perspective, it has no idea of how many pages to given back at 
the time of setting the module parameter (balloon_pages_to_shrink). So I 
think the above "*1" is better.



*3.  V2 paches, balloon_pages_to_shrink=1000000 pages (around 4GB), 
without the shrinker fix patches applied.
1) After inflating some amount of memory, actual=1000001536 Bytes
free -m
                total        used        free      shared buff/cache   
available
Mem:           7975        7292         524          10 158         450
Swap:         10236           0       10236

2) dd if=478MB_file of=/dev/null, actual=8589934592 Bytes
free -m
              total        used        free      shared  buff/cache 
available
Mem:           7975          53        7281          10 640        7656
Swap:         10236           0       10236

Compared to *2, all the balloon pages are shrunk, but users expect 4GB 
to shrink. The reason is that do_slab_shrink has a mistake in 
calculating schrinkctl->nr_scanned, which should be the actual number of 
pages that the shrinker has freed, but do slab_shrink still treat that 
value as 128 (but 4GB has actually been freed).


Best,
Wei

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-03 12:11   ` Tetsuo Handa
@ 2018-08-06  9:56     ` Wei Wang
  2018-08-06 10:29       ` Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Wei Wang @ 2018-08-06  9:56 UTC (permalink / raw)
  To: Tetsuo Handa, virtio-dev, linux-kernel, virtualization, linux-mm,
	mst, mhocko, akpm

On 08/03/2018 08:11 PM, Tetsuo Handa wrote:
> On 2018/08/03 17:32, Wei Wang wrote:
>> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
>> +{
>> +	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
>> +	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
>> +	vb->shrinker.batch = 0;
>> +	vb->shrinker.seeks = DEFAULT_SEEKS;
> Why flags field is not set? If vb is allocated by kmalloc(GFP_KERNEL)
> and is nowhere zero-cleared, KASAN would complain it.

Could you point where in the code that would complain it?
I only see two shrinker flags (NUMA_AWARE and MEMCG_AWARE), and they 
seem not related to that.


Best,
Wei

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-06  9:56     ` Wei Wang
@ 2018-08-06 10:29       ` Tetsuo Handa
  2018-08-06 12:44         ` Wang, Wei W
  0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2018-08-06 10:29 UTC (permalink / raw)
  To: Wei Wang
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko, akpm

On 2018/08/06 18:56, Wei Wang wrote:
> On 08/03/2018 08:11 PM, Tetsuo Handa wrote:
>> On 2018/08/03 17:32, Wei Wang wrote:
>>> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
>>> +{
>>> +    vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
>>> +    vb->shrinker.count_objects = virtio_balloon_shrinker_count;
>>> +    vb->shrinker.batch = 0;
>>> +    vb->shrinker.seeks = DEFAULT_SEEKS;
>> Why flags field is not set? If vb is allocated by kmalloc(GFP_KERNEL)
>> and is nowhere zero-cleared, KASAN would complain it.
> 
> Could you point where in the code that would complain it?
> I only see two shrinker flags (NUMA_AWARE and MEMCG_AWARE), and they seem not related to that.

Where is vb->shrinker.flags initialized?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-06 10:29       ` Tetsuo Handa
@ 2018-08-06 12:44         ` Wang, Wei W
  2018-08-06 13:28           ` Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Wang, Wei W @ 2018-08-06 12:44 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko, akpm

On Monday, August 6, 2018 6:29 PM, Tetsuo Handa wrote:
> On 2018/08/06 18:56, Wei Wang wrote:
> > On 08/03/2018 08:11 PM, Tetsuo Handa wrote:
> >> On 2018/08/03 17:32, Wei Wang wrote:
> >>> +static int virtio_balloon_register_shrinker(struct virtio_balloon
> >>> +*vb) {
> >>> +    vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> >>> +    vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> >>> +    vb->shrinker.batch = 0;
> >>> +    vb->shrinker.seeks = DEFAULT_SEEKS;
> >> Why flags field is not set? If vb is allocated by kmalloc(GFP_KERNEL)
> >> and is nowhere zero-cleared, KASAN would complain it.
> >
> > Could you point where in the code that would complain it?
> > I only see two shrinker flags (NUMA_AWARE and MEMCG_AWARE), and
> they seem not related to that.
> 
> Where is vb->shrinker.flags initialized?

Is that mandatory to be initialized? I find it's not initialized in most shrinkers (e.g. zs_register_shrinker, huge_zero_page_shrinker).

Best,
Wei

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-06 12:44         ` Wang, Wei W
@ 2018-08-06 13:28           ` Tetsuo Handa
  2018-08-06 14:02             ` Wang, Wei W
  0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2018-08-06 13:28 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko, akpm

On 2018/08/06 21:44, Wang, Wei W wrote:
> On Monday, August 6, 2018 6:29 PM, Tetsuo Handa wrote:
>> On 2018/08/06 18:56, Wei Wang wrote:
>>> On 08/03/2018 08:11 PM, Tetsuo Handa wrote:
>>>> On 2018/08/03 17:32, Wei Wang wrote:
>>>>> +static int virtio_balloon_register_shrinker(struct virtio_balloon
>>>>> +*vb) {
>>>>> +    vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
>>>>> +    vb->shrinker.count_objects = virtio_balloon_shrinker_count;
>>>>> +    vb->shrinker.batch = 0;
>>>>> +    vb->shrinker.seeks = DEFAULT_SEEKS;
>>>> Why flags field is not set? If vb is allocated by kmalloc(GFP_KERNEL)
>>>> and is nowhere zero-cleared, KASAN would complain it.
>>>
>>> Could you point where in the code that would complain it?
>>> I only see two shrinker flags (NUMA_AWARE and MEMCG_AWARE), and
>> they seem not related to that.
>>
>> Where is vb->shrinker.flags initialized?
> 
> Is that mandatory to be initialized?

Of course. ;-)

> I find it's not initialized in most shrinkers (e.g. zs_register_shrinker, huge_zero_page_shrinker).

Because most shrinkers are "statically initialized (which means that unspecified fields are
implicitly zero-cleared)" or "dynamically allocated with __GFP_ZERO or zero-cleared using
memset() (which means that all fields are once zero-cleared)".

And if you once zero-clear vb at allocation time, you will get a bonus that
calling unregister_shrinker() without corresponding register_shrinker() is safe
(which will simplify initialization failure path).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker
  2018-08-06 13:28           ` Tetsuo Handa
@ 2018-08-06 14:02             ` Wang, Wei W
  0 siblings, 0 replies; 11+ messages in thread
From: Wang, Wei W @ 2018-08-06 14:02 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: virtio-dev, linux-kernel, virtualization, linux-mm, mst, mhocko, akpm

On Monday, August 6, 2018 9:29 PM, Tetsuo Handa wrote:
> On 2018/08/06 21:44, Wang, Wei W wrote:
> > On Monday, August 6, 2018 6:29 PM, Tetsuo Handa wrote:
> >> On 2018/08/06 18:56, Wei Wang wrote:
> >>> On 08/03/2018 08:11 PM, Tetsuo Handa wrote:
> >>>> On 2018/08/03 17:32, Wei Wang wrote:
> >>>>> +static int virtio_balloon_register_shrinker(struct virtio_balloon
> >>>>> +*vb) {
> >>>>> +    vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> >>>>> +    vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> >>>>> +    vb->shrinker.batch = 0;
> >>>>> +    vb->shrinker.seeks = DEFAULT_SEEKS;
> >>>> Why flags field is not set? If vb is allocated by
> >>>> kmalloc(GFP_KERNEL) and is nowhere zero-cleared, KASAN would
> complain it.
> >>>
> >>> Could you point where in the code that would complain it?
> >>> I only see two shrinker flags (NUMA_AWARE and MEMCG_AWARE), and
> >> they seem not related to that.
> >>
> >> Where is vb->shrinker.flags initialized?
> >
> > Is that mandatory to be initialized?
> 
> Of course. ;-)
> 
> > I find it's not initialized in most shrinkers (e.g. zs_register_shrinker,
> huge_zero_page_shrinker).
> 
> Because most shrinkers are "statically initialized (which means that
> unspecified fields are implicitly zero-cleared)" or "dynamically allocated with
> __GFP_ZERO or zero-cleared using
> memset() (which means that all fields are once zero-cleared)".
> 
> And if you once zero-clear vb at allocation time, you will get a bonus that
> calling unregister_shrinker() without corresponding register_shrinker() is safe
> (which will simplify initialization failure path).

Oh, I see, thanks. So it sounds better to directly kzalloc vb.

Best,
Wei

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-08-06 14:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-03  8:32 [PATCH v3 0/2] virtio-balloon: some improvements Wei Wang
2018-08-03  8:32 ` [PATCH v3 1/2] virtio-balloon: remove BUG() in init_vqs Wei Wang
2018-08-03  8:32 ` [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Wei Wang
2018-08-03 12:11   ` Tetsuo Handa
2018-08-06  9:56     ` Wei Wang
2018-08-06 10:29       ` Tetsuo Handa
2018-08-06 12:44         ` Wang, Wei W
2018-08-06 13:28           ` Tetsuo Handa
2018-08-06 14:02             ` Wang, Wei W
2018-08-03 19:15   ` Michael S. Tsirkin
2018-08-06  3:29     ` Wei Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).