linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] virtio_balloon: use non-blocking allocation
@ 2018-01-02 14:50 Tetsuo Handa
  2018-01-04  5:56 ` Wei Wang
  2018-01-31  0:01 ` Michael S. Tsirkin
  0 siblings, 2 replies; 5+ messages in thread
From: Tetsuo Handa @ 2018-01-02 14:50 UTC (permalink / raw)
  To: virtio-dev
  Cc: linux-kernel, linux-mm, Tetsuo Handa, Matthew Wilcox,
	Michael S. Tsirkin, Michal Hocko, Wei Wang

Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
avoid OOM lockup by moving memory allocations to outside of balloon_lock.

Now, Wei is trying to allocate far more pages outside of balloon_lock and
some more memory inside of balloon_lock in order to perform efficient
communication between host and guest using scatter-gather API.

Since pages allocated outside of balloon_lock are not visible to the OOM
notifier path until fill_balloon() holds balloon_lock (and enqueues the
pending pages), allocating more pages than now may lead to unacceptably
premature OOM killer invocation.

It would be possible to make the pending pages visible to the OOM notifier
path. But there is no need to try to allocate memory so hard from the
beginning. As of commit 18468d93e53b037e ("mm: introduce a common
interface for balloon pages mobility"), it made sense to try allocation
as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
free some memory from balloon on OOM"), it no longer makes sense to try
allocation as hard as possible, for fill_balloon() will after all have to
release just allocated memory if some allocation request hits the OOM
notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
allocating memory for inflating the balloon. Then, memory for inflating
the balloon can be allocated inside balloon_lock, and we can release just
allocated memory as needed.

Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
might spam the kernel log buffer. At the same time, this patch moves
"puff" messages to outside of balloon_lock, for it is not a good thing to
block the OOM notifier path for 1/5 of a second. (Moreover, it is better
to release the workqueue and allow processing other pending items. But
that change is out of this patch's scope.)

__GFP_NOMEMALLOC is currently not required because workqueue context
which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
to return ALLOC_OOM. But since some process context might start calling
balloon_page_alloc() in future, this patch does not remove
__GFP_NOMEMALLOC.

(Only compile tested. Please do runtime tests before committing.)

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
---
 drivers/virtio/virtio_balloon.c | 23 +++++++++++++----------
 mm/balloon_compaction.c         |  5 +++--
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index dfe5684..4d9409b 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -141,7 +141,7 @@ static void set_page_pfns(struct virtio_balloon *vb,
 					  page_to_balloon_pfn(page) + i);
 }
 
-static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
+static unsigned fill_balloon(struct virtio_balloon *vb, size_t num, bool *oom)
 {
 	unsigned num_allocated_pages;
 	unsigned num_pfns;
@@ -151,24 +151,19 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
 	/* We can only do one array worth at a time. */
 	num = min(num, ARRAY_SIZE(vb->pfns));
 
+	mutex_lock(&vb->balloon_lock);
+
 	for (num_pfns = 0; num_pfns < num;
 	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 		struct page *page = balloon_page_alloc();
 
 		if (!page) {
-			dev_info_ratelimited(&vb->vdev->dev,
-					     "Out of puff! Can't get %u pages\n",
-					     VIRTIO_BALLOON_PAGES_PER_PAGE);
-			/* Sleep for at least 1/5 of a second before retry. */
-			msleep(200);
+			*oom = true;
 			break;
 		}
-
 		balloon_page_push(&pages, page);
 	}
 
-	mutex_lock(&vb->balloon_lock);
-
 	vb->num_pfns = 0;
 
 	while ((page = balloon_page_pop(&pages))) {
@@ -404,17 +399,25 @@ static void update_balloon_size_func(struct work_struct *work)
 {
 	struct virtio_balloon *vb;
 	s64 diff;
+	bool oom = false;
 
 	vb = container_of(work, struct virtio_balloon,
 			  update_balloon_size_work);
 	diff = towards_target(vb);
 
 	if (diff > 0)
-		diff -= fill_balloon(vb, diff);
+		diff -= fill_balloon(vb, diff, &oom);
 	else if (diff < 0)
 		diff += leak_balloon(vb, -diff);
 	update_balloon_size(vb);
 
+	if (oom) {
+		dev_info_ratelimited(&vb->vdev->dev,
+				     "Out of puff! Can't get %u pages\n",
+				     VIRTIO_BALLOON_PAGES_PER_PAGE);
+		/* Sleep for at least 1/5 of a second before retry. */
+		msleep(200);
+	}
 	if (diff)
 		queue_work(system_freezable_wq, work);
 }
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index ef858d5..067df56 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -21,8 +21,9 @@
  */
 struct page *balloon_page_alloc(void)
 {
-	struct page *page = alloc_page(balloon_mapping_gfp_mask() |
-				       __GFP_NOMEMALLOC | __GFP_NORETRY);
+	struct page *page = alloc_page((balloon_mapping_gfp_mask() |
+					__GFP_NOMEMALLOC | __GFP_NOWARN) &
+				       ~__GFP_DIRECT_RECLAIM);
 	return page;
 }
 EXPORT_SYMBOL_GPL(balloon_page_alloc);
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] virtio_balloon: use non-blocking allocation
  2018-01-02 14:50 [PATCH] virtio_balloon: use non-blocking allocation Tetsuo Handa
@ 2018-01-04  5:56 ` Wei Wang
  2018-01-31  0:01 ` Michael S. Tsirkin
  1 sibling, 0 replies; 5+ messages in thread
From: Wei Wang @ 2018-01-04  5:56 UTC (permalink / raw)
  To: Tetsuo Handa, virtio-dev
  Cc: linux-kernel, linux-mm, Matthew Wilcox, Michael S. Tsirkin, Michal Hocko

On 01/02/2018 10:50 PM, Tetsuo Handa wrote:
> Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
> avoid OOM lockup by moving memory allocations to outside of balloon_lock.
>
> Now, Wei is trying to allocate far more pages outside of balloon_lock and
> some more memory inside of balloon_lock in order to perform efficient
> communication between host and guest using scatter-gather API.
>
> Since pages allocated outside of balloon_lock are not visible to the OOM
> notifier path until fill_balloon() holds balloon_lock (and enqueues the
> pending pages), allocating more pages than now may lead to unacceptably
> premature OOM killer invocation.
>
> It would be possible to make the pending pages visible to the OOM notifier
> path. But there is no need to try to allocate memory so hard from the
> beginning. As of commit 18468d93e53b037e ("mm: introduce a common
> interface for balloon pages mobility"), it made sense to try allocation
> as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
> free some memory from balloon on OOM"), it no longer makes sense to try
> allocation as hard as possible, for fill_balloon() will after all have to
> release just allocated memory if some allocation request hits the OOM
> notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
> allocating memory for inflating the balloon. Then, memory for inflating
> the balloon can be allocated inside balloon_lock, and we can release just
> allocated memory as needed.
>
> Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
> allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
> might spam the kernel log buffer. At the same time, this patch moves
> "puff" messages to outside of balloon_lock, for it is not a good thing to
> block the OOM notifier path for 1/5 of a second. (Moreover, it is better
> to release the workqueue and allow processing other pending items. But
> that change is out of this patch's scope.)
>
> __GFP_NOMEMALLOC is currently not required because workqueue context
> which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
> to return ALLOC_OOM. But since some process context might start calling
> balloon_page_alloc() in future, this patch does not remove
> __GFP_NOMEMALLOC.
>
> (Only compile tested. Please do runtime tests before committing.)
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Wei Wang <wei.w.wang@intel.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> ---
>   drivers/virtio/virtio_balloon.c | 23 +++++++++++++----------
>   mm/balloon_compaction.c         |  5 +++--
>   2 files changed, 16 insertions(+), 12 deletions(-)
>
>

I think it is better to simply make the temporal "LIST_HEAD(pages)" to 
be visible to oom notify, e.g. make it "struct list_head 
vb->inflating_pages"

Then we can change virtioballoon_oom_notify():

static int oom_notify()
{
     ...
     if (*freed != oom_pages && !list_empty(&vb->inflating_pages))
                return NOTIFY_BAD;

     return NOTIFY_OK;
}


virtioballoon_oom_notify() {
     int ret;

     do {
         ret = oom_notify()
     } while (ret == NOTIFY_BAD);

     return ret;
}


I view the above as something "nice to have" (users also have an option 
to disable F_DEFLATE_ON_OOM, in which case inflated pages are also not 
released by oom).  I can help with this after the "virtio_balloon 
enhancement" series is done.


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] virtio_balloon: use non-blocking allocation
  2018-01-02 14:50 [PATCH] virtio_balloon: use non-blocking allocation Tetsuo Handa
  2018-01-04  5:56 ` Wei Wang
@ 2018-01-31  0:01 ` Michael S. Tsirkin
  2018-01-31 11:13   ` Tetsuo Handa
  1 sibling, 1 reply; 5+ messages in thread
From: Michael S. Tsirkin @ 2018-01-31  0:01 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: virtio-dev, linux-kernel, linux-mm, Matthew Wilcox, Michal Hocko,
	Wei Wang

On Tue, Jan 02, 2018 at 11:50:21PM +0900, Tetsuo Handa wrote:
> Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
> avoid OOM lockup by moving memory allocations to outside of balloon_lock.
> 
> Now, Wei is trying to allocate far more pages outside of balloon_lock and
> some more memory inside of balloon_lock in order to perform efficient
> communication between host and guest using scatter-gather API.
> 
> Since pages allocated outside of balloon_lock are not visible to the OOM
> notifier path until fill_balloon() holds balloon_lock (and enqueues the
> pending pages), allocating more pages than now may lead to unacceptably
> premature OOM killer invocation.
> 
> It would be possible to make the pending pages visible to the OOM notifier
> path. But there is no need to try to allocate memory so hard from the
> beginning. As of commit 18468d93e53b037e ("mm: introduce a common
> interface for balloon pages mobility"), it made sense to try allocation
> as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
> free some memory from balloon on OOM"),

However, please not that this behavious is optional.
Can you keep the current behaviour when deflate on OOM is disabled?


> it no longer makes sense to try
> allocation as hard as possible, for fill_balloon() will after all have to
> release just allocated memory if some allocation request hits the OOM
> notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
> allocating memory for inflating the balloon. Then, memory for inflating
> the balloon can be allocated inside balloon_lock, and we can release just
> allocated memory as needed.
> 
> Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
> allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
> might spam the kernel log buffer. At the same time, this patch moves
> "puff" messages to outside of balloon_lock, for it is not a good thing to
> block the OOM notifier path for 1/5 of a second. (Moreover, it is better
> to release the workqueue and allow processing other pending items. But
> that change is out of this patch's scope.)
> 
> __GFP_NOMEMALLOC is currently not required because workqueue context
> which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
> to return ALLOC_OOM. But since some process context might start calling
> balloon_page_alloc() in future, this patch does not remove
> __GFP_NOMEMALLOC.
> 
> (Only compile tested. Please do runtime tests before committing.)

You will have to find someone to test it.

> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Wei Wang <wei.w.wang@intel.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> ---
>  drivers/virtio/virtio_balloon.c | 23 +++++++++++++----------
>  mm/balloon_compaction.c         |  5 +++--
>  2 files changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index dfe5684..4d9409b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -141,7 +141,7 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  					  page_to_balloon_pfn(page) + i);
>  }
>  
> -static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num, bool *oom)
>  {
>  	unsigned num_allocated_pages;
>  	unsigned num_pfns;
> @@ -151,24 +151,19 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
>  	/* We can only do one array worth at a time. */
>  	num = min(num, ARRAY_SIZE(vb->pfns));
>  
> +	mutex_lock(&vb->balloon_lock);
> +
>  	for (num_pfns = 0; num_pfns < num;
>  	     num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
>  		struct page *page = balloon_page_alloc();
>  
>  		if (!page) {
> -			dev_info_ratelimited(&vb->vdev->dev,
> -					     "Out of puff! Can't get %u pages\n",
> -					     VIRTIO_BALLOON_PAGES_PER_PAGE);
> -			/* Sleep for at least 1/5 of a second before retry. */
> -			msleep(200);
> +			*oom = true;
>  			break;
>  		}
> -
>  		balloon_page_push(&pages, page);
>  	}
>  
> -	mutex_lock(&vb->balloon_lock);
> -
>  	vb->num_pfns = 0;
>  
>  	while ((page = balloon_page_pop(&pages))) {
> @@ -404,17 +399,25 @@ static void update_balloon_size_func(struct work_struct *work)
>  {
>  	struct virtio_balloon *vb;
>  	s64 diff;
> +	bool oom = false;
>  
>  	vb = container_of(work, struct virtio_balloon,
>  			  update_balloon_size_work);
>  	diff = towards_target(vb);
>  
>  	if (diff > 0)
> -		diff -= fill_balloon(vb, diff);
> +		diff -= fill_balloon(vb, diff, &oom);
>  	else if (diff < 0)
>  		diff += leak_balloon(vb, -diff);
>  	update_balloon_size(vb);
>  
> +	if (oom) {
> +		dev_info_ratelimited(&vb->vdev->dev,
> +				     "Out of puff! Can't get %u pages\n",
> +				     VIRTIO_BALLOON_PAGES_PER_PAGE);
> +		/* Sleep for at least 1/5 of a second before retry. */
> +		msleep(200);
> +	}
>  	if (diff)
>  		queue_work(system_freezable_wq, work);
>  }
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index ef858d5..067df56 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -21,8 +21,9 @@
>   */
>  struct page *balloon_page_alloc(void)
>  {
> -	struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> -				       __GFP_NOMEMALLOC | __GFP_NORETRY);
> +	struct page *page = alloc_page((balloon_mapping_gfp_mask() |
> +					__GFP_NOMEMALLOC | __GFP_NOWARN) &
> +				       ~__GFP_DIRECT_RECLAIM);
>  	return page;
>  }
>  EXPORT_SYMBOL_GPL(balloon_page_alloc);
> -- 
> 1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] virtio_balloon: use non-blocking allocation
  2018-01-31  0:01 ` Michael S. Tsirkin
@ 2018-01-31 11:13   ` Tetsuo Handa
  2018-01-31 15:25     ` Michael S. Tsirkin
  0 siblings, 1 reply; 5+ messages in thread
From: Tetsuo Handa @ 2018-01-31 11:13 UTC (permalink / raw)
  To: mst; +Cc: virtio-dev, linux-kernel, linux-mm, willy, mhocko, wei.w.wang

Michael S. Tsirkin wrote:
> On Tue, Jan 02, 2018 at 11:50:21PM +0900, Tetsuo Handa wrote:
> > Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
> > avoid OOM lockup by moving memory allocations to outside of balloon_lock.
> > 
> > Now, Wei is trying to allocate far more pages outside of balloon_lock and
> > some more memory inside of balloon_lock in order to perform efficient
> > communication between host and guest using scatter-gather API.
> > 
> > Since pages allocated outside of balloon_lock are not visible to the OOM
> > notifier path until fill_balloon() holds balloon_lock (and enqueues the
> > pending pages), allocating more pages than now may lead to unacceptably
> > premature OOM killer invocation.
> > 
> > It would be possible to make the pending pages visible to the OOM notifier
> > path. But there is no need to try to allocate memory so hard from the
> > beginning. As of commit 18468d93e53b037e ("mm: introduce a common
> > interface for balloon pages mobility"), it made sense to try allocation
> > as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
> > free some memory from balloon on OOM"),
> 
> However, please not that this behavious is optional.
> Can you keep the current behaviour when deflate on OOM is disabled?

I can, for passing a flag to balloon_page_alloc() will do it.

But do we really prefer behavior up to comment 27 of
https://bugzilla.redhat.com/show_bug.cgi?id=1525356 ?

> 
> 
> > it no longer makes sense to try
> > allocation as hard as possible, for fill_balloon() will after all have to
> > release just allocated memory if some allocation request hits the OOM
> > notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
> > allocating memory for inflating the balloon. Then, memory for inflating
> > the balloon can be allocated inside balloon_lock, and we can release just
> > allocated memory as needed.
> > 
> > Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
> > allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
> > might spam the kernel log buffer. At the same time, this patch moves
> > "puff" messages to outside of balloon_lock, for it is not a good thing to
> > block the OOM notifier path for 1/5 of a second. (Moreover, it is better
> > to release the workqueue and allow processing other pending items. But
> > that change is out of this patch's scope.)
> > 
> > __GFP_NOMEMALLOC is currently not required because workqueue context
> > which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
> > to return ALLOC_OOM. But since some process context might start calling
> > balloon_page_alloc() in future, this patch does not remove
> > __GFP_NOMEMALLOC.
> > 
> > (Only compile tested. Please do runtime tests before committing.)
> 
> You will have to find someone to test it.

I don't have machines with much memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] virtio_balloon: use non-blocking allocation
  2018-01-31 11:13   ` Tetsuo Handa
@ 2018-01-31 15:25     ` Michael S. Tsirkin
  0 siblings, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2018-01-31 15:25 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: virtio-dev, linux-kernel, linux-mm, willy, mhocko, wei.w.wang

On Wed, Jan 31, 2018 at 08:13:26PM +0900, Tetsuo Handa wrote:
> Michael S. Tsirkin wrote:
> > On Tue, Jan 02, 2018 at 11:50:21PM +0900, Tetsuo Handa wrote:
> > > Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
> > > avoid OOM lockup by moving memory allocations to outside of balloon_lock.
> > > 
> > > Now, Wei is trying to allocate far more pages outside of balloon_lock and
> > > some more memory inside of balloon_lock in order to perform efficient
> > > communication between host and guest using scatter-gather API.
> > > 
> > > Since pages allocated outside of balloon_lock are not visible to the OOM
> > > notifier path until fill_balloon() holds balloon_lock (and enqueues the
> > > pending pages), allocating more pages than now may lead to unacceptably
> > > premature OOM killer invocation.
> > > 
> > > It would be possible to make the pending pages visible to the OOM notifier
> > > path. But there is no need to try to allocate memory so hard from the
> > > beginning. As of commit 18468d93e53b037e ("mm: introduce a common
> > > interface for balloon pages mobility"), it made sense to try allocation
> > > as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
> > > free some memory from balloon on OOM"),
> > 
> > However, please not that this behavious is optional.
> > Can you keep the current behaviour when deflate on OOM is disabled?
> 
> I can, for passing a flag to balloon_page_alloc() will do it.
> 
> But do we really prefer behavior up to comment 27 of
> https://bugzilla.redhat.com/show_bug.cgi?id=1525356 ?


You show a config where so much memory is taken that guest crashes, but
hopefully in other situations it's just an application hogging memory.

So I'm sure current behaviour is not optimal for your config but the
problem with deflate on OOM is it does not restart inflating when OOM
condition goes away.  So if host *really* needs that memory it can't
enable deflate on OOM.

Crashing the kernel is of course not really useful, I wish
there was a way to avoid that while still reliably
giving as much as we can to the host.

Ideally balloon flow looks like this:
1. host needs some memory e.g. to start a new guest
2. guest gets request from host to give it back some memory
3. while request not satisfied:
	a. try to get hold of free memory
	b. free up some by flushing caches
	c. free up some by killing memory hogs

All this without crashing the guest.

What we have implemented is a rough approximation.

Deflate on OOM reduces the chance of a crash or hang but it makes the
inflate unreliable: host can no longer use the memory for another guest,
this one might request it back at any time.

What is deflate on oom good for then?  I suspect that people use deflate
on oom as poor man's page hinting.


> > 
> > 
> > > it no longer makes sense to try
> > > allocation as hard as possible, for fill_balloon() will after all have to
> > > release just allocated memory if some allocation request hits the OOM
> > > notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
> > > allocating memory for inflating the balloon. Then, memory for inflating
> > > the balloon can be allocated inside balloon_lock, and we can release just
> > > allocated memory as needed.
> > > 
> > > Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
> > > allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
> > > might spam the kernel log buffer. At the same time, this patch moves
> > > "puff" messages to outside of balloon_lock, for it is not a good thing to
> > > block the OOM notifier path for 1/5 of a second. (Moreover, it is better
> > > to release the workqueue and allow processing other pending items. But
> > > that change is out of this patch's scope.)
> > > 
> > > __GFP_NOMEMALLOC is currently not required because workqueue context
> > > which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
> > > to return ALLOC_OOM. But since some process context might start calling
> > > balloon_page_alloc() in future, this patch does not remove
> > > __GFP_NOMEMALLOC.
> > > 
> > > (Only compile tested. Please do runtime tests before committing.)
> > 
> > You will have to find someone to test it.
> 
> I don't have machines with much memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-31 15:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-02 14:50 [PATCH] virtio_balloon: use non-blocking allocation Tetsuo Handa
2018-01-04  5:56 ` Wei Wang
2018-01-31  0:01 ` Michael S. Tsirkin
2018-01-31 11:13   ` Tetsuo Handa
2018-01-31 15:25     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).