[PATCH 0/2] coroutine: Fix crashes due to too large pool batch size

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] coroutine: Fix crashes due to too large pool batch size
@ 2022-05-10 15:10 Kevin Wolf
  2022-05-10 15:10 ` [PATCH 1/2] coroutine: Rename qemu_coroutine_inc/dec_pool_size() Kevin Wolf
  2022-05-10 15:10 ` [PATCH 2/2] coroutine: Revert to constant batch size Kevin Wolf
  0 siblings, 2 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-05-10 15:10 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hnarukaw, qemu-devel, qemu-stable

Kevin Wolf (2):
  coroutine: Rename qemu_coroutine_inc/dec_pool_size()
  coroutine: Revert to constant batch size

 include/qemu/coroutine.h |  6 +++---
 hw/block/virtio-blk.c    |  6 ++----
 util/qemu-coroutine.c    | 26 ++++++++++++++++----------
 3 files changed, 21 insertions(+), 17 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] coroutine: Rename qemu_coroutine_inc/dec_pool_size()
  2022-05-10 15:10 [PATCH 0/2] coroutine: Fix crashes due to too large pool batch size Kevin Wolf
@ 2022-05-10 15:10 ` Kevin Wolf
  2022-05-10 15:10 ` [PATCH 2/2] coroutine: Revert to constant batch size Kevin Wolf
  1 sibling, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-05-10 15:10 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hnarukaw, qemu-devel, qemu-stable

It's true that these functions currently affect the batch size in which
coroutines are reused (i.e. moved from the global release pool to the
allocation pool of a specific thread), but this is a bug and will be
fixed in a separate patch.

In fact, the comment in the header file already just promises that it
influences the pool size, so reflect this in the name of the functions.
As a nice side effect, the shorter function name makes some line
wrapping unnecessary.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/qemu/coroutine.h | 6 +++---
 hw/block/virtio-blk.c    | 6 ++----
 util/qemu-coroutine.c    | 4 ++--
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 284571badb..031cf23711 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -334,12 +334,12 @@ void coroutine_fn yield_until_fd_readable(int fd);
 /**
  * Increase coroutine pool size
  */
-void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size);
+void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size);
 
 /**
- * Devcrease coroutine pool size
+ * Decrease coroutine pool size
  */
-void qemu_coroutine_decrease_pool_batch_size(unsigned int additional_pool_size);
+void qemu_coroutine_dec_pool_size(unsigned int additional_pool_size);
 
 #include "qemu/lockable.h"
 
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 540c38f829..6a1cc41877 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1215,8 +1215,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
     for (i = 0; i < conf->num_queues; i++) {
         virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output);
     }
-    qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size
-                                            / 2);
+    qemu_coroutine_inc_pool_size(conf->num_queues * conf->queue_size / 2);
     virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
     if (err != NULL) {
         error_propagate(errp, err);
@@ -1253,8 +1252,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev)
     for (i = 0; i < conf->num_queues; i++) {
         virtio_del_queue(vdev, i);
     }
-    qemu_coroutine_decrease_pool_batch_size(conf->num_queues * conf->queue_size
-                                            / 2);
+    qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2);
     qemu_del_vm_change_state_handler(s->change);
     blockdev_mark_auto_del(s->blk);
     virtio_cleanup(vdev);
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index f3e8300c8d..ea23929a74 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -212,12 +212,12 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)
     return co->ctx;
 }
 
-void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size)
+void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size)
 {
     qatomic_add(&pool_batch_size, additional_pool_size);
 }
 
-void qemu_coroutine_decrease_pool_batch_size(unsigned int removing_pool_size)
+void qemu_coroutine_dec_pool_size(unsigned int removing_pool_size)
 {
     qatomic_sub(&pool_batch_size, removing_pool_size);
 }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] coroutine: Revert to constant batch size
  2022-05-10 15:10 [PATCH 0/2] coroutine: Fix crashes due to too large pool batch size Kevin Wolf
  2022-05-10 15:10 ` [PATCH 1/2] coroutine: Rename qemu_coroutine_inc/dec_pool_size() Kevin Wolf
@ 2022-05-10 15:10 ` Kevin Wolf
  2022-05-12  6:56   ` 成川 弘樹
  1 sibling, 1 reply; 7+ messages in thread
From: Kevin Wolf @ 2022-05-10 15:10 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hnarukaw, qemu-devel, qemu-stable

Commit 4c41c69e changed the way the coroutine pool is sized because for
virtio-blk devices with a large queue size and heavy I/O, it was just
too small and caused coroutines to be deleted and reallocated soon
afterwards. The change made the size dynamic based on the number of
queues and the queue size of virtio-blk devices.

There are two important numbers here: Slightly simplified, when a
coroutine terminates, it is generally stored in the global release pool
up to a certain pool size, and if the pool is full, it is freed.
Conversely, when allocating a new coroutine, the coroutines in the
release pool are reused if the pool already has reached a certain
minimum size (the batch size), otherwise we allocate new coroutines.

The problem after commit 4c41c69e is that it not only increases the
maximum pool size (which is the intended effect), but also the batch
size for reusing coroutines (which is a bug). It means that in cases
with many devices and/or a large queue size (which defaults to the
number of vcpus for virtio-blk-pci), many thousand coroutines could be
sitting in the release pool without being reused.

This is not only a waste of memory and allocations, but it actually
makes the QEMU process likely to hit the vm.max_map_count limit on Linux
because each coroutine requires two mappings (its stack and the guard
page for the stack), causing it to abort() in qemu_alloc_stack() because
when the limit is hit, mprotect() starts to fail with ENOMEM.

In order to fix the problem, change the batch size back to 64 to avoid
uselessly accumulating coroutines in the release pool, but keep the
dynamic maximum pool size so that coroutines aren't freed too early
in heavy I/O scenarios.

Note that this fix doesn't strictly make it impossible to hit the limit,
but this would only happen if most of the coroutines are actually in use
at the same time, not just sitting in a pool. This is the same behaviour
as we already had before commit 4c41c69e. Fully preventing this would
require allowing qemu_coroutine_create() to return an error, but it
doesn't seem to be a scenario that people hit in practice.

Cc: qemu-stable@nongnu.org
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938
Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 util/qemu-coroutine.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index ea23929a74..4a8bd63ef0 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -21,14 +21,20 @@
 #include "qemu/coroutine-tls.h"
 #include "block/aio.h"

-/** Initial batch size is 64, and is increased on demand */
+/**
+ * The minimal batch size is always 64, coroutines from the release_pool are
+ * reused as soon as there are 64 coroutines in it. The maximum pool size starts
+ * with 64 and is increased on demand so that coroutines are not deleted even if
+ * they are not immediately reused.
+ */
 enum {
-    POOL_INITIAL_BATCH_SIZE = 64,
+    POOL_MIN_BATCH_SIZE = 64,
+    POOL_INITIAL_MAX_SIZE = 64,
 };

 /** Free list to speed up creation */
 static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
-static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE;
+static unsigned int pool_max_size = POOL_INITIAL_MAX_SIZE;
 static unsigned int release_pool_size;

 typedef QSLIST_HEAD(, Coroutine) CoroutineQSList;
@@ -57,7 +63,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)

         co = QSLIST_FIRST(alloc_pool);
         if (!co) {
-            if (release_pool_size > qatomic_read(&pool_batch_size)) {
+            if (release_pool_size > POOL_MIN_BATCH_SIZE) {
                 /* Slow path; a good place to register the destructor, too.  */
                 Notifier *notifier = get_ptr_coroutine_pool_cleanup_notifier();
                 if (!notifier->notify) {
@@ -95,12 +101,12 @@ static void coroutine_delete(Coroutine *co)
     co->caller = NULL;

     if (CONFIG_COROUTINE_POOL) {
-        if (release_pool_size < qatomic_read(&pool_batch_size) * 2) {
+        if (release_pool_size < qatomic_read(&pool_max_size) * 2) {
             QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next);
             qatomic_inc(&release_pool_size);
             return;
         }
-        if (get_alloc_pool_size() < qatomic_read(&pool_batch_size)) {
+        if (get_alloc_pool_size() < qatomic_read(&pool_max_size)) {
             QSLIST_INSERT_HEAD(get_ptr_alloc_pool(), co, pool_next);
             set_alloc_pool_size(get_alloc_pool_size() + 1);
             return;
@@ -214,10 +220,10 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)

 void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size)
 {
-    qatomic_add(&pool_batch_size, additional_pool_size);
+    qatomic_add(&pool_max_size, additional_pool_size);
 }

 void qemu_coroutine_dec_pool_size(unsigned int removing_pool_size)
 {
-    qatomic_sub(&pool_batch_size, removing_pool_size);
+    qatomic_sub(&pool_max_size, removing_pool_size);
 }
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] coroutine: Revert to constant batch size
  2022-05-10 15:10 ` [PATCH 2/2] coroutine: Revert to constant batch size Kevin Wolf
@ 2022-05-12  6:56   ` 成川 弘樹
  2022-05-12  9:57     ` Kevin Wolf
  2022-05-12 12:50     ` Philippe Mathieu-Daudé via
  0 siblings, 2 replies; 7+ messages in thread
From: 成川 弘樹 @ 2022-05-12  6:56 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: stefanha, qemu-devel, qemu-stable

Thank you for your fix.

I confirmed that after applying this patch, my intended performance 
improvement by 4c41c69e is still kept in our environment.


On 2022/05/11 0:10, Kevin Wolf wrote:
> Commit 4c41c69e changed the way the coroutine pool is sized because for
> virtio-blk devices with a large queue size and heavy I/O, it was just
> too small and caused coroutines to be deleted and reallocated soon
> afterwards. The change made the size dynamic based on the number of
> queues and the queue size of virtio-blk devices.
> 
> There are two important numbers here: Slightly simplified, when a
> coroutine terminates, it is generally stored in the global release pool
> up to a certain pool size, and if the pool is full, it is freed.
> Conversely, when allocating a new coroutine, the coroutines in the
> release pool are reused if the pool already has reached a certain
> minimum size (the batch size), otherwise we allocate new coroutines.
> 
> The problem after commit 4c41c69e is that it not only increases the
> maximum pool size (which is the intended effect), but also the batch
> size for reusing coroutines (which is a bug). It means that in cases
> with many devices and/or a large queue size (which defaults to the
> number of vcpus for virtio-blk-pci), many thousand coroutines could be
> sitting in the release pool without being reused.
> 
> This is not only a waste of memory and allocations, but it actually
> makes the QEMU process likely to hit the vm.max_map_count limit on Linux
> because each coroutine requires two mappings (its stack and the guard
> page for the stack), causing it to abort() in qemu_alloc_stack() because
> when the limit is hit, mprotect() starts to fail with ENOMEM.
> 
> In order to fix the problem, change the batch size back to 64 to avoid
> uselessly accumulating coroutines in the release pool, but keep the
> dynamic maximum pool size so that coroutines aren't freed too early
> in heavy I/O scenarios.
> 
> Note that this fix doesn't strictly make it impossible to hit the limit,
> but this would only happen if most of the coroutines are actually in use
> at the same time, not just sitting in a pool. This is the same behaviour
> as we already had before commit 4c41c69e. Fully preventing this would
> require allowing qemu_coroutine_create() to return an error, but it
> doesn't seem to be a scenario that people hit in practice.
> 
> Cc: qemu-stable@nongnu.org
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938
> Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   util/qemu-coroutine.c | 22 ++++++++++++++--------
>   1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index ea23929a74..4a8bd63ef0 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -21,14 +21,20 @@
>   #include "qemu/coroutine-tls.h"
>   #include "block/aio.h"
>   
> -/** Initial batch size is 64, and is increased on demand */
> +/**
> + * The minimal batch size is always 64, coroutines from the release_pool are
> + * reused as soon as there are 64 coroutines in it. The maximum pool size starts
> + * with 64 and is increased on demand so that coroutines are not deleted even if
> + * they are not immediately reused.
> + */
>   enum {
> -    POOL_INITIAL_BATCH_SIZE = 64,
> +    POOL_MIN_BATCH_SIZE = 64,
> +    POOL_INITIAL_MAX_SIZE = 64,
>   };
>   
>   /** Free list to speed up creation */
>   static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
> -static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE;
> +static unsigned int pool_max_size = POOL_INITIAL_MAX_SIZE;
>   static unsigned int release_pool_size;
>   
>   typedef QSLIST_HEAD(, Coroutine) CoroutineQSList;
> @@ -57,7 +63,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
>   
>           co = QSLIST_FIRST(alloc_pool);
>           if (!co) {
> -            if (release_pool_size > qatomic_read(&pool_batch_size)) {
> +            if (release_pool_size > POOL_MIN_BATCH_SIZE) {
>                   /* Slow path; a good place to register the destructor, too.  */
>                   Notifier *notifier = get_ptr_coroutine_pool_cleanup_notifier();
>                   if (!notifier->notify) {
> @@ -95,12 +101,12 @@ static void coroutine_delete(Coroutine *co)
>       co->caller = NULL;
>   
>       if (CONFIG_COROUTINE_POOL) {
> -        if (release_pool_size < qatomic_read(&pool_batch_size) * 2) {
> +        if (release_pool_size < qatomic_read(&pool_max_size) * 2) {
>               QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next);
>               qatomic_inc(&release_pool_size);
>               return;
>           }
> -        if (get_alloc_pool_size() < qatomic_read(&pool_batch_size)) {
> +        if (get_alloc_pool_size() < qatomic_read(&pool_max_size)) {
>               QSLIST_INSERT_HEAD(get_ptr_alloc_pool(), co, pool_next);
>               set_alloc_pool_size(get_alloc_pool_size() + 1);
>               return;
> @@ -214,10 +220,10 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co)
>   
>   void qemu_coroutine_inc_pool_size(unsigned int additional_pool_size)
>   {
> -    qatomic_add(&pool_batch_size, additional_pool_size);
> +    qatomic_add(&pool_max_size, additional_pool_size);
>   }
>   
>   void qemu_coroutine_dec_pool_size(unsigned int removing_pool_size)
>   {
> -    qatomic_sub(&pool_batch_size, removing_pool_size);
> +    qatomic_sub(&pool_max_size, removing_pool_size);
>   }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] coroutine: Revert to constant batch size
  2022-05-12  6:56   ` 成川 弘樹
@ 2022-05-12  9:57     ` Kevin Wolf
  2022-05-12 12:50     ` Philippe Mathieu-Daudé via
  1 sibling, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-05-12  9:57 UTC (permalink / raw)
  To: 成川 弘樹
  Cc: qemu-block, stefanha, qemu-devel, qemu-stable

Am 12.05.2022 um 08:56 hat 成川 弘樹 geschrieben:
> Thank you for your fix.
> 
> I confirmed that after applying this patch, my intended performance
> improvement by 4c41c69e is still kept in our environment.

This is good news. Thank you for testing the patch!

Kevin



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] coroutine: Revert to constant batch size
  2022-05-12  6:56   ` 成川 弘樹
  2022-05-12  9:57     ` Kevin Wolf
@ 2022-05-12 12:50     ` Philippe Mathieu-Daudé via
  2022-05-13  1:18       ` 成川 弘樹
  1 sibling, 1 reply; 7+ messages in thread
From: Philippe Mathieu-Daudé via @ 2022-05-12 12:50 UTC (permalink / raw)
  To: 成川 弘樹
  Cc: Kevin Wolf, open list:Block layer core, Stefan Hajnoczi,
	qemu-devel@nongnu.org Developers, qemu-stable

Hi Hiroki,

On Thu, May 12, 2022 at 8:57 AM 成川 弘樹 <hnarukaw@yahoo-corp.jp> wrote:
>
> Thank you for your fix.
>
> I confirmed that after applying this patch, my intended performance
> improvement by 4c41c69e is still kept in our environment.

Is that equivalent to a formal
Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
tag?

> On 2022/05/11 0:10, Kevin Wolf wrote:
> > Commit 4c41c69e changed the way the coroutine pool is sized because for
> > virtio-blk devices with a large queue size and heavy I/O, it was just
> > too small and caused coroutines to be deleted and reallocated soon
> > afterwards. The change made the size dynamic based on the number of
> > queues and the queue size of virtio-blk devices.
> >
> > There are two important numbers here: Slightly simplified, when a
> > coroutine terminates, it is generally stored in the global release pool
> > up to a certain pool size, and if the pool is full, it is freed.
> > Conversely, when allocating a new coroutine, the coroutines in the
> > release pool are reused if the pool already has reached a certain
> > minimum size (the batch size), otherwise we allocate new coroutines.
> >
> > The problem after commit 4c41c69e is that it not only increases the
> > maximum pool size (which is the intended effect), but also the batch
> > size for reusing coroutines (which is a bug). It means that in cases
> > with many devices and/or a large queue size (which defaults to the
> > number of vcpus for virtio-blk-pci), many thousand coroutines could be
> > sitting in the release pool without being reused.
> >
> > This is not only a waste of memory and allocations, but it actually
> > makes the QEMU process likely to hit the vm.max_map_count limit on Linux
> > because each coroutine requires two mappings (its stack and the guard
> > page for the stack), causing it to abort() in qemu_alloc_stack() because
> > when the limit is hit, mprotect() starts to fail with ENOMEM.
> >
> > In order to fix the problem, change the batch size back to 64 to avoid
> > uselessly accumulating coroutines in the release pool, but keep the
> > dynamic maximum pool size so that coroutines aren't freed too early
> > in heavy I/O scenarios.
> >
> > Note that this fix doesn't strictly make it impossible to hit the limit,
> > but this would only happen if most of the coroutines are actually in use
> > at the same time, not just sitting in a pool. This is the same behaviour
> > as we already had before commit 4c41c69e. Fully preventing this would
> > require allowing qemu_coroutine_create() to return an error, but it
> > doesn't seem to be a scenario that people hit in practice.
> >
> > Cc: qemu-stable@nongnu.org
> > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938
> > Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] coroutine: Revert to constant batch size
  2022-05-12 12:50     ` Philippe Mathieu-Daudé via
@ 2022-05-13  1:18       ` 成川 弘樹
  0 siblings, 0 replies; 7+ messages in thread
From: 成川 弘樹 @ 2022-05-13  1:18 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, open list:Block layer core, Stefan Hajnoczi,
	qemu-devel@nongnu.org Developers, qemu-stable

I'm not sure how much testing is expected for "Tested-by".

If just checking my perspective is enough, yes. But I did not check that 
this patch fixes the problem of excessive resource consumption.


On 2022/05/12 21:50, Philippe Mathieu-Daudé wrote:
> Hi Hiroki,
> 
> On Thu, May 12, 2022 at 8:57 AM 成川 弘樹 <hnarukaw@yahoo-corp.jp> wrote:
>>
>> Thank you for your fix.
>>
>> I confirmed that after applying this patch, my intended performance
>> improvement by 4c41c69e is still kept in our environment.
> 
> Is that equivalent to a formal
> Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
> tag?
> 
>> On 2022/05/11 0:10, Kevin Wolf wrote:
>>> Commit 4c41c69e changed the way the coroutine pool is sized because for
>>> virtio-blk devices with a large queue size and heavy I/O, it was just
>>> too small and caused coroutines to be deleted and reallocated soon
>>> afterwards. The change made the size dynamic based on the number of
>>> queues and the queue size of virtio-blk devices.
>>>
>>> There are two important numbers here: Slightly simplified, when a
>>> coroutine terminates, it is generally stored in the global release pool
>>> up to a certain pool size, and if the pool is full, it is freed.
>>> Conversely, when allocating a new coroutine, the coroutines in the
>>> release pool are reused if the pool already has reached a certain
>>> minimum size (the batch size), otherwise we allocate new coroutines.
>>>
>>> The problem after commit 4c41c69e is that it not only increases the
>>> maximum pool size (which is the intended effect), but also the batch
>>> size for reusing coroutines (which is a bug). It means that in cases
>>> with many devices and/or a large queue size (which defaults to the
>>> number of vcpus for virtio-blk-pci), many thousand coroutines could be
>>> sitting in the release pool without being reused.
>>>
>>> This is not only a waste of memory and allocations, but it actually
>>> makes the QEMU process likely to hit the vm.max_map_count limit on Linux
>>> because each coroutine requires two mappings (its stack and the guard
>>> page for the stack), causing it to abort() in qemu_alloc_stack() because
>>> when the limit is hit, mprotect() starts to fail with ENOMEM.
>>>
>>> In order to fix the problem, change the batch size back to 64 to avoid
>>> uselessly accumulating coroutines in the release pool, but keep the
>>> dynamic maximum pool size so that coroutines aren't freed too early
>>> in heavy I/O scenarios.
>>>
>>> Note that this fix doesn't strictly make it impossible to hit the limit,
>>> but this would only happen if most of the coroutines are actually in use
>>> at the same time, not just sitting in a pool. This is the same behaviour
>>> as we already had before commit 4c41c69e. Fully preventing this would
>>> require allowing qemu_coroutine_create() to return an error, but it
>>> doesn't seem to be a scenario that people hit in practice.
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Resolves: https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D2079938&amp;data=05%7C01%7Chnarukaw%40yahoo-corp.jp%7Cff781f6380a7429d939208da3415f686%7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C637879566175459621%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1Sy3MvmkDTBSHZQxcEQUGZabBz%2FzTTYj4p0kFfTRTYM%3D&amp;reserved=0
>>> Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-13  1:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10 15:10 [PATCH 0/2] coroutine: Fix crashes due to too large pool batch size Kevin Wolf
2022-05-10 15:10 ` [PATCH 1/2] coroutine: Rename qemu_coroutine_inc/dec_pool_size() Kevin Wolf
2022-05-10 15:10 ` [PATCH 2/2] coroutine: Revert to constant batch size Kevin Wolf
2022-05-12  6:56   ` 成川 弘樹
2022-05-12  9:57     ` Kevin Wolf
2022-05-12 12:50     ` Philippe Mathieu-Daudé via
2022-05-13  1:18       ` 成川 弘樹

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.