All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] thread-pool: Add option to fix the pool size
@ 2022-02-02 17:52 Nicolas Saenz Julienne
  2022-02-03 10:53 ` Stefan Hajnoczi
  0 siblings, 1 reply; 12+ messages in thread
From: Nicolas Saenz Julienne @ 2022-02-02 17:52 UTC (permalink / raw)
  To: kwolf, hreitz, pbonzini
  Cc: Nicolas Saenz Julienne, mtosatti, stefanha, qemu-devel, qemu-block

The thread pool regulates itself: when idle, it kills threads until
empty, when in demand, it creates new threads until full. This behaviour
doesn't play well with latency sensitive workloads where the price of
creating a new thread is too high. For example, when paired with qemu's
'-mlock', or using safety features like SafeStack, creating a new thread
has been measured take multiple milliseconds.

In order to mitigate this let's introduce a new option to set a fixed
pool size. The threads will be created during the pool's initialization,
remain available during its lifetime regardless of demand, and destroyed
upon freeing it. A properly characterized workload will then be able to
configure the pool to avoid any latency spike.

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>

---

The fix I propose here works for my specific use-case, but I'm pretty
sure it'll need to be a bit more versatile to accommodate other
use-cases.

Some questions:

- Is unanimously setting these parameters for any pool instance too
  limiting? It'd make sense to move the options into the AioContext the
  pool belongs to. IIUC, for the general block use-case, this would be
  'qemu_aio_context' as initialized in qemu_init_main_loop().

- Currently I'm setting two pool properties through a single qemu
  option. The pool's size and dynamic behaviour, or lack thereof. I
  think it'd be better to split them into separate options. I thought of
  different ways of expressing this (min/max-size where static happens
  when min-size=max-size, size and static/dynamic, etc..), but you might
  have ideas on what could be useful to other use-cases.

Some background on my workload: I'm using IDE emulation, the guest is an
old RTOS that doesn't support virtio, using 'aio=native' isn't possible
either (unaligned IO accesses).

Thanks!

 include/block/thread-pool.h |  2 ++
 qemu-options.hx             | 21 +++++++++++++++++++++
 softmmu/vl.c                | 28 ++++++++++++++++++++++++++++
 util/thread-pool.c          | 20 +++++++++++++++++---
 4 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h
index 7dd7d730a0..3337971669 100644
--- a/include/block/thread-pool.h
+++ b/include/block/thread-pool.h
@@ -23,6 +23,8 @@
 typedef int ThreadPoolFunc(void *opaque);
 
 typedef struct ThreadPool ThreadPool;
+extern int thread_pool_max_threads;
+extern bool thread_pool_fixed_size;
 
 ThreadPool *thread_pool_new(struct AioContext *ctx);
 void thread_pool_free(ThreadPool *pool);
diff --git a/qemu-options.hx b/qemu-options.hx
index ba3ae6a42a..cb8f50db66 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4627,6 +4627,27 @@ SRST
     user-provided config files on sysconfdir.
 ERST
 
+DEF("thread-pool", HAS_ARG, QEMU_OPTION_threadpool,
+    "-thread-pool fixed-size=[n]\n"
+    "               Sets the number of threads always available in the pool.\n",
+    QEMU_ARCH_ALL)
+SRST
+``-thread-pool fixed-size=[n]``
+    The ``fixed-size=value`` option sets the number of readily available
+    threads in the pool. When set, the pool will create the threads during
+    initialization and will abstain from growing or shrinking during runtime.
+    This moves the burden of properly sizing the pool to the user in exchange
+    for a more deterministic thread pool behaviour. The number of threads has
+    to be greater than 0.
+
+    When not used, the thread pool size will change dynamically based on
+    demand: converging to being empty when idle and maxing out at 64 threads.
+
+    This option targets real-time systems sensitive to the latency introduced
+    by creating new threads during runtime. Performance sensitive use-cases are
+    better-off not using this.
+ERST
+
 DEF("trace", HAS_ARG, QEMU_OPTION_trace,
     "-trace [[enable=]<pattern>][,events=<file>][,file=<file>]\n"
     "                specify tracing options\n",
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5e1b35ba48..6a44cc1818 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -72,6 +72,7 @@
 #include "qemu/log.h"
 #include "sysemu/blockdev.h"
 #include "hw/block/block.h"
+#include "block/thread-pool.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/pc.h"
 #include "migration/misc.h"
@@ -496,6 +497,19 @@ static QemuOptsList qemu_action_opts = {
     },
 };
 
+static QemuOptsList qemu_thread_pool_opts = {
+    .name = "thread-pool",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_thread_pool_opts.head),
+    .desc = {
+        {
+            .name = "fixed-size",
+            .type = QEMU_OPT_NUMBER,
+            .help = "Sets the number of threads available in the pool",
+        },
+        { /* end of list */ }
+    },
+};
+
 const char *qemu_get_vm_name(void)
 {
     return qemu_name;
@@ -2809,6 +2823,7 @@ void qemu_init(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_semihosting_config_opts);
     qemu_add_opts(&qemu_fw_cfg_opts);
     qemu_add_opts(&qemu_action_opts);
+    qemu_add_opts(&qemu_thread_pool_opts);
     module_call_init(MODULE_INIT_OPTS);
 
     error_init(argv[0]);
@@ -3658,6 +3673,19 @@ void qemu_init(int argc, char **argv, char **envp)
             case QEMU_OPTION_nouserconfig:
                 /* Nothing to be parsed here. Especially, do not error out below. */
                 break;
+            case QEMU_OPTION_threadpool:
+                opts = qemu_opts_parse_noisily(qemu_find_opts("thread-pool"),
+                                               optarg, false);
+                if (!opts) {
+                    exit(1);
+                }
+                thread_pool_max_threads = qemu_opt_get_number(opts, "fixed-size", 0);
+                if (thread_pool_max_threads <= 0) {
+                    error_report("fixed-size is invalid");
+                    exit(1);
+                }
+                thread_pool_fixed_size = true;
+                break;
             default:
                 if (os_parse_cmd_args(popt->index, optarg)) {
                     error_report("Option not supported in this build");
diff --git a/util/thread-pool.c b/util/thread-pool.c
index d763cea505..3081f502ff 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -25,6 +25,8 @@
 static void do_spawn_thread(ThreadPool *pool);
 
 typedef struct ThreadPoolElement ThreadPoolElement;
+int thread_pool_max_threads = 64;
+bool thread_pool_fixed_size;
 
 enum ThreadState {
     THREAD_QUEUED,
@@ -59,6 +61,7 @@ struct ThreadPool {
     QemuCond worker_stopped;
     QemuSemaphore sem;
     int max_threads;
+    bool fixed_size;
     QEMUBH *new_thread_bh;
 
     /* The following variables are only accessed from one AioContext. */
@@ -83,12 +86,16 @@ static void *worker_thread(void *opaque)
 
     while (!pool->stopping) {
         ThreadPoolElement *req;
-        int ret;
+        int ret = 0;
 
         do {
             pool->idle_threads++;
             qemu_mutex_unlock(&pool->lock);
-            ret = qemu_sem_timedwait(&pool->sem, 10000);
+            if (pool->fixed_size) {
+                qemu_sem_wait(&pool->sem);
+            } else {
+                ret = qemu_sem_timedwait(&pool->sem, 10000);
+            }
             qemu_mutex_lock(&pool->lock);
             pool->idle_threads--;
         } while (ret == -1 && !QTAILQ_EMPTY(&pool->request_list));
@@ -306,11 +313,18 @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx)
     qemu_mutex_init(&pool->lock);
     qemu_cond_init(&pool->worker_stopped);
     qemu_sem_init(&pool->sem, 0);
-    pool->max_threads = 64;
+    pool->max_threads = thread_pool_max_threads;
+    pool->fixed_size = thread_pool_fixed_size;
     pool->new_thread_bh = aio_bh_new(ctx, spawn_thread_bh_fn, pool);
 
     QLIST_INIT(&pool->head);
     QTAILQ_INIT(&pool->request_list);
+
+    for (int i = 0; pool->fixed_size && i < pool->max_threads; i++) {
+        qemu_mutex_lock(&pool->lock);
+        spawn_thread(pool);
+        qemu_mutex_unlock(&pool->lock);
+    }
 }
 
 ThreadPool *thread_pool_new(AioContext *ctx)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-02 17:52 [RFC] thread-pool: Add option to fix the pool size Nicolas Saenz Julienne
@ 2022-02-03 10:53 ` Stefan Hajnoczi
  2022-02-03 10:56   ` Daniel P. Berrangé
  2022-02-07 12:00   ` Nicolas Saenz Julienne
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-02-03 10:53 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2845 bytes --]

On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> The thread pool regulates itself: when idle, it kills threads until
> empty, when in demand, it creates new threads until full. This behaviour
> doesn't play well with latency sensitive workloads where the price of
> creating a new thread is too high. For example, when paired with qemu's
> '-mlock', or using safety features like SafeStack, creating a new thread
> has been measured take multiple milliseconds.
> 
> In order to mitigate this let's introduce a new option to set a fixed
> pool size. The threads will be created during the pool's initialization,
> remain available during its lifetime regardless of demand, and destroyed
> upon freeing it. A properly characterized workload will then be able to
> configure the pool to avoid any latency spike.
> 
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
> 
> ---
> 
> The fix I propose here works for my specific use-case, but I'm pretty
> sure it'll need to be a bit more versatile to accommodate other
> use-cases.
> 
> Some questions:
> 
> - Is unanimously setting these parameters for any pool instance too
>   limiting? It'd make sense to move the options into the AioContext the
>   pool belongs to. IIUC, for the general block use-case, this would be
>   'qemu_aio_context' as initialized in qemu_init_main_loop().

Yes, qemu_aio_context is the main loop's AioContext. It's used unless
IOThreads are configured.

It's nice to have global settings that affect all AioContexts, so I
think this patch is fine for now.

In the future IOThread-specific parameters could be added if individual
IOThread AioContexts need tuning (similar to how poll-max-ns works
today).

> - Currently I'm setting two pool properties through a single qemu
>   option. The pool's size and dynamic behaviour, or lack thereof. I
>   think it'd be better to split them into separate options. I thought of
>   different ways of expressing this (min/max-size where static happens
>   when min-size=max-size, size and static/dynamic, etc..), but you might
>   have ideas on what could be useful to other use-cases.

Yes, "min" and "max" is more flexible than fixed-size=n. fixed-size=n is
equivalent to min=n,max=n. The current default policy is min=0,max=64.
If you want more threads you could do min=0,max=128. If you want to
reserve 1 thread all the time use min=1,max=64.

I would go with min and max.

> 
> Some background on my workload: I'm using IDE emulation, the guest is an
> old RTOS that doesn't support virtio, using 'aio=native' isn't possible
> either (unaligned IO accesses).

I thought QEMU's block layer creates bounce buffers for unaligned
accesses, handling both memory buffer alignment and LBA alignment
necessary for aio=native,cache=none?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-03 10:53 ` Stefan Hajnoczi
@ 2022-02-03 10:56   ` Daniel P. Berrangé
  2022-02-03 14:19     ` Stefan Hajnoczi
  2022-02-07 12:00   ` Nicolas Saenz Julienne
  1 sibling, 1 reply; 12+ messages in thread
From: Daniel P. Berrangé @ 2022-02-03 10:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > The thread pool regulates itself: when idle, it kills threads until
> > empty, when in demand, it creates new threads until full. This behaviour
> > doesn't play well with latency sensitive workloads where the price of
> > creating a new thread is too high. For example, when paired with qemu's
> > '-mlock', or using safety features like SafeStack, creating a new thread
> > has been measured take multiple milliseconds.
> > 
> > In order to mitigate this let's introduce a new option to set a fixed
> > pool size. The threads will be created during the pool's initialization,
> > remain available during its lifetime regardless of demand, and destroyed
> > upon freeing it. A properly characterized workload will then be able to
> > configure the pool to avoid any latency spike.
> > 
> > Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
> > 
> > ---
> > 
> > The fix I propose here works for my specific use-case, but I'm pretty
> > sure it'll need to be a bit more versatile to accommodate other
> > use-cases.
> > 
> > Some questions:
> > 
> > - Is unanimously setting these parameters for any pool instance too
> >   limiting? It'd make sense to move the options into the AioContext the
> >   pool belongs to. IIUC, for the general block use-case, this would be
> >   'qemu_aio_context' as initialized in qemu_init_main_loop().
> 
> Yes, qemu_aio_context is the main loop's AioContext. It's used unless
> IOThreads are configured.
> 
> It's nice to have global settings that affect all AioContexts, so I
> think this patch is fine for now.
> 
> In the future IOThread-specific parameters could be added if individual
> IOThread AioContexts need tuning (similar to how poll-max-ns works
> today).
> 
> > - Currently I'm setting two pool properties through a single qemu
> >   option. The pool's size and dynamic behaviour, or lack thereof. I
> >   think it'd be better to split them into separate options. I thought of
> >   different ways of expressing this (min/max-size where static happens
> >   when min-size=max-size, size and static/dynamic, etc..), but you might
> >   have ideas on what could be useful to other use-cases.
> 
> Yes, "min" and "max" is more flexible than fixed-size=n. fixed-size=n is
> equivalent to min=n,max=n. The current default policy is min=0,max=64.
> If you want more threads you could do min=0,max=128. If you want to
> reserve 1 thread all the time use min=1,max=64.
> 
> I would go with min and max.

This commit also exposes this as a new top level command line
argument. Given our aim to eliminate QemuOpts and use QAPI/QOM
properties for everything I think we need a different approach.

I'm not sure which exisiting QAPI/QOM option it most appropriate
to graft these tunables onto ?  -machine ?  -accel ?  Or is there
no good fit yet ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-03 10:56   ` Daniel P. Berrangé
@ 2022-02-03 14:19     ` Stefan Hajnoczi
  2022-02-11  9:30       ` Nicolas Saenz Julienne
  2022-02-11 11:32       ` Kevin Wolf
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-02-03 14:19 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 3812 bytes --]

On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > > The thread pool regulates itself: when idle, it kills threads until
> > > empty, when in demand, it creates new threads until full. This behaviour
> > > doesn't play well with latency sensitive workloads where the price of
> > > creating a new thread is too high. For example, when paired with qemu's
> > > '-mlock', or using safety features like SafeStack, creating a new thread
> > > has been measured take multiple milliseconds.
> > > 
> > > In order to mitigate this let's introduce a new option to set a fixed
> > > pool size. The threads will be created during the pool's initialization,
> > > remain available during its lifetime regardless of demand, and destroyed
> > > upon freeing it. A properly characterized workload will then be able to
> > > configure the pool to avoid any latency spike.
> > > 
> > > Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
> > > 
> > > ---
> > > 
> > > The fix I propose here works for my specific use-case, but I'm pretty
> > > sure it'll need to be a bit more versatile to accommodate other
> > > use-cases.
> > > 
> > > Some questions:
> > > 
> > > - Is unanimously setting these parameters for any pool instance too
> > >   limiting? It'd make sense to move the options into the AioContext the
> > >   pool belongs to. IIUC, for the general block use-case, this would be
> > >   'qemu_aio_context' as initialized in qemu_init_main_loop().
> > 
> > Yes, qemu_aio_context is the main loop's AioContext. It's used unless
> > IOThreads are configured.
> > 
> > It's nice to have global settings that affect all AioContexts, so I
> > think this patch is fine for now.
> > 
> > In the future IOThread-specific parameters could be added if individual
> > IOThread AioContexts need tuning (similar to how poll-max-ns works
> > today).
> > 
> > > - Currently I'm setting two pool properties through a single qemu
> > >   option. The pool's size and dynamic behaviour, or lack thereof. I
> > >   think it'd be better to split them into separate options. I thought of
> > >   different ways of expressing this (min/max-size where static happens
> > >   when min-size=max-size, size and static/dynamic, etc..), but you might
> > >   have ideas on what could be useful to other use-cases.
> > 
> > Yes, "min" and "max" is more flexible than fixed-size=n. fixed-size=n is
> > equivalent to min=n,max=n. The current default policy is min=0,max=64.
> > If you want more threads you could do min=0,max=128. If you want to
> > reserve 1 thread all the time use min=1,max=64.
> > 
> > I would go with min and max.
> 
> This commit also exposes this as a new top level command line
> argument. Given our aim to eliminate QemuOpts and use QAPI/QOM
> properties for everything I think we need a different approach.
> 
> I'm not sure which exisiting QAPI/QOM option it most appropriate
> to graft these tunables onto ?  -machine ?  -accel ?  Or is there
> no good fit yet ?

Yep, I didn't comment on this because I don't have a good suggestion.

In terms of semantics I think we should have:

1. A global default value that all new AioContext take. The QEMU main
   loop's qemu_aio_context will use this and all IOThread AioContext
   will use it (unless they have been overridden).

   I would define it on --machine because that's the "global" object for
   a guest, but that's not very satisfying.

2. (Future patch) --object iothread,thread-pool-min=N,thread-pool-max=M
   just like poll-max-ns and friends. This allows the values to be set
   on a per-IOThread basis.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-03 10:53 ` Stefan Hajnoczi
  2022-02-03 10:56   ` Daniel P. Berrangé
@ 2022-02-07 12:00   ` Nicolas Saenz Julienne
  2022-02-07 15:20     ` Stefan Hajnoczi
  1 sibling, 1 reply; 12+ messages in thread
From: Nicolas Saenz Julienne @ 2022-02-07 12:00 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini

Hi Stefan, thanks for the review. I took note of your comments.

On Thu, 2022-02-03 at 10:53 +0000, Stefan Hajnoczi wrote:
> > Some background on my workload: I'm using IDE emulation, the guest is an
> > old RTOS that doesn't support virtio, using 'aio=native' isn't possible
> > either (unaligned IO accesses).
> 
> I thought QEMU's block layer creates bounce buffers for unaligned
> accesses, handling both memory buffer alignment and LBA alignment
> necessary for aio=native,cache=none?

See block/file-posix.c:raw_co_prw() {

    /*
     * When using O_DIRECT, the request must be aligned to be able to use
     * either libaio or io_uring interface. If not fail back to regular thread
     * pool read/write code which emulates this for us if we set
     * QEMU_AIO_MISALIGNED.
     */
    if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov))
	type |= QEMU_AIO_MISALIGNED;
    else if (s->use_linux_io_uring)
        return luring_co_submit(...);
    else if (s->use_linux_aio)
        return laio_co_submit(...);

    return raw_thread_pool_submit(..., handle_aiocb_rw, ...);
}

bdrv_qiov_is_aligned() returns 'false' on my use-case.

I believe what you're referring to happens in handle_aiocb_rw(), but it's too
late then.

Thanks,

-- 
Nicolás Sáenz



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-07 12:00   ` Nicolas Saenz Julienne
@ 2022-02-07 15:20     ` Stefan Hajnoczi
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-02-07 15:20 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1521 bytes --]

On Mon, Feb 07, 2022 at 01:00:46PM +0100, Nicolas Saenz Julienne wrote:
> Hi Stefan, thanks for the review. I took note of your comments.
> 
> On Thu, 2022-02-03 at 10:53 +0000, Stefan Hajnoczi wrote:
> > > Some background on my workload: I'm using IDE emulation, the guest is an
> > > old RTOS that doesn't support virtio, using 'aio=native' isn't possible
> > > either (unaligned IO accesses).
> > 
> > I thought QEMU's block layer creates bounce buffers for unaligned
> > accesses, handling both memory buffer alignment and LBA alignment
> > necessary for aio=native,cache=none?
> 
> See block/file-posix.c:raw_co_prw() {
> 
>     /*
>      * When using O_DIRECT, the request must be aligned to be able to use
>      * either libaio or io_uring interface. If not fail back to regular thread
>      * pool read/write code which emulates this for us if we set
>      * QEMU_AIO_MISALIGNED.
>      */
>     if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov))
> 	type |= QEMU_AIO_MISALIGNED;
>     else if (s->use_linux_io_uring)
>         return luring_co_submit(...);
>     else if (s->use_linux_aio)
>         return laio_co_submit(...);
> 
>     return raw_thread_pool_submit(..., handle_aiocb_rw, ...);
> }
> 
> bdrv_qiov_is_aligned() returns 'false' on my use-case.
> 
> I believe what you're referring to happens in handle_aiocb_rw(), but it's too
> late then.

I was thinking of block/io.c:bdrv_co_preadv_part() but it only aligns
the LBA, not memory buffers.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-03 14:19     ` Stefan Hajnoczi
@ 2022-02-11  9:30       ` Nicolas Saenz Julienne
  2022-02-11 11:32       ` Kevin Wolf
  1 sibling, 0 replies; 12+ messages in thread
From: Nicolas Saenz Julienne @ 2022-02-11  9:30 UTC (permalink / raw)
  To: Stefan Hajnoczi, Daniel P. Berrangé
  Cc: kwolf, qemu-block, mtosatti, qemu-devel, hreitz, pbonzini

On Thu, 2022-02-03 at 14:19 +0000, Stefan Hajnoczi wrote:
> Yep, I didn't comment on this because I don't have a good suggestion.
> 
> In terms of semantics I think we should have:
> 
> 1. A global default value that all new AioContext take. The QEMU main
>    loop's qemu_aio_context will use this and all IOThread AioContext
>    will use it (unless they have been overridden).
> 
>    I would define it on --machine because that's the "global" object for
>    a guest, but that's not very satisfying.

So I tried to implement this. One problem arouse:
 - The thread pool properties are now part of the MachineState. So as to
   properly use QOM.
 - Sadly, the main loop is initialized before the machine class options are
   populated. See 'qemu_init_main_loop()' and 'qemu_apply_machine_options()' in
   'softmmu/vl.c'.
 - Short of manually parsing the options, which IMO defeats the purpose of
   using QOM, or changing the initialization order, which I'm sure won't be
   easy, I can't access the properties early enough.

Any ideas?

Thanks!

-- 
Nicolás Sáenz



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-03 14:19     ` Stefan Hajnoczi
  2022-02-11  9:30       ` Nicolas Saenz Julienne
@ 2022-02-11 11:32       ` Kevin Wolf
  2022-02-14 10:09         ` Stefan Hajnoczi
  1 sibling, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2022-02-11 11:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Daniel P. Berrangé,
	qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 5069 bytes --]

Am 03.02.2022 um 15:19 hat Stefan Hajnoczi geschrieben:
> On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> > On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > > > The thread pool regulates itself: when idle, it kills threads until
> > > > empty, when in demand, it creates new threads until full. This behaviour
> > > > doesn't play well with latency sensitive workloads where the price of
> > > > creating a new thread is too high. For example, when paired with qemu's
> > > > '-mlock', or using safety features like SafeStack, creating a new thread
> > > > has been measured take multiple milliseconds.
> > > > 
> > > > In order to mitigate this let's introduce a new option to set a fixed
> > > > pool size. The threads will be created during the pool's initialization,
> > > > remain available during its lifetime regardless of demand, and destroyed
> > > > upon freeing it. A properly characterized workload will then be able to
> > > > configure the pool to avoid any latency spike.
> > > > 
> > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
> > > > 
> > > > ---
> > > > 
> > > > The fix I propose here works for my specific use-case, but I'm pretty
> > > > sure it'll need to be a bit more versatile to accommodate other
> > > > use-cases.
> > > > 
> > > > Some questions:
> > > > 
> > > > - Is unanimously setting these parameters for any pool instance too
> > > >   limiting? It'd make sense to move the options into the AioContext the
> > > >   pool belongs to. IIUC, for the general block use-case, this would be
> > > >   'qemu_aio_context' as initialized in qemu_init_main_loop().
> > > 
> > > Yes, qemu_aio_context is the main loop's AioContext. It's used unless
> > > IOThreads are configured.
> > > 
> > > It's nice to have global settings that affect all AioContexts, so I
> > > think this patch is fine for now.
> > > 
> > > In the future IOThread-specific parameters could be added if individual
> > > IOThread AioContexts need tuning (similar to how poll-max-ns works
> > > today).
> > > 
> > > > - Currently I'm setting two pool properties through a single qemu
> > > >   option. The pool's size and dynamic behaviour, or lack thereof. I
> > > >   think it'd be better to split them into separate options. I thought of
> > > >   different ways of expressing this (min/max-size where static happens
> > > >   when min-size=max-size, size and static/dynamic, etc..), but you might
> > > >   have ideas on what could be useful to other use-cases.
> > > 
> > > Yes, "min" and "max" is more flexible than fixed-size=n. fixed-size=n is
> > > equivalent to min=n,max=n. The current default policy is min=0,max=64.
> > > If you want more threads you could do min=0,max=128. If you want to
> > > reserve 1 thread all the time use min=1,max=64.
> > > 
> > > I would go with min and max.
> > 
> > This commit also exposes this as a new top level command line
> > argument. Given our aim to eliminate QemuOpts and use QAPI/QOM
> > properties for everything I think we need a different approach.
> > 
> > I'm not sure which exisiting QAPI/QOM option it most appropriate
> > to graft these tunables onto ?  -machine ?  -accel ?  Or is there
> > no good fit yet ?

I would agree that it should be QAPI, but just like QemuOpts doesn't
require that you shoehorn it into an existing option, QAPI doesn't
necessarily either if that's the interface that we want. You could just
create a new QAPI struct for it and parse the new option into that. This
would already be an improvement over this RFC.

Now, whether we actually want a new top-level option is a different
question (we usually try to avoid it), but it's not related to QAPI vs.
QemuOpts.

> Yep, I didn't comment on this because I don't have a good suggestion.
> 
> In terms of semantics I think we should have:
> 
> 1. A global default value that all new AioContext take. The QEMU main
>    loop's qemu_aio_context will use this and all IOThread AioContext
>    will use it (unless they have been overridden).
> 
>    I would define it on --machine because that's the "global" object for
>    a guest, but that's not very satisfying.

Semantically, -machine is about the virtual hardware where as iothreads
are about the backend, so I agree it's not a good fit.

For the main thread, you may want to configure all the same options that
you can configure for an iothread. So to me that sounds like we would
want to allow using an iothread object for the main thread, too.

That would still require us to tell QEMU which iothread object should be
used for the main thread, though.

> 2. (Future patch) --object iothread,thread-pool-min=N,thread-pool-max=M
>    just like poll-max-ns and friends. This allows the values to be set
>    on a per-IOThread basis.

And to be updated with qom-set. (Which is again something that you'll
want for the main thread, too.)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-11 11:32       ` Kevin Wolf
@ 2022-02-14 10:09         ` Stefan Hajnoczi
  2022-02-14 11:38           ` Kevin Wolf
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-02-14 10:09 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrangé,
	qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 2124 bytes --]

On Fri, Feb 11, 2022 at 12:32:57PM +0100, Kevin Wolf wrote:
> Am 03.02.2022 um 15:19 hat Stefan Hajnoczi geschrieben:
> > On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> > > On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > > > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > 1. A global default value that all new AioContext take. The QEMU main
> >    loop's qemu_aio_context will use this and all IOThread AioContext
> >    will use it (unless they have been overridden).
> > 
> >    I would define it on --machine because that's the "global" object for
> >    a guest, but that's not very satisfying.
> 
> Semantically, -machine is about the virtual hardware where as iothreads
> are about the backend, so I agree it's not a good fit.
> 
> For the main thread, you may want to configure all the same options that
> you can configure for an iothread. So to me that sounds like we would
> want to allow using an iothread object for the main thread, too.
> 
> That would still require us to tell QEMU which iothread object should be
> used for the main thread, though.

Making the main loop thread an IOThread is an interesting direction but
not an easy change to make.

The main loop thread has a custom event loop that is not interchangeable
with the IOThread event loop:
- The main loop has a poll notifier interface for libslirp fd monitoring
  integration.
- The main loop is a GLib event loop but manually polls to get
  nanosecond resolution timers.
- The main loop has icount integration.
- The main loop has the special iohandler AioContext

The IOThread event loop runs an optimized AioContext event loop instead.
It falls back to regular g_main_loop_run() if there is a GSource user.

It would definitely be nice to unify the main loop with IOThread and
then use --object iothread,... to configure main loop parameters.

I'm not sure if requiring that of Nicolas is fair though. The event
loops in QEMU are complex and changes are likely to introduce subtle
bugs or performance regressions.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-14 10:09         ` Stefan Hajnoczi
@ 2022-02-14 11:38           ` Kevin Wolf
  2022-02-14 13:10             ` Stefan Hajnoczi
  0 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2022-02-14 11:38 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Daniel P. Berrangé,
	qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]

Am 14.02.2022 um 11:09 hat Stefan Hajnoczi geschrieben:
> On Fri, Feb 11, 2022 at 12:32:57PM +0100, Kevin Wolf wrote:
> > Am 03.02.2022 um 15:19 hat Stefan Hajnoczi geschrieben:
> > > On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> > > > On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > > > > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > > 1. A global default value that all new AioContext take. The QEMU main
> > >    loop's qemu_aio_context will use this and all IOThread AioContext
> > >    will use it (unless they have been overridden).
> > > 
> > >    I would define it on --machine because that's the "global" object for
> > >    a guest, but that's not very satisfying.
> > 
> > Semantically, -machine is about the virtual hardware where as iothreads
> > are about the backend, so I agree it's not a good fit.
> > 
> > For the main thread, you may want to configure all the same options that
> > you can configure for an iothread. So to me that sounds like we would
> > want to allow using an iothread object for the main thread, too.
> > 
> > That would still require us to tell QEMU which iothread object should be
> > used for the main thread, though.
> 
> Making the main loop thread an IOThread is an interesting direction but
> not an easy change to make.
> 
> The main loop thread has a custom event loop that is not interchangeable
> with the IOThread event loop:
> - The main loop has a poll notifier interface for libslirp fd monitoring
>   integration.
> - The main loop is a GLib event loop but manually polls to get
>   nanosecond resolution timers.
> - The main loop has icount integration.
> - The main loop has the special iohandler AioContext
> 
> The IOThread event loop runs an optimized AioContext event loop instead.
> It falls back to regular g_main_loop_run() if there is a GSource user.
> 
> It would definitely be nice to unify the main loop with IOThread and
> then use --object iothread,... to configure main loop parameters.
> 
> I'm not sure if requiring that of Nicolas is fair though. The event
> loops in QEMU are complex and changes are likely to introduce subtle
> bugs or performance regressions.

I'm not suggesting actually running the iothread event loop instead,
merely using the properties of an object to configure the main thread as
the external user interface.
Whether this uses the same main loop code as today or is moved to the
regular iothread event loop is an implementation detail that can be
changed later.

Or we could maybe use a different object type like 'mainthread' and
share the properties using QOM inheritance.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-14 11:38           ` Kevin Wolf
@ 2022-02-14 13:10             ` Stefan Hajnoczi
  2022-02-14 14:09               ` Kevin Wolf
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-02-14 13:10 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrangé,
	qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 3160 bytes --]

On Mon, Feb 14, 2022 at 12:38:22PM +0100, Kevin Wolf wrote:
> Am 14.02.2022 um 11:09 hat Stefan Hajnoczi geschrieben:
> > On Fri, Feb 11, 2022 at 12:32:57PM +0100, Kevin Wolf wrote:
> > > Am 03.02.2022 um 15:19 hat Stefan Hajnoczi geschrieben:
> > > > On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> > > > > On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > > > > > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > > > 1. A global default value that all new AioContext take. The QEMU main
> > > >    loop's qemu_aio_context will use this and all IOThread AioContext
> > > >    will use it (unless they have been overridden).
> > > > 
> > > >    I would define it on --machine because that's the "global" object for
> > > >    a guest, but that's not very satisfying.
> > > 
> > > Semantically, -machine is about the virtual hardware where as iothreads
> > > are about the backend, so I agree it's not a good fit.
> > > 
> > > For the main thread, you may want to configure all the same options that
> > > you can configure for an iothread. So to me that sounds like we would
> > > want to allow using an iothread object for the main thread, too.
> > > 
> > > That would still require us to tell QEMU which iothread object should be
> > > used for the main thread, though.
> > 
> > Making the main loop thread an IOThread is an interesting direction but
> > not an easy change to make.
> > 
> > The main loop thread has a custom event loop that is not interchangeable
> > with the IOThread event loop:
> > - The main loop has a poll notifier interface for libslirp fd monitoring
> >   integration.
> > - The main loop is a GLib event loop but manually polls to get
> >   nanosecond resolution timers.
> > - The main loop has icount integration.
> > - The main loop has the special iohandler AioContext
> > 
> > The IOThread event loop runs an optimized AioContext event loop instead.
> > It falls back to regular g_main_loop_run() if there is a GSource user.
> > 
> > It would definitely be nice to unify the main loop with IOThread and
> > then use --object iothread,... to configure main loop parameters.
> > 
> > I'm not sure if requiring that of Nicolas is fair though. The event
> > loops in QEMU are complex and changes are likely to introduce subtle
> > bugs or performance regressions.
> 
> I'm not suggesting actually running the iothread event loop instead,
> merely using the properties of an object to configure the main thread as
> the external user interface.
> Whether this uses the same main loop code as today or is moved to the
> regular iothread event loop is an implementation detail that can be
> changed later.
> 
> Or we could maybe use a different object type like 'mainthread' and
> share the properties using QOM inheritance.

That seems cleaner than trying faking an IOThread to me since I don't
see a concrete plan to unify the two event loops.

The main loop code is in util/main-loop.c. Maybe call it --object
main-loop? Attempting to instantiate more than one main-loop object
should fail.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] thread-pool: Add option to fix the pool size
  2022-02-14 13:10             ` Stefan Hajnoczi
@ 2022-02-14 14:09               ` Kevin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2022-02-14 14:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Daniel P. Berrangé,
	qemu-block, mtosatti, qemu-devel, hreitz, pbonzini,
	Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 3454 bytes --]

Am 14.02.2022 um 14:10 hat Stefan Hajnoczi geschrieben:
> On Mon, Feb 14, 2022 at 12:38:22PM +0100, Kevin Wolf wrote:
> > Am 14.02.2022 um 11:09 hat Stefan Hajnoczi geschrieben:
> > > On Fri, Feb 11, 2022 at 12:32:57PM +0100, Kevin Wolf wrote:
> > > > Am 03.02.2022 um 15:19 hat Stefan Hajnoczi geschrieben:
> > > > > On Thu, Feb 03, 2022 at 10:56:49AM +0000, Daniel P. Berrangé wrote:
> > > > > > On Thu, Feb 03, 2022 at 10:53:07AM +0000, Stefan Hajnoczi wrote:
> > > > > > > On Wed, Feb 02, 2022 at 06:52:34PM +0100, Nicolas Saenz Julienne wrote:
> > > > > 1. A global default value that all new AioContext take. The QEMU main
> > > > >    loop's qemu_aio_context will use this and all IOThread AioContext
> > > > >    will use it (unless they have been overridden).
> > > > > 
> > > > >    I would define it on --machine because that's the "global" object for
> > > > >    a guest, but that's not very satisfying.
> > > > 
> > > > Semantically, -machine is about the virtual hardware where as iothreads
> > > > are about the backend, so I agree it's not a good fit.
> > > > 
> > > > For the main thread, you may want to configure all the same options that
> > > > you can configure for an iothread. So to me that sounds like we would
> > > > want to allow using an iothread object for the main thread, too.
> > > > 
> > > > That would still require us to tell QEMU which iothread object should be
> > > > used for the main thread, though.
> > > 
> > > Making the main loop thread an IOThread is an interesting direction but
> > > not an easy change to make.
> > > 
> > > The main loop thread has a custom event loop that is not interchangeable
> > > with the IOThread event loop:
> > > - The main loop has a poll notifier interface for libslirp fd monitoring
> > >   integration.
> > > - The main loop is a GLib event loop but manually polls to get
> > >   nanosecond resolution timers.
> > > - The main loop has icount integration.
> > > - The main loop has the special iohandler AioContext
> > > 
> > > The IOThread event loop runs an optimized AioContext event loop instead.
> > > It falls back to regular g_main_loop_run() if there is a GSource user.
> > > 
> > > It would definitely be nice to unify the main loop with IOThread and
> > > then use --object iothread,... to configure main loop parameters.
> > > 
> > > I'm not sure if requiring that of Nicolas is fair though. The event
> > > loops in QEMU are complex and changes are likely to introduce subtle
> > > bugs or performance regressions.
> > 
> > I'm not suggesting actually running the iothread event loop instead,
> > merely using the properties of an object to configure the main thread as
> > the external user interface.
> > Whether this uses the same main loop code as today or is moved to the
> > regular iothread event loop is an implementation detail that can be
> > changed later.
> > 
> > Or we could maybe use a different object type like 'mainthread' and
> > share the properties using QOM inheritance.
> 
> That seems cleaner than trying faking an IOThread to me since I don't
> see a concrete plan to unify the two event loops.
> 
> The main loop code is in util/main-loop.c. Maybe call it --object
> main-loop? Attempting to instantiate more than one main-loop object
> should fail.

Sounds good. And if you don't create one explicitly, we'll just
internally create a default main-loop object.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-02-14 14:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 17:52 [RFC] thread-pool: Add option to fix the pool size Nicolas Saenz Julienne
2022-02-03 10:53 ` Stefan Hajnoczi
2022-02-03 10:56   ` Daniel P. Berrangé
2022-02-03 14:19     ` Stefan Hajnoczi
2022-02-11  9:30       ` Nicolas Saenz Julienne
2022-02-11 11:32       ` Kevin Wolf
2022-02-14 10:09         ` Stefan Hajnoczi
2022-02-14 11:38           ` Kevin Wolf
2022-02-14 13:10             ` Stefan Hajnoczi
2022-02-14 14:09               ` Kevin Wolf
2022-02-07 12:00   ` Nicolas Saenz Julienne
2022-02-07 15:20     ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.