All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
@ 2020-01-06 14:45 Peter Maydell
  2020-01-06 15:22 ` Marc-André Lureau
  2020-01-06 17:28 ` Peter Xu
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Maydell @ 2020-01-06 14:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Stefan Hajnoczi

On older versions of glib (anything prior to glib commit 0f056ebe
from May 2019), the implementation of g_source_ref() and
g_source_unref() is not threadsafe for a GSource which is not
attached to a GMainContext.

QEMU's real iothread.c implementation always attaches its
iothread->ctx's GSource to a GMainContext created for that iothread,
so it is OK, but the simple test framework implementation in
tests/iothread.c was not doing this.  This was causing intermittent
assertion failures in the test-aio-multithread subtest
"/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
why only BSD seems to have been affected -- perhaps a combination of
the specific glib version being used in the VMs and their happening
to run on a host with a lot of CPUs).

Borrow the iothread_init_gcontext() from the real iothread.c
and add the corresponding cleanup code and the calls to
g_main_context_push/pop_thread_default() so we actually use
the GMainContext we create.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I don't really have a good understanding of the glib APIs here,
so I'm mostly just cribbing code from the real iothread.c;
review by people who do know the glib/iothread stuff better
welcomed. It does seem to fix the intermittent test failure
on NetBSD, at least, where we were running into an assertion
failure because a g_source_unref() incorrectly thought it
had decremented the refcount to 0 and should delete a context
that was actually still in use.

 tests/iothread.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tests/iothread.c b/tests/iothread.c
index 13c9fdcd8df..d3a2ee9a014 100644
--- a/tests/iothread.c
+++ b/tests/iothread.c
@@ -21,6 +21,8 @@
 
 struct IOThread {
     AioContext *ctx;
+    GMainContext *worker_context;
+    GMainLoop *main_loop;
 
     QemuThread thread;
     QemuMutex init_done_lock;
@@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
     return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
 }
 
+static void iothread_init_gcontext(IOThread *iothread)
+{
+    GSource *source;
+
+    iothread->worker_context = g_main_context_new();
+    source = aio_get_g_source(iothread_get_aio_context(iothread));
+    g_source_attach(source, iothread->worker_context);
+    g_source_unref(source);
+    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
+}
+
 static void *iothread_run(void *opaque)
 {
     IOThread *iothread = opaque;
@@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
     my_iothread = iothread;
     qemu_mutex_lock(&iothread->init_done_lock);
     iothread->ctx = aio_context_new(&error_abort);
+
+    /*
+     * We must connect the ctx to a GMainContext, because in older versions
+     * of glib the g_source_ref()/unref() functions are not threadsafe
+     * on sources without a context.
+     */
+    iothread_init_gcontext(iothread);
+
+    /*
+     * g_main_context_push_thread_default() must be called before anything
+     * in this new thread uses glib.
+     */
+    g_main_context_push_thread_default(iothread->worker_context);
+
     qemu_cond_signal(&iothread->init_done_cond);
     qemu_mutex_unlock(&iothread->init_done_lock);
 
@@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
         aio_poll(iothread->ctx, true);
     }
 
+    g_main_context_pop_thread_default(iothread->worker_context);
     rcu_unregister_thread();
     return NULL;
 }
@@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
 {
     aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
     qemu_thread_join(&iothread->thread);
+    g_main_context_unref(iothread->worker_context);
+    g_main_loop_unref(iothread->main_loop);
     qemu_cond_destroy(&iothread->init_done_cond);
     qemu_mutex_destroy(&iothread->init_done_lock);
     aio_context_unref(iothread->ctx);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 14:45 [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext Peter Maydell
@ 2020-01-06 15:22 ` Marc-André Lureau
  2020-01-06 15:31   ` Peter Maydell
  2020-01-06 17:28 ` Peter Xu
  1 sibling, 1 reply; 7+ messages in thread
From: Marc-André Lureau @ 2020-01-06 15:22 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Stefan Hajnoczi, QEMU, Peter Xu

Hi

On Mon, Jan 6, 2020 at 7:03 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On older versions of glib (anything prior to glib commit 0f056ebe
> from May 2019), the implementation of g_source_ref() and
> g_source_unref() is not threadsafe for a GSource which is not
> attached to a GMainContext.
>
> QEMU's real iothread.c implementation always attaches its
> iothread->ctx's GSource to a GMainContext created for that iothread,
> so it is OK, but the simple test framework implementation in
> tests/iothread.c was not doing this.  This was causing intermittent
> assertion failures in the test-aio-multithread subtest
> "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> why only BSD seems to have been affected -- perhaps a combination of
> the specific glib version being used in the VMs and their happening
> to run on a host with a lot of CPUs).
>
> Borrow the iothread_init_gcontext() from the real iothread.c
> and add the corresponding cleanup code and the calls to
> g_main_context_push/pop_thread_default() so we actually use
> the GMainContext we create.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> I don't really have a good understanding of the glib APIs here,
> so I'm mostly just cribbing code from the real iothread.c;
> review by people who do know the glib/iothread stuff better
> welcomed. It does seem to fix the intermittent test failure
> on NetBSD, at least, where we were running into an assertion
> failure because a g_source_unref() incorrectly thought it
> had decremented the refcount to 0 and should delete a context
> that was actually still in use.
>
>  tests/iothread.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/tests/iothread.c b/tests/iothread.c
> index 13c9fdcd8df..d3a2ee9a014 100644
> --- a/tests/iothread.c
> +++ b/tests/iothread.c
> @@ -21,6 +21,8 @@
>
>  struct IOThread {
>      AioContext *ctx;
> +    GMainContext *worker_context;
> +    GMainLoop *main_loop;
>
>      QemuThread thread;
>      QemuMutex init_done_lock;
> @@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
>      return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
>  }
>
> +static void iothread_init_gcontext(IOThread *iothread)
> +{
> +    GSource *source;
> +
> +    iothread->worker_context = g_main_context_new();
> +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> +    g_source_attach(source, iothread->worker_context);
> +    g_source_unref(source);
> +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
> +}
> +
>  static void *iothread_run(void *opaque)
>  {
>      IOThread *iothread = opaque;
> @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
>      my_iothread = iothread;
>      qemu_mutex_lock(&iothread->init_done_lock);
>      iothread->ctx = aio_context_new(&error_abort);
> +
> +    /*
> +     * We must connect the ctx to a GMainContext, because in older versions
> +     * of glib the g_source_ref()/unref() functions are not threadsafe
> +     * on sources without a context.
> +     */
> +    iothread_init_gcontext(iothread);
> +
> +    /*
> +     * g_main_context_push_thread_default() must be called before anything
> +     * in this new thread uses glib.

in/if, I suppose

> +     */
> +    g_main_context_push_thread_default(iothread->worker_context);
> +
>      qemu_cond_signal(&iothread->init_done_cond);
>      qemu_mutex_unlock(&iothread->init_done_lock);
>
> @@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
>          aio_poll(iothread->ctx, true);
>      }
>
> +    g_main_context_pop_thread_default(iothread->worker_context);
>      rcu_unregister_thread();
>      return NULL;
>  }
> @@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
>  {
>      aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
>      qemu_thread_join(&iothread->thread);
> +    g_main_context_unref(iothread->worker_context);
> +    g_main_loop_unref(iothread->main_loop);
>      qemu_cond_destroy(&iothread->init_done_cond);
>      qemu_mutex_destroy(&iothread->init_done_lock);
>      aio_context_unref(iothread->ctx);
> --
> 2.20.1
>
>

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>


-- 
Marc-André Lureau


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 15:22 ` Marc-André Lureau
@ 2020-01-06 15:31   ` Peter Maydell
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2020-01-06 15:31 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Stefan Hajnoczi, QEMU, Peter Xu

On Mon, 6 Jan 2020 at 15:22, Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
>
> Hi
>
> On Mon, Jan 6, 2020 at 7:03 PM Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > On older versions of glib (anything prior to glib commit 0f056ebe
> > from May 2019), the implementation of g_source_ref() and
> > g_source_unref() is not threadsafe for a GSource which is not
> > attached to a GMainContext.
> >
> > QEMU's real iothread.c implementation always attaches its
> > iothread->ctx's GSource to a GMainContext created for that iothread,
> > so it is OK, but the simple test framework implementation in
> > tests/iothread.c was not doing this.  This was causing intermittent
> > assertion failures in the test-aio-multithread subtest
> > "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> > why only BSD seems to have been affected -- perhaps a combination of
> > the specific glib version being used in the VMs and their happening
> > to run on a host with a lot of CPUs).

> >  static void *iothread_run(void *opaque)
> >  {
> >      IOThread *iothread = opaque;
> > @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
> >      my_iothread = iothread;
> >      qemu_mutex_lock(&iothread->init_done_lock);
> >      iothread->ctx = aio_context_new(&error_abort);
> > +
> > +    /*
> > +     * We must connect the ctx to a GMainContext, because in older versions
> > +     * of glib the g_source_ref()/unref() functions are not threadsafe
> > +     * on sources without a context.
> > +     */
> > +    iothread_init_gcontext(iothread);
> > +
> > +    /*
> > +     * g_main_context_push_thread_default() must be called before anything
> > +     * in this new thread uses glib.
>
> in/if, I suppose

No; it means "before anything in this new thread specifically" as
opposed to "before anything in the whole process". (This comment is
verbatim copied from the main iothread.c, incidentally).

Thanks for the review.

-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 14:45 [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext Peter Maydell
  2020-01-06 15:22 ` Marc-André Lureau
@ 2020-01-06 17:28 ` Peter Xu
  2020-01-06 17:40   ` Peter Maydell
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Xu @ 2020-01-06 17:28 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, Stefan Hajnoczi

On Mon, Jan 06, 2020 at 02:45:52PM +0000, Peter Maydell wrote:
> On older versions of glib (anything prior to glib commit 0f056ebe
> from May 2019), the implementation of g_source_ref() and
> g_source_unref() is not threadsafe for a GSource which is not
> attached to a GMainContext.
> 
> QEMU's real iothread.c implementation always attaches its
> iothread->ctx's GSource to a GMainContext created for that iothread,
> so it is OK, but the simple test framework implementation in
> tests/iothread.c was not doing this.  This was causing intermittent
> assertion failures in the test-aio-multithread subtest
> "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> why only BSD seems to have been affected -- perhaps a combination of
> the specific glib version being used in the VMs and their happening
> to run on a host with a lot of CPUs).
> 
> Borrow the iothread_init_gcontext() from the real iothread.c
> and add the corresponding cleanup code and the calls to
> g_main_context_push/pop_thread_default() so we actually use
> the GMainContext we create.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

I've no idea on the g_source_ref() issue, but if so then a patch like
this makes sense to me especially if it fixes up test failures.

Reviewed-by: Peter Xu <peterx@redhat.com>

Still a few comments below.

> ---
> I don't really have a good understanding of the glib APIs here,
> so I'm mostly just cribbing code from the real iothread.c;
> review by people who do know the glib/iothread stuff better
> welcomed. It does seem to fix the intermittent test failure
> on NetBSD, at least, where we were running into an assertion
> failure because a g_source_unref() incorrectly thought it
> had decremented the refcount to 0 and should delete a context
> that was actually still in use.
> 
>  tests/iothread.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/tests/iothread.c b/tests/iothread.c
> index 13c9fdcd8df..d3a2ee9a014 100644
> --- a/tests/iothread.c
> +++ b/tests/iothread.c
> @@ -21,6 +21,8 @@
>  
>  struct IOThread {
>      AioContext *ctx;
> +    GMainContext *worker_context;
> +    GMainLoop *main_loop;
>  
>      QemuThread thread;
>      QemuMutex init_done_lock;
> @@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
>      return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
>  }
>  
> +static void iothread_init_gcontext(IOThread *iothread)
> +{
> +    GSource *source;
> +
> +    iothread->worker_context = g_main_context_new();
> +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> +    g_source_attach(source, iothread->worker_context);
> +    g_source_unref(source);
> +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);

IIUC the main_loop is not required because in this case we only use
the aio context to run rather than the main context itself.

> +}
> +
>  static void *iothread_run(void *opaque)
>  {
>      IOThread *iothread = opaque;
> @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
>      my_iothread = iothread;
>      qemu_mutex_lock(&iothread->init_done_lock);
>      iothread->ctx = aio_context_new(&error_abort);
> +
> +    /*
> +     * We must connect the ctx to a GMainContext, because in older versions
> +     * of glib the g_source_ref()/unref() functions are not threadsafe
> +     * on sources without a context.
> +     */
> +    iothread_init_gcontext(iothread);
> +
> +    /*
> +     * g_main_context_push_thread_default() must be called before anything
> +     * in this new thread uses glib.
> +     */
> +    g_main_context_push_thread_default(iothread->worker_context);

IMHO if all the users of tests/iothread.c are block layers who only
uses the aio context directly, then I think this is not needed too.

Thanks,

> +
>      qemu_cond_signal(&iothread->init_done_cond);
>      qemu_mutex_unlock(&iothread->init_done_lock);
>  
> @@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
>          aio_poll(iothread->ctx, true);
>      }
>  
> +    g_main_context_pop_thread_default(iothread->worker_context);
>      rcu_unregister_thread();
>      return NULL;
>  }
> @@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
>  {
>      aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
>      qemu_thread_join(&iothread->thread);
> +    g_main_context_unref(iothread->worker_context);
> +    g_main_loop_unref(iothread->main_loop);
>      qemu_cond_destroy(&iothread->init_done_cond);
>      qemu_mutex_destroy(&iothread->init_done_lock);
>      aio_context_unref(iothread->ctx);
> -- 
> 2.20.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 17:28 ` Peter Xu
@ 2020-01-06 17:40   ` Peter Maydell
  2020-01-06 18:01     ` Peter Xu
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2020-01-06 17:40 UTC (permalink / raw)
  To: Peter Xu; +Cc: QEMU Developers, Stefan Hajnoczi

On Mon, 6 Jan 2020 at 17:28, Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Jan 06, 2020 at 02:45:52PM +0000, Peter Maydell wrote:
> > On older versions of glib (anything prior to glib commit 0f056ebe
> > from May 2019), the implementation of g_source_ref() and
> > g_source_unref() is not threadsafe for a GSource which is not
> > attached to a GMainContext.
> >
> > QEMU's real iothread.c implementation always attaches its
> > iothread->ctx's GSource to a GMainContext created for that iothread,
> > so it is OK, but the simple test framework implementation in
> > tests/iothread.c was not doing this.  This was causing intermittent
> > assertion failures in the test-aio-multithread subtest
> > "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> > why only BSD seems to have been affected -- perhaps a combination of
> > the specific glib version being used in the VMs and their happening
> > to run on a host with a lot of CPUs).
> >
> > Borrow the iothread_init_gcontext() from the real iothread.c
> > and add the corresponding cleanup code and the calls to
> > g_main_context_push/pop_thread_default() so we actually use
> > the GMainContext we create.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>
> I've no idea on the g_source_ref() issue, but if so then a patch like
> this makes sense to me especially if it fixes up test failures.
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Still a few comments below.

> > +static void iothread_init_gcontext(IOThread *iothread)
> > +{
> > +    GSource *source;
> > +
> > +    iothread->worker_context = g_main_context_new();
> > +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> > +    g_source_attach(source, iothread->worker_context);
> > +    g_source_unref(source);
> > +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
>
> IIUC the main_loop is not required because in this case we only use
> the aio context to run rather than the main context itself.

Mmm. I wasn't sure to what extent glib expects the
GMainContext and GMainLoop to match up, so I was mostly
just copying from iothread.c.

> > +    /*
> > +     * g_main_context_push_thread_default() must be called before anything
> > +     * in this new thread uses glib.
> > +     */
> > +    g_main_context_push_thread_default(iothread->worker_context);
>
> IMHO if all the users of tests/iothread.c are block layers who only
> uses the aio context directly, then I think this is not needed too.

So we're OK to not do this because tests/iothread.c's
main loop doesn't call g_main_loop_run(), and it doesn't
provide an iothread_get_g_main_context() ?

I'm kind of inclined towards being lazy and sticking with
what this patch has, because:
 * it matches the real iothread.c, which reduces the possiblity
   of future surprise bugs due to things not matching up
 * it's already been reviewed
 * it saves me having to do a respin and retest

But if people would prefer these bits deleted I'll stop
being lazy :-)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 17:40   ` Peter Maydell
@ 2020-01-06 18:01     ` Peter Xu
  2020-01-07 15:32       ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Xu @ 2020-01-06 18:01 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers, Stefan Hajnoczi

On Mon, Jan 06, 2020 at 05:40:15PM +0000, Peter Maydell wrote:

[...]

> So we're OK to not do this because tests/iothread.c's
> main loop doesn't call g_main_loop_run(), and it doesn't
> provide an iothread_get_g_main_context() ?
> 
> I'm kind of inclined towards being lazy and sticking with
> what this patch has, because:
>  * it matches the real iothread.c, which reduces the possiblity
>    of future surprise bugs due to things not matching up
>  * it's already been reviewed
>  * it saves me having to do a respin and retest
> 
> But if people would prefer these bits deleted I'll stop
> being lazy :-)

Please feel free to be lazy and fix the test sooner (and that's why I
offered my r-b :).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
  2020-01-06 18:01     ` Peter Xu
@ 2020-01-07 15:32       ` Peter Maydell
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2020-01-07 15:32 UTC (permalink / raw)
  To: Peter Xu; +Cc: QEMU Developers, Stefan Hajnoczi

On Mon, 6 Jan 2020 at 18:01, Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Jan 06, 2020 at 05:40:15PM +0000, Peter Maydell wrote:
>
> [...]
>
> > So we're OK to not do this because tests/iothread.c's
> > main loop doesn't call g_main_loop_run(), and it doesn't
> > provide an iothread_get_g_main_context() ?
> >
> > I'm kind of inclined towards being lazy and sticking with
> > what this patch has, because:
> >  * it matches the real iothread.c, which reduces the possiblity
> >    of future surprise bugs due to things not matching up
> >  * it's already been reviewed
> >  * it saves me having to do a respin and retest
> >
> > But if people would prefer these bits deleted I'll stop
> > being lazy :-)
>
> Please feel free to be lazy and fix the test sooner (and that's why I
> offered my r-b :).

Thanks; I've applied this to master (though some other
problem with the NetBSD VM test run seems to have crept
in while I was ignoring failures on the assumption they
were due to this bug :-( )

-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-07 15:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-06 14:45 [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext Peter Maydell
2020-01-06 15:22 ` Marc-André Lureau
2020-01-06 15:31   ` Peter Maydell
2020-01-06 17:28 ` Peter Xu
2020-01-06 17:40   ` Peter Maydell
2020-01-06 18:01     ` Peter Xu
2020-01-07 15:32       ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.