All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
@ 2018-08-23 14:31 Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 1/5] Revert "chardev: tcp: postpone TLS work until machine done" Marc-André Lureau
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

Hi,

In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
(and its follow up 99f2f54174a59), Peter moved chardev socket
connection to machine_done event. However, chardev created later will
no longer attempt to connect, and chardev created in tests do not have
machine_done event (breaking some of vhost-user-test).

The goal was to move the "connect" source to the chardev frontend
context (the monitor thread context in his case). chr->gcontext is set
with qemu_chr_fe_set_handlers(). But there is no guarantee that the
function will be called in general, so we can't delay connection until
then: the chardev should still attempt to connect during open(), using
the main context.

An alternative would be to specify the iothread during chardev
creation. Setting up monitor OOB would be quite different too, it
would take the same iothread as argument.

99f2f54174a595e is also a bit problematic, since it will behave
differently before and after machine_done (the first case gives a
chance to use a different context reliably, the second looks racy)

In the end, I am not sure this is all necessary, as chardev callbacks
are called after qemu_chr_fe_set_handlers(), at which point the
context of sources are updated. In "char-socket: update all ioc
handlers when changing context", I moved also the hup handler to the
updated context. So unless the main thread is already stuck, we can
setup a different context for the chardev at that time. Or not?

v2:
- fix a random socket chardev test failure

Marc-André Lureau (5):
  Revert "chardev: tcp: postpone TLS work until machine done"
  Revert "chardev: tcp: postpone async connection setup"
  char-socket: update all ioc handlers when changing context
  test-char: fix random socket test failure
  test-char: add socket reconnect test

 chardev/char-socket.c | 86 ++++++++++++++++++-------------------------
 tests/test-char.c     | 25 +++++++++++--
 2 files changed, 57 insertions(+), 54 deletions(-)

-- 
2.18.0.547.g1d89318c48

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 1/5] Revert "chardev: tcp: postpone TLS work until machine done"
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
@ 2018-08-23 14:31 ` Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 2/5] Revert "chardev: tcp: postpone async connection setup" Marc-André Lureau
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

This reverts commit 99f2f54174a595e3ada6e4332fcd2b37ebb0d55d.

See next commit reverting 25679e5d58e258e9950685ffbd0cae4cd40d9cc2 as
well for rationale.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 chardev/char-socket.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index efbad6ee7c..e28d2cca4c 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -32,7 +32,6 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
-#include "sysemu/sysemu.h"
 
 #include "chardev/char-io.h"
 
@@ -724,11 +723,6 @@ static void tcp_chr_tls_init(Chardev *chr)
     Error *err = NULL;
     gchar *name;
 
-    if (!machine_init_done) {
-        /* This will be postponed to machine_done notifier */
-        return;
-    }
-
     if (s->is_listen) {
         tioc = qio_channel_tls_new_server(
             s->ioc, s->tls_creds,
@@ -1169,10 +1163,6 @@ static int tcp_chr_machine_done_hook(Chardev *chr)
         tcp_chr_connect_async(chr);
     }
 
-    if (s->ioc && s->tls_creds) {
-        tcp_chr_tls_init(chr);
-    }
-
     return 0;
 }
 
-- 
2.18.0.547.g1d89318c48

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 2/5] Revert "chardev: tcp: postpone async connection setup"
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 1/5] Revert "chardev: tcp: postpone TLS work until machine done" Marc-André Lureau
@ 2018-08-23 14:31 ` Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 3/5] char-socket: update all ioc handlers when changing context Marc-André Lureau
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

This reverts commit 25679e5d58e258e9950685ffbd0cae4cd40d9cc2.

This commit broke "reconnect socket" chardev that are created after
"machine_done": they no longer try to connect. It broke also
vhost-user-test that uses chardev while there is no "machine_done"
event.

The goal of this patch was to move the "connect" source to the
frontend context. chr->gcontext is set with
qemu_chr_fe_set_handlers(). But there is no guarantee that it will be
called, so we can't delay connection until then: the chardev should
still attempt to connect during open(). qemu_chr_fe_set_handlers() is
eventually called later and will update the context.

Unless there is a good reason to not use initially the default
context, I think we should revert to the previous state to fix the
regressions.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 chardev/char-socket.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index e28d2cca4c..14f6567f6a 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -1005,8 +1005,9 @@ static void qmp_chardev_open_socket(Chardev *chr,
         s->reconnect_time = reconnect;
     }
 
-    /* If reconnect_time is set, will do that in chr_machine_done. */
-    if (!s->reconnect_time) {
+    if (s->reconnect_time) {
+        tcp_chr_connect_async(chr);
+    } else {
         if (s->is_listen) {
             char *name;
             s->listener = qio_net_listener_new();
@@ -1155,17 +1156,6 @@ char_socket_get_connected(Object *obj, Error **errp)
     return s->connected;
 }
 
-static int tcp_chr_machine_done_hook(Chardev *chr)
-{
-    SocketChardev *s = SOCKET_CHARDEV(chr);
-
-    if (s->reconnect_time) {
-        tcp_chr_connect_async(chr);
-    }
-
-    return 0;
-}
-
 static void char_socket_class_init(ObjectClass *oc, void *data)
 {
     ChardevClass *cc = CHARDEV_CLASS(oc);
@@ -1181,7 +1171,6 @@ static void char_socket_class_init(ObjectClass *oc, void *data)
     cc->chr_add_client = tcp_chr_add_client;
     cc->chr_add_watch = tcp_chr_add_watch;
     cc->chr_update_read_handler = tcp_chr_update_read_handler;
-    cc->chr_machine_done = tcp_chr_machine_done_hook;
 
     object_class_property_add(oc, "addr", "SocketAddress",
                               char_socket_get_addr, NULL,
-- 
2.18.0.547.g1d89318c48

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 3/5] char-socket: update all ioc handlers when changing context
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 1/5] Revert "chardev: tcp: postpone TLS work until machine done" Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 2/5] Revert "chardev: tcp: postpone async connection setup" Marc-André Lureau
@ 2018-08-23 14:31 ` Marc-André Lureau
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure Marc-André Lureau
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

So far, tcp_chr_update_read_handler() only updated the read
handler. Let's also update the hup handler.

Factorize the code while at it. (note that s->ioc != NULL when
s->connected)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 chardev/char-socket.c | 59 ++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 14f6567f6a..7cd0ae2824 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -353,6 +353,15 @@ static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond)
     return qio_channel_create_watch(s->ioc, cond);
 }
 
+static void remove_hup_source(SocketChardev *s)
+{
+    if (s->hup_source != NULL) {
+        g_source_destroy(s->hup_source);
+        g_source_unref(s->hup_source);
+        s->hup_source = NULL;
+    }
+}
+
 static void tcp_chr_free_connection(Chardev *chr)
 {
     SocketChardev *s = SOCKET_CHARDEV(chr);
@@ -367,11 +376,7 @@ static void tcp_chr_free_connection(Chardev *chr)
         s->read_msgfds_num = 0;
     }
 
-    if (s->hup_source != NULL) {
-        g_source_destroy(s->hup_source);
-        g_source_unref(s->hup_source);
-        s->hup_source = NULL;
-    }
+    remove_hup_source(s);
 
     tcp_set_msgfds(chr, NULL, 0);
     remove_fd_in_watch(chr);
@@ -540,6 +545,27 @@ static char *sockaddr_to_str(struct sockaddr_storage *ss, socklen_t ss_len,
     }
 }
 
+static void update_ioc_handlers(SocketChardev *s)
+{
+    Chardev *chr = CHARDEV(s);
+
+    if (!s->connected) {
+        return;
+    }
+
+    remove_fd_in_watch(chr);
+    chr->gsource = io_add_watch_poll(chr, s->ioc,
+                                     tcp_chr_read_poll,
+                                     tcp_chr_read, chr,
+                                     chr->gcontext);
+
+    remove_hup_source(s);
+    s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
+    g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
+                          chr, NULL);
+    g_source_attach(s->hup_source, chr->gcontext);
+}
+
 static void tcp_chr_connect(void *opaque)
 {
     Chardev *chr = CHARDEV(opaque);
@@ -552,16 +578,7 @@ static void tcp_chr_connect(void *opaque)
         s->is_listen, s->is_telnet);
 
     s->connected = 1;
-    chr->gsource = io_add_watch_poll(chr, s->ioc,
-                                       tcp_chr_read_poll,
-                                       tcp_chr_read,
-                                       chr, chr->gcontext);
-
-    s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
-    g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
-                          chr, NULL);
-    g_source_attach(s->hup_source, chr->gcontext);
-
+    update_ioc_handlers(s);
     qemu_chr_be_event(chr, CHR_EVENT_OPENED);
 }
 
@@ -592,17 +609,7 @@ static void tcp_chr_update_read_handler(Chardev *chr)
         tcp_chr_telnet_init(CHARDEV(s));
     }
 
-    if (!s->connected) {
-        return;
-    }
-
-    remove_fd_in_watch(chr);
-    if (s->ioc) {
-        chr->gsource = io_add_watch_poll(chr, s->ioc,
-                                           tcp_chr_read_poll,
-                                           tcp_chr_read, chr,
-                                           chr->gcontext);
-    }
+    update_ioc_handlers(s);
 }
 
 static gboolean tcp_chr_telnet_init_io(QIOChannel *ioc,
-- 
2.18.0.547.g1d89318c48

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
                   ` (2 preceding siblings ...)
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 3/5] char-socket: update all ioc handlers when changing context Marc-André Lureau
@ 2018-08-23 14:31 ` Marc-André Lureau
  2018-09-11 13:46   ` Paolo Bonzini
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 5/5] test-char: add socket reconnect test Marc-André Lureau
  2018-08-24  3:45 ` [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Peter Xu
  5 siblings, 1 reply; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

Peter reported a test failure on FreeBSD with the new reconnect test:

MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
gtester -k --verbose -m=quick tests/test-char
TEST: tests/test-char... (pid=16190)
  /char/null:                                                          OK
  /char/invalid:                                                       OK
  /char/ringbuf:                                                       OK
  /char/mux:                                                           OK
  /char/stdio:                                                         OK
  /char/pipe:                                                          OK
  /char/file:                                                          OK
  /char/file-fifo:                                                     OK
  /char/udp:                                                           OK
  /char/serial:                                                        OK
  /char/hotswap:                                                       OK
  /char/socket/basic:                                                  OK
  /char/socket/reconnect:                                              FAIL
GTester: last random seed: R02S521380d9c12f1dac3ad1763bf5665c27
(pid=16367)
  /char/socket/fdpass:                                                 OK
FAIL: tests/test-char
**
ERROR:tests/test-char.c:353:char_socket_test_common: assertion failed:
(object_property_get_bool(OBJECT(chr_client), "connected",
&error_abort))

It turns out that the socket test code checks both server and client
connection states, but doesn't wait for both.

Wait for the client side as well.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 tests/test-char.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 5905d31441..2b1e400542 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -349,6 +349,12 @@ static void char_socket_test_common(Chardev *chr)
     g_assert_cmpint(id, >, 0);
     main_loop();
 
+    d.chr = chr_client;
+    id = g_idle_add(char_socket_test_idle, &d);
+    g_source_set_name_by_id(id, "test-idle");
+    g_assert_cmpint(id, >, 0);
+    main_loop();
+
     g_assert(object_property_get_bool(OBJECT(chr), "connected", &error_abort));
     g_assert(object_property_get_bool(OBJECT(chr_client),
                                       "connected", &error_abort));
@@ -358,6 +364,7 @@ static void char_socket_test_common(Chardev *chr)
 
     object_unparent(OBJECT(chr_client));
 
+    d.chr = chr;
     d.conn_expected = false;
     g_idle_add(char_socket_test_idle, &d);
     main_loop();
-- 
2.18.0.547.g1d89318c48

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 5/5] test-char: add socket reconnect test
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
                   ` (3 preceding siblings ...)
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure Marc-André Lureau
@ 2018-08-23 14:31 ` Marc-André Lureau
  2018-08-24  3:45 ` [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Peter Xu
  5 siblings, 0 replies; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-23 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, berrange, peterx, Marc-André Lureau

This test exhibits a regression fixed by the previous reverts.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 tests/test-char.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tests/test-char.c b/tests/test-char.c
index 2b1e400542..68a82a0d93 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -309,7 +309,7 @@ static int socket_can_read_hello(void *opaque)
     return 10;
 }
 
-static void char_socket_test_common(Chardev *chr)
+static void char_socket_test_common(Chardev *chr, bool reconnect)
 {
     Chardev *chr_client;
     QObject *addr;
@@ -329,7 +329,8 @@ static void char_socket_test_common(Chardev *chr)
     addr = object_property_get_qobject(OBJECT(chr), "addr", &error_abort);
     qdict = qobject_to(QDict, addr);
     port = qdict_get_str(qdict, "port");
-    tmp = g_strdup_printf("tcp:127.0.0.1:%s", port);
+    tmp = g_strdup_printf("tcp:127.0.0.1:%s%s", port,
+                          reconnect ? ",reconnect=1" : "");
     qobject_unref(qdict);
 
     qemu_chr_fe_init(&be, chr, &error_abort);
@@ -377,7 +378,15 @@ static void char_socket_basic_test(void)
 {
     Chardev *chr = qemu_chr_new("server", "tcp:127.0.0.1:0,server,nowait");
 
-    char_socket_test_common(chr);
+    char_socket_test_common(chr, false);
+}
+
+
+static void char_socket_reconnect_test(void)
+{
+    Chardev *chr = qemu_chr_new("server", "tcp:127.0.0.1:0,server,nowait");
+
+    char_socket_test_common(chr, true);
 }
 
 
@@ -409,7 +418,7 @@ static void char_socket_fdpass_test(void)
 
     qemu_opts_del(opts);
 
-    char_socket_test_common(chr);
+    char_socket_test_common(chr, false);
 }
 
 
@@ -830,6 +839,7 @@ int main(int argc, char **argv)
     g_test_add_func("/char/file-fifo", char_file_fifo_test);
 #endif
     g_test_add_func("/char/socket/basic", char_socket_basic_test);
+    g_test_add_func("/char/socket/reconnect", char_socket_reconnect_test);
     g_test_add_func("/char/socket/fdpass", char_socket_fdpass_test);
     g_test_add_func("/char/udp", char_udp_test);
 #ifdef HAVE_CHARDEV_SERIAL
-- 
2.18.0.547.g1d89318c48

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
  2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
                   ` (4 preceding siblings ...)
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 5/5] test-char: add socket reconnect test Marc-André Lureau
@ 2018-08-24  3:45 ` Peter Xu
  2018-08-24  8:32   ` Marc-André Lureau
  5 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2018-08-24  3:45 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: qemu-devel, Paolo Bonzini, berrange

On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
> Hi,
> 
> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> (and its follow up 99f2f54174a59), Peter moved chardev socket
> connection to machine_done event. However, chardev created later will
> no longer attempt to connect, and chardev created in tests do not have
> machine_done event (breaking some of vhost-user-test).
> 
> The goal was to move the "connect" source to the chardev frontend
> context (the monitor thread context in his case). chr->gcontext is set
> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> function will be called in general, so we can't delay connection until
> then: the chardev should still attempt to connect during open(), using
> the main context.
> 
> An alternative would be to specify the iothread during chardev
> creation. Setting up monitor OOB would be quite different too, it
> would take the same iothread as argument.
> 
> 99f2f54174a595e is also a bit problematic, since it will behave
> differently before and after machine_done (the first case gives a
> chance to use a different context reliably, the second looks racy)
> 
> In the end, I am not sure this is all necessary, as chardev callbacks
> are called after qemu_chr_fe_set_handlers(), at which point the
> context of sources are updated. In "char-socket: update all ioc
> handlers when changing context", I moved also the hup handler to the
> updated context. So unless the main thread is already stuck, we can
> setup a different context for the chardev at that time. Or not?
> 
> v2:
> - fix a random socket chardev test failure

I have no problem on patch 3-5, but aren't patch 1-2 still breaking
the context switch of chardev?  Or did I misunderstood?

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
  2018-08-24  3:45 ` [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Peter Xu
@ 2018-08-24  8:32   ` Marc-André Lureau
  2018-08-24  8:44     ` Peter Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-24  8:32 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Paolo Bonzini, P. Berrange, Daniel

Hi

On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <peterx@redhat.com> wrote:
> On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
>> Hi,
>>
>> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
>> (and its follow up 99f2f54174a59), Peter moved chardev socket
>> connection to machine_done event. However, chardev created later will
>> no longer attempt to connect, and chardev created in tests do not have
>> machine_done event (breaking some of vhost-user-test).
>>
>> The goal was to move the "connect" source to the chardev frontend
>> context (the monitor thread context in his case). chr->gcontext is set
>> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
>> function will be called in general, so we can't delay connection until
>> then: the chardev should still attempt to connect during open(), using
>> the main context.
>>
>> An alternative would be to specify the iothread during chardev
>> creation. Setting up monitor OOB would be quite different too, it
>> would take the same iothread as argument.
>>
>> 99f2f54174a595e is also a bit problematic, since it will behave
>> differently before and after machine_done (the first case gives a
>> chance to use a different context reliably, the second looks racy)
>>
>> In the end, I am not sure this is all necessary, as chardev callbacks
>> are called after qemu_chr_fe_set_handlers(), at which point the
>> context of sources are updated. In "char-socket: update all ioc
>> handlers when changing context", I moved also the hup handler to the
>> updated context. So unless the main thread is already stuck, we can
>> setup a different context for the chardev at that time. Or not?
>>
>> v2:
>> - fix a random socket chardev test failure
>
> I have no problem on patch 3-5, but aren't patch 1-2 still breaking
> the context switch of chardev?  Or did I misunderstood?

Switching context is broken/racy regardless. My current plan is to
suspend the context to switch to, and update all sources before
thawing the target context.

If you prefer we can delay the revert, but I would rather have them
because the regression seems much worse to me.

>
> Regards,
>
> --
> Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
  2018-08-24  8:32   ` Marc-André Lureau
@ 2018-08-24  8:44     ` Peter Xu
  2018-08-24  8:51       ` Marc-André Lureau
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2018-08-24  8:44 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: qemu-devel, Paolo Bonzini, P. Berrange, Daniel

On Fri, Aug 24, 2018 at 10:32:58AM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <peterx@redhat.com> wrote:
> > On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
> >> Hi,
> >>
> >> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> >> (and its follow up 99f2f54174a59), Peter moved chardev socket
> >> connection to machine_done event. However, chardev created later will
> >> no longer attempt to connect, and chardev created in tests do not have
> >> machine_done event (breaking some of vhost-user-test).
> >>
> >> The goal was to move the "connect" source to the chardev frontend
> >> context (the monitor thread context in his case). chr->gcontext is set
> >> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> >> function will be called in general, so we can't delay connection until
> >> then: the chardev should still attempt to connect during open(), using
> >> the main context.
> >>
> >> An alternative would be to specify the iothread during chardev
> >> creation. Setting up monitor OOB would be quite different too, it
> >> would take the same iothread as argument.
> >>
> >> 99f2f54174a595e is also a bit problematic, since it will behave
> >> differently before and after machine_done (the first case gives a
> >> chance to use a different context reliably, the second looks racy)
> >>
> >> In the end, I am not sure this is all necessary, as chardev callbacks
> >> are called after qemu_chr_fe_set_handlers(), at which point the
> >> context of sources are updated. In "char-socket: update all ioc
> >> handlers when changing context", I moved also the hup handler to the
> >> updated context. So unless the main thread is already stuck, we can
> >> setup a different context for the chardev at that time. Or not?
> >>
> >> v2:
> >> - fix a random socket chardev test failure
> >
> > I have no problem on patch 3-5, but aren't patch 1-2 still breaking
> > the context switch of chardev?  Or did I misunderstood?
> 
> Switching context is broken/racy regardless. My current plan is to
> suspend the context to switch to, and update all sources before
> thawing the target context.
> 
> If you prefer we can delay the revert, but I would rather have them
> because the regression seems much worse to me.

Have you started working on above?  I started to code up some RFC
patch to let the chardev maintain the iothreads (now it'll only have a
monitor iothread) then we don't need to switch context for chardev any
more.  I plan to post it today for a quick look.  With that, this
series will not break anything then.

> 
> >
> > Regards,
> >
> > --
> > Peter Xu

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
  2018-08-24  8:44     ` Peter Xu
@ 2018-08-24  8:51       ` Marc-André Lureau
  2018-08-24  9:12         ` Peter Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Marc-André Lureau @ 2018-08-24  8:51 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, QEMU

Hi

On Fri, Aug 24, 2018 at 10:45 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Aug 24, 2018 at 10:32:58AM +0200, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <peterx@redhat.com> wrote:
> > > On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
> > >> Hi,
> > >>
> > >> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > >> (and its follow up 99f2f54174a59), Peter moved chardev socket
> > >> connection to machine_done event. However, chardev created later will
> > >> no longer attempt to connect, and chardev created in tests do not have
> > >> machine_done event (breaking some of vhost-user-test).
> > >>
> > >> The goal was to move the "connect" source to the chardev frontend
> > >> context (the monitor thread context in his case). chr->gcontext is set
> > >> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > >> function will be called in general, so we can't delay connection until
> > >> then: the chardev should still attempt to connect during open(), using
> > >> the main context.
> > >>
> > >> An alternative would be to specify the iothread during chardev
> > >> creation. Setting up monitor OOB would be quite different too, it
> > >> would take the same iothread as argument.
> > >>
> > >> 99f2f54174a595e is also a bit problematic, since it will behave
> > >> differently before and after machine_done (the first case gives a
> > >> chance to use a different context reliably, the second looks racy)
> > >>
> > >> In the end, I am not sure this is all necessary, as chardev callbacks
> > >> are called after qemu_chr_fe_set_handlers(), at which point the
> > >> context of sources are updated. In "char-socket: update all ioc
> > >> handlers when changing context", I moved also the hup handler to the
> > >> updated context. So unless the main thread is already stuck, we can
> > >> setup a different context for the chardev at that time. Or not?
> > >>
> > >> v2:
> > >> - fix a random socket chardev test failure
> > >
> > > I have no problem on patch 3-5, but aren't patch 1-2 still breaking
> > > the context switch of chardev?  Or did I misunderstood?
> >
> > Switching context is broken/racy regardless. My current plan is to
> > suspend the context to switch to, and update all sources before
> > thawing the target context.
> >
> > If you prefer we can delay the revert, but I would rather have them
> > because the regression seems much worse to me.
>
> Have you started working on above?  I started to code up some RFC
> patch to let the chardev maintain the iothreads (now it'll only have a
> monitor iothread) then we don't need to switch context for chardev any
> more.  I plan to post it today for a quick look.  With that, this
> series will not break anything then.

No, I haven't really started the code, but spent some time thinking...

Ok, if you manage to put together that proposal, that sounds interesting.

Nevertheless, I can work on improving/fixing the race when switching
contexts (already used by colo)

>
> >
> > >
> > > Regards,
> > >
> > > --
> > > Peter Xu
>
> Regards,
>
> --
> Peter Xu
>


-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression
  2018-08-24  8:51       ` Marc-André Lureau
@ 2018-08-24  9:12         ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2018-08-24  9:12 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Paolo Bonzini, QEMU

On Fri, Aug 24, 2018 at 10:51:18AM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Aug 24, 2018 at 10:45 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Fri, Aug 24, 2018 at 10:32:58AM +0200, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <peterx@redhat.com> wrote:
> > > > On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
> > > >> Hi,
> > > >>
> > > >> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > > >> (and its follow up 99f2f54174a59), Peter moved chardev socket
> > > >> connection to machine_done event. However, chardev created later will
> > > >> no longer attempt to connect, and chardev created in tests do not have
> > > >> machine_done event (breaking some of vhost-user-test).
> > > >>
> > > >> The goal was to move the "connect" source to the chardev frontend
> > > >> context (the monitor thread context in his case). chr->gcontext is set
> > > >> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > > >> function will be called in general, so we can't delay connection until
> > > >> then: the chardev should still attempt to connect during open(), using
> > > >> the main context.
> > > >>
> > > >> An alternative would be to specify the iothread during chardev
> > > >> creation. Setting up monitor OOB would be quite different too, it
> > > >> would take the same iothread as argument.
> > > >>
> > > >> 99f2f54174a595e is also a bit problematic, since it will behave
> > > >> differently before and after machine_done (the first case gives a
> > > >> chance to use a different context reliably, the second looks racy)
> > > >>
> > > >> In the end, I am not sure this is all necessary, as chardev callbacks
> > > >> are called after qemu_chr_fe_set_handlers(), at which point the
> > > >> context of sources are updated. In "char-socket: update all ioc
> > > >> handlers when changing context", I moved also the hup handler to the
> > > >> updated context. So unless the main thread is already stuck, we can
> > > >> setup a different context for the chardev at that time. Or not?
> > > >>
> > > >> v2:
> > > >> - fix a random socket chardev test failure
> > > >
> > > > I have no problem on patch 3-5, but aren't patch 1-2 still breaking
> > > > the context switch of chardev?  Or did I misunderstood?
> > >
> > > Switching context is broken/racy regardless. My current plan is to
> > > suspend the context to switch to, and update all sources before
> > > thawing the target context.
> > >
> > > If you prefer we can delay the revert, but I would rather have them
> > > because the regression seems much worse to me.
> >
> > Have you started working on above?  I started to code up some RFC
> > patch to let the chardev maintain the iothreads (now it'll only have a
> > monitor iothread) then we don't need to switch context for chardev any
> > more.  I plan to post it today for a quick look.  With that, this
> > series will not break anything then.
> 
> No, I haven't really started the code, but spent some time thinking...
> 
> Ok, if you manage to put together that proposal, that sounds interesting.

Not really your iothread proposal; I tried to keep the interface
unchanged, but move the iothread ownership from the monitor code to
chardev code.  Please have a look at:

  [RFC 0/3] chardev: introduce chardev contexts

> 
> Nevertheless, I can work on improving/fixing the race when switching
> contexts (already used by colo)

Ouch, is colo changing context too?  I hope that series won't break it.

(Ok if so I believe it'll very possible to break it...)

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure
  2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure Marc-André Lureau
@ 2018-09-11 13:46   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2018-09-11 13:46 UTC (permalink / raw)
  To: Marc-André Lureau, qemu-devel; +Cc: berrange, peterx

On 23/08/2018 16:31, Marc-André Lureau wrote:
> Peter reported a test failure on FreeBSD with the new reconnect test:
> 
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> gtester -k --verbose -m=quick tests/test-char
> TEST: tests/test-char... (pid=16190)
>   /char/null:                                                          OK
>   /char/invalid:                                                       OK
>   /char/ringbuf:                                                       OK
>   /char/mux:                                                           OK
>   /char/stdio:                                                         OK
>   /char/pipe:                                                          OK
>   /char/file:                                                          OK
>   /char/file-fifo:                                                     OK
>   /char/udp:                                                           OK
>   /char/serial:                                                        OK
>   /char/hotswap:                                                       OK
>   /char/socket/basic:                                                  OK
>   /char/socket/reconnect:                                              FAIL
> GTester: last random seed: R02S521380d9c12f1dac3ad1763bf5665c27
> (pid=16367)
>   /char/socket/fdpass:                                                 OK
> FAIL: tests/test-char
> **
> ERROR:tests/test-char.c:353:char_socket_test_common: assertion failed:
> (object_property_get_bool(OBJECT(chr_client), "connected",
> &error_abort))
> 
> It turns out that the socket test code checks both server and client
> connection states, but doesn't wait for both.
> 
> Wait for the client side as well.
> 
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  tests/test-char.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tests/test-char.c b/tests/test-char.c
> index 5905d31441..2b1e400542 100644
> --- a/tests/test-char.c
> +++ b/tests/test-char.c
> @@ -349,6 +349,12 @@ static void char_socket_test_common(Chardev *chr)
>      g_assert_cmpint(id, >, 0);
>      main_loop();
>  
> +    d.chr = chr_client;
> +    id = g_idle_add(char_socket_test_idle, &d);
> +    g_source_set_name_by_id(id, "test-idle");
> +    g_assert_cmpint(id, >, 0);
> +    main_loop();
> +
>      g_assert(object_property_get_bool(OBJECT(chr), "connected", &error_abort));
>      g_assert(object_property_get_bool(OBJECT(chr_client),
>                                        "connected", &error_abort));
> @@ -358,6 +364,7 @@ static void char_socket_test_common(Chardev *chr)
>  
>      object_unparent(OBJECT(chr_client));
>  
> +    d.chr = chr;
>      d.conn_expected = false;
>      g_idle_add(char_socket_test_idle, &d);
>      main_loop();
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-09-11 13:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 14:31 [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Marc-André Lureau
2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 1/5] Revert "chardev: tcp: postpone TLS work until machine done" Marc-André Lureau
2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 2/5] Revert "chardev: tcp: postpone async connection setup" Marc-André Lureau
2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 3/5] char-socket: update all ioc handlers when changing context Marc-André Lureau
2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 4/5] test-char: fix random socket test failure Marc-André Lureau
2018-09-11 13:46   ` Paolo Bonzini
2018-08-23 14:31 ` [Qemu-devel] [PATCH v2 5/5] test-char: add socket reconnect test Marc-André Lureau
2018-08-24  3:45 ` [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression Peter Xu
2018-08-24  8:32   ` Marc-André Lureau
2018-08-24  8:44     ` Peter Xu
2018-08-24  8:51       ` Marc-André Lureau
2018-08-24  9:12         ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.