All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
@ 2018-06-04  8:06 Peter Xu
  2018-06-04 15:01 ` Marc-André Lureau
  2018-06-07 11:53 ` Markus Armbruster
  0 siblings, 2 replies; 15+ messages in thread
From: Peter Xu @ 2018-06-04  8:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Marc-André Lureau, Daniel P . Berrange,
	Kevin Wolf, Max Reitz, peterx, Eric Blake, John Snow,
	Markus Armbruster, Stefan Hajnoczi, Dr . David Alan Gilbert

Previously we cleanup the queues when we got CLOSED event.  It was used
to make sure we won't leftover replies/events of a old client to a new
client.  Now this patch postpones that until OPENED.

In most cases, it does not really matter much since either way will make
sure that the new client won't receive unexpected old events/responses.
However there can be a very rare race condition with the old way when we
use QMP with stdio and pipelines (which is quite common in iotests).
The problem is that, we can easily have something like this in scripts:

  cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands

In most cases, a QMP session will be based on a bidirectional
channel (read/write to a TCP port, for example), so IN and OUT are
sharing some fundamentally same thing.  However that's untrue for pipes
above - the IN is the "cat" program, while the "OUT" is directed to the
"filter_commands" which is actually another process.

Now when we received the CLOSED event, we cleanup the queues (including
the QMP response queue).  That means, if we kill/stop the "cat" process
faster than the filter_commands process, the filter_commands process is
possible to miss some responses/events that should belong to it.

In practise, I encountered a very strange SHUTDOWN event missing when
running test with iotest 087.  Without this patch, iotest 078 will have
~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
enabled:

087 8s ... - output mismatch (see 087.out.bad)
--- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
+++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
@@ -8,7 +8,6 @@
 {"return": {}}
 {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}

 === Duplicate ID ===
@@ -53,7 +52,6 @@
 {"return": {}}
 {"return": {}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}

This patch fixes the problem.

Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
Signed-off-by: Peter Xu <peterx@redhat.com>

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 monitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index 46814af533..6f26108108 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4354,6 +4354,7 @@ static void monitor_qmp_event(void *opaque, int event)
 
     switch (event) {
     case CHR_EVENT_OPENED:
+        monitor_qmp_cleanup_queues(mon);
         mon->qmp.commands = &qmp_cap_negotiation_commands;
         monitor_qmp_caps_reset(mon);
         data = get_qmp_greeting(mon);
@@ -4362,7 +4363,6 @@ static void monitor_qmp_event(void *opaque, int event)
         mon_refcount++;
         break;
     case CHR_EVENT_CLOSED:
-        monitor_qmp_cleanup_queues(mon);
         json_message_parser_destroy(&mon->qmp.parser);
         json_message_parser_init(&mon->qmp.parser, handle_qmp_command);
         mon_refcount--;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-04  8:06 [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues Peter Xu
@ 2018-06-04 15:01 ` Marc-André Lureau
  2018-06-05  3:39   ` Peter Xu
  2018-06-07 11:53 ` Markus Armbruster
  1 sibling, 1 reply; 15+ messages in thread
From: Marc-André Lureau @ 2018-06-04 15:01 UTC (permalink / raw)
  To: Peter Xu
  Cc: QEMU, Kevin Wolf, John Snow, Markus Armbruster, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, Dr . David Alan Gilbert

Hi

On Mon, Jun 4, 2018 at 10:06 AM, Peter Xu <peterx@redhat.com> wrote:
> Previously we cleanup the queues when we got CLOSED event.  It was used
> to make sure we won't leftover replies/events of a old client to a new
> client.  Now this patch postpones that until OPENED.
>
> In most cases, it does not really matter much since either way will make
> sure that the new client won't receive unexpected old events/responses.
> However there can be a very rare race condition with the old way when we
> use QMP with stdio and pipelines (which is quite common in iotests).
> The problem is that, we can easily have something like this in scripts:
>
>   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
>
> In most cases, a QMP session will be based on a bidirectional
> channel (read/write to a TCP port, for example), so IN and OUT are
> sharing some fundamentally same thing.  However that's untrue for pipes
> above - the IN is the "cat" program, while the "OUT" is directed to the
> "filter_commands" which is actually another process.
>
> Now when we received the CLOSED event, we cleanup the queues (including
> the QMP response queue).  That means, if we kill/stop the "cat" process
> faster than the filter_commands process, the filter_commands process is
> possible to miss some responses/events that should belong to it.
>
> In practise, I encountered a very strange SHUTDOWN event missing when
> running test with iotest 087.  Without this patch, iotest 078 will have
> ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
> enabled:
>
> 087 8s ... - output mismatch (see 087.out.bad)
> --- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
> +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
> @@ -8,7 +8,6 @@
>  {"return": {}}
>  {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
>
>  === Duplicate ID ===
> @@ -53,7 +52,6 @@
>  {"return": {}}
>  {"return": {}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
>
> This patch fixes the problem.
>
> Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
> Signed-off-by: Peter Xu <peterx@redhat.com>
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  monitor.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/monitor.c b/monitor.c
> index 46814af533..6f26108108 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4354,6 +4354,7 @@ static void monitor_qmp_event(void *opaque, int event)
>
>      switch (event) {
>      case CHR_EVENT_OPENED:
> +        monitor_qmp_cleanup_queues(mon);
>          mon->qmp.commands = &qmp_cap_negotiation_commands;
>          monitor_qmp_caps_reset(mon);
>          data = get_qmp_greeting(mon);
> @@ -4362,7 +4363,6 @@ static void monitor_qmp_event(void *opaque, int event)
>          mon_refcount++;
>          break;
>      case CHR_EVENT_CLOSED:
> -        monitor_qmp_cleanup_queues(mon);
>          json_message_parser_destroy(&mon->qmp.parser);
>          json_message_parser_init(&mon->qmp.parser, handle_qmp_command);
>          mon_refcount--;
> --
> 2.17.0
>
>

Drawback, we will not clean up pending commands until next client.

Perhaps we could have something more specific to the stdio case (untested):


diff --git a/include/io/channel-file.h b/include/io/channel-file.h
index ebfe54ec70..04a20bc2ed 100644
--- a/include/io/channel-file.h
+++ b/include/io/channel-file.h
@@ -90,4 +90,12 @@ qio_channel_file_new_path(const char *path,
                           mode_t mode,
                           Error **errp);

+/**
+ * qio_channel_file_is_opened:
+ *
+ * Returns: true if the associated file descriptor is valid & opened.
+ */
+bool
+qio_channel_file_is_opened(QIOChannelFile *ioc);
+
 #endif /* QIO_CHANNEL_FILE_H */
diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 2c9b2ce567..bf9803b4c9 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -59,7 +59,10 @@ static gboolean fd_chr_read(QIOChannel *chan,
GIOCondition cond, void *opaque)
         chan, (gchar *)buf, len, NULL);
     if (ret == 0) {
         remove_fd_in_watch(chr);
-        qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
+        if (!qio_channel_file_is_opened(s->ioc_out)) {
+            /* only send close event if both in & out channels are closed */
+            qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
+        }
         return FALSE;
     }
     if (ret > 0) {
diff --git a/io/channel-file.c b/io/channel-file.c
index db948abc3e..1a02f99abf 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -63,6 +63,12 @@ qio_channel_file_new_path(const char *path,
     return ioc;
 }

+bool
+qio_channel_file_is_opened(QIOChannelFile *ioc)
+{
+    errno = 0;
+    return ioc->fd != -1 && (fcntl(ioc->fd, F_GETFD) != -1 || errno != EBADF);
+}

 static void qio_channel_file_init(Object *obj)
 {

-- 
Marc-André Lureau

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-04 15:01 ` Marc-André Lureau
@ 2018-06-05  3:39   ` Peter Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2018-06-05  3:39 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: QEMU, Kevin Wolf, John Snow, Markus Armbruster, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, Dr . David Alan Gilbert

On Mon, Jun 04, 2018 at 05:01:21PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Jun 4, 2018 at 10:06 AM, Peter Xu <peterx@redhat.com> wrote:
> > Previously we cleanup the queues when we got CLOSED event.  It was used
> > to make sure we won't leftover replies/events of a old client to a new
> > client.  Now this patch postpones that until OPENED.
> >
> > In most cases, it does not really matter much since either way will make
> > sure that the new client won't receive unexpected old events/responses.
> > However there can be a very rare race condition with the old way when we
> > use QMP with stdio and pipelines (which is quite common in iotests).
> > The problem is that, we can easily have something like this in scripts:
> >
> >   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
> >
> > In most cases, a QMP session will be based on a bidirectional
> > channel (read/write to a TCP port, for example), so IN and OUT are
> > sharing some fundamentally same thing.  However that's untrue for pipes
> > above - the IN is the "cat" program, while the "OUT" is directed to the
> > "filter_commands" which is actually another process.
> >
> > Now when we received the CLOSED event, we cleanup the queues (including
> > the QMP response queue).  That means, if we kill/stop the "cat" process
> > faster than the filter_commands process, the filter_commands process is
> > possible to miss some responses/events that should belong to it.
> >
> > In practise, I encountered a very strange SHUTDOWN event missing when
> > running test with iotest 087.  Without this patch, iotest 078 will have
> > ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
> > enabled:
> >
> > 087 8s ... - output mismatch (see 087.out.bad)
> > --- /home/peterx/git/qemu/tests/qemu-iotests/087.out    2018-06-01 18:44:22.378982462 +0800
> > +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad    2018-06-01 18:53:44.267840928 +0800
> > @@ -8,7 +8,6 @@
> >  {"return": {}}
> >  {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}}
> >  {"return": {}}
> > -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
> >
> >  === Duplicate ID ===
> > @@ -53,7 +52,6 @@
> >  {"return": {}}
> >  {"return": {}}
> >  {"return": {}}
> > -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
> >
> > This patch fixes the problem.
> >
> > Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  monitor.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/monitor.c b/monitor.c
> > index 46814af533..6f26108108 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4354,6 +4354,7 @@ static void monitor_qmp_event(void *opaque, int event)
> >
> >      switch (event) {
> >      case CHR_EVENT_OPENED:
> > +        monitor_qmp_cleanup_queues(mon);
> >          mon->qmp.commands = &qmp_cap_negotiation_commands;
> >          monitor_qmp_caps_reset(mon);
> >          data = get_qmp_greeting(mon);
> > @@ -4362,7 +4363,6 @@ static void monitor_qmp_event(void *opaque, int event)
> >          mon_refcount++;
> >          break;
> >      case CHR_EVENT_CLOSED:
> > -        monitor_qmp_cleanup_queues(mon);
> >          json_message_parser_destroy(&mon->qmp.parser);
> >          json_message_parser_init(&mon->qmp.parser, handle_qmp_command);
> >          mon_refcount--;
> > --
> > 2.17.0
> >
> >
> 
> Drawback, we will not clean up pending commands until next client.

IMHO that's fine.

> 
> Perhaps we could have something more specific to the stdio case (untested):

Yeah if we can fix that from the QIO layer that'll be good to me too,
though...

> 
> 
> diff --git a/include/io/channel-file.h b/include/io/channel-file.h
> index ebfe54ec70..04a20bc2ed 100644
> --- a/include/io/channel-file.h
> +++ b/include/io/channel-file.h
> @@ -90,4 +90,12 @@ qio_channel_file_new_path(const char *path,
>                            mode_t mode,
>                            Error **errp);
> 
> +/**
> + * qio_channel_file_is_opened:
> + *
> + * Returns: true if the associated file descriptor is valid & opened.
> + */
> +bool
> +qio_channel_file_is_opened(QIOChannelFile *ioc);
> +
>  #endif /* QIO_CHANNEL_FILE_H */
> diff --git a/chardev/char-fd.c b/chardev/char-fd.c
> index 2c9b2ce567..bf9803b4c9 100644
> --- a/chardev/char-fd.c
> +++ b/chardev/char-fd.c
> @@ -59,7 +59,10 @@ static gboolean fd_chr_read(QIOChannel *chan,
> GIOCondition cond, void *opaque)
>          chan, (gchar *)buf, len, NULL);
>      if (ret == 0) {
>          remove_fd_in_watch(chr);
> -        qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> +        if (!qio_channel_file_is_opened(s->ioc_out)) {
> +            /* only send close event if both in & out channels are closed */
> +            qemu_chr_be_event(chr, CHR_EVENT_CLOSED);

... here what if we close READ file first then WRITE file?  Will we
miss one CLOSED event then for the channel?

Regards,

> +        }
>          return FALSE;
>      }
>      if (ret > 0) {
> diff --git a/io/channel-file.c b/io/channel-file.c
> index db948abc3e..1a02f99abf 100644
> --- a/io/channel-file.c
> +++ b/io/channel-file.c
> @@ -63,6 +63,12 @@ qio_channel_file_new_path(const char *path,
>      return ioc;
>  }
> 
> +bool
> +qio_channel_file_is_opened(QIOChannelFile *ioc)
> +{
> +    errno = 0;
> +    return ioc->fd != -1 && (fcntl(ioc->fd, F_GETFD) != -1 || errno != EBADF);
> +}
> 
>  static void qio_channel_file_init(Object *obj)
>  {
> 
> -- 
> Marc-André Lureau

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-04  8:06 [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues Peter Xu
  2018-06-04 15:01 ` Marc-André Lureau
@ 2018-06-07 11:53 ` Markus Armbruster
  2018-06-08  4:42   ` Peter Xu
  1 sibling, 1 reply; 15+ messages in thread
From: Markus Armbruster @ 2018-06-07 11:53 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
	Marc-André Lureau, Paolo Bonzini, Dr . David Alan Gilbert

Peter Xu <peterx@redhat.com> writes:

> Previously we cleanup the queues when we got CLOSED event.  It was used

we clean up

> to make sure we won't leftover replies/events of a old client to a new

we won't send leftover replies/events of an old client

> client.  Now this patch postpones that until OPENED.

What if OPENED never comes?

> In most cases, it does not really matter much since either way will make
> sure that the new client won't receive unexpected old events/responses.
> However there can be a very rare race condition with the old way when we
> use QMP with stdio and pipelines (which is quite common in iotests).
> The problem is that, we can easily have something like this in scripts:
>
>   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
>
> In most cases, a QMP session will be based on a bidirectional
> channel (read/write to a TCP port, for example), so IN and OUT are
> sharing some fundamentally same thing.  However that's untrue for pipes

Suggest "are fundamentally the same thing".

> above - the IN is the "cat" program, while the "OUT" is directed to the
> "filter_commands" which is actually another process.

Regarding 'IN is the "cat" program': a pipe is not a program.  Suggest
'IN is connected to "cat", while OUT is connected to "filter_commands",
which are separate processes.

> Now when we received the CLOSED event, we cleanup the queues (including

we clean up

> the QMP response queue).  That means, if we kill/stop the "cat" process
> faster than the filter_commands process, the filter_commands process is
> possible to miss some responses/events that should belong to it.

I'm not sure I understand the error scenario.  Can you explain it in
more concrete terms?  Like "cat terminates, QEMU reads EOF, QEMU does
this, QEMU does that, oops, it shouldn't have done that".

> In practise, I encountered a very strange SHUTDOWN event missing when

practice

May I suggest use of a spell checker?  ;)

> running test with iotest 087.  Without this patch, iotest 078 will have

087 or 078?

> ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
> enabled:
>
> 087 8s ... - output mismatch (see 087.out.bad)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-07 11:53 ` Markus Armbruster
@ 2018-06-08  4:42   ` Peter Xu
  2018-06-08  7:05     ` Markus Armbruster
  2018-06-08  8:04     ` Stefan Hajnoczi
  0 siblings, 2 replies; 15+ messages in thread
From: Peter Xu @ 2018-06-08  4:42 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Kevin Wolf, John Snow, Max Reitz, Stefan Hajnoczi,
	Marc-André Lureau, Paolo Bonzini, Dr . David Alan Gilbert

On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Previously we cleanup the queues when we got CLOSED event.  It was used
> 
> we clean up
> 
> > to make sure we won't leftover replies/events of a old client to a new
> 
> we won't send leftover replies/events of an old client
> 
> > client.  Now this patch postpones that until OPENED.
> 
> What if OPENED never comes?

Then we clean that up until destruction of the monitor.  IMHO it's
fine, but I'm not sure whether that's an overall acceptable approach.

> 
> > In most cases, it does not really matter much since either way will make
> > sure that the new client won't receive unexpected old events/responses.
> > However there can be a very rare race condition with the old way when we
> > use QMP with stdio and pipelines (which is quite common in iotests).
> > The problem is that, we can easily have something like this in scripts:
> >
> >   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
> >
> > In most cases, a QMP session will be based on a bidirectional
> > channel (read/write to a TCP port, for example), so IN and OUT are
> > sharing some fundamentally same thing.  However that's untrue for pipes
> 
> Suggest "are fundamentally the same thing".
> 
> > above - the IN is the "cat" program, while the "OUT" is directed to the
> > "filter_commands" which is actually another process.
> 
> Regarding 'IN is the "cat" program': a pipe is not a program.  Suggest
> 'IN is connected to "cat", while OUT is connected to "filter_commands",
> which are separate processes.
> 
> > Now when we received the CLOSED event, we cleanup the queues (including
> 
> we clean up
> 
> > the QMP response queue).  That means, if we kill/stop the "cat" process
> > faster than the filter_commands process, the filter_commands process is
> > possible to miss some responses/events that should belong to it.
> 
> I'm not sure I understand the error scenario.  Can you explain it in
> more concrete terms?  Like "cat terminates, QEMU reads EOF, QEMU does
> this, QEMU does that, oops, it shouldn't have done that".

One condition could be this (after "quit" command is executed and QEMU
quits the main loop):

1. [main thread] QEMU queues one SHUTDOWN event into response queue

2. "cat" terminates (to distinguish it from the animal, I quote it).
   Logically it can terminate even earlier, but let's just assume it's
   here.

3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
   "cat" process

4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,
   which will clean up the response queue of the monitor, then the
   SHUTDOWN event is dropped

5. [main thread] clean up the monitors in monitor_cleanup(), when
   trying to flush pending responses, it sees nothing.  SHUTDOWN is
   lost forever

Note that before the monitor iothread was introduced, step [4]/[5]
could never happen since the main loop was the only place to detect
the EOF event of stdin and run the CLOSED event hooks.  Now things can
happen in parallel in the iothread.

I can add these into commit message.

> 
> > In practise, I encountered a very strange SHUTDOWN event missing when
> 
> practice
> 
> May I suggest use of a spell checker?  ;)

Sorry for that.  TODO added: "Find a spell checker for Emacs".

> 
> > running test with iotest 087.  Without this patch, iotest 078 will have
> 
> 087 or 078?

087.  Err.  Even a spell checker won't help me here!

> 
> > ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
> > enabled:
> >
> > 087 8s ... - output mismatch (see 087.out.bad)

I'll take all the rest comments.  Thanks for reviewing.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  4:42   ` Peter Xu
@ 2018-06-08  7:05     ` Markus Armbruster
  2018-06-08  8:04     ` Stefan Hajnoczi
  1 sibling, 0 replies; 15+ messages in thread
From: Markus Armbruster @ 2018-06-08  7:05 UTC (permalink / raw)
  To: Peter Xu
  Cc: Markus Armbruster, Kevin Wolf, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, Marc-André Lureau,
	John Snow, Dr . David Alan Gilbert

Peter Xu <peterx@redhat.com> writes:

> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > Previously we cleanup the queues when we got CLOSED event.  It was used
>> 
>> we clean up
>> 
>> > to make sure we won't leftover replies/events of a old client to a new
>> 
>> we won't send leftover replies/events of an old client
>> 
>> > client.  Now this patch postpones that until OPENED.
>> 
>> What if OPENED never comes?
>
> Then we clean that up until destruction of the monitor.  IMHO it's
> fine, but I'm not sure whether that's an overall acceptable approach.

See below.

>> > In most cases, it does not really matter much since either way will make
>> > sure that the new client won't receive unexpected old events/responses.
>> > However there can be a very rare race condition with the old way when we
>> > use QMP with stdio and pipelines (which is quite common in iotests).
>> > The problem is that, we can easily have something like this in scripts:
>> >
>> >   cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
>> >
>> > In most cases, a QMP session will be based on a bidirectional
>> > channel (read/write to a TCP port, for example), so IN and OUT are
>> > sharing some fundamentally same thing.  However that's untrue for pipes
>> 
>> Suggest "are fundamentally the same thing".
>> 
>> > above - the IN is the "cat" program, while the "OUT" is directed to the
>> > "filter_commands" which is actually another process.
>> 
>> Regarding 'IN is the "cat" program': a pipe is not a program.  Suggest
>> 'IN is connected to "cat", while OUT is connected to "filter_commands",
>> which are separate processes.
>> 
>> > Now when we received the CLOSED event, we cleanup the queues (including
>> 
>> we clean up
>> 
>> > the QMP response queue).  That means, if we kill/stop the "cat" process
>> > faster than the filter_commands process, the filter_commands process is
>> > possible to miss some responses/events that should belong to it.
>> 
>> I'm not sure I understand the error scenario.  Can you explain it in
>> more concrete terms?  Like "cat terminates, QEMU reads EOF, QEMU does
>> this, QEMU does that, oops, it shouldn't have done that".
>
> One condition could be this (after "quit" command is executed and QEMU
> quits the main loop):
>
> 1. [main thread] QEMU queues one SHUTDOWN event into response queue
>
> 2. "cat" terminates (to distinguish it from the animal, I quote it).
>    Logically it can terminate even earlier, but let's just assume it's
>    here.
>
> 3. [monitor iothread] QEMU reads EOF from stdin, which connects to the
>    "cat" process
>
> 4. [monitor iothread] QEMU calls the CLOSED event hook for the monitor,
>    which will clean up the response queue of the monitor, then the
>    SHUTDOWN event is dropped
>
> 5. [main thread] clean up the monitors in monitor_cleanup(), when
>    trying to flush pending responses, it sees nothing.  SHUTDOWN is
>    lost forever

Got it, thanks!

The actual problem is that we drop the output queue when we see EOF on
input.

Here's what I think we should do on such EOF:

1. Shut down input

   Stop reading input.  If input has its own file descriptor (as opposed
   to a read/write fd shared with output), close it.

2. Flush output

   Continue to write output until the queue becomes empty or we run into
   an error (such as EPIPE, telling us the output's consumer went away).
   Works exactly as before, except we proceed to step 3 when the queue
   becomes empty.

3. Shut down output

   Close the output file descriptor.

> Note that before the monitor iothread was introduced, step [4]/[5]
> could never happen since the main loop was the only place to detect
> the EOF event of stdin and run the CLOSED event hooks.

I can't quite see what ensured response queue is empty before main loop
runs the CLOSED event hook.

>                                                         Now things can
> happen in parallel in the iothread.

Perhaps that just made the bug more likely to bite.

> I can add these into commit message.
>
>> 
>> > In practise, I encountered a very strange SHUTDOWN event missing when
>> 
>> practice
>> 
>> May I suggest use of a spell checker?  ;)
>
> Sorry for that.  TODO added: "Find a spell checker for Emacs".

C-h i
M emacs RET
M spelling RET

>> > running test with iotest 087.  Without this patch, iotest 078 will have
>> 
>> 087 or 078?
>
> 087.  Err.  Even a spell checker won't help me here!

Need to leave something for reviewers to do ;)

>> > ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band
>> > enabled:
>> >
>> > 087 8s ... - output mismatch (see 087.out.bad)
>
> I'll take all the rest comments.  Thanks for reviewing.

I'm not sure this patch is the proper solution.  Can we explore the one
I sketched?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  4:42   ` Peter Xu
  2018-06-08  7:05     ` Markus Armbruster
@ 2018-06-08  8:04     ` Stefan Hajnoczi
  2018-06-08  8:18       ` Markus Armbruster
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-08  8:04 UTC (permalink / raw)
  To: Peter Xu
  Cc: Markus Armbruster, Kevin Wolf, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, Marc-André Lureau,
	John Snow, Dr . David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 802 bytes --]

On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> > 
> > we clean up
> > 
> > > to make sure we won't leftover replies/events of a old client to a new
> > 
> > we won't send leftover replies/events of an old client
> > 
> > > client.  Now this patch postpones that until OPENED.
> > 
> > What if OPENED never comes?
> 
> Then we clean that up until destruction of the monitor.  IMHO it's
> fine, but I'm not sure whether that's an overall acceptable approach.

I think this patch fixes the problem at the wrong level.  Marc-André's
fix seemed like a cleaner solution.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  8:04     ` Stefan Hajnoczi
@ 2018-06-08  8:18       ` Markus Armbruster
  2018-06-08  9:11         ` Peter Xu
  2018-06-11 16:45         ` Stefan Hajnoczi
  0 siblings, 2 replies; 15+ messages in thread
From: Markus Armbruster @ 2018-06-08  8:18 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Peter Xu, Kevin Wolf, qemu-devel, Markus Armbruster, Max Reitz,
	Stefan Hajnoczi, Marc-André Lureau, Paolo Bonzini,
	John Snow, Dr . David Alan Gilbert

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
>> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
>> > Peter Xu <peterx@redhat.com> writes:
>> > 
>> > > Previously we cleanup the queues when we got CLOSED event.  It was used
>> > 
>> > we clean up
>> > 
>> > > to make sure we won't leftover replies/events of a old client to a new
>> > 
>> > we won't send leftover replies/events of an old client
>> > 
>> > > client.  Now this patch postpones that until OPENED.
>> > 
>> > What if OPENED never comes?
>> 
>> Then we clean that up until destruction of the monitor.  IMHO it's
>> fine, but I'm not sure whether that's an overall acceptable approach.
>
> I think this patch fixes the problem at the wrong level.  Marc-André's
> fix seemed like a cleaner solution.

Is it the right solution?

I proposed another one:

[...]
>> > Here's what I think we should do on such EOF:
>> > 
>> > 1. Shut down input
>> > 
>> >    Stop reading input.  If input has its own file descriptor (as opposed
>> >    to a read/write fd shared with output), close it.
>> > 
>> > 2. Flush output
>> > 
>> >    Continue to write output until the queue becomes empty or we run into
>> >    an error (such as EPIPE, telling us the output's consumer went away).
>> >    Works exactly as before, except we proceed to step 3 when the queue

exactly as before EOF

>> >    becomes empty.
>> > 
>> > 3. Shut down output
>> > 
>> >    Close the output file descriptor.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  8:18       ` Markus Armbruster
@ 2018-06-08  9:11         ` Peter Xu
  2018-06-08  9:24           ` Peter Xu
  2018-06-11 16:45         ` Stefan Hajnoczi
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Xu @ 2018-06-08  9:11 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Stefan Hajnoczi, Kevin Wolf, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Marc-André Lureau, Paolo Bonzini,
	John Snow, Dr . David Alan Gilbert

On Fri, Jun 08, 2018 at 10:18:25AM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> > On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> >> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> >> > Peter Xu <peterx@redhat.com> writes:
> >> > 
> >> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> >> > 
> >> > we clean up
> >> > 
> >> > > to make sure we won't leftover replies/events of a old client to a new
> >> > 
> >> > we won't send leftover replies/events of an old client
> >> > 
> >> > > client.  Now this patch postpones that until OPENED.
> >> > 
> >> > What if OPENED never comes?
> >> 
> >> Then we clean that up until destruction of the monitor.  IMHO it's
> >> fine, but I'm not sure whether that's an overall acceptable approach.
> >
> > I think this patch fixes the problem at the wrong level.  Marc-André's
> > fix seemed like a cleaner solution.

Actually I like Marc-Andre's solution too.

However I left a comment there, I'm not sure whether that's workable.
My feeling is that currently our chardev can't really work very nicely
when the chardev backend is composed by two different channels, say,
when IN and OUT are different fds underneath.

Btw, I just noticed that fd_chr_add_watch() will only add a watch to
the ioc_out object, I'm not sure why (see that we have both ioc_in and
ioc_out for fd chardev):

static GSource *fd_chr_add_watch(Chardev *chr, GIOCondition cond)
{
    FDChardev *s = FD_CHARDEV(chr);
    return qio_channel_create_watch(s->ioc_out, cond);
}

(this is not related to current problem at all, it's a pure question;
 feel free to ignore)

> 
> Is it the right solution?
> 
> I proposed another one:
> 
> [...]
> >> > Here's what I think we should do on such EOF:
> >> > 
> >> > 1. Shut down input
> >> > 
> >> >    Stop reading input.  If input has its own file descriptor (as opposed
> >> >    to a read/write fd shared with output), close it.
> >> > 
> >> > 2. Flush output
> >> > 
> >> >    Continue to write output until the queue becomes empty or we run into
> >> >    an error (such as EPIPE, telling us the output's consumer went away).
> >> >    Works exactly as before, except we proceed to step 3 when the queue
> 
> exactly as before EOF

Yeah this seems reasonable.

I did a quick test with below change and it fixes the problem too:

diff --git a/monitor.c b/monitor.c
index 5eb4630215..b76cab5022 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4366,6 +4366,8 @@ static void monitor_qmp_event(void *opaque, int event)
         mon_refcount++;
         break;
     case CHR_EVENT_CLOSED:
+        /* Force flush all events */
+        monitor_qmp_bh_responder(NULL);
         monitor_qmp_cleanup_queues(mon);
         json_message_parser_destroy(&mon->qmp.parser);
         json_message_parser_init(&mon->qmp.parser, handle_qmp_command);

(Though maybe I can do it a bit "prettier" when I post a formal
 patch... for example, only flush the response queue for that specific
 monitor)

Frankly speaking I think this might be an ideal fix as well.  For
example what if we are executing the dispatcher of a command when we
received the CLOSED event?  If so, the dispatcher will put the
response onto the response queue after the CLOSED event, and ideally
we'd better also deliver that to the filter_output process.

> 
> >> >    becomes empty.
> >> > 
> >> > 3. Shut down output
> >> > 
> >> >    Close the output file descriptor.

For this one I don't know whether it'll be necessary.  That's managed
by chardev backends now, I think it now won't be closed until QEMU
quits (for our bash pipelne scenario).

Regards,

-- 
Peter Xu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  9:11         ` Peter Xu
@ 2018-06-08  9:24           ` Peter Xu
  2018-06-11  7:58             ` Markus Armbruster
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2018-06-08  9:24 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Max Reitz,
	Stefan Hajnoczi, Paolo Bonzini, Marc-André Lureau,
	John Snow, Dr . David Alan Gilbert

On Fri, Jun 08, 2018 at 05:11:54PM +0800, Peter Xu wrote:

[...]

> Frankly speaking I think this might be an ideal fix as well.  For
> example what if we are executing the dispatcher of a command when we
> received the CLOSED event?  If so, the dispatcher will put the
> response onto the response queue after the CLOSED event, and ideally
> we'd better also deliver that to the filter_output process.

Please ignore this paragraph.  Actually if that happens, we'll queue
the response onto the response queue as usual, then as long as the
output channel is not closed it'll still be delivered to the
filter_output process.

So I think I agree with Markus's solution, we just flush the response
queue when we get CLOSED (but we don't close the output fd; IMHO
that's chardev backend's job).  Would that work?

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  9:24           ` Peter Xu
@ 2018-06-11  7:58             ` Markus Armbruster
  0 siblings, 0 replies; 15+ messages in thread
From: Markus Armbruster @ 2018-06-11  7:58 UTC (permalink / raw)
  To: Peter Xu
  Cc: Markus Armbruster, Kevin Wolf, Stefan Hajnoczi, qemu-devel,
	Max Reitz, Stefan Hajnoczi, Marc-André Lureau,
	Paolo Bonzini, John Snow, Dr . David Alan Gilbert

Peter Xu <peterx@redhat.com> writes:

> On Fri, Jun 08, 2018 at 05:11:54PM +0800, Peter Xu wrote:
[...]
> So I think I agree with Markus's solution, we just flush the response
> queue when we get CLOSED (but we don't close the output fd; IMHO
> that's chardev backend's job).  Would that work?

I figure you're right on close() being the chardev backend's job.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-08  8:18       ` Markus Armbruster
  2018-06-08  9:11         ` Peter Xu
@ 2018-06-11 16:45         ` Stefan Hajnoczi
  2018-06-12  5:47           ` Peter Xu
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-11 16:45 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Stefan Hajnoczi, Peter Xu, Kevin Wolf, qemu-devel, Max Reitz,
	Marc-André Lureau, Paolo Bonzini, John Snow,
	Dr . David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

On Fri, Jun 08, 2018 at 10:18:25AM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> > On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> >> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> >> > Peter Xu <peterx@redhat.com> writes:
> >> > 
> >> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> >> > 
> >> > we clean up
> >> > 
> >> > > to make sure we won't leftover replies/events of a old client to a new
> >> > 
> >> > we won't send leftover replies/events of an old client
> >> > 
> >> > > client.  Now this patch postpones that until OPENED.
> >> > 
> >> > What if OPENED never comes?
> >> 
> >> Then we clean that up until destruction of the monitor.  IMHO it's
> >> fine, but I'm not sure whether that's an overall acceptable approach.
> >
> > I think this patch fixes the problem at the wrong level.  Marc-André's
> > fix seemed like a cleaner solution.
> 
> Is it the right solution?
> 
> I proposed another one:

Sorry, I won't be able to participate in this because I'm behind on
other patches and tasks.  Therefore, feel free to disregard but I'll
give my 2 cents:

This seems like a chardev bug.  The solution should probably be in the
chardev layer (Marc-André's patch or something else), not in the monitor
(this patch).

Even if there is a monitor change, it's probably necessary to at least
clarify the meaning of the CLOSE event to reduce the chance of similar
bugs in future chardev users.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-11 16:45         ` Stefan Hajnoczi
@ 2018-06-12  5:47           ` Peter Xu
  2018-06-13 14:30             ` Stefan Hajnoczi
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2018-06-12  5:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Markus Armbruster, Stefan Hajnoczi, Kevin Wolf, qemu-devel,
	Max Reitz, Marc-André Lureau, Paolo Bonzini, John Snow,
	Dr . David Alan Gilbert

On Mon, Jun 11, 2018 at 05:45:49PM +0100, Stefan Hajnoczi wrote:
> On Fri, Jun 08, 2018 at 10:18:25AM +0200, Markus Armbruster wrote:
> > Stefan Hajnoczi <stefanha@gmail.com> writes:
> > > On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> > >> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> > >> > Peter Xu <peterx@redhat.com> writes:
> > >> > 
> > >> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> > >> > 
> > >> > we clean up
> > >> > 
> > >> > > to make sure we won't leftover replies/events of a old client to a new
> > >> > 
> > >> > we won't send leftover replies/events of an old client
> > >> > 
> > >> > > client.  Now this patch postpones that until OPENED.
> > >> > 
> > >> > What if OPENED never comes?
> > >> 
> > >> Then we clean that up until destruction of the monitor.  IMHO it's
> > >> fine, but I'm not sure whether that's an overall acceptable approach.
> > >
> > > I think this patch fixes the problem at the wrong level.  Marc-André's
> > > fix seemed like a cleaner solution.
> > 
> > Is it the right solution?
> > 
> > I proposed another one:
> 
> Sorry, I won't be able to participate in this because I'm behind on
> other patches and tasks.  Therefore, feel free to disregard but I'll
> give my 2 cents:
> 
> This seems like a chardev bug.  The solution should probably be in the
> chardev layer (Marc-André's patch or something else), not in the monitor
> (this patch).

Yes that's why I said I like Marc-Andre's patch. :) I just don't know
an reliable way to achieve what we want there.

The thing is that we don't really monitor the ioc_out for fd-typed
chardevs.  We only do that when we call qemu_chr_fe_add_watch() (e.g.,
in monitor_flush_locked()) when the writting buffer is full.  But
normally we can't detect any event from the output side, hence no way
to deliever a CLOSED event when the output fd is the last fd that is
closed.

Maybe we can keep that output watch for the whole lifecycle of the
chardev?  I'm not sure yet.

> 
> Even if there is a monitor change, it's probably necessary to at least
> clarify the meaning of the CLOSE event to reduce the chance of similar
> bugs in future chardev users.

Makes sense.

I'll wait for a few more days (I wanted to enable OOB asap since
otherwise we'll reach another rc soon again) to see whether
Paolo/Marc-Andre/... has any better idea, otherwise I'll post a patch
to fix our problem first, then we turn OOB on alongside with the fix.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-12  5:47           ` Peter Xu
@ 2018-06-13 14:30             ` Stefan Hajnoczi
  2018-06-14  3:05               ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-13 14:30 UTC (permalink / raw)
  To: Peter Xu
  Cc: Markus Armbruster, Stefan Hajnoczi, Kevin Wolf, qemu-devel,
	Max Reitz, Marc-André Lureau, Paolo Bonzini, John Snow,
	Dr . David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 2401 bytes --]

On Tue, Jun 12, 2018 at 01:47:00PM +0800, Peter Xu wrote:
> On Mon, Jun 11, 2018 at 05:45:49PM +0100, Stefan Hajnoczi wrote:
> > On Fri, Jun 08, 2018 at 10:18:25AM +0200, Markus Armbruster wrote:
> > > Stefan Hajnoczi <stefanha@gmail.com> writes:
> > > > On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> > > >> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> > > >> > Peter Xu <peterx@redhat.com> writes:
> > > >> > 
> > > >> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> > > >> > 
> > > >> > we clean up
> > > >> > 
> > > >> > > to make sure we won't leftover replies/events of a old client to a new
> > > >> > 
> > > >> > we won't send leftover replies/events of an old client
> > > >> > 
> > > >> > > client.  Now this patch postpones that until OPENED.
> > > >> > 
> > > >> > What if OPENED never comes?
> > > >> 
> > > >> Then we clean that up until destruction of the monitor.  IMHO it's
> > > >> fine, but I'm not sure whether that's an overall acceptable approach.
> > > >
> > > > I think this patch fixes the problem at the wrong level.  Marc-André's
> > > > fix seemed like a cleaner solution.
> > > 
> > > Is it the right solution?
> > > 
> > > I proposed another one:
> > 
> > Sorry, I won't be able to participate in this because I'm behind on
> > other patches and tasks.  Therefore, feel free to disregard but I'll
> > give my 2 cents:
> > 
> > This seems like a chardev bug.  The solution should probably be in the
> > chardev layer (Marc-André's patch or something else), not in the monitor
> > (this patch).
> 
> Yes that's why I said I like Marc-Andre's patch. :) I just don't know
> an reliable way to achieve what we want there.
> 
> The thing is that we don't really monitor the ioc_out for fd-typed
> chardevs.  We only do that when we call qemu_chr_fe_add_watch() (e.g.,
> in monitor_flush_locked()) when the writting buffer is full.  But
> normally we can't detect any event from the output side, hence no way
> to deliever a CLOSED event when the output fd is the last fd that is
> closed.
> 
> Maybe we can keep that output watch for the whole lifecycle of the
> chardev?  I'm not sure yet.

It's not possible to keep a POLLOUT fd watch since poll(2) would return
instantly and the event loop would spin.  However, POLLHUP (G_IO_HUP)
might work.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues
  2018-06-13 14:30             ` Stefan Hajnoczi
@ 2018-06-14  3:05               ` Peter Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2018-06-14  3:05 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Markus Armbruster, Stefan Hajnoczi, Kevin Wolf, qemu-devel,
	Max Reitz, Marc-André Lureau, Paolo Bonzini, John Snow,
	Dr . David Alan Gilbert

On Wed, Jun 13, 2018 at 03:30:05PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jun 12, 2018 at 01:47:00PM +0800, Peter Xu wrote:
> > On Mon, Jun 11, 2018 at 05:45:49PM +0100, Stefan Hajnoczi wrote:
> > > On Fri, Jun 08, 2018 at 10:18:25AM +0200, Markus Armbruster wrote:
> > > > Stefan Hajnoczi <stefanha@gmail.com> writes:
> > > > > On Fri, Jun 08, 2018 at 12:42:35PM +0800, Peter Xu wrote:
> > > > >> On Thu, Jun 07, 2018 at 01:53:01PM +0200, Markus Armbruster wrote:
> > > > >> > Peter Xu <peterx@redhat.com> writes:
> > > > >> > 
> > > > >> > > Previously we cleanup the queues when we got CLOSED event.  It was used
> > > > >> > 
> > > > >> > we clean up
> > > > >> > 
> > > > >> > > to make sure we won't leftover replies/events of a old client to a new
> > > > >> > 
> > > > >> > we won't send leftover replies/events of an old client
> > > > >> > 
> > > > >> > > client.  Now this patch postpones that until OPENED.
> > > > >> > 
> > > > >> > What if OPENED never comes?
> > > > >> 
> > > > >> Then we clean that up until destruction of the monitor.  IMHO it's
> > > > >> fine, but I'm not sure whether that's an overall acceptable approach.
> > > > >
> > > > > I think this patch fixes the problem at the wrong level.  Marc-André's
> > > > > fix seemed like a cleaner solution.
> > > > 
> > > > Is it the right solution?
> > > > 
> > > > I proposed another one:
> > > 
> > > Sorry, I won't be able to participate in this because I'm behind on
> > > other patches and tasks.  Therefore, feel free to disregard but I'll
> > > give my 2 cents:
> > > 
> > > This seems like a chardev bug.  The solution should probably be in the
> > > chardev layer (Marc-André's patch or something else), not in the monitor
> > > (this patch).
> > 
> > Yes that's why I said I like Marc-Andre's patch. :) I just don't know
> > an reliable way to achieve what we want there.
> > 
> > The thing is that we don't really monitor the ioc_out for fd-typed
> > chardevs.  We only do that when we call qemu_chr_fe_add_watch() (e.g.,
> > in monitor_flush_locked()) when the writting buffer is full.  But
> > normally we can't detect any event from the output side, hence no way
> > to deliever a CLOSED event when the output fd is the last fd that is
> > closed.
> > 
> > Maybe we can keep that output watch for the whole lifecycle of the
> > chardev?  I'm not sure yet.
> 
> It's not possible to keep a POLLOUT fd watch since poll(2) would return
> instantly and the event loop would spin.  However, POLLHUP (G_IO_HUP)
> might work.

It seems so.  But I am not confident enough to quickly provide a
mature patch for chardev.  I'll just work around for monitor right
now.  I'll comment with your ideas there and also on the events.  We
can rework upon when we have time with it.

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-14  3:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-04  8:06 [Qemu-devel] [PATCH] monitor: postpone monitor_qmp_cleanup_queues Peter Xu
2018-06-04 15:01 ` Marc-André Lureau
2018-06-05  3:39   ` Peter Xu
2018-06-07 11:53 ` Markus Armbruster
2018-06-08  4:42   ` Peter Xu
2018-06-08  7:05     ` Markus Armbruster
2018-06-08  8:04     ` Stefan Hajnoczi
2018-06-08  8:18       ` Markus Armbruster
2018-06-08  9:11         ` Peter Xu
2018-06-08  9:24           ` Peter Xu
2018-06-11  7:58             ` Markus Armbruster
2018-06-11 16:45         ` Stefan Hajnoczi
2018-06-12  5:47           ` Peter Xu
2018-06-13 14:30             ` Stefan Hajnoczi
2018-06-14  3:05               ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.