All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] chardev's and fd's in monitors
@ 2016-10-12 19:15 Dr. David Alan Gilbert
  2016-10-12 20:02 ` Marc-André Lureau
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-12 19:15 UTC (permalink / raw)
  To: qemu-devel, berrange, armbru

Hi,
  I had a look at a couple of readline like libraries;
editline and linenoise.  A difficulty with using them is that
they both want fd's or FILE*'s; editline takes either but
from a brief look I think it's expecting to extract the fd.
That makes them tricky to integrate into qemu, where
the chardev's hide a whole bunch of non-fd things; in particular
tls, mux, ringbuffers etc.

If we could get away with just a FILE* then we could use fopencookie,
but that's GNU only.

Is there any sane way of shepherding all chardev's into having an
fd?

Once you had those then you could also use them in a separate thread.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-12 19:15 [Qemu-devel] chardev's and fd's in monitors Dr. David Alan Gilbert
@ 2016-10-12 20:02 ` Marc-André Lureau
  2016-10-13 15:47   ` Dr. David Alan Gilbert
  2016-10-18 10:04 ` Daniel P. Berrange
  2016-10-18 10:06 ` Daniel P. Berrange
  2 siblings, 1 reply; 54+ messages in thread
From: Marc-André Lureau @ 2016-10-12 20:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, qemu-devel, berrange, armbru

Hi

On Wed, Oct 12, 2016 at 11:15 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
wrote:

> Hi,
>   I had a look at a couple of readline like libraries;
> editline and linenoise.  A difficulty with using them is that
> they both want fd's or FILE*'s; editline takes either but
> from a brief look I think it's expecting to extract the fd.
> That makes them tricky to integrate into qemu, where
> the chardev's hide a whole bunch of non-fd things; in particular
> tls, mux, ringbuffers etc.
>

We could restrict readline usage to chardev with fd? But even with that,
how would it be compatible with mux? It would have to somehow steal/restore
the chardev fd. Alternatively, we could have a "fd pipe"/socketpair chardev
frontend compatible with any chardev. Sounds contrived though, but it
should work, and probably not so much code. (qemu_chr_new_fd_fe?)


>
> If we could get away with just a FILE* then we could use fopencookie,
> but that's GNU only.
>
> Is there any sane way of shepherding all chardev's into having an
> fd?
>

Ah that would be nice! But I think the point is to stay in userspace (and
avoid extra copy, context switch, or extra fds). Otherwise, it feels like
the whole chr interface could be a socketpair + a thin layer for events,
that would simplify things indeed.


> Once you had those then you could also use them in a separate thread.
>
>
You can already use chardev in seperate thread, but I don't know to which
extent (see add_handlers_full for completely seperate thread, locking for
write for multi-writer, I suppose s->chr_read is called from the
dispatching context and is responsability for frontend callback to lock
properly)


-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-12 20:02 ` Marc-André Lureau
@ 2016-10-13 15:47   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-13 15:47 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: qemu-devel, berrange, armbru

* Marc-André Lureau (marcandre.lureau@gmail.com) wrote:
> Hi
> 
> On Wed, Oct 12, 2016 at 11:15 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
> wrote:
> 
> > Hi,
> >   I had a look at a couple of readline like libraries;
> > editline and linenoise.  A difficulty with using them is that
> > they both want fd's or FILE*'s; editline takes either but
> > from a brief look I think it's expecting to extract the fd.
> > That makes them tricky to integrate into qemu, where
> > the chardev's hide a whole bunch of non-fd things; in particular
> > tls, mux, ringbuffers etc.
> >
> 
> We could restrict readline usage to chardev with fd? But even with that,
> how would it be compatible with mux? It would have to somehow steal/restore
> the chardev fd. Alternatively, we could have a "fd pipe"/socketpair chardev
> frontend compatible with any chardev. Sounds contrived though, but it
> should work, and probably not so much code. (qemu_chr_new_fd_fe?)

Right; you'd still have to be careful about where the code ran that
stuffed it down that fd; for example I was thinking that maybe
I could connect editline to a pipe, and then just add a handler
that streamed that out to the chardev; but I worry that if, in the main
thread, I was to pass input to editline that caused it to output a huge
amount (say a big tab complete or a max-len filename) then it could deadlock
because the thing emptying the pipe wouldn't get run until the main loop
returned.

editline does have an 'el_push' so you could avoid having a real fd for it's
input (or at least an fd that ever does anything) and just call el_push
to stuff characters into it's input queue.

> >
> > If we could get away with just a FILE* then we could use fopencookie,
> > but that's GNU only.
> >
> > Is there any sane way of shepherding all chardev's into having an
> > fd?
> >
> 
> Ah that would be nice! But I think the point is to stay in userspace (and
> avoid extra copy, context switch, or extra fds). Otherwise, it feels like
> the whole chr interface could be a socketpair + a thin layer for events,
> that would simplify things indeed.

Well that would be nice :-)  I don't have much sympathy for saving on copies
and context switches at the bandwidth the monitor is going at.

> > Once you had those then you could also use them in a separate thread.
> >
> >
> You can already use chardev in seperate thread, but I don't know to which
> extent (see add_handlers_full for completely seperate thread, locking for
> write for multi-writer, I suppose s->chr_read is called from the
> dispatching context and is responsability for frontend callback to lock
> properly)

Oh that's fancy and new.   It would be fun to run a monitor in a different
thread with that; or use it to drain an output fd.
But would you trust multiple threads to drive the two different parts of a mux?

Dave

> -- 
> Marc-André Lureau
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-12 19:15 [Qemu-devel] chardev's and fd's in monitors Dr. David Alan Gilbert
  2016-10-12 20:02 ` Marc-André Lureau
@ 2016-10-18 10:04 ` Daniel P. Berrange
  2016-10-18 11:32   ` Dr. David Alan Gilbert
  2016-10-18 12:08   ` Markus Armbruster
  2016-10-18 10:06 ` Daniel P. Berrange
  2 siblings, 2 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 10:04 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> Hi,
>   I had a look at a couple of readline like libraries;
> editline and linenoise.  A difficulty with using them is that
> they both want fd's or FILE*'s; editline takes either but
> from a brief look I think it's expecting to extract the fd.
> That makes them tricky to integrate into qemu, where
> the chardev's hide a whole bunch of non-fd things; in particular
> tls, mux, ringbuffers etc.
> 
> If we could get away with just a FILE* then we could use fopencookie,
> but that's GNU only.
> 
> Is there any sane way of shepherding all chardev's into having an
> fd?

The entire chardev abstraction model exists precisely because we cannot
make all chardevs look like a single fd. Even those which are fd based
may have separate FDs for input and output.

IMHO the only viable approach would be to enhance linenoise/editline to
not assume use of fd* or FILE * abstractions.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-12 19:15 [Qemu-devel] chardev's and fd's in monitors Dr. David Alan Gilbert
  2016-10-12 20:02 ` Marc-André Lureau
  2016-10-18 10:04 ` Daniel P. Berrange
@ 2016-10-18 10:06 ` Daniel P. Berrange
  2 siblings, 0 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 10:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> Hi,
>   I had a look at a couple of readline like libraries;
> editline and linenoise.  A difficulty with using them is that
> they both want fd's or FILE*'s; editline takes either but
> from a brief look I think it's expecting to extract the fd.
> That makes them tricky to integrate into qemu, where
> the chardev's hide a whole bunch of non-fd things; in particular
> tls, mux, ringbuffers etc.
> 
> If we could get away with just a FILE* then we could use fopencookie,
> but that's GNU only.
> 
> Is there any sane way of shepherding all chardev's into having an
> fd?
> 
> Once you had those then you could also use them in a separate thread.

BTW, what is the actual thread issue you are facing ? Chardevs at least
ought to be usable from a separate thread, as long as each distinct
chardev object instance was only used from one thread at a time ?

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 10:04 ` Daniel P. Berrange
@ 2016-10-18 11:32   ` Dr. David Alan Gilbert
  2016-10-18 11:41     ` Marc-André Lureau
  2016-10-18 12:01     ` Daniel P. Berrange
  2016-10-18 12:08   ` Markus Armbruster
  1 sibling, 2 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-18 11:32 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, armbru

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > Hi,
> >   I had a look at a couple of readline like libraries;
> > editline and linenoise.  A difficulty with using them is that
> > they both want fd's or FILE*'s; editline takes either but
> > from a brief look I think it's expecting to extract the fd.
> > That makes them tricky to integrate into qemu, where
> > the chardev's hide a whole bunch of non-fd things; in particular
> > tls, mux, ringbuffers etc.
> > 
> > If we could get away with just a FILE* then we could use fopencookie,
> > but that's GNU only.
> > 
> > Is there any sane way of shepherding all chardev's into having an
> > fd?
> 
> The entire chardev abstraction model exists precisely because we cannot
> make all chardevs look like a single fd. Even those which are fd based
> may have separate FDs for input and output.

Note that editline takes separate in/out streams, but it does want those streams
to be FILE*'s.

> IMHO the only viable approach would be to enhance linenoise/editline to
> not assume use of fd* or FILE * abstractions.

I think if it came to that then we'd probably end up sticking with what we
had for a very long time; I'd assume it would take a long time before
any mods we made to the libraries would come around to be generally useful.

> BTW, what is the actual thread issue you are facing ? Chardevs at least
> ought to be usable from a separate thread, as long as each distinct
> chardev object instance was only used from one thread at a time ?

Marc-André pointed that out; I hadn't realised they were thread safe.
But what are the rules? You say 'only used from one thread at a time' -
what happens if we have a mux and the different streams to the mux come
from different threads?

My actual thoughts for threads came from a few sides:
  a) Maybe I could have a shim thread that fed the editline fd from a chardev
  b) I'd eventually like multiple monitor threads.

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 11:32   ` Dr. David Alan Gilbert
@ 2016-10-18 11:41     ` Marc-André Lureau
  2016-10-18 11:44       ` Marc-André Lureau
  2016-10-18 12:01     ` Daniel P. Berrange
  1 sibling, 1 reply; 54+ messages in thread
From: Marc-André Lureau @ 2016-10-18 11:41 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrange; +Cc: qemu-devel, armbru

Hi

On Tue, Oct 18, 2016 at 2:32 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
wrote:

> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > Hi,
> > >   I had a look at a couple of readline like libraries;
> > > editline and linenoise.  A difficulty with using them is that
> > > they both want fd's or FILE*'s; editline takes either but
> > > from a brief look I think it's expecting to extract the fd.
> > > That makes them tricky to integrate into qemu, where
> > > the chardev's hide a whole bunch of non-fd things; in particular
> > > tls, mux, ringbuffers etc.
> > >
> > > If we could get away with just a FILE* then we could use fopencookie,
> > > but that's GNU only.
> > >
> > > Is there any sane way of shepherding all chardev's into having an
> > > fd?
> >
> > The entire chardev abstraction model exists precisely because we cannot
> > make all chardevs look like a single fd. Even those which are fd based
> > may have separate FDs for input and output.
>
> Note that editline takes separate in/out streams, but it does want those
> streams
> to be FILE*'s.
>

glibc can have custom streams iirc:
https://www.gnu.org/software/libc/manual/html_node/Custom-Streams.html#Custom-Streams

I haven't looked in details if that would solve it.


> > IMHO the only viable approach would be to enhance linenoise/editline to
> > not assume use of fd* or FILE * abstractions.
>
> I think if it came to that then we'd probably end up sticking with what we
> had for a very long time; I'd assume it would take a long time before
> any mods we made to the libraries would come around to be generally useful.
>
> > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > ought to be usable from a separate thread, as long as each distinct
> > chardev object instance was only used from one thread at a time ?
>
> Marc-André pointed that out; I hadn't realised they were thread safe.
> But what are the rules? You say 'only used from one thread at a time' -
> what happens if we have a mux and the different streams to the mux come
> from different threads?
>
> My actual thoughts for threads came from a few sides:
>   a) Maybe I could have a shim thread that fed the editline fd from a
> chardev
>   b) I'd eventually like multiple monitor threads.
>
> Dave
>
> > Regards,
> > Daniel
> > --
> > |: http://berrange.com      -o-
> http://www.flickr.com/photos/dberrange/ :|
> > |: http://libvirt.org              -o-
> http://virt-manager.org :|
> > |: http://entangle-photo.org       -o-
> http://search.cpan.org/~danberr/ :|
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> --
Marc-André Lureau

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 11:41     ` Marc-André Lureau
@ 2016-10-18 11:44       ` Marc-André Lureau
  0 siblings, 0 replies; 54+ messages in thread
From: Marc-André Lureau @ 2016-10-18 11:44 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrange; +Cc: qemu-devel, armbru

Hi

On Tue, Oct 18, 2016 at 2:41 PM Marc-André Lureau <
marcandre.lureau@gmail.com> wrote:

> Hi
>
> On Tue, Oct 18, 2016 at 2:32 PM Dr. David Alan Gilbert <
> dgilbert@redhat.com> wrote:
>
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > Hi,
> > >   I had a look at a couple of readline like libraries;
> > > editline and linenoise.  A difficulty with using them is that
> > > they both want fd's or FILE*'s; editline takes either but
> > > from a brief look I think it's expecting to extract the fd.
> > > That makes them tricky to integrate into qemu, where
> > > the chardev's hide a whole bunch of non-fd things; in particular
> > > tls, mux, ringbuffers etc.
> > >
> > > If we could get away with just a FILE* then we could use fopencookie,
> > > but that's GNU only.
> > >
> > > Is there any sane way of shepherding all chardev's into having an
> > > fd?
> >
> > The entire chardev abstraction model exists precisely because we cannot
> > make all chardevs look like a single fd. Even those which are fd based
> > may have separate FDs for input and output.
>
> Note that editline takes separate in/out streams, but it does want those
> streams
> to be FILE*'s.
>
>
> glibc can have custom streams iirc:
>
> https://www.gnu.org/software/libc/manual/html_node/Custom-Streams.html#Custom-Streams
>
> I haven't looked in details if that would solve it.
>

Sorry, just noticed you mentionned it already (didn't realize it was called
fopencookie)


> > IMHO the only viable approach would be to enhance linenoise/editline to
> > not assume use of fd* or FILE * abstractions.
>
> I think if it came to that then we'd probably end up sticking with what we
> had for a very long time; I'd assume it would take a long time before
> any mods we made to the libraries would come around to be generally useful.
>
> > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > ought to be usable from a separate thread, as long as each distinct
> > chardev object instance was only used from one thread at a time ?
>
> Marc-André pointed that out; I hadn't realised they were thread safe.
> But what are the rules? You say 'only used from one thread at a time' -
> what happens if we have a mux and the different streams to the mux come
> from different threads?
>
> My actual thoughts for threads came from a few sides:
>   a) Maybe I could have a shim thread that fed the editline fd from a
> chardev
>   b) I'd eventually like multiple monitor threads.
>
> Dave
>
> > Regards,
> > Daniel
> > --
> > |: http://berrange.com      -o-
> http://www.flickr.com/photos/dberrange/ :|
> > |: http://libvirt.org              -o-
> http://virt-manager.org :|
> > |: http://entangle-photo.org       -o-
> http://search.cpan.org/~danberr/ :|
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> --
> Marc-André Lureau
>
-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 11:32   ` Dr. David Alan Gilbert
  2016-10-18 11:41     ` Marc-André Lureau
@ 2016-10-18 12:01     ` Daniel P. Berrange
  2016-10-18 13:25       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 12:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > Hi,
> > >   I had a look at a couple of readline like libraries;
> > > editline and linenoise.  A difficulty with using them is that
> > > they both want fd's or FILE*'s; editline takes either but
> > > from a brief look I think it's expecting to extract the fd.
> > > That makes them tricky to integrate into qemu, where
> > > the chardev's hide a whole bunch of non-fd things; in particular
> > > tls, mux, ringbuffers etc.
> > > 
> > > If we could get away with just a FILE* then we could use fopencookie,
> > > but that's GNU only.
> > > 
> > > Is there any sane way of shepherding all chardev's into having an
> > > fd?
> > 
> > The entire chardev abstraction model exists precisely because we cannot
> > make all chardevs look like a single fd. Even those which are fd based
> > may have separate FDs for input and output.
> 
> Note that editline takes separate in/out streams, but it does want those streams
> to be FILE*'s.
> 
> > IMHO the only viable approach would be to enhance linenoise/editline to
> > not assume use of fd* or FILE * abstractions.
> 
> I think if it came to that then we'd probably end up sticking with what we
> had for a very long time; I'd assume it would take a long time before
> any mods we made to the libraries would come around to be generally useful.
> 
> > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > ought to be usable from a separate thread, as long as each distinct
> > chardev object instance was only used from one thread at a time ?
> 
> Marc-André pointed that out; I hadn't realised they were thread safe.
> But what are the rules? You say 'only used from one thread at a time' -
> what happens if we have a mux and the different streams to the mux come
> from different threads?

Well there is no mutex locking on the CharDriverState objects, so the
exact rule is "you mustn't do anything from multiple threads that will
race on contents of CharDriverState". That's too fuzzy to be useful to
developers though, so I think the only sensible option right now is to
say any "top level" CharDriverState should only be touch from one thread
at a time. IOW, if you have a mux, that that rule would apply to the
mux itself and the various children it owns as if they were a single
unnit.

> My actual thoughts for threads came from a few sides:
>   a) Maybe I could have a shim thread that fed the editline fd from a chardev
>   b) I'd eventually like multiple monitor threads.

Can you expand on what you mean by multiple monitor threads ? Presumably
you're meaning a single monitor instance, with multiple threads processing
commands concurrently ?  If so, I think that ought to be fine even with
the current thread rules around chardevs. The processing of individual
monitor commands doesn't interact with the CharDriverState AFAIR, as we
have clean separation between parsing the incoming command, running the
command, and formatting the outgoing response. IOW, for a single monitor
it is still sufficient to have a single thread deal with all I/O for the
chardev - only the command execution needs to be delegated to other
threads, and those wouldn't be touching the chardev at all.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 10:04 ` Daniel P. Berrange
  2016-10-18 11:32   ` Dr. David Alan Gilbert
@ 2016-10-18 12:08   ` Markus Armbruster
  2016-10-18 12:13     ` Daniel P. Berrange
  1 sibling, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-18 12:08 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
>> Hi,
>>   I had a look at a couple of readline like libraries;
>> editline and linenoise.  A difficulty with using them is that
>> they both want fd's or FILE*'s; editline takes either but
>> from a brief look I think it's expecting to extract the fd.
>> That makes them tricky to integrate into qemu, where
>> the chardev's hide a whole bunch of non-fd things; in particular
>> tls, mux, ringbuffers etc.
>> 
>> If we could get away with just a FILE* then we could use fopencookie,
>> but that's GNU only.
>> 
>> Is there any sane way of shepherding all chardev's into having an
>> fd?
>
> The entire chardev abstraction model exists precisely because we cannot
> make all chardevs look like a single fd. Even those which are fd based
> may have separate FDs for input and output.
>
> IMHO the only viable approach would be to enhance linenoise/editline to
> not assume use of fd* or FILE * abstractions.

The real thing (GNU readline) has hooks rl_getc_function,
rl_input_available_hook, rl_redisplay_function and so forth, which might
do the trick.  Unfortunately, we're stuck with cheap copies due to our
foolish acceptance of GPLv2-only contributions.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 12:08   ` Markus Armbruster
@ 2016-10-18 12:13     ` Daniel P. Berrange
  2016-10-18 12:43       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 12:13 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel

On Tue, Oct 18, 2016 at 02:08:14PM +0200, Markus Armbruster wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> >> Hi,
> >>   I had a look at a couple of readline like libraries;
> >> editline and linenoise.  A difficulty with using them is that
> >> they both want fd's or FILE*'s; editline takes either but
> >> from a brief look I think it's expecting to extract the fd.
> >> That makes them tricky to integrate into qemu, where
> >> the chardev's hide a whole bunch of non-fd things; in particular
> >> tls, mux, ringbuffers etc.
> >> 
> >> If we could get away with just a FILE* then we could use fopencookie,
> >> but that's GNU only.
> >> 
> >> Is there any sane way of shepherding all chardev's into having an
> >> fd?
> >
> > The entire chardev abstraction model exists precisely because we cannot
> > make all chardevs look like a single fd. Even those which are fd based
> > may have separate FDs for input and output.
> >
> > IMHO the only viable approach would be to enhance linenoise/editline to
> > not assume use of fd* or FILE * abstractions.
> 
> The real thing (GNU readline) has hooks rl_getc_function,
> rl_input_available_hook, rl_redisplay_function and so forth, which might
> do the trick.  Unfortunately, we're stuck with cheap copies due to our
> foolish acceptance of GPLv2-only contributions.

I'm wondering if improving read line facilities in HMP is really important
enough for us to bother dealing with the problems, as opposed to just
staying with our current impl ?

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 12:13     ` Daniel P. Berrange
@ 2016-10-18 12:43       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-18 12:43 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Markus Armbruster, qemu-devel

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, Oct 18, 2016 at 02:08:14PM +0200, Markus Armbruster wrote:
> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > 
> > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > >> Hi,
> > >>   I had a look at a couple of readline like libraries;
> > >> editline and linenoise.  A difficulty with using them is that
> > >> they both want fd's or FILE*'s; editline takes either but
> > >> from a brief look I think it's expecting to extract the fd.
> > >> That makes them tricky to integrate into qemu, where
> > >> the chardev's hide a whole bunch of non-fd things; in particular
> > >> tls, mux, ringbuffers etc.
> > >> 
> > >> If we could get away with just a FILE* then we could use fopencookie,
> > >> but that's GNU only.
> > >> 
> > >> Is there any sane way of shepherding all chardev's into having an
> > >> fd?
> > >
> > > The entire chardev abstraction model exists precisely because we cannot
> > > make all chardevs look like a single fd. Even those which are fd based
> > > may have separate FDs for input and output.
> > >
> > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > not assume use of fd* or FILE * abstractions.
> > 
> > The real thing (GNU readline) has hooks rl_getc_function,
> > rl_input_available_hook, rl_redisplay_function and so forth, which might
> > do the trick.  Unfortunately, we're stuck with cheap copies due to our
> > foolish acceptance of GPLv2-only contributions.
> 
> I'm wondering if improving read line facilities in HMP is really important
> enough for us to bother dealing with the problems, as opposed to just
> staying with our current impl ?

It's worth us understanding it, but not worth us putting infrastructure in for.
If we can find a way to do it sanely on top of what we've got I think it's worth
it.

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 12:01     ` Daniel P. Berrange
@ 2016-10-18 13:25       ` Dr. David Alan Gilbert
  2016-10-18 13:35         ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-18 13:25 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, armbru

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > Hi,
> > > >   I had a look at a couple of readline like libraries;
> > > > editline and linenoise.  A difficulty with using them is that
> > > > they both want fd's or FILE*'s; editline takes either but
> > > > from a brief look I think it's expecting to extract the fd.
> > > > That makes them tricky to integrate into qemu, where
> > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > tls, mux, ringbuffers etc.
> > > > 
> > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > but that's GNU only.
> > > > 
> > > > Is there any sane way of shepherding all chardev's into having an
> > > > fd?
> > > 
> > > The entire chardev abstraction model exists precisely because we cannot
> > > make all chardevs look like a single fd. Even those which are fd based
> > > may have separate FDs for input and output.
> > 
> > Note that editline takes separate in/out streams, but it does want those streams
> > to be FILE*'s.
> > 
> > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > not assume use of fd* or FILE * abstractions.
> > 
> > I think if it came to that then we'd probably end up sticking with what we
> > had for a very long time; I'd assume it would take a long time before
> > any mods we made to the libraries would come around to be generally useful.
> > 
> > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > ought to be usable from a separate thread, as long as each distinct
> > > chardev object instance was only used from one thread at a time ?
> > 
> > Marc-André pointed that out; I hadn't realised they were thread safe.
> > But what are the rules? You say 'only used from one thread at a time' -
> > what happens if we have a mux and the different streams to the mux come
> > from different threads?
> 
> Well there is no mutex locking on the CharDriverState objects, so the
> exact rule is "you mustn't do anything from multiple threads that will
> race on contents of CharDriverState". That's too fuzzy to be useful to
> developers though, so I think the only sensible option right now is to
> say any "top level" CharDriverState should only be touch from one thread
> at a time. IOW, if you have a mux, that that rule would apply to the
> mux itself and the various children it owns as if they were a single
> unnit.

OK; I think we're probably saved by the big lock at the moment, so that
all device emulation that outputs text is probably holding it and the monitor
is also.  What about something like an error_report from a different thread
while something is happening in the monitor?

> > My actual thoughts for threads came from a few sides:
> >   a) Maybe I could have a shim thread that fed the editline fd from a chardev
> >   b) I'd eventually like multiple monitor threads.
> 
> Can you expand on what you mean by multiple monitor threads ? Presumably
> you're meaning a single monitor instance, with multiple threads processing
> commands concurrently ?  If so, I think that ought to be fine even with
> the current thread rules around chardevs. The processing of individual
> monitor commands doesn't interact with the CharDriverState AFAIR, as we
> have clean separation between parsing the incoming command, running the
> command, and formatting the outgoing response. IOW, for a single monitor
> it is still sufficient to have a single thread deal with all I/O for the
> chardev - only the command execution needs to be delegated to other
> threads, and those wouldn't be touching the chardev at all.

Hmm, I'd thought of the other way around - multiple individual monitors each
running one command; ie each connection for a monitor would be it's own thread.

Dave

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 13:25       ` Dr. David Alan Gilbert
@ 2016-10-18 13:35         ` Daniel P. Berrange
  2016-10-18 13:52           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 13:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > Hi,
> > > > >   I had a look at a couple of readline like libraries;
> > > > > editline and linenoise.  A difficulty with using them is that
> > > > > they both want fd's or FILE*'s; editline takes either but
> > > > > from a brief look I think it's expecting to extract the fd.
> > > > > That makes them tricky to integrate into qemu, where
> > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > > tls, mux, ringbuffers etc.
> > > > > 
> > > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > > but that's GNU only.
> > > > > 
> > > > > Is there any sane way of shepherding all chardev's into having an
> > > > > fd?
> > > > 
> > > > The entire chardev abstraction model exists precisely because we cannot
> > > > make all chardevs look like a single fd. Even those which are fd based
> > > > may have separate FDs for input and output.
> > > 
> > > Note that editline takes separate in/out streams, but it does want those streams
> > > to be FILE*'s.
> > > 
> > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > > not assume use of fd* or FILE * abstractions.
> > > 
> > > I think if it came to that then we'd probably end up sticking with what we
> > > had for a very long time; I'd assume it would take a long time before
> > > any mods we made to the libraries would come around to be generally useful.
> > > 
> > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > > ought to be usable from a separate thread, as long as each distinct
> > > > chardev object instance was only used from one thread at a time ?
> > > 
> > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > > But what are the rules? You say 'only used from one thread at a time' -
> > > what happens if we have a mux and the different streams to the mux come
> > > from different threads?
> > 
> > Well there is no mutex locking on the CharDriverState objects, so the
> > exact rule is "you mustn't do anything from multiple threads that will
> > race on contents of CharDriverState". That's too fuzzy to be useful to
> > developers though, so I think the only sensible option right now is to
> > say any "top level" CharDriverState should only be touch from one thread
> > at a time. IOW, if you have a mux, that that rule would apply to the
> > mux itself and the various children it owns as if they were a single
> > unnit.
> 
> OK; I think we're probably saved by the big lock at the moment, so that
> all device emulation that outputs text is probably holding it and the monitor
> is also.  What about something like an error_report from a different thread
> while something is happening in the monitor?

If we moved execution of monitor commands to separate thread from the
thread handling monitor I/O, then we'd have to modify error_report so
that it queued the text in some manner, such that it was only then
fed back to the client once the command thread completed. Alternatively
we'd have to introduced locking in the Monitor object, that serialized
access to the underling CharDriverState I/O funcs.

> > > My actual thoughts for threads came from a few sides:
> > >   a) Maybe I could have a shim thread that fed the editline fd from a chardev
> > >   b) I'd eventually like multiple monitor threads.
> > 
> > Can you expand on what you mean by multiple monitor threads ? Presumably
> > you're meaning a single monitor instance, with multiple threads processing
> > commands concurrently ?  If so, I think that ought to be fine even with
> > the current thread rules around chardevs. The processing of individual
> > monitor commands doesn't interact with the CharDriverState AFAIR, as we
> > have clean separation between parsing the incoming command, running the
> > command, and formatting the outgoing response. IOW, for a single monitor
> > it is still sufficient to have a single thread deal with all I/O for the
> > chardev - only the command execution needs to be delegated to other
> > threads, and those wouldn't be touching the chardev at all.
> 
> Hmm, I'd thought of the other way around - multiple individual monitors each
> running one command; ie each connection for a monitor would be it's own
> thread.

So I guess there's two problems with the monitor handling right now wrt.

 - A long running command will block the event loop thread for too long
 - A long running command prevents a client issuing other commands while
   waiting for the previous command to complete.

Running a thread per monitor server solves the first problem. If we make
monitor command handling async though, then it solves both problems.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 13:35         ` Daniel P. Berrange
@ 2016-10-18 13:52           ` Dr. David Alan Gilbert
  2016-10-18 14:01             ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-18 13:52 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, armbru

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > Hi,
> > > > > >   I had a look at a couple of readline like libraries;
> > > > > > editline and linenoise.  A difficulty with using them is that
> > > > > > they both want fd's or FILE*'s; editline takes either but
> > > > > > from a brief look I think it's expecting to extract the fd.
> > > > > > That makes them tricky to integrate into qemu, where
> > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > > > tls, mux, ringbuffers etc.
> > > > > > 
> > > > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > > > but that's GNU only.
> > > > > > 
> > > > > > Is there any sane way of shepherding all chardev's into having an
> > > > > > fd?
> > > > > 
> > > > > The entire chardev abstraction model exists precisely because we cannot
> > > > > make all chardevs look like a single fd. Even those which are fd based
> > > > > may have separate FDs for input and output.
> > > > 
> > > > Note that editline takes separate in/out streams, but it does want those streams
> > > > to be FILE*'s.
> > > > 
> > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > > > not assume use of fd* or FILE * abstractions.
> > > > 
> > > > I think if it came to that then we'd probably end up sticking with what we
> > > > had for a very long time; I'd assume it would take a long time before
> > > > any mods we made to the libraries would come around to be generally useful.
> > > > 
> > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > > > ought to be usable from a separate thread, as long as each distinct
> > > > > chardev object instance was only used from one thread at a time ?
> > > > 
> > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > > > But what are the rules? You say 'only used from one thread at a time' -
> > > > what happens if we have a mux and the different streams to the mux come
> > > > from different threads?
> > > 
> > > Well there is no mutex locking on the CharDriverState objects, so the
> > > exact rule is "you mustn't do anything from multiple threads that will
> > > race on contents of CharDriverState". That's too fuzzy to be useful to
> > > developers though, so I think the only sensible option right now is to
> > > say any "top level" CharDriverState should only be touch from one thread
> > > at a time. IOW, if you have a mux, that that rule would apply to the
> > > mux itself and the various children it owns as if they were a single
> > > unnit.
> > 
> > OK; I think we're probably saved by the big lock at the moment, so that
> > all device emulation that outputs text is probably holding it and the monitor
> > is also.  What about something like an error_report from a different thread
> > while something is happening in the monitor?
> 
> If we moved execution of monitor commands to separate thread from the
> thread handling monitor I/O, then we'd have to modify error_report so
> that it queued the text in some manner, such that it was only then
> fed back to the client once the command thread completed. Alternatively
> we'd have to introduced locking in the Monitor object, that serialized
> access to the underling CharDriverState I/O funcs.

I already use error_report's in places in migration threads of various
types; I'm not sure if that's a problem.

> > > > My actual thoughts for threads came from a few sides:
> > > >   a) Maybe I could have a shim thread that fed the editline fd from a chardev
> > > >   b) I'd eventually like multiple monitor threads.
> > > 
> > > Can you expand on what you mean by multiple monitor threads ? Presumably
> > > you're meaning a single monitor instance, with multiple threads processing
> > > commands concurrently ?  If so, I think that ought to be fine even with
> > > the current thread rules around chardevs. The processing of individual
> > > monitor commands doesn't interact with the CharDriverState AFAIR, as we
> > > have clean separation between parsing the incoming command, running the
> > > command, and formatting the outgoing response. IOW, for a single monitor
> > > it is still sufficient to have a single thread deal with all I/O for the
> > > chardev - only the command execution needs to be delegated to other
> > > threads, and those wouldn't be touching the chardev at all.
> > 
> > Hmm, I'd thought of the other way around - multiple individual monitors each
> > running one command; ie each connection for a monitor would be it's own
> > thread.
> 
> So I guess there's two problems with the monitor handling right now wrt.
> 
>  - A long running command will block the event loop thread for too long
>  - A long running command prevents a client issuing other commands while
>    waiting for the previous command to complete.
> 
> Running a thread per monitor server solves the first problem. If we make
> monitor command handling async though, then it solves both problems.

There are some other error cases that cause the main thread to be blocked
and my main interest in having multiple monitor threads is being able to dig
ourselves out of those error cases.
The cases I've got are:
    a) COLO or migration, where the networking/storage dies during the final
     sync of a migration with the big lock held.
    b) Postcopy where the network dies and a device emulation tries to access
     memory where the memory isn't on the host yet; the device emulation
     has the lock and we can't issue a command to trigger a recovery.

  Neither of those are necessarily running an existing monitor command.

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 13:52           ` Dr. David Alan Gilbert
@ 2016-10-18 14:01             ` Daniel P. Berrange
  2016-10-18 18:53               ` Dr. David Alan Gilbert
                                 ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-18 14:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > > Hi,
> > > > > > >   I had a look at a couple of readline like libraries;
> > > > > > > editline and linenoise.  A difficulty with using them is that
> > > > > > > they both want fd's or FILE*'s; editline takes either but
> > > > > > > from a brief look I think it's expecting to extract the fd.
> > > > > > > That makes them tricky to integrate into qemu, where
> > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > > > > tls, mux, ringbuffers etc.
> > > > > > > 
> > > > > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > > > > but that's GNU only.
> > > > > > > 
> > > > > > > Is there any sane way of shepherding all chardev's into having an
> > > > > > > fd?
> > > > > > 
> > > > > > The entire chardev abstraction model exists precisely because we cannot
> > > > > > make all chardevs look like a single fd. Even those which are fd based
> > > > > > may have separate FDs for input and output.
> > > > > 
> > > > > Note that editline takes separate in/out streams, but it does want those streams
> > > > > to be FILE*'s.
> > > > > 
> > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > > > > not assume use of fd* or FILE * abstractions.
> > > > > 
> > > > > I think if it came to that then we'd probably end up sticking with what we
> > > > > had for a very long time; I'd assume it would take a long time before
> > > > > any mods we made to the libraries would come around to be generally useful.
> > > > > 
> > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > > > > ought to be usable from a separate thread, as long as each distinct
> > > > > > chardev object instance was only used from one thread at a time ?
> > > > > 
> > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > > > > But what are the rules? You say 'only used from one thread at a time' -
> > > > > what happens if we have a mux and the different streams to the mux come
> > > > > from different threads?
> > > > 
> > > > Well there is no mutex locking on the CharDriverState objects, so the
> > > > exact rule is "you mustn't do anything from multiple threads that will
> > > > race on contents of CharDriverState". That's too fuzzy to be useful to
> > > > developers though, so I think the only sensible option right now is to
> > > > say any "top level" CharDriverState should only be touch from one thread
> > > > at a time. IOW, if you have a mux, that that rule would apply to the
> > > > mux itself and the various children it owns as if they were a single
> > > > unnit.
> > > 
> > > OK; I think we're probably saved by the big lock at the moment, so that
> > > all device emulation that outputs text is probably holding it and the monitor
> > > is also.  What about something like an error_report from a different thread
> > > while something is happening in the monitor?
> > 
> > If we moved execution of monitor commands to separate thread from the
> > thread handling monitor I/O, then we'd have to modify error_report so
> > that it queued the text in some manner, such that it was only then
> > fed back to the client once the command thread completed. Alternatively
> > we'd have to introduced locking in the Monitor object, that serialized
> > access to the underling CharDriverState I/O funcs.
> 
> I already use error_report's in places in migration threads of various
> types; I'm not sure if that's a problem.

Unless those places are protected by the big qemu lock, that sounds
not good. error_report calls into error_vprintf which checks the
'cur_mon' global "Monitor" pointer. This variable is updated at
runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
monitor_read(), etc. So if migration threads outside the BQL are
calling error_report() that could well cause problems. If you
are lucky messages will merely end up going to stderr instead of
the monitor, but in worst case I wouldn't be surprised if there
is a crash possibility in some race conditions.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 14:01             ` Daniel P. Berrange
@ 2016-10-18 18:53               ` Dr. David Alan Gilbert
  2016-10-19  7:45                 ` Daniel P. Berrange
  2016-10-19  8:00               ` Markus Armbruster
  2016-10-19 12:26               ` Paolo Bonzini
  2 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-18 18:53 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel, armbru

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > > > Hi,
> > > > > > > >   I had a look at a couple of readline like libraries;
> > > > > > > > editline and linenoise.  A difficulty with using them is that
> > > > > > > > they both want fd's or FILE*'s; editline takes either but
> > > > > > > > from a brief look I think it's expecting to extract the fd.
> > > > > > > > That makes them tricky to integrate into qemu, where
> > > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > > > > > tls, mux, ringbuffers etc.
> > > > > > > > 
> > > > > > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > > > > > but that's GNU only.
> > > > > > > > 
> > > > > > > > Is there any sane way of shepherding all chardev's into having an
> > > > > > > > fd?
> > > > > > > 
> > > > > > > The entire chardev abstraction model exists precisely because we cannot
> > > > > > > make all chardevs look like a single fd. Even those which are fd based
> > > > > > > may have separate FDs for input and output.
> > > > > > 
> > > > > > Note that editline takes separate in/out streams, but it does want those streams
> > > > > > to be FILE*'s.
> > > > > > 
> > > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > > > > > not assume use of fd* or FILE * abstractions.
> > > > > > 
> > > > > > I think if it came to that then we'd probably end up sticking with what we
> > > > > > had for a very long time; I'd assume it would take a long time before
> > > > > > any mods we made to the libraries would come around to be generally useful.
> > > > > > 
> > > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > > > > > ought to be usable from a separate thread, as long as each distinct
> > > > > > > chardev object instance was only used from one thread at a time ?
> > > > > > 
> > > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > > > > > But what are the rules? You say 'only used from one thread at a time' -
> > > > > > what happens if we have a mux and the different streams to the mux come
> > > > > > from different threads?
> > > > > 
> > > > > Well there is no mutex locking on the CharDriverState objects, so the
> > > > > exact rule is "you mustn't do anything from multiple threads that will
> > > > > race on contents of CharDriverState". That's too fuzzy to be useful to
> > > > > developers though, so I think the only sensible option right now is to
> > > > > say any "top level" CharDriverState should only be touch from one thread
> > > > > at a time. IOW, if you have a mux, that that rule would apply to the
> > > > > mux itself and the various children it owns as if they were a single
> > > > > unnit.
> > > > 
> > > > OK; I think we're probably saved by the big lock at the moment, so that
> > > > all device emulation that outputs text is probably holding it and the monitor
> > > > is also.  What about something like an error_report from a different thread
> > > > while something is happening in the monitor?
> > > 
> > > If we moved execution of monitor commands to separate thread from the
> > > thread handling monitor I/O, then we'd have to modify error_report so
> > > that it queued the text in some manner, such that it was only then
> > > fed back to the client once the command thread completed. Alternatively
> > > we'd have to introduced locking in the Monitor object, that serialized
> > > access to the underling CharDriverState I/O funcs.
> > 
> > I already use error_report's in places in migration threads of various
> > types; I'm not sure if that's a problem.
> 
> Unless those places are protected by the big qemu lock, that sounds
> not good. error_report calls into error_vprintf which checks the
> 'cur_mon' global "Monitor" pointer. This variable is updated at
> runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> monitor_read(), etc. So if migration threads outside the BQL are
> calling error_report() that could well cause problems. If you
> are lucky messages will merely end up going to stderr instead of
> the monitor, but in worst case I wouldn't be surprised if there
> is a crash possibility in some race conditions.

Hmm that's going to be interesting to fix;  I certainly use error_report
all over in postcopy, and the postcopy code uses device load code in its
threads that are shared by the normal load paths.   I doubt any of the
rest of the threaded code is clean from them either; does block code
used in the iothreads ever end up with an error_report?

Can't we take the bql in the inside of error_report?

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 18:53               ` Dr. David Alan Gilbert
@ 2016-10-19  7:45                 ` Daniel P. Berrange
  0 siblings, 0 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-19  7:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru

On Tue, Oct 18, 2016 at 07:53:51PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > > > > > > > > Hi,
> > > > > > > > >   I had a look at a couple of readline like libraries;
> > > > > > > > > editline and linenoise.  A difficulty with using them is that
> > > > > > > > > they both want fd's or FILE*'s; editline takes either but
> > > > > > > > > from a brief look I think it's expecting to extract the fd.
> > > > > > > > > That makes them tricky to integrate into qemu, where
> > > > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > > > > > > > > tls, mux, ringbuffers etc.
> > > > > > > > > 
> > > > > > > > > If we could get away with just a FILE* then we could use fopencookie,
> > > > > > > > > but that's GNU only.
> > > > > > > > > 
> > > > > > > > > Is there any sane way of shepherding all chardev's into having an
> > > > > > > > > fd?
> > > > > > > > 
> > > > > > > > The entire chardev abstraction model exists precisely because we cannot
> > > > > > > > make all chardevs look like a single fd. Even those which are fd based
> > > > > > > > may have separate FDs for input and output.
> > > > > > > 
> > > > > > > Note that editline takes separate in/out streams, but it does want those streams
> > > > > > > to be FILE*'s.
> > > > > > > 
> > > > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > > > > > > > not assume use of fd* or FILE * abstractions.
> > > > > > > 
> > > > > > > I think if it came to that then we'd probably end up sticking with what we
> > > > > > > had for a very long time; I'd assume it would take a long time before
> > > > > > > any mods we made to the libraries would come around to be generally useful.
> > > > > > > 
> > > > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > > > > > > > ought to be usable from a separate thread, as long as each distinct
> > > > > > > > chardev object instance was only used from one thread at a time ?
> > > > > > > 
> > > > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > > > > > > But what are the rules? You say 'only used from one thread at a time' -
> > > > > > > what happens if we have a mux and the different streams to the mux come
> > > > > > > from different threads?
> > > > > > 
> > > > > > Well there is no mutex locking on the CharDriverState objects, so the
> > > > > > exact rule is "you mustn't do anything from multiple threads that will
> > > > > > race on contents of CharDriverState". That's too fuzzy to be useful to
> > > > > > developers though, so I think the only sensible option right now is to
> > > > > > say any "top level" CharDriverState should only be touch from one thread
> > > > > > at a time. IOW, if you have a mux, that that rule would apply to the
> > > > > > mux itself and the various children it owns as if they were a single
> > > > > > unnit.
> > > > > 
> > > > > OK; I think we're probably saved by the big lock at the moment, so that
> > > > > all device emulation that outputs text is probably holding it and the monitor
> > > > > is also.  What about something like an error_report from a different thread
> > > > > while something is happening in the monitor?
> > > > 
> > > > If we moved execution of monitor commands to separate thread from the
> > > > thread handling monitor I/O, then we'd have to modify error_report so
> > > > that it queued the text in some manner, such that it was only then
> > > > fed back to the client once the command thread completed. Alternatively
> > > > we'd have to introduced locking in the Monitor object, that serialized
> > > > access to the underling CharDriverState I/O funcs.
> > > 
> > > I already use error_report's in places in migration threads of various
> > > types; I'm not sure if that's a problem.
> > 
> > Unless those places are protected by the big qemu lock, that sounds
> > not good. error_report calls into error_vprintf which checks the
> > 'cur_mon' global "Monitor" pointer. This variable is updated at
> > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> > monitor_read(), etc. So if migration threads outside the BQL are
> > calling error_report() that could well cause problems. If you
> > are lucky messages will merely end up going to stderr instead of
> > the monitor, but in worst case I wouldn't be surprised if there
> > is a crash possibility in some race conditions.
> 
> Hmm that's going to be interesting to fix;  I certainly use error_report
> all over in postcopy, and the postcopy code uses device load code in its
> threads that are shared by the normal load paths.   I doubt any of the
> rest of the threaded code is clean from them either; does block code
> used in the iothreads ever end up with an error_report?

One approach might be to turn the 'cur_mon' variable in to a
thread-local instead of a regular global variable. That would mean
any error_report() from a non-eventloop thread ends up going to
stderr instead of the to the monitor, but at least that is then
unambigously threadsafe.

> Can't we take the bql in the inside of error_report?

We could probably do that, or we could associate a dedicated mutex
with the cur_mon variable to have more fine grained locking.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 14:01             ` Daniel P. Berrange
  2016-10-18 18:53               ` Dr. David Alan Gilbert
@ 2016-10-19  8:00               ` Markus Armbruster
  2016-10-19  8:12                 ` Dr. David Alan Gilbert
  2016-10-19 12:26               ` Paolo Bonzini
  2 siblings, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-19  8:00 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
>> * Daniel P. Berrange (berrange@redhat.com) wrote:
>> > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
>> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
>> > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
>> > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
>> > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
>> > > > > > > Hi,
>> > > > > > >   I had a look at a couple of readline like libraries;
>> > > > > > > editline and linenoise.  A difficulty with using them is that
>> > > > > > > they both want fd's or FILE*'s; editline takes either but
>> > > > > > > from a brief look I think it's expecting to extract the fd.
>> > > > > > > That makes them tricky to integrate into qemu, where
>> > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
>> > > > > > > tls, mux, ringbuffers etc.
>> > > > > > > 
>> > > > > > > If we could get away with just a FILE* then we could use fopencookie,
>> > > > > > > but that's GNU only.
>> > > > > > > 
>> > > > > > > Is there any sane way of shepherding all chardev's into having an
>> > > > > > > fd?
>> > > > > > 
>> > > > > > The entire chardev abstraction model exists precisely because we cannot
>> > > > > > make all chardevs look like a single fd. Even those which are fd based
>> > > > > > may have separate FDs for input and output.
>> > > > > 
>> > > > > Note that editline takes separate in/out streams, but it does want those streams
>> > > > > to be FILE*'s.
>> > > > > 
>> > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
>> > > > > > not assume use of fd* or FILE * abstractions.
>> > > > > 
>> > > > > I think if it came to that then we'd probably end up sticking with what we
>> > > > > had for a very long time; I'd assume it would take a long time before
>> > > > > any mods we made to the libraries would come around to be generally useful.
>> > > > > 
>> > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
>> > > > > > ought to be usable from a separate thread, as long as each distinct
>> > > > > > chardev object instance was only used from one thread at a time ?
>> > > > > 
>> > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
>> > > > > But what are the rules? You say 'only used from one thread at a time' -
>> > > > > what happens if we have a mux and the different streams to the mux come
>> > > > > from different threads?
>> > > > 
>> > > > Well there is no mutex locking on the CharDriverState objects, so the
>> > > > exact rule is "you mustn't do anything from multiple threads that will
>> > > > race on contents of CharDriverState". That's too fuzzy to be useful to
>> > > > developers though, so I think the only sensible option right now is to
>> > > > say any "top level" CharDriverState should only be touch from one thread
>> > > > at a time. IOW, if you have a mux, that that rule would apply to the
>> > > > mux itself and the various children it owns as if they were a single
>> > > > unnit.
>> > > 
>> > > OK; I think we're probably saved by the big lock at the moment, so that
>> > > all device emulation that outputs text is probably holding it and the monitor
>> > > is also.  What about something like an error_report from a different thread
>> > > while something is happening in the monitor?
>> > 
>> > If we moved execution of monitor commands to separate thread from the
>> > thread handling monitor I/O, then we'd have to modify error_report so
>> > that it queued the text in some manner, such that it was only then
>> > fed back to the client once the command thread completed. Alternatively
>> > we'd have to introduced locking in the Monitor object, that serialized
>> > access to the underling CharDriverState I/O funcs.
>> 
>> I already use error_report's in places in migration threads of various
>> types; I'm not sure if that's a problem.
>
> Unless those places are protected by the big qemu lock, that sounds
> not good. error_report calls into error_vprintf which checks the
> 'cur_mon' global "Monitor" pointer. This variable is updated at
> runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> monitor_read(), etc. So if migration threads outside the BQL are
> calling error_report() that could well cause problems. If you
> are lucky messages will merely end up going to stderr instead of
> the monitor, but in worst case I wouldn't be surprised if there
> is a crash possibility in some race conditions.

cur_mon dates back to single-threaded times.

The idea is to print to the monitor when running within an HMP command,
else to stderr.

The current solution is to set cur_mon around monitor commands.  Fine
with a single thread, not fine at all with multiple threads.

Making cur_mon thread-local should fix things.

If you do want to report errors from another thread in a monitor, you
should use error_setg() & friends to get them into the monitor, in my
opinion.  Asynchronously barfing output to a monitor doesn't strike me
as a sensible design.  Not least because it doesn't work at all with
QMP!  If an error message is important enough for the human monitor's
user to make use route it to the human monitor, why is hiding it from
the QMP client okay?

If I'm wrong and it is sensible, we need locking.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19  8:00               ` Markus Armbruster
@ 2016-10-19  8:12                 ` Dr. David Alan Gilbert
  2016-10-19  8:42                   ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-19  8:12 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Daniel P. Berrange, qemu-devel

* Markus Armbruster (armbru@redhat.com) wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> >> * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> >> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> >> > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> >> > > > > > > Hi,
> >> > > > > > >   I had a look at a couple of readline like libraries;
> >> > > > > > > editline and linenoise.  A difficulty with using them is that
> >> > > > > > > they both want fd's or FILE*'s; editline takes either but
> >> > > > > > > from a brief look I think it's expecting to extract the fd.
> >> > > > > > > That makes them tricky to integrate into qemu, where
> >> > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> >> > > > > > > tls, mux, ringbuffers etc.
> >> > > > > > > 
> >> > > > > > > If we could get away with just a FILE* then we could use fopencookie,
> >> > > > > > > but that's GNU only.
> >> > > > > > > 
> >> > > > > > > Is there any sane way of shepherding all chardev's into having an
> >> > > > > > > fd?
> >> > > > > > 
> >> > > > > > The entire chardev abstraction model exists precisely because we cannot
> >> > > > > > make all chardevs look like a single fd. Even those which are fd based
> >> > > > > > may have separate FDs for input and output.
> >> > > > > 
> >> > > > > Note that editline takes separate in/out streams, but it does want those streams
> >> > > > > to be FILE*'s.
> >> > > > > 
> >> > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> >> > > > > > not assume use of fd* or FILE * abstractions.
> >> > > > > 
> >> > > > > I think if it came to that then we'd probably end up sticking with what we
> >> > > > > had for a very long time; I'd assume it would take a long time before
> >> > > > > any mods we made to the libraries would come around to be generally useful.
> >> > > > > 
> >> > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> >> > > > > > ought to be usable from a separate thread, as long as each distinct
> >> > > > > > chardev object instance was only used from one thread at a time ?
> >> > > > > 
> >> > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> >> > > > > But what are the rules? You say 'only used from one thread at a time' -
> >> > > > > what happens if we have a mux and the different streams to the mux come
> >> > > > > from different threads?
> >> > > > 
> >> > > > Well there is no mutex locking on the CharDriverState objects, so the
> >> > > > exact rule is "you mustn't do anything from multiple threads that will
> >> > > > race on contents of CharDriverState". That's too fuzzy to be useful to
> >> > > > developers though, so I think the only sensible option right now is to
> >> > > > say any "top level" CharDriverState should only be touch from one thread
> >> > > > at a time. IOW, if you have a mux, that that rule would apply to the
> >> > > > mux itself and the various children it owns as if they were a single
> >> > > > unnit.
> >> > > 
> >> > > OK; I think we're probably saved by the big lock at the moment, so that
> >> > > all device emulation that outputs text is probably holding it and the monitor
> >> > > is also.  What about something like an error_report from a different thread
> >> > > while something is happening in the monitor?
> >> > 
> >> > If we moved execution of monitor commands to separate thread from the
> >> > thread handling monitor I/O, then we'd have to modify error_report so
> >> > that it queued the text in some manner, such that it was only then
> >> > fed back to the client once the command thread completed. Alternatively
> >> > we'd have to introduced locking in the Monitor object, that serialized
> >> > access to the underling CharDriverState I/O funcs.
> >> 
> >> I already use error_report's in places in migration threads of various
> >> types; I'm not sure if that's a problem.
> >
> > Unless those places are protected by the big qemu lock, that sounds
> > not good. error_report calls into error_vprintf which checks the
> > 'cur_mon' global "Monitor" pointer. This variable is updated at
> > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> > monitor_read(), etc. So if migration threads outside the BQL are
> > calling error_report() that could well cause problems. If you
> > are lucky messages will merely end up going to stderr instead of
> > the monitor, but in worst case I wouldn't be surprised if there
> > is a crash possibility in some race conditions.
> 
> cur_mon dates back to single-threaded times.
> 
> The idea is to print to the monitor when running within an HMP command,
> else to stderr.
> 
> The current solution is to set cur_mon around monitor commands.  Fine
> with a single thread, not fine at all with multiple threads.
> 
> Making cur_mon thread-local should fix things.
> 
> If you do want to report errors from another thread in a monitor, you
> should use error_setg() & friends to get them into the monitor, in my
> opinion.  Asynchronously barfing output to a monitor doesn't strike me
> as a sensible design.  Not least because it doesn't work at all with
> QMP!  If an error message is important enough for the human monitor's
> user to make use route it to the human monitor, why is hiding it from
> the QMP client okay?
> 
> If I'm wrong and it is sensible, we need locking.

The difficulty is that we've long tried to be consistent and use error_report
rather than fprintf's; now that is turning out to be wrong if we're in
other threads.  It's even trickier for the cases of routines that might
be called in either the main thread or another thread - we have no
right answer as to how they should print na error.

Looks like we need the lock.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19  8:12                 ` Dr. David Alan Gilbert
@ 2016-10-19  8:42                   ` Daniel P. Berrange
  2016-10-19  9:48                     ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-19  8:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel

On Wed, Oct 19, 2016 at 09:12:11AM +0100, Dr. David Alan Gilbert wrote:
> * Markus Armbruster (armbru@redhat.com) wrote:
> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > 
> > > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> > >> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > >> > On Tue, Oct 18, 2016 at 02:25:25PM +0100, Dr. David Alan Gilbert wrote:
> > >> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > >> > > > On Tue, Oct 18, 2016 at 12:32:02PM +0100, Dr. David Alan Gilbert wrote:
> > >> > > > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > >> > > > > > On Wed, Oct 12, 2016 at 08:15:02PM +0100, Dr. David Alan Gilbert wrote:
> > >> > > > > > > Hi,
> > >> > > > > > >   I had a look at a couple of readline like libraries;
> > >> > > > > > > editline and linenoise.  A difficulty with using them is that
> > >> > > > > > > they both want fd's or FILE*'s; editline takes either but
> > >> > > > > > > from a brief look I think it's expecting to extract the fd.
> > >> > > > > > > That makes them tricky to integrate into qemu, where
> > >> > > > > > > the chardev's hide a whole bunch of non-fd things; in particular
> > >> > > > > > > tls, mux, ringbuffers etc.
> > >> > > > > > > 
> > >> > > > > > > If we could get away with just a FILE* then we could use fopencookie,
> > >> > > > > > > but that's GNU only.
> > >> > > > > > > 
> > >> > > > > > > Is there any sane way of shepherding all chardev's into having an
> > >> > > > > > > fd?
> > >> > > > > > 
> > >> > > > > > The entire chardev abstraction model exists precisely because we cannot
> > >> > > > > > make all chardevs look like a single fd. Even those which are fd based
> > >> > > > > > may have separate FDs for input and output.
> > >> > > > > 
> > >> > > > > Note that editline takes separate in/out streams, but it does want those streams
> > >> > > > > to be FILE*'s.
> > >> > > > > 
> > >> > > > > > IMHO the only viable approach would be to enhance linenoise/editline to
> > >> > > > > > not assume use of fd* or FILE * abstractions.
> > >> > > > > 
> > >> > > > > I think if it came to that then we'd probably end up sticking with what we
> > >> > > > > had for a very long time; I'd assume it would take a long time before
> > >> > > > > any mods we made to the libraries would come around to be generally useful.
> > >> > > > > 
> > >> > > > > > BTW, what is the actual thread issue you are facing ? Chardevs at least
> > >> > > > > > ought to be usable from a separate thread, as long as each distinct
> > >> > > > > > chardev object instance was only used from one thread at a time ?
> > >> > > > > 
> > >> > > > > Marc-André pointed that out; I hadn't realised they were thread safe.
> > >> > > > > But what are the rules? You say 'only used from one thread at a time' -
> > >> > > > > what happens if we have a mux and the different streams to the mux come
> > >> > > > > from different threads?
> > >> > > > 
> > >> > > > Well there is no mutex locking on the CharDriverState objects, so the
> > >> > > > exact rule is "you mustn't do anything from multiple threads that will
> > >> > > > race on contents of CharDriverState". That's too fuzzy to be useful to
> > >> > > > developers though, so I think the only sensible option right now is to
> > >> > > > say any "top level" CharDriverState should only be touch from one thread
> > >> > > > at a time. IOW, if you have a mux, that that rule would apply to the
> > >> > > > mux itself and the various children it owns as if they were a single
> > >> > > > unnit.
> > >> > > 
> > >> > > OK; I think we're probably saved by the big lock at the moment, so that
> > >> > > all device emulation that outputs text is probably holding it and the monitor
> > >> > > is also.  What about something like an error_report from a different thread
> > >> > > while something is happening in the monitor?
> > >> > 
> > >> > If we moved execution of monitor commands to separate thread from the
> > >> > thread handling monitor I/O, then we'd have to modify error_report so
> > >> > that it queued the text in some manner, such that it was only then
> > >> > fed back to the client once the command thread completed. Alternatively
> > >> > we'd have to introduced locking in the Monitor object, that serialized
> > >> > access to the underling CharDriverState I/O funcs.
> > >> 
> > >> I already use error_report's in places in migration threads of various
> > >> types; I'm not sure if that's a problem.
> > >
> > > Unless those places are protected by the big qemu lock, that sounds
> > > not good. error_report calls into error_vprintf which checks the
> > > 'cur_mon' global "Monitor" pointer. This variable is updated at
> > > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> > > monitor_read(), etc. So if migration threads outside the BQL are
> > > calling error_report() that could well cause problems. If you
> > > are lucky messages will merely end up going to stderr instead of
> > > the monitor, but in worst case I wouldn't be surprised if there
> > > is a crash possibility in some race conditions.
> > 
> > cur_mon dates back to single-threaded times.
> > 
> > The idea is to print to the monitor when running within an HMP command,
> > else to stderr.
> > 
> > The current solution is to set cur_mon around monitor commands.  Fine
> > with a single thread, not fine at all with multiple threads.
> > 
> > Making cur_mon thread-local should fix things.
> > 
> > If you do want to report errors from another thread in a monitor, you
> > should use error_setg() & friends to get them into the monitor, in my
> > opinion.  Asynchronously barfing output to a monitor doesn't strike me
> > as a sensible design.  Not least because it doesn't work at all with
> > QMP!  If an error message is important enough for the human monitor's
> > user to make use route it to the human monitor, why is hiding it from
> > the QMP client okay?
> > 
> > If I'm wrong and it is sensible, we need locking.
> 
> The difficulty is that we've long tried to be consistent and use error_report
> rather than fprintf's; now that is turning out to be wrong if we're in
> other threads.  It's even trickier for the cases of routines that might
> be called in either the main thread or another thread - we have no
> right answer as to how they should print na error.

Pretty much all code should be using error_setg together with an Error **errp
parameter to the method. The only places that should use error_report are at
top of the call chain where they unambigously know that the printed result
is going to the right place.

Historically the migration hasn't use error_setg because where was no
mechanism to feed the Error **errp object back to the caller, due to the
"migrate" QMP command always being in the background. Similarly in the
incoming migration side, we've always wanted errors to go to stderr.

I fixed the outgoing side in

  commit d59ce6f34434bf47a9b26138c908650bf9a24be1
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Wed Apr 27 11:05:00 2016 +0100

    migration: add reporting of errors for outgoing migration

So we could useful convert the migration code to use error_setg everywhere
and then use error_report_err() on the incoming side to spit that Error
onto stderr.

The block layer has almost no use of Error **errp parameters either and
this has long been an issue preventing useful error reporting there.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19  8:42                   ` Daniel P. Berrange
@ 2016-10-19  9:48                     ` Markus Armbruster
  2016-10-19 10:05                       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-19  9:48 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Wed, Oct 19, 2016 at 09:12:11AM +0100, Dr. David Alan Gilbert wrote:
>> * Markus Armbruster (armbru@redhat.com) wrote:
>> > "Daniel P. Berrange" <berrange@redhat.com> writes:
>> > 
>> > > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
[...]
>> > >> I already use error_report's in places in migration threads of various
>> > >> types; I'm not sure if that's a problem.
>> > >
>> > > Unless those places are protected by the big qemu lock, that sounds
>> > > not good. error_report calls into error_vprintf which checks the
>> > > 'cur_mon' global "Monitor" pointer. This variable is updated at
>> > > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
>> > > monitor_read(), etc. So if migration threads outside the BQL are
>> > > calling error_report() that could well cause problems. If you
>> > > are lucky messages will merely end up going to stderr instead of
>> > > the monitor, but in worst case I wouldn't be surprised if there
>> > > is a crash possibility in some race conditions.
>> > 
>> > cur_mon dates back to single-threaded times.
>> > 
>> > The idea is to print to the monitor when running within an HMP command,
>> > else to stderr.
>> > 
>> > The current solution is to set cur_mon around monitor commands.  Fine
>> > with a single thread, not fine at all with multiple threads.
>> > 
>> > Making cur_mon thread-local should fix things.
>> > 
>> > If you do want to report errors from another thread in a monitor, you
>> > should use error_setg() & friends to get them into the monitor, in my
>> > opinion.  Asynchronously barfing output to a monitor doesn't strike me
>> > as a sensible design.  Not least because it doesn't work at all with
>> > QMP!  If an error message is important enough for the human monitor's
>> > user to make use route it to the human monitor, why is hiding it from
>> > the QMP client okay?
>> > 
>> > If I'm wrong and it is sensible, we need locking.
>> 
>> The difficulty is that we've long tried to be consistent and use error_report
>> rather than fprintf's; now that is turning out to be wrong if we're in
>> other threads.

No, the two are equally wrong as wrong as far as threading is concerned:
unless that other thread is executing an HMP command, error_report()
calls vfprintf().

>>                 It's even trickier for the cases of routines that might
>> be called in either the main thread or another thread - we have no
>> right answer as to how they should print na error.
>
> Pretty much all code should be using error_setg together with an Error **errp
> parameter to the method. The only places that should use error_report are at
> top of the call chain where they unambigously know that the printed result
> is going to the right place.

When you know your code's running on behalf of startup code, a monitor
command or similar, go ahead and error_report().

When you don't know your context, error reporting should be left to
something up the stack that does.  This means returning a suitable error
value in simple cases (null, -1, -errno, whatever makes sense, just
document it), and error_setg() when error values don't provide enough
information to the caller to report the error in a useful way.

[...]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19  9:48                     ` Markus Armbruster
@ 2016-10-19 10:05                       ` Dr. David Alan Gilbert
  2016-10-19 10:16                         ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-19 10:05 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Daniel P. Berrange, qemu-devel

* Markus Armbruster (armbru@redhat.com) wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Wed, Oct 19, 2016 at 09:12:11AM +0100, Dr. David Alan Gilbert wrote:
> >> * Markus Armbruster (armbru@redhat.com) wrote:
> >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> >> > 
> >> > > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> [...]
> >> > >> I already use error_report's in places in migration threads of various
> >> > >> types; I'm not sure if that's a problem.
> >> > >
> >> > > Unless those places are protected by the big qemu lock, that sounds
> >> > > not good. error_report calls into error_vprintf which checks the
> >> > > 'cur_mon' global "Monitor" pointer. This variable is updated at
> >> > > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> >> > > monitor_read(), etc. So if migration threads outside the BQL are
> >> > > calling error_report() that could well cause problems. If you
> >> > > are lucky messages will merely end up going to stderr instead of
> >> > > the monitor, but in worst case I wouldn't be surprised if there
> >> > > is a crash possibility in some race conditions.
> >> > 
> >> > cur_mon dates back to single-threaded times.
> >> > 
> >> > The idea is to print to the monitor when running within an HMP command,
> >> > else to stderr.
> >> > 
> >> > The current solution is to set cur_mon around monitor commands.  Fine
> >> > with a single thread, not fine at all with multiple threads.
> >> > 
> >> > Making cur_mon thread-local should fix things.
> >> > 
> >> > If you do want to report errors from another thread in a monitor, you
> >> > should use error_setg() & friends to get them into the monitor, in my
> >> > opinion.  Asynchronously barfing output to a monitor doesn't strike me
> >> > as a sensible design.  Not least because it doesn't work at all with
> >> > QMP!  If an error message is important enough for the human monitor's
> >> > user to make use route it to the human monitor, why is hiding it from
> >> > the QMP client okay?
> >> > 
> >> > If I'm wrong and it is sensible, we need locking.
> >> 
> >> The difficulty is that we've long tried to be consistent and use error_report
> >> rather than fprintf's; now that is turning out to be wrong if we're in
> >> other threads.
> 
> No, the two are equally wrong as wrong as far as threading is concerned:
> unless that other thread is executing an HMP command, error_report()
> calls vfprintf().
> 
> >>                 It's even trickier for the cases of routines that might
> >> be called in either the main thread or another thread - we have no
> >> right answer as to how they should print na error.
> >
> > Pretty much all code should be using error_setg together with an Error **errp
> > parameter to the method. The only places that should use error_report are at
> > top of the call chain where they unambigously know that the printed result
> > is going to the right place.
> 
> When you know your code's running on behalf of startup code, a monitor
> command or similar, go ahead and error_report().
> 
> When you don't know your context, error reporting should be left to
> something up the stack that does.  This means returning a suitable error
> value in simple cases (null, -1, -errno, whatever makes sense, just
> document it), and error_setg() when error values don't provide enough
> information to the caller to report the error in a useful way.

We need a way to be able to report an error without plumbing error_setg
up the stack; if you're saying error_report isn't suitable then we
should just recommend we switch everything in migration back to
fprintf(stderr,


Dave

> [...]
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 10:05                       ` Dr. David Alan Gilbert
@ 2016-10-19 10:16                         ` Daniel P. Berrange
  2016-10-19 12:16                           ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-19 10:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel

On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> * Markus Armbruster (armbru@redhat.com) wrote:
> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > 
> > > On Wed, Oct 19, 2016 at 09:12:11AM +0100, Dr. David Alan Gilbert wrote:
> > >> * Markus Armbruster (armbru@redhat.com) wrote:
> > >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > >> > 
> > >> > > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
> > [...]
> > >> > >> I already use error_report's in places in migration threads of various
> > >> > >> types; I'm not sure if that's a problem.
> > >> > >
> > >> > > Unless those places are protected by the big qemu lock, that sounds
> > >> > > not good. error_report calls into error_vprintf which checks the
> > >> > > 'cur_mon' global "Monitor" pointer. This variable is updated at
> > >> > > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> > >> > > monitor_read(), etc. So if migration threads outside the BQL are
> > >> > > calling error_report() that could well cause problems. If you
> > >> > > are lucky messages will merely end up going to stderr instead of
> > >> > > the monitor, but in worst case I wouldn't be surprised if there
> > >> > > is a crash possibility in some race conditions.
> > >> > 
> > >> > cur_mon dates back to single-threaded times.
> > >> > 
> > >> > The idea is to print to the monitor when running within an HMP command,
> > >> > else to stderr.
> > >> > 
> > >> > The current solution is to set cur_mon around monitor commands.  Fine
> > >> > with a single thread, not fine at all with multiple threads.
> > >> > 
> > >> > Making cur_mon thread-local should fix things.
> > >> > 
> > >> > If you do want to report errors from another thread in a monitor, you
> > >> > should use error_setg() & friends to get them into the monitor, in my
> > >> > opinion.  Asynchronously barfing output to a monitor doesn't strike me
> > >> > as a sensible design.  Not least because it doesn't work at all with
> > >> > QMP!  If an error message is important enough for the human monitor's
> > >> > user to make use route it to the human monitor, why is hiding it from
> > >> > the QMP client okay?
> > >> > 
> > >> > If I'm wrong and it is sensible, we need locking.
> > >> 
> > >> The difficulty is that we've long tried to be consistent and use error_report
> > >> rather than fprintf's; now that is turning out to be wrong if we're in
> > >> other threads.
> > 
> > No, the two are equally wrong as wrong as far as threading is concerned:
> > unless that other thread is executing an HMP command, error_report()
> > calls vfprintf().
> > 
> > >>                 It's even trickier for the cases of routines that might
> > >> be called in either the main thread or another thread - we have no
> > >> right answer as to how they should print na error.
> > >
> > > Pretty much all code should be using error_setg together with an Error **errp
> > > parameter to the method. The only places that should use error_report are at
> > > top of the call chain where they unambigously know that the printed result
> > > is going to the right place.
> > 
> > When you know your code's running on behalf of startup code, a monitor
> > command or similar, go ahead and error_report().
> > 
> > When you don't know your context, error reporting should be left to
> > something up the stack that does.  This means returning a suitable error
> > value in simple cases (null, -1, -errno, whatever makes sense, just
> > document it), and error_setg() when error values don't provide enough
> > information to the caller to report the error in a useful way.
> 
> We need a way to be able to report an error without plumbing error_setg
> up the stack; if you're saying error_report isn't suitable then we
> should just recommend we switch everything in migration back to
> fprintf(stderr,

Well both error_report() + fprintf  are broken from POV of anything
using QMP. error_report() is slightly less broken for HMP, but doesn't
help QMP.

In the short term we should just make error_report be  threadsafe in
its usage of the monitor. Beyond the short term we have no choice but
to plumb in error_setg throughout, otherwise QMP will continue to
have useless error reporting in this area of code.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 10:16                         ` Daniel P. Berrange
@ 2016-10-19 12:16                           ` Markus Armbruster
  2016-10-19 12:21                             ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-19 12:16 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
>> * Markus Armbruster (armbru@redhat.com) wrote:
>> > "Daniel P. Berrange" <berrange@redhat.com> writes:
>> > 
>> > > On Wed, Oct 19, 2016 at 09:12:11AM +0100, Dr. David Alan Gilbert wrote:
>> > >> * Markus Armbruster (armbru@redhat.com) wrote:
>> > >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
>> > >> > 
>> > >> > > On Tue, Oct 18, 2016 at 02:52:13PM +0100, Dr. David Alan Gilbert wrote:
>> > [...]
>> > >> > >> I already use error_report's in places in migration threads of various
>> > >> > >> types; I'm not sure if that's a problem.
>> > >> > >
>> > >> > > Unless those places are protected by the big qemu lock, that sounds
>> > >> > > not good. error_report calls into error_vprintf which checks the
>> > >> > > 'cur_mon' global "Monitor" pointer. This variable is updated at
>> > >> > > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
>> > >> > > monitor_read(), etc. So if migration threads outside the BQL are
>> > >> > > calling error_report() that could well cause problems. If you
>> > >> > > are lucky messages will merely end up going to stderr instead of
>> > >> > > the monitor, but in worst case I wouldn't be surprised if there
>> > >> > > is a crash possibility in some race conditions.
>> > >> > 
>> > >> > cur_mon dates back to single-threaded times.
>> > >> > 
>> > >> > The idea is to print to the monitor when running within an HMP command,
>> > >> > else to stderr.
>> > >> > 
>> > >> > The current solution is to set cur_mon around monitor commands.  Fine
>> > >> > with a single thread, not fine at all with multiple threads.
>> > >> > 
>> > >> > Making cur_mon thread-local should fix things.
>> > >> > 
>> > >> > If you do want to report errors from another thread in a monitor, you
>> > >> > should use error_setg() & friends to get them into the monitor, in my
>> > >> > opinion.  Asynchronously barfing output to a monitor doesn't strike me
>> > >> > as a sensible design.  Not least because it doesn't work at all with
>> > >> > QMP!  If an error message is important enough for the human monitor's
>> > >> > user to make use route it to the human monitor, why is hiding it from
>> > >> > the QMP client okay?
>> > >> > 
>> > >> > If I'm wrong and it is sensible, we need locking.
>> > >> 
>> > >> The difficulty is that we've long tried to be consistent and use error_report
>> > >> rather than fprintf's; now that is turning out to be wrong if we're in
>> > >> other threads.
>> > 
>> > No, the two are equally wrong as wrong as far as threading is concerned:
>> > unless that other thread is executing an HMP command, error_report()
>> > calls vfprintf().
>> > 
>> > >>                 It's even trickier for the cases of routines that might
>> > >> be called in either the main thread or another thread - we have no
>> > >> right answer as to how they should print na error.
>> > >
>> > > Pretty much all code should be using error_setg together with an Error **errp
>> > > parameter to the method. The only places that should use error_report are at
>> > > top of the call chain where they unambigously know that the printed result
>> > > is going to the right place.
>> > 
>> > When you know your code's running on behalf of startup code, a monitor
>> > command or similar, go ahead and error_report().
>> > 
>> > When you don't know your context, error reporting should be left to
>> > something up the stack that does.  This means returning a suitable error
>> > value in simple cases (null, -1, -errno, whatever makes sense, just
>> > document it), and error_setg() when error values don't provide enough
>> > information to the caller to report the error in a useful way.
>> 
>> We need a way to be able to report an error without plumbing error_setg
>> up the stack; if you're saying error_report isn't suitable then we
>> should just recommend we switch everything in migration back to
>> fprintf(stderr,

In the cases where error_report() isn't suitable, fprintf() is just as
unsuitable for the exact same reasons.

> Well both error_report() + fprintf  are broken from POV of anything
> using QMP. error_report() is slightly less broken for HMP,

error_report() is not broken at all for HMP code.  The trouble is code
that can't know whether it's running in a context where error_report()
is suitable.

>                                                            but doesn't
> help QMP.

Correct.

> In the short term we should just make error_report be  threadsafe in
> its usage of the monitor.

Any problems left once cur_mon is thread-local (which it should be
anyway)?

>                           Beyond the short term we have no choice but
> to plumb in error_setg throughout, otherwise QMP will continue to
> have useless error reporting in this area of code.

Actually, no choice but propagate errors up the stack until they reach a
spot that knows how to report them.  error_setg() is *one* way to do
that.  Its prime advantage is ability to carry an error message.  Its
disadvantage is boilerplate.  Use it as needed.  Just don't convert back
and forth between Error and other representations while propagating up
the stack.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 12:16                           ` Markus Armbruster
@ 2016-10-19 12:21                             ` Daniel P. Berrange
  2016-10-19 18:06                               ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-19 12:21 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel

On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> >> 
> >> We need a way to be able to report an error without plumbing error_setg
> >> up the stack; if you're saying error_report isn't suitable then we
> >> should just recommend we switch everything in migration back to
> >> fprintf(stderr,
> 
> In the cases where error_report() isn't suitable, fprintf() is just as
> unsuitable for the exact same reasons.
> 
> > Well both error_report() + fprintf  are broken from POV of anything
> > using QMP. error_report() is slightly less broken for HMP,
> 
> error_report() is not broken at all for HMP code.  The trouble is code
> that can't know whether it's running in a context where error_report()
> is suitable.
> 
> >                                                            but doesn't
> > help QMP.
> 
> Correct.
> 
> > In the short term we should just make error_report be  threadsafe in
> > its usage of the monitor.
> 
> Any problems left once cur_mon is thread-local (which it should be
> anyway)?

If we make cur_mon a thread-local, then error_report() is equivalent
to fprintf(stderr) for the migration code, since the migration
code runs in a different thread thread, and so would always see
cur_mon == NULL.

> >                           Beyond the short term we have no choice but
> > to plumb in error_setg throughout, otherwise QMP will continue to
> > have useless error reporting in this area of code.
> 
> Actually, no choice but propagate errors up the stack until they reach a
> spot that knows how to report them.  error_setg() is *one* way to do
> that.  Its prime advantage is ability to carry an error message.  Its
> disadvantage is boilerplate.  Use it as needed.  Just don't convert back
> and forth between Error and other representations while propagating up
> the stack.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-18 14:01             ` Daniel P. Berrange
  2016-10-18 18:53               ` Dr. David Alan Gilbert
  2016-10-19  8:00               ` Markus Armbruster
@ 2016-10-19 12:26               ` Paolo Bonzini
  2016-10-19 17:01                 ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 54+ messages in thread
From: Paolo Bonzini @ 2016-10-19 12:26 UTC (permalink / raw)
  To: Daniel P. Berrange, Dr. David Alan Gilbert; +Cc: qemu-devel, armbru



On 18/10/2016 16:01, Daniel P. Berrange wrote:
>> > 
>> > I already use error_report's in places in migration threads of various
>> > types; I'm not sure if that's a problem.
> Unless those places are protected by the big qemu lock, that sounds
> not good. error_report calls into error_vprintf which checks the
> 'cur_mon' global "Monitor" pointer. This variable is updated at
> runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> monitor_read(), etc. So if migration threads outside the BQL are
> calling error_report() that could well cause problems. If you
> are lucky messages will merely end up going to stderr instead of
> the monitor, but in worst case I wouldn't be surprised if there
> is a crash possibility in some race conditions.

Writes to chardevs *are* thread-safe (assuming qio_channel_create_watch
is thread-safe; it seems to be).

Only reads aren't, in the sense that they require an event loop so they
use that event loop for serialization.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 12:26               ` Paolo Bonzini
@ 2016-10-19 17:01                 ` Dr. David Alan Gilbert
  2016-10-19 20:51                   ` Paolo Bonzini
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-19 17:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Daniel P. Berrange, qemu-devel, armbru

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> On 18/10/2016 16:01, Daniel P. Berrange wrote:
> >> > 
> >> > I already use error_report's in places in migration threads of various
> >> > types; I'm not sure if that's a problem.
> > Unless those places are protected by the big qemu lock, that sounds
> > not good. error_report calls into error_vprintf which checks the
> > 'cur_mon' global "Monitor" pointer. This variable is updated at
> > runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> > monitor_read(), etc. So if migration threads outside the BQL are
> > calling error_report() that could well cause problems. If you
> > are lucky messages will merely end up going to stderr instead of
> > the monitor, but in worst case I wouldn't be surprised if there
> > is a crash possibility in some race conditions.
> 
> Writes to chardevs *are* thread-safe (assuming qio_channel_create_watch
> is thread-safe; it seems to be).

Hmm that's useful (although it doesn't solve error_report because error_vprintf
is racy itself).

How deadlock safe is it?  In particular imagine I've got another
thread which is doing:
  take bql
     <many layers of stuff>
     write to a chardev
  release bql

  if that chardev is registered on the main thread, will that
deadlock?

> Only reads aren't, in the sense that they require an event loop so they
> use that event loop for serialization.

Hmm OK.

Dave

> 
> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 12:21                             ` Daniel P. Berrange
@ 2016-10-19 18:06                               ` Dr. David Alan Gilbert
  2016-10-20  8:37                                 ` Daniel P. Berrange
  2016-10-20  8:55                                 ` Markus Armbruster
  0 siblings, 2 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-19 18:06 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Markus Armbruster, qemu-devel

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > 
> > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> > >> 
> > >> We need a way to be able to report an error without plumbing error_setg
> > >> up the stack; if you're saying error_report isn't suitable then we
> > >> should just recommend we switch everything in migration back to
> > >> fprintf(stderr,
> > 
> > In the cases where error_report() isn't suitable, fprintf() is just as
> > unsuitable for the exact same reasons.
> > 
> > > Well both error_report() + fprintf  are broken from POV of anything
> > > using QMP. error_report() is slightly less broken for HMP,
> > 
> > error_report() is not broken at all for HMP code.  The trouble is code
> > that can't know whether it's running in a context where error_report()
> > is suitable.
> > 
> > >                                                            but doesn't
> > > help QMP.
> > 
> > Correct.
> > 
> > > In the short term we should just make error_report be  threadsafe in
> > > its usage of the monitor.
> > 
> > Any problems left once cur_mon is thread-local (which it should be
> > anyway)?
> 
> If we make cur_mon a thread-local, then error_report() is equivalent
> to fprintf(stderr) for the migration code, since the migration
> code runs in a different thread thread, and so would always see
> cur_mon == NULL.

Yes, that would become safe; it does sound the best fix for the current
worry.

If we had that, then why not wire up error_report to pass errors back to QMP
as well?

Dave

> > >                           Beyond the short term we have no choice but
> > > to plumb in error_setg throughout, otherwise QMP will continue to
> > > have useless error reporting in this area of code.
> > 
> > Actually, no choice but propagate errors up the stack until they reach a
> > spot that knows how to report them.  error_setg() is *one* way to do
> > that.  Its prime advantage is ability to carry an error message.  Its
> > disadvantage is boilerplate.  Use it as needed.  Just don't convert back
> > and forth between Error and other representations while propagating up
> > the stack.
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 17:01                 ` Dr. David Alan Gilbert
@ 2016-10-19 20:51                   ` Paolo Bonzini
  2016-10-20  8:34                     ` Daniel P. Berrange
  0 siblings, 1 reply; 54+ messages in thread
From: Paolo Bonzini @ 2016-10-19 20:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, armbru



On 19/10/2016 19:01, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>>
>>
>> On 18/10/2016 16:01, Daniel P. Berrange wrote:
>>>>>
>>>>> I already use error_report's in places in migration threads of various
>>>>> types; I'm not sure if that's a problem.
>>> Unless those places are protected by the big qemu lock, that sounds
>>> not good. error_report calls into error_vprintf which checks the
>>> 'cur_mon' global "Monitor" pointer. This variable is updated at
>>> runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
>>> monitor_read(), etc. So if migration threads outside the BQL are
>>> calling error_report() that could well cause problems. If you
>>> are lucky messages will merely end up going to stderr instead of
>>> the monitor, but in worst case I wouldn't be surprised if there
>>> is a crash possibility in some race conditions.
>>
>> Writes to chardevs *are* thread-safe (assuming qio_channel_create_watch
>> is thread-safe; it seems to be).
> 
> Hmm that's useful (although it doesn't solve error_report because error_vprintf
> is racy itself).

How is it racy?  Because of the case where cur_mon changes under the
feet of error_vprintf?  I guess that can be ignored for now (just a TODO
comment will do).

> How deadlock safe is it?  In particular imagine I've got another
> thread which is doing:
>   take bql
>      <many layers of stuff>
>      write to a chardev
>   release bql
> 
>   if that chardev is registered on the main thread, will that
> deadlock?

No, it won't.  Since your usecase is about the monitor, note that the
monitor does its own buffering; if the nonblocking write leaves stuff in
the buffer, the monitor will process the watch only after the BQL is
released by the thread.  However, there's no deadlock.

Paolo

>> Only reads aren't, in the sense that they require an event loop so they
>> use that event loop for serialization.
> 
> Hmm OK.
> 
> Dave
> 
>>
>> Paolo
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 20:51                   ` Paolo Bonzini
@ 2016-10-20  8:34                     ` Daniel P. Berrange
  0 siblings, 0 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-20  8:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Dr. David Alan Gilbert, qemu-devel, armbru

On Wed, Oct 19, 2016 at 10:51:07PM +0200, Paolo Bonzini wrote:
> 
> 
> On 19/10/2016 19:01, Dr. David Alan Gilbert wrote:
> > * Paolo Bonzini (pbonzini@redhat.com) wrote:
> >>
> >>
> >> On 18/10/2016 16:01, Daniel P. Berrange wrote:
> >>>>>
> >>>>> I already use error_report's in places in migration threads of various
> >>>>> types; I'm not sure if that's a problem.
> >>> Unless those places are protected by the big qemu lock, that sounds
> >>> not good. error_report calls into error_vprintf which checks the
> >>> 'cur_mon' global "Monitor" pointer. This variable is updated at
> >>> runtime - eg in qmp_human_monitor_command(), monitor_qmp_read(),
> >>> monitor_read(), etc. So if migration threads outside the BQL are
> >>> calling error_report() that could well cause problems. If you
> >>> are lucky messages will merely end up going to stderr instead of
> >>> the monitor, but in worst case I wouldn't be surprised if there
> >>> is a crash possibility in some race conditions.
> >>
> >> Writes to chardevs *are* thread-safe (assuming qio_channel_create_watch
> >> is thread-safe; it seems to be).
> > 
> > Hmm that's useful (although it doesn't solve error_report because error_vprintf
> > is racy itself).
> 
> How is it racy?  Because of the case where cur_mon changes under the
> feet of error_vprintf?  I guess that can be ignored for now (just a TODO
> comment will do).

Yes, the usage of cur_mon is unsafe as there's a TOCTOU race in
there that I believe can result in a NULL pointer crash.

eg these two methods:

static inline bool monitor_is_qmp(const Monitor *mon)
{
    return (mon->flags & MONITOR_USE_CONTROL);
}

bool monitor_cur_is_qmp(void)
{
    return cur_mon && monitor_is_qmp(cur_mon);
}


In using error_vprintf() from a non-main thread and not holding
the big QEMU lock, then 'cur_mon' can change between time we
check that its non-NULL, and when accessing cur_mon->flags.

Admittedly its a rare race - you could have to have an error
being reported by background migration, concurrently with a
monitor command being invoked, but that will hit someone
eventually unless I'm missing some synchronization somewhere.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 18:06                               ` Dr. David Alan Gilbert
@ 2016-10-20  8:37                                 ` Daniel P. Berrange
  2016-10-20  8:53                                   ` Marc-André Lureau
                                                     ` (2 more replies)
  2016-10-20  8:55                                 ` Markus Armbruster
  1 sibling, 3 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-20  8:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel

On Wed, Oct 19, 2016 at 07:06:16PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> > > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > > 
> > > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> > > >> 
> > > >> We need a way to be able to report an error without plumbing error_setg
> > > >> up the stack; if you're saying error_report isn't suitable then we
> > > >> should just recommend we switch everything in migration back to
> > > >> fprintf(stderr,
> > > 
> > > In the cases where error_report() isn't suitable, fprintf() is just as
> > > unsuitable for the exact same reasons.
> > > 
> > > > Well both error_report() + fprintf  are broken from POV of anything
> > > > using QMP. error_report() is slightly less broken for HMP,
> > > 
> > > error_report() is not broken at all for HMP code.  The trouble is code
> > > that can't know whether it's running in a context where error_report()
> > > is suitable.
> > > 
> > > >                                                            but doesn't
> > > > help QMP.
> > > 
> > > Correct.
> > > 
> > > > In the short term we should just make error_report be  threadsafe in
> > > > its usage of the monitor.
> > > 
> > > Any problems left once cur_mon is thread-local (which it should be
> > > anyway)?
> > 
> > If we make cur_mon a thread-local, then error_report() is equivalent
> > to fprintf(stderr) for the migration code, since the migration
> > code runs in a different thread thread, and so would always see
> > cur_mon == NULL.
> 
> Yes, that would become safe; it does sound the best fix for the current
> worry.
> 
> If we had that, then why not wire up error_report to pass errors back to QMP
> as well?

You have a problem of context - if you have multiple monitors, how do
you know which to send the error back to if you're not in the event
loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
allows proper async command processing it gets even harder, because
there's potentially many outstanding commands associated with a monitor
and you need to decide which the error should be given to.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  8:37                                 ` Daniel P. Berrange
@ 2016-10-20  8:53                                   ` Marc-André Lureau
  2016-10-20 10:45                                     ` Markus Armbruster
  2016-10-20 16:56                                   ` Paolo Bonzini
  2016-10-20 16:59                                   ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 54+ messages in thread
From: Marc-André Lureau @ 2016-10-20  8:53 UTC (permalink / raw)
  To: Daniel P. Berrange, Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel

Hi

On Thu, Oct 20, 2016 at 11:38 AM Daniel P. Berrange <berrange@redhat.com>
wrote:

> On Wed, Oct 19, 2016 at 07:06:16PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> > > > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > > >
> > > > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert
> wrote:
> > > > >>
> > > > >> We need a way to be able to report an error without plumbing
> error_setg
> > > > >> up the stack; if you're saying error_report isn't suitable then we
> > > > >> should just recommend we switch everything in migration back to
> > > > >> fprintf(stderr,
> > > >
> > > > In the cases where error_report() isn't suitable, fprintf() is just
> as
> > > > unsuitable for the exact same reasons.
> > > >
> > > > > Well both error_report() + fprintf  are broken from POV of anything
> > > > > using QMP. error_report() is slightly less broken for HMP,
> > > >
> > > > error_report() is not broken at all for HMP code.  The trouble is
> code
> > > > that can't know whether it's running in a context where
> error_report()
> > > > is suitable.
> > > >
> > > > >                                                            but
> doesn't
> > > > > help QMP.
> > > >
> > > > Correct.
> > > >
> > > > > In the short term we should just make error_report be  threadsafe
> in
> > > > > its usage of the monitor.
> > > >
> > > > Any problems left once cur_mon is thread-local (which it should be
> > > > anyway)?
> > >
> > > If we make cur_mon a thread-local, then error_report() is equivalent
> > > to fprintf(stderr) for the migration code, since the migration
> > > code runs in a different thread thread, and so would always see
> > > cur_mon == NULL.
> >
> > Yes, that would become safe; it does sound the best fix for the current
> > worry.
> >
> > If we had that, then why not wire up error_report to pass errors back to
> QMP
> > as well?
>
> You have a problem of context - if you have multiple monitors, how do
> you know which to send the error back to if you're not in the event
> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
> allows proper async command processing it gets even harder, because
> there's potentially many outstanding commands associated with a monitor
> and you need to decide which the error should be given to.
>

I would think the contrary, it gets potentially easier as it may actually
keep the associated monitor (the one that triggered the command) with the
'async' context.

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-19 18:06                               ` Dr. David Alan Gilbert
  2016-10-20  8:37                                 ` Daniel P. Berrange
@ 2016-10-20  8:55                                 ` Markus Armbruster
  2016-10-20  9:03                                   ` Daniel P. Berrange
  1 sibling, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-20  8:55 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Daniel P. Berrange, qemu-devel

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Daniel P. Berrange (berrange@redhat.com) wrote:
>> On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
>> > "Daniel P. Berrange" <berrange@redhat.com> writes:
>> > 
>> > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
>> > >> 
>> > >> We need a way to be able to report an error without plumbing error_setg
>> > >> up the stack; if you're saying error_report isn't suitable then we
>> > >> should just recommend we switch everything in migration back to
>> > >> fprintf(stderr,
>> > 
>> > In the cases where error_report() isn't suitable, fprintf() is just as
>> > unsuitable for the exact same reasons.
>> > 
>> > > Well both error_report() + fprintf  are broken from POV of anything
>> > > using QMP. error_report() is slightly less broken for HMP,
>> > 
>> > error_report() is not broken at all for HMP code.  The trouble is code
>> > that can't know whether it's running in a context where error_report()
>> > is suitable.
>> > 
>> > >                                                            but doesn't
>> > > help QMP.
>> > 
>> > Correct.
>> > 
>> > > In the short term we should just make error_report be  threadsafe in
>> > > its usage of the monitor.
>> > 
>> > Any problems left once cur_mon is thread-local (which it should be
>> > anyway)?
>> 
>> If we make cur_mon a thread-local, then error_report() is equivalent
>> to fprintf(stderr) for the migration code, since the migration
>> code runs in a different thread thread, and so would always see
>> cur_mon == NULL.
>
> Yes, that would become safe; it does sound the best fix for the current
> worry.
>
> If we had that, then why not wire up error_report to pass errors back to QMP
> as well?

Well, that would be similar to how QMP used to work.

Back when the design of the QMP monitor was hammered out, we discussed
how to do errors.

Anthony argued for passing around error objects.  I pointed out the
enormous amount of work this would require: every call chain from the
monitor to an error needs to be modified, with ripple effects throughout
QEMU.

So I proposed a shortcut: have a function that reports the error, except
when in QMP context store it in the monitor instead.  That way, you need
to touch only places reporting errors, not every call chains leading to
one.

Sadly, that function couldn't be error_report() back then, because
Anthony insisted on rich error objects, against my opposition.  To
support them, we invented a new function, in commit 8204a91.  Code still
had to be converted to this new function.  But it was the least
laborious solution given the rich error object requirement.

Anthony reluctantly accepted "store errors in monitor" as a transitional
interface, mostly because we needed to get QMP off the ground fast, and
passing around error objects would have slowed command conversion to a
crawl.  I hoped the transitional interface would turn out to be quite
practical, and remain.

Rich errors turned out to be a dead end, and we abandoned them after a
bit over two years (commit de253f1).

The "store error in the monitor" turned out to be a dead end, too.  They
lingered in the tree for a long time, until commit 4629ed1.  My memory
is foggy on why exactly they didn't work out, but reasons include:

* What if code attempts to store multiple errors?  We initially made
  that an assertion failure, but quickly had to relax that so that
  subsequent errors are silently ignored (commit 27a749f).  That's
  differently suboptimal.

* Failure remains difficult to see in the code.  Before QMP, a monitor
  command handler didn't return status to the monitor core, it simply
  reported it to the human user, possibly buried deep down in some call
  chain.  Only if something up the chain needed to know, we additionally
  propagated failure up the chain in ad hoc ways.  Making error
  propagation the only way to fail commands made failure more obvious in
  the code.

* Plumbing errors to the correct monitor is easy only in the
  (synchronous) monitor command handler.  If the handler kicks off some
  background job, you can't store them in the monitor even if you know
  which monitor kicked off the job, because that could interfere with
  another handler's execution!  You'd have to find some other place to
  store, and create some other code to examine that store and do what
  needs to be done.  Whatever that may be: could be sending the error in
  an asynchronous event, could be retaining for a later command to
  report synchronously.  But then propagating errors up the call chain
  starts to look more appealing than it used to.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  8:55                                 ` Markus Armbruster
@ 2016-10-20  9:03                                   ` Daniel P. Berrange
  2016-10-20  9:58                                     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-20  9:03 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel

On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> >> > 
> >> > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> >> > >> 
> >> > >> We need a way to be able to report an error without plumbing error_setg
> >> > >> up the stack; if you're saying error_report isn't suitable then we
> >> > >> should just recommend we switch everything in migration back to
> >> > >> fprintf(stderr,
> >> > 
> >> > In the cases where error_report() isn't suitable, fprintf() is just as
> >> > unsuitable for the exact same reasons.
> >> > 
> >> > > Well both error_report() + fprintf  are broken from POV of anything
> >> > > using QMP. error_report() is slightly less broken for HMP,
> >> > 
> >> > error_report() is not broken at all for HMP code.  The trouble is code
> >> > that can't know whether it's running in a context where error_report()
> >> > is suitable.
> >> > 
> >> > >                                                            but doesn't
> >> > > help QMP.
> >> > 
> >> > Correct.
> >> > 
> >> > > In the short term we should just make error_report be  threadsafe in
> >> > > its usage of the monitor.
> >> > 
> >> > Any problems left once cur_mon is thread-local (which it should be
> >> > anyway)?
> >> 
> >> If we make cur_mon a thread-local, then error_report() is equivalent
> >> to fprintf(stderr) for the migration code, since the migration
> >> code runs in a different thread thread, and so would always see
> >> cur_mon == NULL.
> >
> > Yes, that would become safe; it does sound the best fix for the current
> > worry.
> >
> > If we had that, then why not wire up error_report to pass errors back to QMP
> > as well?
> 
> Well, that would be similar to how QMP used to work.
> 
> Back when the design of the QMP monitor was hammered out, we discussed
> how to do errors.
> 
> Anthony argued for passing around error objects.  I pointed out the
> enormous amount of work this would require: every call chain from the
> monitor to an error needs to be modified, with ripple effects throughout
> QEMU.
> 
> So I proposed a shortcut: have a function that reports the error, except
> when in QMP context store it in the monitor instead.  That way, you need
> to touch only places reporting errors, not every call chains leading to
> one.
> 
> Sadly, that function couldn't be error_report() back then, because
> Anthony insisted on rich error objects, against my opposition.  To
> support them, we invented a new function, in commit 8204a91.  Code still
> had to be converted to this new function.  But it was the least
> laborious solution given the rich error object requirement.
> 
> Anthony reluctantly accepted "store errors in monitor" as a transitional
> interface, mostly because we needed to get QMP off the ground fast, and
> passing around error objects would have slowed command conversion to a
> crawl.  I hoped the transitional interface would turn out to be quite
> practical, and remain.
> 
> Rich errors turned out to be a dead end, and we abandoned them after a
> bit over two years (commit de253f1).
> 
> The "store error in the monitor" turned out to be a dead end, too.  They
> lingered in the tree for a long time, until commit 4629ed1.  My memory
> is foggy on why exactly they didn't work out, but reasons include:
> 
> * What if code attempts to store multiple errors?  We initially made
>   that an assertion failure, but quickly had to relax that so that
>   subsequent errors are silently ignored (commit 27a749f).  That's
>   differently suboptimal.
> 
> * Failure remains difficult to see in the code.  Before QMP, a monitor
>   command handler didn't return status to the monitor core, it simply
>   reported it to the human user, possibly buried deep down in some call
>   chain.  Only if something up the chain needed to know, we additionally
>   propagated failure up the chain in ad hoc ways.  Making error
>   propagation the only way to fail commands made failure more obvious in
>   the code.
> 
> * Plumbing errors to the correct monitor is easy only in the
>   (synchronous) monitor command handler.  If the handler kicks off some
>   background job, you can't store them in the monitor even if you know
>   which monitor kicked off the job, because that could interfere with
>   another handler's execution!  You'd have to find some other place to
>   store, and create some other code to examine that store and do what
>   needs to be done.  Whatever that may be: could be sending the error in
>   an asynchronous event, could be retaining for a later command to
>   report synchronously.  But then propagating errors up the call chain
>   starts to look more appealing than it used to.

Our code has increasingly converted to propagate errors up the call
chain, but having a mix of different error reporting approaches
is increasingly causing pain.

eg a function which propagates errors wants to call into a function
whicih uses error_report - there's no nice way to propagate the error
since it has already been reported.  If the function then wants to
explicitly ignore the error, then that's impossible too,since it has
already been reported.  Add in our code which doesn't use error_report
and instead returns errno values, such as the block layer, and it gets
even worse because if that calls a function which propagates an error,
it has to throw away that useful error and return a useless invented
errno value :(

IMHO continuing to convert code to propagate errors is the only way
out of this swamp, because it provides the greatest flexibility to
the callers of said functions to decide how to deal with the error.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  9:03                                   ` Daniel P. Berrange
@ 2016-10-20  9:58                                     ` Dr. David Alan Gilbert
  2016-10-20 10:42                                       ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-20  9:58 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Markus Armbruster, qemu-devel

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> > 
> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > >> On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> > >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > >> > 
> > >> > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> > >> > >> 
> > >> > >> We need a way to be able to report an error without plumbing error_setg
> > >> > >> up the stack; if you're saying error_report isn't suitable then we
> > >> > >> should just recommend we switch everything in migration back to
> > >> > >> fprintf(stderr,
> > >> > 
> > >> > In the cases where error_report() isn't suitable, fprintf() is just as
> > >> > unsuitable for the exact same reasons.
> > >> > 
> > >> > > Well both error_report() + fprintf  are broken from POV of anything
> > >> > > using QMP. error_report() is slightly less broken for HMP,
> > >> > 
> > >> > error_report() is not broken at all for HMP code.  The trouble is code
> > >> > that can't know whether it's running in a context where error_report()
> > >> > is suitable.
> > >> > 
> > >> > >                                                            but doesn't
> > >> > > help QMP.
> > >> > 
> > >> > Correct.
> > >> > 
> > >> > > In the short term we should just make error_report be  threadsafe in
> > >> > > its usage of the monitor.
> > >> > 
> > >> > Any problems left once cur_mon is thread-local (which it should be
> > >> > anyway)?
> > >> 
> > >> If we make cur_mon a thread-local, then error_report() is equivalent
> > >> to fprintf(stderr) for the migration code, since the migration
> > >> code runs in a different thread thread, and so would always see
> > >> cur_mon == NULL.
> > >
> > > Yes, that would become safe; it does sound the best fix for the current
> > > worry.
> > >
> > > If we had that, then why not wire up error_report to pass errors back to QMP
> > > as well?
> > 
> > Well, that would be similar to how QMP used to work.
> > 
> > Back when the design of the QMP monitor was hammered out, we discussed
> > how to do errors.
> > 
> > Anthony argued for passing around error objects.  I pointed out the
> > enormous amount of work this would require: every call chain from the
> > monitor to an error needs to be modified, with ripple effects throughout
> > QEMU.
> > 
> > So I proposed a shortcut: have a function that reports the error, except
> > when in QMP context store it in the monitor instead.  That way, you need
> > to touch only places reporting errors, not every call chains leading to
> > one.
> > 
> > Sadly, that function couldn't be error_report() back then, because
> > Anthony insisted on rich error objects, against my opposition.  To
> > support them, we invented a new function, in commit 8204a91.  Code still
> > had to be converted to this new function.  But it was the least
> > laborious solution given the rich error object requirement.
> > 
> > Anthony reluctantly accepted "store errors in monitor" as a transitional
> > interface, mostly because we needed to get QMP off the ground fast, and
> > passing around error objects would have slowed command conversion to a
> > crawl.  I hoped the transitional interface would turn out to be quite
> > practical, and remain.
> > 
> > Rich errors turned out to be a dead end, and we abandoned them after a
> > bit over two years (commit de253f1).
> > 
> > The "store error in the monitor" turned out to be a dead end, too.  They
> > lingered in the tree for a long time, until commit 4629ed1.  My memory
> > is foggy on why exactly they didn't work out, but reasons include:
> > 
> > * What if code attempts to store multiple errors?  We initially made
> >   that an assertion failure, but quickly had to relax that so that
> >   subsequent errors are silently ignored (commit 27a749f).  That's
> >   differently suboptimal.
> > 
> > * Failure remains difficult to see in the code.  Before QMP, a monitor
> >   command handler didn't return status to the monitor core, it simply
> >   reported it to the human user, possibly buried deep down in some call
> >   chain.  Only if something up the chain needed to know, we additionally
> >   propagated failure up the chain in ad hoc ways.  Making error
> >   propagation the only way to fail commands made failure more obvious in
> >   the code.
> > 
> > * Plumbing errors to the correct monitor is easy only in the
> >   (synchronous) monitor command handler.  If the handler kicks off some
> >   background job, you can't store them in the monitor even if you know
> >   which monitor kicked off the job, because that could interfere with
> >   another handler's execution!  You'd have to find some other place to
> >   store, and create some other code to examine that store and do what
> >   needs to be done.  Whatever that may be: could be sending the error in
> >   an asynchronous event, could be retaining for a later command to
> >   report synchronously.  But then propagating errors up the call chain
> >   starts to look more appealing than it used to.
> 
> Our code has increasingly converted to propagate errors up the call
> chain, but having a mix of different error reporting approaches
> is increasingly causing pain.
> 
> eg a function which propagates errors wants to call into a function
> whicih uses error_report - there's no nice way to propagate the error
> since it has already been reported.  If the function then wants to
> explicitly ignore the error, then that's impossible too,since it has
> already been reported.  Add in our code which doesn't use error_report
> and instead returns errno values, such as the block layer, and it gets
> even worse because if that calls a function which propagates an error,
> it has to throw away that useful error and return a useless invented
> errno value :(
> 
> IMHO continuing to convert code to propagate errors is the only way
> out of this swamp, because it provides the greatest flexibility to
> the callers of said functions to decide how to deal with the error.

The problem is that our way of propagating errors actively discourages 
people from adding errors and you're left with lots of useless invented errno's.
error_report makes it easy for people to scatter meaningful error messages
in at any point.

Make an easy, concise way of reporting an error that fits in with
a propagation scheme and I'd consider converting stuff.

Dave

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  9:58                                     ` Dr. David Alan Gilbert
@ 2016-10-20 10:42                                       ` Markus Armbruster
  2016-10-20 11:01                                         ` Dr. David Alan Gilbert
  2016-10-20 11:08                                         ` Daniel P. Berrange
  0 siblings, 2 replies; 54+ messages in thread
From: Markus Armbruster @ 2016-10-20 10:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Daniel P. Berrange, qemu-devel

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Daniel P. Berrange (berrange@redhat.com) wrote:
>> On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
>> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>> > 
>> > > * Daniel P. Berrange (berrange@redhat.com) wrote:
>> > >> On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
>> > >> > "Daniel P. Berrange" <berrange@redhat.com> writes:
>> > >> > 
>> > >> > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
>> > >> > >> 
>> > >> > >> We need a way to be able to report an error without plumbing error_setg
>> > >> > >> up the stack; if you're saying error_report isn't suitable then we
>> > >> > >> should just recommend we switch everything in migration back to
>> > >> > >> fprintf(stderr,
>> > >> > 
>> > >> > In the cases where error_report() isn't suitable, fprintf() is just as
>> > >> > unsuitable for the exact same reasons.
>> > >> > 
>> > >> > > Well both error_report() + fprintf  are broken from POV of anything
>> > >> > > using QMP. error_report() is slightly less broken for HMP,
>> > >> > 
>> > >> > error_report() is not broken at all for HMP code.  The trouble is code
>> > >> > that can't know whether it's running in a context where error_report()
>> > >> > is suitable.
>> > >> > 
>> > >> > >                                                            but doesn't
>> > >> > > help QMP.
>> > >> > 
>> > >> > Correct.
>> > >> > 
>> > >> > > In the short term we should just make error_report be  threadsafe in
>> > >> > > its usage of the monitor.
>> > >> > 
>> > >> > Any problems left once cur_mon is thread-local (which it should be
>> > >> > anyway)?
>> > >> 
>> > >> If we make cur_mon a thread-local, then error_report() is equivalent
>> > >> to fprintf(stderr) for the migration code, since the migration
>> > >> code runs in a different thread thread, and so would always see
>> > >> cur_mon == NULL.
>> > >
>> > > Yes, that would become safe; it does sound the best fix for the current
>> > > worry.
>> > >
>> > > If we had that, then why not wire up error_report to pass errors back to QMP
>> > > as well?
>> > 
>> > Well, that would be similar to how QMP used to work.
>> > 
>> > Back when the design of the QMP monitor was hammered out, we discussed
>> > how to do errors.
>> > 
>> > Anthony argued for passing around error objects.  I pointed out the
>> > enormous amount of work this would require: every call chain from the
>> > monitor to an error needs to be modified, with ripple effects throughout
>> > QEMU.
>> > 
>> > So I proposed a shortcut: have a function that reports the error, except
>> > when in QMP context store it in the monitor instead.  That way, you need
>> > to touch only places reporting errors, not every call chains leading to
>> > one.
>> > 
>> > Sadly, that function couldn't be error_report() back then, because
>> > Anthony insisted on rich error objects, against my opposition.  To
>> > support them, we invented a new function, in commit 8204a91.  Code still
>> > had to be converted to this new function.  But it was the least
>> > laborious solution given the rich error object requirement.
>> > 
>> > Anthony reluctantly accepted "store errors in monitor" as a transitional
>> > interface, mostly because we needed to get QMP off the ground fast, and
>> > passing around error objects would have slowed command conversion to a
>> > crawl.  I hoped the transitional interface would turn out to be quite
>> > practical, and remain.
>> > 
>> > Rich errors turned out to be a dead end, and we abandoned them after a
>> > bit over two years (commit de253f1).
>> > 
>> > The "store error in the monitor" turned out to be a dead end, too.  They
>> > lingered in the tree for a long time, until commit 4629ed1.  My memory
>> > is foggy on why exactly they didn't work out, but reasons include:
>> > 
>> > * What if code attempts to store multiple errors?  We initially made
>> >   that an assertion failure, but quickly had to relax that so that
>> >   subsequent errors are silently ignored (commit 27a749f).  That's
>> >   differently suboptimal.
>> > 
>> > * Failure remains difficult to see in the code.  Before QMP, a monitor
>> >   command handler didn't return status to the monitor core, it simply
>> >   reported it to the human user, possibly buried deep down in some call
>> >   chain.  Only if something up the chain needed to know, we additionally
>> >   propagated failure up the chain in ad hoc ways.  Making error
>> >   propagation the only way to fail commands made failure more obvious in
>> >   the code.
>> > 
>> > * Plumbing errors to the correct monitor is easy only in the
>> >   (synchronous) monitor command handler.  If the handler kicks off some
>> >   background job, you can't store them in the monitor even if you know
>> >   which monitor kicked off the job, because that could interfere with
>> >   another handler's execution!  You'd have to find some other place to
>> >   store, and create some other code to examine that store and do what
>> >   needs to be done.  Whatever that may be: could be sending the error in
>> >   an asynchronous event, could be retaining for a later command to
>> >   report synchronously.  But then propagating errors up the call chain
>> >   starts to look more appealing than it used to.
>> 
>> Our code has increasingly converted to propagate errors up the call
>> chain, but having a mix of different error reporting approaches
>> is increasingly causing pain.
>> 
>> eg a function which propagates errors wants to call into a function
>> whicih uses error_report

When you add an Error * parameter to a function, you get to convert
everything it calls.  Failure to do so is a bug, simple as that.

Likewise, when you add a new call to a function that takes an Error *
parameter.

>>                          - there's no nice way to propagate the error
>> since it has already been reported.  If the function then wants to
>> explicitly ignore the error, then that's impossible too,since it has
>> already been reported.  Add in our code which doesn't use error_report
>> and instead returns errno values, such as the block layer, and it gets
>> even worse because if that calls a function which propagates an error,
>> it has to throw away that useful error and return a useless invented
>> errno value :(

Different bug: when you receive an Error, you have to either handle and
consume it, or pass it on.  Throwing it away and returning an error code
instead counts as neither, and is a bug.

>> IMHO continuing to convert code to propagate errors is the only way
>> out of this swamp, because it provides the greatest flexibility to
>> the callers of said functions to decide how to deal with the error.

I wouldn't call the whole situation a swamp.  error_report() is just
fine in many places.  So is returning -errno.  We convert to Error when
these techniques cease to be fine.  The swampy part is old code where
these techniques have never been fine, or at least not for a while.  To
be drained incrementally.

> The problem is that our way of propagating errors actively discourages 
> people from adding errors and you're left with lots of useless invented errno's.
> error_report makes it easy for people to scatter meaningful error messages
> in at any point.
>
> Make an easy, concise way of reporting an error that fits in with
> a propagation scheme and I'd consider converting stuff.

error_setg(errp, "This is as simple as it gets, I'm afraid")

Snark aside, I acknowledge the pain of converting call chains to
propagate Error objects, having converted "a few" myself.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  8:53                                   ` Marc-André Lureau
@ 2016-10-20 10:45                                     ` Markus Armbruster
  0 siblings, 0 replies; 54+ messages in thread
From: Markus Armbruster @ 2016-10-20 10:45 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Daniel P. Berrange, Dr. David Alan Gilbert, qemu-devel

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Thu, Oct 20, 2016 at 11:38 AM Daniel P. Berrange <berrange@redhat.com>
> wrote:
>
>> On Wed, Oct 19, 2016 at 07:06:16PM +0100, Dr. David Alan Gilbert wrote:
>> > * Daniel P. Berrange (berrange@redhat.com) wrote:
[...]
>> > > If we make cur_mon a thread-local, then error_report() is equivalent
>> > > to fprintf(stderr) for the migration code, since the migration
>> > > code runs in a different thread thread, and so would always see
>> > > cur_mon == NULL.
>> >
>> > Yes, that would become safe; it does sound the best fix for the current
>> > worry.
>> >
>> > If we had that, then why not wire up error_report to pass errors back to QMP
>> > as well?
>>
>> You have a problem of context - if you have multiple monitors, how do
>> you know which to send the error back to if you're not in the event
>> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
>> allows proper async command processing it gets even harder, because
>> there's potentially many outstanding commands associated with a monitor
>> and you need to decide which the error should be given to.
>>
>
> I would think the contrary, it gets potentially easier as it may actually
> keep the associated monitor (the one that triggered the command) with the
> 'async' context.

Finding the monitor where the asynchronous command was issued is
feasible, but you cannot use it to store an asynchronous command's
error, because the currently running command is already using it.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 10:42                                       ` Markus Armbruster
@ 2016-10-20 11:01                                         ` Dr. David Alan Gilbert
  2016-10-20 11:10                                           ` Daniel P. Berrange
  2016-10-20 11:08                                         ` Daniel P. Berrange
  1 sibling, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-20 11:01 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Daniel P. Berrange, qemu-devel

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> > The problem is that our way of propagating errors actively discourages 
> > people from adding errors and you're left with lots of useless invented errno's.
> > error_report makes it easy for people to scatter meaningful error messages
> > in at any point.
> >
> > Make an easy, concise way of reporting an error that fits in with
> > a propagation scheme and I'd consider converting stuff.
> 
> error_setg(errp, "This is as simple as it gets, I'm afraid")
> 
> Snark aside, I acknowledge the pain of converting call chains to
> propagate Error objects, having converted "a few" myself.

If you can get it down to that line it would be great!

But unfortunately it isn't:
  a) I have to make sure my cleanup path after that error_setg doesn't
     cause any other errors because that breaks the rule
     that I can't call error_setg twice.

  b) I've got to put the whole local_err/error_propagate stuff all over.

  c) We insist on allowing the Err ** pointer to be NULL that
     removes some potential simplifications.

No; just too messy - give me something I can just drop those
 error_setg's in and I'd be happy - it's everything around it
that I find truly painful.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 10:42                                       ` Markus Armbruster
  2016-10-20 11:01                                         ` Dr. David Alan Gilbert
@ 2016-10-20 11:08                                         ` Daniel P. Berrange
  2016-10-20 11:57                                           ` Markus Armbruster
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-20 11:08 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel

On Thu, Oct 20, 2016 at 12:42:01PM +0200, Markus Armbruster wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
> >> 
> >> Our code has increasingly converted to propagate errors up the call
> >> chain, but having a mix of different error reporting approaches
> >> is increasingly causing pain.
> >> 
> >> eg a function which propagates errors wants to call into a function
> >> whicih uses error_report
> 
> When you add an Error * parameter to a function, you get to convert
> everything it calls.  Failure to do so is a bug, simple as that.
> 
> Likewise, when you add a new call to a function that takes an Error *
> parameter.

I agree its a bug - just that from a pragmatic POV it isn't always
practical to convert the entire call chain in one big bang - particularly
if you are working on shared infrastructure. For example, the
qemu-sockets.c APIs for connecting & listening on unix/tcp sockets
use Error * to report errors. Those APIs were used in the block
layer, migration, vnc, chardev and network backend code. It clearly
isn't practical to convert all those regions to code to correctly
propagate Error ** in one go. That's certainly the desired end
goal IMHO, but getting there is going to take a while.

This slow conversion has been going on for a long time now, so
maybe we need to make an aggressive push in strategic areas of
the code to stop it dragging out indefinitely. This is kind of
a general problem we have in QEMU - we have introduced lots of
new standard frameworks, but very rarely finish converting
code to use them.  This is why I did a big push to try to get
everything switched over to use QIOChannel, in the shortest
possible time, rather than doing only small bits over many
releases.

> >>                          - there's no nice way to propagate the error
> >> since it has already been reported.  If the function then wants to
> >> explicitly ignore the error, then that's impossible too,since it has
> >> already been reported.  Add in our code which doesn't use error_report
> >> and instead returns errno values, such as the block layer, and it gets
> >> even worse because if that calls a function which propagates an error,
> >> it has to throw away that useful error and return a useless invented
> >> errno value :(
> 
> Different bug: when you receive an Error, you have to either handle and
> consume it, or pass it on.  Throwing it away and returning an error code
> instead counts as neither, and is a bug.

Totally agree its a bug - just again hits the reality of having to
do major code conversions.

> >> IMHO continuing to convert code to propagate errors is the only way
> >> out of this swamp, because it provides the greatest flexibility to
> >> the callers of said functions to decide how to deal with the error.
> 
> I wouldn't call the whole situation a swamp.  error_report() is just
> fine in many places.  So is returning -errno.  We convert to Error when
> these techniques cease to be fine.  The swampy part is old code where
> these techniques have never been fine, or at least not for a while.  To
> be drained incrementally.

Realistically all the major backend subsystems (chardev, network, block,
ui and migration) need to be converted to Error ** propagation, since
they all ultimately call into some common code that reports Error **.
Very few places will end up being able to stick with -errno, or plain
error_report in the long term.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 11:01                                         ` Dr. David Alan Gilbert
@ 2016-10-20 11:10                                           ` Daniel P. Berrange
  2016-10-20 11:45                                             ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-20 11:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel

On Thu, Oct 20, 2016 at 12:01:08PM +0100, Dr. David Alan Gilbert wrote:
> * Markus Armbruster (armbru@redhat.com) wrote:
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > > The problem is that our way of propagating errors actively discourages 
> > > people from adding errors and you're left with lots of useless invented errno's.
> > > error_report makes it easy for people to scatter meaningful error messages
> > > in at any point.
> > >
> > > Make an easy, concise way of reporting an error that fits in with
> > > a propagation scheme and I'd consider converting stuff.
> > 
> > error_setg(errp, "This is as simple as it gets, I'm afraid")
> > 
> > Snark aside, I acknowledge the pain of converting call chains to
> > propagate Error objects, having converted "a few" myself.
> 
> If you can get it down to that line it would be great!
> 
> But unfortunately it isn't:
>   a) I have to make sure my cleanup path after that error_setg doesn't
>      cause any other errors because that breaks the rule
>      that I can't call error_setg twice.

>   b) I've got to put the whole local_err/error_propagate stuff all over.
> 
>   c) We insist on allowing the Err ** pointer to be NULL that
>      removes some potential simplifications.

This last point is one i really dislike - if we could mandate that
the error parameters was non-NULL, it will eliminate alot of the
need to use local_err/error_propagate, as you could just check
if (*errp), and so largely solve (b).


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 11:10                                           ` Daniel P. Berrange
@ 2016-10-20 11:45                                             ` Markus Armbruster
  0 siblings, 0 replies; 54+ messages in thread
From: Markus Armbruster @ 2016-10-20 11:45 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Thu, Oct 20, 2016 at 12:01:08PM +0100, Dr. David Alan Gilbert wrote:
>> * Markus Armbruster (armbru@redhat.com) wrote:
>> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>> 
>> > > The problem is that our way of propagating errors actively discourages 
>> > > people from adding errors and you're left with lots of useless invented errno's.
>> > > error_report makes it easy for people to scatter meaningful error messages
>> > > in at any point.
>> > >
>> > > Make an easy, concise way of reporting an error that fits in with
>> > > a propagation scheme and I'd consider converting stuff.
>> > 
>> > error_setg(errp, "This is as simple as it gets, I'm afraid")
>> > 
>> > Snark aside, I acknowledge the pain of converting call chains to
>> > propagate Error objects, having converted "a few" myself.
>> 
>> If you can get it down to that line it would be great!
>> 
>> But unfortunately it isn't:
>>   a) I have to make sure my cleanup path after that error_setg doesn't
>>      cause any other errors because that breaks the rule
>>      that I can't call error_setg twice.

We disallowed setting a second error because that's a common symptom of
forgotten or botched error handling.

For what it's worth, g_error_set() does the same.

>>   b) I've got to put the whole local_err/error_propagate stuff all over.

Only when you need to examine the errors you're propagating.

>>   c) We insist on allowing the Err ** pointer to be NULL that
>>      removes some potential simplifications.
>
> This last point is one i really dislike - if we could mandate that
> the error parameters was non-NULL, it will eliminate alot of the
> need to use local_err/error_propagate, as you could just check
> if (*errp), and so largely solve (b).

Outlawing null errp shifts the pain to the places that ignore errors.
Whether that would be a win is not obvious.  A patch doing the change
might demonstrate it is, but it would be a lot of work, possibly for
naught.

For what it's worth, g_error_set() permits null, too.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 11:08                                         ` Daniel P. Berrange
@ 2016-10-20 11:57                                           ` Markus Armbruster
  2016-10-20 17:56                                             ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-20 11:57 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Dr. David Alan Gilbert, qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Thu, Oct 20, 2016 at 12:42:01PM +0200, Markus Armbruster wrote:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>> 
>> > * Daniel P. Berrange (berrange@redhat.com) wrote:
>> >> On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
>> >> 
>> >> Our code has increasingly converted to propagate errors up the call
>> >> chain, but having a mix of different error reporting approaches
>> >> is increasingly causing pain.
>> >> 
>> >> eg a function which propagates errors wants to call into a function
>> >> whicih uses error_report
>> 
>> When you add an Error * parameter to a function, you get to convert
>> everything it calls.  Failure to do so is a bug, simple as that.
>> 
>> Likewise, when you add a new call to a function that takes an Error *
>> parameter.
>
> I agree its a bug - just that from a pragmatic POV it isn't always
> practical to convert the entire call chain in one big bang - particularly
> if you are working on shared infrastructure.

Been there, felt the pain.  I just wish we'd be more diligent adding
FIXME comments when we take shortcuts.

>                                              For example, the
> qemu-sockets.c APIs for connecting & listening on unix/tcp sockets
> use Error * to report errors. Those APIs were used in the block
> layer, migration, vnc, chardev and network backend code. It clearly
> isn't practical to convert all those regions to code to correctly
> propagate Error ** in one go. That's certainly the desired end
> goal IMHO, but getting there is going to take a while.
>
> This slow conversion has been going on for a long time now, so
> maybe we need to make an aggressive push in strategic areas of
> the code to stop it dragging out indefinitely. This is kind of
> a general problem we have in QEMU - we have introduced lots of
> new standard frameworks, but very rarely finish converting
> code to use them.

Yep.

>                    This is why I did a big push to try to get
> everything switched over to use QIOChannel, in the shortest
> possible time, rather than doing only small bits over many
> releases.

Much better when you can pull it off.

A common road block is having to update code nobody wants to touch, let
alone maintain, with no way to test.  The easy way out is to update just
the code you actually care about, then call the conversion process
"incremental".  Well, it's not incremental without commitments to
increment.

>> >>                          - there's no nice way to propagate the error
>> >> since it has already been reported.  If the function then wants to
>> >> explicitly ignore the error, then that's impossible too,since it has
>> >> already been reported.  Add in our code which doesn't use error_report
>> >> and instead returns errno values, such as the block layer, and it gets
>> >> even worse because if that calls a function which propagates an error,
>> >> it has to throw away that useful error and return a useless invented
>> >> errno value :(
>> 
>> Different bug: when you receive an Error, you have to either handle and
>> consume it, or pass it on.  Throwing it away and returning an error code
>> instead counts as neither, and is a bug.
>
> Totally agree its a bug - just again hits the reality of having to
> do major code conversions.
>
>> >> IMHO continuing to convert code to propagate errors is the only way
>> >> out of this swamp, because it provides the greatest flexibility to
>> >> the callers of said functions to decide how to deal with the error.
>> 
>> I wouldn't call the whole situation a swamp.  error_report() is just
>> fine in many places.  So is returning -errno.  We convert to Error when
>> these techniques cease to be fine.  The swampy part is old code where
>> these techniques have never been fine, or at least not for a while.  To
>> be drained incrementally.
>
> Realistically all the major backend subsystems (chardev, network, block,
> ui and migration) need to be converted to Error ** propagation, since
> they all ultimately call into some common code that reports Error **.

Infrastucture generally doesn't know how it's used, which means
error_report() is generally wrong there.  Sufficiently simple functions
can keep returning -errno, null, whatever, but the interesting stuff
needs to use Error.

> Very few places will end up being able to stick with -errno, or plain
> error_report in the long term.

Not sure about "very few".  Less than now.  We'll see.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  8:37                                 ` Daniel P. Berrange
  2016-10-20  8:53                                   ` Marc-André Lureau
@ 2016-10-20 16:56                                   ` Paolo Bonzini
  2016-10-21  9:12                                     ` Markus Armbruster
  2016-10-21  9:35                                     ` Daniel P. Berrange
  2016-10-20 16:59                                   ` Dr. David Alan Gilbert
  2 siblings, 2 replies; 54+ messages in thread
From: Paolo Bonzini @ 2016-10-20 16:56 UTC (permalink / raw)
  To: Daniel P. Berrange, Dr. David Alan Gilbert; +Cc: Markus Armbruster, qemu-devel



On 20/10/2016 10:37, Daniel P. Berrange wrote:
> You have a problem of context - if you have multiple monitors, how do
> you know which to send the error back to if you're not in the event
> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
> allows proper async command processing it gets even harder, because
> there's potentially many outstanding commands associated with a monitor
> and you need to decide which the error should be given to.

Why should it even consult the current monitor?  Shouldn't errors go to
all HMP monitors when QEMU falls back to error_report?

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20  8:37                                 ` Daniel P. Berrange
  2016-10-20  8:53                                   ` Marc-André Lureau
  2016-10-20 16:56                                   ` Paolo Bonzini
@ 2016-10-20 16:59                                   ` Dr. David Alan Gilbert
  2 siblings, 0 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-20 16:59 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Markus Armbruster, qemu-devel

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Wed, Oct 19, 2016 at 07:06:16PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> > > On Wed, Oct 19, 2016 at 02:16:05PM +0200, Markus Armbruster wrote:
> > > > "Daniel P. Berrange" <berrange@redhat.com> writes:
> > > > 
> > > > > On Wed, Oct 19, 2016 at 11:05:53AM +0100, Dr. David Alan Gilbert wrote:
> > > > >> 
> > > > >> We need a way to be able to report an error without plumbing error_setg
> > > > >> up the stack; if you're saying error_report isn't suitable then we
> > > > >> should just recommend we switch everything in migration back to
> > > > >> fprintf(stderr,
> > > > 
> > > > In the cases where error_report() isn't suitable, fprintf() is just as
> > > > unsuitable for the exact same reasons.
> > > > 
> > > > > Well both error_report() + fprintf  are broken from POV of anything
> > > > > using QMP. error_report() is slightly less broken for HMP,
> > > > 
> > > > error_report() is not broken at all for HMP code.  The trouble is code
> > > > that can't know whether it's running in a context where error_report()
> > > > is suitable.
> > > > 
> > > > >                                                            but doesn't
> > > > > help QMP.
> > > > 
> > > > Correct.
> > > > 
> > > > > In the short term we should just make error_report be  threadsafe in
> > > > > its usage of the monitor.
> > > > 
> > > > Any problems left once cur_mon is thread-local (which it should be
> > > > anyway)?
> > > 
> > > If we make cur_mon a thread-local, then error_report() is equivalent
> > > to fprintf(stderr) for the migration code, since the migration
> > > code runs in a different thread thread, and so would always see
> > > cur_mon == NULL.
> > 
> > Yes, that would become safe; it does sound the best fix for the current
> > worry.
> > 
> > If we had that, then why not wire up error_report to pass errors back to QMP
> > as well?
> 
> You have a problem of context - if you have multiple monitors, how do
> you know which to send the error back to if you're not in the event
> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
> allows proper async command processing it gets even harder, because
> there's potentially many outstanding commands associated with a monitor
> and you need to decide which the error should be given to.

Are those async commands operating in a particular coroutine? i.e. should
it be a coroutine-local pointer and if it's not in that then a thread-local?

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 11:57                                           ` Markus Armbruster
@ 2016-10-20 17:56                                             ` Dr. David Alan Gilbert
  2016-10-21  9:06                                               ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-20 17:56 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Daniel P. Berrange, qemu-devel

* Markus Armbruster (armbru@redhat.com) wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Thu, Oct 20, 2016 at 12:42:01PM +0200, Markus Armbruster wrote:
> >> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> >> 
> >> > * Daniel P. Berrange (berrange@redhat.com) wrote:
> >> >> On Thu, Oct 20, 2016 at 10:55:52AM +0200, Markus Armbruster wrote:
> >> >> 
> >> >> Our code has increasingly converted to propagate errors up the call
> >> >> chain, but having a mix of different error reporting approaches
> >> >> is increasingly causing pain.
> >> >> 
> >> >> eg a function which propagates errors wants to call into a function
> >> >> whicih uses error_report
> >> 
> >> When you add an Error * parameter to a function, you get to convert
> >> everything it calls.  Failure to do so is a bug, simple as that.
> >> 
> >> Likewise, when you add a new call to a function that takes an Error *
> >> parameter.
> >
> > I agree its a bug - just that from a pragmatic POV it isn't always
> > practical to convert the entire call chain in one big bang - particularly
> > if you are working on shared infrastructure.
> 
> Been there, felt the pain.  I just wish we'd be more diligent adding
> FIXME comments when we take shortcuts.
> 
> >                                              For example, the
> > qemu-sockets.c APIs for connecting & listening on unix/tcp sockets
> > use Error * to report errors. Those APIs were used in the block
> > layer, migration, vnc, chardev and network backend code. It clearly
> > isn't practical to convert all those regions to code to correctly
> > propagate Error ** in one go. That's certainly the desired end
> > goal IMHO, but getting there is going to take a while.
> >
> > This slow conversion has been going on for a long time now, so
> > maybe we need to make an aggressive push in strategic areas of
> > the code to stop it dragging out indefinitely. This is kind of
> > a general problem we have in QEMU - we have introduced lots of
> > new standard frameworks, but very rarely finish converting
> > code to use them.
> 
> Yep.
> 
> >                    This is why I did a big push to try to get
> > everything switched over to use QIOChannel, in the shortest
> > possible time, rather than doing only small bits over many
> > releases.
> 
> Much better when you can pull it off.
> 
> A common road block is having to update code nobody wants to touch, let
> alone maintain, with no way to test.  The easy way out is to update just
> the code you actually care about, then call the conversion process
> "incremental".  Well, it's not incremental without commitments to
> increment.

I don't think we should do this until we can come up with a scheme
that has less impact on the code being modified.

> >> >>                          - there's no nice way to propagate the error
> >> >> since it has already been reported.  If the function then wants to
> >> >> explicitly ignore the error, then that's impossible too,since it has
> >> >> already been reported.  Add in our code which doesn't use error_report
> >> >> and instead returns errno values, such as the block layer, and it gets
> >> >> even worse because if that calls a function which propagates an error,
> >> >> it has to throw away that useful error and return a useless invented
> >> >> errno value :(
> >> 
> >> Different bug: when you receive an Error, you have to either handle and
> >> consume it, or pass it on.  Throwing it away and returning an error code
> >> instead counts as neither, and is a bug.
> >
> > Totally agree its a bug - just again hits the reality of having to
> > do major code conversions.
> >
> >> >> IMHO continuing to convert code to propagate errors is the only way
> >> >> out of this swamp, because it provides the greatest flexibility to
> >> >> the callers of said functions to decide how to deal with the error.
> >> 
> >> I wouldn't call the whole situation a swamp.  error_report() is just
> >> fine in many places.  So is returning -errno.  We convert to Error when
> >> these techniques cease to be fine.  The swampy part is old code where
> >> these techniques have never been fine, or at least not for a while.  To
> >> be drained incrementally.
> >
> > Realistically all the major backend subsystems (chardev, network, block,
> > ui and migration) need to be converted to Error ** propagation, since
> > they all ultimately call into some common code that reports Error **.
> 
> Infrastucture generally doesn't know how it's used, which means
> error_report() is generally wrong there.  Sufficiently simple functions
> can keep returning -errno, null, whatever, but the interesting stuff
> needs to use Error.
> > Very few places will end up being able to stick with -errno, or plain
> > error_report in the long term.
> 
> Not sure about "very few".  Less than now.  We'll see.

I'd also prefer we got the very-few level; Migration used to be
characterised by getting a 'load of migration failed -22' and having
no clue in the logs to why; I've slowly fought back to be able
to get an error from the lowest level that caused the failure.
I want more of that, so that when someone gets a rare failure in the field
I can see why.

Dave

> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 17:56                                             ` Dr. David Alan Gilbert
@ 2016-10-21  9:06                                               ` Markus Armbruster
  2016-10-21  9:37                                                 ` Daniel P. Berrange
  2016-10-21  9:45                                                 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 54+ messages in thread
From: Markus Armbruster @ 2016-10-21  9:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Markus Armbruster (armbru@redhat.com) wrote:
>> "Daniel P. Berrange" <berrange@redhat.com> writes:
[...]
>> > Realistically all the major backend subsystems (chardev, network, block,
>> > ui and migration) need to be converted to Error ** propagation, since
>> > they all ultimately call into some common code that reports Error **.
>> 
>> Infrastucture generally doesn't know how it's used, which means
>> error_report() is generally wrong there.  Sufficiently simple functions
>> can keep returning -errno, null, whatever, but the interesting stuff
>> needs to use Error.
>> > Very few places will end up being able to stick with -errno, or plain
>> > error_report in the long term.
>> 
>> Not sure about "very few".  Less than now.  We'll see.
>
> I'd also prefer we got the very-few level; Migration used to be
> characterised by getting a 'load of migration failed -22' and having
> no clue in the logs to why; I've slowly fought back to be able
> to get an error from the lowest level that caused the failure.
> I want more of that, so that when someone gets a rare failure in the field
> I can see why.

When it's about details that are only useful for debugging, logging
might be a practical alternative.  No excuse for shoddy error reporting,
of course.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 16:56                                   ` Paolo Bonzini
@ 2016-10-21  9:12                                     ` Markus Armbruster
  2016-10-21 21:06                                       ` Paolo Bonzini
  2016-10-21  9:35                                     ` Daniel P. Berrange
  1 sibling, 1 reply; 54+ messages in thread
From: Markus Armbruster @ 2016-10-21  9:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Daniel P. Berrange, Dr. David Alan Gilbert, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 20/10/2016 10:37, Daniel P. Berrange wrote:
>> You have a problem of context - if you have multiple monitors, how do
>> you know which to send the error back to if you're not in the event
>> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
>> allows proper async command processing it gets even harder, because
>> there's potentially many outstanding commands associated with a monitor
>> and you need to decide which the error should be given to.
>
> Why should it even consult the current monitor?  Shouldn't errors go to
> all HMP monitors when QEMU falls back to error_report?

Maybe.  Copying error_report() output to human monitors is a courtesy to
users who fail to watch stderr.  But what about QMP?  I figure all we
could do there is sending some "Uh, here's an error message that may or
may not be related to anything you've been doing in this monitor, do
with it what you want" event.  Would that be useful?

Also, when you got an HMP monitor and stderr on the same tty (or
whatever), you'd want to suppress one of the two messages.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-20 16:56                                   ` Paolo Bonzini
  2016-10-21  9:12                                     ` Markus Armbruster
@ 2016-10-21  9:35                                     ` Daniel P. Berrange
  1 sibling, 0 replies; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-21  9:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Dr. David Alan Gilbert, Markus Armbruster, qemu-devel

On Thu, Oct 20, 2016 at 06:56:11PM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/10/2016 10:37, Daniel P. Berrange wrote:
> > You have a problem of context - if you have multiple monitors, how do
> > you know which to send the error back to if you're not in the event
> > loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
> > allows proper async command processing it gets even harder, because
> > there's potentially many outstanding commands associated with a monitor
> > and you need to decide which the error should be given to.
> 
> Why should it even consult the current monitor?  Shouldn't errors go to
> all HMP monitors when QEMU falls back to error_report?

I rather think not - say you're running two monitors, one for issuing
commands like hotplug, and in the other just using to query stats/info
about the VM. Why should an error about a failed hotplug command pollute
the output of the other monitor you're using for querying stats.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-21  9:06                                               ` Markus Armbruster
@ 2016-10-21  9:37                                                 ` Daniel P. Berrange
  2016-10-21 11:56                                                   ` Dr. David Alan Gilbert
  2016-10-21  9:45                                                 ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel P. Berrange @ 2016-10-21  9:37 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel

On Fri, Oct 21, 2016 at 11:06:21AM +0200, Markus Armbruster wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Markus Armbruster (armbru@redhat.com) wrote:
> >> "Daniel P. Berrange" <berrange@redhat.com> writes:
> [...]
> >> > Realistically all the major backend subsystems (chardev, network, block,
> >> > ui and migration) need to be converted to Error ** propagation, since
> >> > they all ultimately call into some common code that reports Error **.
> >> 
> >> Infrastucture generally doesn't know how it's used, which means
> >> error_report() is generally wrong there.  Sufficiently simple functions
> >> can keep returning -errno, null, whatever, but the interesting stuff
> >> needs to use Error.
> >> > Very few places will end up being able to stick with -errno, or plain
> >> > error_report in the long term.
> >> 
> >> Not sure about "very few".  Less than now.  We'll see.
> >
> > I'd also prefer we got the very-few level; Migration used to be
> > characterised by getting a 'load of migration failed -22' and having
> > no clue in the logs to why; I've slowly fought back to be able
> > to get an error from the lowest level that caused the failure.
> > I want more of that, so that when someone gets a rare failure in the field
> > I can see why.
> 
> When it's about details that are only useful for debugging, logging
> might be a practical alternative.  No excuse for shoddy error reporting,
> of course.

FWIW, I would very much like it if incoming migration were able to report
the error failing to load migration stream back via the monitor, instead
of spitting them to stderr - the latter makes it hard for libvirt to
provide good error report to the users.

On the outgoing side we've now fed the errors back via the query-migrate
command - the same approach could be used on the incoming side.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-21  9:06                                               ` Markus Armbruster
  2016-10-21  9:37                                                 ` Daniel P. Berrange
@ 2016-10-21  9:45                                                 ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-21  9:45 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Markus Armbruster (armbru@redhat.com) wrote:
> >> "Daniel P. Berrange" <berrange@redhat.com> writes:
> [...]
> >> > Realistically all the major backend subsystems (chardev, network, block,
> >> > ui and migration) need to be converted to Error ** propagation, since
> >> > they all ultimately call into some common code that reports Error **.
> >> 
> >> Infrastucture generally doesn't know how it's used, which means
> >> error_report() is generally wrong there.  Sufficiently simple functions
> >> can keep returning -errno, null, whatever, but the interesting stuff
> >> needs to use Error.
> >> > Very few places will end up being able to stick with -errno, or plain
> >> > error_report in the long term.
> >> 
> >> Not sure about "very few".  Less than now.  We'll see.
> >
> > I'd also prefer we got the very-few level; Migration used to be
> > characterised by getting a 'load of migration failed -22' and having
> > no clue in the logs to why; I've slowly fought back to be able
> > to get an error from the lowest level that caused the failure.
> > I want more of that, so that when someone gets a rare failure in the field
> > I can see why.
> 
> When it's about details that are only useful for debugging, logging
> might be a practical alternative.  No excuse for shoddy error reporting,
> of course.

It's detail I want in the initial error report rather than future debug;
to give a real example:

 qemu-system-ppc64: 9223477658187168481 != 9223477658187151905
 qemu-system-ppc64: Failed to load cpu:env.insns_flags
 qemu-system-ppc64: error while loading state for instance 0x0 of device 'cpu'
 qemu-system-ppc64: load of migration failed: Invalid argument

I wouldn't want to log the value loaded from each field unless I'm in the
pain of really bad debugging; but when one goes wrong like this getting the
value mismatch, the field name, and the device that failed is what I want
in the logs.

One point here is that those lines each come from a different function 
as we come back up out of the failure;  the one at the bottom doesn't
have the information on the device it's loading, just that it's loading
an integer and it's not the expected value.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-21  9:37                                                 ` Daniel P. Berrange
@ 2016-10-21 11:56                                                   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 54+ messages in thread
From: Dr. David Alan Gilbert @ 2016-10-21 11:56 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Markus Armbruster, qemu-devel

* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Fri, Oct 21, 2016 at 11:06:21AM +0200, Markus Armbruster wrote:
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> > 
> > > * Markus Armbruster (armbru@redhat.com) wrote:
> > >> "Daniel P. Berrange" <berrange@redhat.com> writes:
> > [...]
> > >> > Realistically all the major backend subsystems (chardev, network, block,
> > >> > ui and migration) need to be converted to Error ** propagation, since
> > >> > they all ultimately call into some common code that reports Error **.
> > >> 
> > >> Infrastucture generally doesn't know how it's used, which means
> > >> error_report() is generally wrong there.  Sufficiently simple functions
> > >> can keep returning -errno, null, whatever, but the interesting stuff
> > >> needs to use Error.
> > >> > Very few places will end up being able to stick with -errno, or plain
> > >> > error_report in the long term.
> > >> 
> > >> Not sure about "very few".  Less than now.  We'll see.
> > >
> > > I'd also prefer we got the very-few level; Migration used to be
> > > characterised by getting a 'load of migration failed -22' and having
> > > no clue in the logs to why; I've slowly fought back to be able
> > > to get an error from the lowest level that caused the failure.
> > > I want more of that, so that when someone gets a rare failure in the field
> > > I can see why.
> > 
> > When it's about details that are only useful for debugging, logging
> > might be a practical alternative.  No excuse for shoddy error reporting,
> > of course.
> 
> FWIW, I would very much like it if incoming migration were able to report
> the error failing to load migration stream back via the monitor, instead
> of spitting them to stderr - the latter makes it hard for libvirt to
> provide good error report to the users.

As long as we also get them in the stderr log then that's fine.

Dave

> On the outgoing side we've now fed the errors back via the query-migrate
> command - the same approach could be used on the incoming side.
> 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-21  9:12                                     ` Markus Armbruster
@ 2016-10-21 21:06                                       ` Paolo Bonzini
  2016-10-24  7:07                                         ` Markus Armbruster
  0 siblings, 1 reply; 54+ messages in thread
From: Paolo Bonzini @ 2016-10-21 21:06 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dr. David Alan Gilbert, qemu-devel



On 21/10/2016 11:12, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 20/10/2016 10:37, Daniel P. Berrange wrote:
>>> You have a problem of context - if you have multiple monitors, how do
>>> you know which to send the error back to if you're not in the event
>>> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
>>> allows proper async command processing it gets even harder, because
>>> there's potentially many outstanding commands associated with a monitor
>>> and you need to decide which the error should be given to.
>>
>> Why should it even consult the current monitor?  Shouldn't errors go to
>> all HMP monitors when QEMU falls back to error_report?
> 
> Maybe.  Copying error_report() output to human monitors is a courtesy to
> users who fail to watch stderr.  But what about QMP?  I figure all we
> could do there is sending some "Uh, here's an error message that may or
> may not be related to anything you've been doing in this monitor, do
> with it what you want" event.  Would that be useful?
> 
> Also, when you got an HMP monitor and stderr on the same tty (or
> whatever), you'd want to suppress one of the two messages.

If you have an HMP monitor you wouldn't print to stderr---stderr would
be reserved for the case where there are only QMP monitors.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Qemu-devel] chardev's and fd's in monitors
  2016-10-21 21:06                                       ` Paolo Bonzini
@ 2016-10-24  7:07                                         ` Markus Armbruster
  0 siblings, 0 replies; 54+ messages in thread
From: Markus Armbruster @ 2016-10-24  7:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Dr. David Alan Gilbert, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/10/2016 11:12, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>>> On 20/10/2016 10:37, Daniel P. Berrange wrote:
>>>> You have a problem of context - if you have multiple monitors, how do
>>>> you know which to send the error back to if you're not in the event
>>>> loop thread, and thus cur_mon is NULL. With Marc-Andre's series which
>>>> allows proper async command processing it gets even harder, because
>>>> there's potentially many outstanding commands associated with a monitor
>>>> and you need to decide which the error should be given to.
>>>
>>> Why should it even consult the current monitor?  Shouldn't errors go to
>>> all HMP monitors when QEMU falls back to error_report?
>> 
>> Maybe.  Copying error_report() output to human monitors is a courtesy to
>> users who fail to watch stderr.  But what about QMP?  I figure all we
>> could do there is sending some "Uh, here's an error message that may or
>> may not be related to anything you've been doing in this monitor, do
>> with it what you want" event.  Would that be useful?
>> 
>> Also, when you got an HMP monitor and stderr on the same tty (or
>> whatever), you'd want to suppress one of the two messages.
>
> If you have an HMP monitor you wouldn't print to stderr---stderr would
> be reserved for the case where there are only QMP monitors.

I disagree.

* Users may run QEMU with the HMP monitor hidden away in the UI, but
  stderr in view.

* Management applications that capture and log stderr won't appreciate
  having to track HMP monitors, pick one, split output into monitor
  command output and errors from elsewhere (if that's even possible) so
  they can log the latter.

  While management applications shouldn't create HMP monitors for their
  own use, flexible ones should be okay with letting their human users
  have one, without breaking down completely.

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2016-10-24  7:07 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-12 19:15 [Qemu-devel] chardev's and fd's in monitors Dr. David Alan Gilbert
2016-10-12 20:02 ` Marc-André Lureau
2016-10-13 15:47   ` Dr. David Alan Gilbert
2016-10-18 10:04 ` Daniel P. Berrange
2016-10-18 11:32   ` Dr. David Alan Gilbert
2016-10-18 11:41     ` Marc-André Lureau
2016-10-18 11:44       ` Marc-André Lureau
2016-10-18 12:01     ` Daniel P. Berrange
2016-10-18 13:25       ` Dr. David Alan Gilbert
2016-10-18 13:35         ` Daniel P. Berrange
2016-10-18 13:52           ` Dr. David Alan Gilbert
2016-10-18 14:01             ` Daniel P. Berrange
2016-10-18 18:53               ` Dr. David Alan Gilbert
2016-10-19  7:45                 ` Daniel P. Berrange
2016-10-19  8:00               ` Markus Armbruster
2016-10-19  8:12                 ` Dr. David Alan Gilbert
2016-10-19  8:42                   ` Daniel P. Berrange
2016-10-19  9:48                     ` Markus Armbruster
2016-10-19 10:05                       ` Dr. David Alan Gilbert
2016-10-19 10:16                         ` Daniel P. Berrange
2016-10-19 12:16                           ` Markus Armbruster
2016-10-19 12:21                             ` Daniel P. Berrange
2016-10-19 18:06                               ` Dr. David Alan Gilbert
2016-10-20  8:37                                 ` Daniel P. Berrange
2016-10-20  8:53                                   ` Marc-André Lureau
2016-10-20 10:45                                     ` Markus Armbruster
2016-10-20 16:56                                   ` Paolo Bonzini
2016-10-21  9:12                                     ` Markus Armbruster
2016-10-21 21:06                                       ` Paolo Bonzini
2016-10-24  7:07                                         ` Markus Armbruster
2016-10-21  9:35                                     ` Daniel P. Berrange
2016-10-20 16:59                                   ` Dr. David Alan Gilbert
2016-10-20  8:55                                 ` Markus Armbruster
2016-10-20  9:03                                   ` Daniel P. Berrange
2016-10-20  9:58                                     ` Dr. David Alan Gilbert
2016-10-20 10:42                                       ` Markus Armbruster
2016-10-20 11:01                                         ` Dr. David Alan Gilbert
2016-10-20 11:10                                           ` Daniel P. Berrange
2016-10-20 11:45                                             ` Markus Armbruster
2016-10-20 11:08                                         ` Daniel P. Berrange
2016-10-20 11:57                                           ` Markus Armbruster
2016-10-20 17:56                                             ` Dr. David Alan Gilbert
2016-10-21  9:06                                               ` Markus Armbruster
2016-10-21  9:37                                                 ` Daniel P. Berrange
2016-10-21 11:56                                                   ` Dr. David Alan Gilbert
2016-10-21  9:45                                                 ` Dr. David Alan Gilbert
2016-10-19 12:26               ` Paolo Bonzini
2016-10-19 17:01                 ` Dr. David Alan Gilbert
2016-10-19 20:51                   ` Paolo Bonzini
2016-10-20  8:34                     ` Daniel P. Berrange
2016-10-18 12:08   ` Markus Armbruster
2016-10-18 12:13     ` Daniel P. Berrange
2016-10-18 12:43       ` Dr. David Alan Gilbert
2016-10-18 10:06 ` Daniel P. Berrange

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.