All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Abort in monitor_puts.
@ 2013-03-22  9:17 KONRAD Frédéric
  2013-03-22 20:50 ` Luiz Capitulino
  0 siblings, 1 reply; 6+ messages in thread
From: KONRAD Frédéric @ 2013-03-22  9:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Anthony Liguori, fred.konrad

[-- Attachment #1: Type: text/plain, Size: 2875 bytes --]

Hi,

Seems there is an issue with the current git (found by toddf on IRC).

To reproduce:

./qemu-system-x86_64 --monitor stdio --nographic

and put "?" it should abort.

Here is the backtrace:

#0  0x00007f77cd347935 in raise () from /lib64/libc.so.6
#1  0x00007f77cd3490e8 in abort () from /lib64/libc.so.6
#2  0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>,
     str=<optimized out>) at 
/home/konradf/Bureau/test-qemu/qemu/monitor.c:297
#5  monitor_puts (mon=0x7f77d2bb10f0, str=
     0x7fff8780b106 " qcow2.  The -n flag requests QEMU\n\t\t\tto reuse 
the image found in new-image-file, instead of\n\t\t\trecreating it from 
scratch.\n")
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:292
#6  0x00007f77d1c201b6 in monitor_vprintf (mon=0x7f77d2bb10f0,
     fmt=<optimized out>, ap=<optimized out>)
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:322
#7  0x00007f77d1c203d4 in monitor_printf (mon=mon@entry=0x7f77d2bb10f0,
     fmt=fmt@entry=0x7f77d1d13a63 "%s%s %s -- %s\n")
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:329
#8  0x00007f77d1c21b07 in help_cmd_dump (name=<optimized out>,
     prefix=<optimized out>, cmds=<optimized out>, mon=<optimized out>)
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:732
#9  help_cmd (mon=0x7f77d2bb10f0, name=<optimized out>)
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:742
#10 0x00007f77d1c256b9 in handle_user_command 
(mon=mon@entry=0x7f77d2bb10f0,
     cmdline=<optimized out>)
---Type <return> to continue, or q <return> to quit---
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:3985
#11 0x00007f77d1c25a5e in monitor_command_cb (mon=0x7f77d2bb10f0,
     cmdline=<optimized out>, opaque=<optimized out>)
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:4601
#12 0x00007f77d1b8fa4b in readline_handle_byte (rs=0x7f77d2bb1560,
     ch=<optimized out>) at readline.c:373
#13 0x00007f77d1c25787 in monitor_read (opaque=<optimized out>,
     buf=<optimized out>, size=<optimized out>)
     at /home/konradf/Bureau/test-qemu/qemu/monitor.c:4587
#14 0x00007f77d1b7d76d in qemu_chr_be_write (len=<optimized out>, buf=
     0x7fff8780c200 "\rL03", s=0x7f77d2ba88a0) at qemu-char.c:160
#15 fd_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=
     0x7f77d2ba88a0) at qemu-char.c:767
#16 0x00007f77d10ba825 in g_main_context_dispatch ()
    from /lib64/libglib-2.0.so.0
#17 0x00007f77d1b56c39 in glib_pollfds_poll () at main-loop.c:187
#18 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:207
#19 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:443
#20 0x00007f77d1a3cd15 in main_loop () at vl.c:2038
#21 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4424

Fred.

[-- Attachment #2: Type: text/html, Size: 4091 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Abort in monitor_puts.
  2013-03-22  9:17 [Qemu-devel] Abort in monitor_puts KONRAD Frédéric
@ 2013-03-22 20:50 ` Luiz Capitulino
  2013-03-22 21:39   ` Luiz Capitulino
  0 siblings, 1 reply; 6+ messages in thread
From: Luiz Capitulino @ 2013-03-22 20:50 UTC (permalink / raw)
  To: KONRAD Frédéric; +Cc: Anthony Liguori, qemu-devel, kraxel

On Fri, 22 Mar 2013 10:17:58 +0100
KONRAD Frédéric <fred.konrad@greensocs.com> wrote:

> Hi,
> 
> Seems there is an issue with the current git (found by toddf on IRC).
> 
> To reproduce:
> 
> ./qemu-system-x86_64 --monitor stdio --nographic
> 
> and put "?" it should abort.
> 
> Here is the backtrace:
> 
> #0  0x00007f77cd347935 in raise () from /lib64/libc.so.6
> #1  0x00007f77cd3490e8 in abort () from /lib64/libc.so.6
> #2  0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6
> #3  0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6
> #4  0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>,
>      str=<optimized out>) at 

Yes, it's easy to reproduce. Bisect says:

f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit
commit f628926bb423fa8a7e0b114511400ea9df38b76a
Author: Gerd Hoffmann <kraxel@redhat.com>
Date:   Tue Mar 19 10:57:56 2013 +0100

    fix monitor
    
    chardev flow control broke monitor, fix it by adding watch support.
    
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

My impression is that monitor_puts() in being called in parallel.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Abort in monitor_puts.
  2013-03-22 20:50 ` Luiz Capitulino
@ 2013-03-22 21:39   ` Luiz Capitulino
  2013-03-25  7:42     ` Gerd Hoffmann
  0 siblings, 1 reply; 6+ messages in thread
From: Luiz Capitulino @ 2013-03-22 21:39 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Anthony Liguori, kraxel, qemu-devel, KONRAD Frédéric

On Fri, 22 Mar 2013 16:50:39 -0400
Luiz Capitulino <lcapitulino@redhat.com> wrote:

> On Fri, 22 Mar 2013 10:17:58 +0100
> KONRAD Frédéric <fred.konrad@greensocs.com> wrote:
> 
> > Hi,
> > 
> > Seems there is an issue with the current git (found by toddf on IRC).
> > 
> > To reproduce:
> > 
> > ./qemu-system-x86_64 --monitor stdio --nographic
> > 
> > and put "?" it should abort.
> > 
> > Here is the backtrace:
> > 
> > #0  0x00007f77cd347935 in raise () from /lib64/libc.so.6
> > #1  0x00007f77cd3490e8 in abort () from /lib64/libc.so.6
> > #2  0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6
> > #3  0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6
> > #4  0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>,
> >      str=<optimized out>) at 
> 
> Yes, it's easy to reproduce. Bisect says:
> 
> f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit
> commit f628926bb423fa8a7e0b114511400ea9df38b76a
> Author: Gerd Hoffmann <kraxel@redhat.com>
> Date:   Tue Mar 19 10:57:56 2013 +0100
> 
>     fix monitor
>     
>     chardev flow control broke monitor, fix it by adding watch support.
>     
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> 
> My impression is that monitor_puts() in being called in parallel.

Not all.

What's happening is that qemu_chr_fe_write() is returning < 0,
mon->outbuf_index is not reset and is full, this causes the assert in
monitor_puts() to trig.

The previous version of monitor_flush() ignores errors, and everything
works, so doing the same thing here fixes the problem :)

For some reason I'm unable to see what the error code is. Gerd, do you think
the patch below is reasonable? If it's not, how should we handle errors here?

diff --git a/monitor.c b/monitor.c
index cfb5d64..ecfe97c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -274,12 +274,11 @@ void monitor_flush(Monitor *mon)
 
     if (mon && mon->outbuf_index != 0 && !mon->mux_out) {
         rc = qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
-        if (rc == mon->outbuf_index) {
+        if (rc == mon->outbuf_index || rc < 0) {
             /* all flushed */
             mon->outbuf_index = 0;
             return;
-        }
-        if (rc > 0) {
+        } else {
             /* partinal write */
             memmove(mon->outbuf, mon->outbuf + rc, mon->outbuf_index - rc);
             mon->outbuf_index -= rc;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Abort in monitor_puts.
  2013-03-22 21:39   ` Luiz Capitulino
@ 2013-03-25  7:42     ` Gerd Hoffmann
  2013-03-25 11:56       ` Luiz Capitulino
  0 siblings, 1 reply; 6+ messages in thread
From: Gerd Hoffmann @ 2013-03-25  7:42 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Anthony Liguori, qemu-devel, KONRAD Frédéric

On 03/22/13 22:39, Luiz Capitulino wrote:
> On Fri, 22 Mar 2013 16:50:39 -0400
> Luiz Capitulino <lcapitulino@redhat.com> wrote:
> 
>> On Fri, 22 Mar 2013 10:17:58 +0100
>> KONRAD Frédéric <fred.konrad@greensocs.com> wrote:
>>
>>> Hi,
>>>
>>> Seems there is an issue with the current git (found by toddf on IRC).
>>>
>>> To reproduce:
>>>
>>> ./qemu-system-x86_64 --monitor stdio --nographic
>>>
>>> and put "?" it should abort.
>>>
>>> Here is the backtrace:
>>>
>>> #0  0x00007f77cd347935 in raise () from /lib64/libc.so.6
>>> #1  0x00007f77cd3490e8 in abort () from /lib64/libc.so.6
>>> #2  0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6
>>> #3  0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6
>>> #4  0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>,
>>>      str=<optimized out>) at 
>>
>> Yes, it's easy to reproduce. Bisect says:
>>
>> f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit
>> commit f628926bb423fa8a7e0b114511400ea9df38b76a
>> Author: Gerd Hoffmann <kraxel@redhat.com>
>> Date:   Tue Mar 19 10:57:56 2013 +0100
>>
>>     fix monitor
>>     
>>     chardev flow control broke monitor, fix it by adding watch support.
>>     
>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>
>> My impression is that monitor_puts() in being called in parallel.
> 
> Not all.
> 
> What's happening is that qemu_chr_fe_write() is returning < 0,
> mon->outbuf_index is not reset and is full, this causes the assert in
> monitor_puts() to trig.
> 
> The previous version of monitor_flush() ignores errors, and everything
> works, so doing the same thing here fixes the problem :)

No, ignoring errors breaks qmp because the output isn't valid json any
more when you cut off something ...

> For some reason I'm unable to see what the error code is. Gerd, do you think
> the patch below is reasonable? If it's not, how should we handle errors here?

No, it's not.

Ignoring the error for errno = EAGAIN breaks flow control.

Ignoring the error for errno != EAGAIN (and maybe logging a debug
message) would be ok, but I suspect it's actually EAGAIN you get here.

Just go for a larger buffer?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Abort in monitor_puts.
  2013-03-25  7:42     ` Gerd Hoffmann
@ 2013-03-25 11:56       ` Luiz Capitulino
  2013-03-25 20:11         ` Gerd Hoffmann
  0 siblings, 1 reply; 6+ messages in thread
From: Luiz Capitulino @ 2013-03-25 11:56 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Anthony Liguori, qemu-devel, KONRAD Frédéric

On Mon, 25 Mar 2013 08:42:57 +0100
Gerd Hoffmann <kraxel@redhat.com> wrote:

> On 03/22/13 22:39, Luiz Capitulino wrote:
> > On Fri, 22 Mar 2013 16:50:39 -0400
> > Luiz Capitulino <lcapitulino@redhat.com> wrote:
> > 
> >> On Fri, 22 Mar 2013 10:17:58 +0100
> >> KONRAD Frédéric <fred.konrad@greensocs.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> Seems there is an issue with the current git (found by toddf on IRC).
> >>>
> >>> To reproduce:
> >>>
> >>> ./qemu-system-x86_64 --monitor stdio --nographic
> >>>
> >>> and put "?" it should abort.
> >>>
> >>> Here is the backtrace:
> >>>
> >>> #0  0x00007f77cd347935 in raise () from /lib64/libc.so.6
> >>> #1  0x00007f77cd3490e8 in abort () from /lib64/libc.so.6
> >>> #2  0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6
> >>> #3  0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6
> >>> #4  0x00007f77d1c1f226 in monitor_puts (mon=<optimized out>,
> >>>      str=<optimized out>) at 
> >>
> >> Yes, it's easy to reproduce. Bisect says:
> >>
> >> f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit
> >> commit f628926bb423fa8a7e0b114511400ea9df38b76a
> >> Author: Gerd Hoffmann <kraxel@redhat.com>
> >> Date:   Tue Mar 19 10:57:56 2013 +0100
> >>
> >>     fix monitor
> >>     
> >>     chardev flow control broke monitor, fix it by adding watch support.
> >>     
> >>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> >>
> >> My impression is that monitor_puts() in being called in parallel.
> > 
> > Not all.
> > 
> > What's happening is that qemu_chr_fe_write() is returning < 0,
> > mon->outbuf_index is not reset and is full, this causes the assert in
> > monitor_puts() to trig.
> > 
> > The previous version of monitor_flush() ignores errors, and everything
> > works, so doing the same thing here fixes the problem :)
> 
> No, ignoring errors breaks qmp because the output isn't valid json any
> more when you cut off something ...

What you mean "when you cut off"? When the other side disconnects? Do we care?

> > For some reason I'm unable to see what the error code is. Gerd, do you think
> > the patch below is reasonable? If it's not, how should we handle errors here?
> 
> No, it's not.
> 
> Ignoring the error for errno = EAGAIN breaks flow control.
> 
> Ignoring the error for errno != EAGAIN (and maybe logging a debug
> message) would be ok, but I suspect it's actually EAGAIN you get here.
> 
> Just go for a larger buffer?

That's simple, but it's not a real fix. We hit that problem because
the help output is a large one. I'd guess that this is easily reproduced
with something like QIDL, which (iirc) generates long json output on QMP.

Looks like we need a dynamic buffer there.

Other ideas?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Abort in monitor_puts.
  2013-03-25 11:56       ` Luiz Capitulino
@ 2013-03-25 20:11         ` Gerd Hoffmann
  0 siblings, 0 replies; 6+ messages in thread
From: Gerd Hoffmann @ 2013-03-25 20:11 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Anthony Liguori, qemu-devel, KONRAD Frédéric

  Hi,

>>> The previous version of monitor_flush() ignores errors, and everything
>>> works, so doing the same thing here fixes the problem :)
>>
>> No, ignoring errors breaks qmp because the output isn't valid json any
>> more when you cut off something ...
> 
> What you mean "when you cut off"? When the other side disconnects? Do we care?

errno = EAGAIN means "kernel buffers full, can't accept your data atm,
try again later".  Simply ignoring this will throw away the data which
didn't fit, which will for the receiver look like someone cut off some
data from the response ...

>> Just go for a larger buffer?
> 
> That's simple, but it's not a real fix. We hit that problem because
> the help output is a large one. I'd guess that this is easily reproduced
> with something like QIDL, which (iirc) generates long json output on QMP.
> 
> Looks like we need a dynamic buffer there.

Yes.

Or generate the data piecewise (i.e. in monitor_unblocked which is
called back when the kernel has room again).  Which is probably only
worth the trouble if we have _really_ big responses (megabytes).

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-25 20:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-22  9:17 [Qemu-devel] Abort in monitor_puts KONRAD Frédéric
2013-03-22 20:50 ` Luiz Capitulino
2013-03-22 21:39   ` Luiz Capitulino
2013-03-25  7:42     ` Gerd Hoffmann
2013-03-25 11:56       ` Luiz Capitulino
2013-03-25 20:11         ` Gerd Hoffmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.