All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marc-André Lureau" <marcandre.lureau@gmail.com>
To: Li Feng <fengli@smartx.com>
Cc: Li Feng <lifeng1519@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Yang Fan <fanyang@smartx.com>, Kyle Zhang <kyle@smartx.com>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>
Subject: Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
Date: Mon, 20 Apr 2020 16:28:05 +0200	[thread overview]
Message-ID: <CAJ+F1CKjouQiJcNQCd4yYKAy3jdd4jD=vBe39uo-ssrk-p8ZaA@mail.gmail.com> (raw)
In-Reply-To: <CAHckoCziWv_QBZtoJ8+qBXPD4T4UUZARU5ES9e5ZPcSNAo0UgQ@mail.gmail.com>

Hi

On Mon, Apr 20, 2020 at 5:10 AM Li Feng <fengli@smartx.com> wrote:
>
> Hi Lureau,
>
> Thanks for your comment.
> I have spent some time to reproduce this crash, so that delay my
> reply, apologies.
>
> This is the description of this bug using gdb:
> 1. Set break points using gdb;
> b chardev/char-socket.c:410 if s->state == TCP_CHARDEV_STATE_DISCONNECTED
> b break vhost_user_write
>
> 2. hotplug a vhost-blk device:
> echo "chardev-add
> socket,id=spdk_vhost_blk2,path=/vhost-blk.0,reconnect=1 "| nc -U
> /tmp/a.socket
> echo "device_add
> vhost-user-blk-pci,chardev=spdk_vhost_blk2,id=spdk_vhost_blk2,num-queues=4"
> | nc -U /tmp/a.socket
>
> 3. Gdb will stop at vhost_user_write, and CTRL-C the vhost target.
> 4. Put 'c' to let gdb continue.
> You will see a crash:
>
> 192 login: qemu-system-x86_64: chardev/char-socket.c:125:
> qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> The related code:
> 394 static void tcp_chr_free_connection(Chardev *chr)
>  395 {
>  396     SocketChardev *s = SOCKET_CHARDEV(chr);
>  397     int i;
>  398
>  399     if (s->read_msgfds_num) {
>  400         for (i = 0; i < s->read_msgfds_num; i++) {
>  401             close(s->read_msgfds[i]);
>  402         }
>  403         g_free(s->read_msgfds);
>  404         s->read_msgfds = NULL;
>  405         s->read_msgfds_num = 0;
>  406     }
>  407
>  408     remove_hup_source(s);
>  409
>  410     tcp_set_msgfds(chr, NULL, 0);
>  411     remove_fd_in_watch(chr);
>  412     object_unref(OBJECT(s->sioc));
>  413     s->sioc = NULL;
>  414     object_unref(OBJECT(s->ioc));
>  415     s->ioc = NULL;
>  416     g_free(chr->filename);
>  417     chr->filename = NULL;
>  418     tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
>  419 }
>
> #0  0x0000555555c1aae8 in tcp_chr_free_connection
> (chr=chr@entry=0x55555751ee00) at chardev/char-socket.c:410
> #1  0x0000555555c1aec8 in tcp_chr_disconnect_locked
> (chr=0x55555751ee00) at chardev/char-socket.c:479
> #2  0x0000555555c1af5d in tcp_chr_disconnect (chr=0x55555751ee00) at
> chardev/char-socket.c:497
> #3  0x0000555555c16715 in qemu_chr_fe_set_handlers
> (b=b@entry=0x5555575f3b48, fd_can_read=fd_can_read@entry=0x0,
> fd_read=fd_
> read@entry=0x0, fd_event=fd_event@entry=0x55555588e740
> <vhost_user_blk_event>, be_change=be_change@entry=0x0, opaque=opaque@
> entry=0x5555575f3990, context=0x0, set_open=true) at chardev/char-fe.c:304
> #4  0x000055555588e5c0 in vhost_user_blk_device_realize
> (dev=0x5555575f3990, errp=<optimized out>)
>     at /root/qemu-master/hw/block/vhost-user-blk.c:447
> #5  0x00005555558ca478 in virtio_device_realize (dev=0x5555575f3990,
> errp=0x7fffffffbb90)
>     at /root/qemu-master/hw/virtio/virtio.c:3615
> #6  0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbd20)
>     at hw/core/qdev.c:891
> #7  0x0000555555b78e07 in property_set_bool (obj=0x5555575f3990,
> v=<optimized out>, name=<optimized out>, opaque=0x555556453
> 040, errp=0x7fffffffbd20) at qom/object.c:2238
> #8  0x0000555555b7db3f in object_property_set_qobject
> (obj=obj@entry=0x5555575f3990, value=value@entry=0x555556ab89c0, name=
> name@entry=0x555555d15308 "realized", errp=errp@entry=0x7fffffffbd20)
> at qom/qom-qobject.c:26
> #9  0x0000555555b7b2c5 in object_property_set_bool
> (obj=0x5555575f3990, value=<optimized out>, name=0x555555d15308
> "realized
> ", errp=0x7fffffffbd20) at qom/object.c:1390
> #10 0x0000555555af13b2 in virtio_pci_realize (pci_dev=0x5555575eb800,
> errp=0x7fffffffbd20) at hw/virtio/virtio-pci.c:1807
> #11 0x0000555555a7a14b in pci_qdev_realize (qdev=<optimized out>,
> errp=<optimized out>) at hw/pci/pci.c:2098
> #12 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbef8)
>     at hw/core/qdev.c:891
>
> (gdb) p s
> $23 = (SocketChardev *) 0x55555751ee00
> (gdb) p s->state
> $24 = TCP_CHARDEV_STATE_DISCONNECTED
>
> At 410 line of char-socket.c, the state has been changed to
> TCP_CHARDEV_STATE_DISCONNECTED before tcp_chr_change_state(s,
> TCP_CHARDEV_STATE_DISCONNECTED);
> It means the tcp_chr_disconnect_locked is called by more than one
> times, and the tcp socket has been freed before.
>
> The crash reason is s->reconnect_timer is set in the previous call of
> tcp_chr_disconnect_locked.
> The another approach fix maybe like this:
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 185fe38dda..94957de367 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
>
> The base code is here:
> 474 static void tcp_chr_disconnect_locked(Chardev *chr)
> 475 {
> 476 SocketChardev *s = SOCKET_CHARDEV(chr);
> 477 bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> 478
> 479 tcp_chr_free_connection(chr);
> 480
> 481 if (s->listener) {
> 482 qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
> 483 chr, NULL, chr->gcontext);
> 484 }
> 485 update_disconnected_filename(s);
> 486 if (emit_close) {
> 487 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> 488 }
> 489 if (s->reconnect_time) {
> 490 qemu_chr_socket_restart_timer(chr);
> 491 }
> 492 }
>
> Which one is better?

+ if (s->reconnect_time && !s->reconnect_timer)

Looks like a good solution to me.

>
> I don't know how to inject an error at socket write when writing tests
> code(tests/test-char.c ).
> Any tips?

You could check the patch I just sent fixing a related issue:
https://patchew.org/QEMU/20200420112012.567284-1-marcandre.lureau@redhat.com/

In your case, it might be as easy as calling qemu_chr_fe_disconnect()
2 times on a reconnect socket, I'll let you figure out.

Thanks!

>
> Thanks,
> Feng Li
>
> Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年4月15日周三 下午6:35写道:
>
> >
> > Hi
> >
> > On Wed, Apr 15, 2020 at 5:31 AM Li Feng <fengli@smartx.com> wrote:
> > >
> > > double call tcp_chr_free_connection generates a crash.
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> >
> > Could you describe how you reach the "double call".
> >
> > Even better would be to write a test for it in tests/test-char.c
> >
> > thanks
> >
> > > ---
> > >  chardev/char-socket.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 185fe38dda..43aab8f24b 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > >      SocketChardev *s = SOCKET_CHARDEV(chr);
> > >      bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> > >
> > > +    /* avoid re-enter when socket read/write error and disconnect event. */
> > > +    if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
> > > +        return;
> > > +    }
> > > +
> > >      tcp_chr_free_connection(chr);
> > >
> > >      if (s->listener) {
> > > --
> > > 2.11.0
> > >
> > >
> > > --
> > > The SmartX email address is only for business purpose. Any sent message
> > > that is not related to the business is not authorized or permitted by
> > > SmartX.
> > > 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
> > >
> > >
> > >
> >
> >
> > --
> > Marc-André Lureau
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
>


-- 
Marc-André Lureau


  reply	other threads:[~2020-04-20 14:29 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-15  3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
2020-04-15  3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
2020-04-14  1:24   ` Raphael Norwitz
2020-04-20 11:18     ` Li Feng
2020-04-15  3:28 ` [PATCH 2/4] vhost-user-blk: fix invalid memory access Li Feng
2020-04-14  1:44   ` Raphael Norwitz
2020-04-15  3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
2020-04-15 10:35   ` Marc-André Lureau
2020-04-20  3:10     ` Li Feng
2020-04-20 14:28       ` Marc-André Lureau [this message]
2020-04-28  8:51   ` [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
     [not found]     ` <20200508051441.8143-1-fengli@smartx.com>
2020-05-08  5:14       ` [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0 Li Feng
2020-05-08 12:32         ` Marc-André Lureau
2020-05-08 12:42           ` Li Feng
2020-05-11 10:02             ` Daniel P. Berrangé
2020-05-11 12:09               ` Li Feng
2020-05-08  5:14       ` [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
2020-05-08 11:00         ` Marc-André Lureau
2020-05-08 12:16           ` Li Feng
2020-05-08 12:28             ` Marc-André Lureau
2020-05-08 12:31               ` Li Feng
2020-05-11 12:39     ` [PATCH v4] " Li Feng
2020-05-21 15:15       ` Paolo Bonzini
2020-04-28  8:54   ` [PATCH v2] " Li Feng
2020-04-28  8:59     ` Li Feng
2020-04-28 11:05       ` Marc-André Lureau
2020-04-28 14:09         ` Feng Li
2020-04-15  3:28 ` [PATCH 4/4] vhost-user-blk: fix crash in realize process Li Feng
2020-04-14  1:54   ` Raphael Norwitz
     [not found]     ` <CAHckoCwsLwqNhVHK4PF+pzt5ExkurPNsWLw7ueRG8dfOUA2rXg@mail.gmail.com>
2020-04-17 10:07       ` Fwd: " Li Feng
2020-04-17  9:45 ` [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Michael S. Tsirkin
2020-04-17 10:11   ` Li Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ+F1CKjouQiJcNQCd4yYKAy3jdd4jD=vBe39uo-ssrk-p8ZaA@mail.gmail.com' \
    --to=marcandre.lureau@gmail.com \
    --cc=fanyang@smartx.com \
    --cc=fengli@smartx.com \
    --cc=kyle@smartx.com \
    --cc=lifeng1519@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.