* Re: [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect
2020-04-15 3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
@ 2020-04-14 1:24 ` Raphael Norwitz
2020-04-20 11:18 ` Li Feng
0 siblings, 1 reply; 32+ messages in thread
From: Raphael Norwitz @ 2020-04-14 1:24 UTC (permalink / raw)
To: Li Feng; +Cc: qemu-devel, mst
On Wed, Apr 15, 2020 at 11:28:23AM +0800, Li Feng wrote:
>
> switch (event) {
> case CHR_EVENT_OPENED:
> @@ -363,7 +376,16 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
> }
> break;
> case CHR_EVENT_CLOSED:
> - vhost_user_blk_disconnect(dev);
> + /*
> + * a close event may happen during a read/write, but vhost
> + * code assumes the vhost_dev remains setup, so delay the
> + * stop & clear to idle.
> + */
> + ctx = qemu_get_current_aio_context();
> +
> + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL,
> + NULL, NULL, NULL, false);
> + aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
This seems a bit racy. What’s to stop the async operation from executing before
the next read?
If the issue is just that the vhost_dev state is being destroyed too early, why
don’t we rather move the vhost_dev_cleanup() call from vhost_user_blk_disconnect()
to vhost_user_blk_connect()? We may need to add some state to tell if this is the
first connect or a reconnect so we don’t call vhost_dev_cleanup() on initial
connect, but as long as we call vhost_dev_cleanup() before vhost_dev_init()
every time the device reconnects it shouldn’t matter that we keep that state
around.
Thoughts?
> break;
> case CHR_EVENT_BREAK:
> case CHR_EVENT_MUX_IN:
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] vhost-user-blk: fix invalid memory access
2020-04-15 3:28 ` [PATCH 2/4] vhost-user-blk: fix invalid memory access Li Feng
@ 2020-04-14 1:44 ` Raphael Norwitz
0 siblings, 0 replies; 32+ messages in thread
From: Raphael Norwitz @ 2020-04-14 1:44 UTC (permalink / raw)
To: Li Feng; +Cc: qemu-devel, mst
On Wed, Apr 15, 2020 at 11:28:24AM +0800, Li Feng wrote:
>
> when s->inflight is freed, vhost_dev_free_inflight may try to access
> s->inflight->addr, it will retrigger the following issue.
>
> ==7309==ERROR: AddressSanitizer: heap-use-after-free on address 0x604001020d18 at pc 0x555555ce948a bp 0x7fffffffb170 sp 0x7fffffffb160
> READ of size 8 at 0x604001020d18 thread T0
> #0 0x555555ce9489 in vhost_dev_free_inflight /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473
> #1 0x555555cd86eb in virtio_reset /root/smartx/qemu-el7/qemu-test/hw/virtio/virtio.c:1214
> #2 0x5555560d3eff in virtio_pci_reset hw/virtio/virtio-pci.c:1859
> #3 0x555555f2ac53 in device_set_realized hw/core/qdev.c:893
> #4 0x5555561d572c in property_set_bool qom/object.c:1925
> #5 0x5555561de8de in object_property_set_qobject qom/qom-qobject.c:27
> #6 0x5555561d99f4 in object_property_set_bool qom/object.c:1188
> #7 0x555555e50ae7 in qdev_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:626
> #8 0x555555e51213 in qmp_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:806
> #9 0x555555e8ff40 in hmp_device_add /root/smartx/qemu-el7/qemu-test/hmp.c:1951
> #10 0x555555be889a in handle_hmp_command /root/smartx/qemu-el7/qemu-test/monitor.c:3404
> #11 0x555555beac8b in monitor_command_cb /root/smartx/qemu-el7/qemu-test/monitor.c:4296
> #12 0x555556433eb7 in readline_handle_byte util/readline.c:393
> #13 0x555555be89ec in monitor_read /root/smartx/qemu-el7/qemu-test/monitor.c:4279
> #14 0x5555563285cc in tcp_chr_read chardev/char-socket.c:470
> #15 0x7ffff670b968 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a968)
> #16 0x55555640727c in glib_pollfds_poll util/main-loop.c:215
> #17 0x55555640727c in os_host_main_loop_wait util/main-loop.c:238
> #18 0x55555640727c in main_loop_wait util/main-loop.c:497
> #19 0x555555b2d0bf in main_loop /root/smartx/qemu-el7/qemu-test/vl.c:2013
> #20 0x555555b2d0bf in main /root/smartx/qemu-el7/qemu-test/vl.c:4776
> #21 0x7fffdd2eb444 in __libc_start_main (/lib64/libc.so.6+0x22444)
> #22 0x555555b3767a (/root/smartx/qemu-el7/qemu-test/x86_64-softmmu/qemu-system-x86_64+0x5e367a)
>
> 0x604001020d18 is located 8 bytes inside of 40-byte region [0x604001020d10,0x604001020d38)
> freed by thread T0 here:
> #0 0x7ffff6f00508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
> #1 0x7ffff671107d in g_free (/lib64/libglib-2.0.so.0+0x5007d)
>
> previously allocated by thread T0 here:
> #0 0x7ffff6f00a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
> #1 0x7ffff6710fc5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x4ffc5)
>
> SUMMARY: AddressSanitizer: heap-use-after-free /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 in vhost_dev_free_inflight
> Shadow bytes around the buggy address:
> 0x0c08801fc150: fa fa 00 00 00 00 04 fa fa fa fd fd fd fd fd fa
> 0x0c08801fc160: fa fa fd fd fd fd fd fd fa fa 00 00 00 00 04 fa
> 0x0c08801fc170: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 04 fa
> 0x0c08801fc180: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 00 01
> 0x0c08801fc190: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 04 fa
> =>0x0c08801fc1a0: fa fa fd[fd]fd fd fd fa fa fa fd fd fd fd fd fa
> 0x0c08801fc1b0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
> 0x0c08801fc1c0: fa fa 00 00 00 00 00 fa fa fa fd fd fd fd fd fd
> 0x0c08801fc1d0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
> 0x0c08801fc1e0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
> 0x0c08801fc1f0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
> Addressable: 00
> Partially addressable: 01 02 03 04 05 06 07
> Heap left redzone: fa
> Freed heap region: fd
> Stack left redzone: f1
> Stack mid redzone: f2
> Stack right redzone: f3
> Stack after return: f5
> Stack use after scope: f8
> Global redzone: f9
> Global init order: f6
> Poisoned by user: f7
> Container overflow: fc
> Array cookie: ac
> Intra object redzone: bb
> ASan internal: fe
> Left alloca redzone: ca
> Right alloca redzone: cb
> ==7309==ABORTING
>
> Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> ---
> hw/block/vhost-user-blk.c | 4 ++++
> hw/virtio/vhost.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 776b9af3eb..19e79b96e4 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -463,7 +463,9 @@ reconnect:
>
> virtio_err:
> g_free(s->vhost_vqs);
> + s->vhost_vqs = NULL;
> g_free(s->inflight);
> + s->inflight = NULL;
> for (i = 0; i < s->num_queues; i++) {
> virtio_delete_queue(s->virtqs[i]);
> }
> @@ -484,7 +486,9 @@ static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> vhost_dev_cleanup(&s->dev);
> vhost_dev_free_inflight(s->inflight);
> g_free(s->vhost_vqs);
> + s->vhost_vqs = NULL;
> g_free(s->inflight);
> + s->inflight = NULL;
>
> for (i = 0; i < s->num_queues; i++) {
> virtio_delete_queue(s->virtqs[i]);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 01ebe12f28..aff98a0ede 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1514,7 +1514,7 @@ void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
>
> void vhost_dev_free_inflight(struct vhost_inflight *inflight)
> {
> - if (inflight->addr) {
> + if (inflight && inflight->addr) {
> qemu_memfd_free(inflight->addr, inflight->size, inflight->fd);
> inflight->addr = NULL;
> inflight->fd = -1;
> --
> 2.11.0
>
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4] vhost-user-blk: fix crash in realize process
2020-04-15 3:28 ` [PATCH 4/4] vhost-user-blk: fix crash in realize process Li Feng
@ 2020-04-14 1:54 ` Raphael Norwitz
[not found] ` <CAHckoCwsLwqNhVHK4PF+pzt5ExkurPNsWLw7ueRG8dfOUA2rXg@mail.gmail.com>
0 siblings, 1 reply; 32+ messages in thread
From: Raphael Norwitz @ 2020-04-14 1:54 UTC (permalink / raw)
To: Li Feng; +Cc: qemu-devel, mst
Mostly looks good - just a few superficial notes.
On Wed, Apr 15, 2020 at 11:28:26AM +0800, Li Feng wrote:
> 1. set s->connected to true after vhost_dev_init;
> 2. call vhost_dev_get_config when s->connected is true, otherwise the
> hdev->host_ops will be nullptr.
You mean hdev->vhost_ops, right?
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/block/vhost-user-blk.c | 47 +++++++++++++++++++++++++----------------------
> 1 file changed, 25 insertions(+), 22 deletions(-)
> + /*
> + * set true util vhost_dev_init return ok, because CLOSE event may happen
> + * in vhost_dev_init routine.
> + */
I'm a little confused by this comment. Do you mean to say “wait until vhost_dev_init
succeeds to set connected to true, because a close event may happen while
vhost_dev_init is executing”?
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev
@ 2020-04-15 3:28 Li Feng
2020-04-15 3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
` (4 more replies)
0 siblings, 5 replies; 32+ messages in thread
From: Li Feng @ 2020-04-15 3:28 UTC (permalink / raw)
To: kyle, Kevin Wolf, Marc-André Lureau, Max Reitz,
Michael S. Tsirkin, Paolo Bonzini, open list:Block layer core,
open list:All patches CC here, Raphael Norwitz
Cc: Li Feng
The following patches fix various crashes happened when injecting errors to
chardev unix domain socket.
The crashes are encountered when the socket is from connected to disconnected at
vhost-user-blk realize routine.
These crashes could be reproduced like this:
1. gdb break at vhost_user_write;
2. add a vhost-user-blk device through qmp;
3. when stop at vhost_user_write, kill the vhost-user-blk target;
3. let qemu continue running;
4. start vhost-user-blk;
5. see crash!
The 'CLOSE' event path is core trouble maker.
qemu_chr_fe_set_handlers
-> vhost_user_blk_event(OPEN)
-> vhost_user_blk_connect
-> vhost_dev_init
-> vhost_user_blk_event(CLOSE)
-> vhost_dev_cleanup
Li Feng (4):
vhost-user-blk: delay vhost_user_blk_disconnect
vhost-user-blk: fix invalid memory access
char-socket: avoid double call tcp_chr_free_connection
vhost-user-blk: fix crash in realize process
chardev/char-socket.c | 5 ++++
hw/block/vhost-user-blk.c | 75 ++++++++++++++++++++++++++++++++---------------
hw/virtio/vhost.c | 2 +-
3 files changed, 58 insertions(+), 24 deletions(-)
--
2.11.0
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
@ 2020-04-15 3:28 ` Li Feng
2020-04-14 1:24 ` Raphael Norwitz
2020-04-15 3:28 ` [PATCH 2/4] vhost-user-blk: fix invalid memory access Li Feng
` (3 subsequent siblings)
4 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-15 3:28 UTC (permalink / raw)
To: kyle, Raphael Norwitz, Michael S. Tsirkin, Kevin Wolf, Max Reitz,
open list:Block layer core, open list:All patches CC here
Cc: Li Feng
Since commit b0a335e351103bf92f3f9d0bd5759311be8156ac, a socket write
may trigger a disconnect events, calling vhost_user_blk_disconnect() and
clearing all the vhost_dev strutures. Then the next socket read will
encounter an invalid pointer to vhost_dev.
Signed-off-by: Li Feng <fengli@smartx.com>
---
hw/block/vhost-user-blk.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 17df5338e7..776b9af3eb 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -349,11 +349,24 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
vhost_dev_cleanup(&s->dev);
}
+static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
+
+static void vhost_user_blk_chr_closed_bh(void *opaque)
+{
+ DeviceState *dev = opaque;
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ vhost_user_blk_disconnect(dev);
+ qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
+ NULL, (void *)dev, NULL, true);
+}
+
static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
{
DeviceState *dev = opaque;
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ AioContext *ctx;
switch (event) {
case CHR_EVENT_OPENED:
@@ -363,7 +376,16 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
}
break;
case CHR_EVENT_CLOSED:
- vhost_user_blk_disconnect(dev);
+ /*
+ * a close event may happen during a read/write, but vhost
+ * code assumes the vhost_dev remains setup, so delay the
+ * stop & clear to idle.
+ */
+ ctx = qemu_get_current_aio_context();
+
+ qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL,
+ NULL, NULL, NULL, false);
+ aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
break;
case CHR_EVENT_BREAK:
case CHR_EVENT_MUX_IN:
--
2.11.0
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/4] vhost-user-blk: fix invalid memory access
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
2020-04-15 3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
@ 2020-04-15 3:28 ` Li Feng
2020-04-14 1:44 ` Raphael Norwitz
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
` (2 subsequent siblings)
4 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-15 3:28 UTC (permalink / raw)
To: kyle, Raphael Norwitz, Michael S. Tsirkin, Kevin Wolf, Max Reitz,
open list:Block layer core, open list:All patches CC here
Cc: Li Feng
when s->inflight is freed, vhost_dev_free_inflight may try to access
s->inflight->addr, it will retrigger the following issue.
==7309==ERROR: AddressSanitizer: heap-use-after-free on address 0x604001020d18 at pc 0x555555ce948a bp 0x7fffffffb170 sp 0x7fffffffb160
READ of size 8 at 0x604001020d18 thread T0
#0 0x555555ce9489 in vhost_dev_free_inflight /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473
#1 0x555555cd86eb in virtio_reset /root/smartx/qemu-el7/qemu-test/hw/virtio/virtio.c:1214
#2 0x5555560d3eff in virtio_pci_reset hw/virtio/virtio-pci.c:1859
#3 0x555555f2ac53 in device_set_realized hw/core/qdev.c:893
#4 0x5555561d572c in property_set_bool qom/object.c:1925
#5 0x5555561de8de in object_property_set_qobject qom/qom-qobject.c:27
#6 0x5555561d99f4 in object_property_set_bool qom/object.c:1188
#7 0x555555e50ae7 in qdev_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:626
#8 0x555555e51213 in qmp_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:806
#9 0x555555e8ff40 in hmp_device_add /root/smartx/qemu-el7/qemu-test/hmp.c:1951
#10 0x555555be889a in handle_hmp_command /root/smartx/qemu-el7/qemu-test/monitor.c:3404
#11 0x555555beac8b in monitor_command_cb /root/smartx/qemu-el7/qemu-test/monitor.c:4296
#12 0x555556433eb7 in readline_handle_byte util/readline.c:393
#13 0x555555be89ec in monitor_read /root/smartx/qemu-el7/qemu-test/monitor.c:4279
#14 0x5555563285cc in tcp_chr_read chardev/char-socket.c:470
#15 0x7ffff670b968 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a968)
#16 0x55555640727c in glib_pollfds_poll util/main-loop.c:215
#17 0x55555640727c in os_host_main_loop_wait util/main-loop.c:238
#18 0x55555640727c in main_loop_wait util/main-loop.c:497
#19 0x555555b2d0bf in main_loop /root/smartx/qemu-el7/qemu-test/vl.c:2013
#20 0x555555b2d0bf in main /root/smartx/qemu-el7/qemu-test/vl.c:4776
#21 0x7fffdd2eb444 in __libc_start_main (/lib64/libc.so.6+0x22444)
#22 0x555555b3767a (/root/smartx/qemu-el7/qemu-test/x86_64-softmmu/qemu-system-x86_64+0x5e367a)
0x604001020d18 is located 8 bytes inside of 40-byte region [0x604001020d10,0x604001020d38)
freed by thread T0 here:
#0 0x7ffff6f00508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
#1 0x7ffff671107d in g_free (/lib64/libglib-2.0.so.0+0x5007d)
previously allocated by thread T0 here:
#0 0x7ffff6f00a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
#1 0x7ffff6710fc5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x4ffc5)
SUMMARY: AddressSanitizer: heap-use-after-free /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 in vhost_dev_free_inflight
Shadow bytes around the buggy address:
0x0c08801fc150: fa fa 00 00 00 00 04 fa fa fa fd fd fd fd fd fa
0x0c08801fc160: fa fa fd fd fd fd fd fd fa fa 00 00 00 00 04 fa
0x0c08801fc170: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 04 fa
0x0c08801fc180: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 00 01
0x0c08801fc190: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 04 fa
=>0x0c08801fc1a0: fa fa fd[fd]fd fd fd fa fa fa fd fd fd fd fd fa
0x0c08801fc1b0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
0x0c08801fc1c0: fa fa 00 00 00 00 00 fa fa fa fd fd fd fd fd fd
0x0c08801fc1d0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
0x0c08801fc1e0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
0x0c08801fc1f0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==7309==ABORTING
Signed-off-by: Li Feng <fengli@smartx.com>
---
hw/block/vhost-user-blk.c | 4 ++++
hw/virtio/vhost.c | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 776b9af3eb..19e79b96e4 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -463,7 +463,9 @@ reconnect:
virtio_err:
g_free(s->vhost_vqs);
+ s->vhost_vqs = NULL;
g_free(s->inflight);
+ s->inflight = NULL;
for (i = 0; i < s->num_queues; i++) {
virtio_delete_queue(s->virtqs[i]);
}
@@ -484,7 +486,9 @@ static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
vhost_dev_cleanup(&s->dev);
vhost_dev_free_inflight(s->inflight);
g_free(s->vhost_vqs);
+ s->vhost_vqs = NULL;
g_free(s->inflight);
+ s->inflight = NULL;
for (i = 0; i < s->num_queues; i++) {
virtio_delete_queue(s->virtqs[i]);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 01ebe12f28..aff98a0ede 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1514,7 +1514,7 @@ void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
void vhost_dev_free_inflight(struct vhost_inflight *inflight)
{
- if (inflight->addr) {
+ if (inflight && inflight->addr) {
qemu_memfd_free(inflight->addr, inflight->size, inflight->fd);
inflight->addr = NULL;
inflight->fd = -1;
--
2.11.0
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
2020-04-15 3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
2020-04-15 3:28 ` [PATCH 2/4] vhost-user-blk: fix invalid memory access Li Feng
@ 2020-04-15 3:28 ` Li Feng
2020-04-15 10:35 ` Marc-André Lureau
` (2 more replies)
2020-04-15 3:28 ` [PATCH 4/4] vhost-user-blk: fix crash in realize process Li Feng
2020-04-17 9:45 ` [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Michael S. Tsirkin
4 siblings, 3 replies; 32+ messages in thread
From: Li Feng @ 2020-04-15 3:28 UTC (permalink / raw)
To: kyle, Marc-André Lureau, Paolo Bonzini,
open list:All patches CC here
Cc: Li Feng
double call tcp_chr_free_connection generates a crash.
Signed-off-by: Li Feng <fengli@smartx.com>
---
chardev/char-socket.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 185fe38dda..43aab8f24b 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
SocketChardev *s = SOCKET_CHARDEV(chr);
bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
+ /* avoid re-enter when socket read/write error and disconnect event. */
+ if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
+ return;
+ }
+
tcp_chr_free_connection(chr);
if (s->listener) {
--
2.11.0
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4] vhost-user-blk: fix crash in realize process
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
` (2 preceding siblings ...)
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
@ 2020-04-15 3:28 ` Li Feng
2020-04-14 1:54 ` Raphael Norwitz
2020-04-17 9:45 ` [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Michael S. Tsirkin
4 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-15 3:28 UTC (permalink / raw)
To: kyle, Raphael Norwitz, Michael S. Tsirkin, Kevin Wolf, Max Reitz,
open list:Block layer core, open list:All patches CC here
Cc: Li Feng
The crash could be reproduced like this:
1. break vhost_user_write;
2. kill the vhost-user-blk target;
3. let qemu continue running;
4. start vhost-user-blk;
5. see crash!
This fix makes changes:
1. set s->connected to true after vhost_dev_init;
2. call vhost_dev_get_config when s->connected is true, otherwise the
hdev->host_ops will be nullptr.
Signed-off-by: Li Feng <fengli@smartx.com>
---
hw/block/vhost-user-blk.c | 47 +++++++++++++++++++++++++----------------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 19e79b96e4..35386b7cb7 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -303,8 +303,6 @@ static int vhost_user_blk_connect(DeviceState *dev)
if (s->connected) {
return 0;
}
- s->connected = true;
-
s->dev.nvqs = s->num_queues;
s->dev.vqs = s->vhost_vqs;
s->dev.vq_index = 0;
@@ -318,6 +316,11 @@ static int vhost_user_blk_connect(DeviceState *dev)
strerror(-ret));
return ret;
}
+ /*
+ * set true util vhost_dev_init return ok, because CLOSE event may happen
+ * in vhost_dev_init routine.
+ */
+ s->connected = true;
/* restore vhost state */
if (virtio_device_started(vdev, vdev->status)) {
@@ -401,6 +404,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
VHostUserBlk *s = VHOST_USER_BLK(vdev);
Error *err = NULL;
int i, ret;
+ bool reconnect;
if (!s->chardev.chr) {
error_setg(errp, "vhost-user-blk: chardev is mandatory");
@@ -433,27 +437,26 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
s->inflight = g_new0(struct vhost_inflight, 1);
s->vhost_vqs = g_new0(struct vhost_virtqueue, s->num_queues);
s->connected = false;
+ reconnect = false;
- qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
- NULL, (void *)dev, NULL, true);
-
-reconnect:
- if (qemu_chr_fe_wait_connected(&s->chardev, &err) < 0) {
- error_report_err(err);
- goto virtio_err;
- }
-
- /* check whether vhost_user_blk_connect() failed or not */
- if (!s->connected) {
- goto reconnect;
- }
-
- ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
- sizeof(struct virtio_blk_config));
- if (ret < 0) {
- error_report("vhost-user-blk: get block config failed");
- goto reconnect;
- }
+ do {
+ if (qemu_chr_fe_wait_connected(&s->chardev, &err) < 0) {
+ error_report_err(err);
+ goto virtio_err;
+ }
+ qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
+ NULL, (void *)dev, NULL, true);
+ if (s->connected) {
+ ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
+ sizeof(struct virtio_blk_config));
+ if (ret < 0) {
+ error_report("vhost-user-blk: get block config failed");
+ reconnect = true;
+ } else {
+ reconnect = false;
+ }
+ }
+ } while (!s->connected || reconnect);
if (s->blkcfg.num_queues != s->num_queues) {
s->blkcfg.num_queues = s->num_queues;
--
2.11.0
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
@ 2020-04-15 10:35 ` Marc-André Lureau
2020-04-20 3:10 ` Li Feng
2020-04-28 8:51 ` [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
2020-04-28 8:54 ` [PATCH v2] " Li Feng
2 siblings, 1 reply; 32+ messages in thread
From: Marc-André Lureau @ 2020-04-15 10:35 UTC (permalink / raw)
To: Li Feng; +Cc: Paolo Bonzini, kyle, open list:All patches CC here
Hi
On Wed, Apr 15, 2020 at 5:31 AM Li Feng <fengli@smartx.com> wrote:
>
> double call tcp_chr_free_connection generates a crash.
>
> Signed-off-by: Li Feng <fengli@smartx.com>
Could you describe how you reach the "double call".
Even better would be to write a test for it in tests/test-char.c
thanks
> ---
> chardev/char-socket.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 185fe38dda..43aab8f24b 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> SocketChardev *s = SOCKET_CHARDEV(chr);
> bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
>
> + /* avoid re-enter when socket read/write error and disconnect event. */
> + if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
> + return;
> + }
> +
> tcp_chr_free_connection(chr);
>
> if (s->listener) {
> --
> 2.11.0
>
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
>
>
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
` (3 preceding siblings ...)
2020-04-15 3:28 ` [PATCH 4/4] vhost-user-blk: fix crash in realize process Li Feng
@ 2020-04-17 9:45 ` Michael S. Tsirkin
2020-04-17 10:11 ` Li Feng
4 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2020-04-17 9:45 UTC (permalink / raw)
To: Li Feng
Cc: Kevin Wolf, open list:Block layer core,
open list:All patches CC here, Max Reitz, kyle, Paolo Bonzini,
Marc-André Lureau, Raphael Norwitz
On Wed, Apr 15, 2020 at 11:28:22AM +0800, Li Feng wrote:
> The following patches fix various crashes happened when injecting errors to
> chardev unix domain socket.
I think these are mostly unrelated fixes right?
If so pls post them separately so I know I can just apply
some and wait for others to get acked.
> The crashes are encountered when the socket is from connected to disconnected at
> vhost-user-blk realize routine.
>
> These crashes could be reproduced like this:
> 1. gdb break at vhost_user_write;
> 2. add a vhost-user-blk device through qmp;
> 3. when stop at vhost_user_write, kill the vhost-user-blk target;
> 3. let qemu continue running;
> 4. start vhost-user-blk;
> 5. see crash!
>
> The 'CLOSE' event path is core trouble maker.
>
> qemu_chr_fe_set_handlers
> -> vhost_user_blk_event(OPEN)
> -> vhost_user_blk_connect
> -> vhost_dev_init
> -> vhost_user_blk_event(CLOSE)
> -> vhost_dev_cleanup
>
>
> Li Feng (4):
> vhost-user-blk: delay vhost_user_blk_disconnect
> vhost-user-blk: fix invalid memory access
> char-socket: avoid double call tcp_chr_free_connection
> vhost-user-blk: fix crash in realize process
>
> chardev/char-socket.c | 5 ++++
> hw/block/vhost-user-blk.c | 75 ++++++++++++++++++++++++++++++++---------------
> hw/virtio/vhost.c | 2 +-
> 3 files changed, 58 insertions(+), 24 deletions(-)
>
> --
> 2.11.0
>
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Fwd: [PATCH 4/4] vhost-user-blk: fix crash in realize process
[not found] ` <CAHckoCwsLwqNhVHK4PF+pzt5ExkurPNsWLw7ueRG8dfOUA2rXg@mail.gmail.com>
@ 2020-04-17 10:07 ` Li Feng
0 siblings, 0 replies; 32+ messages in thread
From: Li Feng @ 2020-04-17 10:07 UTC (permalink / raw)
To: Raphael Norwitz
Cc: lifeng1519, Yang Fan, Kyle Zhang, open list:All patches CC here,
Michael S. Tsirkin
Add mail group list.
Thank you, Raphael .
Raphael Norwitz <raphael.norwitz@nutanix.com> 于2020年4月17日周五 下午12:10写道:
>
> Mostly looks good - just a few superficial notes.
>
> On Wed, Apr 15, 2020 at 11:28:26AM +0800, Li Feng wrote:
> > 1. set s->connected to true after vhost_dev_init;
> > 2. call vhost_dev_get_config when s->connected is true, otherwise the
> > hdev->host_ops will be nullptr.
>
> You mean hdev->vhost_ops, right?
>
Yes.
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
> > ---
> > hw/block/vhost-user-blk.c | 47 +++++++++++++++++++++++++----------------------
> > 1 file changed, 25 insertions(+), 22 deletions(-)
> > + /*
> > + * set true util vhost_dev_init return ok, because CLOSE event may happen
> > + * in vhost_dev_init routine.
> > + */
>
> I'm a little confused by this comment. Do you mean to say “wait until vhost_dev_init
> succeeds to set connected to true, because a close event may happen while
> vhost_dev_init is executing”?
Yes. This is the exception path:
qemu_chr_fe_set_handlers
-> vhost_user_blk_event(OPEN)
-> vhost_user_blk_connect
-> vhost_dev_init
-> vhost_user_blk_event(CLOSE)
-> vhost_dev_cleanup
We should set connected to true only when vhost_dev_init returns OK.
If a close event is triggered, we shouldn't set connected to true.
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev
2020-04-17 9:45 ` [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Michael S. Tsirkin
@ 2020-04-17 10:11 ` Li Feng
0 siblings, 0 replies; 32+ messages in thread
From: Li Feng @ 2020-04-17 10:11 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Kevin Wolf, lifeng1519, Yang Fan, open list:Block layer core,
open list:All patches CC here, Max Reitz, Kyle Zhang,
Paolo Bonzini, Marc-André Lureau, Raphael Norwitz
OK, I will submit this patch "vhost-user-blk: fix invalid memory
access" firstly.
This is unrelated with other three and it has been acked.
Thanks,
Feng Li
Michael S. Tsirkin <mst@redhat.com> 于2020年4月17日周五 下午5:45写道:
>
> On Wed, Apr 15, 2020 at 11:28:22AM +0800, Li Feng wrote:
> > The following patches fix various crashes happened when injecting errors to
> > chardev unix domain socket.
>
> I think these are mostly unrelated fixes right?
> If so pls post them separately so I know I can just apply
> some and wait for others to get acked.
>
>
> > The crashes are encountered when the socket is from connected to disconnected at
> > vhost-user-blk realize routine.
> >
> > These crashes could be reproduced like this:
> > 1. gdb break at vhost_user_write;
> > 2. add a vhost-user-blk device through qmp;
> > 3. when stop at vhost_user_write, kill the vhost-user-blk target;
> > 3. let qemu continue running;
> > 4. start vhost-user-blk;
> > 5. see crash!
> >
> > The 'CLOSE' event path is core trouble maker.
> >
> > qemu_chr_fe_set_handlers
> > -> vhost_user_blk_event(OPEN)
> > -> vhost_user_blk_connect
> > -> vhost_dev_init
> > -> vhost_user_blk_event(CLOSE)
> > -> vhost_dev_cleanup
> >
> >
> > Li Feng (4):
> > vhost-user-blk: delay vhost_user_blk_disconnect
> > vhost-user-blk: fix invalid memory access
> > char-socket: avoid double call tcp_chr_free_connection
> > vhost-user-blk: fix crash in realize process
> >
> > chardev/char-socket.c | 5 ++++
> > hw/block/vhost-user-blk.c | 75 ++++++++++++++++++++++++++++++++---------------
> > hw/virtio/vhost.c | 2 +-
> > 3 files changed, 58 insertions(+), 24 deletions(-)
> >
> > --
> > 2.11.0
> >
> >
> > --
> > The SmartX email address is only for business purpose. Any sent message
> > that is not related to the business is not authorized or permitted by
> > SmartX.
> > 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
> >
>
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
2020-04-15 10:35 ` Marc-André Lureau
@ 2020-04-20 3:10 ` Li Feng
2020-04-20 14:28 ` Marc-André Lureau
0 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-20 3:10 UTC (permalink / raw)
To: Marc-André Lureau
Cc: lifeng1519, Paolo Bonzini, Yang Fan, Kyle Zhang,
open list:All patches CC here
Hi Lureau,
Thanks for your comment.
I have spent some time to reproduce this crash, so that delay my
reply, apologies.
This is the description of this bug using gdb:
1. Set break points using gdb;
b chardev/char-socket.c:410 if s->state == TCP_CHARDEV_STATE_DISCONNECTED
b break vhost_user_write
2. hotplug a vhost-blk device:
echo "chardev-add
socket,id=spdk_vhost_blk2,path=/vhost-blk.0,reconnect=1 "| nc -U
/tmp/a.socket
echo "device_add
vhost-user-blk-pci,chardev=spdk_vhost_blk2,id=spdk_vhost_blk2,num-queues=4"
| nc -U /tmp/a.socket
3. Gdb will stop at vhost_user_write, and CTRL-C the vhost target.
4. Put 'c' to let gdb continue.
You will see a crash:
192 login: qemu-system-x86_64: chardev/char-socket.c:125:
qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
The related code:
394 static void tcp_chr_free_connection(Chardev *chr)
395 {
396 SocketChardev *s = SOCKET_CHARDEV(chr);
397 int i;
398
399 if (s->read_msgfds_num) {
400 for (i = 0; i < s->read_msgfds_num; i++) {
401 close(s->read_msgfds[i]);
402 }
403 g_free(s->read_msgfds);
404 s->read_msgfds = NULL;
405 s->read_msgfds_num = 0;
406 }
407
408 remove_hup_source(s);
409
410 tcp_set_msgfds(chr, NULL, 0);
411 remove_fd_in_watch(chr);
412 object_unref(OBJECT(s->sioc));
413 s->sioc = NULL;
414 object_unref(OBJECT(s->ioc));
415 s->ioc = NULL;
416 g_free(chr->filename);
417 chr->filename = NULL;
418 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
419 }
#0 0x0000555555c1aae8 in tcp_chr_free_connection
(chr=chr@entry=0x55555751ee00) at chardev/char-socket.c:410
#1 0x0000555555c1aec8 in tcp_chr_disconnect_locked
(chr=0x55555751ee00) at chardev/char-socket.c:479
#2 0x0000555555c1af5d in tcp_chr_disconnect (chr=0x55555751ee00) at
chardev/char-socket.c:497
#3 0x0000555555c16715 in qemu_chr_fe_set_handlers
(b=b@entry=0x5555575f3b48, fd_can_read=fd_can_read@entry=0x0,
fd_read=fd_
read@entry=0x0, fd_event=fd_event@entry=0x55555588e740
<vhost_user_blk_event>, be_change=be_change@entry=0x0, opaque=opaque@
entry=0x5555575f3990, context=0x0, set_open=true) at chardev/char-fe.c:304
#4 0x000055555588e5c0 in vhost_user_blk_device_realize
(dev=0x5555575f3990, errp=<optimized out>)
at /root/qemu-master/hw/block/vhost-user-blk.c:447
#5 0x00005555558ca478 in virtio_device_realize (dev=0x5555575f3990,
errp=0x7fffffffbb90)
at /root/qemu-master/hw/virtio/virtio.c:3615
#6 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
value=<optimized out>, errp=0x7fffffffbd20)
at hw/core/qdev.c:891
#7 0x0000555555b78e07 in property_set_bool (obj=0x5555575f3990,
v=<optimized out>, name=<optimized out>, opaque=0x555556453
040, errp=0x7fffffffbd20) at qom/object.c:2238
#8 0x0000555555b7db3f in object_property_set_qobject
(obj=obj@entry=0x5555575f3990, value=value@entry=0x555556ab89c0, name=
name@entry=0x555555d15308 "realized", errp=errp@entry=0x7fffffffbd20)
at qom/qom-qobject.c:26
#9 0x0000555555b7b2c5 in object_property_set_bool
(obj=0x5555575f3990, value=<optimized out>, name=0x555555d15308
"realized
", errp=0x7fffffffbd20) at qom/object.c:1390
#10 0x0000555555af13b2 in virtio_pci_realize (pci_dev=0x5555575eb800,
errp=0x7fffffffbd20) at hw/virtio/virtio-pci.c:1807
#11 0x0000555555a7a14b in pci_qdev_realize (qdev=<optimized out>,
errp=<optimized out>) at hw/pci/pci.c:2098
#12 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
value=<optimized out>, errp=0x7fffffffbef8)
at hw/core/qdev.c:891
(gdb) p s
$23 = (SocketChardev *) 0x55555751ee00
(gdb) p s->state
$24 = TCP_CHARDEV_STATE_DISCONNECTED
At 410 line of char-socket.c, the state has been changed to
TCP_CHARDEV_STATE_DISCONNECTED before tcp_chr_change_state(s,
TCP_CHARDEV_STATE_DISCONNECTED);
It means the tcp_chr_disconnect_locked is called by more than one
times, and the tcp socket has been freed before.
The crash reason is s->reconnect_timer is set in the previous call of
tcp_chr_disconnect_locked.
The another approach fix maybe like this:
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 185fe38dda..94957de367 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
if (emit_close) {
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
- if (s->reconnect_time) {
+ if (s->reconnect_time && !s->reconnect_timer) {
qemu_chr_socket_restart_timer(chr);
}
}
The base code is here:
474 static void tcp_chr_disconnect_locked(Chardev *chr)
475 {
476 SocketChardev *s = SOCKET_CHARDEV(chr);
477 bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
478
479 tcp_chr_free_connection(chr);
480
481 if (s->listener) {
482 qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
483 chr, NULL, chr->gcontext);
484 }
485 update_disconnected_filename(s);
486 if (emit_close) {
487 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
488 }
489 if (s->reconnect_time) {
490 qemu_chr_socket_restart_timer(chr);
491 }
492 }
Which one is better?
I don't know how to inject an error at socket write when writing tests
code(tests/test-char.c ).
Any tips?
Thanks,
Feng Li
Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年4月15日周三 下午6:35写道:
>
> Hi
>
> On Wed, Apr 15, 2020 at 5:31 AM Li Feng <fengli@smartx.com> wrote:
> >
> > double call tcp_chr_free_connection generates a crash.
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
>
> Could you describe how you reach the "double call".
>
> Even better would be to write a test for it in tests/test-char.c
>
> thanks
>
> > ---
> > chardev/char-socket.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 185fe38dda..43aab8f24b 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > SocketChardev *s = SOCKET_CHARDEV(chr);
> > bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> >
> > + /* avoid re-enter when socket read/write error and disconnect event. */
> > + if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
> > + return;
> > + }
> > +
> > tcp_chr_free_connection(chr);
> >
> > if (s->listener) {
> > --
> > 2.11.0
> >
> >
> > --
> > The SmartX email address is only for business purpose. Any sent message
> > that is not related to the business is not authorized or permitted by
> > SmartX.
> > 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
> >
> >
> >
>
>
> --
> Marc-André Lureau
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect
2020-04-14 1:24 ` Raphael Norwitz
@ 2020-04-20 11:18 ` Li Feng
0 siblings, 0 replies; 32+ messages in thread
From: Li Feng @ 2020-04-20 11:18 UTC (permalink / raw)
To: Raphael Norwitz
Cc: lifeng1519, Yang Fan, Kyle Zhang, open list:All patches CC here,
Michael S. Tsirkin
Hi Norwitz ,
Thanks for your good suggestion.
I got this fix from net/vhost-user.c, it has the same issue with this case.
Your suggestion is a good option.
I'm trying to do some work. but there is another crash issue ...
I need some time to make your idea be workable.
This is the net/vhost-user patch:
commit e7c83a885f865128ae3cf1946f8cb538b63cbfba
Author: Marc-André Lureau <marcandre.lureau@redhat.com>
Date: Mon Feb 27 14:49:56 2017 +0400
vhost-user: delay vhost_user_stop
Since commit b0a335e351103bf92f3f9d0bd5759311be8156ac, a socket write
may trigger a disconnect events, calling vhost_user_stop() and clearing
all the vhost_dev strutures holding data that vhost.c functions expect
to remain valid. Delay the cleanup to keep the vhost_dev structure
valid during the vhost.c functions.
Feng Li
Raphael Norwitz <raphael.norwitz@nutanix.com> 于2020年4月17日周五 上午11:41写道:
>
> On Wed, Apr 15, 2020 at 11:28:23AM +0800, Li Feng wrote:
> >
> > switch (event) {
> > case CHR_EVENT_OPENED:
> > @@ -363,7 +376,16 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
> > }
> > break;
> > case CHR_EVENT_CLOSED:
> > - vhost_user_blk_disconnect(dev);
> > + /*
> > + * a close event may happen during a read/write, but vhost
> > + * code assumes the vhost_dev remains setup, so delay the
> > + * stop & clear to idle.
> > + */
> > + ctx = qemu_get_current_aio_context();
> > +
> > + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL,
> > + NULL, NULL, NULL, false);
> > + aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
>
> This seems a bit racy. What’s to stop the async operation from executing before
> the next read?
>
> If the issue is just that the vhost_dev state is being destroyed too early, why
> don’t we rather move the vhost_dev_cleanup() call from vhost_user_blk_disconnect()
> to vhost_user_blk_connect()? We may need to add some state to tell if this is the
> first connect or a reconnect so we don’t call vhost_dev_cleanup() on initial
> connect, but as long as we call vhost_dev_cleanup() before vhost_dev_init()
> every time the device reconnects it shouldn’t matter that we keep that state
> around.
>
> Thoughts?
>
> > break;
> > case CHR_EVENT_BREAK:
> > case CHR_EVENT_MUX_IN:
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
2020-04-20 3:10 ` Li Feng
@ 2020-04-20 14:28 ` Marc-André Lureau
0 siblings, 0 replies; 32+ messages in thread
From: Marc-André Lureau @ 2020-04-20 14:28 UTC (permalink / raw)
To: Li Feng
Cc: Li Feng, Paolo Bonzini, Yang Fan, Kyle Zhang,
open list:All patches CC here
Hi
On Mon, Apr 20, 2020 at 5:10 AM Li Feng <fengli@smartx.com> wrote:
>
> Hi Lureau,
>
> Thanks for your comment.
> I have spent some time to reproduce this crash, so that delay my
> reply, apologies.
>
> This is the description of this bug using gdb:
> 1. Set break points using gdb;
> b chardev/char-socket.c:410 if s->state == TCP_CHARDEV_STATE_DISCONNECTED
> b break vhost_user_write
>
> 2. hotplug a vhost-blk device:
> echo "chardev-add
> socket,id=spdk_vhost_blk2,path=/vhost-blk.0,reconnect=1 "| nc -U
> /tmp/a.socket
> echo "device_add
> vhost-user-blk-pci,chardev=spdk_vhost_blk2,id=spdk_vhost_blk2,num-queues=4"
> | nc -U /tmp/a.socket
>
> 3. Gdb will stop at vhost_user_write, and CTRL-C the vhost target.
> 4. Put 'c' to let gdb continue.
> You will see a crash:
>
> 192 login: qemu-system-x86_64: chardev/char-socket.c:125:
> qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> The related code:
> 394 static void tcp_chr_free_connection(Chardev *chr)
> 395 {
> 396 SocketChardev *s = SOCKET_CHARDEV(chr);
> 397 int i;
> 398
> 399 if (s->read_msgfds_num) {
> 400 for (i = 0; i < s->read_msgfds_num; i++) {
> 401 close(s->read_msgfds[i]);
> 402 }
> 403 g_free(s->read_msgfds);
> 404 s->read_msgfds = NULL;
> 405 s->read_msgfds_num = 0;
> 406 }
> 407
> 408 remove_hup_source(s);
> 409
> 410 tcp_set_msgfds(chr, NULL, 0);
> 411 remove_fd_in_watch(chr);
> 412 object_unref(OBJECT(s->sioc));
> 413 s->sioc = NULL;
> 414 object_unref(OBJECT(s->ioc));
> 415 s->ioc = NULL;
> 416 g_free(chr->filename);
> 417 chr->filename = NULL;
> 418 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
> 419 }
>
> #0 0x0000555555c1aae8 in tcp_chr_free_connection
> (chr=chr@entry=0x55555751ee00) at chardev/char-socket.c:410
> #1 0x0000555555c1aec8 in tcp_chr_disconnect_locked
> (chr=0x55555751ee00) at chardev/char-socket.c:479
> #2 0x0000555555c1af5d in tcp_chr_disconnect (chr=0x55555751ee00) at
> chardev/char-socket.c:497
> #3 0x0000555555c16715 in qemu_chr_fe_set_handlers
> (b=b@entry=0x5555575f3b48, fd_can_read=fd_can_read@entry=0x0,
> fd_read=fd_
> read@entry=0x0, fd_event=fd_event@entry=0x55555588e740
> <vhost_user_blk_event>, be_change=be_change@entry=0x0, opaque=opaque@
> entry=0x5555575f3990, context=0x0, set_open=true) at chardev/char-fe.c:304
> #4 0x000055555588e5c0 in vhost_user_blk_device_realize
> (dev=0x5555575f3990, errp=<optimized out>)
> at /root/qemu-master/hw/block/vhost-user-blk.c:447
> #5 0x00005555558ca478 in virtio_device_realize (dev=0x5555575f3990,
> errp=0x7fffffffbb90)
> at /root/qemu-master/hw/virtio/virtio.c:3615
> #6 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbd20)
> at hw/core/qdev.c:891
> #7 0x0000555555b78e07 in property_set_bool (obj=0x5555575f3990,
> v=<optimized out>, name=<optimized out>, opaque=0x555556453
> 040, errp=0x7fffffffbd20) at qom/object.c:2238
> #8 0x0000555555b7db3f in object_property_set_qobject
> (obj=obj@entry=0x5555575f3990, value=value@entry=0x555556ab89c0, name=
> name@entry=0x555555d15308 "realized", errp=errp@entry=0x7fffffffbd20)
> at qom/qom-qobject.c:26
> #9 0x0000555555b7b2c5 in object_property_set_bool
> (obj=0x5555575f3990, value=<optimized out>, name=0x555555d15308
> "realized
> ", errp=0x7fffffffbd20) at qom/object.c:1390
> #10 0x0000555555af13b2 in virtio_pci_realize (pci_dev=0x5555575eb800,
> errp=0x7fffffffbd20) at hw/virtio/virtio-pci.c:1807
> #11 0x0000555555a7a14b in pci_qdev_realize (qdev=<optimized out>,
> errp=<optimized out>) at hw/pci/pci.c:2098
> #12 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbef8)
> at hw/core/qdev.c:891
>
> (gdb) p s
> $23 = (SocketChardev *) 0x55555751ee00
> (gdb) p s->state
> $24 = TCP_CHARDEV_STATE_DISCONNECTED
>
> At 410 line of char-socket.c, the state has been changed to
> TCP_CHARDEV_STATE_DISCONNECTED before tcp_chr_change_state(s,
> TCP_CHARDEV_STATE_DISCONNECTED);
> It means the tcp_chr_disconnect_locked is called by more than one
> times, and the tcp socket has been freed before.
>
> The crash reason is s->reconnect_timer is set in the previous call of
> tcp_chr_disconnect_locked.
> The another approach fix maybe like this:
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 185fe38dda..94957de367 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
>
> The base code is here:
> 474 static void tcp_chr_disconnect_locked(Chardev *chr)
> 475 {
> 476 SocketChardev *s = SOCKET_CHARDEV(chr);
> 477 bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> 478
> 479 tcp_chr_free_connection(chr);
> 480
> 481 if (s->listener) {
> 482 qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
> 483 chr, NULL, chr->gcontext);
> 484 }
> 485 update_disconnected_filename(s);
> 486 if (emit_close) {
> 487 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> 488 }
> 489 if (s->reconnect_time) {
> 490 qemu_chr_socket_restart_timer(chr);
> 491 }
> 492 }
>
> Which one is better?
+ if (s->reconnect_time && !s->reconnect_timer)
Looks like a good solution to me.
>
> I don't know how to inject an error at socket write when writing tests
> code(tests/test-char.c ).
> Any tips?
You could check the patch I just sent fixing a related issue:
https://patchew.org/QEMU/20200420112012.567284-1-marcandre.lureau@redhat.com/
In your case, it might be as easy as calling qemu_chr_fe_disconnect()
2 times on a reconnect socket, I'll let you figure out.
Thanks!
>
> Thanks,
> Feng Li
>
> Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年4月15日周三 下午6:35写道:
>
> >
> > Hi
> >
> > On Wed, Apr 15, 2020 at 5:31 AM Li Feng <fengli@smartx.com> wrote:
> > >
> > > double call tcp_chr_free_connection generates a crash.
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> >
> > Could you describe how you reach the "double call".
> >
> > Even better would be to write a test for it in tests/test-char.c
> >
> > thanks
> >
> > > ---
> > > chardev/char-socket.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 185fe38dda..43aab8f24b 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > > SocketChardev *s = SOCKET_CHARDEV(chr);
> > > bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> > >
> > > + /* avoid re-enter when socket read/write error and disconnect event. */
> > > + if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
> > > + return;
> > > + }
> > > +
> > > tcp_chr_free_connection(chr);
> > >
> > > if (s->listener) {
> > > --
> > > 2.11.0
> > >
> > >
> > > --
> > > The SmartX email address is only for business purpose. Any sent message
> > > that is not related to the business is not authorized or permitted by
> > > SmartX.
> > > 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
> > >
> > >
> > >
> >
> >
> > --
> > Marc-André Lureau
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
>
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
2020-04-15 10:35 ` Marc-André Lureau
@ 2020-04-28 8:51 ` Li Feng
[not found] ` <20200508051441.8143-1-fengli@smartx.com>
2020-05-11 12:39 ` [PATCH v4] " Li Feng
2020-04-28 8:54 ` [PATCH v2] " Li Feng
2 siblings, 2 replies; 32+ messages in thread
From: Li Feng @ 2020-04-28 8:51 UTC (permalink / raw)
To: kyle, lifeng1519, dimastep, Marc-André Lureau,
Paolo Bonzini, open list:All patches CC here
Cc: Li Feng
When the disconnect event is triggered in the connecting stage,
the tcp_chr_disconnect_locked may be called twice.
The first call:
#0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
#1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
#5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
#6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
#7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
#8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
#9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
The second call:
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
#4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
#5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
#9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
#10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
Run test/test-char to reproduce this issue.
test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
Signed-off-by: Li Feng <fengli@smartx.com>
---
v2:
- Rewrite the solution.
- Add test to reproduce this issue.
chardev/char-socket.c | 2 +-
tests/test-char.c | 48 ++++++++++++++++++++++++++++++++++++++----------
2 files changed, 39 insertions(+), 11 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1f14c2c7c8..d84330b3c9 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
if (emit_close) {
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
- if (s->reconnect_time) {
+ if (s->reconnect_time && !s->reconnect_timer) {
qemu_chr_socket_restart_timer(chr);
}
}
diff --git a/tests/test-char.c b/tests/test-char.c
index 8d39bdc9fa..13dbbfe2a3 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -625,12 +625,14 @@ static void char_udp_test(void)
typedef struct {
int event;
bool got_pong;
+ CharBackend *be;
} CharSocketTestData;
#define SOCKET_PING "Hello"
#define SOCKET_PONG "World"
+typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
static void
char_socket_event(void *opaque, QEMUChrEvent event)
@@ -639,6 +641,23 @@ char_socket_event(void *opaque, QEMUChrEvent event)
data->event = event;
}
+static void
+char_socket_event_with_error(void *opaque, QEMUChrEvent event)
+{
+ CharSocketTestData *data = opaque;
+ CharBackend *be = data->be;
+ data->event = event;
+ switch (event) {
+ case CHR_EVENT_OPENED:
+ qemu_chr_fe_disconnect(be);
+ return;
+ case CHR_EVENT_CLOSED:
+ return;
+ default:
+ return;
+ }
+}
+
static void
char_socket_read(void *opaque, const uint8_t *buf, int size)
@@ -783,6 +802,7 @@ static void char_socket_server_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
char_socket_event, NULL,
&data, NULL, true);
@@ -869,6 +889,7 @@ typedef struct {
const char *reconnect;
bool wait_connected;
bool fd_pass;
+ char_socket_cb event_cb;
} CharSocketClientTestConfig;
static void char_socket_client_dupid_test(gconstpointer opaque)
@@ -920,6 +941,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
static void char_socket_client_test(gconstpointer opaque)
{
const CharSocketClientTestConfig *config = opaque;
+ const char_socket_cb event_cb = config->event_cb;
QIOChannelSocket *ioc;
char *optstr;
Chardev *chr;
@@ -983,8 +1005,9 @@ static void char_socket_client_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
if (config->reconnect) {
g_assert(data.event == -1);
@@ -1022,7 +1045,7 @@ static void char_socket_client_test(gconstpointer opaque)
/* Setup a callback to receive the reply to our greeting */
qemu_chr_fe_set_handlers(&be, char_socket_can_read,
char_socket_read,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
g_assert(data.event == CHR_EVENT_OPENED);
data.event = -1;
@@ -1467,19 +1490,22 @@ int main(int argc, char **argv)
#define SOCKET_CLIENT_TEST(name, addr) \
static CharSocketClientTestConfig client1 ## name = \
- { addr, NULL, false, false }; \
+ { addr, NULL, false, false, char_socket_event}; \
static CharSocketClientTestConfig client2 ## name = \
- { addr, NULL, true, false }; \
+ { addr, NULL, true, false, char_socket_event }; \
static CharSocketClientTestConfig client3 ## name = \
- { addr, ",reconnect=1", false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
static CharSocketClientTestConfig client4 ## name = \
- { addr, ",reconnect=1", true }; \
+ { addr, ",reconnect=1", true, false, char_socket_event }; \
static CharSocketClientTestConfig client5 ## name = \
- { addr, NULL, false, true }; \
+ { addr, NULL, false, true, char_socket_event }; \
static CharSocketClientTestConfig client6 ## name = \
- { addr, NULL, true, true }; \
+ { addr, NULL, true, true, char_socket_event }; \
static CharSocketClientTestConfig client7 ## name = \
- { addr, ",reconnect=1", false, false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
+ static CharSocketClientTestConfig client8 ## name = \
+ { addr, ",reconnect=1", true, false, \
+ char_socket_event_with_error }; \
g_test_add_data_func("/char/socket/client/mainloop/" # name, \
&client1 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
@@ -1493,7 +1519,9 @@ int main(int argc, char **argv)
g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
&client6 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
- &client7 ##name, char_socket_client_dupid_test)
+ &client7 ##name, char_socket_client_dupid_test);\
+ g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
+ &client8 ##name, char_socket_client_test)
if (has_ipv4) {
SOCKET_SERVER_TEST(tcp, &tcpaddr);
--
2.11.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
2020-04-15 10:35 ` Marc-André Lureau
2020-04-28 8:51 ` [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
@ 2020-04-28 8:54 ` Li Feng
2020-04-28 8:59 ` Li Feng
2 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-28 8:54 UTC (permalink / raw)
To: kyle, lifeng1519, dimastep, Marc-André Lureau,
Paolo Bonzini, open list:All patches CC here
Cc: Li Feng
When the disconnect event is triggered in the connecting stage,
the tcp_chr_disconnect_locked may be called twice.
The first call:
#0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
#1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
#5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
#6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
#7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
#8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
#9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
The second call:
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
#4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
#5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
#9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
#10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
Run test/test-char to reproduce this issue.
test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
Signed-off-by: Li Feng <fengli@smartx.com>
---
v2:
- Rewrite the solution.
- Add test to reproduce this issue.
chardev/char-socket.c | 2 +-
tests/test-char.c | 48 ++++++++++++++++++++++++++++++++++++++----------
2 files changed, 39 insertions(+), 11 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1f14c2c7c8..d84330b3c9 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
if (emit_close) {
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
- if (s->reconnect_time) {
+ if (s->reconnect_time && !s->reconnect_timer) {
qemu_chr_socket_restart_timer(chr);
}
}
diff --git a/tests/test-char.c b/tests/test-char.c
index 8d39bdc9fa..13dbbfe2a3 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -625,12 +625,14 @@ static void char_udp_test(void)
typedef struct {
int event;
bool got_pong;
+ CharBackend *be;
} CharSocketTestData;
#define SOCKET_PING "Hello"
#define SOCKET_PONG "World"
+typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
static void
char_socket_event(void *opaque, QEMUChrEvent event)
@@ -639,6 +641,23 @@ char_socket_event(void *opaque, QEMUChrEvent event)
data->event = event;
}
+static void
+char_socket_event_with_error(void *opaque, QEMUChrEvent event)
+{
+ CharSocketTestData *data = opaque;
+ CharBackend *be = data->be;
+ data->event = event;
+ switch (event) {
+ case CHR_EVENT_OPENED:
+ qemu_chr_fe_disconnect(be);
+ return;
+ case CHR_EVENT_CLOSED:
+ return;
+ default:
+ return;
+ }
+}
+
static void
char_socket_read(void *opaque, const uint8_t *buf, int size)
@@ -783,6 +802,7 @@ static void char_socket_server_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
char_socket_event, NULL,
&data, NULL, true);
@@ -869,6 +889,7 @@ typedef struct {
const char *reconnect;
bool wait_connected;
bool fd_pass;
+ char_socket_cb event_cb;
} CharSocketClientTestConfig;
static void char_socket_client_dupid_test(gconstpointer opaque)
@@ -920,6 +941,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
static void char_socket_client_test(gconstpointer opaque)
{
const CharSocketClientTestConfig *config = opaque;
+ const char_socket_cb event_cb = config->event_cb;
QIOChannelSocket *ioc;
char *optstr;
Chardev *chr;
@@ -983,8 +1005,9 @@ static void char_socket_client_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
if (config->reconnect) {
g_assert(data.event == -1);
@@ -1022,7 +1045,7 @@ static void char_socket_client_test(gconstpointer opaque)
/* Setup a callback to receive the reply to our greeting */
qemu_chr_fe_set_handlers(&be, char_socket_can_read,
char_socket_read,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
g_assert(data.event == CHR_EVENT_OPENED);
data.event = -1;
@@ -1467,19 +1490,22 @@ int main(int argc, char **argv)
#define SOCKET_CLIENT_TEST(name, addr) \
static CharSocketClientTestConfig client1 ## name = \
- { addr, NULL, false, false }; \
+ { addr, NULL, false, false, char_socket_event}; \
static CharSocketClientTestConfig client2 ## name = \
- { addr, NULL, true, false }; \
+ { addr, NULL, true, false, char_socket_event }; \
static CharSocketClientTestConfig client3 ## name = \
- { addr, ",reconnect=1", false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
static CharSocketClientTestConfig client4 ## name = \
- { addr, ",reconnect=1", true }; \
+ { addr, ",reconnect=1", true, false, char_socket_event }; \
static CharSocketClientTestConfig client5 ## name = \
- { addr, NULL, false, true }; \
+ { addr, NULL, false, true, char_socket_event }; \
static CharSocketClientTestConfig client6 ## name = \
- { addr, NULL, true, true }; \
+ { addr, NULL, true, true, char_socket_event }; \
static CharSocketClientTestConfig client7 ## name = \
- { addr, ",reconnect=1", false, false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
+ static CharSocketClientTestConfig client8 ## name = \
+ { addr, ",reconnect=1", true, false, \
+ char_socket_event_with_error }; \
g_test_add_data_func("/char/socket/client/mainloop/" # name, \
&client1 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
@@ -1493,7 +1519,9 @@ int main(int argc, char **argv)
g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
&client6 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
- &client7 ##name, char_socket_client_dupid_test)
+ &client7 ##name, char_socket_client_dupid_test);\
+ g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
+ &client8 ##name, char_socket_client_test)
if (has_ipv4) {
SOCKET_SERVER_TEST(tcp, &tcpaddr);
--
2.11.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-28 8:54 ` [PATCH v2] " Li Feng
@ 2020-04-28 8:59 ` Li Feng
2020-04-28 11:05 ` Marc-André Lureau
0 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-04-28 8:59 UTC (permalink / raw)
To: Kyle Zhang, Feng Li, Dima Stepanov, Marc-André Lureau,
Paolo Bonzini, open list:All patches CC here, Yutian Zhou
Sorry for sending the weird same mail out.
Ignore the second one.
Hi Lureau,
I found another issue when running my new test: tests/test-char -p
/char/socket/client/reconnect-error/unix
The backtrace like this:
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00005555555aaa34 in error_handle_fatal (errp=<optimized out>,
err=0x7fffec0012d0) at util/error.c:40
#3 0x00005555555aab0d in error_setv (errp=0x555555802a08
<error_abort>, src=0x5555555c4220 "io/channel.c", line=148,
func=0x5555555c4520 <__func__.17450> "qio_channel_readv_all",
err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<optimized out>,
ap=0x7ffff423bb10, suffix=0x0) at util/error.c:73
#4 0x00005555555aac90 in error_setg_internal
(errp=errp@entry=0x555555802a08 <error_abort>,
src=src@entry=0x5555555c4220 "io/channel.c", line=line@entry=148,
func=func@entry=0x5555555c4520 <__func__.17450>
"qio_channel_readv_all", fmt=fmt@entry=0x5555555c4340 "Unexpected
end-of-file before all bytes were read") at util/error.c:97
#5 0x000055555556c1fc in qio_channel_readv_all (ioc=<optimized out>,
iov=<optimized out>, niov=<optimized out>, errp=0x555555802a08
<error_abort>) at io/channel.c:147
#6 0x000055555556c23a in qio_channel_read_all (ioc=<optimized out>,
buf=<optimized out>, buflen=<optimized out>, errp=<optimized out>) at
io/channel.c:247
#7 0x000055555556ad0e in char_socket_ping_pong (ioc=0x7fffec0008c0)
at tests/test-char.c:727
#8 0x000055555556adcf in char_socket_client_server_thread
(data=data@entry=0x55555582e350) at tests/test-char.c:881
#9 0x00005555555a9556 in qemu_thread_start (args=<optimized out>) at
util/qemu-thread-posix.c:519
#10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
I think this is a new issue of qemu, not my test issue.
How do you think?
Thanks,
Feng Li
Li Feng <fengli@smartx.com> 于2020年4月28日周二 下午4:53写道:
>
> When the disconnect event is triggered in the connecting stage,
> the tcp_chr_disconnect_locked may be called twice.
>
> The first call:
> #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> The second call:
> #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
>
> Run test/test-char to reproduce this issue.
>
> test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> v2:
> - Rewrite the solution.
> - Add test to reproduce this issue.
>
> chardev/char-socket.c | 2 +-
> tests/test-char.c | 48 ++++++++++++++++++++++++++++++++++++++----------
> 2 files changed, 39 insertions(+), 11 deletions(-)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 1f14c2c7c8..d84330b3c9 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
> diff --git a/tests/test-char.c b/tests/test-char.c
> index 8d39bdc9fa..13dbbfe2a3 100644
> --- a/tests/test-char.c
> +++ b/tests/test-char.c
> @@ -625,12 +625,14 @@ static void char_udp_test(void)
> typedef struct {
> int event;
> bool got_pong;
> + CharBackend *be;
> } CharSocketTestData;
>
>
> #define SOCKET_PING "Hello"
> #define SOCKET_PONG "World"
>
> +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
>
> static void
> char_socket_event(void *opaque, QEMUChrEvent event)
> @@ -639,6 +641,23 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> data->event = event;
> }
>
> +static void
> +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> +{
> + CharSocketTestData *data = opaque;
> + CharBackend *be = data->be;
> + data->event = event;
> + switch (event) {
> + case CHR_EVENT_OPENED:
> + qemu_chr_fe_disconnect(be);
> + return;
> + case CHR_EVENT_CLOSED:
> + return;
> + default:
> + return;
> + }
> +}
> +
>
> static void
> char_socket_read(void *opaque, const uint8_t *buf, int size)
> @@ -783,6 +802,7 @@ static void char_socket_server_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> char_socket_event, NULL,
> &data, NULL, true);
> @@ -869,6 +889,7 @@ typedef struct {
> const char *reconnect;
> bool wait_connected;
> bool fd_pass;
> + char_socket_cb event_cb;
> } CharSocketClientTestConfig;
>
> static void char_socket_client_dupid_test(gconstpointer opaque)
> @@ -920,6 +941,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> static void char_socket_client_test(gconstpointer opaque)
> {
> const CharSocketClientTestConfig *config = opaque;
> + const char_socket_cb event_cb = config->event_cb;
> QIOChannelSocket *ioc;
> char *optstr;
> Chardev *chr;
> @@ -983,8 +1005,9 @@ static void char_socket_client_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> if (config->reconnect) {
> g_assert(data.event == -1);
> @@ -1022,7 +1045,7 @@ static void char_socket_client_test(gconstpointer opaque)
> /* Setup a callback to receive the reply to our greeting */
> qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> char_socket_read,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> g_assert(data.event == CHR_EVENT_OPENED);
> data.event = -1;
> @@ -1467,19 +1490,22 @@ int main(int argc, char **argv)
>
> #define SOCKET_CLIENT_TEST(name, addr) \
> static CharSocketClientTestConfig client1 ## name = \
> - { addr, NULL, false, false }; \
> + { addr, NULL, false, false, char_socket_event}; \
> static CharSocketClientTestConfig client2 ## name = \
> - { addr, NULL, true, false }; \
> + { addr, NULL, true, false, char_socket_event }; \
> static CharSocketClientTestConfig client3 ## name = \
> - { addr, ",reconnect=1", false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> static CharSocketClientTestConfig client4 ## name = \
> - { addr, ",reconnect=1", true }; \
> + { addr, ",reconnect=1", true, false, char_socket_event }; \
> static CharSocketClientTestConfig client5 ## name = \
> - { addr, NULL, false, true }; \
> + { addr, NULL, false, true, char_socket_event }; \
> static CharSocketClientTestConfig client6 ## name = \
> - { addr, NULL, true, true }; \
> + { addr, NULL, true, true, char_socket_event }; \
> static CharSocketClientTestConfig client7 ## name = \
> - { addr, ",reconnect=1", false, false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> + static CharSocketClientTestConfig client8 ## name = \
> + { addr, ",reconnect=1", true, false, \
> + char_socket_event_with_error }; \
> g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> &client1 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> @@ -1493,7 +1519,9 @@ int main(int argc, char **argv)
> g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> &client6 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> - &client7 ##name, char_socket_client_dupid_test)
> + &client7 ##name, char_socket_client_dupid_test);\
> + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> + &client8 ##name, char_socket_client_test)
>
> if (has_ipv4) {
> SOCKET_SERVER_TEST(tcp, &tcpaddr);
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-28 8:59 ` Li Feng
@ 2020-04-28 11:05 ` Marc-André Lureau
2020-04-28 14:09 ` Feng Li
0 siblings, 1 reply; 32+ messages in thread
From: Marc-André Lureau @ 2020-04-28 11:05 UTC (permalink / raw)
To: Li Feng
Cc: Feng Li, open list:All patches CC here, Yutian Zhou, Kyle Zhang,
Paolo Bonzini, Dima Stepanov
Hi
On Tue, Apr 28, 2020 at 10:59 AM Li Feng <fengli@smartx.com> wrote:
>
> Sorry for sending the weird same mail out.
> Ignore the second one.
>
> Hi Lureau,
>
> I found another issue when running my new test: tests/test-char -p
> /char/socket/client/reconnect-error/unix
> The backtrace like this:
> #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> #2 0x00005555555aaa34 in error_handle_fatal (errp=<optimized out>,
> err=0x7fffec0012d0) at util/error.c:40
> #3 0x00005555555aab0d in error_setv (errp=0x555555802a08
> <error_abort>, src=0x5555555c4220 "io/channel.c", line=148,
> func=0x5555555c4520 <__func__.17450> "qio_channel_readv_all",
> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<optimized out>,
> ap=0x7ffff423bb10, suffix=0x0) at util/error.c:73
> #4 0x00005555555aac90 in error_setg_internal
> (errp=errp@entry=0x555555802a08 <error_abort>,
> src=src@entry=0x5555555c4220 "io/channel.c", line=line@entry=148,
> func=func@entry=0x5555555c4520 <__func__.17450>
> "qio_channel_readv_all", fmt=fmt@entry=0x5555555c4340 "Unexpected
> end-of-file before all bytes were read") at util/error.c:97
> #5 0x000055555556c1fc in qio_channel_readv_all (ioc=<optimized out>,
> iov=<optimized out>, niov=<optimized out>, errp=0x555555802a08
> <error_abort>) at io/channel.c:147
> #6 0x000055555556c23a in qio_channel_read_all (ioc=<optimized out>,
> buf=<optimized out>, buflen=<optimized out>, errp=<optimized out>) at
> io/channel.c:247
> #7 0x000055555556ad0e in char_socket_ping_pong (ioc=0x7fffec0008c0)
> at tests/test-char.c:727
> #8 0x000055555556adcf in char_socket_client_server_thread
> (data=data@entry=0x55555582e350) at tests/test-char.c:881
> #9 0x00005555555a9556 in qemu_thread_start (args=<optimized out>) at
> util/qemu-thread-posix.c:519
> #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
>
> I think this is a new issue of qemu, not my test issue.
> How do you think?
No idea, it could be a bug in the test itself. How did you produce it?
>
> Thanks,
>
> Feng Li
>
> Li Feng <fengli@smartx.com> 于2020年4月28日周二 下午4:53写道:
>
> >
> > When the disconnect event is triggered in the connecting stage,
> > the tcp_chr_disconnect_locked may be called twice.
> >
> > The first call:
> > #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> > #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> > #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> > #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> > #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> > #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> > #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> > The second call:
> > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> > #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> > #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> > #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> > #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> > #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
> >
> > Run test/test-char to reproduce this issue.
> >
> > test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
> > ---
> > v2:
> > - Rewrite the solution.
> > - Add test to reproduce this issue.
> >
> > chardev/char-socket.c | 2 +-
> > tests/test-char.c | 48 ++++++++++++++++++++++++++++++++++++++----------
> > 2 files changed, 39 insertions(+), 11 deletions(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 1f14c2c7c8..d84330b3c9 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > if (emit_close) {
> > qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> > }
> > - if (s->reconnect_time) {
> > + if (s->reconnect_time && !s->reconnect_timer) {
> > qemu_chr_socket_restart_timer(chr);
> > }
> > }
> > diff --git a/tests/test-char.c b/tests/test-char.c
> > index 8d39bdc9fa..13dbbfe2a3 100644
> > --- a/tests/test-char.c
> > +++ b/tests/test-char.c
> > @@ -625,12 +625,14 @@ static void char_udp_test(void)
> > typedef struct {
> > int event;
> > bool got_pong;
> > + CharBackend *be;
> > } CharSocketTestData;
> >
> >
> > #define SOCKET_PING "Hello"
> > #define SOCKET_PONG "World"
> >
> > +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
> >
> > static void
> > char_socket_event(void *opaque, QEMUChrEvent event)
> > @@ -639,6 +641,23 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> > data->event = event;
> > }
> >
> > +static void
> > +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> > +{
> > + CharSocketTestData *data = opaque;
> > + CharBackend *be = data->be;
> > + data->event = event;
> > + switch (event) {
> > + case CHR_EVENT_OPENED:
> > + qemu_chr_fe_disconnect(be);
> > + return;
> > + case CHR_EVENT_CLOSED:
> > + return;
> > + default:
> > + return;
> > + }
> > +}
> > +
> >
> > static void
> > char_socket_read(void *opaque, const uint8_t *buf, int size)
> > @@ -783,6 +802,7 @@ static void char_socket_server_test(gconstpointer opaque)
> >
> > reconnect:
> > data.event = -1;
> > + data.be = &be;
> > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > char_socket_event, NULL,
> > &data, NULL, true);
> > @@ -869,6 +889,7 @@ typedef struct {
> > const char *reconnect;
> > bool wait_connected;
> > bool fd_pass;
> > + char_socket_cb event_cb;
> > } CharSocketClientTestConfig;
> >
> > static void char_socket_client_dupid_test(gconstpointer opaque)
> > @@ -920,6 +941,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> > static void char_socket_client_test(gconstpointer opaque)
> > {
> > const CharSocketClientTestConfig *config = opaque;
> > + const char_socket_cb event_cb = config->event_cb;
> > QIOChannelSocket *ioc;
> > char *optstr;
> > Chardev *chr;
> > @@ -983,8 +1005,9 @@ static void char_socket_client_test(gconstpointer opaque)
> >
> > reconnect:
> > data.event = -1;
> > + data.be = &be;
> > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > - char_socket_event, NULL,
> > + event_cb, NULL,
> > &data, NULL, true);
> > if (config->reconnect) {
> > g_assert(data.event == -1);
> > @@ -1022,7 +1045,7 @@ static void char_socket_client_test(gconstpointer opaque)
> > /* Setup a callback to receive the reply to our greeting */
> > qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> > char_socket_read,
> > - char_socket_event, NULL,
> > + event_cb, NULL,
> > &data, NULL, true);
> > g_assert(data.event == CHR_EVENT_OPENED);
> > data.event = -1;
> > @@ -1467,19 +1490,22 @@ int main(int argc, char **argv)
> >
> > #define SOCKET_CLIENT_TEST(name, addr) \
> > static CharSocketClientTestConfig client1 ## name = \
> > - { addr, NULL, false, false }; \
> > + { addr, NULL, false, false, char_socket_event}; \
> > static CharSocketClientTestConfig client2 ## name = \
> > - { addr, NULL, true, false }; \
> > + { addr, NULL, true, false, char_socket_event }; \
> > static CharSocketClientTestConfig client3 ## name = \
> > - { addr, ",reconnect=1", false }; \
> > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > static CharSocketClientTestConfig client4 ## name = \
> > - { addr, ",reconnect=1", true }; \
> > + { addr, ",reconnect=1", true, false, char_socket_event }; \
> > static CharSocketClientTestConfig client5 ## name = \
> > - { addr, NULL, false, true }; \
> > + { addr, NULL, false, true, char_socket_event }; \
> > static CharSocketClientTestConfig client6 ## name = \
> > - { addr, NULL, true, true }; \
> > + { addr, NULL, true, true, char_socket_event }; \
> > static CharSocketClientTestConfig client7 ## name = \
> > - { addr, ",reconnect=1", false, false }; \
> > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > + static CharSocketClientTestConfig client8 ## name = \
> > + { addr, ",reconnect=1", true, false, \
> > + char_socket_event_with_error }; \
> > g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> > &client1 ##name, char_socket_client_test); \
> > g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> > @@ -1493,7 +1519,9 @@ int main(int argc, char **argv)
> > g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> > &client6 ##name, char_socket_client_test); \
> > g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> > - &client7 ##name, char_socket_client_dupid_test)
> > + &client7 ##name, char_socket_client_dupid_test);\
> > + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> > + &client8 ##name, char_socket_client_test)
> >
> > if (has_ipv4) {
> > SOCKET_SERVER_TEST(tcp, &tcpaddr);
> > --
> > 2.11.0
> >
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-28 11:05 ` Marc-André Lureau
@ 2020-04-28 14:09 ` Feng Li
0 siblings, 0 replies; 32+ messages in thread
From: Feng Li @ 2020-04-28 14:09 UTC (permalink / raw)
To: Marc-André Lureau
Cc: open list:All patches CC here, Yutian Zhou, Kyle Zhang,
Paolo Bonzini, Li Feng, Dima Stepanov
Apply the above patch, then `make check-unit`
or
`tests/test-char -p /char/socket/client/reconnect-error/unix`
Marc-André Lureau <marcandre.lureau@redhat.com> 于2020年4月28日周二 下午7:06写道:
>
> Hi
>
> On Tue, Apr 28, 2020 at 10:59 AM Li Feng <fengli@smartx.com> wrote:
> >
> > Sorry for sending the weird same mail out.
> > Ignore the second one.
> >
> > Hi Lureau,
> >
> > I found another issue when running my new test: tests/test-char -p
> > /char/socket/client/reconnect-error/unix
> > The backtrace like this:
> > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > #2 0x00005555555aaa34 in error_handle_fatal (errp=<optimized out>,
> > err=0x7fffec0012d0) at util/error.c:40
> > #3 0x00005555555aab0d in error_setv (errp=0x555555802a08
> > <error_abort>, src=0x5555555c4220 "io/channel.c", line=148,
> > func=0x5555555c4520 <__func__.17450> "qio_channel_readv_all",
> > err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<optimized out>,
> > ap=0x7ffff423bb10, suffix=0x0) at util/error.c:73
> > #4 0x00005555555aac90 in error_setg_internal
> > (errp=errp@entry=0x555555802a08 <error_abort>,
> > src=src@entry=0x5555555c4220 "io/channel.c", line=line@entry=148,
> > func=func@entry=0x5555555c4520 <__func__.17450>
> > "qio_channel_readv_all", fmt=fmt@entry=0x5555555c4340 "Unexpected
> > end-of-file before all bytes were read") at util/error.c:97
> > #5 0x000055555556c1fc in qio_channel_readv_all (ioc=<optimized out>,
> > iov=<optimized out>, niov=<optimized out>, errp=0x555555802a08
> > <error_abort>) at io/channel.c:147
> > #6 0x000055555556c23a in qio_channel_read_all (ioc=<optimized out>,
> > buf=<optimized out>, buflen=<optimized out>, errp=<optimized out>) at
> > io/channel.c:247
> > #7 0x000055555556ad0e in char_socket_ping_pong (ioc=0x7fffec0008c0)
> > at tests/test-char.c:727
> > #8 0x000055555556adcf in char_socket_client_server_thread
> > (data=data@entry=0x55555582e350) at tests/test-char.c:881
> > #9 0x00005555555a9556 in qemu_thread_start (args=<optimized out>) at
> > util/qemu-thread-posix.c:519
> > #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> >
> > I think this is a new issue of qemu, not my test issue.
> > How do you think?
>
>
> No idea, it could be a bug in the test itself. How did you produce it?
>
> >
> > Thanks,
> >
> > Feng Li
> >
> > Li Feng <fengli@smartx.com> 于2020年4月28日周二 下午4:53写道:
> >
> > >
> > > When the disconnect event is triggered in the connecting stage,
> > > the tcp_chr_disconnect_locked may be called twice.
> > >
> > > The first call:
> > > #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> > > #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> > > #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> > > #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> > > #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> > > #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> > > #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> > > The second call:
> > > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> > > #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> > > #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> > > #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> > > #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> > > #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
> > >
> > > Run test/test-char to reproduce this issue.
> > >
> > > test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> > > ---
> > > v2:
> > > - Rewrite the solution.
> > > - Add test to reproduce this issue.
> > >
> > > chardev/char-socket.c | 2 +-
> > > tests/test-char.c | 48 ++++++++++++++++++++++++++++++++++++++----------
> > > 2 files changed, 39 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 1f14c2c7c8..d84330b3c9 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > > if (emit_close) {
> > > qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> > > }
> > > - if (s->reconnect_time) {
> > > + if (s->reconnect_time && !s->reconnect_timer) {
> > > qemu_chr_socket_restart_timer(chr);
> > > }
> > > }
> > > diff --git a/tests/test-char.c b/tests/test-char.c
> > > index 8d39bdc9fa..13dbbfe2a3 100644
> > > --- a/tests/test-char.c
> > > +++ b/tests/test-char.c
> > > @@ -625,12 +625,14 @@ static void char_udp_test(void)
> > > typedef struct {
> > > int event;
> > > bool got_pong;
> > > + CharBackend *be;
> > > } CharSocketTestData;
> > >
> > >
> > > #define SOCKET_PING "Hello"
> > > #define SOCKET_PONG "World"
> > >
> > > +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
> > >
> > > static void
> > > char_socket_event(void *opaque, QEMUChrEvent event)
> > > @@ -639,6 +641,23 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> > > data->event = event;
> > > }
> > >
> > > +static void
> > > +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> > > +{
> > > + CharSocketTestData *data = opaque;
> > > + CharBackend *be = data->be;
> > > + data->event = event;
> > > + switch (event) {
> > > + case CHR_EVENT_OPENED:
> > > + qemu_chr_fe_disconnect(be);
> > > + return;
> > > + case CHR_EVENT_CLOSED:
> > > + return;
> > > + default:
> > > + return;
> > > + }
> > > +}
> > > +
> > >
> > > static void
> > > char_socket_read(void *opaque, const uint8_t *buf, int size)
> > > @@ -783,6 +802,7 @@ static void char_socket_server_test(gconstpointer opaque)
> > >
> > > reconnect:
> > > data.event = -1;
> > > + data.be = &be;
> > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > char_socket_event, NULL,
> > > &data, NULL, true);
> > > @@ -869,6 +889,7 @@ typedef struct {
> > > const char *reconnect;
> > > bool wait_connected;
> > > bool fd_pass;
> > > + char_socket_cb event_cb;
> > > } CharSocketClientTestConfig;
> > >
> > > static void char_socket_client_dupid_test(gconstpointer opaque)
> > > @@ -920,6 +941,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> > > static void char_socket_client_test(gconstpointer opaque)
> > > {
> > > const CharSocketClientTestConfig *config = opaque;
> > > + const char_socket_cb event_cb = config->event_cb;
> > > QIOChannelSocket *ioc;
> > > char *optstr;
> > > Chardev *chr;
> > > @@ -983,8 +1005,9 @@ static void char_socket_client_test(gconstpointer opaque)
> > >
> > > reconnect:
> > > data.event = -1;
> > > + data.be = &be;
> > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > - char_socket_event, NULL,
> > > + event_cb, NULL,
> > > &data, NULL, true);
> > > if (config->reconnect) {
> > > g_assert(data.event == -1);
> > > @@ -1022,7 +1045,7 @@ static void char_socket_client_test(gconstpointer opaque)
> > > /* Setup a callback to receive the reply to our greeting */
> > > qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> > > char_socket_read,
> > > - char_socket_event, NULL,
> > > + event_cb, NULL,
> > > &data, NULL, true);
> > > g_assert(data.event == CHR_EVENT_OPENED);
> > > data.event = -1;
> > > @@ -1467,19 +1490,22 @@ int main(int argc, char **argv)
> > >
> > > #define SOCKET_CLIENT_TEST(name, addr) \
> > > static CharSocketClientTestConfig client1 ## name = \
> > > - { addr, NULL, false, false }; \
> > > + { addr, NULL, false, false, char_socket_event}; \
> > > static CharSocketClientTestConfig client2 ## name = \
> > > - { addr, NULL, true, false }; \
> > > + { addr, NULL, true, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client3 ## name = \
> > > - { addr, ",reconnect=1", false }; \
> > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client4 ## name = \
> > > - { addr, ",reconnect=1", true }; \
> > > + { addr, ",reconnect=1", true, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client5 ## name = \
> > > - { addr, NULL, false, true }; \
> > > + { addr, NULL, false, true, char_socket_event }; \
> > > static CharSocketClientTestConfig client6 ## name = \
> > > - { addr, NULL, true, true }; \
> > > + { addr, NULL, true, true, char_socket_event }; \
> > > static CharSocketClientTestConfig client7 ## name = \
> > > - { addr, ",reconnect=1", false, false }; \
> > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > + static CharSocketClientTestConfig client8 ## name = \
> > > + { addr, ",reconnect=1", true, false, \
> > > + char_socket_event_with_error }; \
> > > g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> > > &client1 ##name, char_socket_client_test); \
> > > g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> > > @@ -1493,7 +1519,9 @@ int main(int argc, char **argv)
> > > g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> > > &client6 ##name, char_socket_client_test); \
> > > g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> > > - &client7 ##name, char_socket_client_dupid_test)
> > > + &client7 ##name, char_socket_client_dupid_test);\
> > > + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> > > + &client8 ##name, char_socket_client_test)
> > >
> > > if (has_ipv4) {
> > > SOCKET_SERVER_TEST(tcp, &tcpaddr);
> > > --
> > > 2.11.0
> > >
> >
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0
[not found] ` <20200508051441.8143-1-fengli@smartx.com>
@ 2020-05-08 5:14 ` Li Feng
2020-05-08 12:32 ` Marc-André Lureau
2020-05-08 5:14 ` [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
1 sibling, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-05-08 5:14 UTC (permalink / raw)
To: kyle, lifeng1519, dimastep, marcandre.lureau,
Daniel P. Berrangé,
open list:All patches CC here
Cc: Li Feng
Root cause:
From `man recvmsg`, the RETURN VALUE says:
These calls return the number of bytes received, or -1 if an error occurred.
In the event of an error, errno is set to indicate the error.
The return value will be 0 when the peer has performed an orderly shutdown.
When an error happens, the socket will be closed, and recvmsg return 0,
then error_setg will trigger a crash.
This unit test could reproduce this issue:
tests/test-char -p /char/socket/client/reconnect-error/unix
The core file backtrace is :
(gdb) bt
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
#3 0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
#4 0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
#5 0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
errp=0x555555802a08 <error_abort>) at io/channel.c:147
#6 0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
errp=<optimized out>) at io/channel.c:247
#7 0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
#8 0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
#9 0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
#10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
Signed-off-by: Li Feng <fengli@smartx.com>
---
io/channel.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/io/channel.c b/io/channel.c
index e4376eb0bc..1a4a505f01 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
if (ret == 0) {
ret = -1;
- error_setg(errp,
- "Unexpected end-of-file before all bytes were read");
} else if (ret == 1) {
ret = 0;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start
[not found] ` <20200508051441.8143-1-fengli@smartx.com>
2020-05-08 5:14 ` [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0 Li Feng
@ 2020-05-08 5:14 ` Li Feng
2020-05-08 11:00 ` Marc-André Lureau
1 sibling, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-05-08 5:14 UTC (permalink / raw)
To: kyle, lifeng1519, dimastep, marcandre.lureau, Paolo Bonzini,
open list:All patches CC here
Cc: Li Feng
When the disconnect event is triggered in the connecting stage,
the tcp_chr_disconnect_locked may be called twice.
The first call:
#0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
#1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
#5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
#6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
#7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
#8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
#9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
The second call:
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
#4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
#5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
#9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
#10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
Run test/test-char to reproduce this issue.
test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
Signed-off-by: Li Feng <fengli@smartx.com>
---
chardev/char-socket.c | 2 +-
tests/test-char.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 55 insertions(+), 15 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1f14c2c7c8..d84330b3c9 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
if (emit_close) {
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
- if (s->reconnect_time) {
+ if (s->reconnect_time && !s->reconnect_timer) {
qemu_chr_socket_restart_timer(chr);
}
}
diff --git a/tests/test-char.c b/tests/test-char.c
index 8d39bdc9fa..d5c9049eec 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -625,12 +625,14 @@ static void char_udp_test(void)
typedef struct {
int event;
bool got_pong;
+ CharBackend *be;
} CharSocketTestData;
#define SOCKET_PING "Hello"
#define SOCKET_PONG "World"
+typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
static void
char_socket_event(void *opaque, QEMUChrEvent event)
@@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
data->event = event;
}
+static void
+char_socket_event_with_error(void *opaque, QEMUChrEvent event)
+{
+ static bool first_error;
+ CharSocketTestData *data = opaque;
+ CharBackend *be = data->be;
+ data->event = event;
+ switch (event) {
+ case CHR_EVENT_OPENED:
+ if (!first_error) {
+ first_error = true;
+ qemu_chr_fe_disconnect(be);
+ }
+ return;
+ case CHR_EVENT_CLOSED:
+ return;
+ default:
+ return;
+ }
+}
+
static void
char_socket_read(void *opaque, const uint8_t *buf, int size)
@@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
}
-static void
+static int
char_socket_ping_pong(QIOChannel *ioc)
{
char greeting[sizeof(SOCKET_PING)];
const char *response = SOCKET_PONG;
- qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
+ int ret;
+ ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
+ if (ret != 0) {
+ object_unref(OBJECT(ioc));
+ return -1;
+ }
g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
-
object_unref(OBJECT(ioc));
+ return 0;
}
@@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
char_socket_event, NULL,
&data, NULL, true);
@@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
QIOChannelSocket *ioc = data;
QIOChannelSocket *cioc;
+retry:
cioc = qio_channel_socket_accept(ioc, &error_abort);
g_assert_nonnull(cioc);
- char_socket_ping_pong(QIO_CHANNEL(cioc));
+ if (char_socket_ping_pong(QIO_CHANNEL(cioc)) != 0) {
+ goto retry;
+ }
return NULL;
}
@@ -869,6 +901,7 @@ typedef struct {
const char *reconnect;
bool wait_connected;
bool fd_pass;
+ char_socket_cb event_cb;
} CharSocketClientTestConfig;
static void char_socket_client_dupid_test(gconstpointer opaque)
@@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
static void char_socket_client_test(gconstpointer opaque)
{
const CharSocketClientTestConfig *config = opaque;
+ const char_socket_cb event_cb = config->event_cb;
QIOChannelSocket *ioc;
char *optstr;
Chardev *chr;
@@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
if (config->reconnect) {
g_assert(data.event == -1);
@@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
/* Setup a callback to receive the reply to our greeting */
qemu_chr_fe_set_handlers(&be, char_socket_can_read,
char_socket_read,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
g_assert(data.event == CHR_EVENT_OPENED);
data.event = -1;
@@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
#define SOCKET_CLIENT_TEST(name, addr) \
static CharSocketClientTestConfig client1 ## name = \
- { addr, NULL, false, false }; \
+ { addr, NULL, false, false, char_socket_event}; \
static CharSocketClientTestConfig client2 ## name = \
- { addr, NULL, true, false }; \
+ { addr, NULL, true, false, char_socket_event }; \
static CharSocketClientTestConfig client3 ## name = \
- { addr, ",reconnect=1", false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
static CharSocketClientTestConfig client4 ## name = \
- { addr, ",reconnect=1", true }; \
+ { addr, ",reconnect=1", true, false, char_socket_event }; \
static CharSocketClientTestConfig client5 ## name = \
- { addr, NULL, false, true }; \
+ { addr, NULL, false, true, char_socket_event }; \
static CharSocketClientTestConfig client6 ## name = \
- { addr, NULL, true, true }; \
+ { addr, NULL, true, true, char_socket_event }; \
static CharSocketClientTestConfig client7 ## name = \
- { addr, ",reconnect=1", false, false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
+ static CharSocketClientTestConfig client8 ## name = \
+ { addr, ",reconnect=1", true, false, \
+ char_socket_event_with_error }; \
g_test_add_data_func("/char/socket/client/mainloop/" # name, \
&client1 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
@@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
&client6 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
- &client7 ##name, char_socket_client_dupid_test)
+ &client7 ##name, char_socket_client_dupid_test);\
+ g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
+ &client8 ##name, char_socket_client_test)
if (has_ipv4) {
SOCKET_SERVER_TEST(tcp, &tcpaddr);
--
2.11.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-05-08 5:14 ` [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
@ 2020-05-08 11:00 ` Marc-André Lureau
2020-05-08 12:16 ` Li Feng
0 siblings, 1 reply; 32+ messages in thread
From: Marc-André Lureau @ 2020-05-08 11:00 UTC (permalink / raw)
To: Li Feng
Cc: Feng Li, Dima Stepanov, Kyle Zhang,
open list:All patches CC here, Paolo Bonzini
On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
>
> When the disconnect event is triggered in the connecting stage,
> the tcp_chr_disconnect_locked may be called twice.
>
> The first call:
> #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> The second call:
> #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
>
> Run test/test-char to reproduce this issue.
>
> test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> Signed-off-by: Li Feng <fengli@smartx.com>
After applying your patch, with qemu configure --enable-debug
--enable-sanitizers:
$ tests/test-char
Unexpected error in qio_channel_readv_all() at
/home/elmarco/src/qq/io/channel.c:147:
Unexpected end-of-file before all bytes were read
[1] 2445287 abort (core dumped) tests/test-char
> ---
> chardev/char-socket.c | 2 +-
> tests/test-char.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
> 2 files changed, 55 insertions(+), 15 deletions(-)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 1f14c2c7c8..d84330b3c9 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
> diff --git a/tests/test-char.c b/tests/test-char.c
> index 8d39bdc9fa..d5c9049eec 100644
> --- a/tests/test-char.c
> +++ b/tests/test-char.c
> @@ -625,12 +625,14 @@ static void char_udp_test(void)
> typedef struct {
> int event;
> bool got_pong;
> + CharBackend *be;
> } CharSocketTestData;
>
>
> #define SOCKET_PING "Hello"
> #define SOCKET_PONG "World"
>
> +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
>
> static void
> char_socket_event(void *opaque, QEMUChrEvent event)
> @@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> data->event = event;
> }
>
> +static void
> +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> +{
> + static bool first_error;
> + CharSocketTestData *data = opaque;
> + CharBackend *be = data->be;
> + data->event = event;
> + switch (event) {
> + case CHR_EVENT_OPENED:
> + if (!first_error) {
> + first_error = true;
> + qemu_chr_fe_disconnect(be);
> + }
> + return;
> + case CHR_EVENT_CLOSED:
> + return;
> + default:
> + return;
> + }
> +}
> +
>
> static void
> char_socket_read(void *opaque, const uint8_t *buf, int size)
> @@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
> }
>
>
> -static void
> +static int
> char_socket_ping_pong(QIOChannel *ioc)
> {
> char greeting[sizeof(SOCKET_PING)];
> const char *response = SOCKET_PONG;
>
> - qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> + int ret;
> + ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> + if (ret != 0) {
> + object_unref(OBJECT(ioc));
> + return -1;
> + }
>
> g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
>
> qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
> -
> object_unref(OBJECT(ioc));
> + return 0;
> }
>
>
> @@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> char_socket_event, NULL,
> &data, NULL, true);
> @@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
> QIOChannelSocket *ioc = data;
> QIOChannelSocket *cioc;
>
> +retry:
> cioc = qio_channel_socket_accept(ioc, &error_abort);
> g_assert_nonnull(cioc);
>
> - char_socket_ping_pong(QIO_CHANNEL(cioc));
> + if (char_socket_ping_pong(QIO_CHANNEL(cioc)) != 0) {
> + goto retry;
> + }
>
> return NULL;
> }
> @@ -869,6 +901,7 @@ typedef struct {
> const char *reconnect;
> bool wait_connected;
> bool fd_pass;
> + char_socket_cb event_cb;
> } CharSocketClientTestConfig;
>
> static void char_socket_client_dupid_test(gconstpointer opaque)
> @@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> static void char_socket_client_test(gconstpointer opaque)
> {
> const CharSocketClientTestConfig *config = opaque;
> + const char_socket_cb event_cb = config->event_cb;
> QIOChannelSocket *ioc;
> char *optstr;
> Chardev *chr;
> @@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> if (config->reconnect) {
> g_assert(data.event == -1);
> @@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
> /* Setup a callback to receive the reply to our greeting */
> qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> char_socket_read,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> g_assert(data.event == CHR_EVENT_OPENED);
> data.event = -1;
> @@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
>
> #define SOCKET_CLIENT_TEST(name, addr) \
> static CharSocketClientTestConfig client1 ## name = \
> - { addr, NULL, false, false }; \
> + { addr, NULL, false, false, char_socket_event}; \
> static CharSocketClientTestConfig client2 ## name = \
> - { addr, NULL, true, false }; \
> + { addr, NULL, true, false, char_socket_event }; \
> static CharSocketClientTestConfig client3 ## name = \
> - { addr, ",reconnect=1", false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> static CharSocketClientTestConfig client4 ## name = \
> - { addr, ",reconnect=1", true }; \
> + { addr, ",reconnect=1", true, false, char_socket_event }; \
> static CharSocketClientTestConfig client5 ## name = \
> - { addr, NULL, false, true }; \
> + { addr, NULL, false, true, char_socket_event }; \
> static CharSocketClientTestConfig client6 ## name = \
> - { addr, NULL, true, true }; \
> + { addr, NULL, true, true, char_socket_event }; \
> static CharSocketClientTestConfig client7 ## name = \
> - { addr, ",reconnect=1", false, false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> + static CharSocketClientTestConfig client8 ## name = \
> + { addr, ",reconnect=1", true, false, \
> + char_socket_event_with_error }; \
> g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> &client1 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> @@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
> g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> &client6 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> - &client7 ##name, char_socket_client_dupid_test)
> + &client7 ##name, char_socket_client_dupid_test);\
> + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> + &client8 ##name, char_socket_client_test)
>
> if (has_ipv4) {
> SOCKET_SERVER_TEST(tcp, &tcpaddr);
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-05-08 11:00 ` Marc-André Lureau
@ 2020-05-08 12:16 ` Li Feng
2020-05-08 12:28 ` Marc-André Lureau
0 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-05-08 12:16 UTC (permalink / raw)
To: Marc-André Lureau
Cc: Feng Li, Dima Stepanov, Kyle Zhang,
open list:All patches CC here, Paolo Bonzini
Have you applied the first one?
io/channel: fix crash when qio_channel_readv_all return 0
Thanks,
Feng Li
Marc-André Lureau <marcandre.lureau@redhat.com> 于2020年5月8日周五 下午7:01写道:
>
> On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> >
> > When the disconnect event is triggered in the connecting stage,
> > the tcp_chr_disconnect_locked may be called twice.
> >
> > The first call:
> > #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> > #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> > #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> > #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> > #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> > #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> > #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> > The second call:
> > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> > #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> > #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> > #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> > #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> > #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
> >
> > Run test/test-char to reproduce this issue.
> >
> > test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
>
> After applying your patch, with qemu configure --enable-debug
> --enable-sanitizers:
>
> $ tests/test-char
> Unexpected error in qio_channel_readv_all() at
> /home/elmarco/src/qq/io/channel.c:147:
> Unexpected end-of-file before all bytes were read
> [1] 2445287 abort (core dumped) tests/test-char
>
> > ---
> > chardev/char-socket.c | 2 +-
> > tests/test-char.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
> > 2 files changed, 55 insertions(+), 15 deletions(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 1f14c2c7c8..d84330b3c9 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > if (emit_close) {
> > qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> > }
> > - if (s->reconnect_time) {
> > + if (s->reconnect_time && !s->reconnect_timer) {
> > qemu_chr_socket_restart_timer(chr);
> > }
> > }
> > diff --git a/tests/test-char.c b/tests/test-char.c
> > index 8d39bdc9fa..d5c9049eec 100644
> > --- a/tests/test-char.c
> > +++ b/tests/test-char.c
> > @@ -625,12 +625,14 @@ static void char_udp_test(void)
> > typedef struct {
> > int event;
> > bool got_pong;
> > + CharBackend *be;
> > } CharSocketTestData;
> >
> >
> > #define SOCKET_PING "Hello"
> > #define SOCKET_PONG "World"
> >
> > +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
> >
> > static void
> > char_socket_event(void *opaque, QEMUChrEvent event)
> > @@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> > data->event = event;
> > }
> >
> > +static void
> > +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> > +{
> > + static bool first_error;
> > + CharSocketTestData *data = opaque;
> > + CharBackend *be = data->be;
> > + data->event = event;
> > + switch (event) {
> > + case CHR_EVENT_OPENED:
> > + if (!first_error) {
> > + first_error = true;
> > + qemu_chr_fe_disconnect(be);
> > + }
> > + return;
> > + case CHR_EVENT_CLOSED:
> > + return;
> > + default:
> > + return;
> > + }
> > +}
> > +
> >
> > static void
> > char_socket_read(void *opaque, const uint8_t *buf, int size)
> > @@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
> > }
> >
> >
> > -static void
> > +static int
> > char_socket_ping_pong(QIOChannel *ioc)
> > {
> > char greeting[sizeof(SOCKET_PING)];
> > const char *response = SOCKET_PONG;
> >
> > - qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > + int ret;
> > + ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > + if (ret != 0) {
> > + object_unref(OBJECT(ioc));
> > + return -1;
> > + }
> >
> > g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
> >
> > qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
> > -
> > object_unref(OBJECT(ioc));
> > + return 0;
> > }
> >
> >
> > @@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
> >
> > reconnect:
> > data.event = -1;
> > + data.be = &be;
> > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > char_socket_event, NULL,
> > &data, NULL, true);
> > @@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
> > QIOChannelSocket *ioc = data;
> > QIOChannelSocket *cioc;
> >
> > +retry:
> > cioc = qio_channel_socket_accept(ioc, &error_abort);
> > g_assert_nonnull(cioc);
> >
> > - char_socket_ping_pong(QIO_CHANNEL(cioc));
> > + if (char_socket_ping_pong(QIO_CHANNEL(cioc)) != 0) {
> > + goto retry;
> > + }
> >
> > return NULL;
> > }
> > @@ -869,6 +901,7 @@ typedef struct {
> > const char *reconnect;
> > bool wait_connected;
> > bool fd_pass;
> > + char_socket_cb event_cb;
> > } CharSocketClientTestConfig;
> >
> > static void char_socket_client_dupid_test(gconstpointer opaque)
> > @@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> > static void char_socket_client_test(gconstpointer opaque)
> > {
> > const CharSocketClientTestConfig *config = opaque;
> > + const char_socket_cb event_cb = config->event_cb;
> > QIOChannelSocket *ioc;
> > char *optstr;
> > Chardev *chr;
> > @@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
> >
> > reconnect:
> > data.event = -1;
> > + data.be = &be;
> > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > - char_socket_event, NULL,
> > + event_cb, NULL,
> > &data, NULL, true);
> > if (config->reconnect) {
> > g_assert(data.event == -1);
> > @@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
> > /* Setup a callback to receive the reply to our greeting */
> > qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> > char_socket_read,
> > - char_socket_event, NULL,
> > + event_cb, NULL,
> > &data, NULL, true);
> > g_assert(data.event == CHR_EVENT_OPENED);
> > data.event = -1;
> > @@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
> >
> > #define SOCKET_CLIENT_TEST(name, addr) \
> > static CharSocketClientTestConfig client1 ## name = \
> > - { addr, NULL, false, false }; \
> > + { addr, NULL, false, false, char_socket_event}; \
> > static CharSocketClientTestConfig client2 ## name = \
> > - { addr, NULL, true, false }; \
> > + { addr, NULL, true, false, char_socket_event }; \
> > static CharSocketClientTestConfig client3 ## name = \
> > - { addr, ",reconnect=1", false }; \
> > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > static CharSocketClientTestConfig client4 ## name = \
> > - { addr, ",reconnect=1", true }; \
> > + { addr, ",reconnect=1", true, false, char_socket_event }; \
> > static CharSocketClientTestConfig client5 ## name = \
> > - { addr, NULL, false, true }; \
> > + { addr, NULL, false, true, char_socket_event }; \
> > static CharSocketClientTestConfig client6 ## name = \
> > - { addr, NULL, true, true }; \
> > + { addr, NULL, true, true, char_socket_event }; \
> > static CharSocketClientTestConfig client7 ## name = \
> > - { addr, ",reconnect=1", false, false }; \
> > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > + static CharSocketClientTestConfig client8 ## name = \
> > + { addr, ",reconnect=1", true, false, \
> > + char_socket_event_with_error }; \
> > g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> > &client1 ##name, char_socket_client_test); \
> > g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> > @@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
> > g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> > &client6 ##name, char_socket_client_test); \
> > g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> > - &client7 ##name, char_socket_client_dupid_test)
> > + &client7 ##name, char_socket_client_dupid_test);\
> > + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> > + &client8 ##name, char_socket_client_test)
> >
> > if (has_ipv4) {
> > SOCKET_SERVER_TEST(tcp, &tcpaddr);
> > --
> > 2.11.0
> >
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-05-08 12:16 ` Li Feng
@ 2020-05-08 12:28 ` Marc-André Lureau
2020-05-08 12:31 ` Li Feng
0 siblings, 1 reply; 32+ messages in thread
From: Marc-André Lureau @ 2020-05-08 12:28 UTC (permalink / raw)
To: Li Feng
Cc: Feng Li, Kyle Zhang, Dima Stepanov,
open list:All patches CC here, Paolo Bonzini
Hi
On Fri, May 8, 2020 at 2:16 PM Li Feng <fengli@smartx.com> wrote:
>
> Have you applied the first one?
> io/channel: fix crash when qio_channel_readv_all return 0
>
Sorry, I missed that one. With that it passes. thanks
> Thanks,
>
> Feng Li
>
> Marc-André Lureau <marcandre.lureau@redhat.com> 于2020年5月8日周五 下午7:01写道:
>
> >
> > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > >
> > > When the disconnect event is triggered in the connecting stage,
> > > the tcp_chr_disconnect_locked may be called twice.
> > >
> > > The first call:
> > > #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> > > #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> > > #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> > > #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> > > #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> > > #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> > > #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> > > The second call:
> > > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> > > #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> > > #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> > > #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> > > #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> > > #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
> > >
> > > Run test/test-char to reproduce this issue.
> > >
> > > test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> >
> > After applying your patch, with qemu configure --enable-debug
> > --enable-sanitizers:
> >
> > $ tests/test-char
> > Unexpected error in qio_channel_readv_all() at
> > /home/elmarco/src/qq/io/channel.c:147:
> > Unexpected end-of-file before all bytes were read
> > [1] 2445287 abort (core dumped) tests/test-char
> >
> > > ---
> > > chardev/char-socket.c | 2 +-
> > > tests/test-char.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
> > > 2 files changed, 55 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 1f14c2c7c8..d84330b3c9 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > > if (emit_close) {
> > > qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> > > }
> > > - if (s->reconnect_time) {
> > > + if (s->reconnect_time && !s->reconnect_timer) {
> > > qemu_chr_socket_restart_timer(chr);
> > > }
> > > }
> > > diff --git a/tests/test-char.c b/tests/test-char.c
> > > index 8d39bdc9fa..d5c9049eec 100644
> > > --- a/tests/test-char.c
> > > +++ b/tests/test-char.c
> > > @@ -625,12 +625,14 @@ static void char_udp_test(void)
> > > typedef struct {
> > > int event;
> > > bool got_pong;
> > > + CharBackend *be;
> > > } CharSocketTestData;
> > >
> > >
> > > #define SOCKET_PING "Hello"
> > > #define SOCKET_PONG "World"
> > >
> > > +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
> > >
> > > static void
> > > char_socket_event(void *opaque, QEMUChrEvent event)
> > > @@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> > > data->event = event;
> > > }
> > >
> > > +static void
> > > +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> > > +{
> > > + static bool first_error;
> > > + CharSocketTestData *data = opaque;
> > > + CharBackend *be = data->be;
> > > + data->event = event;
> > > + switch (event) {
> > > + case CHR_EVENT_OPENED:
> > > + if (!first_error) {
> > > + first_error = true;
> > > + qemu_chr_fe_disconnect(be);
> > > + }
> > > + return;
> > > + case CHR_EVENT_CLOSED:
> > > + return;
> > > + default:
> > > + return;
> > > + }
> > > +}
> > > +
> > >
> > > static void
> > > char_socket_read(void *opaque, const uint8_t *buf, int size)
> > > @@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
> > > }
> > >
> > >
> > > -static void
> > > +static int
> > > char_socket_ping_pong(QIOChannel *ioc)
> > > {
> > > char greeting[sizeof(SOCKET_PING)];
> > > const char *response = SOCKET_PONG;
> > >
> > > - qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > > + int ret;
> > > + ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > > + if (ret != 0) {
> > > + object_unref(OBJECT(ioc));
> > > + return -1;
> > > + }
> > >
> > > g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
> > >
> > > qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
> > > -
> > > object_unref(OBJECT(ioc));
> > > + return 0;
> > > }
> > >
> > >
> > > @@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
> > >
> > > reconnect:
> > > data.event = -1;
> > > + data.be = &be;
> > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > char_socket_event, NULL,
> > > &data, NULL, true);
> > > @@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
> > > QIOChannelSocket *ioc = data;
> > > QIOChannelSocket *cioc;
> > >
> > > +retry:
> > > cioc = qio_channel_socket_accept(ioc, &error_abort);
> > > g_assert_nonnull(cioc);
> > >
> > > - char_socket_ping_pong(QIO_CHANNEL(cioc));
> > > + if (char_socket_ping_pong(QIO_CHANNEL(cioc)) != 0) {
> > > + goto retry;
> > > + }
> > >
> > > return NULL;
> > > }
> > > @@ -869,6 +901,7 @@ typedef struct {
> > > const char *reconnect;
> > > bool wait_connected;
> > > bool fd_pass;
> > > + char_socket_cb event_cb;
> > > } CharSocketClientTestConfig;
> > >
> > > static void char_socket_client_dupid_test(gconstpointer opaque)
> > > @@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> > > static void char_socket_client_test(gconstpointer opaque)
> > > {
> > > const CharSocketClientTestConfig *config = opaque;
> > > + const char_socket_cb event_cb = config->event_cb;
> > > QIOChannelSocket *ioc;
> > > char *optstr;
> > > Chardev *chr;
> > > @@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
> > >
> > > reconnect:
> > > data.event = -1;
> > > + data.be = &be;
> > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > - char_socket_event, NULL,
> > > + event_cb, NULL,
> > > &data, NULL, true);
> > > if (config->reconnect) {
> > > g_assert(data.event == -1);
> > > @@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
> > > /* Setup a callback to receive the reply to our greeting */
> > > qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> > > char_socket_read,
> > > - char_socket_event, NULL,
> > > + event_cb, NULL,
> > > &data, NULL, true);
> > > g_assert(data.event == CHR_EVENT_OPENED);
> > > data.event = -1;
> > > @@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
> > >
> > > #define SOCKET_CLIENT_TEST(name, addr) \
> > > static CharSocketClientTestConfig client1 ## name = \
> > > - { addr, NULL, false, false }; \
> > > + { addr, NULL, false, false, char_socket_event}; \
> > > static CharSocketClientTestConfig client2 ## name = \
> > > - { addr, NULL, true, false }; \
> > > + { addr, NULL, true, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client3 ## name = \
> > > - { addr, ",reconnect=1", false }; \
> > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client4 ## name = \
> > > - { addr, ",reconnect=1", true }; \
> > > + { addr, ",reconnect=1", true, false, char_socket_event }; \
> > > static CharSocketClientTestConfig client5 ## name = \
> > > - { addr, NULL, false, true }; \
> > > + { addr, NULL, false, true, char_socket_event }; \
> > > static CharSocketClientTestConfig client6 ## name = \
> > > - { addr, NULL, true, true }; \
> > > + { addr, NULL, true, true, char_socket_event }; \
> > > static CharSocketClientTestConfig client7 ## name = \
> > > - { addr, ",reconnect=1", false, false }; \
> > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > + static CharSocketClientTestConfig client8 ## name = \
> > > + { addr, ",reconnect=1", true, false, \
> > > + char_socket_event_with_error }; \
> > > g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> > > &client1 ##name, char_socket_client_test); \
> > > g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> > > @@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
> > > g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> > > &client6 ##name, char_socket_client_test); \
> > > g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> > > - &client7 ##name, char_socket_client_dupid_test)
> > > + &client7 ##name, char_socket_client_dupid_test);\
> > > + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> > > + &client8 ##name, char_socket_client_test)
> > >
> > > if (has_ipv4) {
> > > SOCKET_SERVER_TEST(tcp, &tcpaddr);
> > > --
> > > 2.11.0
> > >
> >
>
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start
2020-05-08 12:28 ` Marc-André Lureau
@ 2020-05-08 12:31 ` Li Feng
0 siblings, 0 replies; 32+ messages in thread
From: Li Feng @ 2020-05-08 12:31 UTC (permalink / raw)
To: Marc-André Lureau
Cc: Feng Li, Kyle Zhang, Dima Stepanov,
open list:All patches CC here, Paolo Bonzini
Thanks!
Feng Li
Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:29写道:
>
> Hi
>
> On Fri, May 8, 2020 at 2:16 PM Li Feng <fengli@smartx.com> wrote:
> >
> > Have you applied the first one?
> > io/channel: fix crash when qio_channel_readv_all return 0
> >
>
> Sorry, I missed that one. With that it passes. thanks
>
> > Thanks,
> >
> > Feng Li
> >
> > Marc-André Lureau <marcandre.lureau@redhat.com> 于2020年5月8日周五 下午7:01写道:
> >
> > >
> > > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > > >
> > > > When the disconnect event is triggered in the connecting stage,
> > > > the tcp_chr_disconnect_locked may be called twice.
> > > >
> > > > The first call:
> > > > #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> > > > #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > > #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > > #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > > #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> > > > #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> > > > #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> > > > #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> > > > #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> > > > #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> > > > The second call:
> > > > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > > #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> > > > #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> > > > #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> > > > #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> > > > #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> > > > #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> > > > #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> > > > #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> > > > #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
> > > >
> > > > Run test/test-char to reproduce this issue.
> > > >
> > > > test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
> > > >
> > > > Signed-off-by: Li Feng <fengli@smartx.com>
> > >
> > > After applying your patch, with qemu configure --enable-debug
> > > --enable-sanitizers:
> > >
> > > $ tests/test-char
> > > Unexpected error in qio_channel_readv_all() at
> > > /home/elmarco/src/qq/io/channel.c:147:
> > > Unexpected end-of-file before all bytes were read
> > > [1] 2445287 abort (core dumped) tests/test-char
> > >
> > > > ---
> > > > chardev/char-socket.c | 2 +-
> > > > tests/test-char.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
> > > > 2 files changed, 55 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > index 1f14c2c7c8..d84330b3c9 100644
> > > > --- a/chardev/char-socket.c
> > > > +++ b/chardev/char-socket.c
> > > > @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > > > if (emit_close) {
> > > > qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> > > > }
> > > > - if (s->reconnect_time) {
> > > > + if (s->reconnect_time && !s->reconnect_timer) {
> > > > qemu_chr_socket_restart_timer(chr);
> > > > }
> > > > }
> > > > diff --git a/tests/test-char.c b/tests/test-char.c
> > > > index 8d39bdc9fa..d5c9049eec 100644
> > > > --- a/tests/test-char.c
> > > > +++ b/tests/test-char.c
> > > > @@ -625,12 +625,14 @@ static void char_udp_test(void)
> > > > typedef struct {
> > > > int event;
> > > > bool got_pong;
> > > > + CharBackend *be;
> > > > } CharSocketTestData;
> > > >
> > > >
> > > > #define SOCKET_PING "Hello"
> > > > #define SOCKET_PONG "World"
> > > >
> > > > +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
> > > >
> > > > static void
> > > > char_socket_event(void *opaque, QEMUChrEvent event)
> > > > @@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> > > > data->event = event;
> > > > }
> > > >
> > > > +static void
> > > > +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> > > > +{
> > > > + static bool first_error;
> > > > + CharSocketTestData *data = opaque;
> > > > + CharBackend *be = data->be;
> > > > + data->event = event;
> > > > + switch (event) {
> > > > + case CHR_EVENT_OPENED:
> > > > + if (!first_error) {
> > > > + first_error = true;
> > > > + qemu_chr_fe_disconnect(be);
> > > > + }
> > > > + return;
> > > > + case CHR_EVENT_CLOSED:
> > > > + return;
> > > > + default:
> > > > + return;
> > > > + }
> > > > +}
> > > > +
> > > >
> > > > static void
> > > > char_socket_read(void *opaque, const uint8_t *buf, int size)
> > > > @@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
> > > > }
> > > >
> > > >
> > > > -static void
> > > > +static int
> > > > char_socket_ping_pong(QIOChannel *ioc)
> > > > {
> > > > char greeting[sizeof(SOCKET_PING)];
> > > > const char *response = SOCKET_PONG;
> > > >
> > > > - qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > > > + int ret;
> > > > + ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> > > > + if (ret != 0) {
> > > > + object_unref(OBJECT(ioc));
> > > > + return -1;
> > > > + }
> > > >
> > > > g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
> > > >
> > > > qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
> > > > -
> > > > object_unref(OBJECT(ioc));
> > > > + return 0;
> > > > }
> > > >
> > > >
> > > > @@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
> > > >
> > > > reconnect:
> > > > data.event = -1;
> > > > + data.be = &be;
> > > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > > char_socket_event, NULL,
> > > > &data, NULL, true);
> > > > @@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
> > > > QIOChannelSocket *ioc = data;
> > > > QIOChannelSocket *cioc;
> > > >
> > > > +retry:
> > > > cioc = qio_channel_socket_accept(ioc, &error_abort);
> > > > g_assert_nonnull(cioc);
> > > >
> > > > - char_socket_ping_pong(QIO_CHANNEL(cioc));
> > > > + if (char_socket_ping_pong(QIO_CHANNEL(cioc)) != 0) {
> > > > + goto retry;
> > > > + }
> > > >
> > > > return NULL;
> > > > }
> > > > @@ -869,6 +901,7 @@ typedef struct {
> > > > const char *reconnect;
> > > > bool wait_connected;
> > > > bool fd_pass;
> > > > + char_socket_cb event_cb;
> > > > } CharSocketClientTestConfig;
> > > >
> > > > static void char_socket_client_dupid_test(gconstpointer opaque)
> > > > @@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> > > > static void char_socket_client_test(gconstpointer opaque)
> > > > {
> > > > const CharSocketClientTestConfig *config = opaque;
> > > > + const char_socket_cb event_cb = config->event_cb;
> > > > QIOChannelSocket *ioc;
> > > > char *optstr;
> > > > Chardev *chr;
> > > > @@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
> > > >
> > > > reconnect:
> > > > data.event = -1;
> > > > + data.be = &be;
> > > > qemu_chr_fe_set_handlers(&be, NULL, NULL,
> > > > - char_socket_event, NULL,
> > > > + event_cb, NULL,
> > > > &data, NULL, true);
> > > > if (config->reconnect) {
> > > > g_assert(data.event == -1);
> > > > @@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
> > > > /* Setup a callback to receive the reply to our greeting */
> > > > qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> > > > char_socket_read,
> > > > - char_socket_event, NULL,
> > > > + event_cb, NULL,
> > > > &data, NULL, true);
> > > > g_assert(data.event == CHR_EVENT_OPENED);
> > > > data.event = -1;
> > > > @@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
> > > >
> > > > #define SOCKET_CLIENT_TEST(name, addr) \
> > > > static CharSocketClientTestConfig client1 ## name = \
> > > > - { addr, NULL, false, false }; \
> > > > + { addr, NULL, false, false, char_socket_event}; \
> > > > static CharSocketClientTestConfig client2 ## name = \
> > > > - { addr, NULL, true, false }; \
> > > > + { addr, NULL, true, false, char_socket_event }; \
> > > > static CharSocketClientTestConfig client3 ## name = \
> > > > - { addr, ",reconnect=1", false }; \
> > > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > > static CharSocketClientTestConfig client4 ## name = \
> > > > - { addr, ",reconnect=1", true }; \
> > > > + { addr, ",reconnect=1", true, false, char_socket_event }; \
> > > > static CharSocketClientTestConfig client5 ## name = \
> > > > - { addr, NULL, false, true }; \
> > > > + { addr, NULL, false, true, char_socket_event }; \
> > > > static CharSocketClientTestConfig client6 ## name = \
> > > > - { addr, NULL, true, true }; \
> > > > + { addr, NULL, true, true, char_socket_event }; \
> > > > static CharSocketClientTestConfig client7 ## name = \
> > > > - { addr, ",reconnect=1", false, false }; \
> > > > + { addr, ",reconnect=1", false, false, char_socket_event }; \
> > > > + static CharSocketClientTestConfig client8 ## name = \
> > > > + { addr, ",reconnect=1", true, false, \
> > > > + char_socket_event_with_error }; \
> > > > g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> > > > &client1 ##name, char_socket_client_test); \
> > > > g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> > > > @@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
> > > > g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> > > > &client6 ##name, char_socket_client_test); \
> > > > g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> > > > - &client7 ##name, char_socket_client_dupid_test)
> > > > + &client7 ##name, char_socket_client_dupid_test);\
> > > > + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> > > > + &client8 ##name, char_socket_client_test)
> > > >
> > > > if (has_ipv4) {
> > > > SOCKET_SERVER_TEST(tcp, &tcpaddr);
> > > > --
> > > > 2.11.0
> > > >
> > >
> >
>
>
> --
> Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0
2020-05-08 5:14 ` [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0 Li Feng
@ 2020-05-08 12:32 ` Marc-André Lureau
2020-05-08 12:42 ` Li Feng
0 siblings, 1 reply; 32+ messages in thread
From: Marc-André Lureau @ 2020-05-08 12:32 UTC (permalink / raw)
To: Li Feng
Cc: Li Feng, Dima Stepanov, Kyle Zhang, Daniel P. Berrangé,
open list:All patches CC here
Hi
On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
>
> Root cause:
> From `man recvmsg`, the RETURN VALUE says:
> These calls return the number of bytes received, or -1 if an error occurred.
> In the event of an error, errno is set to indicate the error.
> The return value will be 0 when the peer has performed an orderly shutdown.
>
> When an error happens, the socket will be closed, and recvmsg return 0,
> then error_setg will trigger a crash.
>
> This unit test could reproduce this issue:
> tests/test-char -p /char/socket/client/reconnect-error/unix
Current master doesn't trigger the backtrace, it's only after your patch 2.
>
> The core file backtrace is :
>
> (gdb) bt
> #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> #2 0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> #3 0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> #4 0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> #5 0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> errp=0x555555802a08 <error_abort>) at io/channel.c:147
> #6 0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> errp=<optimized out>) at io/channel.c:247
> #7 0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> #8 0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> #9 0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> io/channel.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/io/channel.c b/io/channel.c
> index e4376eb0bc..1a4a505f01 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
>
> if (ret == 0) {
> ret = -1;
> - error_setg(errp,
> - "Unexpected end-of-file before all bytes were read");
Nack, this code is fine.
The problem is that the test case doesn't expect a disconnect in
char_socket_ping_pong().
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0
2020-05-08 12:32 ` Marc-André Lureau
@ 2020-05-08 12:42 ` Li Feng
2020-05-11 10:02 ` Daniel P. Berrangé
0 siblings, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-05-08 12:42 UTC (permalink / raw)
To: Marc-André Lureau
Cc: Li Feng, Dima Stepanov, Kyle Zhang, Daniel P. Berrangé,
open list:All patches CC here
Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
>
> Hi
>
> On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> >
> > Root cause:
> > From `man recvmsg`, the RETURN VALUE says:
> > These calls return the number of bytes received, or -1 if an error occurred.
> > In the event of an error, errno is set to indicate the error.
> > The return value will be 0 when the peer has performed an orderly shutdown.
> >
> > When an error happens, the socket will be closed, and recvmsg return 0,
> > then error_setg will trigger a crash.
> >
> > This unit test could reproduce this issue:
> > tests/test-char -p /char/socket/client/reconnect-error/unix
>
> Current master doesn't trigger the backtrace, it's only after your patch 2.
Yes. However, the issue did exist in the master code base.
The test case in patch 2 revealed this issue.
>
> >
> > The core file backtrace is :
> >
> > (gdb) bt
> > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > #2 0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> > #3 0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> > func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> > fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> > #4 0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> > src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> > func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> > fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> > #5 0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> > errp=0x555555802a08 <error_abort>) at io/channel.c:147
> > #6 0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> > errp=<optimized out>) at io/channel.c:247
> > #7 0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> > #8 0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> > #9 0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> > #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
> > ---
> > io/channel.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/io/channel.c b/io/channel.c
> > index e4376eb0bc..1a4a505f01 100644
> > --- a/io/channel.c
> > +++ b/io/channel.c
> > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> >
> > if (ret == 0) {
> > ret = -1;
> > - error_setg(errp,
> > - "Unexpected end-of-file before all bytes were read");
>
> Nack, this code is fine.
>
> The problem is that the test case doesn't expect a disconnect in
> char_socket_ping_pong().
Yes, in the test case I try to inject an error in char_socket_ping_pong.
Any concerns about these two patches?
>
> --
> Marc-André Lureau
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0
2020-05-08 12:42 ` Li Feng
@ 2020-05-11 10:02 ` Daniel P. Berrangé
2020-05-11 12:09 ` Li Feng
0 siblings, 1 reply; 32+ messages in thread
From: Daniel P. Berrangé @ 2020-05-11 10:02 UTC (permalink / raw)
To: Li Feng
Cc: Li Feng, Kyle Zhang, Marc-André Lureau,
open list:All patches CC here, Dima Stepanov
On Fri, May 08, 2020 at 08:42:22PM +0800, Li Feng wrote:
> Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
> >
> > Hi
> >
> > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > >
> > > Root cause:
> > > From `man recvmsg`, the RETURN VALUE says:
> > > These calls return the number of bytes received, or -1 if an error occurred.
> > > In the event of an error, errno is set to indicate the error.
> > > The return value will be 0 when the peer has performed an orderly shutdown.
> > >
> > > When an error happens, the socket will be closed, and recvmsg return 0,
> > > then error_setg will trigger a crash.
> > >
> > > This unit test could reproduce this issue:
> > > tests/test-char -p /char/socket/client/reconnect-error/unix
> >
> > Current master doesn't trigger the backtrace, it's only after your patch 2.
> Yes. However, the issue did exist in the master code base.
> The test case in patch 2 revealed this issue.
>
> >
> > >
> > > The core file backtrace is :
> > >
> > > (gdb) bt
> > > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > #2 0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> > > #3 0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> > > func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> > > fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> > > #4 0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> > > src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> > > func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> > > fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> > > #5 0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> > > errp=0x555555802a08 <error_abort>) at io/channel.c:147
> > > #6 0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> > > errp=<optimized out>) at io/channel.c:247
> > > #7 0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> > > #8 0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> > > #9 0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> > > #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > > #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> > > ---
> > > io/channel.c | 2 --
> > > 1 file changed, 2 deletions(-)
> > >
> > > diff --git a/io/channel.c b/io/channel.c
> > > index e4376eb0bc..1a4a505f01 100644
> > > --- a/io/channel.c
> > > +++ b/io/channel.c
> > > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> > >
> > > if (ret == 0) {
> > > ret = -1;
> > > - error_setg(errp,
> > > - "Unexpected end-of-file before all bytes were read");
> >
> > Nack, this code is fine.
> >
> > The problem is that the test case doesn't expect a disconnect in
> > char_socket_ping_pong().
> Yes, in the test case I try to inject an error in char_socket_ping_pong.
> Any concerns about these two patches?
I agree with Marc-André - this patch is wrong. qio_channel_readv_all
*MUST* always set 'errp' if the return value is -1. To not set 'errp'
is violating the calling convention.
The bug here is in your test case - it is passing '&error_abort' to the
'qio_channel_readv_all' call. If you don't want the test to crash, then
don't pass &error_abort
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0
2020-05-11 10:02 ` Daniel P. Berrangé
@ 2020-05-11 12:09 ` Li Feng
0 siblings, 0 replies; 32+ messages in thread
From: Li Feng @ 2020-05-11 12:09 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Li Feng, Kyle Zhang, Marc-André Lureau,
open list:All patches CC here, Dima Stepanov
Thank you, Daniel and Marc-André.
I understand now. The error_abort is used to abort the program.
I should rewrite the char_socket_ping_pong.
I will post a new version without this wrong patch.
Thanks,
Feng Li
Daniel P. Berrangé <berrange@redhat.com> 于2020年5月11日周一 下午6:03写道:
>
> On Fri, May 08, 2020 at 08:42:22PM +0800, Li Feng wrote:
> > Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
> > >
> > > Hi
> > >
> > > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > > >
> > > > Root cause:
> > > > From `man recvmsg`, the RETURN VALUE says:
> > > > These calls return the number of bytes received, or -1 if an error occurred.
> > > > In the event of an error, errno is set to indicate the error.
> > > > The return value will be 0 when the peer has performed an orderly shutdown.
> > > >
> > > > When an error happens, the socket will be closed, and recvmsg return 0,
> > > > then error_setg will trigger a crash.
> > > >
> > > > This unit test could reproduce this issue:
> > > > tests/test-char -p /char/socket/client/reconnect-error/unix
> > >
> > > Current master doesn't trigger the backtrace, it's only after your patch 2.
> > Yes. However, the issue did exist in the master code base.
> > The test case in patch 2 revealed this issue.
> >
> > >
> > > >
> > > > The core file backtrace is :
> > > >
> > > > (gdb) bt
> > > > #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > > #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > > #2 0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> > > > #3 0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> > > > func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> > > > fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> > > > #4 0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> > > > src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> > > > func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> > > > fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> > > > #5 0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> > > > errp=0x555555802a08 <error_abort>) at io/channel.c:147
> > > > #6 0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> > > > errp=<optimized out>) at io/channel.c:247
> > > > #7 0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> > > > #8 0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> > > > #9 0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> > > > #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > > > #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> > > >
> > > > Signed-off-by: Li Feng <fengli@smartx.com>
> > > > ---
> > > > io/channel.c | 2 --
> > > > 1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/io/channel.c b/io/channel.c
> > > > index e4376eb0bc..1a4a505f01 100644
> > > > --- a/io/channel.c
> > > > +++ b/io/channel.c
> > > > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> > > >
> > > > if (ret == 0) {
> > > > ret = -1;
> > > > - error_setg(errp,
> > > > - "Unexpected end-of-file before all bytes were read");
> > >
> > > Nack, this code is fine.
> > >
> > > The problem is that the test case doesn't expect a disconnect in
> > > char_socket_ping_pong().
> > Yes, in the test case I try to inject an error in char_socket_ping_pong.
> > Any concerns about these two patches?
>
> I agree with Marc-André - this patch is wrong. qio_channel_readv_all
> *MUST* always set 'errp' if the return value is -1. To not set 'errp'
> is violating the calling convention.
>
> The bug here is in your test case - it is passing '&error_abort' to the
> 'qio_channel_readv_all' call. If you don't want the test to crash, then
> don't pass &error_abort
>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4] char-socket: initialize reconnect timer only when the timer doesn't start
2020-04-28 8:51 ` [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
[not found] ` <20200508051441.8143-1-fengli@smartx.com>
@ 2020-05-11 12:39 ` Li Feng
2020-05-21 15:15 ` Paolo Bonzini
1 sibling, 1 reply; 32+ messages in thread
From: Li Feng @ 2020-05-11 12:39 UTC (permalink / raw)
To: kyle, lifeng1519, dimastep, marcandre.lureau, Paolo Bonzini,
open list:All patches CC here
Cc: Li Feng
When the disconnect event is triggered in the connecting stage,
the tcp_chr_disconnect_locked may be called twice.
The first call:
#0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
#1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
#5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
#6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
#7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
#8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
#9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
The second call:
#0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
#1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
#2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
#4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
#5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
#6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
#7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
#8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
#9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
#10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
Run test/test-char to reproduce this issue.
test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
Signed-off-by: Li Feng <fengli@smartx.com>
---
v4:
- remove the wrong patch
- fix the char_socket_ping_pong to support the reconnect exception test
v3:
- add a patch to fix a crash when recvmsg return 0
- make the tests reproduce the two crash
v2:
- add unit test
chardev/char-socket.c | 2 +-
tests/test-char.c | 74 +++++++++++++++++++++++++++++++++++++++------------
2 files changed, 58 insertions(+), 18 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 1f14c2c7c8..d84330b3c9 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
if (emit_close) {
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
- if (s->reconnect_time) {
+ if (s->reconnect_time && !s->reconnect_timer) {
qemu_chr_socket_restart_timer(chr);
}
}
diff --git a/tests/test-char.c b/tests/test-char.c
index 8d39bdc9fa..d94580c61e 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -625,12 +625,14 @@ static void char_udp_test(void)
typedef struct {
int event;
bool got_pong;
+ CharBackend *be;
} CharSocketTestData;
#define SOCKET_PING "Hello"
#define SOCKET_PONG "World"
+typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
static void
char_socket_event(void *opaque, QEMUChrEvent event)
@@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
data->event = event;
}
+static void
+char_socket_event_with_error(void *opaque, QEMUChrEvent event)
+{
+ static bool first_error;
+ CharSocketTestData *data = opaque;
+ CharBackend *be = data->be;
+ data->event = event;
+ switch (event) {
+ case CHR_EVENT_OPENED:
+ if (!first_error) {
+ first_error = true;
+ qemu_chr_fe_disconnect(be);
+ }
+ return;
+ case CHR_EVENT_CLOSED:
+ return;
+ default:
+ return;
+ }
+}
+
static void
char_socket_read(void *opaque, const uint8_t *buf, int size)
@@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
}
-static void
-char_socket_ping_pong(QIOChannel *ioc)
+static int
+char_socket_ping_pong(QIOChannel *ioc, Error **errp)
{
char greeting[sizeof(SOCKET_PING)];
const char *response = SOCKET_PONG;
- qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
+ int ret;
+ ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), errp);
+ if (ret != 0) {
+ object_unref(OBJECT(ioc));
+ return -1;
+ }
g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
- qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
-
+ qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), errp);
object_unref(OBJECT(ioc));
+ return 0;
}
@@ -723,7 +751,7 @@ char_socket_server_client_thread(gpointer data)
qio_channel_socket_connect_sync(ioc, addr, &error_abort);
- char_socket_ping_pong(QIO_CHANNEL(ioc));
+ char_socket_ping_pong(QIO_CHANNEL(ioc), &error_abort);
return NULL;
}
@@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
char_socket_event, NULL,
&data, NULL, true);
@@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
QIOChannelSocket *ioc = data;
QIOChannelSocket *cioc;
+retry:
cioc = qio_channel_socket_accept(ioc, &error_abort);
g_assert_nonnull(cioc);
- char_socket_ping_pong(QIO_CHANNEL(cioc));
+ if (char_socket_ping_pong(QIO_CHANNEL(cioc), NULL) != 0) {
+ goto retry;
+ }
return NULL;
}
@@ -869,6 +901,7 @@ typedef struct {
const char *reconnect;
bool wait_connected;
bool fd_pass;
+ char_socket_cb event_cb;
} CharSocketClientTestConfig;
static void char_socket_client_dupid_test(gconstpointer opaque)
@@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
static void char_socket_client_test(gconstpointer opaque)
{
const CharSocketClientTestConfig *config = opaque;
+ const char_socket_cb event_cb = config->event_cb;
QIOChannelSocket *ioc;
char *optstr;
Chardev *chr;
@@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
reconnect:
data.event = -1;
+ data.be = &be;
qemu_chr_fe_set_handlers(&be, NULL, NULL,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
if (config->reconnect) {
g_assert(data.event == -1);
@@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
/* Setup a callback to receive the reply to our greeting */
qemu_chr_fe_set_handlers(&be, char_socket_can_read,
char_socket_read,
- char_socket_event, NULL,
+ event_cb, NULL,
&data, NULL, true);
g_assert(data.event == CHR_EVENT_OPENED);
data.event = -1;
@@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
#define SOCKET_CLIENT_TEST(name, addr) \
static CharSocketClientTestConfig client1 ## name = \
- { addr, NULL, false, false }; \
+ { addr, NULL, false, false, char_socket_event}; \
static CharSocketClientTestConfig client2 ## name = \
- { addr, NULL, true, false }; \
+ { addr, NULL, true, false, char_socket_event }; \
static CharSocketClientTestConfig client3 ## name = \
- { addr, ",reconnect=1", false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
static CharSocketClientTestConfig client4 ## name = \
- { addr, ",reconnect=1", true }; \
+ { addr, ",reconnect=1", true, false, char_socket_event }; \
static CharSocketClientTestConfig client5 ## name = \
- { addr, NULL, false, true }; \
+ { addr, NULL, false, true, char_socket_event }; \
static CharSocketClientTestConfig client6 ## name = \
- { addr, NULL, true, true }; \
+ { addr, NULL, true, true, char_socket_event }; \
static CharSocketClientTestConfig client7 ## name = \
- { addr, ",reconnect=1", false, false }; \
+ { addr, ",reconnect=1", false, false, char_socket_event }; \
+ static CharSocketClientTestConfig client8 ## name = \
+ { addr, ",reconnect=1", true, false, \
+ char_socket_event_with_error }; \
g_test_add_data_func("/char/socket/client/mainloop/" # name, \
&client1 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
@@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
&client6 ##name, char_socket_client_test); \
g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
- &client7 ##name, char_socket_client_dupid_test)
+ &client7 ##name, char_socket_client_dupid_test);\
+ g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
+ &client8 ##name, char_socket_client_test)
if (has_ipv4) {
SOCKET_SERVER_TEST(tcp, &tcpaddr);
--
2.11.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4] char-socket: initialize reconnect timer only when the timer doesn't start
2020-05-11 12:39 ` [PATCH v4] " Li Feng
@ 2020-05-21 15:15 ` Paolo Bonzini
0 siblings, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2020-05-21 15:15 UTC (permalink / raw)
To: Li Feng, kyle, lifeng1519, dimastep, marcandre.lureau,
open list:All patches CC here
Sorry, can you resend this?
I cannot apply it.
Paolo
On 11/05/20 14:39, Li Feng wrote:
> When the disconnect event is triggered in the connecting stage,
> the tcp_chr_disconnect_locked may be called twice.
>
> The first call:
> #0 qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:120
> #1 0x000055555558e38c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #2 0x000055555558e3cd in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #3 0x000055555558ea32 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #4 0x000055555558eeb8 in qemu_chr_socket_connected (task=0x55555582f300, opaque=<optimized out>) at chardev/char-socket.c:1090
> #5 0x0000555555574352 in qio_task_complete (task=task@entry=0x55555582f300) at io/task.c:196
> #6 0x00005555555745f4 in qio_task_thread_result (opaque=0x55555582f300) at io/task.c:111
> #7 qio_task_wait_thread (task=0x55555582f300) at io/task.c:190
> #8 0x000055555558f17e in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1013
> #9 0x0000555555567cbd in char_socket_client_reconnect_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1152
> The second call:
> #0 0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> #2 0x00007ffff5abc096 in __assert_fail_base () from /lib64/libc.so.6
> #3 0x00007ffff5abc142 in __assert_fail () from /lib64/libc.so.6
> #4 0x000055555558d10a in qemu_chr_socket_restart_timer (chr=0x55555582ee90) at chardev/char-socket.c:125
> #5 0x000055555558df0c in tcp_chr_disconnect_locked (chr=<optimized out>) at chardev/char-socket.c:490
> #6 0x000055555558df4d in tcp_chr_disconnect (chr=0x55555582ee90) at chardev/char-socket.c:497
> #7 0x000055555558e5b2 in tcp_chr_new_client (chr=chr@entry=0x55555582ee90, sioc=sioc@entry=0x55555582f0b0) at chardev/char-socket.c:892
> #8 0x000055555558e93a in tcp_chr_connect_client_sync (chr=chr@entry=0x55555582ee90, errp=errp@entry=0x7fffffffd178) at chardev/char-socket.c:944
> #9 0x000055555558ec78 in tcp_chr_wait_connected (chr=0x55555582ee90, errp=0x555555802a08 <error_abort>) at chardev/char-socket.c:1035
> #10 0x000055555556804b in char_socket_client_test (opaque=0x5555557fe020 <client8unix>) at tests/test-char.c:1023
>
> Run test/test-char to reproduce this issue.
>
> test-char: chardev/char-socket.c:125: qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> v4:
> - remove the wrong patch
> - fix the char_socket_ping_pong to support the reconnect exception test
>
> v3:
> - add a patch to fix a crash when recvmsg return 0
> - make the tests reproduce the two crash
>
> v2:
> - add unit test
>
> chardev/char-socket.c | 2 +-
> tests/test-char.c | 74 +++++++++++++++++++++++++++++++++++++++------------
> 2 files changed, 58 insertions(+), 18 deletions(-)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 1f14c2c7c8..d84330b3c9 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
> diff --git a/tests/test-char.c b/tests/test-char.c
> index 8d39bdc9fa..d94580c61e 100644
> --- a/tests/test-char.c
> +++ b/tests/test-char.c
> @@ -625,12 +625,14 @@ static void char_udp_test(void)
> typedef struct {
> int event;
> bool got_pong;
> + CharBackend *be;
> } CharSocketTestData;
>
>
> #define SOCKET_PING "Hello"
> #define SOCKET_PONG "World"
>
> +typedef void (*char_socket_cb)(void *opaque, QEMUChrEvent event);
>
> static void
> char_socket_event(void *opaque, QEMUChrEvent event)
> @@ -639,6 +641,27 @@ char_socket_event(void *opaque, QEMUChrEvent event)
> data->event = event;
> }
>
> +static void
> +char_socket_event_with_error(void *opaque, QEMUChrEvent event)
> +{
> + static bool first_error;
> + CharSocketTestData *data = opaque;
> + CharBackend *be = data->be;
> + data->event = event;
> + switch (event) {
> + case CHR_EVENT_OPENED:
> + if (!first_error) {
> + first_error = true;
> + qemu_chr_fe_disconnect(be);
> + }
> + return;
> + case CHR_EVENT_CLOSED:
> + return;
> + default:
> + return;
> + }
> +}
> +
>
> static void
> char_socket_read(void *opaque, const uint8_t *buf, int size)
> @@ -699,19 +722,24 @@ char_socket_addr_to_opt_str(SocketAddress *addr, bool fd_pass,
> }
>
>
> -static void
> -char_socket_ping_pong(QIOChannel *ioc)
> +static int
> +char_socket_ping_pong(QIOChannel *ioc, Error **errp)
> {
> char greeting[sizeof(SOCKET_PING)];
> const char *response = SOCKET_PONG;
>
> - qio_channel_read_all(ioc, greeting, sizeof(greeting), &error_abort);
> + int ret;
> + ret = qio_channel_read_all(ioc, greeting, sizeof(greeting), errp);
> + if (ret != 0) {
> + object_unref(OBJECT(ioc));
> + return -1;
> + }
>
> g_assert(memcmp(greeting, SOCKET_PING, sizeof(greeting)) == 0);
>
> - qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), &error_abort);
> -
> + qio_channel_write_all(ioc, response, sizeof(SOCKET_PONG), errp);
> object_unref(OBJECT(ioc));
> + return 0;
> }
>
>
> @@ -723,7 +751,7 @@ char_socket_server_client_thread(gpointer data)
>
> qio_channel_socket_connect_sync(ioc, addr, &error_abort);
>
> - char_socket_ping_pong(QIO_CHANNEL(ioc));
> + char_socket_ping_pong(QIO_CHANNEL(ioc), &error_abort);
>
> return NULL;
> }
> @@ -783,6 +811,7 @@ static void char_socket_server_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> char_socket_event, NULL,
> &data, NULL, true);
> @@ -855,10 +884,13 @@ char_socket_client_server_thread(gpointer data)
> QIOChannelSocket *ioc = data;
> QIOChannelSocket *cioc;
>
> +retry:
> cioc = qio_channel_socket_accept(ioc, &error_abort);
> g_assert_nonnull(cioc);
>
> - char_socket_ping_pong(QIO_CHANNEL(cioc));
> + if (char_socket_ping_pong(QIO_CHANNEL(cioc), NULL) != 0) {
> + goto retry;
> + }
>
> return NULL;
> }
> @@ -869,6 +901,7 @@ typedef struct {
> const char *reconnect;
> bool wait_connected;
> bool fd_pass;
> + char_socket_cb event_cb;
> } CharSocketClientTestConfig;
>
> static void char_socket_client_dupid_test(gconstpointer opaque)
> @@ -920,6 +953,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
> static void char_socket_client_test(gconstpointer opaque)
> {
> const CharSocketClientTestConfig *config = opaque;
> + const char_socket_cb event_cb = config->event_cb;
> QIOChannelSocket *ioc;
> char *optstr;
> Chardev *chr;
> @@ -983,8 +1017,9 @@ static void char_socket_client_test(gconstpointer opaque)
>
> reconnect:
> data.event = -1;
> + data.be = &be;
> qemu_chr_fe_set_handlers(&be, NULL, NULL,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> if (config->reconnect) {
> g_assert(data.event == -1);
> @@ -1022,7 +1057,7 @@ static void char_socket_client_test(gconstpointer opaque)
> /* Setup a callback to receive the reply to our greeting */
> qemu_chr_fe_set_handlers(&be, char_socket_can_read,
> char_socket_read,
> - char_socket_event, NULL,
> + event_cb, NULL,
> &data, NULL, true);
> g_assert(data.event == CHR_EVENT_OPENED);
> data.event = -1;
> @@ -1467,19 +1502,22 @@ int main(int argc, char **argv)
>
> #define SOCKET_CLIENT_TEST(name, addr) \
> static CharSocketClientTestConfig client1 ## name = \
> - { addr, NULL, false, false }; \
> + { addr, NULL, false, false, char_socket_event}; \
> static CharSocketClientTestConfig client2 ## name = \
> - { addr, NULL, true, false }; \
> + { addr, NULL, true, false, char_socket_event }; \
> static CharSocketClientTestConfig client3 ## name = \
> - { addr, ",reconnect=1", false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> static CharSocketClientTestConfig client4 ## name = \
> - { addr, ",reconnect=1", true }; \
> + { addr, ",reconnect=1", true, false, char_socket_event }; \
> static CharSocketClientTestConfig client5 ## name = \
> - { addr, NULL, false, true }; \
> + { addr, NULL, false, true, char_socket_event }; \
> static CharSocketClientTestConfig client6 ## name = \
> - { addr, NULL, true, true }; \
> + { addr, NULL, true, true, char_socket_event }; \
> static CharSocketClientTestConfig client7 ## name = \
> - { addr, ",reconnect=1", false, false }; \
> + { addr, ",reconnect=1", false, false, char_socket_event }; \
> + static CharSocketClientTestConfig client8 ## name = \
> + { addr, ",reconnect=1", true, false, \
> + char_socket_event_with_error }; \
> g_test_add_data_func("/char/socket/client/mainloop/" # name, \
> &client1 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/wait-conn/" # name, \
> @@ -1493,7 +1531,9 @@ int main(int argc, char **argv)
> g_test_add_data_func("/char/socket/client/wait-conn-fdpass/" # name, \
> &client6 ##name, char_socket_client_test); \
> g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
> - &client7 ##name, char_socket_client_dupid_test)
> + &client7 ##name, char_socket_client_dupid_test);\
> + g_test_add_data_func("/char/socket/client/reconnect-error/" # name, \
> + &client8 ##name, char_socket_client_test)
>
> if (has_ipv4) {
> SOCKET_SERVER_TEST(tcp, &tcpaddr);
>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2020-05-21 15:16 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15 3:28 [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Li Feng
2020-04-15 3:28 ` [PATCH 1/4] vhost-user-blk: delay vhost_user_blk_disconnect Li Feng
2020-04-14 1:24 ` Raphael Norwitz
2020-04-20 11:18 ` Li Feng
2020-04-15 3:28 ` [PATCH 2/4] vhost-user-blk: fix invalid memory access Li Feng
2020-04-14 1:44 ` Raphael Norwitz
2020-04-15 3:28 ` [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection Li Feng
2020-04-15 10:35 ` Marc-André Lureau
2020-04-20 3:10 ` Li Feng
2020-04-20 14:28 ` Marc-André Lureau
2020-04-28 8:51 ` [PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
[not found] ` <20200508051441.8143-1-fengli@smartx.com>
2020-05-08 5:14 ` [PATCH v3 1/2] io/channel: fix crash when qio_channel_readv_all return 0 Li Feng
2020-05-08 12:32 ` Marc-André Lureau
2020-05-08 12:42 ` Li Feng
2020-05-11 10:02 ` Daniel P. Berrangé
2020-05-11 12:09 ` Li Feng
2020-05-08 5:14 ` [PATCH v3 2/2] char-socket: initialize reconnect timer only when the timer doesn't start Li Feng
2020-05-08 11:00 ` Marc-André Lureau
2020-05-08 12:16 ` Li Feng
2020-05-08 12:28 ` Marc-André Lureau
2020-05-08 12:31 ` Li Feng
2020-05-11 12:39 ` [PATCH v4] " Li Feng
2020-05-21 15:15 ` Paolo Bonzini
2020-04-28 8:54 ` [PATCH v2] " Li Feng
2020-04-28 8:59 ` Li Feng
2020-04-28 11:05 ` Marc-André Lureau
2020-04-28 14:09 ` Feng Li
2020-04-15 3:28 ` [PATCH 4/4] vhost-user-blk: fix crash in realize process Li Feng
2020-04-14 1:54 ` Raphael Norwitz
[not found] ` <CAHckoCwsLwqNhVHK4PF+pzt5ExkurPNsWLw7ueRG8dfOUA2rXg@mail.gmail.com>
2020-04-17 10:07 ` Fwd: " Li Feng
2020-04-17 9:45 ` [PATCH 0/4] fix crashes when inject errors to vhost-user-blk chardev Michael S. Tsirkin
2020-04-17 10:11 ` Li Feng
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.