qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization
@ 2021-03-25 15:12 Denis Plotnikov
  2021-03-25 15:12 ` [PATCH v3 1/3] vhost-user-blk: use different event handlers on initialization Denis Plotnikov
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Denis Plotnikov @ 2021-03-25 15:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

v3:
  * 0003: a new patch added fixing the problem on vm shutdown
    I stumbled on this bug after v2 sending.
  * 0001: gramma fixing (Raphael)
  * 0002: commit message fixing (Raphael)

v2:
  * split the initial patch into two (Raphael)
  * rename init to realized (Raphael)
  * remove unrelated comment (Raphael)

When the vhost-user-blk device lose the connection to the daemon during
the initialization phase it kills qemu because of the assert in the code.
The series fixes the bug.

0001 is preparation for the fix
0002 fixes the bug, patch description has the full motivation for the series
0003 (added in v3) fix bug on vm shutdown

Denis Plotnikov (3):
  vhost-user-blk: use different event handlers on initialization
  vhost-user-blk: perform immediate cleanup if disconnect on
    initialization
  vhost-user-blk: add immediate cleanup on shutdown

 hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++---------------
 1 file changed, 48 insertions(+), 31 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/3] vhost-user-blk: use different event handlers on initialization
  2021-03-25 15:12 [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
@ 2021-03-25 15:12 ` Denis Plotnikov
  2021-03-25 15:12 ` [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect " Denis Plotnikov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Denis Plotnikov @ 2021-03-25 15:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

It is useful to use different connect/disconnect event handlers
on device initialization and operation as seen from the further
commit fixing a bug on device initialization.

This patch refactors the code to make use of them: we don't rely any
more on the VM state for choosing how to cleanup the device, instead
we explicitly use the proper event handler depending on whether
the device has been initialized.

Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
 hw/block/vhost-user-blk.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index b870a50e6b20..1af95ec6aae7 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -362,7 +362,18 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
     vhost_dev_cleanup(&s->dev);
 }
 
-static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
+static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
+                                 bool realized);
+
+static void vhost_user_blk_event_realize(void *opaque, QEMUChrEvent event)
+{
+    vhost_user_blk_event(opaque, event, false);
+}
+
+static void vhost_user_blk_event_oper(void *opaque, QEMUChrEvent event)
+{
+    vhost_user_blk_event(opaque, event, true);
+}
 
 static void vhost_user_blk_chr_closed_bh(void *opaque)
 {
@@ -371,11 +382,12 @@ static void vhost_user_blk_chr_closed_bh(void *opaque)
     VHostUserBlk *s = VHOST_USER_BLK(vdev);
 
     vhost_user_blk_disconnect(dev);
-    qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
-            NULL, opaque, NULL, true);
+    qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL,
+            vhost_user_blk_event_oper, NULL, opaque, NULL, true);
 }
 
-static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
+static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
+                                 bool realized)
 {
     DeviceState *dev = opaque;
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -406,7 +418,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
          * TODO: maybe it is a good idea to make the same fix
          * for other vhost-user devices.
          */
-        if (runstate_is_running()) {
+        if (realized) {
             AioContext *ctx = qemu_get_current_aio_context();
 
             qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
@@ -473,8 +485,9 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
     s->vhost_vqs = g_new0(struct vhost_virtqueue, s->num_queues);
     s->connected = false;
 
-    qemu_chr_fe_set_handlers(&s->chardev,  NULL, NULL, vhost_user_blk_event,
-                             NULL, (void *)dev, NULL, true);
+    qemu_chr_fe_set_handlers(&s->chardev,  NULL, NULL,
+                             vhost_user_blk_event_realize, NULL, (void *)dev,
+                             NULL, true);
 
 reconnect:
     if (qemu_chr_fe_wait_connected(&s->chardev, &err) < 0) {
@@ -494,6 +507,10 @@ reconnect:
         goto reconnect;
     }
 
+    /* we're fully initialized, now we can operate, so change the handler */
+    qemu_chr_fe_set_handlers(&s->chardev,  NULL, NULL,
+                             vhost_user_blk_event_oper, NULL, (void *)dev,
+                             NULL, true);
     return;
 
 virtio_err:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-03-25 15:12 [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
  2021-03-25 15:12 ` [PATCH v3 1/3] vhost-user-blk: use different event handlers on initialization Denis Plotnikov
@ 2021-03-25 15:12 ` Denis Plotnikov
  2021-04-21 15:24   ` Kevin Wolf
  2021-03-25 15:12 ` [PATCH v3 3/3] vhost-user-blk: add immediate cleanup on shutdown Denis Plotnikov
  2021-03-29 13:44 ` [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
  3 siblings, 1 reply; 12+ messages in thread
From: Denis Plotnikov @ 2021-03-25 15:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
because of connection problems with vhost-blk daemon.

However, it introdues a new problem. Now, any communication errors
during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
lead to qemu abort on assert in vhost_dev_get_config().

This happens because vhost_user_blk_disconnect() is postponed but
it should have dropped s->connected flag by the time
vhost_user_blk_device_realize() performs a new connection opening.
On the connection opening, vhost_dev initialization in
vhost_user_blk_connect() relies on s->connection flag and
if it's not dropped, it skips vhost_dev initialization and returns
with success. Then, vhost_user_blk_device_realize()'s execution flow
goes to vhost_dev_get_config() where it's aborted on the assert.

To fix the problem this patch adds immediate cleanup on device
initialization(in vhost_user_blk_device_realize()) using different
event handlers for initialization and operation introduced in the
previous patch.
On initialization (in vhost_user_blk_device_realize()) we fully
control the initialization process. At that point, nobody can use the
device since it isn't initialized and we don't need to postpone any
cleanups, so we can do cleaup right away when there is a communication
problem with the vhost-blk daemon.
On operation we leave it as is, since the disconnect may happen when
the device is in use, so the device users may want to use vhost_dev's data
to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).

Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
 hw/block/vhost-user-blk.c | 48 +++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 1af95ec6aae7..4e215f71f152 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -402,38 +402,38 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
         break;
     case CHR_EVENT_CLOSED:
         /*
-         * A close event may happen during a read/write, but vhost
-         * code assumes the vhost_dev remains setup, so delay the
-         * stop & clear. There are two possible paths to hit this
-         * disconnect event:
-         * 1. When VM is in the RUN_STATE_PRELAUNCH state. The
-         * vhost_user_blk_device_realize() is a caller.
-         * 2. In tha main loop phase after VM start.
-         *
-         * For p2 the disconnect event will be delayed. We can't
-         * do the same for p1, because we are not running the loop
-         * at this moment. So just skip this step and perform
-         * disconnect in the caller function.
-         *
-         * TODO: maybe it is a good idea to make the same fix
-         * for other vhost-user devices.
+         * Closing the connection should happen differently on device
+         * initialization and operation stages.
+         * On initalization, we want to re-start vhost_dev initialization
+         * from the very beginning right away when the connection is closed,
+         * so we clean up vhost_dev on each connection closing.
+         * On operation, we want to postpone vhost_dev cleanup to let the
+         * other code perform its own cleanup sequence using vhost_dev data
+         * (e.g. vhost_dev_set_log).
          */
         if (realized) {
+            /*
+             * A close event may happen during a read/write, but vhost
+             * code assumes the vhost_dev remains setup, so delay the
+             * stop & clear.
+             */
             AioContext *ctx = qemu_get_current_aio_context();
 
             qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
                     NULL, NULL, false);
             aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
-        }
 
-        /*
-         * Move vhost device to the stopped state. The vhost-user device
-         * will be clean up and disconnected in BH. This can be useful in
-         * the vhost migration code. If disconnect was caught there is an
-         * option for the general vhost code to get the dev state without
-         * knowing its type (in this case vhost-user).
-         */
-        s->dev.started = false;
+            /*
+             * Move vhost device to the stopped state. The vhost-user device
+             * will be clean up and disconnected in BH. This can be useful in
+             * the vhost migration code. If disconnect was caught there is an
+             * option for the general vhost code to get the dev state without
+             * knowing its type (in this case vhost-user).
+             */
+            s->dev.started = false;
+        } else {
+            vhost_user_blk_disconnect(dev);
+        }
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 3/3] vhost-user-blk: add immediate cleanup on shutdown
  2021-03-25 15:12 [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
  2021-03-25 15:12 ` [PATCH v3 1/3] vhost-user-blk: use different event handlers on initialization Denis Plotnikov
  2021-03-25 15:12 ` [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect " Denis Plotnikov
@ 2021-03-25 15:12 ` Denis Plotnikov
  2021-03-29 13:44 ` [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
  3 siblings, 0 replies; 12+ messages in thread
From: Denis Plotnikov @ 2021-03-25 15:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

Qemu crashes on shutdown if the chardev used by vhost-user-blk has been
finalized before the vhost-user-blk.

This happens with char-socket chardev operating in the listening mode (server).
The char-socket chardev emits "close" event at the end of finalizing when
its internal data is destroyed. This calls vhost-user-blk event handler
which in turn tries to manipulate with destroyed chardev by setting an empty
event handler for vhost-user-blk cleanup postponing.

This patch separates the shutdown case from the cleanup postponing removing
the need to set an event handler.

Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
---
 hw/block/vhost-user-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 4e215f71f152..0b5b9d44cdb0 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -411,7 +411,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
          * other code perform its own cleanup sequence using vhost_dev data
          * (e.g. vhost_dev_set_log).
          */
-        if (realized) {
+        if (realized && !runstate_check(RUN_STATE_SHUTDOWN)) {
             /*
              * A close event may happen during a read/write, but vhost
              * code assumes the vhost_dev remains setup, so delay the
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization
  2021-03-25 15:12 [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
                   ` (2 preceding siblings ...)
  2021-03-25 15:12 ` [PATCH v3 3/3] vhost-user-blk: add immediate cleanup on shutdown Denis Plotnikov
@ 2021-03-29 13:44 ` Denis Plotnikov
  2021-04-01  9:21   ` [BUG FIX][PATCH " Denis Plotnikov
  3 siblings, 1 reply; 12+ messages in thread
From: Denis Plotnikov @ 2021-03-29 13:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

[-- Attachment #1: Type: text/plain, Size: 1099 bytes --]

ping!

On 25.03.2021 18:12, Denis Plotnikov wrote:
> v3:
>    * 0003: a new patch added fixing the problem on vm shutdown
>      I stumbled on this bug after v2 sending.
>    * 0001: gramma fixing (Raphael)
>    * 0002: commit message fixing (Raphael)
>
> v2:
>    * split the initial patch into two (Raphael)
>    * rename init to realized (Raphael)
>    * remove unrelated comment (Raphael)
>
> When the vhost-user-blk device lose the connection to the daemon during
> the initialization phase it kills qemu because of the assert in the code.
> The series fixes the bug.
>
> 0001 is preparation for the fix
> 0002 fixes the bug, patch description has the full motivation for the series
> 0003 (added in v3) fix bug on vm shutdown
>
> Denis Plotnikov (3):
>    vhost-user-blk: use different event handlers on initialization
>    vhost-user-blk: perform immediate cleanup if disconnect on
>      initialization
>    vhost-user-blk: add immediate cleanup on shutdown
>
>   hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++---------------
>   1 file changed, 48 insertions(+), 31 deletions(-)
>

[-- Attachment #2: Type: text/html, Size: 1431 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [BUG FIX][PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization
  2021-03-29 13:44 ` [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
@ 2021-04-01  9:21   ` Denis Plotnikov
  2021-04-01 19:01     ` Valentin Sinitsyn
  0 siblings, 1 reply; 12+ messages in thread
From: Denis Plotnikov @ 2021-04-01  9:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

[-- Attachment #1: Type: text/plain, Size: 1304 bytes --]

This is a series fixing a bug in host-user-blk.
Is there any chance for it to be considered for the next rc?

Thanks!

Denis

On 29.03.2021 16:44, Denis Plotnikov wrote:
>
> ping!
>
> On 25.03.2021 18:12, Denis Plotnikov wrote:
>> v3:
>>    * 0003: a new patch added fixing the problem on vm shutdown
>>      I stumbled on this bug after v2 sending.
>>    * 0001: gramma fixing (Raphael)
>>    * 0002: commit message fixing (Raphael)
>>
>> v2:
>>    * split the initial patch into two (Raphael)
>>    * rename init to realized (Raphael)
>>    * remove unrelated comment (Raphael)
>>
>> When the vhost-user-blk device lose the connection to the daemon during
>> the initialization phase it kills qemu because of the assert in the code.
>> The series fixes the bug.
>>
>> 0001 is preparation for the fix
>> 0002 fixes the bug, patch description has the full motivation for the series
>> 0003 (added in v3) fix bug on vm shutdown
>>
>> Denis Plotnikov (3):
>>    vhost-user-blk: use different event handlers on initialization
>>    vhost-user-blk: perform immediate cleanup if disconnect on
>>      initialization
>>    vhost-user-blk: add immediate cleanup on shutdown
>>
>>   hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++---------------
>>   1 file changed, 48 insertions(+), 31 deletions(-)
>>

[-- Attachment #2: Type: text/html, Size: 2027 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG FIX][PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization
  2021-04-01  9:21   ` [BUG FIX][PATCH " Denis Plotnikov
@ 2021-04-01 19:01     ` Valentin Sinitsyn
  0 siblings, 0 replies; 12+ messages in thread
From: Valentin Sinitsyn @ 2021-04-01 19:01 UTC (permalink / raw)
  To: Denis Plotnikov, qemu-devel
  Cc: kwolf, qemu-block, mst, raphael.norwitz, yc-core, mreitz

On 01.04.2021 14:21, Denis Plotnikov wrote:
> This is a series fixing a bug in host-user-blk.
More specifically, it's not just a bug but crasher.

Valentine

> Is there any chance for it to be considered for the next rc?
> 
> Thanks!
> 
> Denis
> 
> On 29.03.2021 16:44, Denis Plotnikov wrote:
>>
>> ping!
>>
>> On 25.03.2021 18:12, Denis Plotnikov wrote:
>>> v3:
>>>    * 0003: a new patch added fixing the problem on vm shutdown
>>>      I stumbled on this bug after v2 sending.
>>>    * 0001: gramma fixing (Raphael)
>>>    * 0002: commit message fixing (Raphael)
>>>
>>> v2:
>>>    * split the initial patch into two (Raphael)
>>>    * rename init to realized (Raphael)
>>>    * remove unrelated comment (Raphael)
>>>
>>> When the vhost-user-blk device lose the connection to the daemon during
>>> the initialization phase it kills qemu because of the assert in the code.
>>> The series fixes the bug.
>>>
>>> 0001 is preparation for the fix
>>> 0002 fixes the bug, patch description has the full motivation for the series
>>> 0003 (added in v3) fix bug on vm shutdown
>>>
>>> Denis Plotnikov (3):
>>>    vhost-user-blk: use different event handlers on initialization
>>>    vhost-user-blk: perform immediate cleanup if disconnect on
>>>      initialization
>>>    vhost-user-blk: add immediate cleanup on shutdown
>>>
>>>   hw/block/vhost-user-blk.c | 79 ++++++++++++++++++++++++---------------
>>>   1 file changed, 48 insertions(+), 31 deletions(-)
>>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-03-25 15:12 ` [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect " Denis Plotnikov
@ 2021-04-21 15:24   ` Kevin Wolf
  2021-04-21 16:13     ` Denis Plotnikov
  0 siblings, 1 reply; 12+ messages in thread
From: Kevin Wolf @ 2021-04-21 15:24 UTC (permalink / raw)
  To: Denis Plotnikov
  Cc: qemu-block, mst, qemu-devel, mreitz, yc-core, raphael.norwitz

Am 25.03.2021 um 16:12 hat Denis Plotnikov geschrieben:
> Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
> introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
> because of connection problems with vhost-blk daemon.
> 
> However, it introdues a new problem. Now, any communication errors
> during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
> lead to qemu abort on assert in vhost_dev_get_config().
> 
> This happens because vhost_user_blk_disconnect() is postponed but
> it should have dropped s->connected flag by the time
> vhost_user_blk_device_realize() performs a new connection opening.
> On the connection opening, vhost_dev initialization in
> vhost_user_blk_connect() relies on s->connection flag and
> if it's not dropped, it skips vhost_dev initialization and returns
> with success. Then, vhost_user_blk_device_realize()'s execution flow
> goes to vhost_dev_get_config() where it's aborted on the assert.
> 
> To fix the problem this patch adds immediate cleanup on device
> initialization(in vhost_user_blk_device_realize()) using different
> event handlers for initialization and operation introduced in the
> previous patch.
> On initialization (in vhost_user_blk_device_realize()) we fully
> control the initialization process. At that point, nobody can use the
> device since it isn't initialized and we don't need to postpone any
> cleanups, so we can do cleaup right away when there is a communication
> problem with the vhost-blk daemon.
> On operation we leave it as is, since the disconnect may happen when
> the device is in use, so the device users may want to use vhost_dev's data
> to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
> 
> Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

I think there is something wrong with this patch.

I'm debugging an error case, specifically num-queues being larger in
QEMU that in the vhost-user-blk export. Before this patch, it has just
an unfriendly error message:

qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Unexpected end-of-file before all data were read
qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Failed to read msg header. Read 0 instead of 12. Original request 24.
qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: vhost-user-blk: get block config failed
qemu-system-x86_64: Failed to set msg fds.
qemu-system-x86_64: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)

After the patch, it crashes:

#0  0x0000555555d0a4bd in vhost_user_read_cb (source=0x5555568f4690, condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffffffcbf0) at ../hw/virtio/vhost-user.c:313
#1  0x0000555555d950d3 in qio_channel_fd_source_dispatch (source=0x555557c3f750, callback=0x555555d0a478 <vhost_user_read_cb>, user_data=0x7fffffffcbf0) at ../io/channel-watch.c:84
#2  0x00007ffff7b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#3  0x00007ffff7b84a98 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
#4  0x00007ffff7b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#5  0x0000555555d0a724 in vhost_user_read (dev=0x555557bc62f8, msg=0x7fffffffcc50) at ../hw/virtio/vhost-user.c:402
#6  0x0000555555d0ee6b in vhost_user_get_config (dev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
#7  0x0000555555d56d46 in vhost_dev_get_config (hdev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
#8  0x0000555555cdd150 in vhost_user_blk_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcf90) at ../hw/block/vhost-user-blk.c:510
#9  0x0000555555d08f6d in virtio_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcff0) at ../hw/virtio/virtio.c:3660

The problem is that vhost_user_read_cb() still accesses dev->opaque even
though the device has been cleaned up meanwhile when the connection was
closed (the vhost_user_blk_disconnect() added by this patch), so it's
NULL now. This problem was actually mentioned in the comment that is
removed by this patch.

I tried to fix this by making vhost_user_read() cope with the fact that
the device might have been cleaned up meanwhile, but then I'm running
into the next set of problems.

The first is that retrying is pointless, the error condition is in the
configuration, it will never change.

The other is that after many repetitions of the same error message, I
got a crash where the device is cleaned up a second time in
vhost_dev_init() and the virtqueues are already NULL.

So it seems to me that erroring out during the initialisation phase
makes a lot more sense than retrying.

Kevin

>  hw/block/vhost-user-blk.c | 48 +++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 1af95ec6aae7..4e215f71f152 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -402,38 +402,38 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
>          break;
>      case CHR_EVENT_CLOSED:
>          /*
> -         * A close event may happen during a read/write, but vhost
> -         * code assumes the vhost_dev remains setup, so delay the
> -         * stop & clear. There are two possible paths to hit this
> -         * disconnect event:
> -         * 1. When VM is in the RUN_STATE_PRELAUNCH state. The
> -         * vhost_user_blk_device_realize() is a caller.
> -         * 2. In tha main loop phase after VM start.
> -         *
> -         * For p2 the disconnect event will be delayed. We can't
> -         * do the same for p1, because we are not running the loop
> -         * at this moment. So just skip this step and perform
> -         * disconnect in the caller function.
> -         *
> -         * TODO: maybe it is a good idea to make the same fix
> -         * for other vhost-user devices.
> +         * Closing the connection should happen differently on device
> +         * initialization and operation stages.
> +         * On initalization, we want to re-start vhost_dev initialization
> +         * from the very beginning right away when the connection is closed,
> +         * so we clean up vhost_dev on each connection closing.
> +         * On operation, we want to postpone vhost_dev cleanup to let the
> +         * other code perform its own cleanup sequence using vhost_dev data
> +         * (e.g. vhost_dev_set_log).
>           */
>          if (realized) {
> +            /*
> +             * A close event may happen during a read/write, but vhost
> +             * code assumes the vhost_dev remains setup, so delay the
> +             * stop & clear.
> +             */
>              AioContext *ctx = qemu_get_current_aio_context();
>  
>              qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
>                      NULL, NULL, false);
>              aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
> -        }
>  
> -        /*
> -         * Move vhost device to the stopped state. The vhost-user device
> -         * will be clean up and disconnected in BH. This can be useful in
> -         * the vhost migration code. If disconnect was caught there is an
> -         * option for the general vhost code to get the dev state without
> -         * knowing its type (in this case vhost-user).
> -         */
> -        s->dev.started = false;
> +            /*
> +             * Move vhost device to the stopped state. The vhost-user device
> +             * will be clean up and disconnected in BH. This can be useful in
> +             * the vhost migration code. If disconnect was caught there is an
> +             * option for the general vhost code to get the dev state without
> +             * knowing its type (in this case vhost-user).
> +             */
> +            s->dev.started = false;
> +        } else {
> +            vhost_user_blk_disconnect(dev);
> +        }
>          break;
>      case CHR_EVENT_BREAK:
>      case CHR_EVENT_MUX_IN:
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-04-21 15:24   ` Kevin Wolf
@ 2021-04-21 16:13     ` Denis Plotnikov
  2021-04-21 19:59       ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Denis Plotnikov @ 2021-04-21 16:13 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mst, qemu-devel, mreitz, yc-core, raphael.norwitz


On 21.04.2021 18:24, Kevin Wolf wrote:
> Am 25.03.2021 um 16:12 hat Denis Plotnikov geschrieben:
>> Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
>> introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
>> because of connection problems with vhost-blk daemon.
>>
>> However, it introdues a new problem. Now, any communication errors
>> during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
>> lead to qemu abort on assert in vhost_dev_get_config().
>>
>> This happens because vhost_user_blk_disconnect() is postponed but
>> it should have dropped s->connected flag by the time
>> vhost_user_blk_device_realize() performs a new connection opening.
>> On the connection opening, vhost_dev initialization in
>> vhost_user_blk_connect() relies on s->connection flag and
>> if it's not dropped, it skips vhost_dev initialization and returns
>> with success. Then, vhost_user_blk_device_realize()'s execution flow
>> goes to vhost_dev_get_config() where it's aborted on the assert.
>>
>> To fix the problem this patch adds immediate cleanup on device
>> initialization(in vhost_user_blk_device_realize()) using different
>> event handlers for initialization and operation introduced in the
>> previous patch.
>> On initialization (in vhost_user_blk_device_realize()) we fully
>> control the initialization process. At that point, nobody can use the
>> device since it isn't initialized and we don't need to postpone any
>> cleanups, so we can do cleaup right away when there is a communication
>> problem with the vhost-blk daemon.
>> On operation we leave it as is, since the disconnect may happen when
>> the device is in use, so the device users may want to use vhost_dev's data
>> to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
>>
>> Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
>> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> I think there is something wrong with this patch.
>
> I'm debugging an error case, specifically num-queues being larger in
> QEMU that in the vhost-user-blk export. Before this patch, it has just
> an unfriendly error message:
>
> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Unexpected end-of-file before all data were read
> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Failed to read msg header. Read 0 instead of 12. Original request 24.
> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: vhost-user-blk: get block config failed
> qemu-system-x86_64: Failed to set msg fds.
> qemu-system-x86_64: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
>
> After the patch, it crashes:
>
> #0  0x0000555555d0a4bd in vhost_user_read_cb (source=0x5555568f4690, condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffffffcbf0) at ../hw/virtio/vhost-user.c:313
> #1  0x0000555555d950d3 in qio_channel_fd_source_dispatch (source=0x555557c3f750, callback=0x555555d0a478 <vhost_user_read_cb>, user_data=0x7fffffffcbf0) at ../io/channel-watch.c:84
> #2  0x00007ffff7b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
> #3  0x00007ffff7b84a98 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
> #4  0x00007ffff7b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> #5  0x0000555555d0a724 in vhost_user_read (dev=0x555557bc62f8, msg=0x7fffffffcc50) at ../hw/virtio/vhost-user.c:402
> #6  0x0000555555d0ee6b in vhost_user_get_config (dev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
> #7  0x0000555555d56d46 in vhost_dev_get_config (hdev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
> #8  0x0000555555cdd150 in vhost_user_blk_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcf90) at ../hw/block/vhost-user-blk.c:510
> #9  0x0000555555d08f6d in virtio_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcff0) at ../hw/virtio/virtio.c:3660
>
> The problem is that vhost_user_read_cb() still accesses dev->opaque even
> though the device has been cleaned up meanwhile when the connection was
> closed (the vhost_user_blk_disconnect() added by this patch), so it's
> NULL now. This problem was actually mentioned in the comment that is
> removed by this patch.
>
> I tried to fix this by making vhost_user_read() cope with the fact that
> the device might have been cleaned up meanwhile, but then I'm running
> into the next set of problems.
>
> The first is that retrying is pointless, the error condition is in the
> configuration, it will never change.
>
> The other is that after many repetitions of the same error message, I
> got a crash where the device is cleaned up a second time in
> vhost_dev_init() and the virtqueues are already NULL.
>
> So it seems to me that erroring out during the initialisation phase
> makes a lot more sense than retrying.
>
> Kevin

But without the patch there will be another problem which the patch 
actually addresses.

It seems to me that there is a case when the retrying is useless and 
this is exactly your case -- we'll never get a proper configuration.

What if we get rid of the re-connection and give the only try to realize 
the device? Than we don't need case separating for initialization and 
operation of device, correct? But I don't familiar with the cases where 
the reconnect is needed? Do you know something it?

Denis

>
>>   hw/block/vhost-user-blk.c | 48 +++++++++++++++++++--------------------
>>   1 file changed, 24 insertions(+), 24 deletions(-)
>>
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index 1af95ec6aae7..4e215f71f152 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -402,38 +402,38 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
>>           break;
>>       case CHR_EVENT_CLOSED:
>>           /*
>> -         * A close event may happen during a read/write, but vhost
>> -         * code assumes the vhost_dev remains setup, so delay the
>> -         * stop & clear. There are two possible paths to hit this
>> -         * disconnect event:
>> -         * 1. When VM is in the RUN_STATE_PRELAUNCH state. The
>> -         * vhost_user_blk_device_realize() is a caller.
>> -         * 2. In tha main loop phase after VM start.
>> -         *
>> -         * For p2 the disconnect event will be delayed. We can't
>> -         * do the same for p1, because we are not running the loop
>> -         * at this moment. So just skip this step and perform
>> -         * disconnect in the caller function.
>> -         *
>> -         * TODO: maybe it is a good idea to make the same fix
>> -         * for other vhost-user devices.
>> +         * Closing the connection should happen differently on device
>> +         * initialization and operation stages.
>> +         * On initalization, we want to re-start vhost_dev initialization
>> +         * from the very beginning right away when the connection is closed,
>> +         * so we clean up vhost_dev on each connection closing.
>> +         * On operation, we want to postpone vhost_dev cleanup to let the
>> +         * other code perform its own cleanup sequence using vhost_dev data
>> +         * (e.g. vhost_dev_set_log).
>>            */
>>           if (realized) {
>> +            /*
>> +             * A close event may happen during a read/write, but vhost
>> +             * code assumes the vhost_dev remains setup, so delay the
>> +             * stop & clear.
>> +             */
>>               AioContext *ctx = qemu_get_current_aio_context();
>>   
>>               qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
>>                       NULL, NULL, false);
>>               aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
>> -        }
>>   
>> -        /*
>> -         * Move vhost device to the stopped state. The vhost-user device
>> -         * will be clean up and disconnected in BH. This can be useful in
>> -         * the vhost migration code. If disconnect was caught there is an
>> -         * option for the general vhost code to get the dev state without
>> -         * knowing its type (in this case vhost-user).
>> -         */
>> -        s->dev.started = false;
>> +            /*
>> +             * Move vhost device to the stopped state. The vhost-user device
>> +             * will be clean up and disconnected in BH. This can be useful in
>> +             * the vhost migration code. If disconnect was caught there is an
>> +             * option for the general vhost code to get the dev state without
>> +             * knowing its type (in this case vhost-user).
>> +             */
>> +            s->dev.started = false;
>> +        } else {
>> +            vhost_user_blk_disconnect(dev);
>> +        }
>>           break;
>>       case CHR_EVENT_BREAK:
>>       case CHR_EVENT_MUX_IN:
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-04-21 16:13     ` Denis Plotnikov
@ 2021-04-21 19:59       ` Michael S. Tsirkin
  2021-04-22  8:09         ` Denis Plotnikov
  2021-04-22  9:31         ` Kevin Wolf
  0 siblings, 2 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-04-21 19:59 UTC (permalink / raw)
  To: Denis Plotnikov
  Cc: Kevin Wolf, qemu-block, qemu-devel, mreitz, yc-core, raphael.norwitz

On Wed, Apr 21, 2021 at 07:13:24PM +0300, Denis Plotnikov wrote:
> 
> On 21.04.2021 18:24, Kevin Wolf wrote:
> > Am 25.03.2021 um 16:12 hat Denis Plotnikov geschrieben:
> > > Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
> > > introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
> > > because of connection problems with vhost-blk daemon.
> > > 
> > > However, it introdues a new problem. Now, any communication errors
> > > during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
> > > lead to qemu abort on assert in vhost_dev_get_config().
> > > 
> > > This happens because vhost_user_blk_disconnect() is postponed but
> > > it should have dropped s->connected flag by the time
> > > vhost_user_blk_device_realize() performs a new connection opening.
> > > On the connection opening, vhost_dev initialization in
> > > vhost_user_blk_connect() relies on s->connection flag and
> > > if it's not dropped, it skips vhost_dev initialization and returns
> > > with success. Then, vhost_user_blk_device_realize()'s execution flow
> > > goes to vhost_dev_get_config() where it's aborted on the assert.
> > > 
> > > To fix the problem this patch adds immediate cleanup on device
> > > initialization(in vhost_user_blk_device_realize()) using different
> > > event handlers for initialization and operation introduced in the
> > > previous patch.
> > > On initialization (in vhost_user_blk_device_realize()) we fully
> > > control the initialization process. At that point, nobody can use the
> > > device since it isn't initialized and we don't need to postpone any
> > > cleanups, so we can do cleaup right away when there is a communication
> > > problem with the vhost-blk daemon.
> > > On operation we leave it as is, since the disconnect may happen when
> > > the device is in use, so the device users may want to use vhost_dev's data
> > > to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
> > > 
> > > Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
> > > Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> > I think there is something wrong with this patch.
> > 
> > I'm debugging an error case, specifically num-queues being larger in
> > QEMU that in the vhost-user-blk export. Before this patch, it has just
> > an unfriendly error message:
> > 
> > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Unexpected end-of-file before all data were read
> > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Failed to read msg header. Read 0 instead of 12. Original request 24.
> > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: vhost-user-blk: get block config failed
> > qemu-system-x86_64: Failed to set msg fds.
> > qemu-system-x86_64: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
> > 
> > After the patch, it crashes:
> > 
> > #0  0x0000555555d0a4bd in vhost_user_read_cb (source=0x5555568f4690, condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffffffcbf0) at ../hw/virtio/vhost-user.c:313
> > #1  0x0000555555d950d3 in qio_channel_fd_source_dispatch (source=0x555557c3f750, callback=0x555555d0a478 <vhost_user_read_cb>, user_data=0x7fffffffcbf0) at ../io/channel-watch.c:84
> > #2  0x00007ffff7b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
> > #3  0x00007ffff7b84a98 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
> > #4  0x00007ffff7b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> > #5  0x0000555555d0a724 in vhost_user_read (dev=0x555557bc62f8, msg=0x7fffffffcc50) at ../hw/virtio/vhost-user.c:402
> > #6  0x0000555555d0ee6b in vhost_user_get_config (dev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
> > #7  0x0000555555d56d46 in vhost_dev_get_config (hdev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
> > #8  0x0000555555cdd150 in vhost_user_blk_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcf90) at ../hw/block/vhost-user-blk.c:510
> > #9  0x0000555555d08f6d in virtio_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcff0) at ../hw/virtio/virtio.c:3660
> > 
> > The problem is that vhost_user_read_cb() still accesses dev->opaque even
> > though the device has been cleaned up meanwhile when the connection was
> > closed (the vhost_user_blk_disconnect() added by this patch), so it's
> > NULL now. This problem was actually mentioned in the comment that is
> > removed by this patch.
> > 
> > I tried to fix this by making vhost_user_read() cope with the fact that
> > the device might have been cleaned up meanwhile, but then I'm running
> > into the next set of problems.
> > 
> > The first is that retrying is pointless, the error condition is in the
> > configuration, it will never change.
> > 
> > The other is that after many repetitions of the same error message, I
> > got a crash where the device is cleaned up a second time in
> > vhost_dev_init() and the virtqueues are already NULL.
> > 
> > So it seems to me that erroring out during the initialisation phase
> > makes a lot more sense than retrying.
> > 
> > Kevin
> 
> But without the patch there will be another problem which the patch actually
> addresses.
> 
> It seems to me that there is a case when the retrying is useless and this is
> exactly your case -- we'll never get a proper configuration.
> 
> What if we get rid of the re-connection and give the only try to realize the
> device? Than we don't need case separating for initialization and operation
> of device, correct? But I don't familiar with the cases where the reconnect
> is needed? Do you know something it?

Reconnect is for when server is restarted while we are talking to it.

> Denis
> 
> > 
> > >   hw/block/vhost-user-blk.c | 48 +++++++++++++++++++--------------------
> > >   1 file changed, 24 insertions(+), 24 deletions(-)
> > > 
> > > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> > > index 1af95ec6aae7..4e215f71f152 100644
> > > --- a/hw/block/vhost-user-blk.c
> > > +++ b/hw/block/vhost-user-blk.c
> > > @@ -402,38 +402,38 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> > >           break;
> > >       case CHR_EVENT_CLOSED:
> > >           /*
> > > -         * A close event may happen during a read/write, but vhost
> > > -         * code assumes the vhost_dev remains setup, so delay the
> > > -         * stop & clear. There are two possible paths to hit this
> > > -         * disconnect event:
> > > -         * 1. When VM is in the RUN_STATE_PRELAUNCH state. The
> > > -         * vhost_user_blk_device_realize() is a caller.
> > > -         * 2. In tha main loop phase after VM start.
> > > -         *
> > > -         * For p2 the disconnect event will be delayed. We can't
> > > -         * do the same for p1, because we are not running the loop
> > > -         * at this moment. So just skip this step and perform
> > > -         * disconnect in the caller function.
> > > -         *
> > > -         * TODO: maybe it is a good idea to make the same fix
> > > -         * for other vhost-user devices.
> > > +         * Closing the connection should happen differently on device
> > > +         * initialization and operation stages.
> > > +         * On initalization, we want to re-start vhost_dev initialization
> > > +         * from the very beginning right away when the connection is closed,
> > > +         * so we clean up vhost_dev on each connection closing.
> > > +         * On operation, we want to postpone vhost_dev cleanup to let the
> > > +         * other code perform its own cleanup sequence using vhost_dev data
> > > +         * (e.g. vhost_dev_set_log).
> > >            */
> > >           if (realized) {
> > > +            /*
> > > +             * A close event may happen during a read/write, but vhost
> > > +             * code assumes the vhost_dev remains setup, so delay the
> > > +             * stop & clear.
> > > +             */
> > >               AioContext *ctx = qemu_get_current_aio_context();
> > >               qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
> > >                       NULL, NULL, false);
> > >               aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
> > > -        }
> > > -        /*
> > > -         * Move vhost device to the stopped state. The vhost-user device
> > > -         * will be clean up and disconnected in BH. This can be useful in
> > > -         * the vhost migration code. If disconnect was caught there is an
> > > -         * option for the general vhost code to get the dev state without
> > > -         * knowing its type (in this case vhost-user).
> > > -         */
> > > -        s->dev.started = false;
> > > +            /*
> > > +             * Move vhost device to the stopped state. The vhost-user device
> > > +             * will be clean up and disconnected in BH. This can be useful in
> > > +             * the vhost migration code. If disconnect was caught there is an
> > > +             * option for the general vhost code to get the dev state without
> > > +             * knowing its type (in this case vhost-user).
> > > +             */
> > > +            s->dev.started = false;
> > > +        } else {
> > > +            vhost_user_blk_disconnect(dev);
> > > +        }
> > >           break;
> > >       case CHR_EVENT_BREAK:
> > >       case CHR_EVENT_MUX_IN:
> > > -- 
> > > 2.25.1
> > > 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-04-21 19:59       ` Michael S. Tsirkin
@ 2021-04-22  8:09         ` Denis Plotnikov
  2021-04-22  9:31         ` Kevin Wolf
  1 sibling, 0 replies; 12+ messages in thread
From: Denis Plotnikov @ 2021-04-22  8:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Kevin Wolf, qemu-block, qemu-devel, mreitz, yc-core, raphael.norwitz


On 21.04.2021 22:59, Michael S. Tsirkin wrote:
> On Wed, Apr 21, 2021 at 07:13:24PM +0300, Denis Plotnikov wrote:
>> On 21.04.2021 18:24, Kevin Wolf wrote:
>>> Am 25.03.2021 um 16:12 hat Denis Plotnikov geschrieben:
>>>> Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
>>>> introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
>>>> because of connection problems with vhost-blk daemon.
>>>>
>>>> However, it introdues a new problem. Now, any communication errors
>>>> during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
>>>> lead to qemu abort on assert in vhost_dev_get_config().
>>>>
>>>> This happens because vhost_user_blk_disconnect() is postponed but
>>>> it should have dropped s->connected flag by the time
>>>> vhost_user_blk_device_realize() performs a new connection opening.
>>>> On the connection opening, vhost_dev initialization in
>>>> vhost_user_blk_connect() relies on s->connection flag and
>>>> if it's not dropped, it skips vhost_dev initialization and returns
>>>> with success. Then, vhost_user_blk_device_realize()'s execution flow
>>>> goes to vhost_dev_get_config() where it's aborted on the assert.
>>>>
>>>> To fix the problem this patch adds immediate cleanup on device
>>>> initialization(in vhost_user_blk_device_realize()) using different
>>>> event handlers for initialization and operation introduced in the
>>>> previous patch.
>>>> On initialization (in vhost_user_blk_device_realize()) we fully
>>>> control the initialization process. At that point, nobody can use the
>>>> device since it isn't initialized and we don't need to postpone any
>>>> cleanups, so we can do cleaup right away when there is a communication
>>>> problem with the vhost-blk daemon.
>>>> On operation we leave it as is, since the disconnect may happen when
>>>> the device is in use, so the device users may want to use vhost_dev's data
>>>> to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
>>>>
>>>> Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
>>>> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
>>> I think there is something wrong with this patch.
>>>
>>> I'm debugging an error case, specifically num-queues being larger in
>>> QEMU that in the vhost-user-blk export. Before this patch, it has just
>>> an unfriendly error message:
>>>
>>> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Unexpected end-of-file before all data were read
>>> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Failed to read msg header. Read 0 instead of 12. Original request 24.
>>> qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: vhost-user-blk: get block config failed
>>> qemu-system-x86_64: Failed to set msg fds.
>>> qemu-system-x86_64: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
>>>
>>> After the patch, it crashes:
>>>
>>> #0  0x0000555555d0a4bd in vhost_user_read_cb (source=0x5555568f4690, condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffffffcbf0) at ../hw/virtio/vhost-user.c:313
>>> #1  0x0000555555d950d3 in qio_channel_fd_source_dispatch (source=0x555557c3f750, callback=0x555555d0a478 <vhost_user_read_cb>, user_data=0x7fffffffcbf0) at ../io/channel-watch.c:84
>>> #2  0x00007ffff7b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
>>> #3  0x00007ffff7b84a98 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
>>> #4  0x00007ffff7b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
>>> #5  0x0000555555d0a724 in vhost_user_read (dev=0x555557bc62f8, msg=0x7fffffffcc50) at ../hw/virtio/vhost-user.c:402
>>> #6  0x0000555555d0ee6b in vhost_user_get_config (dev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
>>> #7  0x0000555555d56d46 in vhost_dev_get_config (hdev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
>>> #8  0x0000555555cdd150 in vhost_user_blk_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcf90) at ../hw/block/vhost-user-blk.c:510
>>> #9  0x0000555555d08f6d in virtio_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcff0) at ../hw/virtio/virtio.c:3660
>>>
>>> The problem is that vhost_user_read_cb() still accesses dev->opaque even
>>> though the device has been cleaned up meanwhile when the connection was
>>> closed (the vhost_user_blk_disconnect() added by this patch), so it's
>>> NULL now. This problem was actually mentioned in the comment that is
>>> removed by this patch.
>>>
>>> I tried to fix this by making vhost_user_read() cope with the fact that
>>> the device might have been cleaned up meanwhile, but then I'm running
>>> into the next set of problems.
>>>
>>> The first is that retrying is pointless, the error condition is in the
>>> configuration, it will never change.
>>>
>>> The other is that after many repetitions of the same error message, I
>>> got a crash where the device is cleaned up a second time in
>>> vhost_dev_init() and the virtqueues are already NULL.
>>>
>>> So it seems to me that erroring out during the initialisation phase
>>> makes a lot more sense than retrying.
>>>
>>> Kevin
>> But without the patch there will be another problem which the patch actually
>> addresses.
>>
>> It seems to me that there is a case when the retrying is useless and this is
>> exactly your case -- we'll never get a proper configuration.
>>
>> What if we get rid of the re-connection and give the only try to realize the
>> device? Than we don't need case separating for initialization and operation
>> of device, correct? But I don't familiar with the cases where the reconnect
>> is needed? Do you know something it?
> Reconnect is for when server is restarted while we are talking to it.

Can we eliminate reconnect on vhost-user-blk device initialization?

In addition to Kevin's case, I saw an inconvenient behavior when qmp 
add_device command hangs untill it manages to connect to server properly.

Denis

>
>> Denis
>>
>>>>    hw/block/vhost-user-blk.c | 48 +++++++++++++++++++--------------------
>>>>    1 file changed, 24 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>>>> index 1af95ec6aae7..4e215f71f152 100644
>>>> --- a/hw/block/vhost-user-blk.c
>>>> +++ b/hw/block/vhost-user-blk.c
>>>> @@ -402,38 +402,38 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
>>>>            break;
>>>>        case CHR_EVENT_CLOSED:
>>>>            /*
>>>> -         * A close event may happen during a read/write, but vhost
>>>> -         * code assumes the vhost_dev remains setup, so delay the
>>>> -         * stop & clear. There are two possible paths to hit this
>>>> -         * disconnect event:
>>>> -         * 1. When VM is in the RUN_STATE_PRELAUNCH state. The
>>>> -         * vhost_user_blk_device_realize() is a caller.
>>>> -         * 2. In tha main loop phase after VM start.
>>>> -         *
>>>> -         * For p2 the disconnect event will be delayed. We can't
>>>> -         * do the same for p1, because we are not running the loop
>>>> -         * at this moment. So just skip this step and perform
>>>> -         * disconnect in the caller function.
>>>> -         *
>>>> -         * TODO: maybe it is a good idea to make the same fix
>>>> -         * for other vhost-user devices.
>>>> +         * Closing the connection should happen differently on device
>>>> +         * initialization and operation stages.
>>>> +         * On initalization, we want to re-start vhost_dev initialization
>>>> +         * from the very beginning right away when the connection is closed,
>>>> +         * so we clean up vhost_dev on each connection closing.
>>>> +         * On operation, we want to postpone vhost_dev cleanup to let the
>>>> +         * other code perform its own cleanup sequence using vhost_dev data
>>>> +         * (e.g. vhost_dev_set_log).
>>>>             */
>>>>            if (realized) {
>>>> +            /*
>>>> +             * A close event may happen during a read/write, but vhost
>>>> +             * code assumes the vhost_dev remains setup, so delay the
>>>> +             * stop & clear.
>>>> +             */
>>>>                AioContext *ctx = qemu_get_current_aio_context();
>>>>                qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
>>>>                        NULL, NULL, false);
>>>>                aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
>>>> -        }
>>>> -        /*
>>>> -         * Move vhost device to the stopped state. The vhost-user device
>>>> -         * will be clean up and disconnected in BH. This can be useful in
>>>> -         * the vhost migration code. If disconnect was caught there is an
>>>> -         * option for the general vhost code to get the dev state without
>>>> -         * knowing its type (in this case vhost-user).
>>>> -         */
>>>> -        s->dev.started = false;
>>>> +            /*
>>>> +             * Move vhost device to the stopped state. The vhost-user device
>>>> +             * will be clean up and disconnected in BH. This can be useful in
>>>> +             * the vhost migration code. If disconnect was caught there is an
>>>> +             * option for the general vhost code to get the dev state without
>>>> +             * knowing its type (in this case vhost-user).
>>>> +             */
>>>> +            s->dev.started = false;
>>>> +        } else {
>>>> +            vhost_user_blk_disconnect(dev);
>>>> +        }
>>>>            break;
>>>>        case CHR_EVENT_BREAK:
>>>>        case CHR_EVENT_MUX_IN:
>>>> -- 
>>>> 2.25.1
>>>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect on initialization
  2021-04-21 19:59       ` Michael S. Tsirkin
  2021-04-22  8:09         ` Denis Plotnikov
@ 2021-04-22  9:31         ` Kevin Wolf
  1 sibling, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2021-04-22  9:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-block, qemu-devel, mreitz, Denis Plotnikov, yc-core,
	raphael.norwitz

Am 21.04.2021 um 21:59 hat Michael S. Tsirkin geschrieben:
> On Wed, Apr 21, 2021 at 07:13:24PM +0300, Denis Plotnikov wrote:
> > 
> > On 21.04.2021 18:24, Kevin Wolf wrote:
> > > Am 25.03.2021 um 16:12 hat Denis Plotnikov geschrieben:
> > > > Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
> > > > introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
> > > > because of connection problems with vhost-blk daemon.
> > > > 
> > > > However, it introdues a new problem. Now, any communication errors
> > > > during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
> > > > lead to qemu abort on assert in vhost_dev_get_config().
> > > > 
> > > > This happens because vhost_user_blk_disconnect() is postponed but
> > > > it should have dropped s->connected flag by the time
> > > > vhost_user_blk_device_realize() performs a new connection opening.
> > > > On the connection opening, vhost_dev initialization in
> > > > vhost_user_blk_connect() relies on s->connection flag and
> > > > if it's not dropped, it skips vhost_dev initialization and returns
> > > > with success. Then, vhost_user_blk_device_realize()'s execution flow
> > > > goes to vhost_dev_get_config() where it's aborted on the assert.
> > > > 
> > > > To fix the problem this patch adds immediate cleanup on device
> > > > initialization(in vhost_user_blk_device_realize()) using different
> > > > event handlers for initialization and operation introduced in the
> > > > previous patch.
> > > > On initialization (in vhost_user_blk_device_realize()) we fully
> > > > control the initialization process. At that point, nobody can use the
> > > > device since it isn't initialized and we don't need to postpone any
> > > > cleanups, so we can do cleaup right away when there is a communication
> > > > problem with the vhost-blk daemon.
> > > > On operation we leave it as is, since the disconnect may happen when
> > > > the device is in use, so the device users may want to use vhost_dev's data
> > > > to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
> > > > 
> > > > Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>
> > > > Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> > > I think there is something wrong with this patch.
> > > 
> > > I'm debugging an error case, specifically num-queues being larger in
> > > QEMU that in the vhost-user-blk export. Before this patch, it has just
> > > an unfriendly error message:
> > > 
> > > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Unexpected end-of-file before all data were read
> > > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: Failed to read msg header. Read 0 instead of 12. Original request 24.
> > > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=vhost1,id=blk1,iommu_platform=off,disable-legacy=on,num-queues=4: vhost-user-blk: get block config failed
> > > qemu-system-x86_64: Failed to set msg fds.
> > > qemu-system-x86_64: vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
> > > 
> > > After the patch, it crashes:
> > > 
> > > #0  0x0000555555d0a4bd in vhost_user_read_cb (source=0x5555568f4690, condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffffffcbf0) at ../hw/virtio/vhost-user.c:313
> > > #1  0x0000555555d950d3 in qio_channel_fd_source_dispatch (source=0x555557c3f750, callback=0x555555d0a478 <vhost_user_read_cb>, user_data=0x7fffffffcbf0) at ../io/channel-watch.c:84
> > > #2  0x00007ffff7b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
> > > #3  0x00007ffff7b84a98 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0
> > > #4  0x00007ffff7b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> > > #5  0x0000555555d0a724 in vhost_user_read (dev=0x555557bc62f8, msg=0x7fffffffcc50) at ../hw/virtio/vhost-user.c:402
> > > #6  0x0000555555d0ee6b in vhost_user_get_config (dev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
> > > #7  0x0000555555d56d46 in vhost_dev_get_config (hdev=0x555557bc62f8, config=0x555557bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
> > > #8  0x0000555555cdd150 in vhost_user_blk_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcf90) at ../hw/block/vhost-user-blk.c:510
> > > #9  0x0000555555d08f6d in virtio_device_realize (dev=0x555557bc60b0, errp=0x7fffffffcff0) at ../hw/virtio/virtio.c:3660
> > > 
> > > The problem is that vhost_user_read_cb() still accesses dev->opaque even
> > > though the device has been cleaned up meanwhile when the connection was
> > > closed (the vhost_user_blk_disconnect() added by this patch), so it's
> > > NULL now. This problem was actually mentioned in the comment that is
> > > removed by this patch.
> > > 
> > > I tried to fix this by making vhost_user_read() cope with the fact that
> > > the device might have been cleaned up meanwhile, but then I'm running
> > > into the next set of problems.
> > > 
> > > The first is that retrying is pointless, the error condition is in the
> > > configuration, it will never change.
> > > 
> > > The other is that after many repetitions of the same error message, I
> > > got a crash where the device is cleaned up a second time in
> > > vhost_dev_init() and the virtqueues are already NULL.
> > > 
> > > So it seems to me that erroring out during the initialisation phase
> > > makes a lot more sense than retrying.
> > > 
> > > Kevin
> > 
> > But without the patch there will be another problem which the patch actually
> > addresses.
> > 
> > It seems to me that there is a case when the retrying is useless and this is
> > exactly your case -- we'll never get a proper configuration.
> > 
> > What if we get rid of the re-connection and give the only try to realize the
> > device? Than we don't need case separating for initialization and operation
> > of device, correct? But I don't familiar with the cases where the reconnect
> > is needed? Do you know something it?
> 
> Reconnect is for when server is restarted while we are talking to it.

Sure, and that makes sense once the device has actually been realized.
But do we really have a use case for reconnecting during initialisation?
Commit 77542d43149 seems to just have added both at once, without
explaining in the commit message what the reason for this decision is.

The problem here is that we can't seem to tell what is a fatal error and
what is just a server restart. Fatal errors just seem much more likely
during device initialisation and turning them into endless loops in
vhost_user_blk_device_realize() is probably not a great idea.

Either way, we have a use-after-free in the error case and that needs
fixing once we have decided on the intended behaviour.

(The real fix for my num-queues error is of course checking the number
explicitly instead of making invalid requests, and this would hide the
bug we're discussing here, but I assume that more error cases like this
exist.)

Unrelated to the bug, I also wonder why the reconnection logic was
implemented specifically in vhost-user-blk. Isn't this something that
should apply to all vhost-user devices? Possibly with some callbacks for
device specific logic?

Kevin



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-04-22  9:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-25 15:12 [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
2021-03-25 15:12 ` [PATCH v3 1/3] vhost-user-blk: use different event handlers on initialization Denis Plotnikov
2021-03-25 15:12 ` [PATCH v3 2/3] vhost-user-blk: perform immediate cleanup if disconnect " Denis Plotnikov
2021-04-21 15:24   ` Kevin Wolf
2021-04-21 16:13     ` Denis Plotnikov
2021-04-21 19:59       ` Michael S. Tsirkin
2021-04-22  8:09         ` Denis Plotnikov
2021-04-22  9:31         ` Kevin Wolf
2021-03-25 15:12 ` [PATCH v3 3/3] vhost-user-blk: add immediate cleanup on shutdown Denis Plotnikov
2021-03-29 13:44 ` [PATCH v3 0/3] vhost-user-blk: fix bug on device disconnection during initialization Denis Plotnikov
2021-04-01  9:21   ` [BUG FIX][PATCH " Denis Plotnikov
2021-04-01 19:01     ` Valentin Sinitsyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).