All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] block-backend: Introduce I/O hang
@ 2020-09-27 13:04 Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 1/7] block-backend: introduce I/O rehandle info Ying Fang
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, zhang.zhanghailiang, mreitz

A VM in the cloud environment may use a virutal disk as the backend storage,
and there are usually filesystems on the virtual block device. When backend
storage is temporarily down, any I/O issued to the virtual block device will
cause an error. For example, an error occurred in ext4 filesystem would make
the filesystem readonly. However a cloud backend storage can be soon recovered.
For example, an IP-SAN may be down due to network failure and will be online
soon after network is recovered. The error in the filesystem may not be
recovered unless a device reattach or system restart. So an I/O rehandle is
in need to implement a self-healing mechanism.

This patch series propose a feature called I/O hang. It can rehandle AIOs
with EIO error without sending error back to guest. From guest's perspective
of view it is just like an IO is hanging and not returned. Guest can get
back running smoothly when I/O is recovred with this feature enabled.


Ying Fang (7):
  block-backend: introduce I/O rehandle info
  block-backend: rehandle block aios when EIO
  block-backend: add I/O hang timeout
  block-backend: add I/O hang drain when disbale
  virtio-blk: disable I/O hang when resetting
  qemu-option: add I/O hang timeout option
  qapi: add I/O hang and I/O hang timeout qapi event

 block/block-backend.c          | 285 +++++++++++++++++++++++++++++++++
 blockdev.c                     |  11 ++
 hw/block/virtio-blk.c          |   8 +
 include/sysemu/block-backend.h |   5 +
 qapi/block-core.json           |  26 +++
 5 files changed, 335 insertions(+)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/7] block-backend: introduce I/O rehandle info
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 2/7] block-backend: rehandle block aios when EIO Ying Fang
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

The I/O hang feature is realized based on a rehandle mechanism.
Each block backend will have a list to store hanging block AIOs,
and a timer to regularly resend these aios. In order to issue
the AIOs again, each block AIOs also need to store its coroutine entry.

Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 block/block-backend.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 24dd0670d1..bf104a7cf5 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -35,6 +35,18 @@
 
 static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
 
+/* block backend rehandle timer interval 5s */
+#define BLOCK_BACKEND_REHANDLE_TIMER_INTERVAL   5000
+
+typedef struct BlockBackendRehandleInfo {
+    bool enable;
+    QEMUTimer *ts;
+    unsigned timer_interval_ms;
+
+    unsigned int in_flight;
+    QTAILQ_HEAD(, BlkAioEmAIOCB) re_aios;
+} BlockBackendRehandleInfo;
+
 typedef struct BlockBackendAioNotifier {
     void (*attached_aio_context)(AioContext *new_context, void *opaque);
     void (*detach_aio_context)(void *opaque);
@@ -95,6 +107,8 @@ struct BlockBackend {
      * Accessed with atomic ops.
      */
     unsigned int in_flight;
+
+    BlockBackendRehandleInfo reinfo;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -350,6 +364,7 @@ BlockBackend *blk_new(AioContext *ctx, uint64_t perm, uint64_t shared_perm)
     qemu_co_queue_init(&blk->queued_requests);
     notifier_list_init(&blk->remove_bs_notifiers);
     notifier_list_init(&blk->insert_bs_notifiers);
+
     QLIST_INIT(&blk->aio_notifiers);
 
     QTAILQ_INSERT_TAIL(&block_backends, blk, link);
@@ -1392,6 +1407,10 @@ typedef struct BlkAioEmAIOCB {
     BlkRwCo rwco;
     int bytes;
     bool has_returned;
+
+    /* for rehandle */
+    CoroutineEntry *co_entry;
+    QTAILQ_ENTRY(BlkAioEmAIOCB) list;
 } BlkAioEmAIOCB;
 
 static AioContext *blk_aio_em_aiocb_get_aio_context(BlockAIOCB *acb_)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/7] block-backend: rehandle block aios when EIO
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 1/7] block-backend: introduce I/O rehandle info Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 3/7] block-backend: add I/O hang timeout Ying Fang
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

When a backend device temporarily does not response, like a network disk down
due to some network faults, any IO to the coresponding virtual block device
in VM would return I/O error. If the hypervisor returns the error to VM, the
filesystem on this block device may not work as usual. And in many situations,
the returned error is often an EIO.

To avoid this unavailablity, we can store the failed AIOs, and resend them
later. If the error is temporary, the retries can succeed and the AIOs can
be successfully completed.

Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 block/block-backend.c | 89 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index bf104a7cf5..90f1ca5753 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -365,6 +365,12 @@ BlockBackend *blk_new(AioContext *ctx, uint64_t perm, uint64_t shared_perm)
     notifier_list_init(&blk->remove_bs_notifiers);
     notifier_list_init(&blk->insert_bs_notifiers);
 
+    /* for rehandle */
+    blk->reinfo.enable = false;
+    blk->reinfo.ts = NULL;
+    atomic_set(&blk->reinfo.in_flight, 0);
+    QTAILQ_INIT(&blk->reinfo.re_aios);
+
     QLIST_INIT(&blk->aio_notifiers);
 
     QTAILQ_INSERT_TAIL(&block_backends, blk, link);
@@ -1425,8 +1431,16 @@ static const AIOCBInfo blk_aio_em_aiocb_info = {
     .get_aio_context    = blk_aio_em_aiocb_get_aio_context,
 };
 
+static void blk_rehandle_timer_cb(void *opaque);
+static void blk_rehandle_aio_complete(BlkAioEmAIOCB *acb);
+
 static void blk_aio_complete(BlkAioEmAIOCB *acb)
 {
+    if (acb->rwco.blk->reinfo.enable) {
+        blk_rehandle_aio_complete(acb);
+        return;
+    }
+
     if (acb->has_returned) {
         acb->common.cb(acb->common.opaque, acb->rwco.ret);
         blk_dec_in_flight(acb->rwco.blk);
@@ -1459,6 +1473,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
         .ret    = NOT_DONE,
     };
     acb->bytes = bytes;
+    acb->co_entry = co_entry;
     acb->has_returned = false;
 
     co = qemu_coroutine_create(co_entry, acb);
@@ -2054,6 +2069,20 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
             throttle_group_attach_aio_context(tgm, new_context);
             bdrv_drained_end(bs);
         }
+
+        if (blk->reinfo.enable) {
+            if (blk->reinfo.ts) {
+                timer_del(blk->reinfo.ts);
+                timer_free(blk->reinfo.ts);
+            }
+            blk->reinfo.ts = aio_timer_new(new_context, QEMU_CLOCK_REALTIME,
+                                           SCALE_MS, blk_rehandle_timer_cb,
+                                           blk);
+            if (atomic_read(&blk->reinfo.in_flight)) {
+                timer_mod(blk->reinfo.ts,
+                          qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
+            }
+        }
     }
 
     blk->ctx = new_context;
@@ -2405,6 +2434,66 @@ static void blk_root_drained_end(BdrvChild *child, int *drained_end_counter)
     }
 }
 
+static void blk_rehandle_insert_aiocb(BlockBackend *blk, BlkAioEmAIOCB *acb)
+{
+    assert(blk->reinfo.enable);
+
+    atomic_inc(&blk->reinfo.in_flight);
+    QTAILQ_INSERT_TAIL(&blk->reinfo.re_aios, acb, list);
+    timer_mod(blk->reinfo.ts, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+                              blk->reinfo.timer_interval_ms);
+}
+
+static void blk_rehandle_remove_aiocb(BlockBackend *blk, BlkAioEmAIOCB *acb)
+{
+    QTAILQ_REMOVE(&blk->reinfo.re_aios, acb, list);
+    atomic_dec(&blk->reinfo.in_flight);
+}
+
+static void blk_rehandle_timer_cb(void *opaque)
+{
+    BlockBackend *blk = opaque;
+    BlockBackendRehandleInfo *reinfo = &blk->reinfo;
+    BlkAioEmAIOCB *acb, *tmp;
+    Coroutine *co;
+
+    aio_context_acquire(blk_get_aio_context(blk));
+    QTAILQ_FOREACH_SAFE(acb, &reinfo->re_aios, list, tmp) {
+        if (acb->rwco.ret == NOT_DONE) {
+            continue;
+        }
+
+        blk_inc_in_flight(acb->rwco.blk);
+        acb->rwco.ret = NOT_DONE;
+        acb->has_returned = false;
+
+        co = qemu_coroutine_create(acb->co_entry, acb);
+        bdrv_coroutine_enter(blk_bs(blk), co);
+
+        acb->has_returned = true;
+        if (acb->rwco.ret != NOT_DONE) {
+            blk_rehandle_remove_aiocb(acb->rwco.blk, acb);
+            replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
+                                             blk_aio_complete_bh, acb);
+        }
+    }
+    aio_context_release(blk_get_aio_context(blk));
+}
+
+static void blk_rehandle_aio_complete(BlkAioEmAIOCB *acb)
+{
+    if (acb->has_returned) {
+        blk_dec_in_flight(acb->rwco.blk);
+        if (acb->rwco.ret == -EIO) {
+            blk_rehandle_insert_aiocb(acb->rwco.blk, acb);
+            return;
+        }
+
+        acb->common.cb(acb->common.opaque, acb->rwco.ret);
+        qemu_aio_unref(acb);
+    }
+}
+
 void blk_register_buf(BlockBackend *blk, void *host, size_t size)
 {
     bdrv_register_buf(blk_bs(blk), host, size);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/7] block-backend: add I/O hang timeout
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 1/7] block-backend: introduce I/O rehandle info Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 2/7] block-backend: rehandle block aios when EIO Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale Ying Fang
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

Not all errors would be fixed, so it is better to add a rehandle timeout
for I/O hang.

Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 block/block-backend.c          | 99 +++++++++++++++++++++++++++++++++-
 include/sysemu/block-backend.h |  2 +
 2 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 90f1ca5753..d0b2b59f55 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -38,6 +38,11 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
 /* block backend rehandle timer interval 5s */
 #define BLOCK_BACKEND_REHANDLE_TIMER_INTERVAL   5000
 
+enum BlockIOHangStatus {
+    BLOCK_IO_HANG_STATUS_NORMAL = 0,
+    BLOCK_IO_HANG_STATUS_HANG,
+};
+
 typedef struct BlockBackendRehandleInfo {
     bool enable;
     QEMUTimer *ts;
@@ -109,6 +114,11 @@ struct BlockBackend {
     unsigned int in_flight;
 
     BlockBackendRehandleInfo reinfo;
+
+    int64_t iohang_timeout; /* The I/O hang timeout value in sec. */
+    int64_t iohang_time;    /* The I/O hang start time */
+    bool is_iohang_timeout;
+    int iohang_status;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -2480,20 +2490,107 @@ static void blk_rehandle_timer_cb(void *opaque)
     aio_context_release(blk_get_aio_context(blk));
 }
 
+static bool blk_iohang_handle(BlockBackend *blk, int new_status)
+{
+    int64_t now;
+    int old_status = blk->iohang_status;
+    bool need_rehandle = false;
+
+    switch (new_status) {
+    case BLOCK_IO_HANG_STATUS_NORMAL:
+        if (old_status == BLOCK_IO_HANG_STATUS_HANG) {
+            /* Case when I/O Hang is recovered */
+            blk->is_iohang_timeout = false;
+            blk->iohang_time = 0;
+        }
+        break;
+    case BLOCK_IO_HANG_STATUS_HANG:
+        if (old_status != BLOCK_IO_HANG_STATUS_HANG) {
+            /* Case when I/O hang is first triggered */
+            blk->iohang_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
+            need_rehandle = true;
+        } else {
+            if (!blk->is_iohang_timeout) {
+                now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
+                if (now >= (blk->iohang_time + blk->iohang_timeout)) {
+                    /* Case when I/O hang is timeout */
+                    blk->is_iohang_timeout = true;
+                } else {
+                    /* Case when I/O hang is continued */
+                    need_rehandle = true;
+                }
+            }
+        }
+        break;
+    default:
+        break;
+    }
+
+    blk->iohang_status = new_status;
+    return need_rehandle;
+}
+
+static bool blk_rehandle_aio(BlkAioEmAIOCB *acb, bool *has_timeout)
+{
+    bool need_rehandle = false;
+
+    /* Rehandle aio which returns EIO before hang timeout */
+    if (acb->rwco.ret == -EIO) {
+        if (acb->rwco.blk->is_iohang_timeout) {
+            /* I/O hang has timeout and not recovered */
+            *has_timeout = true;
+        } else {
+            need_rehandle = blk_iohang_handle(acb->rwco.blk,
+                                              BLOCK_IO_HANG_STATUS_HANG);
+            /* I/O hang timeout first trigger */
+            if (acb->rwco.blk->is_iohang_timeout) {
+                *has_timeout = true;
+            }
+        }
+    }
+
+    return need_rehandle;
+}
+
 static void blk_rehandle_aio_complete(BlkAioEmAIOCB *acb)
 {
+    bool has_timeout = false;
+    bool need_rehandle = false;
+
     if (acb->has_returned) {
         blk_dec_in_flight(acb->rwco.blk);
-        if (acb->rwco.ret == -EIO) {
+        need_rehandle = blk_rehandle_aio(acb, &has_timeout);
+        if (need_rehandle) {
             blk_rehandle_insert_aiocb(acb->rwco.blk, acb);
             return;
         }
 
         acb->common.cb(acb->common.opaque, acb->rwco.ret);
+
+        /* I/O hang return to normal status */
+        if (!has_timeout) {
+            blk_iohang_handle(acb->rwco.blk, BLOCK_IO_HANG_STATUS_NORMAL);
+        }
+
         qemu_aio_unref(acb);
     }
 }
 
+void blk_iohang_init(BlockBackend *blk, int64_t iohang_timeout)
+{
+    if (!blk) {
+        return;
+    }
+
+    blk->is_iohang_timeout = false;
+    blk->iohang_time = 0;
+    blk->iohang_timeout = 0;
+    blk->iohang_status = BLOCK_IO_HANG_STATUS_NORMAL;
+    if (iohang_timeout > 0) {
+        blk->iohang_timeout = iohang_timeout;
+    }
+}
+
 void blk_register_buf(BlockBackend *blk, void *host, size_t size)
 {
     bdrv_register_buf(blk_bs(blk), host, size);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 8203d7f6f9..bfebe3a960 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -268,4 +268,6 @@ const BdrvChild *blk_root(BlockBackend *blk);
 
 int blk_make_empty(BlockBackend *blk, Error **errp);
 
+void blk_iohang_init(BlockBackend *blk, int64_t iohang_timeout);
+
 #endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (2 preceding siblings ...)
  2020-09-27 13:04 ` [RFC PATCH 3/7] block-backend: add I/O hang timeout Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-28 15:09   ` Eric Blake
  2020-09-27 13:04 ` [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting Ying Fang
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

To disable I/O hang, all hanging AIOs need to be drained. A rehandle status
field is introduced to notify rehandle mechanism not to rehandle failed AIOs
when I/O hang is disabled.

Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 block/block-backend.c          | 85 ++++++++++++++++++++++++++++++++--
 include/sysemu/block-backend.h |  3 ++
 2 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index d0b2b59f55..95b2d6a679 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -37,6 +37,9 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
 
 /* block backend rehandle timer interval 5s */
 #define BLOCK_BACKEND_REHANDLE_TIMER_INTERVAL   5000
+#define BLOCK_BACKEND_REHANDLE_NORMAL           1
+#define BLOCK_BACKEND_REHANDLE_DRAIN_REQUESTED  2
+#define BLOCK_BACKEND_REHANDLE_DRAINED          3
 
 enum BlockIOHangStatus {
     BLOCK_IO_HANG_STATUS_NORMAL = 0,
@@ -50,6 +53,8 @@ typedef struct BlockBackendRehandleInfo {
 
     unsigned int in_flight;
     QTAILQ_HEAD(, BlkAioEmAIOCB) re_aios;
+
+    int status;
 } BlockBackendRehandleInfo;
 
 typedef struct BlockBackendAioNotifier {
@@ -471,6 +476,8 @@ static void blk_delete(BlockBackend *blk)
     assert(!blk->refcnt);
     assert(!blk->name);
     assert(!blk->dev);
+    assert(atomic_read(&blk->reinfo.in_flight) == 0);
+    blk_rehandle_disable(blk);
     if (blk->public.throttle_group_member.throttle_state) {
         blk_io_limits_disable(blk);
     }
@@ -2460,6 +2467,37 @@ static void blk_rehandle_remove_aiocb(BlockBackend *blk, BlkAioEmAIOCB *acb)
     atomic_dec(&blk->reinfo.in_flight);
 }
 
+static void blk_rehandle_drain(BlockBackend *blk)
+{
+    if (blk_bs(blk)) {
+        bdrv_drained_begin(blk_bs(blk));
+        BDRV_POLL_WHILE(blk_bs(blk), atomic_read(&blk->reinfo.in_flight) > 0);
+        bdrv_drained_end(blk_bs(blk));
+    }
+}
+
+static bool blk_rehandle_is_paused(BlockBackend *blk)
+{
+    return blk->reinfo.status == BLOCK_BACKEND_REHANDLE_DRAIN_REQUESTED ||
+           blk->reinfo.status == BLOCK_BACKEND_REHANDLE_DRAINED;
+}
+
+static void blk_rehandle_pause(BlockBackend *blk)
+{
+    BlockBackendRehandleInfo *reinfo = &blk->reinfo;
+
+    aio_context_acquire(blk_get_aio_context(blk));
+    if (!reinfo->enable || reinfo->status == BLOCK_BACKEND_REHANDLE_DRAINED) {
+        aio_context_release(blk_get_aio_context(blk));
+        return;
+    }
+
+    reinfo->status = BLOCK_BACKEND_REHANDLE_DRAIN_REQUESTED;
+    blk_rehandle_drain(blk);
+    reinfo->status = BLOCK_BACKEND_REHANDLE_DRAINED;
+    aio_context_release(blk_get_aio_context(blk));
+}
+
 static void blk_rehandle_timer_cb(void *opaque)
 {
     BlockBackend *blk = opaque;
@@ -2559,10 +2597,12 @@ static void blk_rehandle_aio_complete(BlkAioEmAIOCB *acb)
 
     if (acb->has_returned) {
         blk_dec_in_flight(acb->rwco.blk);
-        need_rehandle = blk_rehandle_aio(acb, &has_timeout);
-        if (need_rehandle) {
-            blk_rehandle_insert_aiocb(acb->rwco.blk, acb);
-            return;
+        if (!blk_rehandle_is_paused(acb->rwco.blk)) {
+            need_rehandle = blk_rehandle_aio(acb, &has_timeout);
+            if (need_rehandle) {
+                blk_rehandle_insert_aiocb(acb->rwco.blk, acb);
+                return;
+            }
         }
 
         acb->common.cb(acb->common.opaque, acb->rwco.ret);
@@ -2576,6 +2616,42 @@ static void blk_rehandle_aio_complete(BlkAioEmAIOCB *acb)
     }
 }
 
+void blk_rehandle_enable(BlockBackend *blk)
+{
+    BlockBackendRehandleInfo *reinfo = &blk->reinfo;
+
+    aio_context_acquire(blk_get_aio_context(blk));
+    if (reinfo->enable) {
+        aio_context_release(blk_get_aio_context(blk));
+        return;
+    }
+
+    reinfo->ts = aio_timer_new(blk_get_aio_context(blk), QEMU_CLOCK_REALTIME,
+                               SCALE_MS, blk_rehandle_timer_cb, blk);
+    reinfo->timer_interval_ms = BLOCK_BACKEND_REHANDLE_TIMER_INTERVAL;
+    reinfo->status = BLOCK_BACKEND_REHANDLE_NORMAL;
+    reinfo->enable = true;
+    aio_context_release(blk_get_aio_context(blk));
+}
+
+void blk_rehandle_disable(BlockBackend *blk)
+{
+    if (!blk->reinfo.enable) {
+        return;
+    }
+
+    blk_rehandle_pause(blk);
+    timer_del(blk->reinfo.ts);
+    timer_free(blk->reinfo.ts);
+    blk->reinfo.ts = NULL;
+    blk->reinfo.enable = false;
+}
+
+bool blk_iohang_is_enabled(BlockBackend *blk)
+{
+    return blk->iohang_timeout != 0;
+}
+
 void blk_iohang_init(BlockBackend *blk, int64_t iohang_timeout)
 {
     if (!blk) {
@@ -2588,6 +2664,7 @@ void blk_iohang_init(BlockBackend *blk, int64_t iohang_timeout)
     blk->iohang_status = BLOCK_IO_HANG_STATUS_NORMAL;
     if (iohang_timeout > 0) {
         blk->iohang_timeout = iohang_timeout;
+        blk_rehandle_enable(blk);
     }
 }
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index bfebe3a960..375ae13b0b 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -268,6 +268,9 @@ const BdrvChild *blk_root(BlockBackend *blk);
 
 int blk_make_empty(BlockBackend *blk, Error **errp);
 
+void blk_rehandle_enable(BlockBackend *blk);
+void blk_rehandle_disable(BlockBackend *blk);
+bool blk_iohang_is_enabled(BlockBackend *blk);
 void blk_iohang_init(BlockBackend *blk, int64_t iohang_timeout);
 
 #endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (3 preceding siblings ...)
  2020-09-27 13:04 ` [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 6/7] qemu-option: add I/O hang timeout option Ying Fang
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

All AIOs including the hanging AIOs need to be drained when resetting
virtio-blk. So it is necessary to disable I/O hang before resetting
and enable I/O hang again after resetting if I/O hang is enabled.

Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/block/virtio-blk.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 2204ba149e..11837a54f5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -892,6 +892,10 @@ static void virtio_blk_reset(VirtIODevice *vdev)
     AioContext *ctx;
     VirtIOBlockReq *req;
 
+    if (blk_iohang_is_enabled(s->blk)) {
+        blk_rehandle_disable(s->blk);
+    }
+
     ctx = blk_get_aio_context(s->blk);
     aio_context_acquire(ctx);
     blk_drain(s->blk);
@@ -909,6 +913,10 @@ static void virtio_blk_reset(VirtIODevice *vdev)
 
     assert(!s->dataplane_started);
     blk_set_enable_write_cache(s->blk, s->original_wce);
+
+    if (blk_iohang_is_enabled(s->blk)) {
+        blk_rehandle_enable(s->blk);
+    }
 }
 
 /* coalesce internal state, copy to pci i/o region 0
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 6/7] qemu-option: add I/O hang timeout option
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (4 preceding siblings ...)
  2020-09-27 13:04 ` [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:04 ` [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event Ying Fang
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

I/O hang timeout should be different under different situations. So it is
better to provide an option for user to determine I/O hang timeout for
each block device.

Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 blockdev.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 7f2561081e..ff8cdcd497 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -500,6 +500,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
     BlockdevDetectZeroesOptions detect_zeroes =
         BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF;
     const char *throttling_group = NULL;
+    int64_t iohang_timeout = 0;
 
     /* Check common options by copying from bs_opts to opts, all other options
      * stay in bs_opts for processing by bdrv_open(). */
@@ -622,6 +623,12 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
 
         bs->detect_zeroes = detect_zeroes;
 
+        /* init timeout value for I/O Hang */
+        iohang_timeout = qemu_opt_get_number(opts, "iohang-timeout", 0);
+        if (iohang_timeout > 0) {
+            blk_iohang_init(blk, iohang_timeout);
+        }
+
         block_acct_setup(blk_get_stats(blk), account_invalid, account_failed);
 
         if (!parse_stats_intervals(blk_get_stats(blk), interval_list, errp)) {
@@ -3786,6 +3793,10 @@ QemuOptsList qemu_common_drive_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "whether to account for failed I/O operations "
                     "in the statistics",
+        },{
+            .name = "iohang-timeout",
+            .type = QEMU_OPT_NUMBER,
+            .help = "timeout value for I/O Hang",
         },
         { /* end of list */ }
     },
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (5 preceding siblings ...)
  2020-09-27 13:04 ` [RFC PATCH 6/7] qemu-option: add I/O hang timeout option Ying Fang
@ 2020-09-27 13:04 ` Ying Fang
  2020-09-27 13:27 ` [RFC PATCH 0/7] block-backend: Introduce I/O hang no-reply
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ying Fang @ 2020-09-27 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Ying Fang, Jiahui Cen, zhang.zhanghailiang, mreitz

Sometimes hypervisor management tools like libvirt may need to monitor
I/O hang events. Let's report I/O hang and I/O hang timeout event via qapi.

Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 block/block-backend.c |  3 +++
 qapi/block-core.json  | 26 ++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 95b2d6a679..5dc5b11bcc 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2540,6 +2540,7 @@ static bool blk_iohang_handle(BlockBackend *blk, int new_status)
             /* Case when I/O Hang is recovered */
             blk->is_iohang_timeout = false;
             blk->iohang_time = 0;
+            qapi_event_send_block_io_hang(false);
         }
         break;
     case BLOCK_IO_HANG_STATUS_HANG:
@@ -2547,12 +2548,14 @@ static bool blk_iohang_handle(BlockBackend *blk, int new_status)
             /* Case when I/O hang is first triggered */
             blk->iohang_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
             need_rehandle = true;
+            qapi_event_send_block_io_hang(true);
         } else {
             if (!blk->is_iohang_timeout) {
                 now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
                 if (now >= (blk->iohang_time + blk->iohang_timeout)) {
                     /* Case when I/O hang is timeout */
                     blk->is_iohang_timeout = true;
+                    qapi_event_send_block_io_hang_timeout(true);
                 } else {
                     /* Case when I/O hang is continued */
                     need_rehandle = true;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3c16f1e11d..7bdf75c6d7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5535,3 +5535,29 @@
 { 'command': 'blockdev-snapshot-delete-internal-sync',
   'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
   'returns': 'SnapshotInfo' }
+
+##
+# @BLOCK_IO_HANG:
+#
+# Emitted when device I/O hang trigger event begin or end
+#
+# @set: true if I/O hang begin; false if I/O hang end.
+#
+# Since: 5.2
+#
+##
+{ 'event': 'BLOCK_IO_HANG',
+  'data': { 'set': 'bool' }}
+
+##
+# @BLOCK_IO_HANG_TIMEOUT:
+#
+# Emitted when device I/O hang timeout event set or clear
+#
+# @set: true if set; false if clear.
+#
+# Since: 5.2
+#
+##
+{ 'event': 'BLOCK_IO_HANG_TIMEOUT',
+  'data': { 'set': 'bool' }}
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (6 preceding siblings ...)
  2020-09-27 13:04 ` [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event Ying Fang
@ 2020-09-27 13:27 ` no-reply
  2020-09-27 13:32 ` no-reply
  2020-09-28 10:57 ` Kevin Wolf
  9 siblings, 0 replies; 13+ messages in thread
From: no-reply @ 2020-09-27 13:27 UTC (permalink / raw)
  To: fangying1; +Cc: kwolf, fangying1, mreitz, qemu-devel, zhang.zhanghailiang

Patchew URL: https://patchew.org/QEMU/20200927130420.1095-1-fangying1@huawei.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

Host machine cpu: x86_64
Target machine cpu family: x86
Target machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object libblock.fa.p/block_vdi.c.obj
Compiling C object libblock.fa.p/block_cloop.c.obj
../src/block/block-backend.c: In function 'blk_new':
../src/block/block-backend.c:386:5: error: implicit declaration of function 'atomic_set'; did you mean 'qatomic_set'? [-Werror=implicit-function-declaration]
  386 |     atomic_set(&blk->reinfo.in_flight, 0);
      |     ^~~~~~~~~~
      |     qatomic_set
../src/block/block-backend.c:386:5: error: nested extern declaration of 'atomic_set' [-Werror=nested-externs]
In file included from /usr/x86_64-w64-mingw32/sys-root/mingw/lib/glib-2.0/include/glibconfig.h:9,
                 from /usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gtypes.h:32,
                 from /usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/galloca.h:32,
---
                 from /tmp/qemu-test/src/include/qemu/osdep.h:126,
                 from ../src/block/block-backend.c:13:
../src/block/block-backend.c: In function 'blk_delete':
../src/block/block-backend.c:479:12: error: implicit declaration of function 'atomic_read'; did you mean 'qatomic_read'? [-Werror=implicit-function-declaration]
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |            ^~~~~~~~~~~
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gmacros.h:928:8: note: in definition of macro '_G_BOOLEAN_EXPR'
---
../src/block/block-backend.c:479:5: note: in expansion of macro 'assert'
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |     ^~~~~~
../src/block/block-backend.c:479:12: error: nested extern declaration of 'atomic_read' [-Werror=nested-externs]
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |            ^~~~~~~~~~~
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gmacros.h:928:8: note: in definition of macro '_G_BOOLEAN_EXPR'
---
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |     ^~~~~~
../src/block/block-backend.c: In function 'blk_rehandle_insert_aiocb':
../src/block/block-backend.c:2459:5: error: implicit declaration of function 'atomic_inc'; did you mean 'qatomic_inc'? [-Werror=implicit-function-declaration]
 2459 |     atomic_inc(&blk->reinfo.in_flight);
      |     ^~~~~~~~~~
      |     qatomic_inc
../src/block/block-backend.c:2459:5: error: nested extern declaration of 'atomic_inc' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_remove_aiocb':
../src/block/block-backend.c:2468:5: error: implicit declaration of function 'atomic_dec'; did you mean 'qatomic_dec'? [-Werror=implicit-function-declaration]
 2468 |     atomic_dec(&blk->reinfo.in_flight);
      |     ^~~~~~~~~~
      |     qatomic_dec
../src/block/block-backend.c:2468:5: error: nested extern declaration of 'atomic_dec' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [Makefile.ninja:888: libblock.fa.p/block_block-backend.c.obj] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=4c3aba1eb35b428ca91e79a610e892a6', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-1pm_eno6/src/docker-src.2020-09-27-09.21.55.30331:/var/tmp/qemu:z,ro', 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=4c3aba1eb35b428ca91e79a610e892a6
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-1pm_eno6/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    5m11.016s
user    0m19.775s


The full log is available at
http://patchew.org/logs/20200927130420.1095-1-fangying1@huawei.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (7 preceding siblings ...)
  2020-09-27 13:27 ` [RFC PATCH 0/7] block-backend: Introduce I/O hang no-reply
@ 2020-09-27 13:32 ` no-reply
  2020-09-28 10:57 ` Kevin Wolf
  9 siblings, 0 replies; 13+ messages in thread
From: no-reply @ 2020-09-27 13:32 UTC (permalink / raw)
  To: fangying1; +Cc: kwolf, fangying1, mreitz, qemu-devel, zhang.zhanghailiang

Patchew URL: https://patchew.org/QEMU/20200927130420.1095-1-fangying1@huawei.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

C linker for the host machine: cc ld.bfd 2.27-43
Host machine cpu family: x86_64
Host machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object libblock.fa.p/block_commit.c.o
Compiling C object libblock.fa.p/block_vhdx-endian.c.o
../src/block/block-backend.c: In function 'blk_new':
../src/block/block-backend.c:386:5: error: implicit declaration of function 'atomic_set' [-Werror=implicit-function-declaration]
     atomic_set(&blk->reinfo.in_flight, 0);
     ^
../src/block/block-backend.c:386:5: error: nested extern declaration of 'atomic_set' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_delete':
../src/block/block-backend.c:479:5: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]
     assert(atomic_read(&blk->reinfo.in_flight) == 0);
     ^
../src/block/block-backend.c:479:5: error: nested extern declaration of 'atomic_read' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_insert_aiocb':
../src/block/block-backend.c:2459:5: error: implicit declaration of function 'atomic_inc' [-Werror=implicit-function-declaration]
     atomic_inc(&blk->reinfo.in_flight);
     ^
../src/block/block-backend.c:2459:5: error: nested extern declaration of 'atomic_inc' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_remove_aiocb':
../src/block/block-backend.c:2468:5: error: implicit declaration of function 'atomic_dec' [-Werror=implicit-function-declaration]
     atomic_dec(&blk->reinfo.in_flight);
     ^
../src/block/block-backend.c:2468:5: error: nested extern declaration of 'atomic_dec' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [libblock.fa.p/block_block-backend.c.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=39951e04bf3b4809a4afe5755ca771f5', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-5qkiksy1/src/docker-src.2020-09-27-09.28.20.6987:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=39951e04bf3b4809a4afe5755ca771f5
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-5qkiksy1/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    4m6.755s
user    0m23.139s


The full log is available at
http://patchew.org/logs/20200927130420.1095-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
  2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
                   ` (8 preceding siblings ...)
  2020-09-27 13:32 ` no-reply
@ 2020-09-28 10:57 ` Kevin Wolf
  2020-09-29  9:48   ` cenjiahui
  9 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2020-09-28 10:57 UTC (permalink / raw)
  To: Ying Fang; +Cc: zhang.zhanghailiang, qemu-devel, mreitz

Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:
> A VM in the cloud environment may use a virutal disk as the backend storage,
> and there are usually filesystems on the virtual block device. When backend
> storage is temporarily down, any I/O issued to the virtual block device will
> cause an error. For example, an error occurred in ext4 filesystem would make
> the filesystem readonly. However a cloud backend storage can be soon recovered.
> For example, an IP-SAN may be down due to network failure and will be online
> soon after network is recovered. The error in the filesystem may not be
> recovered unless a device reattach or system restart. So an I/O rehandle is
> in need to implement a self-healing mechanism.
> 
> This patch series propose a feature called I/O hang. It can rehandle AIOs
> with EIO error without sending error back to guest. From guest's perspective
> of view it is just like an IO is hanging and not returned. Guest can get
> back running smoothly when I/O is recovred with this feature enabled.

What is the problem with setting werror=stop and rerror=stop for the
device? Is it that QEMU won't automatically retry, but management tool
interaction is required to resume the guest?

I haven't checked your patches in detail yet, but implementing this
functionality in the backend means that blk_drain() will hang (or if it
doesn't hang, it doesn't do what it's supposed to do), making the whole
QEMU process unresponsive until the I/O succeeds again. Amongst others,
this would make it impossible to migrate away from a host with storage
problems.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale
  2020-09-27 13:04 ` [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale Ying Fang
@ 2020-09-28 15:09   ` Eric Blake
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2020-09-28 15:09 UTC (permalink / raw)
  To: Ying Fang, qemu-devel; +Cc: kwolf, Jiahui Cen, zhang.zhanghailiang, mreitz

On 9/27/20 8:04 AM, Ying Fang wrote:

In the subject: s/disbale/disabled/

> To disable I/O hang, all hanging AIOs need to be drained. A rehandle status
> field is introduced to notify rehandle mechanism not to rehandle failed AIOs
> when I/O hang is disabled.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
> ---
>   block/block-backend.c          | 85 ++++++++++++++++++++++++++++++++--
>   include/sysemu/block-backend.h |  3 ++
>   2 files changed, 84 insertions(+), 4 deletions(-)
> 
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
  2020-09-28 10:57 ` Kevin Wolf
@ 2020-09-29  9:48   ` cenjiahui
  0 siblings, 0 replies; 13+ messages in thread
From: cenjiahui @ 2020-09-29  9:48 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Ying Fang, mreitz, zhang.zhanghailiang, qemu-devel


On 2020/9/28 18:57, Kevin Wolf wrote:
> Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:
>> A VM in the cloud environment may use a virutal disk as the backend storage,
>> and there are usually filesystems on the virtual block device. When backend
>> storage is temporarily down, any I/O issued to the virtual block device will
>> cause an error. For example, an error occurred in ext4 filesystem would make
>> the filesystem readonly. However a cloud backend storage can be soon recovered.
>> For example, an IP-SAN may be down due to network failure and will be online
>> soon after network is recovered. The error in the filesystem may not be
>> recovered unless a device reattach or system restart. So an I/O rehandle is
>> in need to implement a self-healing mechanism.
>>
>> This patch series propose a feature called I/O hang. It can rehandle AIOs
>> with EIO error without sending error back to guest. From guest's perspective
>> of view it is just like an IO is hanging and not returned. Guest can get
>> back running smoothly when I/O is recovred with this feature enabled.
> 
> What is the problem with setting werror=stop and rerror=stop for the
When an I/O error occurs, if simply setting werror=stop and rerror=stop, the
whole VM will be paused and unavailable. Moreover, the VM won't be recovered
until the management tool manually resumes it after the backend storage recovers.
> device? Is it that QEMU won't automatically retry, but management tool
> interaction is required to resume the guest?
By using I/O Hang mechanism, we can temporarily hang the IOs, and any other
services unrelated with the hung virtual block device like network can go on
working. Besides, once the backend storage is recovered, our I/O rehandle
mechanism will automatically complete the hung IOs and continue the VM's work.
> 
> I haven't checked your patches in detail yet, but implementing this
> functionality in the backend means that blk_drain() will hang (or if it
> doesn't hang, it doesn't do what it's supposed to do), making the whole
What if we disable rehandle before blk_drain().
> QEMU process unresponsive until the I/O succeeds again. Amongst others,
> this would make it impossible to migrate away from a host with storage
> problems.
Exactly if the storage is recovered during migration iteration phase, the
migration can succeed, but if the storage is still not recovered on migration
completion phase, the migration should fail and be cancelled.

Thanks,
Jiahui Cen
> 
> Kevin
> 
> 
> .
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-29  9:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-27 13:04 [RFC PATCH 0/7] block-backend: Introduce I/O hang Ying Fang
2020-09-27 13:04 ` [RFC PATCH 1/7] block-backend: introduce I/O rehandle info Ying Fang
2020-09-27 13:04 ` [RFC PATCH 2/7] block-backend: rehandle block aios when EIO Ying Fang
2020-09-27 13:04 ` [RFC PATCH 3/7] block-backend: add I/O hang timeout Ying Fang
2020-09-27 13:04 ` [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale Ying Fang
2020-09-28 15:09   ` Eric Blake
2020-09-27 13:04 ` [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting Ying Fang
2020-09-27 13:04 ` [RFC PATCH 6/7] qemu-option: add I/O hang timeout option Ying Fang
2020-09-27 13:04 ` [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event Ying Fang
2020-09-27 13:27 ` [RFC PATCH 0/7] block-backend: Introduce I/O hang no-reply
2020-09-27 13:32 ` no-reply
2020-09-28 10:57 ` Kevin Wolf
2020-09-29  9:48   ` cenjiahui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.