All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
@ 2021-07-02 17:23 Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release Ilya Dryomov
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

This series migrates the qemu rbd driver from the old aio emulation
to native coroutines and adds write zeroes support which is important
for block operations.

To achieve this we first bump the librbd requirement to the already
outdated luminous release of ceph to get rid of some wrappers and
ifdef'ry in the code.

V4->V5:
 - rebase on Kevin's block branch (https://repo.or.cz/qemu/kevin.git block)
   to resolve conflicts with "block/rbd: Add support for rbd image encryption"
   patch

V3->V4:
 - this patch is now rebased on top of current master
 - Patch 1: just mention librbd, tweak version numbers [Ilya]
 - Patch 3: use rbd_get_size instead of rbd_stat [Ilya]
 - Patch 4: retain comment about using a BH in the callback [Ilya]
 - Patch 5: set BDRV_REQ_NO_FALLBACK and silently ignore BDRV_REQ_MAY_UNMAP [Ilya]

V2->V3:
 - this patch is now rebased on top of current master
 - Patch 1: only use cc.links and not cc.run to not break
   cross-compiling. [Kevin]
   Since Qemu 6.1 its okay to rely on librbd >= 12.x since RHEL-7
   support was dropped [Daniel]
 - Patch 4: dropped
 - Patch 5: store BDS in RBDTask and use bdrv_get_aio_context() [Kevin]

V1->V2:
 - this patch is now rebased on top of current master with Paolos
   upcoming fixes for the meson.build script included:
    - meson: accept either shared or static libraries if --disable-static
    - meson: honor --enable-rbd if cc.links test fails
 - Patch 1: adjusted to meson.build script
 - Patch 2: unchanged
 - Patch 3: new patch
 - Patch 4: do not implement empty detach_aio_context callback [Jason]
 - Patch 5: - fix aio completion cleanup in error case [Jason]
            - return error codes from librbd
 - Patch 6: - add support for thick provisioning [Jason]
            - do not set write zeroes alignment
 - Patch 7: new patch

Peter Lieven (6):
  block/rbd: bump librbd requirement to luminous release
  block/rbd: store object_size in BDRVRBDState
  block/rbd: update s->image_size in qemu_rbd_getlength
  block/rbd: migrate from aio to coroutines
  block/rbd: add write zeroes support
  block/rbd: drop qemu_rbd_refresh_limits

 block/rbd.c | 406 ++++++++++++++++------------------------------------
 meson.build |   7 +-
 2 files changed, 128 insertions(+), 285 deletions(-)

-- 
2.19.2



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 2/6] block/rbd: store object_size in BDRVRBDState Ilya Dryomov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

Ceph Luminous (version 12.2.z) is almost 4 years old at this point.
Bump the requirement to get rid of the ifdef'ry in the code.
Qemu 6.1 dropped the support for RHEL-7 which was the last supported
OS that required an older librbd.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 120 ++++------------------------------------------------
 meson.build |   7 ++-
 2 files changed, 13 insertions(+), 114 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index f51adb364670..b4b928bbb99f 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -55,24 +55,10 @@
  * leading "\".
  */
 
-/* rbd_aio_discard added in 0.1.2 */
-#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 2)
-#define LIBRBD_SUPPORTS_DISCARD
-#else
-#undef LIBRBD_SUPPORTS_DISCARD
-#endif
-
 #define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
 
 #define RBD_MAX_SNAPS 100
 
-/* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */
-#ifdef LIBRBD_SUPPORTS_IOVEC
-#define LIBRBD_USE_IOVEC 1
-#else
-#define LIBRBD_USE_IOVEC 0
-#endif
-
 #define RBD_ENCRYPTION_LUKS_HEADER_VERIFICATION_LEN 8
 
 static const char rbd_luks_header_verification[
@@ -96,7 +82,6 @@ typedef struct RBDAIOCB {
     BlockAIOCB common;
     int64_t ret;
     QEMUIOVector *qiov;
-    char *bounce;
     RBDAIOCmd cmd;
     int error;
     struct BDRVRBDState *s;
@@ -106,7 +91,6 @@ typedef struct RADOSCB {
     RBDAIOCB *acb;
     struct BDRVRBDState *s;
     int64_t size;
-    char *buf;
     int64_t ret;
 } RADOSCB;
 
@@ -354,13 +338,9 @@ static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
 
 static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs)
 {
-    if (LIBRBD_USE_IOVEC) {
-        RBDAIOCB *acb = rcb->acb;
-        iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0,
-                   acb->qiov->size - offs);
-    } else {
-        memset(rcb->buf + offs, 0, rcb->size - offs);
-    }
+    RBDAIOCB *acb = rcb->acb;
+    iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0,
+               acb->qiov->size - offs);
 }
 
 #ifdef LIBRBD_SUPPORTS_ENCRYPTION
@@ -787,13 +767,6 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 
     g_free(rcb);
 
-    if (!LIBRBD_USE_IOVEC) {
-        if (acb->cmd == RBD_AIO_READ) {
-            qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
-        }
-        qemu_vfree(acb->bounce);
-    }
-
     acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
 
     qemu_aio_unref(acb);
@@ -1174,28 +1147,6 @@ static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb)
                                      rbd_finish_bh, rcb);
 }
 
-static int rbd_aio_discard_wrapper(rbd_image_t image,
-                                   uint64_t off,
-                                   uint64_t len,
-                                   rbd_completion_t comp)
-{
-#ifdef LIBRBD_SUPPORTS_DISCARD
-    return rbd_aio_discard(image, off, len, comp);
-#else
-    return -ENOTSUP;
-#endif
-}
-
-static int rbd_aio_flush_wrapper(rbd_image_t image,
-                                 rbd_completion_t comp)
-{
-#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
-    return rbd_aio_flush(image, comp);
-#else
-    return -ENOTSUP;
-#endif
-}
-
 static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
                                  int64_t off,
                                  QEMUIOVector *qiov,
@@ -1218,21 +1169,6 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 
     rcb = g_new(RADOSCB, 1);
 
-    if (!LIBRBD_USE_IOVEC) {
-        if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
-            acb->bounce = NULL;
-        } else {
-            acb->bounce = qemu_try_blockalign(bs, qiov->size);
-            if (acb->bounce == NULL) {
-                goto failed;
-            }
-        }
-        if (cmd == RBD_AIO_WRITE) {
-            qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
-        }
-        rcb->buf = acb->bounce;
-    }
-
     acb->ret = 0;
     acb->error = 0;
     acb->s = s;
@@ -1246,7 +1182,7 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
     }
 
     switch (cmd) {
-    case RBD_AIO_WRITE: {
+    case RBD_AIO_WRITE:
         /*
          * RBD APIs don't allow us to write more than actual size, so in order
          * to support growing images, we resize the image before write
@@ -1258,25 +1194,16 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
                 goto failed_completion;
             }
         }
-#ifdef LIBRBD_SUPPORTS_IOVEC
-            r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
-#else
-            r = rbd_aio_write(s->image, off, size, rcb->buf, c);
-#endif
+        r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
         break;
-    }
     case RBD_AIO_READ:
-#ifdef LIBRBD_SUPPORTS_IOVEC
-            r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
-#else
-            r = rbd_aio_read(s->image, off, size, rcb->buf, c);
-#endif
+        r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
         break;
     case RBD_AIO_DISCARD:
-        r = rbd_aio_discard_wrapper(s->image, off, size, c);
+        r = rbd_aio_discard(s->image, off, size, c);
         break;
     case RBD_AIO_FLUSH:
-        r = rbd_aio_flush_wrapper(s->image, c);
+        r = rbd_aio_flush(s->image, c);
         break;
     default:
         r = -EINVAL;
@@ -1291,9 +1218,6 @@ failed_completion:
     rbd_aio_release(c);
 failed:
     g_free(rcb);
-    if (!LIBRBD_USE_IOVEC) {
-        qemu_vfree(acb->bounce);
-    }
 
     qemu_aio_unref(acb);
     return NULL;
@@ -1319,7 +1243,6 @@ static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
                          RBD_AIO_WRITE);
 }
 
-#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
 static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
                                       BlockCompletionFunc *cb,
                                       void *opaque)
@@ -1327,20 +1250,6 @@ static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
     return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
 }
 
-#else
-
-static int qemu_rbd_co_flush(BlockDriverState *bs)
-{
-#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 1)
-    /* rbd_flush added in 0.1.1 */
-    BDRVRBDState *s = bs->opaque;
-    return rbd_flush(s->image);
-#else
-    return 0;
-#endif
-}
-#endif
-
 static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
     BDRVRBDState *s = bs->opaque;
@@ -1546,7 +1455,6 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
     return snap_count;
 }
 
-#ifdef LIBRBD_SUPPORTS_DISCARD
 static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
                                          int64_t offset,
                                          int bytes,
@@ -1556,9 +1464,7 @@ static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
     return rbd_start_aio(bs, offset, NULL, bytes, cb, opaque,
                          RBD_AIO_DISCARD);
 }
-#endif
 
-#ifdef LIBRBD_SUPPORTS_INVALIDATE
 static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
                                                       Error **errp)
 {
@@ -1568,7 +1474,6 @@ static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
         error_setg_errno(errp, -r, "Failed to invalidate the cache");
     }
 }
-#endif
 
 static QemuOptsList qemu_rbd_create_opts = {
     .name = "rbd-create-opts",
@@ -1643,23 +1548,14 @@ static BlockDriver bdrv_rbd = {
     .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
     .bdrv_aio_pwritev       = qemu_rbd_aio_pwritev,
 
-#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
     .bdrv_aio_flush         = qemu_rbd_aio_flush,
-#else
-    .bdrv_co_flush_to_disk  = qemu_rbd_co_flush,
-#endif
-
-#ifdef LIBRBD_SUPPORTS_DISCARD
     .bdrv_aio_pdiscard      = qemu_rbd_aio_pdiscard,
-#endif
 
     .bdrv_snapshot_create   = qemu_rbd_snap_create,
     .bdrv_snapshot_delete   = qemu_rbd_snap_remove,
     .bdrv_snapshot_list     = qemu_rbd_snap_list,
     .bdrv_snapshot_goto     = qemu_rbd_snap_rollback,
-#ifdef LIBRBD_SUPPORTS_INVALIDATE
     .bdrv_co_invalidate_cache = qemu_rbd_co_invalidate_cache,
-#endif
 
     .strong_runtime_opts    = qemu_rbd_strong_runtime_opts,
 };
diff --git a/meson.build b/meson.build
index db6789af9c90..5974a9d4b235 100644
--- a/meson.build
+++ b/meson.build
@@ -706,13 +706,16 @@ if not get_option('rbd').auto() or have_block
       int main(void) {
         rados_t cluster;
         rados_create(&cluster, NULL);
+        #if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 12, 0)
+        #error
+        #endif
         return 0;
       }''', dependencies: [librbd, librados])
       rbd = declare_dependency(dependencies: [librbd, librados])
     elif get_option('rbd').enabled()
-      error('could not link librados')
+      error('librbd >= 1.12.0 required')
     else
-      warning('could not link librados, disabling')
+      warning('librbd >= 1.12.0 not found, disabling')
     endif
   endif
 endif
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 2/6] block/rbd: store object_size in BDRVRBDState
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 3/6] block/rbd: update s->image_size in qemu_rbd_getlength Ilya Dryomov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index b4b928bbb99f..1ebf8f7e4875 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -102,6 +102,7 @@ typedef struct BDRVRBDState {
     char *snap;
     char *namespace;
     uint64_t image_size;
+    uint64_t object_size;
 } BDRVRBDState;
 
 static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
@@ -958,6 +959,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
     const QDictEntry *e;
     Error *local_err = NULL;
     char *keypairs, *secretid;
+    rbd_image_info_t info;
     int r;
 
     keypairs = g_strdup(qdict_get_try_str(options, "=keyvalue-pairs"));
@@ -1035,12 +1037,14 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
 #endif
     }
 
-    r = rbd_get_size(s->image, &s->image_size);
+    r = rbd_stat(s->image, &info, sizeof(info));
     if (r < 0) {
-        error_setg_errno(errp, -r, "error getting image size from %s",
+        error_setg_errno(errp, -r, "error getting image info from %s",
                          s->image_name);
         goto failed_post_open;
     }
+    s->image_size = info.size;
+    s->object_size = info.obj_size;
 
     /* If we are using an rbd snapshot, we must be r/o, otherwise
      * leave as-is */
@@ -1253,15 +1257,7 @@ static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
 static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
     BDRVRBDState *s = bs->opaque;
-    rbd_image_info_t info;
-    int r;
-
-    r = rbd_stat(s->image, &info, sizeof(info));
-    if (r < 0) {
-        return r;
-    }
-
-    bdi->cluster_size = info.obj_size;
+    bdi->cluster_size = s->object_size;
     return 0;
 }
 
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 3/6] block/rbd: update s->image_size in qemu_rbd_getlength
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 2/6] block/rbd: store object_size in BDRVRBDState Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 4/6] block/rbd: migrate from aio to coroutines Ilya Dryomov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

While at it just call rbd_get_size and avoid rbd_image_info_t.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 1ebf8f7e4875..e2028d3db5ff 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1304,15 +1304,14 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
 static int64_t qemu_rbd_getlength(BlockDriverState *bs)
 {
     BDRVRBDState *s = bs->opaque;
-    rbd_image_info_t info;
     int r;
 
-    r = rbd_stat(s->image, &info, sizeof(info));
+    r = rbd_get_size(s->image, &s->image_size);
     if (r < 0) {
         return r;
     }
 
-    return info.size;
+    return s->image_size;
 }
 
 static int coroutine_fn qemu_rbd_co_truncate(BlockDriverState *bs,
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 4/6] block/rbd: migrate from aio to coroutines
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
                   ` (2 preceding siblings ...)
  2021-07-02 17:23 ` [PATCH v5 3/6] block/rbd: update s->image_size in qemu_rbd_getlength Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 5/6] block/rbd: add write zeroes support Ilya Dryomov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 252 +++++++++++++++++++---------------------------------
 1 file changed, 90 insertions(+), 162 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index e2028d3db5ff..380ad28861ad 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -78,22 +78,6 @@ typedef enum {
     RBD_AIO_FLUSH
 } RBDAIOCmd;
 
-typedef struct RBDAIOCB {
-    BlockAIOCB common;
-    int64_t ret;
-    QEMUIOVector *qiov;
-    RBDAIOCmd cmd;
-    int error;
-    struct BDRVRBDState *s;
-} RBDAIOCB;
-
-typedef struct RADOSCB {
-    RBDAIOCB *acb;
-    struct BDRVRBDState *s;
-    int64_t size;
-    int64_t ret;
-} RADOSCB;
-
 typedef struct BDRVRBDState {
     rados_t cluster;
     rados_ioctx_t io_ctx;
@@ -105,6 +89,13 @@ typedef struct BDRVRBDState {
     uint64_t object_size;
 } BDRVRBDState;
 
+typedef struct RBDTask {
+    BlockDriverState *bs;
+    Coroutine *co;
+    bool complete;
+    int64_t ret;
+} RBDTask;
+
 static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
                             BlockdevOptionsRbd *opts, bool cache,
                             const char *keypairs, const char *secretid,
@@ -337,13 +328,6 @@ static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
     return ret;
 }
 
-static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs)
-{
-    RBDAIOCB *acb = rcb->acb;
-    iov_memset(acb->qiov->iov, acb->qiov->niov, offs, 0,
-               acb->qiov->size - offs);
-}
-
 #ifdef LIBRBD_SUPPORTS_ENCRYPTION
 static int qemu_rbd_convert_luks_options(
         RbdEncryptionOptionsLUKSBase *luks_opts,
@@ -733,46 +717,6 @@ exit:
     return ret;
 }
 
-/*
- * This aio completion is being called from rbd_finish_bh() and runs in qemu
- * BH context.
- */
-static void qemu_rbd_complete_aio(RADOSCB *rcb)
-{
-    RBDAIOCB *acb = rcb->acb;
-    int64_t r;
-
-    r = rcb->ret;
-
-    if (acb->cmd != RBD_AIO_READ) {
-        if (r < 0) {
-            acb->ret = r;
-            acb->error = 1;
-        } else if (!acb->error) {
-            acb->ret = rcb->size;
-        }
-    } else {
-        if (r < 0) {
-            qemu_rbd_memset(rcb, 0);
-            acb->ret = r;
-            acb->error = 1;
-        } else if (r < rcb->size) {
-            qemu_rbd_memset(rcb, r);
-            if (!acb->error) {
-                acb->ret = rcb->size;
-            }
-        } else if (!acb->error) {
-            acb->ret = r;
-        }
-    }
-
-    g_free(rcb);
-
-    acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
-
-    qemu_aio_unref(acb);
-}
-
 static char *qemu_rbd_mon_host(BlockdevOptionsRbd *opts, Error **errp)
 {
     const char **vals;
@@ -1122,89 +1066,59 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
     return 0;
 }
 
-static const AIOCBInfo rbd_aiocb_info = {
-    .aiocb_size = sizeof(RBDAIOCB),
-};
-
-static void rbd_finish_bh(void *opaque)
+static void qemu_rbd_finish_bh(void *opaque)
 {
-    RADOSCB *rcb = opaque;
-    qemu_rbd_complete_aio(rcb);
+    RBDTask *task = opaque;
+    task->complete = 1;
+    aio_co_wake(task->co);
 }
 
 /*
- * This is the callback function for rbd_aio_read and _write
+ * This is the completion callback function for all rbd aio calls
+ * started from qemu_rbd_start_co().
  *
  * Note: this function is being called from a non qemu thread so
  * we need to be careful about what we do here. Generally we only
  * schedule a BH, and do the rest of the io completion handling
- * from rbd_finish_bh() which runs in a qemu context.
+ * from qemu_rbd_finish_bh() which runs in a qemu context.
  */
-static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb)
+static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
 {
-    RBDAIOCB *acb = rcb->acb;
-
-    rcb->ret = rbd_aio_get_return_value(c);
+    task->ret = rbd_aio_get_return_value(c);
     rbd_aio_release(c);
-
-    replay_bh_schedule_oneshot_event(bdrv_get_aio_context(acb->common.bs),
-                                     rbd_finish_bh, rcb);
+    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
+                            qemu_rbd_finish_bh, task);
 }
 
-static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
-                                 int64_t off,
-                                 QEMUIOVector *qiov,
-                                 int64_t size,
-                                 BlockCompletionFunc *cb,
-                                 void *opaque,
-                                 RBDAIOCmd cmd)
+static int coroutine_fn qemu_rbd_start_co(BlockDriverState *bs,
+                                          uint64_t offset,
+                                          uint64_t bytes,
+                                          QEMUIOVector *qiov,
+                                          int flags,
+                                          RBDAIOCmd cmd)
 {
-    RBDAIOCB *acb;
-    RADOSCB *rcb = NULL;
+    BDRVRBDState *s = bs->opaque;
+    RBDTask task = { .bs = bs, .co = qemu_coroutine_self() };
     rbd_completion_t c;
     int r;
 
-    BDRVRBDState *s = bs->opaque;
-
-    acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
-    acb->cmd = cmd;
-    acb->qiov = qiov;
-    assert(!qiov || qiov->size == size);
-
-    rcb = g_new(RADOSCB, 1);
+    assert(!qiov || qiov->size == bytes);
 
-    acb->ret = 0;
-    acb->error = 0;
-    acb->s = s;
-
-    rcb->acb = acb;
-    rcb->s = acb->s;
-    rcb->size = size;
-    r = rbd_aio_create_completion(rcb, (rbd_callback_t) rbd_finish_aiocb, &c);
+    r = rbd_aio_create_completion(&task,
+                                  (rbd_callback_t) qemu_rbd_completion_cb, &c);
     if (r < 0) {
-        goto failed;
+        return r;
     }
 
     switch (cmd) {
-    case RBD_AIO_WRITE:
-        /*
-         * RBD APIs don't allow us to write more than actual size, so in order
-         * to support growing images, we resize the image before write
-         * operations that exceed the current size.
-         */
-        if (off + size > s->image_size) {
-            r = qemu_rbd_resize(bs, off + size);
-            if (r < 0) {
-                goto failed_completion;
-            }
-        }
-        r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
-        break;
     case RBD_AIO_READ:
-        r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
+        r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, offset, c);
+        break;
+    case RBD_AIO_WRITE:
+        r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, offset, c);
         break;
     case RBD_AIO_DISCARD:
-        r = rbd_aio_discard(s->image, off, size, c);
+        r = rbd_aio_discard(s->image, offset, bytes, c);
         break;
     case RBD_AIO_FLUSH:
         r = rbd_aio_flush(s->image, c);
@@ -1214,44 +1128,69 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
     }
 
     if (r < 0) {
-        goto failed_completion;
+        error_report("rbd request failed early: cmd %d offset %" PRIu64
+                     " bytes %" PRIu64 " flags %d r %d (%s)", cmd, offset,
+                     bytes, flags, r, strerror(-r));
+        rbd_aio_release(c);
+        return r;
     }
-    return &acb->common;
 
-failed_completion:
-    rbd_aio_release(c);
-failed:
-    g_free(rcb);
+    while (!task.complete) {
+        qemu_coroutine_yield();
+    }
 
-    qemu_aio_unref(acb);
-    return NULL;
+    if (task.ret < 0) {
+        error_report("rbd request failed: cmd %d offset %" PRIu64 " bytes %"
+                     PRIu64 " flags %d task.ret %" PRIi64 " (%s)", cmd, offset,
+                     bytes, flags, task.ret, strerror(-task.ret));
+        return task.ret;
+    }
+
+    /* zero pad short reads */
+    if (cmd == RBD_AIO_READ && task.ret < qiov->size) {
+        qemu_iovec_memset(qiov, task.ret, 0, qiov->size - task.ret);
+    }
+
+    return 0;
+}
+
+static int
+coroutine_fn qemu_rbd_co_preadv(BlockDriverState *bs, uint64_t offset,
+                               uint64_t bytes, QEMUIOVector *qiov,
+                               int flags)
+{
+    return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, RBD_AIO_READ);
 }
 
-static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
-                                       uint64_t offset, uint64_t bytes,
-                                       QEMUIOVector *qiov, int flags,
-                                       BlockCompletionFunc *cb,
-                                       void *opaque)
+static int
+coroutine_fn qemu_rbd_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                                 uint64_t bytes, QEMUIOVector *qiov,
+                                 int flags)
 {
-    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
-                         RBD_AIO_READ);
+    BDRVRBDState *s = bs->opaque;
+    /*
+     * RBD APIs don't allow us to write more than actual size, so in order
+     * to support growing images, we resize the image before write
+     * operations that exceed the current size.
+     */
+    if (offset + bytes > s->image_size) {
+        int r = qemu_rbd_resize(bs, offset + bytes);
+        if (r < 0) {
+            return r;
+        }
+    }
+    return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, RBD_AIO_WRITE);
 }
 
-static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
-                                        uint64_t offset, uint64_t bytes,
-                                        QEMUIOVector *qiov, int flags,
-                                        BlockCompletionFunc *cb,
-                                        void *opaque)
+static int coroutine_fn qemu_rbd_co_flush(BlockDriverState *bs)
 {
-    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
-                         RBD_AIO_WRITE);
+    return qemu_rbd_start_co(bs, 0, 0, NULL, 0, RBD_AIO_FLUSH);
 }
 
-static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
-                                      BlockCompletionFunc *cb,
-                                      void *opaque)
+static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs,
+                                             int64_t offset, int count)
 {
-    return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
+    return qemu_rbd_start_co(bs, offset, count, NULL, 0, RBD_AIO_DISCARD);
 }
 
 static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
@@ -1450,16 +1389,6 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
     return snap_count;
 }
 
-static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
-                                         int64_t offset,
-                                         int bytes,
-                                         BlockCompletionFunc *cb,
-                                         void *opaque)
-{
-    return rbd_start_aio(bs, offset, NULL, bytes, cb, opaque,
-                         RBD_AIO_DISCARD);
-}
-
 static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
                                                       Error **errp)
 {
@@ -1540,11 +1469,10 @@ static BlockDriver bdrv_rbd = {
     .bdrv_co_truncate       = qemu_rbd_co_truncate,
     .protocol_name          = "rbd",
 
-    .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
-    .bdrv_aio_pwritev       = qemu_rbd_aio_pwritev,
-
-    .bdrv_aio_flush         = qemu_rbd_aio_flush,
-    .bdrv_aio_pdiscard      = qemu_rbd_aio_pdiscard,
+    .bdrv_co_preadv         = qemu_rbd_co_preadv,
+    .bdrv_co_pwritev        = qemu_rbd_co_pwritev,
+    .bdrv_co_flush_to_disk  = qemu_rbd_co_flush,
+    .bdrv_co_pdiscard       = qemu_rbd_co_pdiscard,
 
     .bdrv_snapshot_create   = qemu_rbd_snap_create,
     .bdrv_snapshot_delete   = qemu_rbd_snap_remove,
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 5/6] block/rbd: add write zeroes support
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
                   ` (3 preceding siblings ...)
  2021-07-02 17:23 ` [PATCH v5 4/6] block/rbd: migrate from aio to coroutines Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 17:23 ` [PATCH v5 6/6] block/rbd: drop qemu_rbd_refresh_limits Ilya Dryomov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

This patch wittingly sets BDRV_REQ_NO_FALLBACK and silently ignores
BDRV_REQ_MAY_UNMAP for older librbd versions.

The rationale for this is as follows (citing Ilya Dryomov current RBD
maintainer):

---8<---
a) remove the BDRV_REQ_MAY_UNMAP check in qemu_rbd_co_pwrite_zeroes()
   and as a consequence always unmap if librbd is too old

   It's not clear what qemu's expectation is but in general Write
   Zeroes is allowed to unmap.  The only guarantee is that subsequent
   reads return zeroes, everything else is a hint.  This is how it is
   specified in the kernel and in the NVMe spec.

   In particular, block/nvme.c implements it as follows:

   if (flags & BDRV_REQ_MAY_UNMAP) {
       cdw12 |= (1 << 25);
   }

   This sets the Deallocate bit.  But if it's not set, the device may
   still deallocate:

   """
   If the Deallocate bit (CDW12.DEAC) is set to '1' in a Write Zeroes
   command, and the namespace supports clearing all bytes to 0h in the
   values read (e.g., bits 2:0 in the DLFEAT field are set to 001b)
   from a deallocated logical block and its metadata (excluding
   protection information), then for each specified logical block, the
   controller:
   - should deallocate that logical block;

   ...

   If the Deallocate bit is cleared to '0' in a Write Zeroes command,
   and the namespace supports clearing all bytes to 0h in the values
   read (e.g., bits 2:0 in the DLFEAT field are set to 001b) from
   a deallocated logical block and its metadata (excluding protection
   information), then, for each specified logical block, the
   controller:
   - may deallocate that logical block;
   """

   https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-2021.06.02-Ratified-1.pdf

b) set BDRV_REQ_NO_FALLBACK in supported_zero_flags

   Again, it's not clear what qemu expects here, but without it we end
   up in a ridiculous situation where specifying the "don't allow slow
   fallback" switch immediately fails all efficient zeroing requests on
   a device where Write Zeroes is always efficient:

   $ qemu-io -c 'help write' | grep -- '-[zun]'
    -n, -- with -z, don't allow slow fallback
    -u, -- with -z, allow unmapping
    -z, -- write zeroes using blk_co_pwrite_zeroes

   $ qemu-io -f rbd -c 'write -z -u -n 0 1M' rbd:foo/bar
   write failed: Operation not supported
--->8---

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/block/rbd.c b/block/rbd.c
index 380ad28861ad..3152bc8ba00a 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -75,7 +75,8 @@ typedef enum {
     RBD_AIO_READ,
     RBD_AIO_WRITE,
     RBD_AIO_DISCARD,
-    RBD_AIO_FLUSH
+    RBD_AIO_FLUSH,
+    RBD_AIO_WRITE_ZEROES
 } RBDAIOCmd;
 
 typedef struct BDRVRBDState {
@@ -999,6 +1000,10 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
         }
     }
 
+#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
+    bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK;
+#endif
+
     /* When extending regular files, we get zeros from the OS */
     bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE;
 
@@ -1123,6 +1128,18 @@ static int coroutine_fn qemu_rbd_start_co(BlockDriverState *bs,
     case RBD_AIO_FLUSH:
         r = rbd_aio_flush(s->image, c);
         break;
+#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
+    case RBD_AIO_WRITE_ZEROES: {
+        int zero_flags = 0;
+#ifdef RBD_WRITE_ZEROES_FLAG_THICK_PROVISION
+        if (!(flags & BDRV_REQ_MAY_UNMAP)) {
+            zero_flags = RBD_WRITE_ZEROES_FLAG_THICK_PROVISION;
+        }
+#endif
+        r = rbd_aio_write_zeroes(s->image, offset, bytes, c, zero_flags, 0);
+        break;
+    }
+#endif
     default:
         r = -EINVAL;
     }
@@ -1193,6 +1210,16 @@ static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs,
     return qemu_rbd_start_co(bs, offset, count, NULL, 0, RBD_AIO_DISCARD);
 }
 
+#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
+static int
+coroutine_fn qemu_rbd_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+                                      int count, BdrvRequestFlags flags)
+{
+    return qemu_rbd_start_co(bs, offset, count, NULL, flags,
+                             RBD_AIO_WRITE_ZEROES);
+}
+#endif
+
 static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
     BDRVRBDState *s = bs->opaque;
@@ -1473,6 +1500,9 @@ static BlockDriver bdrv_rbd = {
     .bdrv_co_pwritev        = qemu_rbd_co_pwritev,
     .bdrv_co_flush_to_disk  = qemu_rbd_co_flush,
     .bdrv_co_pdiscard       = qemu_rbd_co_pdiscard,
+#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
+    .bdrv_co_pwrite_zeroes  = qemu_rbd_co_pwrite_zeroes,
+#endif
 
     .bdrv_snapshot_create   = qemu_rbd_snap_create,
     .bdrv_snapshot_delete   = qemu_rbd_snap_remove,
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 6/6] block/rbd: drop qemu_rbd_refresh_limits
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
                   ` (4 preceding siblings ...)
  2021-07-02 17:23 ` [PATCH v5 5/6] block/rbd: add write zeroes support Ilya Dryomov
@ 2021-07-02 17:23 ` Ilya Dryomov
  2021-07-02 19:55 ` [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
  2021-07-06 13:19 ` Kevin Wolf
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 17:23 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

From: Peter Lieven <pl@kamp.de>

librbd supports 1 byte alignment for all aio operations.

Currently, there is no API call to query limits from the Ceph
ObjectStore backend.  So drop the bdrv_refresh_limits completely
until there is such an API call.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
---
 block/rbd.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 3152bc8ba00a..01a7b94d6257 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -240,14 +240,6 @@ done:
     return;
 }
 
-
-static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-    /* XXX Does RBD support AIO on less than 512-byte alignment? */
-    bs->bl.request_alignment = 512;
-}
-
-
 static int qemu_rbd_set_auth(rados_t cluster, BlockdevOptionsRbd *opts,
                              Error **errp)
 {
@@ -1482,7 +1474,6 @@ static BlockDriver bdrv_rbd = {
     .format_name            = "rbd",
     .instance_size          = sizeof(BDRVRBDState),
     .bdrv_parse_filename    = qemu_rbd_parse_filename,
-    .bdrv_refresh_limits    = qemu_rbd_refresh_limits,
     .bdrv_file_open         = qemu_rbd_open,
     .bdrv_close             = qemu_rbd_close,
     .bdrv_reopen_prepare    = qemu_rbd_reopen_prepare,
-- 
2.19.2



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
                   ` (5 preceding siblings ...)
  2021-07-02 17:23 ` [PATCH v5 6/6] block/rbd: drop qemu_rbd_refresh_limits Ilya Dryomov
@ 2021-07-02 19:55 ` Ilya Dryomov
  2021-07-06 13:19 ` Kevin Wolf
  7 siblings, 0 replies; 21+ messages in thread
From: Ilya Dryomov @ 2021-07-02 19:55 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Daniel P. Berrange, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

On Fri, Jul 2, 2021 at 7:24 PM Ilya Dryomov <idryomov@gmail.com> wrote:
>
> This series migrates the qemu rbd driver from the old aio emulation
> to native coroutines and adds write zeroes support which is important
> for block operations.
>
> To achieve this we first bump the librbd requirement to the already
> outdated luminous release of ceph to get rid of some wrappers and
> ifdef'ry in the code.
>
> V4->V5:
>  - rebase on Kevin's block branch (https://repo.or.cz/qemu/kevin.git block)
>    to resolve conflicts with "block/rbd: Add support for rbd image encryption"
>    patch
>
> V3->V4:
>  - this patch is now rebased on top of current master
>  - Patch 1: just mention librbd, tweak version numbers [Ilya]
>  - Patch 3: use rbd_get_size instead of rbd_stat [Ilya]
>  - Patch 4: retain comment about using a BH in the callback [Ilya]
>  - Patch 5: set BDRV_REQ_NO_FALLBACK and silently ignore BDRV_REQ_MAY_UNMAP [Ilya]
>
> V2->V3:
>  - this patch is now rebased on top of current master
>  - Patch 1: only use cc.links and not cc.run to not break
>    cross-compiling. [Kevin]
>    Since Qemu 6.1 its okay to rely on librbd >= 12.x since RHEL-7
>    support was dropped [Daniel]
>  - Patch 4: dropped
>  - Patch 5: store BDS in RBDTask and use bdrv_get_aio_context() [Kevin]
>
> V1->V2:
>  - this patch is now rebased on top of current master with Paolos
>    upcoming fixes for the meson.build script included:
>     - meson: accept either shared or static libraries if --disable-static
>     - meson: honor --enable-rbd if cc.links test fails
>  - Patch 1: adjusted to meson.build script
>  - Patch 2: unchanged
>  - Patch 3: new patch
>  - Patch 4: do not implement empty detach_aio_context callback [Jason]
>  - Patch 5: - fix aio completion cleanup in error case [Jason]
>             - return error codes from librbd
>  - Patch 6: - add support for thick provisioning [Jason]
>             - do not set write zeroes alignment
>  - Patch 7: new patch
>
> Peter Lieven (6):
>   block/rbd: bump librbd requirement to luminous release
>   block/rbd: store object_size in BDRVRBDState
>   block/rbd: update s->image_size in qemu_rbd_getlength
>   block/rbd: migrate from aio to coroutines
>   block/rbd: add write zeroes support
>   block/rbd: drop qemu_rbd_refresh_limits
>
>  block/rbd.c | 406 ++++++++++++++++------------------------------------
>  meson.build |   7 +-
>  2 files changed, 128 insertions(+), 285 deletions(-)
>
> --
> 2.19.2
>

Attempting to suit patchew:

Based-on: <20210627114635.39326-1-oro@il.ibm.com>
([PATCH v2] block/rbd: Add support for rbd image encryption)

Thanks,

                Ilya


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
                   ` (6 preceding siblings ...)
  2021-07-02 19:55 ` [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
@ 2021-07-06 13:19 ` Kevin Wolf
  2021-07-06 14:55   ` Peter Lieven
  7 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-07-06 13:19 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Daniel P. Berrange, qemu-block, ct, Peter Lieven, qemu-devel,
	Max Reitz, Paolo Bonzini

Am 02.07.2021 um 19:23 hat Ilya Dryomov geschrieben:
> This series migrates the qemu rbd driver from the old aio emulation
> to native coroutines and adds write zeroes support which is important
> for block operations.
> 
> To achieve this we first bump the librbd requirement to the already
> outdated luminous release of ceph to get rid of some wrappers and
> ifdef'ry in the code.

Thanks, applied to the block branch.

I've only had a very quick look at the patches, but I think there is one
suggestion for a cleanup I can make: The qemu_rbd_finish_bh()
indirection is probably unnecessary now because aio_co_wake() is thread
safe.

(Also, if I were the responsible maintainer, I would prefer true/false
rather than 0/1 for bools, but that's minor. :-))

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-06 13:19 ` Kevin Wolf
@ 2021-07-06 14:55   ` Peter Lieven
  2021-07-06 15:25     ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-07-06 14:55 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, ct, qemu-devel, Max Reitz,
	Paolo Bonzini, Ilya Dryomov



> Am 06.07.2021 um 15:19 schrieb Kevin Wolf <kwolf@redhat.com>:
> 
> Am 02.07.2021 um 19:23 hat Ilya Dryomov geschrieben:
>> This series migrates the qemu rbd driver from the old aio emulation
>> to native coroutines and adds write zeroes support which is important
>> for block operations.
>> 
>> To achieve this we first bump the librbd requirement to the already
>> outdated luminous release of ceph to get rid of some wrappers and
>> ifdef'ry in the code.
> 
> Thanks, applied to the block branch.
> 
> I've only had a very quick look at the patches, but I think there is one
> suggestion for a cleanup I can make: The qemu_rbd_finish_bh()
> indirection is probably unnecessary now because aio_co_wake() is thread
> safe.

But this is new, isn’t it?

We also have this indirection in iscsi and nfs drivers I think.

Does it matter that the completion callback is called from an librbd thread? Will the coroutine continue to run in the right thread?

I will have a decent look after my vacation.

Anyway, Thanks for applying,
Peter

> 
> (Also, if I were the responsible maintainer, I would prefer true/false
> rather than 0/1 for bools, but that's minor. :-))
> 
> Kevin
> 




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-06 14:55   ` Peter Lieven
@ 2021-07-06 15:25     ` Kevin Wolf
  2021-07-06 15:48       ` Peter Lieven
  2021-07-07 18:13       ` Peter Lieven
  0 siblings, 2 replies; 21+ messages in thread
From: Kevin Wolf @ 2021-07-06 15:25 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Daniel P. Berrange, qemu-block, ct, qemu-devel, Max Reitz,
	Paolo Bonzini, Ilya Dryomov

Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
> > Am 06.07.2021 um 15:19 schrieb Kevin Wolf <kwolf@redhat.com>:
> > 
> > Am 02.07.2021 um 19:23 hat Ilya Dryomov geschrieben:
> >> This series migrates the qemu rbd driver from the old aio emulation
> >> to native coroutines and adds write zeroes support which is important
> >> for block operations.
> >> 
> >> To achieve this we first bump the librbd requirement to the already
> >> outdated luminous release of ceph to get rid of some wrappers and
> >> ifdef'ry in the code.
> > 
> > Thanks, applied to the block branch.
> > 
> > I've only had a very quick look at the patches, but I think there is one
> > suggestion for a cleanup I can make: The qemu_rbd_finish_bh()
> > indirection is probably unnecessary now because aio_co_wake() is thread
> > safe.
> 
> But this is new, isn’t it?

Not sure in what sense you mean. aio_co_wake() has always been thread
safe, as far as I know.

Obviously, the old code didn't use aio_co_wake(), but directly called
some callbacks, so the part that is new is your coroutine conversion
that enables getting rid of the BH.

> We also have this indirection in iscsi and nfs drivers I think.

Indeed, the resulting codes look the same. In iscsi and nfs it doesn't
come from an incomplete converstion to coroutines, but they both used
qemu_coroutine_enter() originally, which resumes the coroutine in the
current thread...

> Does it matter that the completion callback is called from an librbd
> thread? Will the coroutine continue to run in the right thread?

...whereas aio_co_wake() resumes the coroutine in the thread where it
was running before.

(Before commit 5f50be9b5, this would have been buggy from an librbd
thread, but now it should work correctly even for threads that are
neither iothreads nor vcpu threads.)

> I will have a decent look after my vacation.

Sounds good, thanks. Enjoy your vacation!

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-06 15:25     ` Kevin Wolf
@ 2021-07-06 15:48       ` Peter Lieven
  2021-07-07 18:13       ` Peter Lieven
  1 sibling, 0 replies; 21+ messages in thread
From: Peter Lieven @ 2021-07-06 15:48 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, ct, qemu-devel, Max Reitz,
	Paolo Bonzini, Ilya Dryomov



> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
> 
> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>> Am 06.07.2021 um 15:19 schrieb Kevin Wolf <kwolf@redhat.com>:
>>> 
>>> Am 02.07.2021 um 19:23 hat Ilya Dryomov geschrieben:
>>>> This series migrates the qemu rbd driver from the old aio emulation
>>>> to native coroutines and adds write zeroes support which is important
>>>> for block operations.
>>>> 
>>>> To achieve this we first bump the librbd requirement to the already
>>>> outdated luminous release of ceph to get rid of some wrappers and
>>>> ifdef'ry in the code.
>>> 
>>> Thanks, applied to the block branch.
>>> 
>>> I've only had a very quick look at the patches, but I think there is one
>>> suggestion for a cleanup I can make: The qemu_rbd_finish_bh()
>>> indirection is probably unnecessary now because aio_co_wake() is thread
>>> safe.
>> 
>> But this is new, isn’t it?
> 
> Not sure in what sense you mean. aio_co_wake() has always been thread
> safe, as far as I know.
> 
> Obviously, the old code didn't use aio_co_wake(), but directly called
> some callbacks, so the part that is new is your coroutine conversion
> that enables getting rid of the BH.
> 
>> We also have this indirection in iscsi and nfs drivers I think.
> 
> Indeed, the resulting codes look the same. In iscsi and nfs it doesn't
> come from an incomplete converstion to coroutines, but they both used
> qemu_coroutine_enter() originally, which resumes the coroutine in the
> current thread...

If I remember correctly this would also serialize requests and thus we used BHs. libnfs and libiscsi are not thread safe as well and they completely run in qemus threads so this wasn’t the original reason.

Thanks for the hints to the relevant commit.

I will send a follow up for rbd/nfs/iscsi in about 2 weeks.

Peter

> 
>> Does it matter that the completion callback is called from an librbd
>> thread? Will the coroutine continue to run in the right thread?
> 
> ...whereas aio_co_wake() resumes the coroutine in the thread where it
> was running before.
> 
> (Before commit 5f50be9b5, this would have been buggy from an librbd
> thread, but now it should work correctly even for threads that are
> neither iothreads nor vcpu threads.)
> 
>> I will have a decent look after my vacation.
> 
> Sounds good, thanks. Enjoy your vacation!
> 
> Kevin
> 




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-06 15:25     ` Kevin Wolf
  2021-07-06 15:48       ` Peter Lieven
@ 2021-07-07 18:13       ` Peter Lieven
  2021-07-08 12:18         ` Kevin Wolf
  1 sibling, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-07-07 18:13 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov



> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
> 
> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>> Am 06.07.2021 um 15:19 schrieb Kevin Wolf <kwolf@redhat.com>:
>>> 
>>> Am 02.07.2021 um 19:23 hat Ilya Dryomov geschrieben:
>>>> This series migrates the qemu rbd driver from the old aio emulation
>>>> to native coroutines and adds write zeroes support which is important
>>>> for block operations.
>>>> 
>>>> To achieve this we first bump the librbd requirement to the already
>>>> outdated luminous release of ceph to get rid of some wrappers and
>>>> ifdef'ry in the code.
>>> 
>>> Thanks, applied to the block branch.
>>> 
>>> I've only had a very quick look at the patches, but I think there is one
>>> suggestion for a cleanup I can make: The qemu_rbd_finish_bh()
>>> indirection is probably unnecessary now because aio_co_wake() is thread
>>> safe.
>> 
>> But this is new, isn’t it?
> 
> Not sure in what sense you mean. aio_co_wake() has always been thread
> safe, as far as I know.
> 
> Obviously, the old code didn't use aio_co_wake(), but directly called
> some callbacks, so the part that is new is your coroutine conversion
> that enables getting rid of the BH.
> 
>> We also have this indirection in iscsi and nfs drivers I think.
> 
> Indeed, the resulting codes look the same. In iscsi and nfs it doesn't
> come from an incomplete converstion to coroutines, but they both used
> qemu_coroutine_enter() originally, which resumes the coroutine in the
> current thread...
> 
>> Does it matter that the completion callback is called from an librbd
>> thread? Will the coroutine continue to run in the right thread?
> 
> ...whereas aio_co_wake() resumes the coroutine in the thread where it
> was running before.
> 
> (Before commit 5f50be9b5, this would have been buggy from an librbd
> thread, but now it should work correctly even for threads that are
> neither iothreads nor vcpu threads.)
> 
>> I will have a decent look after my vacation.
> 
> Sounds good, thanks. Enjoy your vacation!


As I had to fire up my laptop to look into another issue anyway, I have sent two patches for updating MAINTAINERS and to fix the int vs. bool mix for task->complete. As Paolos fix (5f50be9b5) is relatively new and there are maybe other non obvious problems when removing the BH indirection and we are close to soft freeze I would leave the BH removal change for 6.2.

Best,
Peter




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-07 18:13       ` Peter Lieven
@ 2021-07-08 12:18         ` Kevin Wolf
  2021-07-08 18:23           ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-07-08 12:18 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
> > Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
> >> I will have a decent look after my vacation.
> > 
> > Sounds good, thanks. Enjoy your vacation!
> 
> As I had to fire up my laptop to look into another issue anyway, I
> have sent two patches for updating MAINTAINERS and to fix the int vs.
> bool mix for task->complete.

I think you need to reevaluate your definition of vacation. ;-)

But thanks anyway.

> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
> non obvious problems when removing the BH indirection and we are close
> to soft freeze I would leave the BH removal change for 6.2.

Sure, code cleanups aren't urgent.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-08 12:18         ` Kevin Wolf
@ 2021-07-08 18:23           ` Peter Lieven
  2021-07-09 10:21             ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-07-08 18:23 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov



> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
> 
> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>> I will have a decent look after my vacation.
>>> 
>>> Sounds good, thanks. Enjoy your vacation!
>> 
>> As I had to fire up my laptop to look into another issue anyway, I
>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>> bool mix for task->complete.
> 
> I think you need to reevaluate your definition of vacation. ;-)

Lets talk about this when the kids are grown up. Sometimes sending patches can be quite relaxing :-)

> 
> But thanks anyway.
> 
>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>> non obvious problems when removing the BH indirection and we are close
>> to soft freeze I would leave the BH removal change for 6.2.
> 
> Sure, code cleanups aren't urgent.

Isn’t the indirection also a slight performance drop?

Peter





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-08 18:23           ` Peter Lieven
@ 2021-07-09 10:21             ` Kevin Wolf
  2021-09-16 12:34               ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-07-09 10:21 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
> > Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
> >>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
> >>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
> >>>> I will have a decent look after my vacation.
> >>> 
> >>> Sounds good, thanks. Enjoy your vacation!
> >> 
> >> As I had to fire up my laptop to look into another issue anyway, I
> >> have sent two patches for updating MAINTAINERS and to fix the int vs.
> >> bool mix for task->complete.
> > 
> > I think you need to reevaluate your definition of vacation. ;-)
> 
> Lets talk about this when the kids are grown up. Sometimes sending
> patches can be quite relaxing :-)

Heh, fair enough. :-)

> > But thanks anyway.
> > 
> >> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
> >> non obvious problems when removing the BH indirection and we are close
> >> to soft freeze I would leave the BH removal change for 6.2.
> > 
> > Sure, code cleanups aren't urgent.
> 
> Isn’t the indirection also a slight performance drop?

Yeah, I guess technically it is, though I doubt it's measurable.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-07-09 10:21             ` Kevin Wolf
@ 2021-09-16 12:34               ` Peter Lieven
  2021-10-25 11:39                 ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-09-16 12:34 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 09.07.21 um 12:21 schrieb Kevin Wolf:
> Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
>> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
>>> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>>>> I will have a decent look after my vacation.
>>>>> Sounds good, thanks. Enjoy your vacation!
>>>> As I had to fire up my laptop to look into another issue anyway, I
>>>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>>>> bool mix for task->complete.
>>> I think you need to reevaluate your definition of vacation. ;-)
>> Lets talk about this when the kids are grown up. Sometimes sending
>> patches can be quite relaxing :-)
> Heh, fair enough. :-)
>
>>> But thanks anyway.
>>>
>>>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>>>> non obvious problems when removing the BH indirection and we are close
>>>> to soft freeze I would leave the BH removal change for 6.2.
>>> Sure, code cleanups aren't urgent.
>> Isn’t the indirection also a slight performance drop?
> Yeah, I guess technically it is, though I doubt it's measurable.


As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.

However, if I remove the BH I run into the following assertion while running some fio tests:


qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.


Any idea?


This is what I changed:


diff --git a/block/rbd.c b/block/rbd.c
index 3cb24f9981..bc1dbc20f7 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
      return 0;
  }

-static void qemu_rbd_finish_bh(void *opaque)
-{
-    RBDTask *task = opaque;
-    task->complete = true;
-    aio_co_wake(task->co);
-}
-
  /*
   * This is the completion callback function for all rbd aio calls
   * started from qemu_rbd_start_co().
@@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
  {
      task->ret = rbd_aio_get_return_value(c);
      rbd_aio_release(c);
-    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
-                            qemu_rbd_finish_bh, task);
+    task->complete = true;
+    aio_co_wake(task->co);
  }


Peter





^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-09-16 12:34               ` Peter Lieven
@ 2021-10-25 11:39                 ` Peter Lieven
  2021-10-25 12:58                   ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-10-25 11:39 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 16.09.21 um 14:34 schrieb Peter Lieven:
> Am 09.07.21 um 12:21 schrieb Kevin Wolf:
>> Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
>>> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>>>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>>>>> I will have a decent look after my vacation.
>>>>>> Sounds good, thanks. Enjoy your vacation!
>>>>> As I had to fire up my laptop to look into another issue anyway, I
>>>>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>>>>> bool mix for task->complete.
>>>> I think you need to reevaluate your definition of vacation. ;-)
>>> Lets talk about this when the kids are grown up. Sometimes sending
>>> patches can be quite relaxing :-)
>> Heh, fair enough. :-)
>>
>>>> But thanks anyway.
>>>>
>>>>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>>>>> non obvious problems when removing the BH indirection and we are close
>>>>> to soft freeze I would leave the BH removal change for 6.2.
>>>> Sure, code cleanups aren't urgent.
>>> Isn’t the indirection also a slight performance drop?
>> Yeah, I guess technically it is, though I doubt it's measurable.
>
>
> As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.
>
> However, if I remove the BH I run into the following assertion while running some fio tests:
>
>
> qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.
>
>
> Any idea?
>
>
> This is what I changed:
>
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 3cb24f9981..bc1dbc20f7 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
>      return 0;
>  }
>
> -static void qemu_rbd_finish_bh(void *opaque)
> -{
> -    RBDTask *task = opaque;
> -    task->complete = true;
> -    aio_co_wake(task->co);
> -}
> -
>  /*
>   * This is the completion callback function for all rbd aio calls
>   * started from qemu_rbd_start_co().
> @@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
>  {
>      task->ret = rbd_aio_get_return_value(c);
>      rbd_aio_release(c);
> -    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
> -                            qemu_rbd_finish_bh, task);
> +    task->complete = true;
> +    aio_co_wake(task->co);
>  }
>
>
> Peter
>
>

Kevin, Paolo, any idea?


Thanks,

Peter


-- 

Mit freundlichen Grüßen

Peter Lieven

...........................................................

   KAMP Netzwerkdienste GmbH
   Vestische Str. 89-91 | 46117 Oberhausen
   Tel: +49 (0) 208.89 402-50 | Fax: +49 (0) 208.89 402-40
   pl@kamp.de | http://www.kamp.de

   Geschäftsführer: Heiner Lante | Michael Lante
   Amtsgericht Duisburg | HRB Nr. 12154
   USt-Id-Nr.: DE 120607556

...........................................................




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-10-25 11:39                 ` Peter Lieven
@ 2021-10-25 12:58                   ` Kevin Wolf
  2021-10-26 14:53                     ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-10-25 12:58 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 25.10.2021 um 13:39 hat Peter Lieven geschrieben:
> Am 16.09.21 um 14:34 schrieb Peter Lieven:
> > Am 09.07.21 um 12:21 schrieb Kevin Wolf:
> > > Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
> > > > Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
> > > > > Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
> > > > > > > Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
> > > > > > > Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
> > > > > > > > I will have a decent look after my vacation.
> > > > > > > Sounds good, thanks. Enjoy your vacation!
> > > > > > As I had to fire up my laptop to look into another issue anyway, I
> > > > > > have sent two patches for updating MAINTAINERS and to fix the int vs.
> > > > > > bool mix for task->complete.
> > > > > I think you need to reevaluate your definition of vacation. ;-)
> > > > Lets talk about this when the kids are grown up. Sometimes sending
> > > > patches can be quite relaxing :-)
> > > Heh, fair enough. :-)
> > > 
> > > > > But thanks anyway.
> > > > > 
> > > > > > As Paolos fix (5f50be9b5) is relatively new and there are maybe other
> > > > > > non obvious problems when removing the BH indirection and we are close
> > > > > > to soft freeze I would leave the BH removal change for 6.2.
> > > > > Sure, code cleanups aren't urgent.
> > > > Isn’t the indirection also a slight performance drop?
> > > Yeah, I guess technically it is, though I doubt it's measurable.
> > 
> > 
> > As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.
> > 
> > However, if I remove the BH I run into the following assertion while running some fio tests:
> > 
> > 
> > qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.
> > 
> > 
> > Any idea?
> > 
> > 
> > This is what I changed:
> > 
> > 
> > diff --git a/block/rbd.c b/block/rbd.c
> > index 3cb24f9981..bc1dbc20f7 100644
> > --- a/block/rbd.c
> > +++ b/block/rbd.c
> > @@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
> >      return 0;
> >  }
> > 
> > -static void qemu_rbd_finish_bh(void *opaque)
> > -{
> > -    RBDTask *task = opaque;
> > -    task->complete = true;
> > -    aio_co_wake(task->co);
> > -}
> > -
> >  /*
> >   * This is the completion callback function for all rbd aio calls
> >   * started from qemu_rbd_start_co().
> > @@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
> >  {
> >      task->ret = rbd_aio_get_return_value(c);
> >      rbd_aio_release(c);
> > -    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
> > -                            qemu_rbd_finish_bh, task);
> > +    task->complete = true;
> > +    aio_co_wake(task->co);
> >  }
> 
> Kevin, Paolo, any idea?

Not really, I don't see the connection between both places.

Do you have a stack trace for the crash?

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-10-25 12:58                   ` Kevin Wolf
@ 2021-10-26 14:53                     ` Peter Lieven
  2021-11-15 11:17                       ` Peter Lieven
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Lieven @ 2021-10-26 14:53 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 25.10.21 um 14:58 schrieb Kevin Wolf:
> Am 25.10.2021 um 13:39 hat Peter Lieven geschrieben:
>> Am 16.09.21 um 14:34 schrieb Peter Lieven:
>>> Am 09.07.21 um 12:21 schrieb Kevin Wolf:
>>>> Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
>>>>> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>>>>>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>>>>>>> I will have a decent look after my vacation.
>>>>>>>> Sounds good, thanks. Enjoy your vacation!
>>>>>>> As I had to fire up my laptop to look into another issue anyway, I
>>>>>>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>>>>>>> bool mix for task->complete.
>>>>>> I think you need to reevaluate your definition of vacation. ;-)
>>>>> Lets talk about this when the kids are grown up. Sometimes sending
>>>>> patches can be quite relaxing :-)
>>>> Heh, fair enough. :-)
>>>>
>>>>>> But thanks anyway.
>>>>>>
>>>>>>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>>>>>>> non obvious problems when removing the BH indirection and we are close
>>>>>>> to soft freeze I would leave the BH removal change for 6.2.
>>>>>> Sure, code cleanups aren't urgent.
>>>>> Isn’t the indirection also a slight performance drop?
>>>> Yeah, I guess technically it is, though I doubt it's measurable.
>>>
>>> As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.
>>>
>>> However, if I remove the BH I run into the following assertion while running some fio tests:
>>>
>>>
>>> qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.
>>>
>>>
>>> Any idea?
>>>
>>>
>>> This is what I changed:
>>>
>>>
>>> diff --git a/block/rbd.c b/block/rbd.c
>>> index 3cb24f9981..bc1dbc20f7 100644
>>> --- a/block/rbd.c
>>> +++ b/block/rbd.c
>>> @@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
>>>      return 0;
>>>  }
>>>
>>> -static void qemu_rbd_finish_bh(void *opaque)
>>> -{
>>> -    RBDTask *task = opaque;
>>> -    task->complete = true;
>>> -    aio_co_wake(task->co);
>>> -}
>>> -
>>>  /*
>>>   * This is the completion callback function for all rbd aio calls
>>>   * started from qemu_rbd_start_co().
>>> @@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
>>>  {
>>>      task->ret = rbd_aio_get_return_value(c);
>>>      rbd_aio_release(c);
>>> -    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
>>> -                            qemu_rbd_finish_bh, task);
>>> +    task->complete = true;
>>> +    aio_co_wake(task->co);
>>>  }
>> Kevin, Paolo, any idea?
> Not really, I don't see the connection between both places.
>
> Do you have a stack trace for the crash?


The crash seems not to be limited to that assertion. I have also seen:


qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.


Altough harder to trigger I catch this backtrace in gdb:


qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
[Wechseln zu Thread 0x7ffff7fa8f40 (LWP 17852)]

Thread 1 "qemu-system-x86" hit Breakpoint 1, __GI_abort () at abort.c:49
49    abort.c: Datei oder Verzeichnis nicht gefunden.
(gdb) bt
#0  0x00007ffff42567e0 in __GI_abort () at abort.c:49
#1  0x00007ffff424648a in __assert_fail_base (fmt=0x7ffff43cd750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:92
#2  0x00007ffff4246502 in __GI___assert_fail (assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:101
#3  0x0000555555becc78 in blk_aio_write_entry (opaque=0x555556b534f0) at ../block/block-backend.c:1497
#4  0x0000555555cf0e4c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:173
#5  0x00007ffff426e7b0 in __start_context () at /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007fffffffd5a0 in  ()
#7  0x0000000000000000 in  ()


(gdb) thread apply all bt full

Thread 27 (Thread 0x7fffb89ff700 (LWP 17884)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5555571d9dec) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffb89eb1d0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 1103, cond = 0x5555571d9dc0, mutex = 0x5555571d9df8, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 551
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x5555571d9df8, cond=0x5555571d9dc0) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffb89eb1d0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 1103, cond = 0x5555571d9dc0, mutex = 0x5555571d9df8, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 551
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=cond@entry=0x5555571d9dc0, mutex=mutex@entry=0x5555571d9df8) at pthread_cond_wait.c:655
#3  0x0000555555ce0d0b in qemu_cond_wait_impl (cond=0x5555571d9dc0, mutex=0x5555571d9df8, file=0x555555d65554 "../ui/vnc-jobs.c", line=248) at ../util/qemu-thread-posix.c:194
        err = <optimized out>
        __PRETTY_FUNCTION__ = "qemu_cond_wait_impl"
        __func__ = "qemu_cond_wait_impl"
#4  0x000055555584d1e9 in vnc_worker_thread_loop (queue=queue@entry=0x5555571d9dc0) at ../ui/vnc-jobs.c:248
        _f = <optimized out>
        job = <optimized out>
        entry = <optimized out>
        tmp = <optimized out>
        vs = Python Exception <class 'gdb.error'> value of type `VncState' requires 66304 bytes, which is more than max-value-size:
#5  0x000055555584df48 in vnc_worker_thread (arg=arg@entry=0x5555571d9dc0) at ../ui/vnc-jobs.c:361
        queue = 0x5555571d9dc0
#6  0x0000555555ce05b6 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:557
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {93825029951232, -891796517554577268, 140736290863040, 0, 93825029951232, 140737488346704, 891927086486954124, 6428070804076404876}, __mask_was_saved = 0}}, __pad = {0x7fffb89fb710, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x555555ce0610 <qemu_thread_atexit_notify>
        __not_first_call = <optimized out>
        qemu_thread_args = <optimized out>
        start_routine = 0x55555584df30 <vnc_worker_thread>
        arg = 0x5555571d9dc0
        r = <optimized out>
#7  0x00007ffff460e6db in start_thread (arg=0x7fffb89ff700) at pthread_create.c:463
        pd = 0x7fffb89ff700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736290879232, -891796517554577268, 140736290863040, 0, 93825029951232, 140737488346704, 891927086447108236, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#8  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 26 (Thread 0x7fffba7fc700 (LWP 17882)):
#0  0x00007ffff432c317 in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x0000555555b7a327 in kvm_vcpu_ioctl (cpu=cpu@entry=0x555556c49090, type=type@entry=44672) at ../accel/kvm/kvm-all.c:3020
        ret = <optimized out>
        arg = 0x0
        ap = {{gp_offset = 16, fp_offset = 32767, overflow_arg_area = 0x7fffba7f85a0, reg_save_area = 0x7fffba7f8530}}
#2  0x0000555555b7a49a in kvm_cpu_exec (cpu=cpu@entry=0x555556c49090) at ../accel/kvm/kvm-all.c:2846
        attrs = {unspecified = 0, secure = 0, user = 0, requester_id = 0, byte_swap = 0, target_tlb_bit0 = 0, target_tlb_bit1 = 0, target_tlb_bit2 = 0}
        run = <optimized out>
        ret = <optimized out>
        run_ret = <optimized out>
#3  0x0000555555b7b84d in kvm_vcpu_thread_fn (arg=arg@entry=0x555556c49090) at ../accel/kvm/kvm-accel-ops.c:49
        cpu = 0x555556c49090
        r = <optimized out>
#4  0x0000555555ce05b6 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:557
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {93825016352880, -891796517554577268, 140736322308032, 0, 93825016352880, 140737488345904, 891931759948243084, 6428070804076404876}, __mask_was_saved = 0}}, __pad = {0x7fffba7f8710, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x555555ce0610 <qemu_thread_atexit_notify>
        __not_first_call = <optimized out>

        qemu_thread_args = <optimized out>
        start_routine = 0x555555b7b7b0 <kvm_vcpu_thread_fn>
        arg = 0x555556c49090
        r = <optimized out>
#5  0x00007ffff460e6db in start_thread (arg=0x7fffba7fc700) at pthread_create.c:463
        pd = 0x7fffba7fc700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736322324224, -891796517554577268, 140736322308032, 0, 93825016352880, 140737488345904, 891931759908397196, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 25 (Thread 0x7fffbaffd700 (LWP 17879)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffbaff9670, expected=0, futex_word=0x7fffbc003288) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbaff95c0, __canceltype = -1157654928, __prev = 0x0}
        cbuffer = {wseq = 212, cond = 0x7fffbc003260, mutex = 0x7fffb408d6a8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 106
#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffbaff9670, mutex=0x7fffb408d6a8, cond=0x7fffbc003260) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbaff95c0, __canceltype = -1157654928, __prev = 0x0}
        cbuffer = {wseq = 212, cond = 0x7fffbc003260, mutex = 0x7fffb408d6a8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 106
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x7fffbc003260, mutex=0x7fffb408d6a8, abstime=0x7fffbaff9670) at pthread_cond_wait.c:667
#3  0x00007ffff54df350 in  () at /usr/lib/librbd.so.1
#4  0x00007ffff54f501d in  () at /usr/lib/librbd.so.1
#5  0x00007ffff460e6db in start_thread (arg=0x7fffbaffd700) at pthread_create.c:463
        pd = 0x7fffbaffd700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736330716928, -891796517554577268, 140736330700736, 0, 140736347517600, 140736448130592, 891930660933640332, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 24 (Thread 0x7fffbb7fe700 (LWP 17878)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fffbc003320) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbb7fa5c0, __canceltype = 1453715840, __prev = 0x0}
        cbuffer = {wseq = 553868, cond = 0x7fffbc0032f8, mutex = 0x7fffbc0032d0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 276934
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x7fffbc0032d0, cond=0x7fffbc0032f8) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbb7fa5c0, __canceltype = 1453715840, __prev = 0x0}
        cbuffer = {wseq = 553868, cond = 0x7fffbc0032f8, mutex = 0x7fffbc0032d0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>

        signals = <optimized out>
        result = 0
        seq = 276934
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x7fffbc0032f8, mutex=0x7fffbc0032d0) at pthread_cond_wait.c:655
#3  0x00007fffe7bd48bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffe81558bc in Finisher::finisher_thread_entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffbb7fe700) at pthread_create.c:463
        pd = 0x7fffbb7fe700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736339109632, -891796517554577268, 140736339093440, 0, 140736347517856, 140736448130320, 891929559811399820, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 23 (Thread 0x7fffbbfff700 (LWP 17877)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fffb400b108) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbbffb5c0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x7fffb400b0e0, mutex = 0x7fffb400b0b8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x7fffb400b0b8, cond=0x7fffb400b0e0) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffbbffb5c0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x7fffb400b0e0, mutex = 0x7fffb400b0b8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x7fffb400b0e0, mutex=0x7fffb400b0b8) at pthread_cond_wait.c:655
#3  0x00007fffe7bd48bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffe81558bc in Finisher::finisher_thread_entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffbbfff700) at pthread_create.c:463
        pd = 0x7fffbbfff700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736347502336, -891796517554577268, 140736347486144, 0, 140736213332360, 140736456523312, 891928460836642956, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 22 (Thread 0x7fffc0ff9700 (LWP 17876)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fffb400ae08) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc0ff55d0, __canceltype = -1057008016, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x7fffb400ade0, mutex = 0x7fffb4006bf0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x7fffb4006bf0, cond=0x7fffb400ade0) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc0ff55d0, __canceltype = -1057008016, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x7fffb400ade0, mutex = 0x7fffb4006bf0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>

        signals = <optimized out>
        result = 0
        seq = 0
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x7fffb400ade0, mutex=0x7fffb4006bf0) at pthread_cond_wait.c:655
#3  0x00007fffe81a20c5 in SafeTimer::timer_thread() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81a383d in SafeTimerThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc0ff9700) at pthread_create.c:463
        pd = 0x7fffc0ff9700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736431363840, -891796517554577268, 140736431347648, 0, 140736213314176, 140736456523312, 891767926770278540, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 21 (Thread 0x7fffc17fa700 (LWP 17875)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffc17f6670, expected=0, futex_word=0x5555567968ac) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc17f65d0, __canceltype = -1048615312, __prev = 0x0}
        cbuffer = {wseq = 349, cond = 0x555556796880, mutex = 0x555556796958, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 174
#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffc17f6670, mutex=0x555556796958, cond=0x555556796880) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc17f65d0, __canceltype = -1048615312, __prev = 0x0}
        cbuffer = {wseq = 349, cond = 0x555556796880, mutex = 0x555556796958, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 174
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x555556796880, mutex=0x555556796958, abstime=0x7fffc17f6670) at pthread_cond_wait.c:667
#3  0x00007fffe81a24f2 in SafeTimer::timer_thread() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81a383d in SafeTimerThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc17fa700) at pthread_create.c:463
        pd = 0x7fffc17fa700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736439756544, -891796517554577268, 140736439740352, 0, 93825015256320, 140737488342304, 891766825648038028, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 20 (Thread 0x7fffc1ffb700 (LWP 17874)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffc1ff7490, expected=0, futex_word=0x5555567bbd24) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc1ff73b0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 1339747, cond = 0x5555567bbcf8, mutex = 0x5555567bbcd0, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 669873
#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffc1ff7490, mutex=0x5555567bbcd0, cond=0x5555567bbcf8) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc1ff73b0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 1339747, cond = 0x5555567bbcf8, mutex = 0x5555567bbcd0, private = 0}

        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 669873
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x5555567bbcf8, mutex=0x5555567bbcd0, abstime=0x7fffc1ff7490) at pthread_cond_wait.c:667
#3  0x00007fffe81b5946 in ThreadPool::worker(ThreadPool::WorkThread*) () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81b6c61 in ThreadPool::WorkThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc1ffb700) at pthread_create.c:463
        pd = 0x7fffc1ffb700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736448149248, -891796517554577268, 140736448133056, 0, 93825015298832, 140737488342448, 891765726673281164, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 19 (Thread 0x7fffc27fc700 (LWP 17873)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a09250) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc27f85c0, __canceltype = -1031830016, __prev = 0x0}
        cbuffer = {wseq = 490660, cond = 0x555556a09228, mutex = 0x555556a09200, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 245330
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a09200, cond=0x555556a09228) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc27f85c0, __canceltype = -1031830016, __prev = 0x0}
        cbuffer = {wseq = 490660, cond = 0x555556a09228, mutex = 0x555556a09200, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 245330
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a09228, mutex=0x555556a09200) at pthread_cond_wait.c:655
#3  0x00007fffe7bd48bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffe81558bc in Finisher::finisher_thread_entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc27fc700) at pthread_create.c:463
        pd = 0x7fffc27fc700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736456541952, -891796517554577268, 140736456525760, 0, 93825013945040, 140737488342576, 891773430233997452, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 18 (Thread 0x7fffc2ffd700 (LWP 17872)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a090b0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc2ff95d0, __canceltype = -1023437200, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x555556a09088, mutex = 0x555556a08ff0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a08ff0, cond=0x555556a09088) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc2ff95d0, __canceltype = -1023437200, __prev = 0x0}

        cbuffer = {wseq = 0, cond = 0x555556a09088, mutex = 0x555556a08ff0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a09088, mutex=0x555556a08ff0) at pthread_cond_wait.c:655
#3  0x00007fffe81a20c5 in SafeTimer::timer_thread() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81a383d in SafeTimerThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc2ffd700) at pthread_create.c:463
        pd = 0x7fffc2ffd700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736464934656, -891796517554577268, 140736464918464, 0, 93825014997344, 140737488342576, 891772331259240588, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 17 (Thread 0x7fffc37fe700 (LWP 17871)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a08d78) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc37fa5d0, __canceltype = -1015044496, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x555556a08d50, mutex = 0x555556a08cb0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a08cb0, cond=0x555556a08d50) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc37fa5d0, __canceltype = -1015044496, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x555556a08d50, mutex = 0x555556a08cb0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a08d50, mutex=0x555556a08cb0) at pthread_cond_wait.c:655
#3  0x00007fffe81a20c5 in SafeTimer::timer_thread() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81a383d in SafeTimerThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc37fe700) at pthread_create.c:463
        pd = 0x7fffc37fe700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736473327360, -891796517554577268, 140736473311168, 0, 93825014995888, 140737488342544, 891771230137000076, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 16 (Thread 0x7fffc3fff700 (LWP 17870)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a0885c) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc3ffb5c0, __canceltype = -735858672, __prev = 0x0}
        cbuffer = {wseq = 7, cond = 0x555556a08830, mutex = 0x555556a08808, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 3
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a08808, cond=0x555556a08830) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffc3ffb5c0, __canceltype = -735858672, __prev = 0x0}

        cbuffer = {wseq = 7, cond = 0x555556a08830, mutex = 0x555556a08808, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 3
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a08830, mutex=0x555556a08808) at pthread_cond_wait.c:655
#3  0x00007fffe7bd48bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffe81558bc in Finisher::finisher_thread_entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffc3fff700) at pthread_create.c:463
        pd = 0x7fffc3fff700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736481720064, -891796517554577268, 140736481703872, 0, 93825013942488, 140737488342432, 891770131162243212, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 15 (Thread 0x7fffd8ff9700 (LWP 17869)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffd8ff5670, expected=0, futex_word=0x555556a08778) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd8ff55d0, __canceltype = -654354943, __prev = 0x0}
        cbuffer = {wseq = 22, cond = 0x555556a08750, mutex = 0x555556a086f8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 11
#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffd8ff5670, mutex=0x555556a086f8, cond=0x555556a08750) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd8ff55d0, __canceltype = -654354943, __prev = 0x0}
        cbuffer = {wseq = 22, cond = 0x555556a08750, mutex = 0x555556a086f8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 11
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x555556a08750, mutex=0x555556a086f8, abstime=0x7fffd8ff5670) at pthread_cond_wait.c:667
#3  0x00007fffe81a24f2 in SafeTimer::timer_thread() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe81a383d in SafeTimerThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffd8ff9700) at pthread_create.c:463
        pd = 0x7fffd8ff9700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736834017024, -891796517554577268, 140736834000832, 0, 93825014947296, 140737488342432, 891715150212145292, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 14 (Thread 0x7fffd97fa700 (LWP 17868)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a4c720) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd97f65c0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x555556a4c6f8, mutex = 0x555556a4c6b8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a4c6b8, cond=0x555556a4c6f8) at pthread_cond_wait.c:502

        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd97f65c0, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 0, cond = 0x555556a4c6f8, mutex = 0x555556a4c6b8, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 0
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a4c6f8, mutex=0x555556a4c6b8) at pthread_cond_wait.c:655
#3  0x00007fffe833966c in DispatchQueue::run_local_delivery() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe83ebf4d in DispatchQueue::LocalDeliveryThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffd97fa700) at pthread_create.c:463
        pd = 0x7fffd97fa700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736842409728, -891796517554577268, 140736842393536, 0, 93825014220680, 140737488342544, 891714049089904780, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 13 (Thread 0x7fffd9ffb700 (LWP 17867)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x555556a4c574) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd9ff7280, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 35, cond = 0x555556a4c548, mutex = 0x555556a4c508, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 17
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x555556a4c508, cond=0x555556a4c548) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffd9ff7280, __canceltype = 0, __prev = 0x0}
        cbuffer = {wseq = 35, cond = 0x555556a4c548, mutex = 0x555556a4c508, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 17
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x555556a4c548, mutex=0x555556a4c508) at pthread_cond_wait.c:655
#3  0x00007fffe833a285 in DispatchQueue::entry() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe83ebe3d in DispatchQueue::DispatchThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffd9ffb700) at pthread_create.c:463
        pd = 0x7fffd9ffb700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736850802432, -891796517554577268, 140736850786240, 0, 93825014220392, 140737488342512, 891712950115147916, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 12 (Thread 0x7fffda7fc700 (LWP 17866)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffda7f8680, expected=0, futex_word=0x555556a61de4) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffda7f85f0, __canceltype = 1454415344, __prev = 0x0}
        cbuffer = {wseq = 49, cond = 0x555556a61db8, mutex = 0x555556a61d90, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 24

#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffda7f8680, mutex=0x555556a61d90, cond=0x555556a61db8) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffda7f85f0, __canceltype = 1454415344, __prev = 0x0}
        cbuffer = {wseq = 49, cond = 0x555556a61db8, mutex = 0x555556a61d90, private = 0}
        err = <optimized out>
        g = 1
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 24
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x555556a61db8, mutex=0x555556a61d90, abstime=0x7fffda7f8680) at pthread_cond_wait.c:667
#3  0x00007ffff4e7b354 in  () at /usr/lib/librados.so.2
#4  0x00007fffe7bda6df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff460e6db in start_thread (arg=0x7fffda7fc700) at pthread_create.c:463
        pd = 0x7fffda7fc700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736859195136, -891796517554577268, 140736859178944, 0, 93825014218800, 140737488343248, 891720653675864204, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 11 (Thread 0x7fffdaffd700 (LWP 17865)):
#0  0x00007ffff4615065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7fffdaff96a0, expected=0, futex_word=0x555556a09610) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
        __ret = -516
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffdaff9610, __canceltype = -194944912, __prev = 0x0}
        cbuffer = {wseq = 44, cond = 0x555556a095e8, mutex = 0x555556a095c0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 22
#1  0x00007ffff4615065 in __pthread_cond_wait_common (abstime=0x7fffdaff96a0, mutex=0x555556a095c0, cond=0x555556a095e8) at pthread_cond_wait.c:539
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffdaff9610, __canceltype = -194944912, __prev = 0x0}
        cbuffer = {wseq = 44, cond = 0x555556a095e8, mutex = 0x555556a095c0, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        seq = 22
#2  0x00007ffff4615065 in __pthread_cond_timedwait (cond=0x555556a095e8, mutex=0x555556a095c0, abstime=0x7fffdaff96a0) at pthread_cond_wait.c:667
#3  0x00007fffe81f2736 in CephContextServiceThread::entry() () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007ffff460e6db in start_thread (arg=0x7fffdaffd700) at pthread_create.c:463
        pd = 0x7fffdaffd700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736867587840, -891796517554577268, 140736867571648, 0, 93825013945760, 140737488342320, 891719554701107340, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#5  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7fffdb7fe700 (LWP 17860)):
#0  0x00007ffff4337a47 in epoll_wait (epfd=15, events=0x555556abd660, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
        resultvar = 18446744073709551612
        sc_cancel_oldtype = 0
#1  0x00007fffe8462a1e in EpollDriver::event_wait(std::vector<FiredFileEvent, std::allocator<FiredFileEvent> >&, timeval*) () at /usr/lib/ceph/libceph-common.so.0
#2  0x00007fffe8449723 in EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) () at /usr/lib/ceph/libceph-common.so.0
#3  0x00007fffe844ea78 in  () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe7bda6df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff460e6db in start_thread (arg=0x7fffdb7fe700) at pthread_create.c:463
        pd = 0x7fffdb7fe700
        now = <optimized out>
        unwind_buf =

              {cancel_jmp_buf = {{jmp_buf = {140736875980544, -891796517554577268, 140736875964352, 0, 93825014864512, 140737488342000, 891718453578866828, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7fffdbfff700 (LWP 17859)):
#0  0x00007ffff4337a47 in epoll_wait (epfd=12, events=0x555556a90870, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
        resultvar = 18446744073709551612
        sc_cancel_oldtype = 0
#1  0x00007fffe8462a1e in EpollDriver::event_wait(std::vector<FiredFileEvent, std::allocator<FiredFileEvent> >&, timeval*) () at /usr/lib/ceph/libceph-common.so.0
#2  0x00007fffe8449723 in EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) () at /usr/lib/ceph/libceph-common.so.0
#3  0x00007fffe844ea78 in  () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe7bda6df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff460e6db in start_thread (arg=0x7fffdbfff700) at pthread_create.c:463
        pd = 0x7fffdbfff700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736884373248, -891796517554577268, 140736884357056, 0, 93825014863888, 140737488342000, 891717354604109964, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7fffe0bc1700 (LWP 17858)):
#0  0x00007ffff4337a47 in epoll_wait (epfd=9, events=0x555556a63d10, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
        resultvar = 18446744073709551612
        sc_cancel_oldtype = 0
#1  0x00007fffe8462a1e in EpollDriver::event_wait(std::vector<FiredFileEvent, std::allocator<FiredFileEvent> >&, timeval*) () at /usr/lib/ceph/libceph-common.so.0
#2  0x00007fffe8449723 in EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) () at /usr/lib/ceph/libceph-common.so.0
#3  0x00007fffe844ea78 in  () at /usr/lib/ceph/libceph-common.so.0
#4  0x00007fffe7bda6df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff460e6db in start_thread (arg=0x7fffe0bc1700) at pthread_create.c:463
        pd = 0x7fffe0bc1700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736963811072, -891796517554577268, 140736963794880, 0, 93825014863312, 140737488342000, 891838883924975756, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7fffe1bbb700 (LWP 17857)):
#0  0x00007ffff4614ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5555568e0cd8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
        __ret = -512
        oldtype = 0
        err = <optimized out>
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffe1bb7630, __canceltype = -194959278, __prev = 0x0}
        cbuffer = {wseq = 256, cond = 0x5555568e0cb0, mutex = 0x5555568e0c30, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 128
#1  0x00007ffff4614ad3 in __pthread_cond_wait_common (abstime=0x0, mutex=0x5555568e0c30, cond=0x5555568e0cb0) at pthread_cond_wait.c:502
        spin = 0
        buffer = {__routine = 0x7ffff4614770 <__condvar_cleanup_waiting>, __arg = 0x7fffe1bb7630, __canceltype = -194959278, __prev = 0x0}
        cbuffer = {wseq = 256, cond = 0x5555568e0cb0, mutex = 0x5555568e0c30, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        signals = <optimized out>
        result = 0
        seq = 128
#2  0x00007ffff4614ad3 in __pthread_cond_wait (cond=0x5555568e0cb0, mutex=0x5555568e0c30) at pthread_cond_wait.c:655
#3  0x00007fffe7bd48bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffe8493daf in ceph::logging::Log::entry() () at /usr/lib/ceph/libceph-common.so.0
#5  0x00007ffff460e6db in start_thread (arg=0x7fffe1bbb700) at pthread_create.c:463
        pd = 0x7fffe1bbb700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140736980563712, -891796517554577268, 140736980547520, 0, 93825012730880, 140737488342688, 891836679533011084, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}

        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7fffe300f700 (LWP 17856)):
#0  0x00007ffff4331639 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x0000555555ce140b in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /home/lieven/git/qemu-6.0.0/include/qemu/futex.h:29
        value = <optimized out>
        __PRETTY_FUNCTION__ = "qemu_event_wait"
#2  0x0000555555ce140b in qemu_event_wait (ev=ev@entry=0x55555666f268 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:480
        value = <optimized out>
        __PRETTY_FUNCTION__ = "qemu_event_wait"
#3  0x0000555555ce9258 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:258
        tries = 0
        n = <optimized out>
        node = <optimized out>
#4  0x0000555555ce05b6 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:557
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {93825010610432, -891796517554577268, 140737001863104, 0, 93825010610432, 140737488347232, 891842243702988940, 6428070804076404876}, __mask_was_saved = 0}}, __pad = {0x7fffe300b710, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x555555ce0610 <qemu_thread_atexit_notify>
        __not_first_call = <optimized out>
        qemu_thread_args = <optimized out>
        start_routine = 0x555555ce9200 <call_rcu_thread>
        arg = 0x0
        r = <optimized out>
#5  0x00007ffff460e6db in start_thread (arg=0x7fffe300f700) at pthread_create.c:463
        pd = 0x7fffe300f700
        now = <optimized out>
        unwind_buf =
              {cancel_jmp_buf = {{jmp_buf = {140737001879296, -891796517554577268, 140737001863104, 0, 93825010610432, 140737488347232, 891842243663143052, 891812829063979148}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#6  0x00007ffff433771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ffff7fa8f40 (LWP 17852)):
#0  0x00007ffff42567e0 in __GI_abort () at abort.c:49
        act =
          {__sigaction_handler = {sa_handler = 0x555556a4e240, sa_sigaction = 0x555556a4e240}, sa_mask = {__val = {0, 0, 0, 0, 140733912637160, 0, 0, 5, 140733912636992, 140737291030896, 140737291016016, 0, 3954949963032095744, 140737291000980, 0, 140737291016016}}, sa_flags = 1441150130, sa_restorer = 0x555555e638e0}
        sigs =
            {__val = {93825001732896, 140737291011849, 93825001731250, 3954949963032095744, 140737293330048, 93825001731250, 93825001732320, 1497, 140737291016016, 140737289732025, 206158430232, 140733912637232, 140733912637040, 3954949963032095744, 93825001731250, 1497}}
#1  0x00007ffff424648a in __assert_fail_base (fmt=0x7ffff43cd750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:92
        str = 0x555556a4e240 "\320\361mWUU"
        total = 4096
#2  0x00007ffff4246502 in __GI___assert_fail (assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:101
#3  0x0000555555becc78 in blk_aio_write_entry (opaque=0x555556b534f0) at ../block/block-backend.c:1497
        acb = 0x555556b534f0
        rwco = 0x555556b53518
        qiov = <optimized out>
        opaque = 0x555556b534f0
        acb = 0x555556b534f0
        rwco = 0x555556b53518
        qiov = <optimized out>
#4  0x0000555555cf0e4c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:173
        arg = <optimized out>
        self = <optimized out>
        co = 0x5555570f73a0
#5  0x00007ffff426e7b0 in __start_context () at /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007fffffffd5a0 in  ()
#7  0x0000000000000000 in  ()





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
  2021-10-26 14:53                     ` Peter Lieven
@ 2021-11-15 11:17                       ` Peter Lieven
  0 siblings, 0 replies; 21+ messages in thread
From: Peter Lieven @ 2021-11-15 11:17 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, qemu-block, Christian Theune, qemu-devel,
	Max Reitz, Paolo Bonzini, Ilya Dryomov

Am 26.10.21 um 16:53 schrieb Peter Lieven:
> Am 25.10.21 um 14:58 schrieb Kevin Wolf:
>> Am 25.10.2021 um 13:39 hat Peter Lieven geschrieben:
>>> Am 16.09.21 um 14:34 schrieb Peter Lieven:
>>>> Am 09.07.21 um 12:21 schrieb Kevin Wolf:
>>>>> Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
>>>>>> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>>> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>>>>>>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>>>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>>>>>>>> I will have a decent look after my vacation.
>>>>>>>>> Sounds good, thanks. Enjoy your vacation!
>>>>>>>> As I had to fire up my laptop to look into another issue anyway, I
>>>>>>>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>>>>>>>> bool mix for task->complete.
>>>>>>> I think you need to reevaluate your definition of vacation. ;-)
>>>>>> Lets talk about this when the kids are grown up. Sometimes sending
>>>>>> patches can be quite relaxing :-)
>>>>> Heh, fair enough. :-)
>>>>>
>>>>>>> But thanks anyway.
>>>>>>>
>>>>>>>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>>>>>>>> non obvious problems when removing the BH indirection and we are close
>>>>>>>> to soft freeze I would leave the BH removal change for 6.2.
>>>>>>> Sure, code cleanups aren't urgent.
>>>>>> Isn’t the indirection also a slight performance drop?
>>>>> Yeah, I guess technically it is, though I doubt it's measurable.
>>>> As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.
>>>>
>>>> However, if I remove the BH I run into the following assertion while running some fio tests:
>>>>
>>>>
>>>> qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.
>>>>
>>>>
>>>> Any idea?
>>>>
>>>>
>>>> This is what I changed:
>>>>
>>>>
>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>> index 3cb24f9981..bc1dbc20f7 100644
>>>> --- a/block/rbd.c
>>>> +++ b/block/rbd.c
>>>> @@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
>>>>       return 0;
>>>>   }
>>>>
>>>> -static void qemu_rbd_finish_bh(void *opaque)
>>>> -{
>>>> -    RBDTask *task = opaque;
>>>> -    task->complete = true;
>>>> -    aio_co_wake(task->co);
>>>> -}
>>>> -
>>>>   /*
>>>>    * This is the completion callback function for all rbd aio calls
>>>>    * started from qemu_rbd_start_co().
>>>> @@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
>>>>   {
>>>>       task->ret = rbd_aio_get_return_value(c);
>>>>       rbd_aio_release(c);
>>>> -    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
>>>> -                            qemu_rbd_finish_bh, task);
>>>> +    task->complete = true;
>>>> +    aio_co_wake(task->co);
>>>>   }
>>> Kevin, Paolo, any idea?
>> Not really, I don't see the connection between both places.
>>
>> Do you have a stack trace for the crash?
>
> The crash seems not to be limited to that assertion. I have also seen:
>
>
> qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
>
>
> Altough harder to trigger I catch this backtrace in gdb:
>
>
> qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
> [Wechseln zu Thread 0x7ffff7fa8f40 (LWP 17852)]
>
> Thread 1 "qemu-system-x86" hit Breakpoint 1, __GI_abort () at abort.c:49
> 49    abort.c: Datei oder Verzeichnis nicht gefunden.
> (gdb) bt
> #0  0x00007ffff42567e0 in __GI_abort () at abort.c:49
> #1  0x00007ffff424648a in __assert_fail_base (fmt=0x7ffff43cd750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:92
> #2  0x00007ffff4246502 in __GI___assert_fail (assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:101
> #3  0x0000555555becc78 in blk_aio_write_entry (opaque=0x555556b534f0) at ../block/block-backend.c:1497
> #4  0x0000555555cf0e4c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:173
> #5  0x00007ffff426e7b0 in __start_context () at /lib/x86_64-linux-gnu/libc.so.6
> #6  0x00007fffffffd5a0 in  ()
> #7  0x0000000000000000 in  ()
>


any ideas? Or should we just abandon the idea of removing the BH?


Peter





^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-11-15 11:18 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 2/6] block/rbd: store object_size in BDRVRBDState Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 3/6] block/rbd: update s->image_size in qemu_rbd_getlength Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 4/6] block/rbd: migrate from aio to coroutines Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 5/6] block/rbd: add write zeroes support Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 6/6] block/rbd: drop qemu_rbd_refresh_limits Ilya Dryomov
2021-07-02 19:55 ` [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
2021-07-06 13:19 ` Kevin Wolf
2021-07-06 14:55   ` Peter Lieven
2021-07-06 15:25     ` Kevin Wolf
2021-07-06 15:48       ` Peter Lieven
2021-07-07 18:13       ` Peter Lieven
2021-07-08 12:18         ` Kevin Wolf
2021-07-08 18:23           ` Peter Lieven
2021-07-09 10:21             ` Kevin Wolf
2021-09-16 12:34               ` Peter Lieven
2021-10-25 11:39                 ` Peter Lieven
2021-10-25 12:58                   ` Kevin Wolf
2021-10-26 14:53                     ` Peter Lieven
2021-11-15 11:17                       ` Peter Lieven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.