All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write
@ 2018-02-15 19:28 Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks Eric Blake
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf

While we would prefer that block drivers use coroutines instead
of aio callbacks, it is a fairly easy exercise to prove that
all existing drivers with aio callbacks are merely scaling
from bytes into sectors and back to bytes.  So, even though I
am not set up to completely run (or even compile-test) this
full series, it seems pretty straightforward to change the
signature to quit playing games with pointless scaling.

Note that except for the null-aio driver, I intentionally did
NOT try and change the request_alignment from the block layer's
default of 512 (it defaults to 1 only for drivers that use
coroutines).

(And along the way, I got my docker-test-mingw@fedora working;
thanks to the help I got on IRC)

Eric Blake (6):
  block: Support byte-based aio callbacks
  file-win32: Switch to byte-based callbacks
  null: Switch to byte-based read/write
  rbd: Switch to byte-based callbacks
  vxhs: Switch to byte-based callbacks
  block: Drop last of the sector-based aio callbacks

 include/block/block_int.h |  8 +++---
 include/block/raw-aio.h   |  2 +-
 block/io.c                | 26 ++++++++++++-------
 block/file-win32.c        | 36 +++++++++++++-------------
 block/null.c              | 66 ++++++++++++++++++++++++++---------------------
 block/rbd.c               | 36 ++++++++++++--------------
 block/vxhs.c              | 36 +++++++++++---------------
 block/win32-aio.c         |  5 ++--
 8 files changed, 109 insertions(+), 106 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-04-24 15:40   ` Kevin Wolf
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 2/6] file-win32: Switch to byte-based callbacks Eric Blake
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Stefan Hajnoczi, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Add new sector-based aio callbacks for read and write,
to match the fact that bdrv_aio_pdiscard is already byte-based.

Ideally, drivers should be converted to use coroutine callbacks
rather than aio; but that is not quite as trivial (if we do that
conversion, the null-aio driver will disappear), so for the
short term, converting the signature but keeping things with aio
is easier.  Once all drivers are converted, the sector-based aio
callbacks will be removed.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block_int.h |  6 ++++++
 block/io.c                | 37 +++++++++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5ae7738cf8d..c882dc4232d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -137,9 +137,15 @@ struct BlockDriver {
     BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
+    BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
+        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
+    BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
+        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
         BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index 4d3d1f640a3..84a4caa72b7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -934,9 +934,11 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
     sector_num = offset >> BDRV_SECTOR_BITS;
     nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    if (!drv->bdrv_aio_preadv) {
+        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    }

     if (drv->bdrv_co_readv) {
         return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
@@ -946,8 +948,13 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
-                                      bdrv_co_io_em_complete, &co);
+        if (drv->bdrv_aio_preadv) {
+            acb = bs->drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
+                                           bdrv_co_io_em_complete, &co);
+        } else {
+            acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
+                                          bdrv_co_io_em_complete, &co);
+        }
         if (acb == NULL) {
             return -EIO;
         } else {
@@ -982,9 +989,11 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
     sector_num = offset >> BDRV_SECTOR_BITS;
     nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    if (!drv->bdrv_aio_pwritev) {
+        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    }

     if (drv->bdrv_co_writev_flags) {
         ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
@@ -999,8 +1008,16 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
-                                       bdrv_co_io_em_complete, &co);
+        if (drv->bdrv_aio_pwritev) {
+            acb = bs->drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
+                                            flags & bs->supported_write_flags,
+                                            bdrv_co_io_em_complete, &co);
+            flags &= ~bs->supported_write_flags;
+        } else {
+            assert(!bs->supported_write_flags);
+            acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
+                                           bdrv_co_io_em_complete, &co);
+        }
         if (acb == NULL) {
             ret = -EIO;
         } else {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/6] file-win32: Switch to byte-based callbacks
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write Eric Blake
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Max Reitz, Stefan Weil

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the file-win32 driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, the block layer defaults this driver to
use request_alignment of 512, and I did not change that,
because I don't know if Windows is tolerant of non-sector AIO
operations.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Compile-tested via 'make docker-test-mingw@fedora', but I don't
have a sane way to test whether it actually works.
---
 include/block/raw-aio.h |  2 +-
 block/file-win32.c      | 36 ++++++++++++++++++------------------
 block/win32-aio.c       |  5 ++---
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index a4cdbbf1b7a..9e47b8a629d 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -57,7 +57,7 @@ void win32_aio_cleanup(QEMUWin32AIOState *aio);
 int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
 BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
         QEMUWin32AIOState *aio, HANDLE hfile,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
         BlockCompletionFunc *cb, void *opaque, int type);
 void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *old_context);
diff --git a/block/file-win32.c b/block/file-win32.c
index f24c7bb92c6..7a2e3367946 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -410,32 +410,32 @@ fail:
     return ret;
 }

-static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
-                         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                         BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *raw_aio_preadv(BlockDriverState *bs,
+                                  uint64_t offset, uint64_t bytes,
+                                  QEMUIOVector *qiov, int flags,
+                                  BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_READ);
+        return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
+                                cb, opaque, QEMU_AIO_READ);
     } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
+        return paio_submit(bs, s->hfile, offset, qiov, bytes,
                            cb, opaque, QEMU_AIO_READ);
     }
 }

-static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
-                          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                          BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *raw_aio_pwritev(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
+                                   BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_WRITE);
+        return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
+                                cb, opaque, QEMU_AIO_WRITE);
     } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
+        return paio_submit(bs, s->hfile, offset, qiov, bytes,
                            cb, opaque, QEMU_AIO_WRITE);
     }
 }
@@ -602,8 +602,8 @@ BlockDriver bdrv_file = {
     .bdrv_create        = raw_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,

-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_preadv    = raw_aio_preadv,
+    .bdrv_aio_pwritev   = raw_aio_pwritev,
     .bdrv_aio_flush     = raw_aio_flush,

     .bdrv_truncate	= raw_truncate,
@@ -764,8 +764,8 @@ static BlockDriver bdrv_host_device = {
     .bdrv_file_open	= hdev_open,
     .bdrv_close		= raw_close,

-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_preadv    = raw_aio_preadv,
+    .bdrv_aio_pwritev   = raw_aio_pwritev,
     .bdrv_aio_flush     = raw_aio_flush,

     .bdrv_detach_aio_context = raw_detach_aio_context,
diff --git a/block/win32-aio.c b/block/win32-aio.c
index 3be8f458fab..9cd355d42f8 100644
--- a/block/win32-aio.c
+++ b/block/win32-aio.c
@@ -112,15 +112,14 @@ static const AIOCBInfo win32_aiocb_info = {

 BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
         QEMUWin32AIOState *aio, HANDLE hfile,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
         BlockCompletionFunc *cb, void *opaque, int type)
 {
     struct QEMUWin32AIOCB *waiocb;
-    uint64_t offset = sector_num * 512;
     DWORD rc;

     waiocb = qemu_aio_get(&win32_aiocb_info, bs, cb, opaque);
-    waiocb->nbytes = nb_sectors * 512;
+    waiocb->nbytes = bytes;
     waiocb->qiov = qiov;
     waiocb->is_read = (type == QEMU_AIO_READ);

-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 2/6] file-win32: Switch to byte-based callbacks Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-04-24 15:52   ` Kevin Wolf
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 4/6] rbd: Switch to byte-based callbacks Eric Blake
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the null-co and null-aio drivers.

Note that since the null driver does nothing on writes, it trivially
supports the BDRV_REQ_FUA flag (all writes have already landed to
the same bit-bucket without needing an extra flush call).  Furthermore,
bdrv_refresh_limits() defaults the block size to 512 for any driver
that does not support coroutines; while this is still correct for the
other aio-based drivers, the null driver does just as well with
byte-based requests, and being explicit means we can avoid cycles
wasted on read-modify-write.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/null.c | 66 ++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/block/null.c b/block/null.c
index 806a8631e4d..2381090dfcf 100644
--- a/block/null.c
+++ b/block/null.c
@@ -93,6 +93,7 @@ static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
     }
     s->read_zeroes = qemu_opt_get_bool(opts, NULL_OPT_ZEROES, false);
     qemu_opts_del(opts);
+    bs->supported_write_flags = BDRV_REQ_FUA;
     return ret;
 }

@@ -116,22 +117,22 @@ static coroutine_fn int null_co_common(BlockDriverState *bs)
     return 0;
 }

-static coroutine_fn int null_co_readv(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors,
-                                      QEMUIOVector *qiov)
+static coroutine_fn int null_co_preadv(BlockDriverState *bs,
+                                       uint64_t offset, uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags)
 {
     BDRVNullState *s = bs->opaque;

     if (s->read_zeroes) {
-        qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
+        qemu_iovec_memset(qiov, 0, 0, bytes);
     }

     return null_co_common(bs);
 }

-static coroutine_fn int null_co_writev(BlockDriverState *bs,
-                                       int64_t sector_num, int nb_sectors,
-                                       QEMUIOVector *qiov)
+static coroutine_fn int null_co_pwritev(BlockDriverState *bs,
+                                        uint64_t offset, uint64_t bytes,
+                                        QEMUIOVector *qiov, int flags)
 {
     return null_co_common(bs);
 }
@@ -186,26 +187,26 @@ static inline BlockAIOCB *null_aio_common(BlockDriverState *bs,
     return &acb->common;
 }

-static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
-                                  int64_t sector_num, QEMUIOVector *qiov,
-                                  int nb_sectors,
-                                  BlockCompletionFunc *cb,
-                                  void *opaque)
-{
-    BDRVNullState *s = bs->opaque;
-
-    if (s->read_zeroes) {
-        qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
-    }
-
-    return null_aio_common(bs, cb, opaque);
-}
-
-static BlockAIOCB *null_aio_writev(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
+static BlockAIOCB *null_aio_preadv(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
                                    BlockCompletionFunc *cb,
                                    void *opaque)
+{
+    BDRVNullState *s = bs->opaque;
+
+    if (s->read_zeroes) {
+        qemu_iovec_memset(qiov, 0, 0, bytes);
+    }
+
+    return null_aio_common(bs, cb, opaque);
+}
+
+static BlockAIOCB *null_aio_pwritev(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t bytes,
+                                    QEMUIOVector *qiov, int flags,
+                                    BlockCompletionFunc *cb,
+                                    void *opaque)
 {
     return null_aio_common(bs, cb, opaque);
 }
@@ -256,6 +257,11 @@ static void null_refresh_filename(BlockDriverState *bs, QDict *opts)
     bs->full_open_options = opts;
 }

+static void null_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->bl.request_alignment = 1;
+}
+
 static BlockDriver bdrv_null_co = {
     .format_name            = "null-co",
     .protocol_name          = "null-co",
@@ -266,14 +272,15 @@ static BlockDriver bdrv_null_co = {
     .bdrv_close             = null_close,
     .bdrv_getlength         = null_getlength,

-    .bdrv_co_readv          = null_co_readv,
-    .bdrv_co_writev         = null_co_writev,
+    .bdrv_co_preadv         = null_co_preadv,
+    .bdrv_co_pwritev        = null_co_pwritev,
     .bdrv_co_flush_to_disk  = null_co_flush,
     .bdrv_reopen_prepare    = null_reopen_prepare,

     .bdrv_co_block_status   = null_co_block_status,

     .bdrv_refresh_filename  = null_refresh_filename,
+    .bdrv_refresh_limits    = null_refresh_limits,
 };

 static BlockDriver bdrv_null_aio = {
@@ -286,14 +293,15 @@ static BlockDriver bdrv_null_aio = {
     .bdrv_close             = null_close,
     .bdrv_getlength         = null_getlength,

-    .bdrv_aio_readv         = null_aio_readv,
-    .bdrv_aio_writev        = null_aio_writev,
+    .bdrv_aio_preadv        = null_aio_preadv,
+    .bdrv_aio_pwritev       = null_aio_pwritev,
     .bdrv_aio_flush         = null_aio_flush,
     .bdrv_reopen_prepare    = null_reopen_prepare,

     .bdrv_co_block_status   = null_co_block_status,

     .bdrv_refresh_filename  = null_refresh_filename,
+    .bdrv_refresh_limits    = null_refresh_limits,
 };

 static void bdrv_null_init(void)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/6] rbd: Switch to byte-based callbacks
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
                   ` (2 preceding siblings ...)
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 5/6] vxhs: " Eric Blake
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Josh Durgin, Jeff Cody, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the rbd driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, the block layer defaults this driver to
use request_alignment of 512, and I did not change that.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/rbd.c | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 8474b0ba117..b3a613023c1 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -870,27 +870,23 @@ failed:
     return NULL;
 }

-static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
-                                      int64_t sector_num,
-                                      QEMUIOVector *qiov,
-                                      int nb_sectors,
-                                      BlockCompletionFunc *cb,
-                                      void *opaque)
-{
-    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
-                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
-                         RBD_AIO_READ);
-}
-
-static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
-                                       int64_t sector_num,
-                                       QEMUIOVector *qiov,
-                                       int nb_sectors,
+static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
+                                       uint64_t offset, uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags,
                                        BlockCompletionFunc *cb,
                                        void *opaque)
 {
-    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
-                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
+    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
+                         RBD_AIO_READ);
+}
+
+static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
+                                        uint64_t offset, uint64_t bytes,
+                                        QEMUIOVector *qiov, int flags,
+                                        BlockCompletionFunc *cb,
+                                        void *opaque)
+{
+    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
                          RBD_AIO_WRITE);
 }

@@ -1140,8 +1136,8 @@ static BlockDriver bdrv_rbd = {
     .bdrv_truncate          = qemu_rbd_truncate,
     .protocol_name          = "rbd",

-    .bdrv_aio_readv         = qemu_rbd_aio_readv,
-    .bdrv_aio_writev        = qemu_rbd_aio_writev,
+    .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
+    .bdrv_aio_pwritev       = qemu_rbd_aio_pwritev,

 #ifdef LIBRBD_SUPPORTS_AIO_FLUSH
     .bdrv_aio_flush         = qemu_rbd_aio_flush,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 5/6] vxhs: Switch to byte-based callbacks
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
                   ` (3 preceding siblings ...)
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 4/6] rbd: Switch to byte-based callbacks Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
  2018-04-24 15:02 ` [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
  6 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the vxhs driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, the block layer defaults this driver to
use request_alignment of 512, and I did not change that.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
I was unable to get this to compile, as I do not have the vxhs
devel headers installed.  By inspection, it should work, but
a careful review is appreciated.
---
 block/vxhs.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/block/vxhs.c b/block/vxhs.c
index 75cc6c8672c..c478ae26035 100644
--- a/block/vxhs.c
+++ b/block/vxhs.c
@@ -424,21 +424,17 @@ static const AIOCBInfo vxhs_aiocb_info = {
  * and is passed to QNIO. When QNIO completes the work,
  * it will be passed back through the callback.
  */
-static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
-                               QEMUIOVector *qiov, int nb_sectors,
+static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, uint64_t offset,
+                               QEMUIOVector *qiov, uint64_t size,
                                BlockCompletionFunc *cb, void *opaque,
                                VDISKAIOCmd iodir)
 {
     VXHSAIOCB *acb = NULL;
     BDRVVXHSState *s = bs->opaque;
-    size_t size;
-    uint64_t offset;
     int iio_flags = 0;
     int ret = 0;
     void *dev_handle = s->vdisk_hostinfo.dev_handle;

-    offset = sector_num * BDRV_SECTOR_SIZE;
-    size = nb_sectors * BDRV_SECTOR_SIZE;
     acb = qemu_aio_get(&vxhs_aiocb_info, bs, cb, opaque);

     /*
@@ -451,11 +447,11 @@ static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
     switch (iodir) {
     case VDISK_AIO_WRITE:
             ret = iio_writev(dev_handle, acb, qiov->iov, qiov->niov,
-                             offset, (uint64_t)size, iio_flags);
+                             offset, size, iio_flags);
             break;
     case VDISK_AIO_READ:
             ret = iio_readv(dev_handle, acb, qiov->iov, qiov->niov,
-                            offset, (uint64_t)size, iio_flags);
+                            offset, size, iio_flags);
             break;
     default:
             trace_vxhs_aio_rw_invalid(iodir);
@@ -474,22 +470,20 @@ errout:
     return NULL;
 }

-static BlockAIOCB *vxhs_aio_readv(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
+static BlockAIOCB *vxhs_aio_preadv(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
                                    BlockCompletionFunc *cb, void *opaque)
 {
-    return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors, cb,
-                       opaque, VDISK_AIO_READ);
+    return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_READ);
 }

-static BlockAIOCB *vxhs_aio_writev(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
-                                   BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *vxhs_aio_pwritev(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t bytes,
+                                    QEMUIOVector *qiov, int flags,
+                                    BlockCompletionFunc *cb, void *opaque)
 {
-    return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors,
-                       cb, opaque, VDISK_AIO_WRITE);
+    return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_WRITE);
 }

 static void vxhs_close(BlockDriverState *bs)
@@ -563,8 +557,8 @@ static BlockDriver bdrv_vxhs = {
     .bdrv_parse_filename          = vxhs_parse_filename,
     .bdrv_close                   = vxhs_close,
     .bdrv_getlength               = vxhs_getlength,
-    .bdrv_aio_readv               = vxhs_aio_readv,
-    .bdrv_aio_writev              = vxhs_aio_writev,
+    .bdrv_aio_preadv              = vxhs_aio_preadv,
+    .bdrv_aio_pwritev             = vxhs_aio_pwritev,
 };

 static void bdrv_vxhs_init(void)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/6] block: Drop last of the sector-based aio callbacks
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
                   ` (4 preceding siblings ...)
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 5/6] vxhs: " Eric Blake
@ 2018-02-15 19:28 ` Eric Blake
  2018-04-24 15:02 ` [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
  6 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-02-15 19:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, Stefan Hajnoczi, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Now that all drivers with aio callbacks are using the
byte-based interfaces, we can remove the sector-based versions.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block_int.h |  6 ------
 block/io.c                | 23 ++++++-----------------
 2 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c882dc4232d..0f5b7accfa8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -134,15 +134,9 @@ struct BlockDriver {
     void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);

     /* aio */
-    BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
         BlockCompletionFunc *cb, void *opaque);
-    BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
         BlockCompletionFunc *cb, void *opaque);
diff --git a/block/io.c b/block/io.c
index 84a4caa72b7..04239f01eef 100644
--- a/block/io.c
+++ b/block/io.c
@@ -948,13 +948,8 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        if (drv->bdrv_aio_preadv) {
-            acb = bs->drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
-                                           bdrv_co_io_em_complete, &co);
-        } else {
-            acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
-                                          bdrv_co_io_em_complete, &co);
-        }
+        acb = bs->drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
+                                       bdrv_co_io_em_complete, &co);
         if (acb == NULL) {
             return -EIO;
         } else {
@@ -1008,16 +1003,10 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        if (drv->bdrv_aio_pwritev) {
-            acb = bs->drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
-                                            flags & bs->supported_write_flags,
-                                            bdrv_co_io_em_complete, &co);
-            flags &= ~bs->supported_write_flags;
-        } else {
-            assert(!bs->supported_write_flags);
-            acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
-                                           bdrv_co_io_em_complete, &co);
-        }
+        acb = bs->drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
+                                        flags & bs->supported_write_flags,
+                                        bdrv_co_io_em_complete, &co);
+        flags &= ~bs->supported_write_flags;
         if (acb == NULL) {
             ret = -EIO;
         } else {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write
  2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
                   ` (5 preceding siblings ...)
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
@ 2018-04-24 15:02 ` Eric Blake
  2018-04-24 19:13   ` John Snow
  6 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2018-04-24 15:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1754 bytes --]

On 02/15/2018 01:28 PM, Eric Blake wrote:
> While we would prefer that block drivers use coroutines instead
> of aio callbacks, it is a fairly easy exercise to prove that
> all existing drivers with aio callbacks are merely scaling
> from bytes into sectors and back to bytes.  So, even though I
> am not set up to completely run (or even compile-test) this
> full series, it seems pretty straightforward to change the
> signature to quit playing games with pointless scaling.
> 
> Note that except for the null-aio driver, I intentionally did
> NOT try and change the request_alignment from the block layer's
> default of 512 (it defaults to 1 only for drivers that use
> coroutines).
> 
> (And along the way, I got my docker-test-mingw@fedora working;
> thanks to the help I got on IRC)

ping

> 
> Eric Blake (6):
>   block: Support byte-based aio callbacks
>   file-win32: Switch to byte-based callbacks
>   null: Switch to byte-based read/write
>   rbd: Switch to byte-based callbacks
>   vxhs: Switch to byte-based callbacks
>   block: Drop last of the sector-based aio callbacks
> 
>  include/block/block_int.h |  8 +++---
>  include/block/raw-aio.h   |  2 +-
>  block/io.c                | 26 ++++++++++++-------
>  block/file-win32.c        | 36 +++++++++++++-------------
>  block/null.c              | 66 ++++++++++++++++++++++++++---------------------
>  block/rbd.c               | 36 ++++++++++++--------------
>  block/vxhs.c              | 36 +++++++++++---------------
>  block/win32-aio.c         |  5 ++--
>  8 files changed, 109 insertions(+), 106 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks Eric Blake
@ 2018-04-24 15:40   ` Kevin Wolf
  2018-04-24 17:06     ` Eric Blake
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2018-04-24 15:40 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Add new sector-based aio callbacks for read and write,
> to match the fact that bdrv_aio_pdiscard is already byte-based.
> 
> Ideally, drivers should be converted to use coroutine callbacks
> rather than aio; but that is not quite as trivial (if we do that
> conversion, the null-aio driver will disappear), so for the
> short term, converting the signature but keeping things with aio
> is easier.  Once all drivers are converted, the sector-based aio
> callbacks will be removed.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  include/block/block_int.h |  6 ++++++
>  block/io.c                | 37 +++++++++++++++++++++++++++----------
>  2 files changed, 33 insertions(+), 10 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 5ae7738cf8d..c882dc4232d 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -137,9 +137,15 @@ struct BlockDriver {
>      BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
>          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
>          BlockCompletionFunc *cb, void *opaque);
> +    BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
> +        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
> +        BlockCompletionFunc *cb, void *opaque);
>      BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
>          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
>          BlockCompletionFunc *cb, void *opaque);
> +    BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
> +        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
> +        BlockCompletionFunc *cb, void *opaque);
>      BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
>          BlockCompletionFunc *cb, void *opaque);
>      BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs,
> diff --git a/block/io.c b/block/io.c
> index 4d3d1f640a3..84a4caa72b7 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -934,9 +934,11 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
>      sector_num = offset >> BDRV_SECTOR_BITS;
>      nb_sectors = bytes >> BDRV_SECTOR_BITS;
> 
> -    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> -    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> -    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
> +    if (!drv->bdrv_aio_preadv) {
> +        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> +        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> +        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
> +    }

Hm, this is kind of ugly. Previously, we handled everything byte-aligned
in the first section, now we mix both in the second section.

I can see that you do this so you don't have to duplicate the acb and
coroutine yielding code below, but can we move things into the right
place in the final patch at least? That is, calculate sector_num and
nb_sectors only if all the byte-based interfaces weren't available.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write
  2018-02-15 19:28 ` [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write Eric Blake
@ 2018-04-24 15:52   ` Kevin Wolf
  2018-04-24 17:00     ` Eric Blake
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2018-04-24 15:52 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Fam Zheng, Max Reitz

Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based callbacks
> in the null-co and null-aio drivers.
> 
> Note that since the null driver does nothing on writes, it trivially
> supports the BDRV_REQ_FUA flag (all writes have already landed to
> the same bit-bucket without needing an extra flush call).  Furthermore,
> bdrv_refresh_limits() defaults the block size to 512 for any driver
> that does not support coroutines; while this is still correct for the
> other aio-based drivers, the null driver does just as well with
> byte-based requests, and being explicit means we can avoid cycles
> wasted on read-modify-write.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

> +static void null_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +    bs->bl.request_alignment = 1;
> +}

I would rather modify bdrv_refresh_limits() so that it defaults to 1 for
drivers supporting either .bdrv_co_preadv or .bdrv_aio_preadv.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write
  2018-04-24 15:52   ` Kevin Wolf
@ 2018-04-24 17:00     ` Eric Blake
  2018-04-24 17:19       ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2018-04-24 17:00 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]

On 04/24/2018 10:52 AM, Kevin Wolf wrote:
> Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
>> We are gradually moving away from sector-based interfaces, towards
>> byte-based.  Make the change for the last few sector-based callbacks
>> in the null-co and null-aio drivers.
>>
>> Note that since the null driver does nothing on writes, it trivially
>> supports the BDRV_REQ_FUA flag (all writes have already landed to
>> the same bit-bucket without needing an extra flush call).  Furthermore,
>> bdrv_refresh_limits() defaults the block size to 512 for any driver
>> that does not support coroutines; while this is still correct for the
>> other aio-based drivers, the null driver does just as well with
>> byte-based requests, and being explicit means we can avoid cycles
>> wasted on read-modify-write.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
>> +static void null_refresh_limits(BlockDriverState *bs, Error **errp)
>> +{
>> +    bs->bl.request_alignment = 1;
>> +}
> 
> I would rather modify bdrv_refresh_limits() so that it defaults to 1 for
> drivers supporting either .bdrv_co_preadv or .bdrv_aio_preadv.

Sure, I can do that (although then I may have to provide a
refresh_limits callback for each of the other drivers that I
specifically left at 512 alignment).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks
  2018-04-24 15:40   ` Kevin Wolf
@ 2018-04-24 17:06     ` Eric Blake
  2018-04-24 17:15       ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2018-04-24 17:06 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 3247 bytes --]

On 04/24/2018 10:40 AM, Kevin Wolf wrote:
> Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
>> We are gradually moving away from sector-based interfaces, towards
>> byte-based.  Add new sector-based aio callbacks for read and write,
>> to match the fact that bdrv_aio_pdiscard is already byte-based.
>>
>> Ideally, drivers should be converted to use coroutine callbacks
>> rather than aio; but that is not quite as trivial (if we do that
>> conversion, the null-aio driver will disappear), so for the
>> short term, converting the signature but keeping things with aio
>> is easier.  Once all drivers are converted, the sector-based aio
>> callbacks will be removed.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  include/block/block_int.h |  6 ++++++
>>  block/io.c                | 37 +++++++++++++++++++++++++++----------
>>  2 files changed, 33 insertions(+), 10 deletions(-)
>>

>> +++ b/block/io.c
>> @@ -934,9 +934,11 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
>>      sector_num = offset >> BDRV_SECTOR_BITS;
>>      nb_sectors = bytes >> BDRV_SECTOR_BITS;
>>
>> -    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>> -    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>> -    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
>> +    if (!drv->bdrv_aio_preadv) {
>> +        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>> +        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>> +        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
>> +    }
> 
> Hm, this is kind of ugly. Previously, we handled everything byte-aligned
> in the first section, now we mix both in the second section.
> 
> I can see that you do this so you don't have to duplicate the acb and
> coroutine yielding code below, but can we move things into the right
> place in the final patch at least? That is, calculate sector_num and
> nb_sectors only if all the byte-based interfaces weren't available.

Yeah, that's easy enough to squash into patch 6:

diff --git i/block/io.c w/block/io.c
index ba767612931..49fabe8eeb1 100644
--- i/block/io.c
+++ w/block/io.c
@@ -924,16 +924,13 @@ static int coroutine_fn
bdrv_driver_preadv(BlockDriverState *bs,
         return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
     }

-    sector_num = offset >> BDRV_SECTOR_BITS;
-    nb_sectors = bytes >> BDRV_SECTOR_BITS;
-
-    if (!drv->bdrv_aio_preadv) {
-        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
-    }
-
     if (drv->bdrv_co_readv) {
+        sector_num = offset >> BDRV_SECTOR_BITS;
+        nb_sectors = bytes >> BDRV_SECTOR_BITS;
+
+        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
         return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
     } else {
         BlockAIOCB *acb;

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks
  2018-04-24 17:06     ` Eric Blake
@ 2018-04-24 17:15       ` Kevin Wolf
  2018-04-24 19:16         ` Eric Blake
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2018-04-24 17:15 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 3431 bytes --]

Am 24.04.2018 um 19:06 hat Eric Blake geschrieben:
> On 04/24/2018 10:40 AM, Kevin Wolf wrote:
> > Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
> >> We are gradually moving away from sector-based interfaces, towards
> >> byte-based.  Add new sector-based aio callbacks for read and write,
> >> to match the fact that bdrv_aio_pdiscard is already byte-based.
> >>
> >> Ideally, drivers should be converted to use coroutine callbacks
> >> rather than aio; but that is not quite as trivial (if we do that
> >> conversion, the null-aio driver will disappear), so for the
> >> short term, converting the signature but keeping things with aio
> >> is easier.  Once all drivers are converted, the sector-based aio
> >> callbacks will be removed.
> >>
> >> Signed-off-by: Eric Blake <eblake@redhat.com>
> >> ---
> >>  include/block/block_int.h |  6 ++++++
> >>  block/io.c                | 37 +++++++++++++++++++++++++++----------
> >>  2 files changed, 33 insertions(+), 10 deletions(-)
> >>
> 
> >> +++ b/block/io.c
> >> @@ -934,9 +934,11 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
> >>      sector_num = offset >> BDRV_SECTOR_BITS;
> >>      nb_sectors = bytes >> BDRV_SECTOR_BITS;
> >>
> >> -    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> >> -    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> >> -    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
> >> +    if (!drv->bdrv_aio_preadv) {
> >> +        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> >> +        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> >> +        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
> >> +    }
> > 
> > Hm, this is kind of ugly. Previously, we handled everything byte-aligned
> > in the first section, now we mix both in the second section.
> > 
> > I can see that you do this so you don't have to duplicate the acb and
> > coroutine yielding code below, but can we move things into the right
> > place in the final patch at least? That is, calculate sector_num and
> > nb_sectors only if all the byte-based interfaces weren't available.
> 
> Yeah, that's easy enough to squash into patch 6:
> 
> diff --git i/block/io.c w/block/io.c
> index ba767612931..49fabe8eeb1 100644
> --- i/block/io.c
> +++ w/block/io.c
> @@ -924,16 +924,13 @@ static int coroutine_fn
> bdrv_driver_preadv(BlockDriverState *bs,
>          return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
>      }
> 
> -    sector_num = offset >> BDRV_SECTOR_BITS;
> -    nb_sectors = bytes >> BDRV_SECTOR_BITS;
> -
> -    if (!drv->bdrv_aio_preadv) {
> -        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> -        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> -        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
> -    }
> -
>      if (drv->bdrv_co_readv) {
> +        sector_num = offset >> BDRV_SECTOR_BITS;
> +        nb_sectors = bytes >> BDRV_SECTOR_BITS;
> +
> +        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> +        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> +        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
>          return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
>      } else {
>          BlockAIOCB *acb;

Ah, yes. I thought of moving the code in the else block, but this works,
too. Maybe it's even a bit nicer.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write
  2018-04-24 17:00     ` Eric Blake
@ 2018-04-24 17:19       ` Kevin Wolf
  2018-04-24 17:40         ` Eric Blake
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2018-04-24 17:19 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1779 bytes --]

Am 24.04.2018 um 19:00 hat Eric Blake geschrieben:
> On 04/24/2018 10:52 AM, Kevin Wolf wrote:
> > Am 15.02.2018 um 20:28 hat Eric Blake geschrieben:
> >> We are gradually moving away from sector-based interfaces, towards
> >> byte-based.  Make the change for the last few sector-based callbacks
> >> in the null-co and null-aio drivers.
> >>
> >> Note that since the null driver does nothing on writes, it trivially
> >> supports the BDRV_REQ_FUA flag (all writes have already landed to
> >> the same bit-bucket without needing an extra flush call).  Furthermore,
> >> bdrv_refresh_limits() defaults the block size to 512 for any driver
> >> that does not support coroutines; while this is still correct for the
> >> other aio-based drivers, the null driver does just as well with
> >> byte-based requests, and being explicit means we can avoid cycles
> >> wasted on read-modify-write.
> >>
> >> Signed-off-by: Eric Blake <eblake@redhat.com>
> > 
> >> +static void null_refresh_limits(BlockDriverState *bs, Error **errp)
> >> +{
> >> +    bs->bl.request_alignment = 1;
> >> +}
> > 
> > I would rather modify bdrv_refresh_limits() so that it defaults to 1 for
> > drivers supporting either .bdrv_co_preadv or .bdrv_aio_preadv.
> 
> Sure, I can do that (although then I may have to provide a
> refresh_limits callback for each of the other drivers that I
> specifically left at 512 alignment).

Do we know that the other drivers need 512, or do we expect than they
can handle byte granularity, but we try to err on the safe side?

If it's the latter (which would be my expectation), then it's probably
better to set request_alignment = 512 with a comment in those drivers
and hope that someone will lift the restriction in the future.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write
  2018-04-24 17:19       ` Kevin Wolf
@ 2018-04-24 17:40         ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-04-24 17:40 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]

On 04/24/2018 12:19 PM, Kevin Wolf wrote:

>>>> +static void null_refresh_limits(BlockDriverState *bs, Error **errp)
>>>> +{
>>>> +    bs->bl.request_alignment = 1;
>>>> +}
>>>
>>> I would rather modify bdrv_refresh_limits() so that it defaults to 1 for
>>> drivers supporting either .bdrv_co_preadv or .bdrv_aio_preadv.
>>
>> Sure, I can do that (although then I may have to provide a
>> refresh_limits callback for each of the other drivers that I
>> specifically left at 512 alignment).
> 
> Do we know that the other drivers need 512, or do we expect than they
> can handle byte granularity, but we try to err on the safe side?

Done to err on the safe side and avoid the difficulty of a deep audit in
code I'm unfamiliar with; in fact, for vxhs, I wasn't even able to
compile test things, which means I certainly wasn't able to reproduce a
test to see if dropping the alignment would break anything under qemu-io.

> 
> If it's the latter (which would be my expectation), then it's probably
> better to set request_alignment = 512 with a comment in those drivers
> and hope that someone will lift the restriction in the future.

Indeed, leaving a good comment for a future reader is worth the effort.
I'll send a v2 later today.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write
  2018-04-24 15:02 ` [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
@ 2018-04-24 19:13   ` John Snow
  0 siblings, 0 replies; 17+ messages in thread
From: John Snow @ 2018-04-24 19:13 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, qemu-block



On 04/24/2018 11:02 AM, Eric Blake wrote:
> On 02/15/2018 01:28 PM, Eric Blake wrote:
>> While we would prefer that block drivers use coroutines instead
>> of aio callbacks, it is a fairly easy exercise to prove that
>> all existing drivers with aio callbacks are merely scaling
>> from bytes into sectors and back to bytes.  So, even though I
>> am not set up to completely run (or even compile-test) this
>> full series, it seems pretty straightforward to change the
>> signature to quit playing games with pointless scaling.
>>
>> Note that except for the null-aio driver, I intentionally did
>> NOT try and change the request_alignment from the block layer's
>> default of 512 (it defaults to 1 only for drivers that use
>> coroutines).
>>
>> (And along the way, I got my docker-test-mingw@fedora working;
>> thanks to the help I got on IRC)
> 
> ping
> 

One of your replies to patch 3 suggests you were going to send a v2 --
is that not the case? Do you want deeper review of v1 first?

--js

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks
  2018-04-24 17:15       ` Kevin Wolf
@ 2018-04-24 19:16         ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2018-04-24 19:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1034 bytes --]

On 04/24/2018 12:15 PM, Kevin Wolf wrote:

>>> Hm, this is kind of ugly. Previously, we handled everything byte-aligned
>>> in the first section, now we mix both in the second section.
>>>
>>> I can see that you do this so you don't have to duplicate the acb and
>>> coroutine yielding code below, but can we move things into the right
>>> place in the final patch at least? That is, calculate sector_num and
>>> nb_sectors only if all the byte-based interfaces weren't available.
>>
>> Yeah, that's easy enough to squash into patch 6:
>>

> 
> Ah, yes. I thought of moving the code in the else block, but this works,
> too. Maybe it's even a bit nicer.

Moving the code into the 'if' works for bdrv_co_readv, but not so nicely
for the bdrv_co_writev vs. bdrv_co_writev_flags.  So for v2, I'll just
hoist the aio code first; and I guess I smell another round of cleanups
coming...

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 615 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-04-24 19:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-15 19:28 [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 1/6] block: Support byte-based aio callbacks Eric Blake
2018-04-24 15:40   ` Kevin Wolf
2018-04-24 17:06     ` Eric Blake
2018-04-24 17:15       ` Kevin Wolf
2018-04-24 19:16         ` Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 2/6] file-win32: Switch to byte-based callbacks Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 3/6] null: Switch to byte-based read/write Eric Blake
2018-04-24 15:52   ` Kevin Wolf
2018-04-24 17:00     ` Eric Blake
2018-04-24 17:19       ` Kevin Wolf
2018-04-24 17:40         ` Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 4/6] rbd: Switch to byte-based callbacks Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 5/6] vxhs: " Eric Blake
2018-02-15 19:28 ` [Qemu-devel] [PATCH 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
2018-04-24 15:02 ` [Qemu-devel] [PATCH 0/6] block: byte-based AIO read/write Eric Blake
2018-04-24 19:13   ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.