All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev
@ 2016-04-28 13:16 Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 01/17] block: Introduce bdrv_driver_preadv() Kevin Wolf
                   ` (18 more replies)
  0 siblings, 19 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

This series introduces a new BlockDriver interface, which will hopefully be the
final one, or at least good enough for another few years: .bdrv_preadv/pwritev.
It is based on coroutines, vectored, has flags and uses a byte granularity.
This is now the preferred interface for new drivers.

All drivers still using the legacy interface .bdrv_read/write are converted to
the new interface and the emulation code we had for the old interface is
removed. Most of the drivers become zero-copy with these patches as they are
vectored now; only vvfat continues to allocate a bounce buffer.

The less obscure formats (vmdk, vdi, vpc) also natively support byte-aligned
requests now. The block layer is still enforcing a minimal alignment of 512, so
this isn't actually used yet, but in a next step, we can lift this restriction
for drivers that implement .bdrv_preadv/pwritev.

v2:
- Patch 3 ('block: Support AIO drivers in bdrv_driver_preadv/pwritev()')
  Removed bdrv_co_io_em in trace-event after removing its user [Eric]

- Patch 4 ('block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev')
  Fixed some indentations [Eric]

- Patches 6, 7, 8, 12 (bochs, cloop, dmg, vmdk):
  Replaced returns by goto fail to fix missing unlock and qiov leaks [Stefan]

- Patch 13 ('vmdk: Implement .bdrv_co_pwritev() interface')
  Use g_free() instead of free() [Stefan]

Kevin Wolf (17):
  block: Introduce bdrv_driver_preadv()
  block: Introduce bdrv_driver_pwritev()
  block: Support AIO drivers in bdrv_driver_preadv/pwritev()
  block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev
  block: Introduce .bdrv_co_preadv/pwritev BlockDriver function
  bochs: Implement .bdrv_co_preadv() interface
  cloop: Implement .bdrv_co_preadv() interface
  dmg: Implement .bdrv_co_preadv() interface
  vdi: Implement .bdrv_co_preadv() interface
  vdi: Implement .bdrv_co_pwritev() interface
  vmdk: Add vmdk_find_offset_in_cluster()
  vmdk: Implement .bdrv_co_preadv() interface
  vmdk: Implement .bdrv_co_pwritev() interface
  vpc: Implement .bdrv_co_preadv() interface
  vpc: Implement .bdrv_co_pwritev() interface
  vvfat: Implement .bdrv_co_preadv/pwritev interfaces
  block: Remove BlockDriver.bdrv_read/write

 block.c                   |   2 -
 block/block-backend.c     |   4 +-
 block/bochs.c             |  51 ++++---
 block/cloop.c             |  38 +++---
 block/dmg.c               |  40 +++---
 block/io.c                | 321 ++++++++++++++++++--------------------------
 block/iscsi.c             |   8 --
 block/nbd.c               |   9 --
 block/raw_bsd.c           |  12 +-
 block/vdi.c               | 127 ++++++++++--------
 block/vmdk.c              | 330 ++++++++++++++++++++++++++++------------------
 block/vpc.c               | 165 ++++++++++++-----------
 block/vvfat.c             |  55 ++++++--
 hw/ide/macio.c            |   4 +-
 include/block/block_int.h |  12 +-
 trace-events              |   1 -
 16 files changed, 629 insertions(+), 550 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 01/17] block: Introduce bdrv_driver_preadv()
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev() Kevin Wolf
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

This is a function that simply calls into the block driver for doing a
read, providing the byte granularity interface we want to eventually
have everywhere, and using whatever interface that driver supports.

For now, this is just a wrapper for calling bs->drv->bdrv_co_readv().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/io.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index a7dbf85..586a46a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -800,6 +800,21 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
     return 0;
 }
 
+static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
+                                           uint64_t offset, uint64_t bytes,
+                                           QEMUIOVector *qiov, int flags)
+{
+    BlockDriver *drv = bs->drv;
+    int64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    unsigned int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+
+    return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+}
+
 static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
@@ -836,8 +851,9 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
 
     qemu_iovec_init_external(&bounce_qiov, &iov, 1);
 
-    ret = drv->bdrv_co_readv(bs, cluster_sector_num, cluster_nb_sectors,
-                             &bounce_qiov);
+    ret = bdrv_driver_preadv(bs, cluster_sector_num * BDRV_SECTOR_SIZE,
+                             cluster_nb_sectors * BDRV_SECTOR_SIZE,
+                             &bounce_qiov, 0);
     if (ret < 0) {
         goto err;
     }
@@ -880,7 +896,6 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
     BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
     int64_t align, QEMUIOVector *qiov, int flags)
 {
-    BlockDriver *drv = bs->drv;
     int ret;
 
     int64_t sector_num = offset >> BDRV_SECTOR_BITS;
@@ -921,7 +936,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
 
     /* Forward the request to the BlockDriver */
     if (!bs->zero_beyond_eof) {
-        ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+        ret = bdrv_driver_preadv(bs, offset, bytes, qiov, 0);
     } else {
         /* Read zeros after EOF */
         int64_t total_sectors, max_nb_sectors;
@@ -935,7 +950,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
         max_nb_sectors = ROUND_UP(MAX(0, total_sectors - sector_num),
                                   align >> BDRV_SECTOR_BITS);
         if (nb_sectors < max_nb_sectors) {
-            ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+            ret = bdrv_driver_preadv(bs, offset, bytes, qiov, 0);
         } else if (max_nb_sectors > 0) {
             QEMUIOVector local_qiov;
 
@@ -943,8 +958,9 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
             qemu_iovec_concat(&local_qiov, qiov, 0,
                               max_nb_sectors * BDRV_SECTOR_SIZE);
 
-            ret = drv->bdrv_co_readv(bs, sector_num, max_nb_sectors,
-                                     &local_qiov);
+            ret = bdrv_driver_preadv(bs, offset,
+                                     max_nb_sectors * BDRV_SECTOR_SIZE,
+                                     &local_qiov, 0);
 
             qemu_iovec_destroy(&local_qiov);
         } else {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev()
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 01/17] block: Introduce bdrv_driver_preadv() Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  1:14   ` Fam Zheng
  2016-04-30 21:54   ` Eric Blake
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev() Kevin Wolf
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

This is a function that simply calls into the block driver for doing a
write, providing the byte granularity interface we want to eventually
have everywhere, and using whatever interface that driver supports.

This one is a bit more interesting that the version for reads: It adds
support for .bdrv_co_writev_flags() everywhere, so that drivers
implementing this function can drop .bdrv_co_writev() now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/io.c      | 51 ++++++++++++++++++++++++++++++++++++---------------
 block/iscsi.c   |  8 --------
 block/nbd.c     |  9 ---------
 block/raw_bsd.c |  8 --------
 4 files changed, 36 insertions(+), 40 deletions(-)

diff --git a/block/io.c b/block/io.c
index 586a46a..c9b2864 100644
--- a/block/io.c
+++ b/block/io.c
@@ -815,6 +815,36 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
     return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 }
 
+static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
+                                            uint64_t offset, uint64_t bytes,
+                                            QEMUIOVector *qiov, int flags)
+{
+    BlockDriver *drv = bs->drv;
+    int64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    unsigned int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int ret;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+
+    if (drv->bdrv_co_writev_flags) {
+        ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
+                                        flags);
+    } else {
+        assert(drv->supported_write_flags == 0);
+        ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
+    }
+
+    if (ret == 0 && (flags & BDRV_REQ_FUA) &&
+        !(drv->supported_write_flags & BDRV_REQ_FUA))
+    {
+        ret = bdrv_co_flush(bs);
+    }
+
+    return ret;
+}
+
 static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
@@ -866,8 +896,9 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
         /* This does not change the data on the disk, it is not necessary
          * to flush even in cache=writethrough mode.
          */
-        ret = drv->bdrv_co_writev(bs, cluster_sector_num, cluster_nb_sectors,
-                                  &bounce_qiov);
+        ret = bdrv_driver_pwritev(bs, cluster_sector_num * BDRV_SECTOR_SIZE,
+                                  cluster_nb_sectors * BDRV_SECTOR_SIZE,
+                                  &bounce_qiov, 0);
     }
 
     if (ret < 0) {
@@ -1155,7 +1186,8 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
             }
             qemu_iovec_init_external(&qiov, &iov, 1);
 
-            ret = drv->bdrv_co_writev(bs, sector_num, num, &qiov);
+            ret = bdrv_driver_pwritev(bs, sector_num * BDRV_SECTOR_SIZE,
+                                      num * BDRV_SECTOR_SIZE, &qiov, 0);
 
             /* Keep bounce buffer around if it is big enough for all
              * all future requests.
@@ -1215,23 +1247,12 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     } else if (flags & BDRV_REQ_ZERO_WRITE) {
         bdrv_debug_event(bs, BLKDBG_PWRITEV_ZERO);
         ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags);
-    } else if (drv->bdrv_co_writev_flags) {
-        bdrv_debug_event(bs, BLKDBG_PWRITEV);
-        ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
-                                        flags);
     } else {
-        assert(drv->supported_write_flags == 0);
         bdrv_debug_event(bs, BLKDBG_PWRITEV);
-        ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
+        ret = bdrv_driver_pwritev(bs, offset, bytes, qiov, flags);
     }
     bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
 
-    if (ret == 0 && (flags & BDRV_REQ_FUA) &&
-        !(drv->supported_write_flags & BDRV_REQ_FUA))
-    {
-        ret = bdrv_co_flush(bs);
-    }
-
     bdrv_set_dirty(bs, sector_num, nb_sectors);
 
     if (bs->wr_highest_offset < offset + bytes) {
diff --git a/block/iscsi.c b/block/iscsi.c
index 302baf8..4f75204 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -513,13 +513,6 @@ retry:
     return 0;
 }
 
-static int coroutine_fn
-iscsi_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-                QEMUIOVector *iov)
-{
-    return iscsi_co_writev_flags(bs, sector_num, nb_sectors, iov, 0);
-}
-
 
 static bool iscsi_allocationmap_is_allocated(IscsiLun *iscsilun,
                                              int64_t sector_num, int nb_sectors)
@@ -1847,7 +1840,6 @@ static BlockDriver bdrv_iscsi = {
     .bdrv_co_discard      = iscsi_co_discard,
     .bdrv_co_write_zeroes = iscsi_co_write_zeroes,
     .bdrv_co_readv         = iscsi_co_readv,
-    .bdrv_co_writev        = iscsi_co_writev,
     .bdrv_co_writev_flags  = iscsi_co_writev_flags,
     .supported_write_flags = BDRV_REQ_FUA,
     .bdrv_co_flush_to_disk = iscsi_co_flush,
diff --git a/block/nbd.c b/block/nbd.c
index f7ea3b3..fccbfef 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -374,12 +374,6 @@ static int nbd_co_writev_flags(BlockDriverState *bs, int64_t sector_num,
     return ret;
 }
 
-static int nbd_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov)
-{
-    return nbd_co_writev_flags(bs, sector_num, nb_sectors, qiov, 0);
-}
-
 static int nbd_co_flush(BlockDriverState *bs)
 {
     return nbd_client_co_flush(bs);
@@ -476,7 +470,6 @@ static BlockDriver bdrv_nbd = {
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
     .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
     .bdrv_co_writev_flags       = nbd_co_writev_flags,
     .supported_write_flags      = BDRV_REQ_FUA,
     .bdrv_close                 = nbd_close,
@@ -496,7 +489,6 @@ static BlockDriver bdrv_nbd_tcp = {
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
     .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
     .bdrv_co_writev_flags       = nbd_co_writev_flags,
     .supported_write_flags      = BDRV_REQ_FUA,
     .bdrv_close                 = nbd_close,
@@ -516,7 +508,6 @@ static BlockDriver bdrv_nbd_unix = {
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
     .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
     .bdrv_co_writev_flags       = nbd_co_writev_flags,
     .supported_write_flags      = BDRV_REQ_FUA,
     .bdrv_close                 = nbd_close,
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index a6cc7e9..9c9d39b 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -116,13 +116,6 @@ fail:
     return ret;
 }
 
-static int coroutine_fn
-raw_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-              QEMUIOVector *qiov)
-{
-    return raw_co_writev_flags(bs, sector_num, nb_sectors, qiov, 0);
-}
-
 static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
                                             int64_t sector_num,
                                             int nb_sectors, int *pnum,
@@ -256,7 +249,6 @@ BlockDriver bdrv_raw = {
     .bdrv_close           = &raw_close,
     .bdrv_create          = &raw_create,
     .bdrv_co_readv        = &raw_co_readv,
-    .bdrv_co_writev       = &raw_co_writev,
     .bdrv_co_writev_flags = &raw_co_writev_flags,
     .supported_write_flags = BDRV_REQ_FUA,
     .bdrv_co_write_zeroes = &raw_co_write_zeroes,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev()
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 01/17] block: Introduce bdrv_driver_preadv() Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev() Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  1:22   ` Fam Zheng
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev Kevin Wolf
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Instead of registering emulation functions as .bdrv_co_writev, just
directly check whether the function is there or not, and use the AIO
interface if it isn't. This makes the read/write functions more
consistent with how things are done in other places (flush, discard,
etc.)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/io.c   | 126 ++++++++++++++++++++++++-----------------------------------
 trace-events |   1 -
 2 files changed, 52 insertions(+), 75 deletions(-)

diff --git a/block/io.c b/block/io.c
index c9b2864..86d97c3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -40,12 +40,6 @@ static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 static BlockAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
-static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
-                                         int64_t sector_num, int nb_sectors,
-                                         QEMUIOVector *iov);
-static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
-                                         int64_t sector_num, int nb_sectors,
-                                         QEMUIOVector *iov);
 static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
                                          int64_t sector_num,
                                          QEMUIOVector *qiov,
@@ -125,19 +119,13 @@ void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group)
 
 void bdrv_setup_io_funcs(BlockDriver *bdrv)
 {
-    /* Block drivers without coroutine functions need emulation */
-    if (!bdrv->bdrv_co_readv) {
-        bdrv->bdrv_co_readv = bdrv_co_readv_em;
-        bdrv->bdrv_co_writev = bdrv_co_writev_em;
-
-        /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
-         * the block driver lacks aio we need to emulate that too.
-         */
-        if (!bdrv->bdrv_aio_readv) {
-            /* add AIO emulation layer */
-            bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
-            bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
-        }
+    /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
+     * the block driver lacks aio we need to emulate that.
+     */
+    if (!bdrv->bdrv_aio_readv) {
+        /* add AIO emulation layer */
+        bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
+        bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
     }
 }
 
@@ -800,6 +788,19 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
     return 0;
 }
 
+typedef struct CoroutineIOCompletion {
+    Coroutine *coroutine;
+    int ret;
+} CoroutineIOCompletion;
+
+static void bdrv_co_io_em_complete(void *opaque, int ret)
+{
+    CoroutineIOCompletion *co = opaque;
+
+    co->ret = ret;
+    qemu_coroutine_enter(co->coroutine, NULL);
+}
+
 static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
                                            uint64_t offset, uint64_t bytes,
                                            QEMUIOVector *qiov, int flags)
@@ -812,7 +813,23 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
     assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
 
-    return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+    if (drv->bdrv_co_readv) {
+        return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
+    } else {
+        BlockAIOCB *acb;
+        CoroutineIOCompletion co = {
+            .coroutine = qemu_coroutine_self(),
+        };
+
+        acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
+                                      bdrv_co_io_em_complete, &co);
+        if (acb == NULL) {
+            return -EIO;
+        } else {
+            qemu_coroutine_yield();
+            return co.ret;
+        }
+    }
 }
 
 static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
@@ -831,9 +848,23 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
     if (drv->bdrv_co_writev_flags) {
         ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
                                         flags);
-    } else {
+    } else if (drv->bdrv_co_writev) {
         assert(drv->supported_write_flags == 0);
         ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
+    } else {
+        BlockAIOCB *acb;
+        CoroutineIOCompletion co = {
+            .coroutine = qemu_coroutine_self(),
+        };
+
+        acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
+                                       bdrv_co_io_em_complete, &co);
+        if (acb == NULL) {
+            return -EIO;
+        } else {
+            qemu_coroutine_yield();
+            return co.ret;
+        }
     }
 
     if (ret == 0 && (flags & BDRV_REQ_FUA) &&
@@ -2351,59 +2382,6 @@ void qemu_aio_unref(void *p)
 /**************************************************************/
 /* Coroutine block device emulation */
 
-typedef struct CoroutineIOCompletion {
-    Coroutine *coroutine;
-    int ret;
-} CoroutineIOCompletion;
-
-static void bdrv_co_io_em_complete(void *opaque, int ret)
-{
-    CoroutineIOCompletion *co = opaque;
-
-    co->ret = ret;
-    qemu_coroutine_enter(co->coroutine, NULL);
-}
-
-static int coroutine_fn bdrv_co_io_em(BlockDriverState *bs, int64_t sector_num,
-                                      int nb_sectors, QEMUIOVector *iov,
-                                      bool is_write)
-{
-    CoroutineIOCompletion co = {
-        .coroutine = qemu_coroutine_self(),
-    };
-    BlockAIOCB *acb;
-
-    if (is_write) {
-        acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors,
-                                       bdrv_co_io_em_complete, &co);
-    } else {
-        acb = bs->drv->bdrv_aio_readv(bs, sector_num, iov, nb_sectors,
-                                      bdrv_co_io_em_complete, &co);
-    }
-
-    trace_bdrv_co_io_em(bs, sector_num, nb_sectors, is_write, acb);
-    if (!acb) {
-        return -EIO;
-    }
-    qemu_coroutine_yield();
-
-    return co.ret;
-}
-
-static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
-                                         int64_t sector_num, int nb_sectors,
-                                         QEMUIOVector *iov)
-{
-    return bdrv_co_io_em(bs, sector_num, nb_sectors, iov, false);
-}
-
-static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
-                                         int64_t sector_num, int nb_sectors,
-                                         QEMUIOVector *iov)
-{
-    return bdrv_co_io_em(bs, sector_num, nb_sectors, iov, true);
-}
-
 static void coroutine_fn bdrv_flush_co_entry(void *opaque)
 {
     RwCo *rwco = opaque;
diff --git a/trace-events b/trace-events
index 8350743..b4acd2a 100644
--- a/trace-events
+++ b/trace-events
@@ -74,7 +74,6 @@ bdrv_co_copy_on_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector
 bdrv_co_readv_no_serialising(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_write_zeroes(void *bs, int64_t sector_num, int nb_sector, int flags) "bs %p sector_num %"PRId64" nb_sectors %d flags %#x"
-bdrv_co_io_em(void *bs, int64_t sector_num, int nb_sectors, int is_write, void *acb) "bs %p sector_num %"PRId64" nb_sectors %d is_write %d acb %p"
 bdrv_co_do_copy_on_readv(void *bs, int64_t sector_num, int nb_sectors, int64_t cluster_sector_num, int cluster_nb_sectors) "bs %p sector_num %"PRId64" nb_sectors %d cluster_sector_num %"PRId64" cluster_nb_sectors %d"
 
 # block/stream.c
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (2 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev() Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function Kevin Wolf
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

It used to be an internal helper function just for implementing
bdrv_co_do_readv/writev(), but now that it's a public interface, it
deserves a name without "do" in it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/block-backend.c     |  4 ++--
 block/io.c                | 24 ++++++++++++------------
 block/raw_bsd.c           |  4 ++--
 hw/ide/macio.c            |  4 ++--
 include/block/block_int.h |  4 ++--
 5 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 16c9d5e..115b069 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -692,7 +692,7 @@ static int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
         return ret;
     }
 
-    return bdrv_co_do_preadv(blk_bs(blk), offset, bytes, qiov, flags);
+    return bdrv_co_preadv(blk_bs(blk), offset, bytes, qiov, flags);
 }
 
 static int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
@@ -710,7 +710,7 @@ static int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
         flags |= BDRV_REQ_FUA;
     }
 
-    return bdrv_co_do_pwritev(blk_bs(blk), offset, bytes, qiov, flags);
+    return bdrv_co_pwritev(blk_bs(blk), offset, bytes, qiov, flags);
 }
 
 typedef struct BlkRwCo {
diff --git a/block/io.c b/block/io.c
index 86d97c3..2d1e039 100644
--- a/block/io.c
+++ b/block/io.c
@@ -569,13 +569,13 @@ static void coroutine_fn bdrv_rw_co_entry(void *opaque)
     RwCo *rwco = opaque;
 
     if (!rwco->is_write) {
-        rwco->ret = bdrv_co_do_preadv(rwco->bs, rwco->offset,
-                                      rwco->qiov->size, rwco->qiov,
-                                      rwco->flags);
+        rwco->ret = bdrv_co_preadv(rwco->bs, rwco->offset,
+                                   rwco->qiov->size, rwco->qiov,
+                                   rwco->flags);
     } else {
-        rwco->ret = bdrv_co_do_pwritev(rwco->bs, rwco->offset,
-                                       rwco->qiov->size, rwco->qiov,
-                                       rwco->flags);
+        rwco->ret = bdrv_co_pwritev(rwco->bs, rwco->offset,
+                                    rwco->qiov->size, rwco->qiov,
+                                    rwco->flags);
     }
 }
 
@@ -1045,7 +1045,7 @@ out:
 /*
  * Handle a read request in coroutine context
  */
-int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
+int coroutine_fn bdrv_co_preadv(BlockDriverState *bs,
     int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
     BdrvRequestFlags flags)
 {
@@ -1127,8 +1127,8 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
         return -EINVAL;
     }
 
-    return bdrv_co_do_preadv(bs, sector_num << BDRV_SECTOR_BITS,
-                             nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
+    return bdrv_co_preadv(bs, sector_num << BDRV_SECTOR_BITS,
+                          nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
 }
 
 int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
@@ -1388,7 +1388,7 @@ fail:
 /*
  * Handle a write request in coroutine context
  */
-int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
+int coroutine_fn bdrv_co_pwritev(BlockDriverState *bs,
     int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
     BdrvRequestFlags flags)
 {
@@ -1523,8 +1523,8 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
         return -EINVAL;
     }
 
-    return bdrv_co_do_pwritev(bs, sector_num << BDRV_SECTOR_BITS,
-                              nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
+    return bdrv_co_pwritev(bs, sector_num << BDRV_SECTOR_BITS,
+                           nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
 }
 
 int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 9c9d39b..5e65fb0 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -105,8 +105,8 @@ raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-    ret = bdrv_co_do_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE,
-                             nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
+    ret = bdrv_co_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE,
+                          nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
 
 fail:
     if (qiov == &local_qiov) {
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 76256eb..ae29b6f 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -55,8 +55,8 @@ static const int debug_macio = 0;
 /*
  * Unaligned DMA read/write access functions required for OS X/Darwin which
  * don't perform DMA transactions on sector boundaries. These functions are
- * modelled on bdrv_co_do_preadv()/bdrv_co_do_pwritev() and so should be
- * easy to remove if the unaligned block APIs are ever exposed.
+ * modelled on bdrv_co_preadv()/bdrv_co_pwritev() and so should be easy to
+ * remove if the unaligned block APIs are ever exposed.
  */
 
 static void pmac_dma_read(BlockBackend *blk,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 10d8759..f0171e3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -517,10 +517,10 @@ extern BlockDriver bdrv_qcow2;
  */
 void bdrv_setup_io_funcs(BlockDriver *bdrv);
 
-int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
+int coroutine_fn bdrv_co_preadv(BlockDriverState *bs,
     int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
     BdrvRequestFlags flags);
-int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
+int coroutine_fn bdrv_co_pwritev(BlockDriverState *bs,
     int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
     BdrvRequestFlags flags);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (3 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 15:21   ` Eric Blake
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface Kevin Wolf
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Many parts of the block layer are already byte granularity. The block
driver interface, however, was still missing an interface that allows
making use of this. This patch introduces a new BlockDriver interface,
which is based on coroutines, vectored, has flags and uses a byte
granularity. This is now the preferred interface for new drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/io.c                | 28 ++++++++++++++++++++++------
 include/block/block_int.h |  4 ++++
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/block/io.c b/block/io.c
index 2d1e039..2f53636 100644
--- a/block/io.c
+++ b/block/io.c
@@ -806,8 +806,15 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
                                            QEMUIOVector *qiov, int flags)
 {
     BlockDriver *drv = bs->drv;
-    int64_t sector_num = offset >> BDRV_SECTOR_BITS;
-    unsigned int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int64_t sector_num;
+    unsigned int nb_sectors;
+
+    if (drv->bdrv_co_preadv) {
+        return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
+    }
+
+    sector_num = offset >> BDRV_SECTOR_BITS;
+    nb_sectors = bytes >> BDRV_SECTOR_BITS;
 
     assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
@@ -837,10 +844,18 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
                                             QEMUIOVector *qiov, int flags)
 {
     BlockDriver *drv = bs->drv;
-    int64_t sector_num = offset >> BDRV_SECTOR_BITS;
-    unsigned int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int64_t sector_num;
+    unsigned int nb_sectors;
     int ret;
 
+    if (drv->bdrv_co_pwritev) {
+        ret = drv->bdrv_co_pwritev(bs, offset, bytes, qiov, flags);
+        goto emulate_flags;
+    }
+
+    sector_num = offset >> BDRV_SECTOR_BITS;
+    nb_sectors = bytes >> BDRV_SECTOR_BITS;
+
     assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
     assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
@@ -860,13 +875,14 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
         acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
                                        bdrv_co_io_em_complete, &co);
         if (acb == NULL) {
-            return -EIO;
+            ret = -EIO;
         } else {
             qemu_coroutine_yield();
-            return co.ret;
+            ret = co.ret;
         }
     }
 
+emulate_flags:
     if (ret == 0 && (flags & BDRV_REQ_FUA) &&
         !(drv->supported_write_flags & BDRV_REQ_FUA))
     {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f0171e3..0ef6f2c 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -153,10 +153,14 @@ struct BlockDriver {
 
     int coroutine_fn (*bdrv_co_readv)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+    int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
     int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
     int coroutine_fn (*bdrv_co_writev_flags)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int flags);
+    int coroutine_fn (*bdrv_co_pwritev)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
 
     int supported_write_flags;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (4 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 15:50   ` Eric Blake
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 07/17] cloop: " Kevin Wolf
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/bochs.c | 51 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index af8b7ab..f0e18c0 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -104,6 +104,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
 
     bs->read_only = 1; // no write support yet
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
 
     ret = bdrv_pread(bs->file->bs, 0, &bochs, sizeof(bochs));
     if (ret < 0) {
@@ -221,38 +222,52 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
     return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
 }
 
-static int bochs_read(BlockDriverState *bs, int64_t sector_num,
-                    uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+bochs_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                QEMUIOVector *qiov, int flags)
 {
+    BDRVBochsState *s = bs->opaque;
+    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    uint64_t bytes_done = 0;
+    QEMUIOVector local_qiov;
     int ret;
 
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+    qemu_iovec_init(&local_qiov, qiov->niov);
+    qemu_co_mutex_lock(&s->lock);
+
     while (nb_sectors > 0) {
         int64_t block_offset = seek_to_sector(bs, sector_num);
         if (block_offset < 0) {
-            return block_offset;
-        } else if (block_offset > 0) {
-            ret = bdrv_pread(bs->file->bs, block_offset, buf, 512);
+            ret = block_offset;
+            goto fail;
+        }
+
+        qemu_iovec_reset(&local_qiov);
+        qemu_iovec_concat(&local_qiov, qiov, bytes_done, 512);
+
+        if (block_offset > 0) {
+            ret = bdrv_co_preadv(bs->file->bs, block_offset, 512,
+                                 &local_qiov, 0);
             if (ret < 0) {
-                return ret;
+                goto fail;
             }
         } else {
-            memset(buf, 0, 512);
+            qemu_iovec_memset(&local_qiov, 0, 0, 512);
         }
         nb_sectors--;
         sector_num++;
-        buf += 512;
+        bytes_done += 512;
     }
-    return 0;
-}
 
-static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
-                                      uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVBochsState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = bochs_read(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
     qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&local_qiov);
+
     return ret;
 }
 
@@ -267,7 +282,7 @@ static BlockDriver bdrv_bochs = {
     .instance_size	= sizeof(BDRVBochsState),
     .bdrv_probe		= bochs_probe,
     .bdrv_open		= bochs_open,
-    .bdrv_read          = bochs_co_read,
+    .bdrv_co_preadv = bochs_co_preadv,
     .bdrv_close		= bochs_close,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 07/17] cloop: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (5 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  2:28   ` Eric Blake
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 08/17] dmg: " Kevin Wolf
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/cloop.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/block/cloop.c b/block/cloop.c
index a84f140..fc1ca3a 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -66,6 +66,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
 
     bs->read_only = 1;
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
 
     /* read header */
     ret = bdrv_pread(bs->file->bs, 128, &s->block_size, 4);
@@ -229,33 +230,38 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
     return 0;
 }
 
-static int cloop_read(BlockDriverState *bs, int64_t sector_num,
-                    uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+cloop_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                QEMUIOVector *qiov, int flags)
 {
     BDRVCloopState *s = bs->opaque;
-    int i;
+    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int ret, i;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+    qemu_co_mutex_lock(&s->lock);
 
     for (i = 0; i < nb_sectors; i++) {
+        void *data;
         uint32_t sector_offset_in_block =
             ((sector_num + i) % s->sectors_per_block),
             block_num = (sector_num + i) / s->sectors_per_block;
         if (cloop_read_block(bs, block_num) != 0) {
-            return -1;
+            ret = -EIO;
+            goto fail;
         }
-        memcpy(buf + i * 512,
-            s->uncompressed_block + sector_offset_in_block * 512, 512);
+
+        data = s->uncompressed_block + sector_offset_in_block * 512;
+        qemu_iovec_from_buf(qiov, i * 512, data, 512);
     }
-    return 0;
-}
 
-static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
-                                      uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVCloopState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = cloop_read(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
     qemu_co_mutex_unlock(&s->lock);
+
     return ret;
 }
 
@@ -273,7 +279,7 @@ static BlockDriver bdrv_cloop = {
     .instance_size  = sizeof(BDRVCloopState),
     .bdrv_probe     = cloop_probe,
     .bdrv_open      = cloop_open,
-    .bdrv_read      = cloop_co_read,
+    .bdrv_co_preadv = cloop_co_preadv,
     .bdrv_close     = cloop_close,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 08/17] dmg: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (6 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 07/17] cloop: " Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  2:39   ` Eric Blake
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 09/17] vdi: " Kevin Wolf
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/dmg.c | 40 +++++++++++++++++++++++-----------------
 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/block/dmg.c b/block/dmg.c
index a496eb7..1ea5f22 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -440,6 +440,8 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
 
     bs->read_only = 1;
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+
     s->n_chunks = 0;
     s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
     /* used by dmg_read_mish_block to keep track of the current I/O position */
@@ -659,38 +661,42 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
     return 0;
 }
 
-static int dmg_read(BlockDriverState *bs, int64_t sector_num,
-                    uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+              QEMUIOVector *qiov, int flags)
 {
     BDRVDMGState *s = bs->opaque;
-    int i;
+    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int ret, i;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+    qemu_co_mutex_lock(&s->lock);
 
     for (i = 0; i < nb_sectors; i++) {
         uint32_t sector_offset_in_chunk;
+        void *data;
+
         if (dmg_read_chunk(bs, sector_num + i) != 0) {
-            return -1;
+            ret = -EIO;
+            goto fail;
         }
         /* Special case: current chunk is all zeroes. Do not perform a memcpy as
          * s->uncompressed_chunk may be too small to cover the large all-zeroes
          * section. dmg_read_chunk is called to find s->current_chunk */
         if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
-            memset(buf + i * 512, 0, 512);
+            qemu_iovec_memset(qiov, i * 512, 0, 512);
             continue;
         }
         sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
-        memcpy(buf + i * 512,
-               s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
+        data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
+        qemu_iovec_from_buf(qiov, i * 512, data, 512);
     }
-    return 0;
-}
 
-static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
-                                    uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVDMGState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = dmg_read(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
     qemu_co_mutex_unlock(&s->lock);
     return ret;
 }
@@ -715,7 +721,7 @@ static BlockDriver bdrv_dmg = {
     .instance_size  = sizeof(BDRVDMGState),
     .bdrv_probe     = dmg_probe,
     .bdrv_open      = dmg_open,
-    .bdrv_read      = dmg_co_read,
+    .bdrv_co_preadv = dmg_co_preadv,
     .bdrv_close     = dmg_close,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 09/17] vdi: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (7 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 08/17] dmg: " Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 10/17] vdi: Implement .bdrv_co_pwritev() interface Kevin Wolf
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c | 55 ++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 75d4819..8295511 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -557,48 +557,57 @@ static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
     return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
 }
 
-static int vdi_co_read(BlockDriverState *bs,
-        int64_t sector_num, uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vdi_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+              QEMUIOVector *qiov, int flags)
 {
     BDRVVdiState *s = bs->opaque;
+    QEMUIOVector local_qiov;
     uint32_t bmap_entry;
     uint32_t block_index;
-    uint32_t sector_in_block;
-    uint32_t n_sectors;
+    uint32_t offset_in_block;
+    uint32_t n_bytes;
+    uint64_t bytes_done = 0;
     int ret = 0;
 
     logout("\n");
 
-    while (ret >= 0 && nb_sectors > 0) {
-        block_index = sector_num / s->block_sectors;
-        sector_in_block = sector_num % s->block_sectors;
-        n_sectors = s->block_sectors - sector_in_block;
-        if (n_sectors > nb_sectors) {
-            n_sectors = nb_sectors;
-        }
+    qemu_iovec_init(&local_qiov, qiov->niov);
 
-        logout("will read %u sectors starting at sector %" PRIu64 "\n",
-               n_sectors, sector_num);
+    while (ret >= 0 && bytes > 0) {
+        block_index = offset / s->block_size;
+        offset_in_block = offset % s->block_size;
+        n_bytes = MIN(bytes, s->block_size - offset_in_block);
+
+        logout("will read %u bytes starting at offset %" PRIu64 "\n",
+               n_bytes, offset);
 
         /* prepare next AIO request */
         bmap_entry = le32_to_cpu(s->bmap[block_index]);
         if (!VDI_IS_ALLOCATED(bmap_entry)) {
             /* Block not allocated, return zeros, no need to wait. */
-            memset(buf, 0, n_sectors * SECTOR_SIZE);
+            qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
             ret = 0;
         } else {
-            uint64_t offset = s->header.offset_data / SECTOR_SIZE +
-                              (uint64_t)bmap_entry * s->block_sectors +
-                              sector_in_block;
-            ret = bdrv_read(bs->file->bs, offset, buf, n_sectors);
+            uint64_t data_offset = s->header.offset_data +
+                                   (uint64_t)bmap_entry * s->block_size +
+                                   offset_in_block;
+
+            qemu_iovec_reset(&local_qiov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+            ret = bdrv_co_preadv(bs->file->bs, data_offset, n_bytes,
+                                 &local_qiov, 0);
         }
-        logout("%u sectors read\n", n_sectors);
+        logout("%u bytes read\n", n_bytes);
 
-        nb_sectors -= n_sectors;
-        sector_num += n_sectors;
-        buf += n_sectors * SECTOR_SIZE;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
     }
 
+    qemu_iovec_destroy(&local_qiov);
+
     return ret;
 }
 
@@ -903,7 +912,7 @@ static BlockDriver bdrv_vdi = {
     .bdrv_co_get_block_status = vdi_co_get_block_status,
     .bdrv_make_empty = vdi_make_empty,
 
-    .bdrv_read = vdi_co_read,
+    .bdrv_co_preadv     = vdi_co_preadv,
 #if defined(CONFIG_VDI_WRITE)
     .bdrv_write = vdi_co_write,
 #endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 10/17] vdi: Implement .bdrv_co_pwritev() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (8 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 09/17] vdi: " Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 11/17] vmdk: Add vmdk_find_offset_in_cluster() Kevin Wolf
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c | 72 +++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 41 insertions(+), 31 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 8295511..e5fe4e8 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -611,53 +611,55 @@ vdi_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     return ret;
 }
 
-static int vdi_co_write(BlockDriverState *bs,
-        int64_t sector_num, const uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+               QEMUIOVector *qiov, int flags)
 {
     BDRVVdiState *s = bs->opaque;
+    QEMUIOVector local_qiov;
     uint32_t bmap_entry;
     uint32_t block_index;
-    uint32_t sector_in_block;
-    uint32_t n_sectors;
+    uint32_t offset_in_block;
+    uint32_t n_bytes;
     uint32_t bmap_first = VDI_UNALLOCATED;
     uint32_t bmap_last = VDI_UNALLOCATED;
     uint8_t *block = NULL;
+    uint64_t bytes_done = 0;
     int ret = 0;
 
     logout("\n");
 
-    while (ret >= 0 && nb_sectors > 0) {
-        block_index = sector_num / s->block_sectors;
-        sector_in_block = sector_num % s->block_sectors;
-        n_sectors = s->block_sectors - sector_in_block;
-        if (n_sectors > nb_sectors) {
-            n_sectors = nb_sectors;
-        }
+    qemu_iovec_init(&local_qiov, qiov->niov);
 
-        logout("will write %u sectors starting at sector %" PRIu64 "\n",
-               n_sectors, sector_num);
+    while (ret >= 0 && bytes > 0) {
+        block_index = offset / s->block_size;
+        offset_in_block = offset % s->block_size;
+        n_bytes = MIN(bytes, s->block_size - offset_in_block);
+
+        logout("will write %u bytes starting at offset %" PRIu64 "\n",
+               n_bytes, offset);
 
         /* prepare next AIO request */
         bmap_entry = le32_to_cpu(s->bmap[block_index]);
         if (!VDI_IS_ALLOCATED(bmap_entry)) {
             /* Allocate new block and write to it. */
-            uint64_t offset;
+            uint64_t data_offset;
             bmap_entry = s->header.blocks_allocated;
             s->bmap[block_index] = cpu_to_le32(bmap_entry);
             s->header.blocks_allocated++;
-            offset = s->header.offset_data / SECTOR_SIZE +
-                     (uint64_t)bmap_entry * s->block_sectors;
+            data_offset = s->header.offset_data +
+                          (uint64_t)bmap_entry * s->block_size;
             if (block == NULL) {
                 block = g_malloc(s->block_size);
                 bmap_first = block_index;
             }
             bmap_last = block_index;
             /* Copy data to be written to new block and zero unused parts. */
-            memset(block, 0, sector_in_block * SECTOR_SIZE);
-            memcpy(block + sector_in_block * SECTOR_SIZE,
-                   buf, n_sectors * SECTOR_SIZE);
-            memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
-                   (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
+            memset(block, 0, offset_in_block);
+            qemu_iovec_to_buf(qiov, bytes_done, block + offset_in_block,
+                              n_bytes);
+            memset(block + offset_in_block + n_bytes, 0,
+                   s->block_size - n_bytes - offset_in_block);
 
             /* Note that this coroutine does not yield anywhere from reading the
              * bmap entry until here, so in regards to all the coroutines trying
@@ -667,12 +669,12 @@ static int vdi_co_write(BlockDriverState *bs,
              * acquire the lock and thus the padded cluster is written before
              * the other coroutines can write to the affected area. */
             qemu_co_mutex_lock(&s->write_lock);
-            ret = bdrv_write(bs->file->bs, offset, block, s->block_sectors);
+            ret = bdrv_pwrite(bs->file->bs, data_offset, block, s->block_size);
             qemu_co_mutex_unlock(&s->write_lock);
         } else {
-            uint64_t offset = s->header.offset_data / SECTOR_SIZE +
-                              (uint64_t)bmap_entry * s->block_sectors +
-                              sector_in_block;
+            uint64_t data_offset = s->header.offset_data +
+                                   (uint64_t)bmap_entry * s->block_size +
+                                   offset_in_block;
             qemu_co_mutex_lock(&s->write_lock);
             /* This lock is only used to make sure the following write operation
              * is executed after the write issued by the coroutine allocating
@@ -683,16 +685,23 @@ static int vdi_co_write(BlockDriverState *bs,
              * that that write operation has returned (there may be other writes
              * in flight, but they do not concern this very operation). */
             qemu_co_mutex_unlock(&s->write_lock);
-            ret = bdrv_write(bs->file->bs, offset, buf, n_sectors);
+
+            qemu_iovec_reset(&local_qiov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+            ret = bdrv_co_pwritev(bs->file->bs, data_offset, n_bytes,
+                                  &local_qiov, 0);
         }
 
-        nb_sectors -= n_sectors;
-        sector_num += n_sectors;
-        buf += n_sectors * SECTOR_SIZE;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
 
-        logout("%u sectors written\n", n_sectors);
+        logout("%u bytes written\n", n_bytes);
     }
 
+    qemu_iovec_destroy(&local_qiov);
+
     logout("finished data write\n");
     if (ret < 0) {
         return ret;
@@ -703,6 +712,7 @@ static int vdi_co_write(BlockDriverState *bs,
         VdiHeader *header = (VdiHeader *) block;
         uint8_t *base;
         uint64_t offset;
+        uint32_t n_sectors;
 
         logout("now writing modified header\n");
         assert(VDI_IS_ALLOCATED(bmap_first));
@@ -914,7 +924,7 @@ static BlockDriver bdrv_vdi = {
 
     .bdrv_co_preadv     = vdi_co_preadv,
 #if defined(CONFIG_VDI_WRITE)
-    .bdrv_write = vdi_co_write,
+    .bdrv_co_pwritev    = vdi_co_pwritev,
 #endif
 
     .bdrv_get_info = vdi_get_info,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 11/17] vmdk: Add vmdk_find_offset_in_cluster()
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (9 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 10/17] vdi: Implement .bdrv_co_pwritev() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 12/17] vmdk: Implement .bdrv_co_preadv() interface Kevin Wolf
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

This is a byte granularity version of vmdk_find_index_in_cluster().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 45f9d3c..f1e01f9 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1259,15 +1259,26 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
     return NULL;
 }
 
+static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
+                                                   int64_t offset)
+{
+    uint64_t offset_in_cluster, extent_begin_offset, extent_relative_offset;
+    uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+
+    extent_begin_offset =
+        (extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
+    extent_relative_offset = offset - extent_begin_offset;
+    offset_in_cluster = extent_relative_offset % cluster_size;
+
+    return offset_in_cluster;
+}
+
 static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
                                                   int64_t sector_num)
 {
-    uint64_t index_in_cluster, extent_begin_sector, extent_relative_sector_num;
-
-    extent_begin_sector = extent->end_sector - extent->sectors;
-    extent_relative_sector_num = sector_num - extent_begin_sector;
-    index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
-    return index_in_cluster;
+    uint64_t offset;
+    offset = vmdk_find_offset_in_cluster(extent, sector_num * BDRV_SECTOR_SIZE);
+    return offset / BDRV_SECTOR_SIZE;
 }
 
 static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 12/17] vmdk: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (10 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 11/17] vmdk: Add vmdk_find_offset_in_cluster() Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface Kevin Wolf
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c | 98 ++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 53 insertions(+), 45 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f1e01f9..6c447ad 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1381,8 +1381,8 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset,
 }
 
 static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
-                            int64_t offset_in_cluster, uint8_t *buf,
-                            int nb_sectors)
+                            int64_t offset_in_cluster, QEMUIOVector *qiov,
+                            int bytes)
 {
     int ret;
     int cluster_bytes, buf_bytes;
@@ -1394,14 +1394,13 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
 
 
     if (!extent->compressed) {
-        ret = bdrv_pread(extent->file->bs,
-                          cluster_offset + offset_in_cluster,
-                          buf, nb_sectors * 512);
-        if (ret == nb_sectors * 512) {
-            return 0;
-        } else {
-            return -EIO;
+        ret = bdrv_co_preadv(extent->file->bs,
+                             cluster_offset + offset_in_cluster, bytes,
+                             qiov, 0);
+        if (ret < 0) {
+            return ret;
         }
+        return 0;
     }
     cluster_bytes = extent->cluster_sectors * 512;
     /* Read two clusters in case GrainMarker + compressed data > one cluster */
@@ -1433,11 +1432,11 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
 
     }
     if (offset_in_cluster < 0 ||
-            offset_in_cluster + nb_sectors * 512 > buf_len) {
+            offset_in_cluster + bytes > buf_len) {
         ret = -EINVAL;
         goto out;
     }
-    memcpy(buf, uncomp_buf + offset_in_cluster, nb_sectors * 512);
+    qemu_iovec_from_buf(qiov, 0, uncomp_buf + offset_in_cluster, bytes);
     ret = 0;
 
  out:
@@ -1446,64 +1445,73 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
     return ret;
 }
 
-static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
-                    uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+               QEMUIOVector *qiov, int flags)
 {
     BDRVVmdkState *s = bs->opaque;
     int ret;
-    uint64_t n, index_in_cluster;
+    uint64_t n_bytes, offset_in_cluster;
     VmdkExtent *extent = NULL;
+    QEMUIOVector local_qiov;
     uint64_t cluster_offset;
+    uint64_t bytes_done = 0;
 
-    while (nb_sectors > 0) {
-        extent = find_extent(s, sector_num, extent);
+    qemu_iovec_init(&local_qiov, qiov->niov);
+    qemu_co_mutex_lock(&s->lock);
+
+    while (bytes > 0) {
+        extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
         if (!extent) {
-            return -EIO;
+            ret = -EIO;
+            goto fail;
         }
         ret = get_cluster_offset(bs, extent, NULL,
-                                 sector_num << 9, false, &cluster_offset,
-                                 0, 0);
-        index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
-        n = extent->cluster_sectors - index_in_cluster;
-        if (n > nb_sectors) {
-            n = nb_sectors;
-        }
+                                 offset, false, &cluster_offset, 0, 0);
+        offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
+
+        n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
+                             - offset_in_cluster);
+
         if (ret != VMDK_OK) {
             /* if not allocated, try to read from parent image, if exist */
             if (bs->backing && ret != VMDK_ZEROED) {
                 if (!vmdk_is_cid_valid(bs)) {
-                    return -EINVAL;
+                    ret = -EINVAL;
+                    goto fail;
                 }
-                ret = bdrv_read(bs->backing->bs, sector_num, buf, n);
+
+                qemu_iovec_reset(&local_qiov);
+                qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+                ret = bdrv_co_preadv(bs->backing->bs, offset, n_bytes,
+                                     &local_qiov, 0);
                 if (ret < 0) {
-                    return ret;
+                    goto fail;
                 }
             } else {
-                memset(buf, 0, 512 * n);
+                qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
             }
         } else {
-            ret = vmdk_read_extent(extent,
-                            cluster_offset, index_in_cluster * 512,
-                            buf, n);
+            qemu_iovec_reset(&local_qiov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+            ret = vmdk_read_extent(extent, cluster_offset, offset_in_cluster,
+                                   &local_qiov, n_bytes);
             if (ret) {
-                return ret;
+                goto fail;
             }
         }
-        nb_sectors -= n;
-        sector_num += n;
-        buf += n * 512;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
     }
-    return 0;
-}
 
-static coroutine_fn int vmdk_co_read(BlockDriverState *bs, int64_t sector_num,
-                                     uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVVmdkState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = vmdk_read(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
     qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&local_qiov);
+
     return ret;
 }
 
@@ -2332,7 +2340,7 @@ static BlockDriver bdrv_vmdk = {
     .bdrv_open                    = vmdk_open,
     .bdrv_check                   = vmdk_check,
     .bdrv_reopen_prepare          = vmdk_reopen_prepare,
-    .bdrv_read                    = vmdk_co_read,
+    .bdrv_co_preadv               = vmdk_co_preadv,
     .bdrv_write                   = vmdk_co_write,
     .bdrv_write_compressed        = vmdk_write_compressed,
     .bdrv_co_write_zeroes         = vmdk_co_write_zeroes,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (11 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 12/17] vmdk: Implement .bdrv_co_preadv() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  3:08   ` Fam Zheng
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface Kevin Wolf
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c | 209 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 135 insertions(+), 74 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 6c447ad..f243527 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1016,27 +1016,26 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
  */
 static int get_whole_cluster(BlockDriverState *bs,
                              VmdkExtent *extent,
-                             uint64_t cluster_sector_num,
-                             uint64_t sector_num,
-                             uint64_t skip_start_sector,
-                             uint64_t skip_end_sector)
+                             uint64_t cluster_offset,
+                             uint64_t offset,
+                             uint64_t skip_start_bytes,
+                             uint64_t skip_end_bytes)
 {
     int ret = VMDK_OK;
     int64_t cluster_bytes;
     uint8_t *whole_grain;
 
     /* For COW, align request sector_num to cluster start */
-    sector_num = QEMU_ALIGN_DOWN(sector_num, extent->cluster_sectors);
     cluster_bytes = extent->cluster_sectors << BDRV_SECTOR_BITS;
+    offset = QEMU_ALIGN_DOWN(offset, cluster_bytes);
     whole_grain = qemu_blockalign(bs, cluster_bytes);
 
     if (!bs->backing) {
-        memset(whole_grain, 0,  skip_start_sector << BDRV_SECTOR_BITS);
-        memset(whole_grain + (skip_end_sector << BDRV_SECTOR_BITS), 0,
-               cluster_bytes - (skip_end_sector << BDRV_SECTOR_BITS));
+        memset(whole_grain, 0, skip_start_bytes);
+        memset(whole_grain + skip_end_bytes, 0, cluster_bytes - skip_end_bytes);
     }
 
-    assert(skip_end_sector <= extent->cluster_sectors);
+    assert(skip_end_bytes <= cluster_bytes);
     /* we will be here if it's first write on non-exist grain(cluster).
      * try to read from parent image, if exist */
     if (bs->backing && !vmdk_is_cid_valid(bs)) {
@@ -1045,42 +1044,43 @@ static int get_whole_cluster(BlockDriverState *bs,
     }
 
     /* Read backing data before skip range */
-    if (skip_start_sector > 0) {
+    if (skip_start_bytes > 0) {
         if (bs->backing) {
-            ret = bdrv_read(bs->backing->bs, sector_num,
-                            whole_grain, skip_start_sector);
+            ret = bdrv_pread(bs->backing->bs, offset, whole_grain,
+                             skip_start_bytes);
             if (ret < 0) {
                 ret = VMDK_ERROR;
                 goto exit;
             }
         }
-        ret = bdrv_write(extent->file->bs, cluster_sector_num, whole_grain,
-                         skip_start_sector);
+        ret = bdrv_pwrite(extent->file->bs, cluster_offset, whole_grain,
+                          skip_start_bytes);
         if (ret < 0) {
             ret = VMDK_ERROR;
             goto exit;
         }
     }
     /* Read backing data after skip range */
-    if (skip_end_sector < extent->cluster_sectors) {
+    if (skip_end_bytes < cluster_bytes) {
         if (bs->backing) {
-            ret = bdrv_read(bs->backing->bs, sector_num + skip_end_sector,
-                            whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
-                            extent->cluster_sectors - skip_end_sector);
+            ret = bdrv_pread(bs->backing->bs, offset + skip_end_bytes,
+                             whole_grain + skip_end_bytes,
+                             cluster_bytes - skip_end_bytes);
             if (ret < 0) {
                 ret = VMDK_ERROR;
                 goto exit;
             }
         }
-        ret = bdrv_write(extent->file->bs, cluster_sector_num + skip_end_sector,
-                         whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
-                         extent->cluster_sectors - skip_end_sector);
+        ret = bdrv_pwrite(extent->file->bs, cluster_offset + skip_end_bytes,
+                          whole_grain + skip_end_bytes,
+                          cluster_bytes - skip_end_bytes);
         if (ret < 0) {
             ret = VMDK_ERROR;
             goto exit;
         }
     }
 
+    ret = VMDK_OK;
 exit:
     qemu_vfree(whole_grain);
     return ret;
@@ -1142,8 +1142,8 @@ static int get_cluster_offset(BlockDriverState *bs,
                               uint64_t offset,
                               bool allocate,
                               uint64_t *cluster_offset,
-                              uint64_t skip_start_sector,
-                              uint64_t skip_end_sector)
+                              uint64_t skip_start_bytes,
+                              uint64_t skip_end_bytes)
 {
     unsigned int l1_index, l2_offset, l2_index;
     int min_index, i, j;
@@ -1230,10 +1230,8 @@ static int get_cluster_offset(BlockDriverState *bs,
          * This problem may occur because of insufficient space on host disk
          * or inappropriate VM shutdown.
          */
-        ret = get_whole_cluster(bs, extent,
-                                cluster_sector,
-                                offset >> BDRV_SECTOR_BITS,
-                                skip_start_sector, skip_end_sector);
+        ret = get_whole_cluster(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+                                offset, skip_start_bytes, skip_end_bytes);
         if (ret) {
             return ret;
         }
@@ -1330,38 +1328,57 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
 }
 
 static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset,
-                            int64_t offset_in_cluster, const uint8_t *buf,
-                            int nb_sectors, int64_t sector_num)
+                            int64_t offset_in_cluster, QEMUIOVector *qiov,
+                            uint64_t qiov_offset, uint64_t n_bytes,
+                            uint64_t offset)
 {
     int ret;
     VmdkGrainMarker *data = NULL;
     uLongf buf_len;
-    const uint8_t *write_buf = buf;
-    int write_len = nb_sectors * 512;
+    QEMUIOVector local_qiov;
+    struct iovec iov;
     int64_t write_offset;
     int64_t write_end_sector;
 
     if (extent->compressed) {
+        void *compressed_data;
+
         if (!extent->has_marker) {
             ret = -EINVAL;
             goto out;
         }
         buf_len = (extent->cluster_sectors << 9) * 2;
         data = g_malloc(buf_len + sizeof(VmdkGrainMarker));
-        if (compress(data->data, &buf_len, buf, nb_sectors << 9) != Z_OK ||
-                buf_len == 0) {
+
+        compressed_data = g_malloc(n_bytes);
+        qemu_iovec_to_buf(qiov, qiov_offset, compressed_data, n_bytes);
+        ret = compress(data->data, &buf_len, compressed_data, n_bytes);
+        g_free(compressed_data);
+
+        if (ret != Z_OK || buf_len == 0) {
             ret = -EINVAL;
             goto out;
         }
-        data->lba = sector_num;
+
+        data->lba = offset >> BDRV_SECTOR_BITS;
         data->size = buf_len;
-        write_buf = (uint8_t *)data;
-        write_len = buf_len + sizeof(VmdkGrainMarker);
+
+        n_bytes = buf_len + sizeof(VmdkGrainMarker);
+        iov = (struct iovec) {
+            .iov_base   = data,
+            .iov_len    = n_bytes,
+        };
+        qemu_iovec_init_external(&local_qiov, &iov, 1);
+    } else {
+        qemu_iovec_init(&local_qiov, qiov->niov);
+        qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes);
     }
+
     write_offset = cluster_offset + offset_in_cluster,
-    ret = bdrv_pwrite(extent->file->bs, write_offset, write_buf, write_len);
+    ret = bdrv_co_pwritev(extent->file->bs, write_offset, n_bytes,
+                          &local_qiov, 0);
 
-    write_end_sector = DIV_ROUND_UP(write_offset + write_len, BDRV_SECTOR_SIZE);
+    write_end_sector = DIV_ROUND_UP(write_offset + n_bytes, BDRV_SECTOR_SIZE);
 
     if (extent->compressed) {
         extent->next_cluster_sector = write_end_sector;
@@ -1370,13 +1387,15 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset,
                                           write_end_sector);
     }
 
-    if (ret != write_len) {
-        ret = ret < 0 ? ret : -EIO;
+    if (ret < 0) {
         goto out;
     }
     ret = 0;
  out:
     g_free(data);
+    if (!extent->compressed) {
+        qemu_iovec_destroy(&local_qiov);
+    }
     return ret;
 }
 
@@ -1525,38 +1544,38 @@ fail:
  *
  * Returns: error code with 0 for success.
  */
-static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
-                      const uint8_t *buf, int nb_sectors,
-                      bool zeroed, bool zero_dry_run)
+static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
+                       uint64_t bytes, QEMUIOVector *qiov,
+                       bool zeroed, bool zero_dry_run)
 {
     BDRVVmdkState *s = bs->opaque;
     VmdkExtent *extent = NULL;
     int ret;
-    int64_t index_in_cluster, n;
+    int64_t offset_in_cluster, n_bytes;
     uint64_t cluster_offset;
+    uint64_t bytes_done = 0;
     VmdkMetaData m_data;
 
-    if (sector_num > bs->total_sectors) {
-        error_report("Wrong offset: sector_num=0x%" PRIx64
+    if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
+        error_report("Wrong offset: offset=0x%" PRIx64
                      " total_sectors=0x%" PRIx64,
-                     sector_num, bs->total_sectors);
+                     offset, bs->total_sectors);
         return -EIO;
     }
 
-    while (nb_sectors > 0) {
-        extent = find_extent(s, sector_num, extent);
+    while (bytes > 0) {
+        extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
         if (!extent) {
             return -EIO;
         }
-        index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
-        n = extent->cluster_sectors - index_in_cluster;
-        if (n > nb_sectors) {
-            n = nb_sectors;
-        }
-        ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
+        offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
+        n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
+                             - offset_in_cluster);
+
+        ret = get_cluster_offset(bs, extent, &m_data, offset,
                                  !(extent->compressed || zeroed),
-                                 &cluster_offset,
-                                 index_in_cluster, index_in_cluster + n);
+                                 &cluster_offset, offset_in_cluster,
+                                 offset_in_cluster + n_bytes);
         if (extent->compressed) {
             if (ret == VMDK_OK) {
                 /* Refuse write to allocated cluster for streamOptimized */
@@ -1565,7 +1584,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
                 return -EIO;
             } else {
                 /* allocate */
-                ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
+                ret = get_cluster_offset(bs, extent, &m_data, offset,
                                          true, &cluster_offset, 0, 0);
             }
         }
@@ -1575,9 +1594,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
         if (zeroed) {
             /* Do zeroed write, buf is ignored */
             if (extent->has_zero_grain &&
-                    index_in_cluster == 0 &&
-                    n >= extent->cluster_sectors) {
-                n = extent->cluster_sectors;
+                    offset_in_cluster == 0 &&
+                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
+                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
                 if (!zero_dry_run) {
                     /* update L2 tables */
                     if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
@@ -1589,9 +1608,8 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
                 return -ENOTSUP;
             }
         } else {
-            ret = vmdk_write_extent(extent,
-                            cluster_offset, index_in_cluster * 512,
-                            buf, n, sector_num);
+            ret = vmdk_write_extent(extent, cluster_offset, offset_in_cluster,
+                                    qiov, bytes_done, n_bytes, offset);
             if (ret) {
                 return ret;
             }
@@ -1604,9 +1622,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
                 }
             }
         }
-        nb_sectors -= n;
-        sector_num += n;
-        buf += n * 512;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
 
         /* update CID on the first write every time the virtual disk is
          * opened */
@@ -1621,25 +1639,65 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
-static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num,
-                                      const uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vmdk_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                QEMUIOVector *qiov, int flags)
 {
     int ret;
     BDRVVmdkState *s = bs->opaque;
     qemu_co_mutex_lock(&s->lock);
-    ret = vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
+    ret = vmdk_pwritev(bs, offset, bytes, qiov, false, false);
     qemu_co_mutex_unlock(&s->lock);
     return ret;
 }
 
+typedef struct VmdkWriteCompressedCo {
+    BlockDriverState *bs;
+    int64_t sector_num;
+    const uint8_t *buf;
+    int nb_sectors;
+    int ret;
+} VmdkWriteCompressedCo;
+
+static void vmdk_co_write_compressed(void *opaque)
+{
+    VmdkWriteCompressedCo *co = opaque;
+    QEMUIOVector local_qiov;
+    uint64_t offset = co->sector_num * BDRV_SECTOR_SIZE;
+    uint64_t bytes = co->nb_sectors * BDRV_SECTOR_SIZE;
+
+    struct iovec iov = (struct iovec) {
+        .iov_base   = (uint8_t*) co->buf,
+        .iov_len    = bytes,
+    };
+    qemu_iovec_init_external(&local_qiov, &iov, 1);
+
+    co->ret = vmdk_pwritev(co->bs, offset, bytes, &local_qiov, false, false);
+}
+
 static int vmdk_write_compressed(BlockDriverState *bs,
                                  int64_t sector_num,
                                  const uint8_t *buf,
                                  int nb_sectors)
 {
     BDRVVmdkState *s = bs->opaque;
+
     if (s->num_extents == 1 && s->extents[0].compressed) {
-        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
+        Coroutine *co;
+        AioContext *aio_context = bdrv_get_aio_context(bs);
+        VmdkWriteCompressedCo data = {
+            .bs         = bs,
+            .sector_num = sector_num,
+            .buf        = buf,
+            .nb_sectors = nb_sectors,
+            .ret        = -EINPROGRESS,
+        };
+        co = qemu_coroutine_create(vmdk_co_write_compressed);
+        qemu_coroutine_enter(co, &data);
+        while (data.ret == -EINPROGRESS) {
+            aio_poll(aio_context, true);
+        }
+        return data.ret;
     } else {
         return -ENOTSUP;
     }
@@ -1652,12 +1710,15 @@ static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
 {
     int ret;
     BDRVVmdkState *s = bs->opaque;
+    uint64_t offset = sector_num * BDRV_SECTOR_SIZE;
+    uint64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;
+
     qemu_co_mutex_lock(&s->lock);
     /* write zeroes could fail if sectors not aligned to cluster, test it with
      * dry_run == true before really updating image */
-    ret = vmdk_write(bs, sector_num, NULL, nb_sectors, true, true);
+    ret = vmdk_pwritev(bs, offset, bytes, NULL, true, true);
     if (!ret) {
-        ret = vmdk_write(bs, sector_num, NULL, nb_sectors, true, false);
+        ret = vmdk_pwritev(bs, offset, bytes, NULL, true, false);
     }
     qemu_co_mutex_unlock(&s->lock);
     return ret;
@@ -2341,7 +2402,7 @@ static BlockDriver bdrv_vmdk = {
     .bdrv_check                   = vmdk_check,
     .bdrv_reopen_prepare          = vmdk_reopen_prepare,
     .bdrv_co_preadv               = vmdk_co_preadv,
-    .bdrv_write                   = vmdk_co_write,
+    .bdrv_co_pwritev              = vmdk_co_pwritev,
     .bdrv_write_compressed        = vmdk_write_compressed,
     .bdrv_co_write_zeroes         = vmdk_co_write_zeroes,
     .bdrv_close                   = vmdk_close,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (12 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  3:14   ` Fam Zheng
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 15/17] vpc: Implement .bdrv_co_pwritev() interface Kevin Wolf
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vpc.c | 79 ++++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 42 insertions(+), 37 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 3e2ea69..0099a80 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -454,22 +454,21 @@ static int vpc_reopen_prepare(BDRVReopenState *state,
  * The parameter write must be 1 if the offset will be used for a write
  * operation (the block bitmaps is updated then), 0 otherwise.
  */
-static inline int64_t get_sector_offset(BlockDriverState *bs,
-    int64_t sector_num, int write)
+static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
+                                        bool write)
 {
     BDRVVPCState *s = bs->opaque;
-    uint64_t offset = sector_num * 512;
     uint64_t bitmap_offset, block_offset;
-    uint32_t pagetable_index, pageentry_index;
+    uint32_t pagetable_index, offset_in_block;
 
     pagetable_index = offset / s->block_size;
-    pageentry_index = (offset % s->block_size) / 512;
+    offset_in_block = offset % s->block_size;
 
     if (pagetable_index >= s->max_table_entries || s->pagetable[pagetable_index] == 0xffffffff)
         return -1; /* not allocated */
 
     bitmap_offset = 512 * (uint64_t) s->pagetable[pagetable_index];
-    block_offset = bitmap_offset + s->bitmap_size + (512 * pageentry_index);
+    block_offset = bitmap_offset + s->bitmap_size + offset_in_block;
 
     /* We must ensure that we don't write to any sectors which are marked as
        unused in the bitmap. We get away with setting all bits in the block
@@ -487,6 +486,12 @@ static inline int64_t get_sector_offset(BlockDriverState *bs,
     return block_offset;
 }
 
+static inline int64_t get_sector_offset(BlockDriverState *bs,
+                                        int64_t sector_num, bool write)
+{
+    return get_image_offset(bs, sector_num * BDRV_SECTOR_SIZE, write);
+}
+
 /*
  * Writes the footer to the end of the image file. This is needed when the
  * file grows as it overwrites the old footer
@@ -573,52 +578,52 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
     return 0;
 }
 
-static int vpc_read(BlockDriverState *bs, int64_t sector_num,
-                    uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vpc_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+              QEMUIOVector *qiov, int flags)
 {
     BDRVVPCState *s = bs->opaque;
     int ret;
-    int64_t offset;
-    int64_t sectors, sectors_per_block;
+    int64_t image_offset;
+    int64_t n_bytes;
+    int64_t bytes_done = 0;
     VHDFooter *footer = (VHDFooter *) s->footer_buf;
+    QEMUIOVector local_qiov;
 
     if (be32_to_cpu(footer->type) == VHD_FIXED) {
-        return bdrv_read(bs->file->bs, sector_num, buf, nb_sectors);
+        return bdrv_co_preadv(bs->file->bs, offset, bytes, qiov, 0);
     }
-    while (nb_sectors > 0) {
-        offset = get_sector_offset(bs, sector_num, 0);
 
-        sectors_per_block = s->block_size >> BDRV_SECTOR_BITS;
-        sectors = sectors_per_block - (sector_num % sectors_per_block);
-        if (sectors > nb_sectors) {
-            sectors = nb_sectors;
-        }
+    qemu_co_mutex_lock(&s->lock);
+    qemu_iovec_init(&local_qiov, qiov->niov);
 
-        if (offset == -1) {
-            memset(buf, 0, sectors * BDRV_SECTOR_SIZE);
+    while (bytes > 0) {
+        image_offset = get_image_offset(bs, offset, false);
+        n_bytes = MIN(bytes, s->block_size - (offset % s->block_size));
+
+        if (image_offset == -1) {
+            qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
         } else {
-            ret = bdrv_pread(bs->file->bs, offset, buf,
-                sectors * BDRV_SECTOR_SIZE);
-            if (ret != sectors * BDRV_SECTOR_SIZE) {
-                return -1;
+            qemu_iovec_reset(&local_qiov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+            ret = bdrv_co_preadv(bs->file->bs, image_offset, n_bytes,
+                                 &local_qiov, 0);
+            if (ret < 0) {
+                goto fail;
             }
         }
 
-        nb_sectors -= sectors;
-        sector_num += sectors;
-        buf += sectors * BDRV_SECTOR_SIZE;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
     }
-    return 0;
-}
 
-static coroutine_fn int vpc_co_read(BlockDriverState *bs, int64_t sector_num,
-                                    uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVVPCState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = vpc_read(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
+    qemu_iovec_destroy(&local_qiov);
     qemu_co_mutex_unlock(&s->lock);
+
     return ret;
 }
 
@@ -1056,7 +1061,7 @@ static BlockDriver bdrv_vpc = {
     .bdrv_reopen_prepare    = vpc_reopen_prepare,
     .bdrv_create            = vpc_create,
 
-    .bdrv_read                  = vpc_co_read,
+    .bdrv_co_preadv             = vpc_co_preadv,
     .bdrv_write                 = vpc_co_write,
     .bdrv_co_get_block_status   = vpc_co_get_block_status,
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 15/17] vpc: Implement .bdrv_co_pwritev() interface
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (13 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 16/17] vvfat: Implement .bdrv_co_preadv/pwritev interfaces Kevin Wolf
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vpc.c | 86 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 0099a80..61a8427 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -518,7 +518,7 @@ static int rewrite_footer(BlockDriverState* bs)
  *
  * Returns the sectors' offset in the image file on success and < 0 on error
  */
-static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
+static int64_t alloc_block(BlockDriverState* bs, int64_t offset)
 {
     BDRVVPCState *s = bs->opaque;
     int64_t bat_offset;
@@ -527,14 +527,13 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
     uint8_t bitmap[s->bitmap_size];
 
     /* Check if sector_num is valid */
-    if ((sector_num < 0) || (sector_num > bs->total_sectors))
-        return -1;
+    if ((offset < 0) || (offset > bs->total_sectors * BDRV_SECTOR_SIZE)) {
+        return -EINVAL;
+    }
 
     /* Write entry into in-memory BAT */
-    index = (sector_num * 512) / s->block_size;
-    if (s->pagetable[index] != 0xFFFFFFFF)
-        return -1;
-
+    index = offset / s->block_size;
+    assert(s->pagetable[index] == 0xFFFFFFFF);
     s->pagetable[index] = s->free_data_block_offset / 512;
 
     /* Initialize the block's bitmap */
@@ -558,11 +557,11 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
     if (ret < 0)
         goto fail;
 
-    return get_sector_offset(bs, sector_num, 0);
+    return get_image_offset(bs, offset, false);
 
 fail:
     s->free_data_block_offset -= (s->block_size + s->bitmap_size);
-    return -1;
+    return ret;
 }
 
 static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
@@ -627,55 +626,56 @@ fail:
     return ret;
 }
 
-static int vpc_write(BlockDriverState *bs, int64_t sector_num,
-    const uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vpc_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+               QEMUIOVector *qiov, int flags)
 {
     BDRVVPCState *s = bs->opaque;
-    int64_t offset;
-    int64_t sectors, sectors_per_block;
+    int64_t image_offset;
+    int64_t n_bytes;
+    int64_t bytes_done = 0;
     int ret;
     VHDFooter *footer =  (VHDFooter *) s->footer_buf;
+    QEMUIOVector local_qiov;
 
     if (be32_to_cpu(footer->type) == VHD_FIXED) {
-        return bdrv_write(bs->file->bs, sector_num, buf, nb_sectors);
+        return bdrv_co_pwritev(bs->file->bs, offset, bytes, qiov, 0);
     }
-    while (nb_sectors > 0) {
-        offset = get_sector_offset(bs, sector_num, 1);
 
-        sectors_per_block = s->block_size >> BDRV_SECTOR_BITS;
-        sectors = sectors_per_block - (sector_num % sectors_per_block);
-        if (sectors > nb_sectors) {
-            sectors = nb_sectors;
-        }
+    qemu_co_mutex_lock(&s->lock);
+    qemu_iovec_init(&local_qiov, qiov->niov);
+
+    while (bytes > 0) {
+        image_offset = get_image_offset(bs, offset, true);
+        n_bytes = MIN(bytes, s->block_size - (offset % s->block_size));
 
-        if (offset == -1) {
-            offset = alloc_block(bs, sector_num);
-            if (offset < 0)
-                return -1;
+        if (image_offset == -1) {
+            image_offset = alloc_block(bs, offset);
+            if (image_offset < 0) {
+                ret = image_offset;
+                goto fail;
+            }
         }
 
-        ret = bdrv_pwrite(bs->file->bs, offset, buf,
-                          sectors * BDRV_SECTOR_SIZE);
-        if (ret != sectors * BDRV_SECTOR_SIZE) {
-            return -1;
+        qemu_iovec_reset(&local_qiov);
+        qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
+
+        ret = bdrv_co_pwritev(bs->file->bs, image_offset, n_bytes,
+                              &local_qiov, 0);
+        if (ret < 0) {
+            goto fail;
         }
 
-        nb_sectors -= sectors;
-        sector_num += sectors;
-        buf += sectors * BDRV_SECTOR_SIZE;
+        bytes -= n_bytes;
+        offset += n_bytes;
+        bytes_done += n_bytes;
     }
 
-    return 0;
-}
-
-static coroutine_fn int vpc_co_write(BlockDriverState *bs, int64_t sector_num,
-                                     const uint8_t *buf, int nb_sectors)
-{
-    int ret;
-    BDRVVPCState *s = bs->opaque;
-    qemu_co_mutex_lock(&s->lock);
-    ret = vpc_write(bs, sector_num, buf, nb_sectors);
+    ret = 0;
+fail:
+    qemu_iovec_destroy(&local_qiov);
     qemu_co_mutex_unlock(&s->lock);
+
     return ret;
 }
 
@@ -1062,7 +1062,7 @@ static BlockDriver bdrv_vpc = {
     .bdrv_create            = vpc_create,
 
     .bdrv_co_preadv             = vpc_co_preadv,
-    .bdrv_write                 = vpc_co_write,
+    .bdrv_co_pwritev            = vpc_co_pwritev,
     .bdrv_co_get_block_status   = vpc_co_get_block_status,
 
     .bdrv_get_info          = vpc_get_info,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 16/17] vvfat: Implement .bdrv_co_preadv/pwritev interfaces
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (14 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 15/17] vpc: Implement .bdrv_co_pwritev() interface Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 17/17] block: Remove BlockDriver.bdrv_read/write Kevin Wolf
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

This doesn't really convert any of the actual vvfat logic to use
vectored I/O (and it's doubtful whether that would make sense), but
instead just adapts the wrappers to the modern interface.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vvfat.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 46 insertions(+), 9 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 183fc4f..5b0c8dd 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1179,6 +1179,7 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
         bs->read_only = 0;
     }
 
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
     bs->total_sectors = cyls * heads * secs;
 
     if (init_directories(s, dirname, heads, secs, errp)) {
@@ -1421,14 +1422,31 @@ DLOG(fprintf(stderr, "sector %d not allocated\n", (int)sector_num));
     return 0;
 }
 
-static coroutine_fn int vvfat_co_read(BlockDriverState *bs, int64_t sector_num,
-                                      uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vvfat_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                QEMUIOVector *qiov, int flags)
 {
     int ret;
     BDRVVVFATState *s = bs->opaque;
+    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    void *buf;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+    buf = g_try_malloc(bytes);
+    if (bytes && buf == NULL) {
+        return -ENOMEM;
+    }
+
     qemu_co_mutex_lock(&s->lock);
     ret = vvfat_read(bs, sector_num, buf, nb_sectors);
     qemu_co_mutex_unlock(&s->lock);
+
+    qemu_iovec_from_buf(qiov, 0, buf, bytes);
+    g_free(buf);
+
     return ret;
 }
 
@@ -2880,14 +2898,31 @@ DLOG(checkpoint());
     return 0;
 }
 
-static coroutine_fn int vvfat_co_write(BlockDriverState *bs, int64_t sector_num,
-                                       const uint8_t *buf, int nb_sectors)
+static int coroutine_fn
+vvfat_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                 QEMUIOVector *qiov, int flags)
 {
     int ret;
     BDRVVVFATState *s = bs->opaque;
+    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    void *buf;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+    buf = g_try_malloc(bytes);
+    if (bytes && buf == NULL) {
+        return -ENOMEM;
+    }
+    qemu_iovec_to_buf(qiov, 0, buf, bytes);
+
     qemu_co_mutex_lock(&s->lock);
     ret = vvfat_write(bs, sector_num, buf, nb_sectors);
     qemu_co_mutex_unlock(&s->lock);
+
+    g_free(buf);
+
     return ret;
 }
 
@@ -2904,8 +2939,10 @@ static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
     return BDRV_BLOCK_DATA;
 }
 
-static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
-	const uint8_t* buffer, int nb_sectors) {
+static int coroutine_fn
+write_target_commit(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                    QEMUIOVector *qiov, int flags)
+{
     BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
     return try_commit(s);
 }
@@ -2918,7 +2955,7 @@ static void write_target_close(BlockDriverState *bs) {
 
 static BlockDriver vvfat_write_target = {
     .format_name        = "vvfat_write_target",
-    .bdrv_write         = write_target_commit,
+    .bdrv_co_pwritev    = write_target_commit,
     .bdrv_close         = write_target_close,
 };
 
@@ -3014,8 +3051,8 @@ static BlockDriver bdrv_vvfat = {
     .bdrv_file_open         = vvfat_open,
     .bdrv_close             = vvfat_close,
 
-    .bdrv_read              = vvfat_co_read,
-    .bdrv_write             = vvfat_co_write,
+    .bdrv_co_preadv         = vvfat_co_preadv,
+    .bdrv_co_pwritev        = vvfat_co_pwritev,
     .bdrv_co_get_block_status = vvfat_co_get_block_status,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v2 17/17] block: Remove BlockDriver.bdrv_read/write
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (15 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 16/17] vvfat: Implement .bdrv_co_preadv/pwritev interfaces Kevin Wolf
@ 2016-04-28 13:16 ` Kevin Wolf
  2016-04-29  3:29 ` [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Fam Zheng
  2016-04-29  9:57 ` Kevin Wolf
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-28 13:16 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, famz, sw, qemu-devel

There are no block drivers left that implement the old .bdrv_read/write
interface, so it can be removed now. This gets us rid of the
corresponding emulation functions, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c                   |  2 --
 block/io.c                | 92 -----------------------------------------------
 include/block/block_int.h |  4 ---
 3 files changed, 98 deletions(-)

diff --git a/block.c b/block.c
index d4939b4..78e1f15 100644
--- a/block.c
+++ b/block.c
@@ -218,8 +218,6 @@ void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t sz,
 
 void bdrv_register(BlockDriver *bdrv)
 {
-    bdrv_setup_io_funcs(bdrv);
-
     QLIST_INSERT_HEAD(&bdrv_drivers, bdrv, list);
 }
 
diff --git a/block/io.c b/block/io.c
index 2f53636..6233174 100644
--- a/block/io.c
+++ b/block/io.c
@@ -34,12 +34,6 @@
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
 
-static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
-static BlockAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
 static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
                                          int64_t sector_num,
                                          QEMUIOVector *qiov,
@@ -117,18 +111,6 @@ void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group)
     bdrv_io_limits_enable(bs, group);
 }
 
-void bdrv_setup_io_funcs(BlockDriver *bdrv)
-{
-    /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
-     * the block driver lacks aio we need to emulate that.
-     */
-    if (!bdrv->bdrv_aio_readv) {
-        /* add AIO emulation layer */
-        bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
-        bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
-    }
-}
-
 void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BlockDriver *drv = bs->drv;
@@ -2148,80 +2130,6 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb)
 /**************************************************************/
 /* async block device emulation */
 
-typedef struct BlockAIOCBSync {
-    BlockAIOCB common;
-    QEMUBH *bh;
-    int ret;
-    /* vector translation state */
-    QEMUIOVector *qiov;
-    uint8_t *bounce;
-    int is_write;
-} BlockAIOCBSync;
-
-static const AIOCBInfo bdrv_em_aiocb_info = {
-    .aiocb_size         = sizeof(BlockAIOCBSync),
-};
-
-static void bdrv_aio_bh_cb(void *opaque)
-{
-    BlockAIOCBSync *acb = opaque;
-
-    if (!acb->is_write && acb->ret >= 0) {
-        qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
-    }
-    qemu_vfree(acb->bounce);
-    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_bh_delete(acb->bh);
-    acb->bh = NULL;
-    qemu_aio_unref(acb);
-}
-
-static BlockAIOCB *bdrv_aio_rw_vector(BlockDriverState *bs,
-                                      int64_t sector_num,
-                                      QEMUIOVector *qiov,
-                                      int nb_sectors,
-                                      BlockCompletionFunc *cb,
-                                      void *opaque,
-                                      int is_write)
-
-{
-    BlockAIOCBSync *acb;
-
-    acb = qemu_aio_get(&bdrv_em_aiocb_info, bs, cb, opaque);
-    acb->is_write = is_write;
-    acb->qiov = qiov;
-    acb->bounce = qemu_try_blockalign(bs, qiov->size);
-    acb->bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_aio_bh_cb, acb);
-
-    if (acb->bounce == NULL) {
-        acb->ret = -ENOMEM;
-    } else if (is_write) {
-        qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
-        acb->ret = bs->drv->bdrv_write(bs, sector_num, acb->bounce, nb_sectors);
-    } else {
-        acb->ret = bs->drv->bdrv_read(bs, sector_num, acb->bounce, nb_sectors);
-    }
-
-    qemu_bh_schedule(acb->bh);
-
-    return &acb->common;
-}
-
-static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_rw_vector(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
-}
-
-static BlockAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_rw_vector(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
-}
-
-
 typedef struct BlockAIOCBCoroutine {
     BlockAIOCB common;
     BlockRequest req;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 0ef6f2c..22fbea4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -127,10 +127,6 @@ struct BlockDriver {
                      Error **errp);
     int (*bdrv_file_open)(BlockDriverState *bs, QDict *options, int flags,
                           Error **errp);
-    int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num,
-                     uint8_t *buf, int nb_sectors);
-    int (*bdrv_write)(BlockDriverState *bs, int64_t sector_num,
-                      const uint8_t *buf, int nb_sectors);
     void (*bdrv_close)(BlockDriverState *bs);
     int (*bdrv_create)(const char *filename, QemuOpts *opts, Error **errp);
     int (*bdrv_set_key)(BlockDriverState *bs, const char *key);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function Kevin Wolf
@ 2016-04-28 15:21   ` Eric Blake
  0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2016-04-28 15:21 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: famz, sw, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On 04/28/2016 07:16 AM, Kevin Wolf wrote:
> Many parts of the block layer are already byte granularity. The block
> driver interface, however, was still missing an interface that allows
> making use of this. This patch introduces a new BlockDriver interface,
> which is based on coroutines, vectored, has flags and uses a byte
> granularity. This is now the preferred interface for new drivers.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/io.c                | 28 ++++++++++++++++++++++------
>  include/block/block_int.h |  4 ++++
>  2 files changed, 26 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface Kevin Wolf
@ 2016-04-28 15:50   ` Eric Blake
  0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2016-04-28 15:50 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: famz, sw, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

On 04/28/2016 07:16 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/bochs.c | 51 +++++++++++++++++++++++++++++++++------------------
>  1 file changed, 33 insertions(+), 18 deletions(-)
> 

> +    uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
> +    int nb_sectors = bytes >> BDRV_SECTOR_BITS;

Since we're using the named constant here (instead of /512 or >>9),...

> +    uint64_t bytes_done = 0;
> +    QEMUIOVector local_qiov;
>      int ret;
>  
> +    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> +    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> +
> +    qemu_iovec_init(&local_qiov, qiov->niov);
> +    qemu_co_mutex_lock(&s->lock);
> +
>      while (nb_sectors > 0) {
>          int64_t block_offset = seek_to_sector(bs, sector_num);
>          if (block_offset < 0) {
> -            return block_offset;
> -        } else if (block_offset > 0) {
> -            ret = bdrv_pread(bs->file->bs, block_offset, buf, 512);
> +            ret = block_offset;
> +            goto fail;
> +        }
> +
> +        qemu_iovec_reset(&local_qiov);
> +        qemu_iovec_concat(&local_qiov, qiov, bytes_done, 512);

should we also use the named constant BDRV_SECTOR_SIZE here instead of 512?

I don't care strongly enough for a respin, though, particularly since it
would affect line wrapping.

Reviewed-by: Eric Blake <eblake@redhat.com>

> +
> +        if (block_offset > 0) {
> +            ret = bdrv_co_preadv(bs->file->bs, block_offset, 512,
> +                                 &local_qiov, 0);
>              if (ret < 0) {
> -                return ret;
> +                goto fail;
>              }
>          } else {
> -            memset(buf, 0, 512);
> +            qemu_iovec_memset(&local_qiov, 0, 0, 512);
>          }
>          nb_sectors--;
>          sector_num++;
> -        buf += 512;
> +        bytes_done += 512;

More of the magic 512, if you care.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev()
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev() Kevin Wolf
@ 2016-04-29  1:14   ` Fam Zheng
  2016-04-30 21:54   ` Eric Blake
  1 sibling, 0 replies; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  1:14 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

On Thu, 04/28 15:16, Kevin Wolf wrote:
> This is a function that simply calls into the block driver for doing a
> write, providing the byte granularity interface we want to eventually
> have everywhere, and using whatever interface that driver supports.
> 
> This one is a bit more interesting that the version for reads: It adds

s/that/than/

Othewise looks good.

Fam

> support for .bdrv_co_writev_flags() everywhere, so that drivers
> implementing this function can drop .bdrv_co_writev() now.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev()
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev() Kevin Wolf
@ 2016-04-29  1:22   ` Fam Zheng
  2016-04-29  3:25     ` Fam Zheng
  0 siblings, 1 reply; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  1:22 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

On Thu, 04/28 15:16, Kevin Wolf wrote:
> Instead of registering emulation functions as .bdrv_co_writev, just
> directly check whether the function is there or not, and use the AIO
> interface if it isn't. This makes the read/write functions more
> consistent with how things are done in other places (flush, discard,
> etc.)
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/io.c   | 126 ++++++++++++++++++++++++-----------------------------------
>  trace-events |   1 -
>  2 files changed, 52 insertions(+), 75 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index c9b2864..86d97c3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -40,12 +40,6 @@ static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
>  static BlockAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
>          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
>          BlockCompletionFunc *cb, void *opaque);
> -static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
> -                                         int64_t sector_num, int nb_sectors,
> -                                         QEMUIOVector *iov);
> -static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
> -                                         int64_t sector_num, int nb_sectors,
> -                                         QEMUIOVector *iov);
>  static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
>                                           int64_t sector_num,
>                                           QEMUIOVector *qiov,
> @@ -125,19 +119,13 @@ void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group)
>  
>  void bdrv_setup_io_funcs(BlockDriver *bdrv)
>  {
> -    /* Block drivers without coroutine functions need emulation */
> -    if (!bdrv->bdrv_co_readv) {
> -        bdrv->bdrv_co_readv = bdrv_co_readv_em;
> -        bdrv->bdrv_co_writev = bdrv_co_writev_em;
> -
> -        /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
> -         * the block driver lacks aio we need to emulate that too.
> -         */
> -        if (!bdrv->bdrv_aio_readv) {
> -            /* add AIO emulation layer */
> -            bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
> -            bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
> -        }
> +    /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
> +     * the block driver lacks aio we need to emulate that.
> +     */
> +    if (!bdrv->bdrv_aio_readv) {

I'd also AND with !bdrv->bdrv_co_readv, because in that case the em functions
are assigned but not used.

> +        /* add AIO emulation layer */
> +        bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
> +        bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
>      }
>  }
>  

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 07/17] cloop: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 07/17] cloop: " Kevin Wolf
@ 2016-04-29  2:28   ` Eric Blake
  0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2016-04-29  2:28 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: famz, sw, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On 04/28/2016 07:16 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/cloop.c | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 

>      for (i = 0; i < nb_sectors; i++) {
> +        void *data;
>          uint32_t sector_offset_in_block =
>              ((sector_num + i) % s->sectors_per_block),
>              block_num = (sector_num + i) / s->sectors_per_block;
>          if (cloop_read_block(bs, block_num) != 0) {
> -            return -1;
> +            ret = -EIO;
> +            goto fail;

Might want to mention in the commit message that the fix of the return
value to use negative errno is intentional.

Otherwise,
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 08/17] dmg: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 08/17] dmg: " Kevin Wolf
@ 2016-04-29  2:39   ` Eric Blake
  0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2016-04-29  2:39 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: famz, sw, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

On 04/28/2016 07:16 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/dmg.c | 40 +++++++++++++++++++++++-----------------
>  1 file changed, 23 insertions(+), 17 deletions(-)
> 

>          if (dmg_read_chunk(bs, sector_num + i) != 0) {
> -            return -1;
> +            ret = -EIO;
> +            goto fail;

Same comment as for cloop.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface Kevin Wolf
@ 2016-04-29  3:08   ` Fam Zheng
  2016-04-29  7:41     ` Kevin Wolf
  2016-04-29  8:49     ` Pavel Butsykin
  0 siblings, 2 replies; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  3:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

On Thu, 04/28 15:16, Kevin Wolf wrote:
> +typedef struct VmdkWriteCompressedCo {
> +    BlockDriverState *bs;
> +    int64_t sector_num;
> +    const uint8_t *buf;
> +    int nb_sectors;
> +    int ret;
> +} VmdkWriteCompressedCo;
> +
> +static void vmdk_co_write_compressed(void *opaque)
> +{
> +    VmdkWriteCompressedCo *co = opaque;
> +    QEMUIOVector local_qiov;
> +    uint64_t offset = co->sector_num * BDRV_SECTOR_SIZE;
> +    uint64_t bytes = co->nb_sectors * BDRV_SECTOR_SIZE;
> +
> +    struct iovec iov = (struct iovec) {
> +        .iov_base   = (uint8_t*) co->buf,
> +        .iov_len    = bytes,
> +    };
> +    qemu_iovec_init_external(&local_qiov, &iov, 1);
> +
> +    co->ret = vmdk_pwritev(co->bs, offset, bytes, &local_qiov, false, false);

Should it acquire s->lock?

> +}
> +
>  static int vmdk_write_compressed(BlockDriverState *bs,
>                                   int64_t sector_num,
>                                   const uint8_t *buf,
>                                   int nb_sectors)
>  {
>      BDRVVmdkState *s = bs->opaque;
> +
>      if (s->num_extents == 1 && s->extents[0].compressed) {
> -        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
> +        Coroutine *co;
> +        AioContext *aio_context = bdrv_get_aio_context(bs);
> +        VmdkWriteCompressedCo data = {
> +            .bs         = bs,
> +            .sector_num = sector_num,
> +            .buf        = buf,
> +            .nb_sectors = nb_sectors,
> +            .ret        = -EINPROGRESS,
> +        };
> +        co = qemu_coroutine_create(vmdk_co_write_compressed);
> +        qemu_coroutine_enter(co, &data);
> +        while (data.ret == -EINPROGRESS) {
> +            aio_poll(aio_context, true);
> +        }
> +        return data.ret;

Don't you have a plan to make the creation of coroutine for compressed write in
in block layer?  Or will bdrv_co_pwritev gain a "compressed" flag
(BDRV_REQ_COMPRESSED) in the future?

Fam

>      } else {
>          return -ENOTSUP;
>      }

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface Kevin Wolf
@ 2016-04-29  3:14   ` Fam Zheng
  0 siblings, 0 replies; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  3:14 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

On Thu, 04/28 15:16, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/vpc.c | 79 ++++++++++++++++++++++++++++++++-----------------------------
>  1 file changed, 42 insertions(+), 37 deletions(-)
> 
> diff --git a/block/vpc.c b/block/vpc.c
> index 3e2ea69..0099a80 100644
> --- a/block/vpc.c
> +++ b/block/vpc.c
> @@ -454,22 +454,21 @@ static int vpc_reopen_prepare(BDRVReopenState *state,
>   * The parameter write must be 1 if the offset will be used for a write
>   * operation (the block bitmaps is updated then), 0 otherwise.
>   */
> -static inline int64_t get_sector_offset(BlockDriverState *bs,
> -    int64_t sector_num, int write)
> +static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
> +                                        bool write)

Parameter's indentation is one column off.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev()
  2016-04-29  1:22   ` Fam Zheng
@ 2016-04-29  3:25     ` Fam Zheng
  0 siblings, 0 replies; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  3:25 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, sw, stefanha, qemu-block, mreitz

On Fri, 04/29 09:22, Fam Zheng wrote:
> > @@ -125,19 +119,13 @@ void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group)
> >  
> >  void bdrv_setup_io_funcs(BlockDriver *bdrv)
> >  {
> > -    /* Block drivers without coroutine functions need emulation */
> > -    if (!bdrv->bdrv_co_readv) {
> > -        bdrv->bdrv_co_readv = bdrv_co_readv_em;
> > -        bdrv->bdrv_co_writev = bdrv_co_writev_em;
> > -
> > -        /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
> > -         * the block driver lacks aio we need to emulate that too.
> > -         */
> > -        if (!bdrv->bdrv_aio_readv) {
> > -            /* add AIO emulation layer */
> > -            bdrv->bdrv_aio_readv = bdrv_aio_readv_em;
> > -            bdrv->bdrv_aio_writev = bdrv_aio_writev_em;
> > -        }
> > +    /* bdrv_co_readv_em()/brdv_co_writev_em() work in terms of aio, so if
> > +     * the block driver lacks aio we need to emulate that.
> > +     */
> > +    if (!bdrv->bdrv_aio_readv) {
> 
> I'd also AND with !bdrv->bdrv_co_readv, because in that case the em functions
> are assigned but not used.

Never mind, I see the last patch removes this altogether.

Fam

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (16 preceding siblings ...)
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 17/17] block: Remove BlockDriver.bdrv_read/write Kevin Wolf
@ 2016-04-29  3:29 ` Fam Zheng
  2016-04-29  9:57 ` Kevin Wolf
  18 siblings, 0 replies; 34+ messages in thread
From: Fam Zheng @ 2016-04-29  3:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

On Thu, 04/28 15:16, Kevin Wolf wrote:
> This series introduces a new BlockDriver interface, which will hopefully be the
> final one, or at least good enough for another few years: .bdrv_preadv/pwritev.
> It is based on coroutines, vectored, has flags and uses a byte granularity.
> This is now the preferred interface for new drivers.
> 
> All drivers still using the legacy interface .bdrv_read/write are converted to
> the new interface and the emulation code we had for the old interface is
> removed. Most of the drivers become zero-copy with these patches as they are
> vectored now; only vvfat continues to allocate a bounce buffer.
> 
> The less obscure formats (vmdk, vdi, vpc) also natively support byte-aligned
> requests now. The block layer is still enforcing a minimal alignment of 512, so
> this isn't actually used yet, but in a next step, we can lift this restriction
> for drivers that implement .bdrv_preadv/pwritev.

Code working on byte granularity is consistent and much nicer to read!

Though I'm not 100% sure about the compressed write change in VMDK, but there
is apparently nothing wrong, too. So:

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-29  3:08   ` Fam Zheng
@ 2016-04-29  7:41     ` Kevin Wolf
  2016-04-29  8:49     ` Pavel Butsykin
  1 sibling, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-29  7:41 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-block, mreitz, stefanha, sw, qemu-devel

Am 29.04.2016 um 05:08 hat Fam Zheng geschrieben:
> On Thu, 04/28 15:16, Kevin Wolf wrote:
> > +typedef struct VmdkWriteCompressedCo {
> > +    BlockDriverState *bs;
> > +    int64_t sector_num;
> > +    const uint8_t *buf;
> > +    int nb_sectors;
> > +    int ret;
> > +} VmdkWriteCompressedCo;
> > +
> > +static void vmdk_co_write_compressed(void *opaque)
> > +{
> > +    VmdkWriteCompressedCo *co = opaque;
> > +    QEMUIOVector local_qiov;
> > +    uint64_t offset = co->sector_num * BDRV_SECTOR_SIZE;
> > +    uint64_t bytes = co->nb_sectors * BDRV_SECTOR_SIZE;
> > +
> > +    struct iovec iov = (struct iovec) {
> > +        .iov_base   = (uint8_t*) co->buf,
> > +        .iov_len    = bytes,
> > +    };
> > +    qemu_iovec_init_external(&local_qiov, &iov, 1);
> > +
> > +    co->ret = vmdk_pwritev(co->bs, offset, bytes, &local_qiov, false, false);
> 
> Should it acquire s->lock?

Yes, probably, if only for consistency.

It didn't acquire the lock before, so it didn't seem appropriate to add
it in this patch. Not taking the lock is fine because this function is
only called by qemu-img, where there is no concurrency.

There was a proposal recently to allow compressed writes during a normal
VM run. If we want to enable this, the locking would be needed. (The
same is true for qcow2 and qcow2.)

> > +}
> > +
> >  static int vmdk_write_compressed(BlockDriverState *bs,
> >                                   int64_t sector_num,
> >                                   const uint8_t *buf,
> >                                   int nb_sectors)
> >  {
> >      BDRVVmdkState *s = bs->opaque;
> > +
> >      if (s->num_extents == 1 && s->extents[0].compressed) {
> > -        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
> > +        Coroutine *co;
> > +        AioContext *aio_context = bdrv_get_aio_context(bs);
> > +        VmdkWriteCompressedCo data = {
> > +            .bs         = bs,
> > +            .sector_num = sector_num,
> > +            .buf        = buf,
> > +            .nb_sectors = nb_sectors,
> > +            .ret        = -EINPROGRESS,
> > +        };
> > +        co = qemu_coroutine_create(vmdk_co_write_compressed);
> > +        qemu_coroutine_enter(co, &data);
> > +        while (data.ret == -EINPROGRESS) {
> > +            aio_poll(aio_context, true);
> > +        }
> > +        return data.ret;
> 
> Don't you have a plan to make the creation of coroutine for compressed write in
> in block layer?  Or will bdrv_co_pwritev gain a "compressed" flag
> (BDRV_REQ_COMPRESSED) in the future?

Well, there was this recent discussion, but personally I don't have any
plans. If we want to do compressed writes from running VMs, we will need
the conversion because we must not block the vcpu.

Or maybe the flag would actually be the nicer option, yes. I don't have
a strong opinion on this yet.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-29  3:08   ` Fam Zheng
  2016-04-29  7:41     ` Kevin Wolf
@ 2016-04-29  8:49     ` Pavel Butsykin
  2016-04-29  9:49       ` Kevin Wolf
  1 sibling, 1 reply; 34+ messages in thread
From: Pavel Butsykin @ 2016-04-29  8:49 UTC (permalink / raw)
  To: Fam Zheng, Kevin Wolf; +Cc: qemu-devel, sw, stefanha, qemu-block, mreitz

On 29.04.2016 06:08, Fam Zheng wrote:
> On Thu, 04/28 15:16, Kevin Wolf wrote:
>> +typedef struct VmdkWriteCompressedCo {
>> +    BlockDriverState *bs;
>> +    int64_t sector_num;
>> +    const uint8_t *buf;
>> +    int nb_sectors;
>> +    int ret;
>> +} VmdkWriteCompressedCo;
>> +
>> +static void vmdk_co_write_compressed(void *opaque)
>> +{
>> +    VmdkWriteCompressedCo *co = opaque;
>> +    QEMUIOVector local_qiov;
>> +    uint64_t offset = co->sector_num * BDRV_SECTOR_SIZE;
>> +    uint64_t bytes = co->nb_sectors * BDRV_SECTOR_SIZE;
>> +
>> +    struct iovec iov = (struct iovec) {
>> +        .iov_base   = (uint8_t*) co->buf,
>> +        .iov_len    = bytes,
>> +    };
>> +    qemu_iovec_init_external(&local_qiov, &iov, 1);
>> +
>> +    co->ret = vmdk_pwritev(co->bs, offset, bytes, &local_qiov, false, false);
>
> Should it acquire s->lock?
>
The write_compressed callback is currently used only for image
converting, so I think the lock is not required.

>> +}
>> +
>>   static int vmdk_write_compressed(BlockDriverState *bs,
>>                                    int64_t sector_num,
>>                                    const uint8_t *buf,
>>                                    int nb_sectors)
>>   {
>>       BDRVVmdkState *s = bs->opaque;
>> +
>>       if (s->num_extents == 1 && s->extents[0].compressed) {
>> -        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
>> +        Coroutine *co;
>> +        AioContext *aio_context = bdrv_get_aio_context(bs);
>> +        VmdkWriteCompressedCo data = {
>> +            .bs         = bs,
>> +            .sector_num = sector_num,
>> +            .buf        = buf,
>> +            .nb_sectors = nb_sectors,
>> +            .ret        = -EINPROGRESS,
>> +        };
>> +        co = qemu_coroutine_create(vmdk_co_write_compressed);
>> +        qemu_coroutine_enter(co, &data);
>> +        while (data.ret == -EINPROGRESS) {
>> +            aio_poll(aio_context, true);
>> +        }
>> +        return data.ret;
>
> Don't you have a plan to make the creation of coroutine for compressed write in
> in block layer?  Or will bdrv_co_pwritev gain a "compressed" flag
> (BDRV_REQ_COMPRESSED) in the future?
>
Actually, I already have these patches as part of the issue of backup
compression, hope to send today.

> Fam
>
>>       } else {
>>           return -ENOTSUP;
>>       }
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-29  8:49     ` Pavel Butsykin
@ 2016-04-29  9:49       ` Kevin Wolf
  2016-04-29 10:31         ` Pavel Butsykin
  0 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2016-04-29  9:49 UTC (permalink / raw)
  To: Pavel Butsykin; +Cc: Fam Zheng, qemu-devel, sw, stefanha, qemu-block, mreitz

Am 29.04.2016 um 10:49 hat Pavel Butsykin geschrieben:
> On 29.04.2016 06:08, Fam Zheng wrote:
> >On Thu, 04/28 15:16, Kevin Wolf wrote:
> >>  static int vmdk_write_compressed(BlockDriverState *bs,
> >>                                   int64_t sector_num,
> >>                                   const uint8_t *buf,
> >>                                   int nb_sectors)
> >>  {
> >>      BDRVVmdkState *s = bs->opaque;
> >>+
> >>      if (s->num_extents == 1 && s->extents[0].compressed) {
> >>-        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
> >>+        Coroutine *co;
> >>+        AioContext *aio_context = bdrv_get_aio_context(bs);
> >>+        VmdkWriteCompressedCo data = {
> >>+            .bs         = bs,
> >>+            .sector_num = sector_num,
> >>+            .buf        = buf,
> >>+            .nb_sectors = nb_sectors,
> >>+            .ret        = -EINPROGRESS,
> >>+        };
> >>+        co = qemu_coroutine_create(vmdk_co_write_compressed);
> >>+        qemu_coroutine_enter(co, &data);
> >>+        while (data.ret == -EINPROGRESS) {
> >>+            aio_poll(aio_context, true);
> >>+        }
> >>+        return data.ret;
> >
> >Don't you have a plan to make the creation of coroutine for compressed write in
> >in block layer?  Or will bdrv_co_pwritev gain a "compressed" flag
> >(BDRV_REQ_COMPRESSED) in the future?
> >
> Actually, I already have these patches as part of the issue of backup
> compression, hope to send today.

Sounds good. Do you use a new flag for the normal write functions or just
update the existing .bdrv_write_compressed function?

Also, please base your series on this one to avoid merge conflicts in
vmdk.c. I'm going to apply the series to block-next now as it's fully
reveiwed. I guess you'll effectively need to revert what this patch does
to vmdk_write_compressed() in favour of a block/io.c solution.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev
  2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
                   ` (17 preceding siblings ...)
  2016-04-29  3:29 ` [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Fam Zheng
@ 2016-04-29  9:57 ` Kevin Wolf
  18 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2016-04-29  9:57 UTC (permalink / raw)
  To: qemu-block; +Cc: mreitz, stefanha, famz, sw, qemu-devel

Am 28.04.2016 um 15:16 hat Kevin Wolf geschrieben:
> This series introduces a new BlockDriver interface, which will hopefully be the
> final one, or at least good enough for another few years: .bdrv_preadv/pwritev.
> It is based on coroutines, vectored, has flags and uses a byte granularity.
> This is now the preferred interface for new drivers.
> 
> All drivers still using the legacy interface .bdrv_read/write are converted to
> the new interface and the emulation code we had for the old interface is
> removed. Most of the drivers become zero-copy with these patches as they are
> vectored now; only vvfat continues to allocate a bounce buffer.
> 
> The less obscure formats (vmdk, vdi, vpc) also natively support byte-aligned
> requests now. The block layer is still enforcing a minimal alignment of 512, so
> this isn't actually used yet, but in a next step, we can lift this restriction
> for drivers that implement .bdrv_preadv/pwritev.

Applied to block-next.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface
  2016-04-29  9:49       ` Kevin Wolf
@ 2016-04-29 10:31         ` Pavel Butsykin
  0 siblings, 0 replies; 34+ messages in thread
From: Pavel Butsykin @ 2016-04-29 10:31 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Fam Zheng, qemu-devel, sw, stefanha, qemu-block, mreitz

On 29.04.2016 12:49, Kevin Wolf wrote:
> Am 29.04.2016 um 10:49 hat Pavel Butsykin geschrieben:
>> On 29.04.2016 06:08, Fam Zheng wrote:
>>> On Thu, 04/28 15:16, Kevin Wolf wrote:
>>>>   static int vmdk_write_compressed(BlockDriverState *bs,
>>>>                                    int64_t sector_num,
>>>>                                    const uint8_t *buf,
>>>>                                    int nb_sectors)
>>>>   {
>>>>       BDRVVmdkState *s = bs->opaque;
>>>> +
>>>>       if (s->num_extents == 1 && s->extents[0].compressed) {
>>>> -        return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
>>>> +        Coroutine *co;
>>>> +        AioContext *aio_context = bdrv_get_aio_context(bs);
>>>> +        VmdkWriteCompressedCo data = {
>>>> +            .bs         = bs,
>>>> +            .sector_num = sector_num,
>>>> +            .buf        = buf,
>>>> +            .nb_sectors = nb_sectors,
>>>> +            .ret        = -EINPROGRESS,
>>>> +        };
>>>> +        co = qemu_coroutine_create(vmdk_co_write_compressed);
>>>> +        qemu_coroutine_enter(co, &data);
>>>> +        while (data.ret == -EINPROGRESS) {
>>>> +            aio_poll(aio_context, true);
>>>> +        }
>>>> +        return data.ret;
>>>
>>> Don't you have a plan to make the creation of coroutine for compressed write in
>>> in block layer?  Or will bdrv_co_pwritev gain a "compressed" flag
>>> (BDRV_REQ_COMPRESSED) in the future?
>>>
>> Actually, I already have these patches as part of the issue of backup
>> compression, hope to send today.
>
> Sounds good. Do you use a new flag for the normal write functions or just
> update the existing .bdrv_write_compressed function?
>
No, I just added a new interface .bdrv_co_write_compressed and removed
the old. As for compatibility with the old interface in the
bdrv_write_compressed will create a new coroutine and called
bdrv_co_write_compressed, how you did it in the vmdk(only now is in
io.c). Maybe it's not the best idea, but it can be discussed.

> Also, please base your series on this one to avoid merge conflicts in
> vmdk.c. I'm going to apply the series to block-next now as it's fully
> reveiwed. I guess you'll effectively need to revert what this patch does
> to vmdk_write_compressed() in favour of a block/io.c solution.
>
Yes, patches will be based on this series, I specifically made sure of it :)

> Kevin
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev()
  2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev() Kevin Wolf
  2016-04-29  1:14   ` Fam Zheng
@ 2016-04-30 21:54   ` Eric Blake
  1 sibling, 0 replies; 34+ messages in thread
From: Eric Blake @ 2016-04-30 21:54 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: famz, sw, qemu-devel, mreitz, stefanha

[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]

On 04/28/2016 07:16 AM, Kevin Wolf wrote:
> This is a function that simply calls into the block driver for doing a
> write, providing the byte granularity interface we want to eventually
> have everywhere, and using whatever interface that driver supports.
> 
> This one is a bit more interesting that the version for reads: It adds
> support for .bdrv_co_writev_flags() everywhere, so that drivers
> implementing this function can drop .bdrv_co_writev() now.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---

> @@ -1155,7 +1186,8 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>              }
>              qemu_iovec_init_external(&qiov, &iov, 1);
>  
> -            ret = drv->bdrv_co_writev(bs, sector_num, num, &qiov);
> +            ret = bdrv_driver_pwritev(bs, sector_num * BDRV_SECTOR_SIZE,
> +                                      num * BDRV_SECTOR_SIZE, &qiov, 0);

My review missed a latent bug here (pre-patch this should have been
using bdrv_co_writev_flags(..., flags & ~BDRV_REQ_ZERO_WRITE) so as to
keep BDRV_REQ_FUA semantics working), so as penance I've sent a patch.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-04-30 21:54 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-28 13:16 [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 01/17] block: Introduce bdrv_driver_preadv() Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 02/17] block: Introduce bdrv_driver_pwritev() Kevin Wolf
2016-04-29  1:14   ` Fam Zheng
2016-04-30 21:54   ` Eric Blake
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 03/17] block: Support AIO drivers in bdrv_driver_preadv/pwritev() Kevin Wolf
2016-04-29  1:22   ` Fam Zheng
2016-04-29  3:25     ` Fam Zheng
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function Kevin Wolf
2016-04-28 15:21   ` Eric Blake
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 06/17] bochs: Implement .bdrv_co_preadv() interface Kevin Wolf
2016-04-28 15:50   ` Eric Blake
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 07/17] cloop: " Kevin Wolf
2016-04-29  2:28   ` Eric Blake
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 08/17] dmg: " Kevin Wolf
2016-04-29  2:39   ` Eric Blake
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 09/17] vdi: " Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 10/17] vdi: Implement .bdrv_co_pwritev() interface Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 11/17] vmdk: Add vmdk_find_offset_in_cluster() Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 12/17] vmdk: Implement .bdrv_co_preadv() interface Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 13/17] vmdk: Implement .bdrv_co_pwritev() interface Kevin Wolf
2016-04-29  3:08   ` Fam Zheng
2016-04-29  7:41     ` Kevin Wolf
2016-04-29  8:49     ` Pavel Butsykin
2016-04-29  9:49       ` Kevin Wolf
2016-04-29 10:31         ` Pavel Butsykin
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 14/17] vpc: Implement .bdrv_co_preadv() interface Kevin Wolf
2016-04-29  3:14   ` Fam Zheng
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 15/17] vpc: Implement .bdrv_co_pwritev() interface Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 16/17] vvfat: Implement .bdrv_co_preadv/pwritev interfaces Kevin Wolf
2016-04-28 13:16 ` [Qemu-devel] [PATCH v2 17/17] block: Remove BlockDriver.bdrv_read/write Kevin Wolf
2016-04-29  3:29 ` [Qemu-devel] [PATCH v2 00/17] block: Convert .bdrv_read/write to .bdrv_preadv/pwritev Fam Zheng
2016-04-29  9:57 ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.