All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes
@ 2016-06-01 21:10 Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 01/13] iscsi: Use block size as minimum zero/discard alignment Eric Blake
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block

Kevin pointed out that my recent change to byte-based instead
of sector-based blk_write_zeroes() (commit 983a1600) makes life
harder as long as bdrv_write_zeroes is still sector-based, and
where the compiler doesn't flag any change in parameter types.
Complete the conversion, by making all write_zeroes operations
nominally take bytes.

Depends on:
https://lists.gnu.org/archive/html/qemu-devel/2016-05/msg04636.html

Available as a tag here:
git fetch git://repo.or.cz/qemu/ericb.git nbd-zero-v2

Changes from v1:
- Original patch 1 already landed
- Rebase to other changes that have landed, as well as to proposed
qcow2 changes [multiple, especially patch 6]
- Fix calculation bugs in iscsi [new patch 1, patch 5]
- Allow granularity of 1, rather than forcing a minimum of 512 [patch 3]
- Implement bdrv_pwrite_zeroes() atop bdrv_prwv_co() [patch 4]
- Drop some useless variables, but don't drop qiov from raw-posix
[patches 8, 10, 12]
- Fix qed to use fallback on unaligned request even without backing
file [patch 9]

Ideas for future series:
Audit for using uint32_t instead of int for length
Track all block limits in bytes rather than sectors
Convert discard to byte-based interface
Switch drivers to byte-based read/write

001/13:[down] 'iscsi: Use block size as minimum zero/discard alignment'
002/13:[0060] [FC] 'block: Track write zero limits in bytes'
003/13:[0037] [FC] 'block: Add .bdrv_co_pwrite_zeroes()'
004/13:[0028] [FC] 'block: Switch bdrv_write_zeroes() to byte interface'
005/13:[0003] [FC] 'iscsi: Convert to bdrv_co_pwrite_zeroes()'
006/13:[0071] [FC] 'qcow2: Convert to bdrv_co_pwrite_zeroes()'
007/13:[----] [--] 'blkreplay: Convert to bdrv_co_pwrite_zeroes()'
008/13:[0003] [FC] 'gluster: Convert to bdrv_co_pwrite_zeroes()'
009/13:[0012] [FC] 'qed: Convert to bdrv_co_pwrite_zeroes()'
010/13:[0019] [FC] 'raw-posix: Convert to bdrv_co_pwrite_zeroes()'
011/13:[----] [--] 'raw_bsd: Convert to bdrv_co_pwrite_zeroes()'
012/13:[0003] [FC] 'vmdk: Convert to bdrv_co_pwrite_zeroes()'
013/13:[----] [--] 'block: Kill bdrv_co_write_zeroes()'

Eric Blake (13):
  iscsi: Use block size as minimum zero/discard alignment
  block: Track write zero limits in bytes
  block: Add .bdrv_co_pwrite_zeroes()
  block: Switch bdrv_write_zeroes() to byte interface
  iscsi: Convert to bdrv_co_pwrite_zeroes()
  qcow2: Convert to bdrv_co_pwrite_zeroes()
  blkreplay: Convert to bdrv_co_pwrite_zeroes()
  gluster: Convert to bdrv_co_pwrite_zeroes()
  qed: Convert to bdrv_co_pwrite_zeroes()
  raw-posix: Convert to bdrv_co_pwrite_zeroes()
  raw_bsd: Convert to bdrv_co_pwrite_zeroes()
  vmdk: Convert to bdrv_co_pwrite_zeroes()
  block: Kill bdrv_co_write_zeroes()

 include/block/block.h     |  16 +++----
 include/block/block_int.h |  16 ++++---
 block/blkreplay.c         |   8 ++--
 block/gluster.c           |  14 +++---
 block/io.c                | 114 +++++++++++++++++++++++++---------------------
 block/iscsi.c             |  70 ++++++++++++++++------------
 block/parallels.c         |   4 +-
 block/qcow2-cluster.c     |   3 +-
 block/qcow2.c             |  48 +++++++++----------
 block/qed.c               |  35 +++++++-------
 block/raw-posix.c         |  34 +++++++-------
 block/raw_bsd.c           |  10 ++--
 block/vmdk.c              |  18 ++++----
 migration/block.c         |   5 +-
 tests/qemu-iotests/034    |   2 +-
 tests/qemu-iotests/154    |   2 +-
 trace-events              |  10 ++--
 17 files changed, 216 insertions(+), 193 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 01/13] iscsi: Use block size as minimum zero/discard alignment
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 02/13] block: Track write zero limits in bytes Eric Blake
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, qemu-block, Ronnie Sahlberg, Paolo Bonzini, Peter Lieven,
	Max Reitz

If hardware does not advertise a minimum zero/discard
alignment, we still want to guarantee that the block layer
will align requests to our blocks, rather than the arbitrary
512-byte BDRV sector size.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/iscsi.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index e7d5f7b..94f9974 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1711,6 +1711,8 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
         }
         bs->bl.discard_alignment =
             sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
+    } else {
+        bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS;
     }

     if (iscsilun->bl.max_ws_len < 0xffffffff) {
@@ -1720,6 +1722,9 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
     if (iscsilun->lbp.lbpws) {
         bs->bl.write_zeroes_alignment =
             sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
+    } else {
+        bs->bl.write_zeroes_alignment =
+            iscsilun->block_size >> BDRV_SECTOR_BITS;
     }
     bs->bl.opt_transfer_length =
         sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 02/13] block: Track write zero limits in bytes
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 01/13] iscsi: Use block size as minimum zero/discard alignment Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 03/13] block: Add .bdrv_co_pwrite_zeroes() Eric Blake
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

Another step towards removing sector-based interfaces: convert
the maximum write and minimum alignment values from sectors to
bytes.  Rename the variables to let the compiler check that all
users are converted to the new semantics.

The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS
is constrained by INT_MAX (this means that we can't even
support a 2G write_zeroes, but just under it) - changing
operation lengths to unsigned or to 64-bits is a much bigger
audit, and debatable if we even want to do it (since at the
core, a 32-bit platform will still have ssize_t as its
underlying limit on write()).

Meanwhile, alignment is changed to 'uint32_t', since it makes no
sense to have an alignment larger than the maximum write, and
less painful to use an unsigned type with well-defined behavior
in bit operations than to have to worry about what happens if
a driver mistakenly supplies a negative alignment.

Add an assert that no one was trying to use sectors to get a
write zeroes larger than 2G, and therefore that a later conversion
to bytes won't be impacted by keeping the limit at 32 bits.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block_int.h | 10 ++++++----
 block/io.c                | 22 +++++++++++++---------
 block/iscsi.c             | 13 ++++++-------
 block/qcow2.c             |  2 +-
 block/qed.c               |  2 +-
 block/vmdk.c              |  6 +++---
 6 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 30a9717..2e9c81f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -328,11 +328,13 @@ typedef struct BlockLimits {
     /* optimal alignment for discard requests in sectors */
     int64_t discard_alignment;

-    /* maximum number of sectors that can zeroized at once */
-    int max_write_zeroes;
+    /* maximum number of bytes that can zeroized at once (since it is
+     * signed, it must be < 2G, if set) */
+    int32_t max_pwrite_zeroes;

-    /* optimal alignment for write zeroes requests in sectors */
-    int64_t write_zeroes_alignment;
+    /* optimal alignment for write zeroes requests in bytes, must be
+     * power of 2, and less than max_pwrite_zeroes if that is set */
+    uint32_t pwrite_zeroes_alignment;

     /* optimal transfer length in sectors */
     int opt_transfer_length;
diff --git a/block/io.c b/block/io.c
index 26b5845..108cd35 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1121,15 +1121,19 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int head = 0;
     int tail = 0;

-    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_write_zeroes,
-                                        BDRV_REQUEST_MAX_SECTORS);
-    if (bs->bl.write_zeroes_alignment) {
-        assert(is_power_of_2(bs->bl.write_zeroes_alignment));
-        head = sector_num & (bs->bl.write_zeroes_alignment - 1);
-        tail = (sector_num + nb_sectors) & (bs->bl.write_zeroes_alignment - 1);
-        max_write_zeroes &= ~(bs->bl.write_zeroes_alignment - 1);
+    int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
+    int write_zeroes_sector_align =
+        bs->bl.pwrite_zeroes_alignment >> BDRV_SECTOR_BITS;
+
+    max_write_zeroes >>= BDRV_SECTOR_BITS;
+    if (write_zeroes_sector_align) {
+        assert(is_power_of_2(bs->bl.pwrite_zeroes_alignment));
+        head = sector_num & (write_zeroes_sector_align - 1);
+        tail = (sector_num + nb_sectors) & (write_zeroes_sector_align - 1);
+        max_write_zeroes &= ~(write_zeroes_sector_align - 1);
     }

+    assert(nb_sectors <= BDRV_REQUEST_MAX_SECTORS);
     while (nb_sectors > 0 && !ret) {
         int num = nb_sectors;

@@ -1139,9 +1143,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
          */
         if (head) {
             /* Make a small request up to the first aligned sector.  */
-            num = MIN(nb_sectors, bs->bl.write_zeroes_alignment - head);
+            num = MIN(nb_sectors, write_zeroes_sector_align - head);
             head = 0;
-        } else if (tail && num > bs->bl.write_zeroes_alignment) {
+        } else if (tail && num > write_zeroes_sector_align) {
             /* Shorten the request to the last aligned sector.  */
             num -= tail;
         }
diff --git a/block/iscsi.c b/block/iscsi.c
index 94f9974..52ea9d7 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1715,16 +1715,15 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
         bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS;
     }

-    if (iscsilun->bl.max_ws_len < 0xffffffff) {
-        bs->bl.max_write_zeroes =
-            sector_limits_lun2qemu(iscsilun->bl.max_ws_len, iscsilun);
+    if (iscsilun->bl.max_ws_len < 0xffffffff / iscsilun->block_size) {
+        bs->bl.max_pwrite_zeroes =
+            iscsilun->bl.max_ws_len * iscsilun->block_size;
     }
     if (iscsilun->lbp.lbpws) {
-        bs->bl.write_zeroes_alignment =
-            sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
+        bs->bl.pwrite_zeroes_alignment =
+            iscsilun->bl.opt_unmap_gran * iscsilun->block_size;
     } else {
-        bs->bl.write_zeroes_alignment =
-            iscsilun->block_size >> BDRV_SECTOR_BITS;
+        bs->bl.pwrite_zeroes_alignment = iscsilun->block_size;
     }
     bs->bl.opt_transfer_length =
         sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
diff --git a/block/qcow2.c b/block/qcow2.c
index ecac399..a6ea6cb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1193,7 +1193,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVQcow2State *s = bs->opaque;

-    bs->bl.write_zeroes_alignment = s->cluster_sectors;
+    bs->bl.pwrite_zeroes_alignment = s->cluster_size;
 }

 static int qcow2_set_key(BlockDriverState *bs, const char *key)
diff --git a/block/qed.c b/block/qed.c
index b591d4a..0ab5b40 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -518,7 +518,7 @@ static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVQEDState *s = bs->opaque;

-    bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
+    bs->bl.pwrite_zeroes_alignment = s->header.cluster_size;
 }

 /* We have nothing to do for QED reopen, stubs just return
diff --git a/block/vmdk.c b/block/vmdk.c
index 372e5ed..8494d63 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -998,9 +998,9 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)

     for (i = 0; i < s->num_extents; i++) {
         if (!s->extents[i].flat) {
-            bs->bl.write_zeroes_alignment =
-                MAX(bs->bl.write_zeroes_alignment,
-                    s->extents[i].cluster_sectors);
+            bs->bl.pwrite_zeroes_alignment =
+                MAX(bs->bl.pwrite_zeroes_alignment,
+                    s->extents[i].cluster_sectors << BDRV_SECTOR_BITS);
         }
     }
 }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 03/13] block: Add .bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 01/13] iscsi: Use block size as minimum zero/discard alignment Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 02/13] block: Track write zero limits in bytes Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface Eric Blake
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

Update bdrv_co_do_write_zeroes() to be byte-based, and select
between the new byte-based bdrv_co_pwrite_zeroes() or the old
bdrv_co_write_zeroes().  The next patches will convert drivers,
then remove the old interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block_int.h |  4 ++-
 block/io.c                | 78 ++++++++++++++++++++++++++---------------------
 2 files changed, 46 insertions(+), 36 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 2e9c81f..1dfdf92 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -165,6 +165,8 @@ struct BlockDriver {
      */
     int coroutine_fn (*bdrv_co_write_zeroes)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
+    int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs,
+        int64_t offset, int count, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors);
     int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
@@ -456,7 +458,7 @@ struct BlockDriverState {
     unsigned int request_alignment;
     /* Flags honored during pwrite (so far: BDRV_REQ_FUA) */
     unsigned int supported_write_flags;
-    /* Flags honored during write_zeroes (so far: BDRV_REQ_FUA,
+    /* Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA,
      * BDRV_REQ_MAY_UNMAP) */
     unsigned int supported_zero_flags;

diff --git a/block/io.c b/block/io.c
index 108cd35..3fe7576 100644
--- a/block/io.c
+++ b/block/io.c
@@ -42,8 +42,8 @@ static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
                                          void *opaque,
                                          bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
-static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
+static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags);

 static void bdrv_parent_drained_begin(BlockDriverState *bs)
 {
@@ -893,10 +893,12 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
         goto err;
     }

-    if (drv->bdrv_co_write_zeroes &&
+    if ((drv->bdrv_co_write_zeroes || drv->bdrv_co_pwrite_zeroes) &&
         buffer_is_zero(bounce_buffer, iov.iov_len)) {
-        ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num,
-                                      cluster_nb_sectors, 0);
+        ret = bdrv_co_do_pwrite_zeroes(bs,
+                                       cluster_sector_num * BDRV_SECTOR_SIZE,
+                                       cluster_nb_sectors * BDRV_SECTOR_SIZE,
+                                       0);
     } else {
         /* This does not change the data on the disk, it is not necessary
          * to flush even in cache=writethrough mode.
@@ -1110,8 +1112,8 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,

 #define MAX_WRITE_ZEROES_BOUNCE_BUFFER 32768

-static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
+static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags)
 {
     BlockDriver *drv = bs->drv;
     QEMUIOVector qiov;
@@ -1122,20 +1124,16 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int tail = 0;

     int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
-    int write_zeroes_sector_align =
-        bs->bl.pwrite_zeroes_alignment >> BDRV_SECTOR_BITS;
+    int alignment = MAX(bs->bl.pwrite_zeroes_alignment ?: 1,
+                        bs->request_alignment);

-    max_write_zeroes >>= BDRV_SECTOR_BITS;
-    if (write_zeroes_sector_align) {
-        assert(is_power_of_2(bs->bl.pwrite_zeroes_alignment));
-        head = sector_num & (write_zeroes_sector_align - 1);
-        tail = (sector_num + nb_sectors) & (write_zeroes_sector_align - 1);
-        max_write_zeroes &= ~(write_zeroes_sector_align - 1);
-    }
+    assert(is_power_of_2(alignment));
+    head = offset & (alignment - 1);
+    tail = (offset + count) & (alignment - 1);
+    max_write_zeroes &= ~(alignment - 1);

-    assert(nb_sectors <= BDRV_REQUEST_MAX_SECTORS);
-    while (nb_sectors > 0 && !ret) {
-        int num = nb_sectors;
+    while (count > 0 && !ret) {
+        int num = count;

         /* Align request.  Block drivers can expect the "bulk" of the request
          * to be aligned, and that unaligned requests do not cross cluster
@@ -1143,9 +1141,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
          */
         if (head) {
             /* Make a small request up to the first aligned sector.  */
-            num = MIN(nb_sectors, write_zeroes_sector_align - head);
+            num = MIN(count, alignment - head);
             head = 0;
-        } else if (tail && num > write_zeroes_sector_align) {
+        } else if (tail && num > alignment) {
             /* Shorten the request to the last aligned sector.  */
             num -= tail;
         }
@@ -1157,8 +1155,18 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,

         ret = -ENOTSUP;
         /* First try the efficient write zeroes operation */
-        if (drv->bdrv_co_write_zeroes) {
-            ret = drv->bdrv_co_write_zeroes(bs, sector_num, num,
+        if (drv->bdrv_co_pwrite_zeroes) {
+            ret = drv->bdrv_co_pwrite_zeroes(bs, offset, num,
+                                             flags & bs->supported_zero_flags);
+            if (ret != -ENOTSUP && (flags & BDRV_REQ_FUA) &&
+                !(bs->supported_zero_flags & BDRV_REQ_FUA)) {
+                need_flush = true;
+            }
+        } else if (drv->bdrv_co_write_zeroes) {
+            assert(offset % BDRV_SECTOR_SIZE == 0);
+            assert(count % BDRV_SECTOR_SIZE == 0);
+            ret = drv->bdrv_co_write_zeroes(bs, offset >> BDRV_SECTOR_BITS,
+                                            num >> BDRV_SECTOR_BITS,
                                             flags & bs->supported_zero_flags);
             if (ret != -ENOTSUP && (flags & BDRV_REQ_FUA) &&
                 !(bs->supported_zero_flags & BDRV_REQ_FUA)) {
@@ -1181,33 +1189,31 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
                 write_flags &= ~BDRV_REQ_FUA;
                 need_flush = true;
             }
-            num = MIN(num, max_xfer_len);
-            iov.iov_len = num * BDRV_SECTOR_SIZE;
+            num = MIN(num, max_xfer_len << BDRV_SECTOR_BITS);
+            iov.iov_len = num;
             if (iov.iov_base == NULL) {
-                iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
+                iov.iov_base = qemu_try_blockalign(bs, num);
                 if (iov.iov_base == NULL) {
                     ret = -ENOMEM;
                     goto fail;
                 }
-                memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE);
+                memset(iov.iov_base, 0, num);
             }
             qemu_iovec_init_external(&qiov, &iov, 1);

-            ret = bdrv_driver_pwritev(bs, sector_num * BDRV_SECTOR_SIZE,
-                                      num * BDRV_SECTOR_SIZE, &qiov,
-                                      write_flags);
+            ret = bdrv_driver_pwritev(bs, offset, num, &qiov, write_flags);

             /* Keep bounce buffer around if it is big enough for all
              * all future requests.
              */
-            if (num < max_xfer_len) {
+            if (num < max_xfer_len << BDRV_SECTOR_BITS) {
                 qemu_vfree(iov.iov_base);
                 iov.iov_base = NULL;
             }
         }

-        sector_num += num;
-        nb_sectors -= num;
+        offset += num;
+        count -= num;
     }

 fail:
@@ -1245,7 +1251,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);

     if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
-        !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_write_zeroes &&
+        !(flags & BDRV_REQ_ZERO_WRITE) &&
+        (drv->bdrv_co_pwrite_zeroes || drv->bdrv_co_write_zeroes) &&
         qemu_iovec_is_zero(qiov)) {
         flags |= BDRV_REQ_ZERO_WRITE;
         if (bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP) {
@@ -1257,7 +1264,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
         /* Do nothing, write notifier decided to fail this request */
     } else if (flags & BDRV_REQ_ZERO_WRITE) {
         bdrv_debug_event(bs, BLKDBG_PWRITEV_ZERO);
-        ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags);
+        ret = bdrv_co_do_pwrite_zeroes(bs, sector_num << BDRV_SECTOR_BITS,
+                                       nb_sectors << BDRV_SECTOR_BITS, flags);
     } else {
         bdrv_debug_event(bs, BLKDBG_PWRITEV);
         ret = bdrv_driver_pwritev(bs, offset, bytes, qiov, flags);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (2 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 03/13] block: Add .bdrv_co_pwrite_zeroes() Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-02 11:01   ` Kevin Wolf
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 05/13] iscsi: Convert to bdrv_co_pwrite_zeroes() Eric Blake
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, qemu-block, Max Reitz, Stefan Hajnoczi, Fam Zheng,
	Denis V. Lunev, Juan Quintela, Amit Shah

Rename to bdrv_pwrite_zeroes() to let the compiler ensure we
cater to the updated semantics.  Do the same for
bdrv_aio_write_zeroes() and bdrv_co_write_zeroes().  Two of the
three places map to the byte-based counterparts; but for now,
since we have no byte-based aio write, we still require sector
alignment in those callers, via assertions.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block.h  | 16 ++++++++--------
 block/blkreplay.c      |  4 +++-
 block/io.c             | 45 ++++++++++++++++++++++++++++-----------------
 block/parallels.c      |  4 +++-
 block/qcow2-cluster.c  |  3 +--
 block/qcow2.c          |  9 ++++-----
 block/raw_bsd.c        |  3 ++-
 migration/block.c      |  5 +++--
 tests/qemu-iotests/034 |  2 +-
 tests/qemu-iotests/154 |  2 +-
 trace-events           |  4 ++--
 11 files changed, 56 insertions(+), 41 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 0c3e709..7e87451 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -33,7 +33,7 @@ typedef struct BlockDriverInfo {
      * True if the driver can optimize writing zeroes by unmapping
      * sectors. This is equivalent to the BLKDISCARDZEROES ioctl in Linux
      * with the difference that in qemu a discard is allowed to silently
-     * fail. Therefore we have to use bdrv_write_zeroes with the
+     * fail. Therefore we have to use bdrv_pwrite_zeroes with the
      * BDRV_REQ_MAY_UNMAP flag for an optimized zero write with unmapping.
      * After this call the driver has to guarantee that the contents read
      * back as zero. It is additionally required that the block device is
@@ -227,11 +227,11 @@ int bdrv_read(BlockDriverState *bs, int64_t sector_num,
               uint8_t *buf, int nb_sectors);
 int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors);
-int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-               int nb_sectors, BdrvRequestFlags flags);
-BlockAIOCB *bdrv_aio_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-                                  int nb_sectors, BdrvRequestFlags flags,
-                                  BlockCompletionFunc *cb, void *opaque);
+int bdrv_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+                       int count, BdrvRequestFlags flags);
+BlockAIOCB *bdrv_aio_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+                                   int count, BdrvRequestFlags flags,
+                                   BlockCompletionFunc *cb, void *opaque);
 int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
@@ -250,8 +250,8 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
  * function is not suitable for zeroing the entire image in a single request
  * because it may allocate memory for the entire region.
  */
-int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-    int nb_sectors, BdrvRequestFlags flags);
+int coroutine_fn bdrv_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+    int count, BdrvRequestFlags flags);
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     const char *backing_file);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
diff --git a/block/blkreplay.c b/block/blkreplay.c
index 42f1813..1a721ad 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -107,7 +107,9 @@ static int coroutine_fn blkreplay_co_write_zeroes(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     uint64_t reqid = request_id++;
-    int ret = bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags);
+    int ret = bdrv_co_pwrite_zeroes(bs->file->bs,
+                                    sector_num << BDRV_SECTOR_BITS,
+                                    nb_sectors << BDRV_SECTOR_BITS, flags);
     block_request_create(reqid, bs, qemu_coroutine_self());
     qemu_coroutine_yield();

diff --git a/block/io.c b/block/io.c
index 3fe7576..5f021d5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -620,18 +620,25 @@ int bdrv_write(BlockDriverState *bs, int64_t sector_num,
     return bdrv_rw_co(bs, sector_num, (uint8_t *)buf, nb_sectors, true, 0);
 }

-int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-                      int nb_sectors, BdrvRequestFlags flags)
+int bdrv_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+                       int count, BdrvRequestFlags flags)
 {
-    return bdrv_rw_co(bs, sector_num, NULL, nb_sectors, true,
-                      BDRV_REQ_ZERO_WRITE | flags);
+    QEMUIOVector qiov;
+    struct iovec iov = {
+        .iov_base = NULL,
+        .iov_len = count,
+    };
+
+    qemu_iovec_init_external(&qiov, &iov, 1);
+    return bdrv_prwv_co(bs, offset, &qiov, true,
+                        BDRV_REQ_ZERO_WRITE | flags);
 }

 /*
- * Completely zero out a block device with the help of bdrv_write_zeroes.
+ * Completely zero out a block device with the help of bdrv_pwrite_zeroes.
  * The operation is sped up by checking the block status and only writing
  * zeroes to the device if they currently do not return zeroes. Optional
- * flags are passed through to bdrv_write_zeroes (e.g. BDRV_REQ_MAY_UNMAP,
+ * flags are passed through to bdrv_pwrite_zeroes (e.g. BDRV_REQ_MAY_UNMAP,
  * BDRV_REQ_FUA).
  *
  * Returns < 0 on error, 0 on success. For error codes see bdrv_write().
@@ -662,7 +669,8 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
             sector_num += n;
             continue;
         }
-        ret = bdrv_write_zeroes(bs, sector_num, n, flags);
+        ret = bdrv_pwrite_zeroes(bs, sector_num << BDRV_SECTOR_BITS,
+                                 n << BDRV_SECTOR_BITS, flags);
         if (ret < 0) {
             error_report("error writing zeroes at sector %" PRId64 ": %s",
                          sector_num, strerror(-ret));
@@ -1518,18 +1526,18 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
     return bdrv_co_do_writev(bs, sector_num, nb_sectors, qiov, 0);
 }

-int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors,
-                                      BdrvRequestFlags flags)
+int coroutine_fn bdrv_co_pwrite_zeroes(BlockDriverState *bs,
+                                       int64_t offset, int count,
+                                       BdrvRequestFlags flags)
 {
-    trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
+    trace_bdrv_co_pwrite_zeroes(bs, offset, count, flags);

     if (!(bs->open_flags & BDRV_O_UNMAP)) {
         flags &= ~BDRV_REQ_MAY_UNMAP;
     }

-    return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
-                             BDRV_REQ_ZERO_WRITE | flags);
+    return bdrv_co_pwritev(bs, offset, count, NULL,
+                           BDRV_REQ_ZERO_WRITE | flags);
 }

 typedef struct BdrvCoGetBlockStatusData {
@@ -1881,13 +1889,16 @@ BlockAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
                                  cb, opaque, true);
 }

-BlockAIOCB *bdrv_aio_write_zeroes(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags,
+BlockAIOCB *bdrv_aio_pwrite_zeroes(BlockDriverState *bs,
+        int64_t offset, int count, BdrvRequestFlags flags,
         BlockCompletionFunc *cb, void *opaque)
 {
-    trace_bdrv_aio_write_zeroes(bs, sector_num, nb_sectors, flags, opaque);
+    trace_bdrv_aio_pwrite_zeroes(bs, offset, count, flags, opaque);
+    assert(offset % BDRV_SECTOR_SIZE == 0);
+    assert(count % BDRV_SECTOR_SIZE == 0);

-    return bdrv_co_aio_rw_vector(bs, sector_num, NULL, nb_sectors,
+    return bdrv_co_aio_rw_vector(bs, offset >> BDRV_SECTOR_BITS, NULL,
+                                 count >> BDRV_SECTOR_BITS,
                                  BDRV_REQ_ZERO_WRITE | flags,
                                  cb, opaque, true);
 }
diff --git a/block/parallels.c b/block/parallels.c
index 99fc0f7..4621553 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -210,7 +210,9 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
         int ret;
         space += s->prealloc_size;
         if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
-            ret = bdrv_write_zeroes(bs->file->bs, s->data_end, space, 0);
+            ret = bdrv_pwrite_zeroes(bs->file->bs,
+                                     s->data_end << BDRV_SECTOR_BITS,
+                                     space << BDRV_SECTOR_BITS, 0);
         } else {
             ret = bdrv_truncate(bs->file->bs,
                                 (s->data_end + space) << BDRV_SECTOR_BITS);
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 892e0fb..d901d89 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1765,8 +1765,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                 goto fail;
             }

-            ret = bdrv_write_zeroes(bs->file->bs, offset / BDRV_SECTOR_SIZE,
-                                    s->cluster_sectors, 0);
+            ret = bdrv_pwrite_zeroes(bs->file->bs, offset, s->cluster_size, 0);
             if (ret < 0) {
                 if (!preallocated) {
                     qcow2_free_clusters(bs, offset, s->cluster_size,
diff --git a/block/qcow2.c b/block/qcow2.c
index a6ea6cb..cc59efc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2655,8 +2655,8 @@ static int make_completely_empty(BlockDriverState *bs)
     /* After this call, neither the in-memory nor the on-disk refcount
      * information accurately describe the actual references */

-    ret = bdrv_write_zeroes(bs->file->bs, s->l1_table_offset / BDRV_SECTOR_SIZE,
-                            l1_clusters * s->cluster_sectors, 0);
+    ret = bdrv_pwrite_zeroes(bs->file->bs, s->l1_table_offset,
+                             l1_clusters * s->cluster_size, 0);
     if (ret < 0) {
         goto fail_broken_refcounts;
     }
@@ -2669,9 +2669,8 @@ static int make_completely_empty(BlockDriverState *bs)
      * overwrite parts of the existing refcount and L1 table, which is not
      * an issue because the dirty flag is set, complete data loss is in fact
      * desired and partial data loss is consequently fine as well */
-    ret = bdrv_write_zeroes(bs->file->bs, s->cluster_size / BDRV_SECTOR_SIZE,
-                            (2 + l1_clusters) * s->cluster_size /
-                            BDRV_SECTOR_SIZE, 0);
+    ret = bdrv_pwrite_zeroes(bs->file->bs, s->cluster_size,
+                             (2 + l1_clusters) * s->cluster_size, 0);
     /* This call (even if it failed overall) may have overwritten on-disk
      * refcount structures; in that case, the in-memory refcount information
      * will probably differ from the on-disk information which makes the BDS
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 3385ed4..d9adf90 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -131,7 +131,8 @@ static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
                                             int64_t sector_num, int nb_sectors,
                                             BdrvRequestFlags flags)
 {
-    return bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags);
+    return bdrv_co_pwrite_zeroes(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
+                                 nb_sectors << BDRV_SECTOR_BITS, flags);
 }

 static int coroutine_fn raw_co_discard(BlockDriverState *bs,
diff --git a/migration/block.c b/migration/block.c
index e0628d1..16cc1f8 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -883,8 +883,9 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
             }

             if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors,
-                                        BDRV_REQ_MAY_UNMAP);
+                ret = bdrv_pwrite_zeroes(bs, addr << BDRV_SECTOR_BITS,
+                                         nr_sectors << BDRV_SECTOR_BITS,
+                                         BDRV_REQ_MAY_UNMAP);
             } else {
                 buf = g_malloc(BLOCK_SIZE);
                 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/tests/qemu-iotests/034 b/tests/qemu-iotests/034
index c711cfc..1b28bda 100755
--- a/tests/qemu-iotests/034
+++ b/tests/qemu-iotests/034
@@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# Test bdrv_write_zeroes with backing files
+# Test bdrv_pwrite_zeroes with backing files (see also 154)
 #
 # Copyright (C) 2012 Red Hat, Inc.
 #
diff --git a/tests/qemu-iotests/154 b/tests/qemu-iotests/154
index 5905c55..7ca7219 100755
--- a/tests/qemu-iotests/154
+++ b/tests/qemu-iotests/154
@@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# qcow2 specific bdrv_write_zeroes tests with backing files (complements 034)
+# qcow2 specific bdrv_pwrite_zeroes tests with backing files (complements 034)
 #
 # Copyright (C) 2016 Red Hat, Inc.
 #
diff --git a/trace-events b/trace-events
index da930d5..6329c01 100644
--- a/trace-events
+++ b/trace-events
@@ -70,10 +70,10 @@ bdrv_aio_discard(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs
 bdrv_aio_flush(void *bs, void *opaque) "bs %p opaque %p"
 bdrv_aio_readv(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
 bdrv_aio_writev(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
-bdrv_aio_write_zeroes(void *bs, int64_t sector_num, int nb_sectors, int flags, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d flags %#x opaque %p"
+bdrv_aio_pwrite_zeroes(void *bs, int64_t offset, int count, int flags, void *opaque) "bs %p offset %"PRId64" count %d flags %#x opaque %p"
 bdrv_co_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
 bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
-bdrv_co_write_zeroes(void *bs, int64_t sector_num, int nb_sector, int flags) "bs %p sector_num %"PRId64" nb_sectors %d flags %#x"
+bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p offset %"PRId64" count %d flags %#x"
 bdrv_co_do_copy_on_readv(void *bs, int64_t sector_num, int nb_sectors, int64_t cluster_sector_num, int cluster_nb_sectors) "bs %p sector_num %"PRId64" nb_sectors %d cluster_sector_num %"PRId64" cluster_nb_sectors %d"

 # block/stream.c
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 05/13] iscsi: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (3 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 06/13] qcow2: " Eric Blake
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, qemu-block, Ronnie Sahlberg, Paolo Bonzini, Peter Lieven,
	Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

As this is the first byte-based iscsi interface, convert
is_request_lun_aligned() into two versions, one for sectors
and one for bytes.  Also, change from outright -EINVAL failure
on an unaligned request, to instead failing with -ENOTSUP to
trigger a read-modify-write fallback, particularly since the
block layer should be honoring bs->request_alignment to avoid
-EINVAL on read/write requests.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/iscsi.c | 56 +++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 52ea9d7..7e78ade 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -401,18 +401,26 @@ static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun)
     return sector * BDRV_SECTOR_SIZE / iscsilun->block_size;
 }

-static bool is_request_lun_aligned(int64_t sector_num, int nb_sectors,
-                                      IscsiLun *iscsilun)
+static bool is_byte_request_lun_aligned(int64_t offset, int count,
+                                        IscsiLun *iscsilun)
 {
-    if ((sector_num * BDRV_SECTOR_SIZE) % iscsilun->block_size ||
-        (nb_sectors * BDRV_SECTOR_SIZE) % iscsilun->block_size) {
-            error_report("iSCSI misaligned request: "
-                         "iscsilun->block_size %u, sector_num %" PRIi64
-                         ", nb_sectors %d",
-                         iscsilun->block_size, sector_num, nb_sectors);
-            return 0;
+    if (offset % iscsilun->block_size || count % iscsilun->block_size) {
+        error_report("iSCSI misaligned request: "
+                     "iscsilun->block_size %u, offset %" PRIi64
+                     ", count %d",
+                     iscsilun->block_size, offset, count);
+        return false;
     }
-    return 1;
+    return true;
+}
+
+static bool is_sector_request_lun_aligned(int64_t sector_num, int nb_sectors,
+                                          IscsiLun *iscsilun)
+{
+    assert(nb_sectors < BDRV_REQUEST_MAX_SECTORS);
+    return is_byte_request_lun_aligned(sector_num << BDRV_SECTOR_BITS,
+                                       nb_sectors << BDRV_SECTOR_BITS,
+                                       iscsilun);
 }

 static unsigned long *iscsi_allocationmap_init(IscsiLun *iscsilun)
@@ -461,7 +469,7 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
     if (fua) {
         assert(iscsilun->dpofua);
     }
-    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+    if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
         return -EINVAL;
     }

@@ -541,7 +549,7 @@ static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,

     iscsi_co_init_iscsitask(iscsilun, &iTask);

-    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+    if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
         ret = -EINVAL;
         goto out;
     }
@@ -638,7 +646,7 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
     uint64_t lba;
     uint32_t num_sectors;

-    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+    if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
         return -EINVAL;
     }

@@ -926,7 +934,7 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
     struct IscsiTask iTask;
     struct unmap_list list;

-    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+    if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
         return -EINVAL;
     }

@@ -977,8 +985,8 @@ retry:
 }

 static int
-coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-                                   int nb_sectors, BdrvRequestFlags flags)
+coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+                                    int count, BdrvRequestFlags flags)
 {
     IscsiLun *iscsilun = bs->opaque;
     struct IscsiTask iTask;
@@ -986,8 +994,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
     uint32_t nb_blocks;
     bool use_16_for_ws = iscsilun->use_16_for_rw;

-    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
-        return -EINVAL;
+    if (!is_byte_request_lun_aligned(offset, count, iscsilun)) {
+        return -ENOTSUP;
     }

     if (flags & BDRV_REQ_MAY_UNMAP) {
@@ -1008,8 +1016,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
         return -ENOTSUP;
     }

-    lba = sector_qemu2lun(sector_num, iscsilun);
-    nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+    lba = offset / iscsilun->block_size;
+    nb_blocks = count / iscsilun->block_size;

     if (iscsilun->zeroblock == NULL) {
         iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
@@ -1065,9 +1073,11 @@ retry:
     }

     if (flags & BDRV_REQ_MAY_UNMAP) {
-        iscsi_allocationmap_clear(iscsilun, sector_num, nb_sectors);
+        iscsi_allocationmap_clear(iscsilun, offset >> BDRV_SECTOR_BITS,
+                                  count >> BDRV_SECTOR_BITS);
     } else {
-        iscsi_allocationmap_set(iscsilun, sector_num, nb_sectors);
+        iscsi_allocationmap_set(iscsilun, offset >> BDRV_SECTOR_BITS,
+                                count >> BDRV_SECTOR_BITS);
     }

     return 0;
@@ -1856,7 +1866,7 @@ static BlockDriver bdrv_iscsi = {

     .bdrv_co_get_block_status = iscsi_co_get_block_status,
     .bdrv_co_discard      = iscsi_co_discard,
-    .bdrv_co_write_zeroes = iscsi_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
     .bdrv_co_readv         = iscsi_co_readv,
     .bdrv_co_writev_flags  = iscsi_co_writev_flags,
     .bdrv_co_flush_to_disk = iscsi_co_flush,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 06/13] qcow2: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (4 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 05/13] iscsi: Convert to bdrv_co_pwrite_zeroes() Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 07/13] blkreplay: " Eric Blake
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 37 +++++++++++++++++++------------------
 trace-events  |  4 ++--
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cc59efc..5e26c3d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2421,38 +2421,39 @@ static bool is_zero_sectors(BlockDriverState *bs, int64_t start,
     return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == count;
 }

-static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
+static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags)
 {
     int ret;
     BDRVQcow2State *s = bs->opaque;

-    uint32_t head = sector_num % s->cluster_sectors;
-    uint32_t tail = (sector_num + nb_sectors) % s->cluster_sectors;
+    uint32_t head = offset % s->cluster_size;
+    uint32_t tail = (offset + count) % s->cluster_size;

-    trace_qcow2_write_zeroes_start_req(qemu_coroutine_self(), sector_num,
-                                       nb_sectors);
+    trace_qcow2_pwrite_zeroes_start_req(qemu_coroutine_self(), offset, count);

     if (head || tail) {
-        int64_t cl_start = sector_num - head;
+        int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
         uint64_t off;
         int nr;

-        assert(cl_start + s->cluster_sectors >= sector_num + nb_sectors);
+        assert(head + count <= s->cluster_size);

         /* check whether remainder of cluster already reads as zero */
-        if (!(is_zero_sectors(bs, cl_start, head) &&
-              is_zero_sectors(bs, sector_num + nb_sectors,
-                              -tail & (s->cluster_sectors - 1)))) {
+        if (!(is_zero_sectors(bs, cl_start,
+                              DIV_ROUND_UP(head, BDRV_SECTOR_SIZE)) &&
+              is_zero_sectors(bs, (offset + count) >> BDRV_SECTOR_BITS,
+                              DIV_ROUND_UP(-tail & (s->cluster_size - 1),
+                                           BDRV_SECTOR_SIZE)))) {
             return -ENOTSUP;
         }

         qemu_co_mutex_lock(&s->lock);
         /* We can have new write after previous check */
-        sector_num = cl_start;
-        nb_sectors = nr = s->cluster_sectors;
-        ret = qcow2_get_cluster_offset(bs, cl_start << BDRV_SECTOR_BITS,
-                                       &nr, &off);
+        offset = cl_start << BDRV_SECTOR_BITS;
+        count = s->cluster_size;
+        nr = s->cluster_sectors;
+        ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
         if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
             qemu_co_mutex_unlock(&s->lock);
             return -ENOTSUP;
@@ -2461,10 +2462,10 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
         qemu_co_mutex_lock(&s->lock);
     }

-    trace_qcow2_write_zeroes(qemu_coroutine_self(), sector_num, nb_sectors);
+    trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, count);

     /* Whatever is left can use real zero clusters */
-    ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS, nb_sectors);
+    ret = qcow2_zero_clusters(bs, offset, count >> BDRV_SECTOR_BITS);
     qemu_co_mutex_unlock(&s->lock);

     return ret;
@@ -3371,7 +3372,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_co_writev         = qcow2_co_writev,
     .bdrv_co_flush_to_os    = qcow2_co_flush_to_os,

-    .bdrv_co_write_zeroes   = qcow2_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes  = qcow2_co_pwrite_zeroes,
     .bdrv_co_discard        = qcow2_co_discard,
     .bdrv_truncate          = qcow2_truncate,
     .bdrv_write_compressed  = qcow2_write_compressed,
diff --git a/trace-events b/trace-events
index 6329c01..2537eec 100644
--- a/trace-events
+++ b/trace-events
@@ -612,8 +612,8 @@ qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
 qcow2_writev_start_part(void *co) "co %p"
 qcow2_writev_done_part(void *co, int cur_nr_sectors) "co %p cur_nr_sectors %d"
 qcow2_writev_data(void *co, uint64_t offset) "co %p offset %" PRIx64
-qcow2_write_zeroes_start_req(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
-qcow2_write_zeroes(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
+qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
+qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"

 # block/qcow2-cluster.c
 qcow2_alloc_clusters_offset(void *co, uint64_t offset, int num) "co %p offset %" PRIx64 " num %d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 07/13] blkreplay: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (5 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 06/13] qcow2: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 08/13] gluster: " Eric Blake
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/blkreplay.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/block/blkreplay.c b/block/blkreplay.c
index 1a721ad..525c2d5 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -103,13 +103,11 @@ static int coroutine_fn blkreplay_co_writev(BlockDriverState *bs,
     return ret;
 }

-static int coroutine_fn blkreplay_co_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
+static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags)
 {
     uint64_t reqid = request_id++;
-    int ret = bdrv_co_pwrite_zeroes(bs->file->bs,
-                                    sector_num << BDRV_SECTOR_BITS,
-                                    nb_sectors << BDRV_SECTOR_BITS, flags);
+    int ret = bdrv_co_pwrite_zeroes(bs->file->bs, offset, count, flags);
     block_request_create(reqid, bs, qemu_coroutine_self());
     qemu_coroutine_yield();

@@ -149,7 +147,7 @@ static BlockDriver bdrv_blkreplay = {
     .bdrv_co_readv          = blkreplay_co_readv,
     .bdrv_co_writev         = blkreplay_co_writev,

-    .bdrv_co_write_zeroes   = blkreplay_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes  = blkreplay_co_pwrite_zeroes,
     .bdrv_co_discard        = blkreplay_co_discard,
     .bdrv_co_flush          = blkreplay_co_flush,
 };
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 08/13] gluster: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (6 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 07/13] blkreplay: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 09/13] qed: " Eric Blake
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Jeff Cody, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/gluster.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index a8aaacf..d361d8e 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -454,14 +454,12 @@ static void qemu_gluster_reopen_abort(BDRVReopenState *state)
 }

 #ifdef CONFIG_GLUSTERFS_ZEROFILL
-static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
+static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
+        int64_t offset, int size, BdrvRequestFlags flags)
 {
     int ret;
     GlusterAIOCB acb;
     BDRVGlusterState *s = bs->opaque;
-    off_t size = nb_sectors * BDRV_SECTOR_SIZE;
-    off_t offset = sector_num * BDRV_SECTOR_SIZE;

     acb.size = size;
     acb.ret = 0;
@@ -769,7 +767,7 @@ static BlockDriver bdrv_gluster = {
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
 #endif
     .create_opts                  = &qemu_gluster_create_opts,
 };
@@ -796,7 +794,7 @@ static BlockDriver bdrv_gluster_tcp = {
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
 #endif
     .create_opts                  = &qemu_gluster_create_opts,
 };
@@ -823,7 +821,7 @@ static BlockDriver bdrv_gluster_unix = {
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
 #endif
     .create_opts                  = &qemu_gluster_create_opts,
 };
@@ -850,7 +848,7 @@ static BlockDriver bdrv_gluster_rdma = {
     .bdrv_co_discard              = qemu_gluster_co_discard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
 #endif
     .create_opts                  = &qemu_gluster_create_opts,
 };
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 09/13] qed: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (7 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 08/13] gluster: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-02 11:16   ` Kevin Wolf
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 10/13] raw-posix: " Eric Blake
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Stefan Hajnoczi, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Kill an abuse of the comma operator while at it (fortunately,
the semantics were still right).  Also, the test for requests
not aligned to clusters should be applied always, not just
when a backing file is present.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/qed.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 0ab5b40..45ec13d 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1419,7 +1419,7 @@ typedef struct {
     bool done;
 } QEDWriteZeroesCB;

-static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
+static void coroutine_fn qed_co_pwrite_zeroes_cb(void *opaque, int ret)
 {
     QEDWriteZeroesCB *cb = opaque;

@@ -1430,10 +1430,10 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
     }
 }

-static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
-                                                 int64_t sector_num,
-                                                 int nb_sectors,
-                                                 BdrvRequestFlags flags)
+static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
+                                                  int64_t offset,
+                                                  int count,
+                                                  BdrvRequestFlags flags)
 {
     BlockAIOCB *blockacb;
     BDRVQEDState *s = bs->opaque;
@@ -1441,25 +1441,22 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
     QEMUIOVector qiov;
     struct iovec iov;

-    /* Refuse if there are untouched backing file sectors */
-    if (bs->backing) {
-        if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE) != 0) {
-            return -ENOTSUP;
-        }
-        if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE) != 0) {
-            return -ENOTSUP;
-        }
+    /* Fall back if the request is not */
+    if (qed_offset_into_cluster(s, offset) ||
+        qed_offset_into_cluster(s, count)) {
+        return -ENOTSUP;
     }

     /* Zero writes start without an I/O buffer.  If a buffer becomes necessary
      * then it will be allocated during request processing.
      */
-    iov.iov_base = NULL,
-    iov.iov_len  = nb_sectors * BDRV_SECTOR_SIZE,
+    iov.iov_base = NULL;
+    iov.iov_len = count;

     qemu_iovec_init_external(&qiov, &iov, 1);
-    blockacb = qed_aio_setup(bs, sector_num, &qiov, nb_sectors,
-                             qed_co_write_zeroes_cb, &cb,
+    blockacb = qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov,
+                             count >> BDRV_SECTOR_BITS,
+                             qed_co_pwrite_zeroes_cb, &cb,
                              QED_AIOCB_WRITE | QED_AIOCB_ZERO);
     if (!blockacb) {
         return -EIO;
@@ -1664,7 +1661,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
     .bdrv_aio_readv           = bdrv_qed_aio_readv,
     .bdrv_aio_writev          = bdrv_qed_aio_writev,
-    .bdrv_co_write_zeroes     = bdrv_qed_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes    = bdrv_qed_co_pwrite_zeroes,
     .bdrv_truncate            = bdrv_qed_truncate,
     .bdrv_getlength           = bdrv_qed_getlength,
     .bdrv_get_info            = bdrv_qed_get_info,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 10/13] raw-posix: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (8 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 09/13] qed: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-03 16:21   ` Kevin Wolf
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 11/13] raw_bsd: " Eric Blake
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/raw-posix.c | 34 +++++++++++++++++-----------------
 trace-events      |  2 +-
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a4f5a1b..8bfcb4a 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1252,8 +1252,8 @@ static int aio_worker(void *arg)
 }

 static int paio_submit_co(BlockDriverState *bs, int fd,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        int type)
+                          int64_t offset, QEMUIOVector *qiov,
+                          int count, int type)
 {
     RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
     ThreadPool *pool;
@@ -1262,16 +1262,16 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
     acb->aio_type = type;
     acb->aio_fildes = fd;

-    acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
-    acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;

     if (qiov) {
         acb->aio_iov = qiov->iov;
         acb->aio_niov = qiov->niov;
-        assert(qiov->size == acb->aio_nbytes);
+        assert(qiov->size == count);
     }

-    trace_paio_submit_co(sector_num, nb_sectors, type);
+    trace_paio_submit_co(offset, qiov->size, type);
     pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     return thread_pool_submit_co(pool, aio_worker, acb);
 }
@@ -1868,17 +1868,17 @@ static coroutine_fn BlockAIOCB *raw_aio_discard(BlockDriverState *bs,
                        cb, opaque, QEMU_AIO_DISCARD);
 }

-static int coroutine_fn raw_co_write_zeroes(
-    BlockDriverState *bs, int64_t sector_num,
-    int nb_sectors, BdrvRequestFlags flags)
+static int coroutine_fn raw_co_pwrite_zeroes(
+    BlockDriverState *bs, int64_t offset,
+    int count, BdrvRequestFlags flags)
 {
     BDRVRawState *s = bs->opaque;

     if (!(flags & BDRV_REQ_MAY_UNMAP)) {
-        return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
                               QEMU_AIO_WRITE_ZEROES);
     } else if (s->discard_zeroes) {
-        return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
                               QEMU_AIO_DISCARD);
     }
     return -ENOTSUP;
@@ -1931,7 +1931,7 @@ BlockDriver bdrv_file = {
     .bdrv_create = raw_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
     .bdrv_co_get_block_status = raw_co_get_block_status,
-    .bdrv_co_write_zeroes = raw_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,

     .bdrv_aio_readv = raw_aio_readv,
     .bdrv_aio_writev = raw_aio_writev,
@@ -2293,8 +2293,8 @@ static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
                        cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
 }

-static coroutine_fn int hdev_co_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
+static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags)
 {
     BDRVRawState *s = bs->opaque;
     int rc;
@@ -2304,10 +2304,10 @@ static coroutine_fn int hdev_co_write_zeroes(BlockDriverState *bs,
         return rc;
     }
     if (!(flags & BDRV_REQ_MAY_UNMAP)) {
-        return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
                               QEMU_AIO_WRITE_ZEROES|QEMU_AIO_BLKDEV);
     } else if (s->discard_zeroes) {
-        return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
                               QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
     }
     return -ENOTSUP;
@@ -2379,7 +2379,7 @@ static BlockDriver bdrv_host_device = {
     .bdrv_reopen_abort   = raw_reopen_abort,
     .bdrv_create         = hdev_create,
     .create_opts         = &raw_create_opts,
-    .bdrv_co_write_zeroes = hdev_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,

     .bdrv_aio_readv	= raw_aio_readv,
     .bdrv_aio_writev	= raw_aio_writev,
diff --git a/trace-events b/trace-events
index 2537eec..7cdbc62 100644
--- a/trace-events
+++ b/trace-events
@@ -132,7 +132,7 @@ thread_pool_cancel(void *req, void *opaque) "req %p opaque %p"

 # block/raw-win32.c
 # block/raw-posix.c
-paio_submit_co(int64_t sector_num, int nb_sectors, int type) "sector_num %"PRId64" nb_sectors %d type %d"
+paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
 paio_submit(void *acb, void *opaque, int64_t sector_num, int nb_sectors, int type) "acb %p opaque %p sector_num %"PRId64" nb_sectors %d type %d"

 # ioport.c
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 11/13] raw_bsd: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (9 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 10/13] raw-posix: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 12/13] vmdk: " Eric Blake
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/raw_bsd.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d9adf90..b1d5237 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -127,12 +127,11 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
            (sector_num << BDRV_SECTOR_BITS);
 }

-static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
-                                            int64_t sector_num, int nb_sectors,
-                                            BdrvRequestFlags flags)
+static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
+                                             int64_t offset, int count,
+                                             BdrvRequestFlags flags)
 {
-    return bdrv_co_pwrite_zeroes(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
-                                 nb_sectors << BDRV_SECTOR_BITS, flags);
+    return bdrv_co_pwrite_zeroes(bs->file->bs, offset, count, flags);
 }

 static int coroutine_fn raw_co_discard(BlockDriverState *bs,
@@ -253,7 +252,7 @@ BlockDriver bdrv_raw = {
     .bdrv_create          = &raw_create,
     .bdrv_co_readv        = &raw_co_readv,
     .bdrv_co_writev_flags = &raw_co_writev_flags,
-    .bdrv_co_write_zeroes = &raw_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
     .bdrv_co_discard      = &raw_co_discard,
     .bdrv_co_get_block_status = &raw_co_get_block_status,
     .bdrv_truncate        = &raw_truncate,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 12/13] vmdk: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (10 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 11/13] raw_bsd: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 13/13] block: Kill bdrv_co_write_zeroes() Eric Blake
  2016-06-02 11:26 ` [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Kevin Wolf
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Fam Zheng, Max Reitz

Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 8494d63..fa5bc09 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1704,15 +1704,13 @@ static int vmdk_write_compressed(BlockDriverState *bs,
     }
 }

-static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
-                                             int64_t sector_num,
-                                             int nb_sectors,
-                                             BdrvRequestFlags flags)
+static int coroutine_fn vmdk_co_pwrite_zeroes(BlockDriverState *bs,
+                                              int64_t offset,
+                                              int bytes,
+                                              BdrvRequestFlags flags)
 {
     int ret;
     BDRVVmdkState *s = bs->opaque;
-    uint64_t offset = sector_num * BDRV_SECTOR_SIZE;
-    uint64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;

     qemu_co_mutex_lock(&s->lock);
     /* write zeroes could fail if sectors not aligned to cluster, test it with
@@ -2403,7 +2401,7 @@ static BlockDriver bdrv_vmdk = {
     .bdrv_co_preadv               = vmdk_co_preadv,
     .bdrv_co_pwritev              = vmdk_co_pwritev,
     .bdrv_write_compressed        = vmdk_write_compressed,
-    .bdrv_co_write_zeroes         = vmdk_co_write_zeroes,
+    .bdrv_co_pwrite_zeroes        = vmdk_co_pwrite_zeroes,
     .bdrv_close                   = vmdk_close,
     .bdrv_create                  = vmdk_create,
     .bdrv_co_flush_to_disk        = vmdk_co_flush,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v2 13/13] block: Kill bdrv_co_write_zeroes()
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (11 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 12/13] vmdk: " Eric Blake
@ 2016-06-01 21:10 ` Eric Blake
  2016-06-02 11:26 ` [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Kevin Wolf
  13 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2016-06-01 21:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

Now that all drivers have been converted to a byte interface,
we no longer need a sector interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block_int.h |  2 --
 block/io.c                | 15 ++-------------
 2 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1dfdf92..8a4963c 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -163,8 +163,6 @@ struct BlockDriver {
      * function pointer may be NULL or return -ENOSUP and .bdrv_co_writev()
      * will be called instead.
      */
-    int coroutine_fn (*bdrv_co_write_zeroes)(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs,
         int64_t offset, int count, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index 5f021d5..6013b97 100644
--- a/block/io.c
+++ b/block/io.c
@@ -901,7 +901,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
         goto err;
     }

-    if ((drv->bdrv_co_write_zeroes || drv->bdrv_co_pwrite_zeroes) &&
+    if (drv->bdrv_co_pwrite_zeroes &&
         buffer_is_zero(bounce_buffer, iov.iov_len)) {
         ret = bdrv_co_do_pwrite_zeroes(bs,
                                        cluster_sector_num * BDRV_SECTOR_SIZE,
@@ -1170,16 +1170,6 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
                 !(bs->supported_zero_flags & BDRV_REQ_FUA)) {
                 need_flush = true;
             }
-        } else if (drv->bdrv_co_write_zeroes) {
-            assert(offset % BDRV_SECTOR_SIZE == 0);
-            assert(count % BDRV_SECTOR_SIZE == 0);
-            ret = drv->bdrv_co_write_zeroes(bs, offset >> BDRV_SECTOR_BITS,
-                                            num >> BDRV_SECTOR_BITS,
-                                            flags & bs->supported_zero_flags);
-            if (ret != -ENOTSUP && (flags & BDRV_REQ_FUA) &&
-                !(bs->supported_zero_flags & BDRV_REQ_FUA)) {
-                need_flush = true;
-            }
         } else {
             assert(!bs->supported_zero_flags);
         }
@@ -1259,8 +1249,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);

     if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
-        !(flags & BDRV_REQ_ZERO_WRITE) &&
-        (drv->bdrv_co_pwrite_zeroes || drv->bdrv_co_write_zeroes) &&
+        !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes &&
         qemu_iovec_is_zero(qiov)) {
         flags |= BDRV_REQ_ZERO_WRITE;
         if (bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP) {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface Eric Blake
@ 2016-06-02 11:01   ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2016-06-02 11:01 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, Max Reitz, Stefan Hajnoczi, Fam Zheng,
	Denis V. Lunev, Juan Quintela, Amit Shah

Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
> Rename to bdrv_pwrite_zeroes() to let the compiler ensure we
> cater to the updated semantics.  Do the same for
> bdrv_aio_write_zeroes() and bdrv_co_write_zeroes().  Two of the
> three places map to the byte-based counterparts; but for now,
> since we have no byte-based aio write, we still require sector
> alignment in those callers, via assertions.

As it happens, I sent a patch (and you already reviewed it) that removes
the AIO function anyway because those callers don't exist. :-)

So this patch will conflict with it, but I guess I can resolve that
while rebasing.

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 09/13] qed: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 09/13] qed: " Eric Blake
@ 2016-06-02 11:16   ` Kevin Wolf
  2016-06-02 12:40     ` Eric Blake
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2016-06-02 11:16 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Max Reitz

Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
> Another step on our continuing quest to switch to byte-based
> interfaces.
> 
> Kill an abuse of the comma operator while at it (fortunately,
> the semantics were still right).  Also, the test for requests
> not aligned to clusters should be applied always, not just
> when a backing file is present.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qed.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/block/qed.c b/block/qed.c
> index 0ab5b40..45ec13d 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -1419,7 +1419,7 @@ typedef struct {
>      bool done;
>  } QEDWriteZeroesCB;
> 
> -static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
> +static void coroutine_fn qed_co_pwrite_zeroes_cb(void *opaque, int ret)
>  {
>      QEDWriteZeroesCB *cb = opaque;
> 
> @@ -1430,10 +1430,10 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
>      }
>  }
> 
> -static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
> -                                                 int64_t sector_num,
> -                                                 int nb_sectors,
> -                                                 BdrvRequestFlags flags)
> +static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
> +                                                  int64_t offset,
> +                                                  int count,
> +                                                  BdrvRequestFlags flags)
>  {
>      BlockAIOCB *blockacb;
>      BDRVQEDState *s = bs->opaque;
> @@ -1441,25 +1441,22 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
>      QEMUIOVector qiov;
>      struct iovec iov;
> 
> -    /* Refuse if there are untouched backing file sectors */
> -    if (bs->backing) {
> -        if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE) != 0) {
> -            return -ENOTSUP;
> -        }
> -        if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE) != 0) {
> -            return -ENOTSUP;
> -        }
> +    /* Fall back if the request is not */

...aligned?

> +    if (qed_offset_into_cluster(s, offset) ||
> +        qed_offset_into_cluster(s, count)) {
> +        return -ENOTSUP;
>      }

This is obviously correct and almost as obviously suboptimal compared to
the original version (we need cluster alignment with a backing file, but
without a backing file, sector alignment would be enough).

But as this is QED, which is only supported for compatibility these
days, simpler if slightly suboptimal code is okay with me.

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes
  2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
                   ` (12 preceding siblings ...)
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 13/13] block: Kill bdrv_co_write_zeroes() Eric Blake
@ 2016-06-02 11:26 ` Kevin Wolf
  13 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2016-06-02 11:26 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block

Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
> Kevin pointed out that my recent change to byte-based instead
> of sector-based blk_write_zeroes() (commit 983a1600) makes life
> harder as long as bdrv_write_zeroes is still sector-based, and
> where the compiler doesn't flag any change in parameter types.
> Complete the conversion, by making all write_zeroes operations
> nominally take bytes.

Thanks, applied to the block branch (after completing the comment in the
qed patch).

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 09/13] qed: Convert to bdrv_co_pwrite_zeroes()
  2016-06-02 11:16   ` Kevin Wolf
@ 2016-06-02 12:40     ` Eric Blake
  2016-06-02 12:45       ` Kevin Wolf
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2016-06-02 12:40 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1891 bytes --]

On 06/02/2016 05:16 AM, Kevin Wolf wrote:
> Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
>> Another step on our continuing quest to switch to byte-based
>> interfaces.
>>
>> Kill an abuse of the comma operator while at it (fortunately,
>> the semantics were still right).  Also, the test for requests
>> not aligned to clusters should be applied always, not just
>> when a backing file is present.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  block/qed.c | 33 +++++++++++++++------------------
>>  1 file changed, 15 insertions(+), 18 deletions(-)

>> -        }
>> +    /* Fall back if the request is not */
> 
> ...aligned?

Yes, thanks.

> 
>> +    if (qed_offset_into_cluster(s, offset) ||
>> +        qed_offset_into_cluster(s, count)) {
>> +        return -ENOTSUP;
>>      }
> 
> This is obviously correct and almost as obviously suboptimal compared to
> the original version (we need cluster alignment with a backing file, but
> without a backing file, sector alignment would be enough).

Does QED support per-sector zeroing, or is it only per-cluster zero like
qcow2?

/me checks the spec

only per-cluster zeroing

> 
> But as this is QED, which is only supported for compatibility these
> days, simpler if slightly suboptimal code is okay with me.

Widening a request to cluster boundaries is only possible if you know
the cluster is otherwise unallocated or already reads as zeroes, and
unless qed tracks zero sectors (as opposed to zero clusters), I argue
that this was actually a bug fix, even when the request was
sector-aligned, since we lack the check for cluster remains
unallocated/zero before widening, the way qcow2 does it.  Sub-optimal
but safe is better than incorrectly optimized.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 09/13] qed: Convert to bdrv_co_pwrite_zeroes()
  2016-06-02 12:40     ` Eric Blake
@ 2016-06-02 12:45       ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2016-06-02 12:45 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Stefan Hajnoczi, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 2158 bytes --]

Am 02.06.2016 um 14:40 hat Eric Blake geschrieben:
> On 06/02/2016 05:16 AM, Kevin Wolf wrote:
> > Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
> >> Another step on our continuing quest to switch to byte-based
> >> interfaces.
> >>
> >> Kill an abuse of the comma operator while at it (fortunately,
> >> the semantics were still right).  Also, the test for requests
> >> not aligned to clusters should be applied always, not just
> >> when a backing file is present.
> >>
> >> Signed-off-by: Eric Blake <eblake@redhat.com>
> >> ---
> >>  block/qed.c | 33 +++++++++++++++------------------
> >>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> >> -        }
> >> +    /* Fall back if the request is not */
> > 
> > ...aligned?
> 
> Yes, thanks.
> 
> > 
> >> +    if (qed_offset_into_cluster(s, offset) ||
> >> +        qed_offset_into_cluster(s, count)) {
> >> +        return -ENOTSUP;
> >>      }
> > 
> > This is obviously correct and almost as obviously suboptimal compared to
> > the original version (we need cluster alignment with a backing file, but
> > without a backing file, sector alignment would be enough).
> 
> Does QED support per-sector zeroing, or is it only per-cluster zero like
> qcow2?
> 
> /me checks the spec
> 
> only per-cluster zeroing
> 
> > 
> > But as this is QED, which is only supported for compatibility these
> > days, simpler if slightly suboptimal code is okay with me.
> 
> Widening a request to cluster boundaries is only possible if you know
> the cluster is otherwise unallocated or already reads as zeroes, and
> unless qed tracks zero sectors (as opposed to zero clusters), I argue
> that this was actually a bug fix, even when the request was
> sector-aligned, since we lack the check for cluster remains
> unallocated/zero before widening, the way qcow2 does it.  Sub-optimal
> but safe is better than incorrectly optimized.

It actually checks whether the cluster is already allocated; in that
case, qed implements a fallback itself. So I think it was working. Just
as usual with the callback based qed code, it's not very easy to read.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/13] raw-posix: Convert to bdrv_co_pwrite_zeroes()
  2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 10/13] raw-posix: " Eric Blake
@ 2016-06-03 16:21   ` Kevin Wolf
  0 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2016-06-03 16:21 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, Max Reitz

Am 01.06.2016 um 23:10 hat Eric Blake geschrieben:
> Another step on our continuing quest to switch to byte-based
> interfaces.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/raw-posix.c | 34 +++++++++++++++++-----------------
>  trace-events      |  2 +-
>  2 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index a4f5a1b..8bfcb4a 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -1252,8 +1252,8 @@ static int aio_worker(void *arg)
>  }
> 
>  static int paio_submit_co(BlockDriverState *bs, int fd,
> -        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
> -        int type)
> +                          int64_t offset, QEMUIOVector *qiov,
> +                          int count, int type)
>  {
>      RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
>      ThreadPool *pool;
> @@ -1262,16 +1262,16 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
>      acb->aio_type = type;
>      acb->aio_fildes = fd;
> 
> -    acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
> -    acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
> +    acb->aio_nbytes = count;
> +    acb->aio_offset = offset;
> 
>      if (qiov) {
>          acb->aio_iov = qiov->iov;
>          acb->aio_niov = qiov->niov;
> -        assert(qiov->size == acb->aio_nbytes);
> +        assert(qiov->size == count);
>      }
> 
> -    trace_paio_submit_co(sector_num, nb_sectors, type);
> +    trace_paio_submit_co(offset, qiov->size, type);

As discussed on IRC, I'm fixing this up in my branch with
s/qiov->size/count/ because qiov can be NULL.

This fixes qemu-iotests 048 for raw.

>      pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
>      return thread_pool_submit_co(pool, aio_worker, acb);
>  }

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-06-03 16:21 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01 21:10 [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 01/13] iscsi: Use block size as minimum zero/discard alignment Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 02/13] block: Track write zero limits in bytes Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 03/13] block: Add .bdrv_co_pwrite_zeroes() Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 04/13] block: Switch bdrv_write_zeroes() to byte interface Eric Blake
2016-06-02 11:01   ` Kevin Wolf
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 05/13] iscsi: Convert to bdrv_co_pwrite_zeroes() Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 06/13] qcow2: " Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 07/13] blkreplay: " Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 08/13] gluster: " Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 09/13] qed: " Eric Blake
2016-06-02 11:16   ` Kevin Wolf
2016-06-02 12:40     ` Eric Blake
2016-06-02 12:45       ` Kevin Wolf
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 10/13] raw-posix: " Eric Blake
2016-06-03 16:21   ` Kevin Wolf
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 11/13] raw_bsd: " Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 12/13] vmdk: " Eric Blake
2016-06-01 21:10 ` [Qemu-devel] [PATCH v2 13/13] block: Kill bdrv_co_write_zeroes() Eric Blake
2016-06-02 11:26 ` [Qemu-devel] [PATCH v2 00/13] Kill sector-based write_zeroes Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.