All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev
@ 2016-06-03 17:21 Kevin Wolf
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

This series converts qcow2 to the byte-based I/O interfaces. This simplifies
the code by removing many unit conversions, and in the unlikely case of actual
unaligned requests, it even makes the driver work more efficiently by avoiding
read-modify-write.

Kevin Wolf (5):
  qcow2: Work with bytes in qcow2_get_cluster_offset()
  qcow2: Implement .bdrv_co_preadv()
  qcow2: Make copy_sectors() byte based
  qcow2: Use bytes instead of sectors for QCowL2Meta
  qcow2: Implement .bdrv_co_pwritev

 block/qcow2-cluster.c | 121 ++++++++++++++------------------
 block/qcow2.c         | 190 ++++++++++++++++++++++++++------------------------
 block/qcow2.h         |  18 ++---
 trace-events          |   6 +-
 4 files changed, 159 insertions(+), 176 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset()
  2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
@ 2016-06-03 17:21 ` Kevin Wolf
  2016-06-03 17:44   ` Eric Blake
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

This patch changes the units that qcow2_get_cluster_offset() uses
internally, without touching the interface just yet. This will be done
in another patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index d901d89..b2405b1 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -482,29 +482,27 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
     unsigned int l2_index;
     uint64_t l1_index, l2_offset, *l2_table;
     int l1_bits, c;
-    unsigned int index_in_cluster, nb_clusters;
-    uint64_t nb_available, nb_needed;
+    unsigned int offset_in_cluster, nb_clusters;
+    uint64_t bytes_available, bytes_needed;
     int ret;
+    unsigned int bytes;
 
-    index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
-    nb_needed = *num + index_in_cluster;
+    bytes = *num * BDRV_SECTOR_SIZE;
 
-    l1_bits = s->l2_bits + s->cluster_bits;
-
-    /* compute how many bytes there are between the offset and
-     * the end of the l1 entry
-     */
+    offset_in_cluster = offset_into_cluster(s, offset);
+    bytes_needed = bytes + offset_in_cluster;
 
-    nb_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1));
-
-    /* compute the number of available sectors */
+    l1_bits = s->l2_bits + s->cluster_bits;
 
-    nb_available = (nb_available >> 9) + index_in_cluster;
+    /* compute how many bytes there are between the start of the cluster
+     * containing offset and the end of the l1 entry */
+    bytes_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1))
+                    + offset_in_cluster;
 
-    if (nb_needed > nb_available) {
-        nb_needed = nb_available;
+    if (bytes_needed > bytes_available) {
+        bytes_needed = bytes_available;
     }
-    assert(nb_needed <= INT_MAX);
+    assert(bytes_needed <= INT_MAX);
 
     *cluster_offset = 0;
 
@@ -542,7 +540,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
     *cluster_offset = be64_to_cpu(l2_table[l2_index]);
 
     /* nb_needed <= INT_MAX, thus nb_clusters <= INT_MAX, too */
-    nb_clusters = size_to_clusters(s, nb_needed << 9);
+    nb_clusters = size_to_clusters(s, bytes_needed);
 
     ret = qcow2_get_cluster_type(*cluster_offset);
     switch (ret) {
@@ -589,13 +587,16 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
 
     qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
 
-    nb_available = (c * s->cluster_sectors);
+    bytes_available = (c * s->cluster_size);
 
 out:
-    if (nb_available > nb_needed)
-        nb_available = nb_needed;
+    if (bytes_available > bytes_needed) {
+        bytes_available = bytes_needed;
+    }
 
-    *num = nb_available - index_in_cluster;
+    bytes = bytes_available - offset_in_cluster;
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    *num = bytes >> BDRV_SECTOR_BITS;
 
     return ret;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv()
  2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
@ 2016-06-03 17:21 ` Kevin Wolf
  2016-06-03 19:18   ` Eric Blake
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

Reading from qcow2 images is now byte granularity.

Most of the affected code in qcow2 actually gets simpler with this
change. The only exception is encryption, which is fixed on 512 bytes
blocks; in order to keep this working, bs->request_alignment is set for
encrypted images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c |  18 ++++-----
 block/qcow2.c         | 108 +++++++++++++++++++++++++++-----------------------
 block/qcow2.h         |   2 +-
 3 files changed, 67 insertions(+), 61 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b2405b1..9fb7f9f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -424,7 +424,8 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
      * interface.  This avoids double I/O throttling and request tracking,
      * which can lead to deadlock when block layer copy-on-read is enabled.
      */
-    ret = bs->drv->bdrv_co_readv(bs, start_sect + n_start, n, &qiov);
+    ret = bs->drv->bdrv_co_preadv(bs, (start_sect + n_start) * BDRV_SECTOR_SIZE,
+                                  n * BDRV_SECTOR_SIZE, &qiov, 0);
     if (ret < 0) {
         goto out;
     }
@@ -467,16 +468,16 @@ out:
  * For a given offset of the disk image, find the cluster offset in
  * qcow2 file. The offset is stored in *cluster_offset.
  *
- * on entry, *num is the number of contiguous sectors we'd like to
+ * on entry, *bytes is the number of contiguous bytes we'd like to
  * access following offset.
  *
- * on exit, *num is the number of contiguous sectors we can read.
+ * on exit, *bytes is the number of contiguous bytes we can read.
  *
  * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
  * cases.
  */
 int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
-    int *num, uint64_t *cluster_offset)
+                             unsigned int *bytes, uint64_t *cluster_offset)
 {
     BDRVQcow2State *s = bs->opaque;
     unsigned int l2_index;
@@ -485,12 +486,9 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
     unsigned int offset_in_cluster, nb_clusters;
     uint64_t bytes_available, bytes_needed;
     int ret;
-    unsigned int bytes;
-
-    bytes = *num * BDRV_SECTOR_SIZE;
 
     offset_in_cluster = offset_into_cluster(s, offset);
-    bytes_needed = bytes + offset_in_cluster;
+    bytes_needed = *bytes + offset_in_cluster;
 
     l1_bits = s->l2_bits + s->cluster_bits;
 
@@ -594,9 +592,7 @@ out:
         bytes_available = bytes_needed;
     }
 
-    bytes = bytes_available - offset_in_cluster;
-    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-    *num = bytes >> BDRV_SECTOR_BITS;
+    *bytes = bytes_available - offset_in_cluster;
 
     return ret;
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 6f5fb81..b498753 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -975,6 +975,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         }
 
         bs->encrypted = 1;
+
+        /* Encryption works on a sector granularity */
+        bs->request_alignment = BDRV_SECTOR_SIZE;
     }
 
     s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
@@ -1331,16 +1334,20 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
     BDRVQcow2State *s = bs->opaque;
     uint64_t cluster_offset;
     int index_in_cluster, ret;
+    unsigned int bytes;
     int64_t status = 0;
 
-    *pnum = nb_sectors;
+    bytes = MIN(INT_MAX, nb_sectors * BDRV_SECTOR_SIZE);
     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_get_cluster_offset(bs, sector_num << 9, pnum, &cluster_offset);
+    ret = qcow2_get_cluster_offset(bs, sector_num << 9, &bytes,
+                                   &cluster_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
         return ret;
     }
 
+    *pnum = bytes >> BDRV_SECTOR_BITS;
+
     if (cluster_offset != 0 && ret != QCOW2_CLUSTER_COMPRESSED &&
         !s->cipher) {
         index_in_cluster = sector_num & (s->cluster_sectors - 1);
@@ -1358,28 +1365,34 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
 
 /* handle reading after the end of the backing file */
 int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
-                  int64_t sector_num, int nb_sectors)
+                        int64_t offset, int bytes)
 {
+    uint64_t bs_size = bs->total_sectors * BDRV_SECTOR_SIZE;
     int n1;
-    if ((sector_num + nb_sectors) <= bs->total_sectors)
-        return nb_sectors;
-    if (sector_num >= bs->total_sectors)
+
+    if ((offset + bytes) <= bs_size) {
+        return bytes;
+    }
+
+    if (offset >= bs_size) {
         n1 = 0;
-    else
-        n1 = bs->total_sectors - sector_num;
+    } else {
+        n1 = bs_size - offset;
+    }
 
-    qemu_iovec_memset(qiov, 512 * n1, 0, 512 * (nb_sectors - n1));
+    qemu_iovec_memset(qiov, n1, 0, bytes - n1);
 
     return n1;
 }
 
-static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
-                          int remaining_sectors, QEMUIOVector *qiov)
+static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
+                                        uint64_t bytes, QEMUIOVector *qiov,
+                                        int flags)
 {
     BDRVQcow2State *s = bs->opaque;
-    int index_in_cluster, n1;
+    int offset_in_cluster, n1;
     int ret;
-    int cur_nr_sectors; /* number of sectors in current iteration */
+    unsigned int cur_bytes; /* number of sectors in current iteration */
     uint64_t cluster_offset = 0;
     uint64_t bytes_done = 0;
     QEMUIOVector hd_qiov;
@@ -1389,26 +1402,24 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
 
     qemu_co_mutex_lock(&s->lock);
 
-    while (remaining_sectors != 0) {
+    while (bytes != 0) {
 
         /* prepare next request */
-        cur_nr_sectors = remaining_sectors;
+        cur_bytes = MIN(bytes, INT_MAX);
         if (s->cipher) {
-            cur_nr_sectors = MIN(cur_nr_sectors,
-                QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
+            cur_bytes = MIN(cur_bytes,
+                            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
         }
 
-        ret = qcow2_get_cluster_offset(bs, sector_num << 9,
-            &cur_nr_sectors, &cluster_offset);
+        ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
         if (ret < 0) {
             goto fail;
         }
 
-        index_in_cluster = sector_num & (s->cluster_sectors - 1);
+        offset_in_cluster = offset_into_cluster(s, offset);
 
         qemu_iovec_reset(&hd_qiov);
-        qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
-            cur_nr_sectors * 512);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
         switch (ret) {
         case QCOW2_CLUSTER_UNALLOCATED:
@@ -1416,18 +1427,17 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
             if (bs->backing) {
                 /* read from the base image */
                 n1 = qcow2_backing_read1(bs->backing->bs, &hd_qiov,
-                    sector_num, cur_nr_sectors);
+                                         offset, cur_bytes);
                 if (n1 > 0) {
                     QEMUIOVector local_qiov;
 
                     qemu_iovec_init(&local_qiov, hd_qiov.niov);
-                    qemu_iovec_concat(&local_qiov, &hd_qiov, 0,
-                                      n1 * BDRV_SECTOR_SIZE);
+                    qemu_iovec_concat(&local_qiov, &hd_qiov, 0, n1);
 
                     BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
                     qemu_co_mutex_unlock(&s->lock);
-                    ret = bdrv_co_readv(bs->backing->bs, sector_num,
-                                        n1, &local_qiov);
+                    ret = bdrv_co_preadv(bs->backing->bs, offset, n1,
+                                         &local_qiov, 0);
                     qemu_co_mutex_lock(&s->lock);
 
                     qemu_iovec_destroy(&local_qiov);
@@ -1438,12 +1448,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
                 }
             } else {
                 /* Note: in this case, no need to wait */
-                qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
+                qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
             }
             break;
 
         case QCOW2_CLUSTER_ZERO:
-            qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
+            qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
             break;
 
         case QCOW2_CLUSTER_COMPRESSED:
@@ -1454,8 +1464,8 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
             }
 
             qemu_iovec_from_buf(&hd_qiov, 0,
-                s->cluster_cache + index_in_cluster * 512,
-                512 * cur_nr_sectors);
+                                s->cluster_cache + offset_in_cluster,
+                                cur_bytes);
             break;
 
         case QCOW2_CLUSTER_NORMAL:
@@ -1482,34 +1492,34 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
                     }
                 }
 
-                assert(cur_nr_sectors <=
-                    QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
+                assert(cur_bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
                 qemu_iovec_reset(&hd_qiov);
-                qemu_iovec_add(&hd_qiov, cluster_data,
-                    512 * cur_nr_sectors);
+                qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
             }
 
             BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
             qemu_co_mutex_unlock(&s->lock);
-            ret = bdrv_co_readv(bs->file->bs,
-                                (cluster_offset >> 9) + index_in_cluster,
-                                cur_nr_sectors, &hd_qiov);
+            ret = bdrv_co_preadv(bs->file->bs,
+                                 cluster_offset + offset_in_cluster,
+                                 cur_bytes, &hd_qiov, 0);
             qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
                 goto fail;
             }
             if (bs->encrypted) {
                 assert(s->cipher);
+                assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+                assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
                 Error *err = NULL;
-                if (qcow2_encrypt_sectors(s, sector_num,  cluster_data,
-                                          cluster_data, cur_nr_sectors, false,
-                                          &err) < 0) {
+                if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
+                                          cluster_data, cluster_data,
+                                          cur_bytes >> BDRV_SECTOR_BITS,
+                                          false, &err) < 0) {
                     error_free(err);
                     ret = -EIO;
                     goto fail;
                 }
-                qemu_iovec_from_buf(qiov, bytes_done,
-                    cluster_data, 512 * cur_nr_sectors);
+                qemu_iovec_from_buf(qiov, bytes_done, cluster_data, cur_bytes);
             }
             break;
 
@@ -1519,9 +1529,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
             goto fail;
         }
 
-        remaining_sectors -= cur_nr_sectors;
-        sector_num += cur_nr_sectors;
-        bytes_done += cur_nr_sectors * 512;
+        bytes -= cur_bytes;
+        offset += cur_bytes;
+        bytes_done += cur_bytes;
     }
     ret = 0;
 
@@ -2435,7 +2445,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     if (head || tail) {
         int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
         uint64_t off;
-        int nr;
+        unsigned int nr;
 
         assert(head + count <= s->cluster_size);
 
@@ -2452,7 +2462,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         /* We can have new write after previous check */
         offset = cl_start << BDRV_SECTOR_BITS;
         count = s->cluster_size;
-        nr = s->cluster_sectors;
+        nr = s->cluster_size;
         ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
         if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
             qemu_co_mutex_unlock(&s->lock);
@@ -3368,7 +3378,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_co_get_block_status = qcow2_co_get_block_status,
     .bdrv_set_key       = qcow2_set_key,
 
-    .bdrv_co_readv          = qcow2_co_readv,
+    .bdrv_co_preadv         = qcow2_co_preadv,
     .bdrv_co_writev         = qcow2_co_writev,
     .bdrv_co_flush_to_os    = qcow2_co_flush_to_os,
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 7db9795..e2c42d5 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -544,7 +544,7 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
                           int nb_sectors, bool enc, Error **errp);
 
 int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
-    int *num, uint64_t *cluster_offset);
+                             unsigned int *bytes, uint64_t *cluster_offset);
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
     int *num, uint64_t *host_offset, QCowL2Meta **m);
 uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based
  2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
@ 2016-06-03 17:21 ` Kevin Wolf
  2016-06-03 19:34   ` Eric Blake
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev Kevin Wolf
  4 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

This will allow copy on write operations where the overwritten part of
the cluster is not aligned to sector boundaries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 9fb7f9f..5d04eb7 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -391,21 +391,17 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
 }
 
 static int coroutine_fn copy_sectors(BlockDriverState *bs,
-                                     uint64_t start_sect,
+                                     uint64_t src_cluster_offset,
                                      uint64_t cluster_offset,
-                                     int n_start, int n_end)
+                                     int offset_in_cluster,
+                                     int bytes)
 {
     BDRVQcow2State *s = bs->opaque;
     QEMUIOVector qiov;
     struct iovec iov;
-    int n, ret;
-
-    n = n_end - n_start;
-    if (n <= 0) {
-        return 0;
-    }
+    int ret;
 
-    iov.iov_len = n * BDRV_SECTOR_SIZE;
+    iov.iov_len = bytes;
     iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
     if (iov.iov_base == NULL) {
         return -ENOMEM;
@@ -424,18 +420,20 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
      * interface.  This avoids double I/O throttling and request tracking,
      * which can lead to deadlock when block layer copy-on-read is enabled.
      */
-    ret = bs->drv->bdrv_co_preadv(bs, (start_sect + n_start) * BDRV_SECTOR_SIZE,
-                                  n * BDRV_SECTOR_SIZE, &qiov, 0);
+    ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
+                                  bytes, &qiov, 0);
     if (ret < 0) {
         goto out;
     }
 
     if (bs->encrypted) {
         Error *err = NULL;
+        int sector = (cluster_offset + offset_in_cluster) >> BDRV_SECTOR_BITS;
         assert(s->cipher);
-        if (qcow2_encrypt_sectors(s, start_sect + n_start,
-                                  iov.iov_base, iov.iov_base, n,
-                                  true, &err) < 0) {
+        assert((offset_in_cluster & BDRV_SECTOR_MASK) == 0);
+        assert((bytes & ~BDRV_SECTOR_MASK) == 0);
+        if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
+                                  bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
             ret = -EIO;
             error_free(err);
             goto out;
@@ -443,14 +441,14 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
     }
 
     ret = qcow2_pre_write_overlap_check(bs, 0,
-            cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
+            cluster_offset + offset_in_cluster, bytes);
     if (ret < 0) {
         goto out;
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
-    ret = bdrv_co_writev(bs->file->bs, (cluster_offset >> 9) + n_start, n,
-                         &qiov);
+    ret = bdrv_co_pwritev(bs->file->bs, cluster_offset + offset_in_cluster,
+                          bytes, &qiov, 0);
     if (ret < 0) {
         goto out;
     }
@@ -743,9 +741,8 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
     }
 
     qemu_co_mutex_unlock(&s->lock);
-    ret = copy_sectors(bs, m->offset / BDRV_SECTOR_SIZE, m->alloc_offset,
-                       r->offset / BDRV_SECTOR_SIZE,
-                       r->offset / BDRV_SECTOR_SIZE + r->nb_sectors);
+    ret = copy_sectors(bs, m->offset, m->alloc_offset,
+                       r->offset, r->nb_sectors * BDRV_SECTOR_SIZE);
     qemu_co_mutex_lock(&s->lock);
 
     if (ret < 0) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta
  2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
                   ` (2 preceding siblings ...)
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
@ 2016-06-03 17:21 ` Kevin Wolf
  2016-06-03 19:44   ` Eric Blake
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev Kevin Wolf
  4 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

In preparation for implementing .bdrv_co_pwritev in qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c | 32 ++++++++++++--------------------
 block/qcow2.h         | 13 +++----------
 2 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 5d04eb7..112f056 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -736,13 +736,12 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
     BDRVQcow2State *s = bs->opaque;
     int ret;
 
-    if (r->nb_sectors == 0) {
+    if (r->nb_bytes == 0) {
         return 0;
     }
 
     qemu_co_mutex_unlock(&s->lock);
-    ret = copy_sectors(bs, m->offset, m->alloc_offset,
-                       r->offset, r->nb_sectors * BDRV_SECTOR_SIZE);
+    ret = copy_sectors(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
     qemu_co_mutex_lock(&s->lock);
 
     if (ret < 0) {
@@ -1192,25 +1191,20 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     /*
      * Save info needed for meta data update.
      *
-     * requested_sectors: Number of sectors from the start of the first
+     * requested_sectors: Number of bytes from the start of the first
      * newly allocated cluster to the end of the (possibly shortened
      * before) write request.
      *
-     * avail_sectors: Number of sectors from the start of the first
+     * avail_bytes: Number of bytes from the start of the first
      * newly allocated to the end of the last newly allocated cluster.
      *
-     * nb_sectors: The number of sectors from the start of the first
+     * nb_bytes: The number of bytes from the start of the first
      * newly allocated cluster to the end of the area that the write
      * request actually writes to (excluding COW at the end)
      */
-    int requested_sectors =
-        (*bytes + offset_into_cluster(s, guest_offset))
-        >> BDRV_SECTOR_BITS;
-    int avail_sectors = nb_clusters
-                        << (s->cluster_bits - BDRV_SECTOR_BITS);
-    int alloc_n_start = offset_into_cluster(s, guest_offset)
-                        >> BDRV_SECTOR_BITS;
-    int nb_sectors = MIN(requested_sectors, avail_sectors);
+    int requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
+    int avail_bytes = nb_clusters << s->cluster_bits;
+    int nb_bytes = MIN(requested_bytes, avail_bytes);
     QCowL2Meta *old_m = *m;
 
     *m = g_malloc0(sizeof(**m));
@@ -1221,23 +1215,21 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
         .alloc_offset   = alloc_cluster_offset,
         .offset         = start_of_cluster(s, guest_offset),
         .nb_clusters    = nb_clusters,
-        .nb_available   = nb_sectors,
 
         .cow_start = {
             .offset     = 0,
-            .nb_sectors = alloc_n_start,
+            .nb_bytes   = offset_into_cluster(s, guest_offset),
         },
         .cow_end = {
-            .offset     = nb_sectors * BDRV_SECTOR_SIZE,
-            .nb_sectors = avail_sectors - nb_sectors,
+            .offset     = nb_bytes,
+            .nb_bytes   = avail_bytes - nb_bytes,
         },
     };
     qemu_co_queue_init(&(*m)->dependent_requests);
     QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
 
     *host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset);
-    *bytes = MIN(*bytes, (nb_sectors * BDRV_SECTOR_SIZE)
-                         - offset_into_cluster(s, guest_offset));
+    *bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
     assert(*bytes != 0);
 
     return 1;
diff --git a/block/qcow2.h b/block/qcow2.h
index e2c42d5..175b0c1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -302,8 +302,8 @@ typedef struct Qcow2COWRegion {
      */
     uint64_t    offset;
 
-    /** Number of sectors to copy */
-    int         nb_sectors;
+    /** Number of bytes to copy */
+    int         nb_bytes;
 } Qcow2COWRegion;
 
 /**
@@ -318,12 +318,6 @@ typedef struct QCowL2Meta
     /** Host offset of the first newly allocated cluster */
     uint64_t alloc_offset;
 
-    /**
-     * Number of sectors from the start of the first allocated cluster to
-     * the end of the (possibly shortened) request
-     */
-    int nb_available;
-
     /** Number of newly allocated clusters */
     int nb_clusters;
 
@@ -471,8 +465,7 @@ static inline uint64_t l2meta_cow_start(QCowL2Meta *m)
 
 static inline uint64_t l2meta_cow_end(QCowL2Meta *m)
 {
-    return m->offset + m->cow_end.offset
-        + (m->cow_end.nb_sectors << BDRV_SECTOR_BITS);
+    return m->offset + m->cow_end.offset + m->cow_end.nb_bytes;
 }
 
 static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev
  2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
                   ` (3 preceding siblings ...)
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
@ 2016-06-03 17:21 ` Kevin Wolf
  2016-06-03 20:13   ` Eric Blake
  2016-06-03 22:59   ` Eric Blake
  4 siblings, 2 replies; 13+ messages in thread
From: Kevin Wolf @ 2016-06-03 17:21 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, jsnow, qemu-devel

This changes qcow2 to implement the byte-based .bdrv_co_pwritev
interface rather than the sector-based old one.

As preallocation uses the same allocation function as normal writes, and
the interface of that function needs to be changed, it is converted in
the same patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c | 13 ++++----
 block/qcow2.c         | 82 ++++++++++++++++++++++++---------------------------
 block/qcow2.h         |  3 +-
 trace-events          |  6 ++--
 4 files changed, 49 insertions(+), 55 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 112f056..9c8378b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1261,7 +1261,8 @@ fail:
  * Return 0 on success and -errno in error cases
  */
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
-    int *num, uint64_t *host_offset, QCowL2Meta **m)
+                               unsigned int *bytes, uint64_t *host_offset,
+                               QCowL2Meta **m)
 {
     BDRVQcow2State *s = bs->opaque;
     uint64_t start, remaining;
@@ -1269,13 +1270,11 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
     uint64_t cur_bytes;
     int ret;
 
-    trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *num);
-
-    assert((offset & ~BDRV_SECTOR_MASK) == 0);
+    trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *bytes);
 
 again:
     start = offset;
-    remaining = (uint64_t)*num << BDRV_SECTOR_BITS;
+    remaining = *bytes;
     cluster_offset = 0;
     *host_offset = 0;
     cur_bytes = 0;
@@ -1361,8 +1360,8 @@ again:
         }
     }
 
-    *num -= remaining >> BDRV_SECTOR_BITS;
-    assert(*num > 0);
+    *bytes -= remaining;
+    assert(*bytes > 0);
     assert(*host_offset != 0);
 
     return 0;
diff --git a/block/qcow2.c b/block/qcow2.c
index b498753..68c6d0d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1544,23 +1544,20 @@ fail:
     return ret;
 }
 
-static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
-                           int64_t sector_num,
-                           int remaining_sectors,
-                           QEMUIOVector *qiov)
+static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     BDRVQcow2State *s = bs->opaque;
-    int index_in_cluster;
+    int offset_in_cluster;
     int ret;
-    int cur_nr_sectors; /* number of sectors in current iteration */
+    unsigned int cur_bytes; /* number of sectors in current iteration */
     uint64_t cluster_offset;
     QEMUIOVector hd_qiov;
     uint64_t bytes_done = 0;
     uint8_t *cluster_data = NULL;
     QCowL2Meta *l2meta = NULL;
 
-    trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
-                                 remaining_sectors);
+    trace_qcow2_writev_start_req(qemu_coroutine_self(), offset, bytes);
 
     qemu_iovec_init(&hd_qiov, qiov->niov);
 
@@ -1568,22 +1565,21 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
 
     qemu_co_mutex_lock(&s->lock);
 
-    while (remaining_sectors != 0) {
+    while (bytes != 0) {
 
         l2meta = NULL;
 
         trace_qcow2_writev_start_part(qemu_coroutine_self());
-        index_in_cluster = sector_num & (s->cluster_sectors - 1);
-        cur_nr_sectors = remaining_sectors;
-        if (bs->encrypted &&
-            cur_nr_sectors >
-            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster) {
-            cur_nr_sectors =
-                QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster;
+        offset_in_cluster = offset_into_cluster(s, offset);
+        cur_bytes = MIN(bytes, INT_MAX);
+        if (bs->encrypted) {
+            cur_bytes = MIN(cur_bytes,
+                            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+                            - offset_in_cluster);
         }
 
-        ret = qcow2_alloc_cluster_offset(bs, sector_num << 9,
-            &cur_nr_sectors, &cluster_offset, &l2meta);
+        ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
+                                         &cluster_offset, &l2meta);
         if (ret < 0) {
             goto fail;
         }
@@ -1591,8 +1587,7 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
         assert((cluster_offset & 511) == 0);
 
         qemu_iovec_reset(&hd_qiov);
-        qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
-            cur_nr_sectors * 512);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
         if (bs->encrypted) {
             Error *err = NULL;
@@ -1611,8 +1606,9 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
                    QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
             qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
 
-            if (qcow2_encrypt_sectors(s, sector_num, cluster_data,
-                                      cluster_data, cur_nr_sectors,
+            if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
+                                      cluster_data, cluster_data,
+                                      cur_bytes >>BDRV_SECTOR_BITS,
                                       true, &err) < 0) {
                 error_free(err);
                 ret = -EIO;
@@ -1620,13 +1616,11 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
             }
 
             qemu_iovec_reset(&hd_qiov);
-            qemu_iovec_add(&hd_qiov, cluster_data,
-                cur_nr_sectors * 512);
+            qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
         }
 
         ret = qcow2_pre_write_overlap_check(bs, 0,
-                cluster_offset + index_in_cluster * BDRV_SECTOR_SIZE,
-                cur_nr_sectors * BDRV_SECTOR_SIZE);
+                cluster_offset + offset_in_cluster, cur_bytes);
         if (ret < 0) {
             goto fail;
         }
@@ -1634,10 +1628,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
         qemu_co_mutex_unlock(&s->lock);
         BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
         trace_qcow2_writev_data(qemu_coroutine_self(),
-                                (cluster_offset >> 9) + index_in_cluster);
-        ret = bdrv_co_writev(bs->file->bs,
-                             (cluster_offset >> 9) + index_in_cluster,
-                             cur_nr_sectors, &hd_qiov);
+                                cluster_offset + offset_in_cluster);
+        ret = bdrv_co_pwritev(bs->file->bs,
+                              cluster_offset + offset_in_cluster,
+                              cur_bytes, &hd_qiov, 0);
         qemu_co_mutex_lock(&s->lock);
         if (ret < 0) {
             goto fail;
@@ -1663,10 +1657,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
             l2meta = next;
         }
 
-        remaining_sectors -= cur_nr_sectors;
-        sector_num += cur_nr_sectors;
-        bytes_done += cur_nr_sectors * 512;
-        trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_nr_sectors);
+        bytes -= cur_bytes;
+        offset += cur_bytes;
+        bytes_done += cur_bytes;
+        trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
     }
     ret = 0;
 
@@ -2008,19 +2002,19 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
 
 static int preallocate(BlockDriverState *bs)
 {
-    uint64_t nb_sectors;
+    uint64_t bytes;
     uint64_t offset;
     uint64_t host_offset = 0;
-    int num;
+    unsigned int cur_bytes;
     int ret;
     QCowL2Meta *meta;
 
-    nb_sectors = bdrv_nb_sectors(bs);
+    bytes = bdrv_getlength(bs);
     offset = 0;
 
-    while (nb_sectors) {
-        num = MIN(nb_sectors, INT_MAX >> BDRV_SECTOR_BITS);
-        ret = qcow2_alloc_cluster_offset(bs, offset, &num,
+    while (bytes) {
+        cur_bytes = MIN(bytes, INT_MAX);
+        ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                          &host_offset, &meta);
         if (ret < 0) {
             return ret;
@@ -2046,8 +2040,8 @@ static int preallocate(BlockDriverState *bs)
 
         /* TODO Preallocate data if requested */
 
-        nb_sectors -= num;
-        offset += num << BDRV_SECTOR_BITS;
+        bytes -= cur_bytes;
+        offset += cur_bytes;
     }
 
     /*
@@ -2059,7 +2053,7 @@ static int preallocate(BlockDriverState *bs)
         uint8_t buf[BDRV_SECTOR_SIZE];
         memset(buf, 0, BDRV_SECTOR_SIZE);
         ret = bdrv_write(bs->file->bs,
-                         (host_offset >> BDRV_SECTOR_BITS) + num - 1,
+                         ((host_offset + cur_bytes) >> BDRV_SECTOR_BITS) - 1,
                          buf, 1);
         if (ret < 0) {
             return ret;
@@ -3379,7 +3373,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_set_key       = qcow2_set_key,
 
     .bdrv_co_preadv         = qcow2_co_preadv,
-    .bdrv_co_writev         = qcow2_co_writev,
+    .bdrv_co_pwritev        = qcow2_co_pwritev,
     .bdrv_co_flush_to_os    = qcow2_co_flush_to_os,
 
     .bdrv_co_pwrite_zeroes  = qcow2_co_pwrite_zeroes,
diff --git a/block/qcow2.h b/block/qcow2.h
index 175b0c1..b36a7bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -539,7 +539,8 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
 int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
                              unsigned int *bytes, uint64_t *cluster_offset);
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
-    int *num, uint64_t *host_offset, QCowL2Meta **m);
+                               unsigned int *bytes, uint64_t *host_offset,
+                               QCowL2Meta **m);
 uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
                                          uint64_t offset,
                                          int compressed_size);
diff --git a/trace-events b/trace-events
index 353454c..81c9ec3 100644
--- a/trace-events
+++ b/trace-events
@@ -606,16 +606,16 @@ qemu_system_shutdown_request(void) ""
 qemu_system_powerdown_request(void) ""
 
 # block/qcow2.c
-qcow2_writev_start_req(void *co, int64_t sector, int nb_sectors) "co %p sector %" PRIx64 " nb_sectors %d"
+qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
 qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
 qcow2_writev_start_part(void *co) "co %p"
-qcow2_writev_done_part(void *co, int cur_nr_sectors) "co %p cur_nr_sectors %d"
+qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d"
 qcow2_writev_data(void *co, uint64_t offset) "co %p offset %" PRIx64
 qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
 qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
 
 # block/qcow2-cluster.c
-qcow2_alloc_clusters_offset(void *co, uint64_t offset, int num) "co %p offset %" PRIx64 " num %d"
+qcow2_alloc_clusters_offset(void *co, uint64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
 qcow2_handle_copied(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
 qcow2_handle_alloc(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
 qcow2_do_alloc_clusters_offset(void *co, uint64_t guest_offset, uint64_t host_offset, int nb_clusters) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " nb_clusters %d"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset()
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
@ 2016-06-03 17:44   ` Eric Blake
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2016-06-03 17:44 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 4334 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> This patch changes the units that qcow2_get_cluster_offset() uses
> internally, without touching the interface just yet. This will be done
> in another patch.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 43 ++++++++++++++++++++++---------------------
>  1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index d901d89..b2405b1 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -482,29 +482,27 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>      unsigned int l2_index;
>      uint64_t l1_index, l2_offset, *l2_table;
>      int l1_bits, c;
> -    unsigned int index_in_cluster, nb_clusters;
> -    uint64_t nb_available, nb_needed;
> +    unsigned int offset_in_cluster, nb_clusters;
> +    uint64_t bytes_available, bytes_needed;
>      int ret;
> +    unsigned int bytes;
>  
> -    index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
> -    nb_needed = *num + index_in_cluster;
> +    bytes = *num * BDRV_SECTOR_SIZE;

bytes can overflow if num > BDRV_REQUEST_MAX_SECTORS.  Is that ever a
problem? Would an assert() help, or else clamping it (since it is really
just a hint of how far we are allowed to look for similar clusters, but
we can always quit looking early):

bytes = MIN(*num * BDRV_SECTOR_SIZE, UINT32_MAX);

Then again, the interface is about to change, so this may just be a
transient problem.

>  
> -    l1_bits = s->l2_bits + s->cluster_bits;
> -
> -    /* compute how many bytes there are between the offset and
> -     * the end of the l1 entry
> -     */
> +    offset_in_cluster = offset_into_cluster(s, offset);
> +    bytes_needed = bytes + offset_in_cluster;

Hmm, my idea for clamped bytes above may need tweaking (that is,
UINT32_MAX is inappropriate as the clamp value, if you are going to
round it up here).  Or, since bytes_needed is 64-bits, make either bytes
or offset_in_cluster also be 64 bits, or start with '0ULL +', to avoid
overflow.

>  
> -    nb_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1));
> -
> -    /* compute the number of available sectors */
> +    l1_bits = s->l2_bits + s->cluster_bits;
>  
> -    nb_available = (nb_available >> 9) + index_in_cluster;
> +    /* compute how many bytes there are between the start of the cluster
> +     * containing offset and the end of the l1 entry */
> +    bytes_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1))
> +                    + offset_in_cluster;
>  
> -    if (nb_needed > nb_available) {
> -        nb_needed = nb_available;
> +    if (bytes_needed > bytes_available) {
> +        bytes_needed = bytes_available;
>      }
> -    assert(nb_needed <= INT_MAX);
> +    assert(bytes_needed <= INT_MAX);

Is this assertion too late given the spots I pointed out above as
possible overflow points?

>  
>      *cluster_offset = 0;
>  
> @@ -542,7 +540,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>      *cluster_offset = be64_to_cpu(l2_table[l2_index]);
>  
>      /* nb_needed <= INT_MAX, thus nb_clusters <= INT_MAX, too */
> -    nb_clusters = size_to_clusters(s, nb_needed << 9);
> +    nb_clusters = size_to_clusters(s, bytes_needed);
>  
>      ret = qcow2_get_cluster_type(*cluster_offset);
>      switch (ret) {
> @@ -589,13 +587,16 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>  
>      qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>  
> -    nb_available = (c * s->cluster_sectors);
> +    bytes_available = (c * s->cluster_size);
>  
>  out:
> -    if (nb_available > nb_needed)
> -        nb_available = nb_needed;
> +    if (bytes_available > bytes_needed) {
> +        bytes_available = bytes_needed;
> +    }
>  
> -    *num = nb_available - index_in_cluster;
> +    bytes = bytes_available - offset_in_cluster;
> +    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
> +    *num = bytes >> BDRV_SECTOR_BITS;
>  
>      return ret;

Looks like it is headed in the right direction, though.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv()
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
@ 2016-06-03 19:18   ` Eric Blake
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2016-06-03 19:18 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 5051 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> Reading from qcow2 images is now byte granularity.
> 
> Most of the affected code in qcow2 actually gets simpler with this
> change. The only exception is encryption, which is fixed on 512 bytes
> blocks; in order to keep this working, bs->request_alignment is set for
> encrypted images.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c |  18 ++++-----
>  block/qcow2.c         | 108 +++++++++++++++++++++++++++-----------------------
>  block/qcow2.h         |   2 +-
>  3 files changed, 67 insertions(+), 61 deletions(-)
> 

> @@ -467,16 +468,16 @@ out:
>   * For a given offset of the disk image, find the cluster offset in
>   * qcow2 file. The offset is stored in *cluster_offset.
>   *
> - * on entry, *num is the number of contiguous sectors we'd like to
> + * on entry, *bytes is the number of contiguous bytes we'd like to

maybe s/number/maximum number/

>   * access following offset.
>   *
> - * on exit, *num is the number of contiguous sectors we can read.
> + * on exit, *bytes is the number of contiguous bytes we can read.

maybe s/we can read/with the same cluster type/

>   *
>   * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
>   * cases.
>   */
>  int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
> -    int *num, uint64_t *cluster_offset)
> +                             unsigned int *bytes, uint64_t *cluster_offset)
>  {
>      BDRVQcow2State *s = bs->opaque;
>      unsigned int l2_index;
> @@ -485,12 +486,9 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>      unsigned int offset_in_cluster, nb_clusters;
>      uint64_t bytes_available, bytes_needed;
>      int ret;
> -    unsigned int bytes;
> -
> -    bytes = *num * BDRV_SECTOR_SIZE;

One potential overflow gone...

>  
>      offset_in_cluster = offset_into_cluster(s, offset);
> -    bytes_needed = bytes + offset_in_cluster;
> +    bytes_needed = *bytes + offset_in_cluster;

...but not the other.  Looks like your callers limit their input 'bytes'
to at most INT_MAX, and therefore it happens to not overflow unsigned
int in practice, but you may want an assertion?

> +++ b/block/qcow2.c
> @@ -975,6 +975,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
>          }
>  
>          bs->encrypted = 1;
> +
> +        /* Encryption works on a sector granularity */
> +        bs->request_alignment = BDRV_SECTOR_SIZE;

Trivial conflict with my patch 5/5 that moves request_alignment into
BlockLimits (if we even want that, since I still have to find why my
patch makes qemu-iotests 77 hang)

>  
> -static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
> -                          int remaining_sectors, QEMUIOVector *qiov)
> +static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
> +                                        uint64_t bytes, QEMUIOVector *qiov,
> +                                        int flags)

Wait a minute.  .bdrv_co_preadv() takes uint64_t bytes, while
bdrv_co_preadv() takes only unsigned int bytes?  Eww.  We've got some
more scrubbing work to do.  At least it is going to get easier to
universally turn on full 64-bit byte interfaces everywhere, especially
once my patches for auto-fragmenting at max_transfer_length land (which
in turn won't be posted before your conversion of bdrv_aligned_preadv()
to a byte interface).  So no impact to this patch.

>  {
>      BDRVQcow2State *s = bs->opaque;
> -    int index_in_cluster, n1;
> +    int offset_in_cluster, n1;
>      int ret;
> -    int cur_nr_sectors; /* number of sectors in current iteration */
> +    unsigned int cur_bytes; /* number of sectors in current iteration */

comment is stale now

>      uint64_t cluster_offset = 0;
>      uint64_t bytes_done = 0;
>      QEMUIOVector hd_qiov;
> @@ -1389,26 +1402,24 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
>  
>      qemu_co_mutex_lock(&s->lock);
>  
> -    while (remaining_sectors != 0) {
> +    while (bytes != 0) {
>  
>          /* prepare next request */
> -        cur_nr_sectors = remaining_sectors;
> +        cur_bytes = MIN(bytes, INT_MAX);
>          if (s->cipher) {
> -            cur_nr_sectors = MIN(cur_nr_sectors,
> -                QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
> +            cur_bytes = MIN(cur_bytes,
> +                            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);

Again, my work on auto-fragmenting at the block layer should make it so
that we can eventually further simplify this part to just assert that
bytes doesn't exceed max_transfer_length, rather than having to fragment
it at INT_MAX ourselves.

Couple of tweaks to fix as pointed out above, but mostly looks sane.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
@ 2016-06-03 19:34   ` Eric Blake
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2016-06-03 19:34 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 1337 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> This will allow copy on write operations where the overwritten part of
> the cluster is not aligned to sector boundaries.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 37 +++++++++++++++++--------------------
>  1 file changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 9fb7f9f..5d04eb7 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -391,21 +391,17 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
>  }
>  
>  static int coroutine_fn copy_sectors(BlockDriverState *bs,
> -                                     uint64_t start_sect,
> +                                     uint64_t src_cluster_offset,
>                                       uint64_t cluster_offset,
> -                                     int n_start, int n_end)
> +                                     int offset_in_cluster,
> +                                     int bytes)

Do we still want this named copy_sectors(), now that it can copy a
sub-sector (well, when not encrypted)?

Otherwise,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
@ 2016-06-03 19:44   ` Eric Blake
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2016-06-03 19:44 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> In preparation for implementing .bdrv_co_pwritev in qcow2.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 32 ++++++++++++--------------------
>  block/qcow2.h         | 13 +++----------
>  2 files changed, 15 insertions(+), 30 deletions(-)
> 

> @@ -1192,25 +1191,20 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
>      /*
>       * Save info needed for meta data update.
>       *
> -     * requested_sectors: Number of sectors from the start of the first
> +     * requested_sectors: Number of bytes from the start of the first

This comment seems like an incomplete conversion. Is it sectors or bytes?


> -    int requested_sectors =
> -        (*bytes + offset_into_cluster(s, guest_offset))
> -        >> BDRV_SECTOR_BITS;
> -    int avail_sectors = nb_clusters
> -                        << (s->cluster_bits - BDRV_SECTOR_BITS);
> -    int alloc_n_start = offset_into_cluster(s, guest_offset)
> -                        >> BDRV_SECTOR_BITS;
> -    int nb_sectors = MIN(requested_sectors, avail_sectors);
> +    int requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
> +    int avail_bytes = nb_clusters << s->cluster_bits;

Can this ever overflow?

Otherwise looks correct.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev Kevin Wolf
@ 2016-06-03 20:13   ` Eric Blake
  2016-06-06 14:50     ` Kevin Wolf
  2016-06-03 22:59   ` Eric Blake
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Blake @ 2016-06-03 20:13 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 5222 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> This changes qcow2 to implement the byte-based .bdrv_co_pwritev
> interface rather than the sector-based old one.
> 
> As preallocation uses the same allocation function as normal writes, and
> the interface of that function needs to be changed, it is converted in
> the same patch.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 13 ++++----
>  block/qcow2.c         | 82 ++++++++++++++++++++++++---------------------------
>  block/qcow2.h         |  3 +-
>  trace-events          |  6 ++--
>  4 files changed, 49 insertions(+), 55 deletions(-)
> 

> +++ b/block/qcow2-cluster.c
> @@ -1261,7 +1261,8 @@ fail:
>   * Return 0 on success and -errno in error cases
>   */
>  int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
> -    int *num, uint64_t *host_offset, QCowL2Meta **m)
> +                               unsigned int *bytes, uint64_t *host_offset,
> +                               QCowL2Meta **m)
>  {

>  
> -    *num -= remaining >> BDRV_SECTOR_BITS;
> -    assert(*num > 0);
> +    *bytes -= remaining;
> +    assert(*bytes > 0);

The assertion is now weaker.  Beforehand, num could go negative.  But
bytes cannot, all it can do is go to 0.  If we wrap around, the
assertion won't catch it.  Do you want to strengthen it by also
asserting that bytes < INT_MAX?

> +++ b/block/qcow2.c
> @@ -1544,23 +1544,20 @@ fail:
>      return ret;
>  }
>  
> -static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
> -                           int64_t sector_num,
> -                           int remaining_sectors,
> -                           QEMUIOVector *qiov)
> +static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs,
> +        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)

Worth consistent indentation, while touching it?


> +    while (bytes != 0) {
>  
>          l2meta = NULL;
>  
>          trace_qcow2_writev_start_part(qemu_coroutine_self());
> -        index_in_cluster = sector_num & (s->cluster_sectors - 1);
> -        cur_nr_sectors = remaining_sectors;
> -        if (bs->encrypted &&
> -            cur_nr_sectors >
> -            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster) {
> -            cur_nr_sectors =
> -                QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster;
> +        offset_in_cluster = offset_into_cluster(s, offset);
> +        cur_bytes = MIN(bytes, INT_MAX);
> +        if (bs->encrypted) {
> +            cur_bytes = MIN(cur_bytes,
> +                            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
> +                            - offset_in_cluster);

Again, if the block layer learns to auto-fragment, then we should just
inform the block layer about QCOW_MAX_CRYPT_CLUSTERS as our max_transfer
limit.


> @@ -1634,10 +1628,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
>          qemu_co_mutex_unlock(&s->lock);
>          BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
>          trace_qcow2_writev_data(qemu_coroutine_self(),
> -                                (cluster_offset >> 9) + index_in_cluster);
> -        ret = bdrv_co_writev(bs->file->bs,
> -                             (cluster_offset >> 9) + index_in_cluster,
> -                             cur_nr_sectors, &hd_qiov);
> +                                cluster_offset + offset_in_cluster);
> +        ret = bdrv_co_pwritev(bs->file->bs,
> +                              cluster_offset + offset_in_cluster,
> +                              cur_bytes, &hd_qiov, 0);

s/0/flags/, even if .supported_write_flags stays 0 for now.

> @@ -2008,19 +2002,19 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
>  
>  static int preallocate(BlockDriverState *bs)
>  {
> -    uint64_t nb_sectors;
> +    uint64_t bytes;
>      uint64_t offset;
>      uint64_t host_offset = 0;
> -    int num;
> +    unsigned int cur_bytes;
>      int ret;
>      QCowL2Meta *meta;
>  
> -    nb_sectors = bdrv_nb_sectors(bs);
> +    bytes = bdrv_getlength(bs);
>      offset = 0;
>  
> -    while (nb_sectors) {
> -        num = MIN(nb_sectors, INT_MAX >> BDRV_SECTOR_BITS);
> -        ret = qcow2_alloc_cluster_offset(bs, offset, &num,
> +    while (bytes) {
> +        cur_bytes = MIN(bytes, INT_MAX);

Okay, while bdrv_co_pwritev() should (eventually) let the block layer
auto-fragment, here, you are right that you have to fragment yourself
since this is not being called from the block layer.


> @@ -2059,7 +2053,7 @@ static int preallocate(BlockDriverState *bs)
>          uint8_t buf[BDRV_SECTOR_SIZE];
>          memset(buf, 0, BDRV_SECTOR_SIZE);
>          ret = bdrv_write(bs->file->bs,
> -                         (host_offset >> BDRV_SECTOR_BITS) + num - 1,
> +                         ((host_offset + cur_bytes) >> BDRV_SECTOR_BITS) - 1,
>                           buf, 1);

Do we still have to call bdrv_write(), or can we convert this to a
byte-based interface call?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev
  2016-06-03 17:21 ` [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev Kevin Wolf
  2016-06-03 20:13   ` Eric Blake
@ 2016-06-03 22:59   ` Eric Blake
  1 sibling, 0 replies; 13+ messages in thread
From: Eric Blake @ 2016-06-03 22:59 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 535 bytes --]

On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> This changes qcow2 to implement the byte-based .bdrv_co_pwritev
> interface rather than the sector-based old one.
> 

Subject line consistency: 2/5 has trailing (), this one does not. I
prefer 2/5.

> As preallocation uses the same allocation function as normal writes, and
> the interface of that function needs to be changed, it is converted in
> the same patch.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev
  2016-06-03 20:13   ` Eric Blake
@ 2016-06-06 14:50     ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2016-06-06 14:50 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-block, jsnow, qemu-devel, mreitz

[-- Attachment #1: Type: text/plain, Size: 5278 bytes --]

Am 03.06.2016 um 22:13 hat Eric Blake geschrieben:
> On 06/03/2016 11:21 AM, Kevin Wolf wrote:
> > This changes qcow2 to implement the byte-based .bdrv_co_pwritev
> > interface rather than the sector-based old one.
> > 
> > As preallocation uses the same allocation function as normal writes, and
> > the interface of that function needs to be changed, it is converted in
> > the same patch.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/qcow2-cluster.c | 13 ++++----
> >  block/qcow2.c         | 82 ++++++++++++++++++++++++---------------------------
> >  block/qcow2.h         |  3 +-
> >  trace-events          |  6 ++--
> >  4 files changed, 49 insertions(+), 55 deletions(-)
> > 
> 
> > +++ b/block/qcow2-cluster.c
> > @@ -1261,7 +1261,8 @@ fail:
> >   * Return 0 on success and -errno in error cases
> >   */
> >  int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
> > -    int *num, uint64_t *host_offset, QCowL2Meta **m)
> > +                               unsigned int *bytes, uint64_t *host_offset,
> > +                               QCowL2Meta **m)
> >  {
> 
> >  
> > -    *num -= remaining >> BDRV_SECTOR_BITS;
> > -    assert(*num > 0);
> > +    *bytes -= remaining;
> > +    assert(*bytes > 0);
> 
> The assertion is now weaker.  Beforehand, num could go negative.  But
> bytes cannot, all it can do is go to 0.  If we wrap around, the
> assertion won't catch it.  Do you want to strengthen it by also
> asserting that bytes < INT_MAX?

Is it useful to assert this, though? We have tons of such loops and we
never assert that cur_bytes <= bytes (which is what you would be
checking effectively) because it's quite obvious.

This assertion was meant to document the less obvious fact that we
always make some progress and never return 0 bytes.

> > +++ b/block/qcow2.c
> > @@ -1544,23 +1544,20 @@ fail:
> >      return ret;
> >  }
> >  
> > -static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
> > -                           int64_t sector_num,
> > -                           int remaining_sectors,
> > -                           QEMUIOVector *qiov)
> > +static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs,
> > +        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
> 
> Worth consistent indentation, while touching it?

Will fix.

> > +    while (bytes != 0) {
> >  
> >          l2meta = NULL;
> >  
> >          trace_qcow2_writev_start_part(qemu_coroutine_self());
> > -        index_in_cluster = sector_num & (s->cluster_sectors - 1);
> > -        cur_nr_sectors = remaining_sectors;
> > -        if (bs->encrypted &&
> > -            cur_nr_sectors >
> > -            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster) {
> > -            cur_nr_sectors =
> > -                QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster;
> > +        offset_in_cluster = offset_into_cluster(s, offset);
> > +        cur_bytes = MIN(bytes, INT_MAX);
> > +        if (bs->encrypted) {
> > +            cur_bytes = MIN(cur_bytes,
> > +                            QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
> > +                            - offset_in_cluster);
> 
> Again, if the block layer learns to auto-fragment, then we should just
> inform the block layer about QCOW_MAX_CRYPT_CLUSTERS as our max_transfer
> limit.
> 
> 
> > @@ -1634,10 +1628,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
> >          qemu_co_mutex_unlock(&s->lock);
> >          BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
> >          trace_qcow2_writev_data(qemu_coroutine_self(),
> > -                                (cluster_offset >> 9) + index_in_cluster);
> > -        ret = bdrv_co_writev(bs->file->bs,
> > -                             (cluster_offset >> 9) + index_in_cluster,
> > -                             cur_nr_sectors, &hd_qiov);
> > +                                cluster_offset + offset_in_cluster);
> > +        ret = bdrv_co_pwritev(bs->file->bs,
> > +                              cluster_offset + offset_in_cluster,
> > +                              cur_bytes, &hd_qiov, 0);
> 
> s/0/flags/, even if .supported_write_flags stays 0 for now.

I don't agree. That might make sense for a passthrough driver like raw,
but I don't see a reason to guess for qcow2 that passing flags down is
the right thing to do. The only flag that we currently have (FUA) isn't
correctly implemented by passing it to the lower layer because qcow2
metadata must be considered.

So I would leave changing flags here for the patch that adds some flag
to .supported_write_flags.

> > @@ -2059,7 +2053,7 @@ static int preallocate(BlockDriverState *bs)
> >          uint8_t buf[BDRV_SECTOR_SIZE];
> >          memset(buf, 0, BDRV_SECTOR_SIZE);
> >          ret = bdrv_write(bs->file->bs,
> > -                         (host_offset >> BDRV_SECTOR_BITS) + num - 1,
> > +                         ((host_offset + cur_bytes) >> BDRV_SECTOR_BITS) - 1,
> >                           buf, 1);
> 
> Do we still have to call bdrv_write(), or can we convert this to a
> byte-based interface call?

Okay, I'll make it a one-byte bdrv_pwrite().

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-06-06 14:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-03 17:21 [Qemu-devel] [PATCH 0/5] qcow2: Implement .bdrv_co_preadv/pwritev Kevin Wolf
2016-06-03 17:21 ` [Qemu-devel] [PATCH 1/5] qcow2: Work with bytes in qcow2_get_cluster_offset() Kevin Wolf
2016-06-03 17:44   ` Eric Blake
2016-06-03 17:21 ` [Qemu-devel] [PATCH 2/5] qcow2: Implement .bdrv_co_preadv() Kevin Wolf
2016-06-03 19:18   ` Eric Blake
2016-06-03 17:21 ` [Qemu-devel] [PATCH 3/5] qcow2: Make copy_sectors() byte based Kevin Wolf
2016-06-03 19:34   ` Eric Blake
2016-06-03 17:21 ` [Qemu-devel] [PATCH 4/5] qcow2: Use bytes instead of sectors for QCowL2Meta Kevin Wolf
2016-06-03 19:44   ` Eric Blake
2016-06-03 17:21 ` [Qemu-devel] [PATCH 5/5] qcow2: Implement .bdrv_co_pwritev Kevin Wolf
2016-06-03 20:13   ` Eric Blake
2016-06-06 14:50     ` Kevin Wolf
2016-06-03 22:59   ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.